• Post Reply Bookmark Topic Watch Topic
  • New Topic

char in java  RSS feed

 
Sachin Tripathi
Ranch Hand
Posts: 368
3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
"char" primitive data type acquires 2 byte size while in c and c++ it acquire 1 byte size?


AND



why java do not support pointers?
 
fred rosenberger
lowercase baba
Bartender
Posts: 12562
49
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
you would probably have to ask the designers of the language to get the real truth. All anyone here can do is guess. Here are mine

1) C predates java by a long chalk. I don't think it was designed with internationalization in mind, but simply the latin alphabet. One byte was enough. Java came along later, folks realized that one byte was NOT enough...so they made it two, thinking THAT would be enough. It's not, but that's another topic.

2) It depends on what you mean. Java does use pointers internally. What it does not allow is pointer arithmetic. Pointer arithmetic is notoriously difficult to get right and debug, and was the cause of all kinds of security exploits.
 
Sachin Tripathi
Ranch Hand
Posts: 368
3
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
ThaNk you FRED but if they tried to incorporate all characters of every language on this earth by doing this they always knew that is not going to happen in any way but they ended up in wasting a lot of memory ..for latin symbols 1 byte was sufficient they must have continued using that..

 
Campbell Ritchie
Marshal
Posts: 56522
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why are you worrying about wasting memory?
When Java® was introduced, it was intended to support Unicode which in those days only supported up to 16 bits per character.
 
Sachin Tripathi
Ranch Hand
Posts: 368
3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
then who should worry abiut it?
 
Sachin Tripathi
Ranch Hand
Posts: 368
3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
why they havent tried to incorporate more characters by increasing the number of bytes in their future editions?
 
Campbell Ritchie
Marshal
Posts: 56522
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
They have. Most characters are defined by four bytes; the first pair are a supplementary code point which directs which table to look up the character corresponding to the second pair.

No longer a beginner's topic: moving discussion.
 
Sachin Tripathi
Ranch Hand
Posts: 368
3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
cannot get you.can you please elaborate?(any example)
 
Campbell Ritchie
Marshal
Posts: 56522
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is by no means easy to explain. This Wikipedia article doesn't look easy to understand. Try this article by Joel Spolsky. It tells you more about UTF-8 whereas Java® uses UTF-16 as a default, but the principles are the same.
 
Stephan van Hulst
Saloon Keeper
Posts: 7963
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Simply put, you use one char to say that the actual character is two chars long, and then you use both chars to make up the character. So really, sometimes a character on your screen is actually two chars in Java.
 
Sachin Tripathi
Ranch Hand
Posts: 368
3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
any example of such character?
 
Stephan van Hulst
Saloon Keeper
Posts: 7963
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
https://en.wikipedia.org/wiki/UTF-16#U.2B0000_to_U.2BD7FF_and_U.2BE000_to_U.2BFFFF
 
Sachin Tripathi
Ranch Hand
Posts: 368
3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thank you stephan ....cleared doubts
 
Campbell Ritchie
Marshal
Posts: 56522
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
While we are on about chars, how would you write a \uXXXX escape for supplementary code points?
 
Stephan van Hulst
Saloon Keeper
Posts: 7963
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Don't you just write down the two codes for the surrogate pair?
 
Campbell Ritchie
Marshal
Posts: 56522
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Probably. If we want a cuneiform two ash (U12400), you might have the 2400, but what would you write for the preceding character?
 
Paweł Baczyński
Bartender
Posts: 2074
44
Firefox Browser IntelliJ IDE Java Linux Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This page says it is \uD809\uDC00.
It prints 𒐀.
 
Campbell Ritchie
Marshal
Posts: 56522
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I never knew about that sort of page. Thank you. There appears to be an algorithm here, but it looks the sort of thing which takes a half‑hour to understand.
I tried System.out.println("\ud809\udc00"); and got a box with12400 written in. Obviously I would have to download some font or other to display it properly.

 
Campbell Ritchie
Marshal
Posts: 56522
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
From that link you gave me, Paweł, it appears a font called LastResort includes U+12400. It also includes Z domain antirestriction (⩤) and Z range antirestriction (⩥) which I need and have great difficulty finding.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!