• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

unicode into char

 
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I thought a unicode value had to be in single qoutes when assigned to a char. For some reason, the following seems to compile and run ok, but only in the range of \00030 to \u00039 .

char d = \u0032; /* Compiles ok */
System.out.print(d);

char d = \u0040; /* Syntax error on token "Invalid Character", invalid VariableInitializer comes up */

Hope someone can explain this to me why single qoutes are not needed in the first example.
 
Bartender
Posts: 9626
16
Mac OS X Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
char is actually a numeric value:


4.2.1 Integral Types and Values
The values of the integral types are integers in the following ranges:

* For byte, from -128 to 127, inclusive
* For short, from -32768 to 32767, inclusive
* For int, from -2147483648 to 2147483647, inclusive
* For long, from -9223372036854775808 to 9223372036854775807, inclusive
* For char, from '\u0000' to '\uffff' inclusive, that is, from 0 to 65535


The Java Specification
It just so happens that \u0030 to \u0039 is Unicode for decimal 0-9. The compiler goes through on the first pass and resolves Unicode literals to their character equivalent, effectively changing your Unicode "characters" to integers, and from then on treating the declarations as perfectly legal integer assignments.
[ February 10, 2005: Message edited by: Joe Ess ]
 
Adrian Stent
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Cheers for that, I think I'm almost there, but confused when a dot is displayed instead of a number.
e.g char b = \u0031;
System.out.println(b); // This should print out 1 instead of the dot.
 
Joe Ess
Bartender
Posts: 9626
16
Mac OS X Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You are skipping a step in your mental compilation. If you write this:

The Java compiler changes the Unicode literal into it's character equivalent (all Java files are assumed to be Unicode):

Now this is setting the integral char to a value of 0x01 (or \u0001 if you want to stick with Unicode). If you print out b, you will be printing out a "Start of Heading" character (smiley face).
The above line is not the same thing as the following line:


The first example, x, is using a character literal to set the integral char to a value of 0x31 (or \u0031). The second example is using a Unicode literal to do the same.
Now if you print out x or y, you will print out "1". The lesson here is that char values aren't really "characters", they are integral values which are interpreted as "characters" in the correct context. Perhaps a gander at the Unicode Character Table is in order?
 
Ranch Hand
Posts: 635
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


output:

633



http://www.fileformat.info/info/unicode/char/633/index.htm

C/C++/Java source code "\u0633"



It is only 633,But in that page it is \u0633.

What is the reason for this difference?What about this 'u' and '0'?
 
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

abalfazl hossein wrote:
It is only 633,But in that page it is \u0633.

What is the reason for this difference?



In your source code you need to let the compiler know that you're specifying a hex or unicode value instead of an integer, by doing something like this:

But the actual hex value is just 633, so that is what is output. If you want to make the output look like source code, you could (almost) do something like this:

I say "almost", because you will need to refine that code a little bit to get the leading zero(s) if the hex value is less than 1000.
 
abalfazl hossein
Ranch Hand
Posts: 635
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
edited
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic