Forums Register Login

What is the range for unicode values in char data type?

+Pie Number of slices to send: Send
What is the range for all unicode values that i can use to initialize a char data type?
+Pie Number of slices to send: Send
A char can range from 0 to 65535. I don't know how many of these values have been assigned a Unicode graphic.
+Pie Number of slices to send: Send
You might be interested in the method, Character.isDefined(char ch), which returns a boolean depending on whether the argument char is defined in Unicode.

In general, you'll find that within the range of possible char values, there are numerous Unicode gaps. For example, \u0237 through \u0249 are not defined. You can assign these values to a char, but they won't translate to Unicode characters.

Also note that in Java 1.5, some of the values within the char range are used for "surrogate pairs," which allows representation of supplementary characters -- that is, Unicode characters with code points greater than \uFFFF. In the context of a 16-bit char, these surrogate values (\uD800 - \uDFFF) are considered undefined.

"...supplementary characters are represented as a pair of char values, the first from the high-surrogates range, (\uD800-\uDBFF), the second from the low-surrogates range (\uDC00-\uDFFF). A char value, therefore, represents Basic Multilingual Plane (BMP) code points [\u0000 to \uFFFF], including the surrogate code points..."

Ref: http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Character.html
[ February 24, 2005: Message edited by: marc weber ]
+Pie Number of slices to send: Send
 

Originally posted by Mike Gershman:
A char can range from 0 to 65535. I don't know how many of these values have been assigned a Unicode graphic.


I count 59177.
+Pie Number of slices to send: Send
I know they range from 0 to 65535 in terms of integers, but i really would like to know in terms of unicode characters.
The point is that i`ve seen questions on mock exams that asks me for example if

char a = '\u000d'

is valid. In this case, it`s not, but it really looks it would be allright. I also checked that

char b = '\u101'

is also valid. This is weird or no? So, am i supposed to memorize all valid unicode initializations for the exam?
+Pie Number of slices to send: Send
 

The point is that i`ve seen questions on mock exams that asks me for example if

char a = '\u000d'

is valid. In this case, it`s not, but it really looks it would be allright.


'\u000d' is the carriage return character (not 'a') and is a legal unicode character. However, '\u000d' and '\u000a' (new line) should not appear anywhere in a Java source program because the Java compiler will treat them as actual line breaks in your program text and break your statement into two lines. Use '\r' and '\n' instead.

If you really want to learn some Unicode, just remember those two, 'u0020' is blank, numbers start with '\u0030' is 0 and 'u0031' is 1, etc., capital letters start with '\u0041' is A, and lower case letters start with '\u0061' is a. That is more than enough for the SCJP exam and for ordinary programming in the English language.
+Pie Number of slices to send: Send
Thanks for the explanation Mike, you got to the point I need.
I wasn't selected to go to mars. This tiny ad got in ahead of me:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com


reply
reply
This thread has been viewed 1441 times.
Similar Threads
Character representation in Octal format..
char primitive initialization values
Range of char type 0 -216 ???
Increment a character...
about char declaration
More...

All times above are in ranch (not your local) time.
The current ranch time is
Apr 15, 2024 22:34:26.