Hi Patrick, I think you did not understand my question. I want to know as to why the lines that are commented in the code gives compilation errors if the comment is removed, where as the lines which are not commented does not give any errors; both being assignment of unicode values to char variable. Hope you got my question right this time. Kezia.
The '\u0027' single-quote literal is invalid because it would evaluate to to: ''' This is obviously unacceptable and is rejected by the compiler. Use '\'' instead. There is a description of this at www.javasoft.com BUG ID: 4090696
Unicode are special to a Java program in that the compiler looks through the code for those sequences and interprets them first. So, trying an assignment like char test = '\u000a'; will not work since the compiler interprets the escape sequence first as a newline such that the compiler will see your code as char test = ' '; As you know, you cannot have a line break in a literal. To take this further, you can also have a variable declaration like char \u0061\u0062\u0063\u0064; and the compile will interpret it as char abcd; [ January 11, 2002: Message edited by: Robert Troshynski ]
For char a = '\u000a'; the complier translates it immediately. So, the above statement turns out to be, char a = ' '; and since there is a line break in the char literal, the compiler flags an error. So, it the case with char b = '\u0027'; which gets translated to char b = '''; so the complier flags an error. But for char c = '\u0008'; which is the unicode value for backspace, and gets translated to char c = '; why doees the compiler not flag any error? What about char d = '\u000c'; which is the unicode value for formfeed, why does the compiler not flag an error in the above case also? Can anybody clear this for me? Thanks, Kezia.
But for char c = '\u0008'; which is the unicode value for backspace, and gets translated to char c = '; why doees the compiler not flag any error? What about char d = '\u000c'; which is the unicode value for formfeed, why does the compiler not flag an error in the above case also?
char c = '\u0008' is not translated to the mentioned expression. Simply the compiler doesn't do that. JLS 3.3 says that all the Unicode escapes will be translated to the corresponding Unicode character. For instance \u0041 will be tranlated to A. The purpose of this lexical translation is to allow the text editors that doen't support Unicode to produce Unicode characters. Because all the symbols in a Unicode escape are ASCII, and all editors manage ASCII. JLS 3.4 says that the next lexical translation is to recognize the line terminator characters: \n (Unix), \r (Mac) and \r\n (Windows) Doing so the lines of the program are determined. Because the line terminators are processed at this step they are not allowed to be be part of character or string literals, as it is stated in JLS 3.10.4 y 3.10.4 Also note that in JLS 3.4 only line terminators are recognized. The rest of the Unicode characters are left untouched. And they can appear within string or character literal with the logical exceptions of ' "" \