• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Tim Cooke
  • Junilu Lacar
Sheriffs:
  • Rob Spoor
  • Devaka Cooray
  • Jeanne Boyarsky
Saloon Keepers:
  • Jesse Silverman
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
  • Tim Holloway
Bartenders:
  • Jj Roberts
  • Al Hobbs
  • Piet Souris

Unicode characters

 
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
From JLS:
Since unicode escapes are processed very early it is not correct to write a character literal as '\u000a' instead use the sequence '\n'.Its not correct to write '\u000d'for carriage return instead use '\r'.When i write a small program to test this stuff
public class unicode
{
public static void main(String[] args)
{
char c1='\u000a';
char c='\u000d';
System.out.println(c1);
System.out.println(c);
}
}
It gives me compiler error.Thats fine...
1)If i comment the first statement out using '//'it gives me compiler error,but if i use the comments /**/ it actually compiles.Whats happening here???
2)And also JLS specifies that I can't use '\u000d' for carriage return but how come the compiler does not complain this but complains for new line character.The code below actually compiles:
public class unicode
{
public static void main(String[] args)
{
/* char c1='\u000a';*/
char c='\u000d';
System.out.println(c);
}
}
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Interesting. I hadn't known this before, but here's what I came up with:
First question:
As the JLS says, the Unicode escape sequences are processed very early. In fact, it appears that they're processed before comment delimiters. So:
<code><pre> // char c1='\u000a';</pre></code>
is processed as if it were:
<code><pre> // char c1='
';</pre></code>
...which of course confuses the compiler. On the other hand,
<code><pre> /* char c1='\u000a'; */</pre></code>
is processed as
<code><pre> /* char c1=
; */</pre></code>
...which looks strange, but is OK since the compiler interprets it as a multi-line comment.
As for your second question, that code does not compile on my machine, running JDK 1.2.2 and 1.3.0 on Windows NT. Are you using Unix or some other system that interprets carriage returns differently? It may be that '\u000d' is OK on some systems, but for platform independence you should avoid using it.
The moral: '\u000a' and '\u000d' are evil. Even if they compile, they probably don't mean what you think they mean. Use '\n' and '\r' instead.
 
Surya B
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim,
Thanks for the reply ,what you have said for the first question logically seems to make sense to me.Regarding the second question,it did compile on my machine,i am using JDK1.2 on Windows NT and the output also varies.If the piece of code is something like this
public class unicode
{
public static void main(String[] args)
{
char c='\u000d';
System.out.print("Test");
System.out.print(c);
System.out.print("this");
}
}
Then can you guess what the ouput on my machine is ..it is just 'this'.Test and the carriage return are missing .If i comment the line out then it prints 'Testthis'.
This is proving to be quite interseting to me,it is compiling on my machine contrary to JLS and is giving intersting results.
 
Ranch Hand
Posts: 40
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It doesn't compile using JDK1.2.1 on Windows NT.
It did show "this" with the following program:
<pre>
public class unicode
{
public static void main(String[] args)
{
char c='\r';
System.out.print("Test");
System.out.print(c);
System.out.print("this");
}
}
</pre>
Do we have others who got JDK1.2 on NT? What's your test result using Surya's program?
 
Ranch Hand
Posts: 136
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi everyone,
I am using JDK 1.2.1 under Windows95 and tried both versions of surya's program.
Both of them give me the following error.
unicode.java:7: Invalid character constant.
char c='\u000d';
^
However, if I change '\u000d' to '\r' in the second version, it compiles and the result is this.
 
Surya B
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi everyone,
Thanks for taking off your time to compile the program.Well I am running the program under this java version
E:\java -version
java version "1.2.2"
Classic VM (build JDK-1.2.2-W, native threads, symcjit)
The program compiles fine with this version,and i have jdk 1.1.6 also and it compiles fine. .Maybe it has got something to do with NT Service pack and JDK combination.Anyway thanks once again.
 
You showed up just in time for the waffles! And this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
reply
    Bookmark Topic Watch Topic
  • New Topic