Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

why the length of array is same?

 
vikas varshney
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator



String s="I m the best";
byte [] b=s.getBytes();

When I found the length of Byte array it was the same as of String length...

while In String Each character takes 2 Bytes as it uses Unicode for the character...
Why is it so??? it must be double of string length....

 
Tim Moores
Bartender
Posts: 2947
46
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There is no "Unicode" encoding. Unicode defines various encodings (like UTF-8, UTF-16, UTF-32 etc.). UTF-8 in particular -the most commonly found Unicode encoding- does NOT use 2 bytes for all characters. Specifically, for characters below 128 it is identical to ASCII, and will thus use single bytes for each character. (To confuse you further, UTF-8 can take between 1 and 6 bytes for a character...)

You should read this: http://www.joelonsoftware.com/articles/Unicode.html

If you use String.getBytes it probably does not use any of the Unicode encodings, though - it uses the platform default encoding. That could be Cp-1252, MacRoman, ISO-8859-1, UTF-8 or any of a number of other encodings. If you want to encode in UTF-8, call String.getBytes("UTF-8").

while In String Each character takes 2 Bytes as it uses Unicode for the character.

Yes - the JVM internally uses UTF-16, which allocates two bytes for each character. But there is no way of getting at the internal representation of any Java object, so that fact is moot.
 
Joanne Neal
Rancher
Posts: 3742
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Read the javadoc for the String.length method
 
vikas varshney
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
what if I use a String that contains the Character that use only UTF-16 encoding(May be in any other language character)???
then what will happen???
 
Paul Clapham
Sheriff
Posts: 21416
33
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try it and see.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic