• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

why the length of array is same?

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator



String s="I m the best";
byte [] b=s.getBytes();

When I found the length of Byte array it was the same as of String length...

while In String Each character takes 2 Bytes as it uses Unicode for the character...
Why is it so??? it must be double of string length....

 
Saloon Keeper
Posts: 7590
177
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There is no "Unicode" encoding. Unicode defines various encodings (like UTF-8, UTF-16, UTF-32 etc.). UTF-8 in particular -the most commonly found Unicode encoding- does NOT use 2 bytes for all characters. Specifically, for characters below 128 it is identical to ASCII, and will thus use single bytes for each character. (To confuse you further, UTF-8 can take between 1 and 6 bytes for a character...)

You should read this: http://www.joelonsoftware.com/articles/Unicode.html

If you use String.getBytes it probably does not use any of the Unicode encodings, though - it uses the platform default encoding. That could be Cp-1252, MacRoman, ISO-8859-1, UTF-8 or any of a number of other encodings. If you want to encode in UTF-8, call String.getBytes("UTF-8").

while In String Each character takes 2 Bytes as it uses Unicode for the character.


Yes - the JVM internally uses UTF-16, which allocates two bytes for each character. But there is no way of getting at the internal representation of any Java object, so that fact is moot.
 
Rancher
Posts: 3742
16
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Read the javadoc for the String.length method
 
vikas varshney
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
what if I use a String that contains the Character that use only UTF-16 encoding(May be in any other language character)???
then what will happen???
 
Marshal
Posts: 28226
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Try it and see.
 
reply
    Bookmark Topic Watch Topic
  • New Topic