why the length of array is same?

Greenhorn

Posts: 3

posted 12 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

String s="I m the best";
byte [] b=s.getBytes();

When I found the length of Byte array it was the same as of String length...

while In String Each character takes 2 Bytes as it uses Unicode for the character...
Why is it so??? it must be double of string length....

Tim Moores

Saloon Keeper

Posts: 7590

177

posted 12 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

There is no "Unicode" encoding. Unicode defines various encodings (like UTF-8, UTF-16, UTF-32 etc.). UTF-8 in particular -the most commonly found Unicode encoding- does NOT use 2 bytes for all characters. Specifically, for characters below 128 it is identical to ASCII, and will thus use single bytes for each character. (To confuse you further, UTF-8 can take between 1 and 6 bytes for a character...)

You should read this: http://www.joelonsoftware.com/articles/Unicode.html

If you use String.getBytes it probably does not use any of the Unicode encodings, though - it uses the platform default encoding. That could be Cp-1252, MacRoman, ISO-8859-1, UTF-8 or any of a number of other encodings. If you want to encode in UTF-8, call String.getBytes("UTF-8").

while In String Each character takes 2 Bytes as it uses Unicode for the character.

Yes - the JVM internally uses UTF-16, which allocates two bytes for each character. But there is no way of getting at the internal representation of any Java object, so that fact is moot.

Joanne Neal

Rancher

Posts: 3742

posted 12 years ago