• Post Reply Bookmark Topic Watch Topic
  • New Topic

getBytes() in String  RSS feed

Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
String s = new String("12345");
byte a[] = s.getBytes();

for(int i=0; i< a.length;i++)

i am getting ASCII values as output, but according to JavaDoc i should get 1 2 3 4 5 as output right ?
author and iconoclast
Posts: 24217
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The byte[] will contain character codes for the characters in the String, using some default encoding which will vary from country to country. For many installations, you will indeed effectively get the ASCII codes for the characters 1, 2, 3, 4, and 5: 49, 50, 51, 52, 53. If you want the characters themselves, use toCharArray().
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You get different values from the getBytes() method depending on the specific character encoding you are using. Because that's what getBytes does, that's to say, assign one or more bytes to represent a character depending of the encoding.

ASCII values fit in one byte, but other kind of character encodings contain more than 256 characters. Like the unicode set, for instance. Hence, Java has to manipulate the corresponding characters using more than one byte.

You can determine the current character encoding settings by means of the System class:

String currentEncoding = System.getProperty("file.encoding");

Or by means of the Charset class: Charset.defaultCharset()

You could change the default encoding used by your application by means of setting this variable when you lauch the application, for instance:

> java -Dfile.encoding=UTF-8
> java -Dfile.encoding=ASCII
> java -Dfile.encoding=UTF-16
> java -Dfile.encoding=Cp1252
> java -Dfile.encoding=Cp500

If you, for instance, use UTF-16 every character will ocupy two bytes, but if you use ASCII, every character will occupy just one byte.

The String class has a method getBytes(String charset) that lets you set the encoding used to generate the bytes.

Notice how using different encoding yield different number of bytes:

Another option is to use the CharsetEncoder and CharsetDecoder classes.

If you use ASCII the generated bytes will correspond with the ASCII character numbers, it means that if you assign every byte of the array to a char variable you will get the corresponding ASCII charactrer back again:

But it will print unknow characters if you use another encoding.
[ May 26, 2006: Message edited by: Edwin Dalorzo ]
It will give me the powers of the gods. Not bad for a tiny ad:
ScroogeXHTML 7.1 - RTF to HTML5 / XHTML converter
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!