• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

getBytes() in String

 
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
String s = new String("12345");
byte a[] = s.getBytes();

for(int i=0; i< a.length;i++)
System.out.println(a[i]);

i am getting ASCII values as output, but according to JavaDoc i should get 1 2 3 4 5 as output right ?
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The byte[] will contain character codes for the characters in the String, using some default encoding which will vary from country to country. For many installations, you will indeed effectively get the ASCII codes for the characters 1, 2, 3, 4, and 5: 49, 50, 51, 52, 53. If you want the characters themselves, use toCharArray().
 
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You get different values from the getBytes() method depending on the specific character encoding you are using. Because that's what getBytes does, that's to say, assign one or more bytes to represent a character depending of the encoding.

ASCII values fit in one byte, but other kind of character encodings contain more than 256 characters. Like the unicode set, for instance. Hence, Java has to manipulate the corresponding characters using more than one byte.

You can determine the current character encoding settings by means of the System class:

String currentEncoding = System.getProperty("file.encoding");

Or by means of the Charset class: Charset.defaultCharset()

You could change the default encoding used by your application by means of setting this variable when you lauch the application, for instance:

> java -Dfile.encoding=UTF-8
> java -Dfile.encoding=ASCII
> java -Dfile.encoding=UTF-16
> java -Dfile.encoding=Cp1252
> java -Dfile.encoding=Cp500

If you, for instance, use UTF-16 every character will ocupy two bytes, but if you use ASCII, every character will occupy just one byte.

The String class has a method getBytes(String charset) that lets you set the encoding used to generate the bytes.

Notice how using different encoding yield different number of bytes:



Another option is to use the CharsetEncoder and CharsetDecoder classes.

If you use ASCII the generated bytes will correspond with the ASCII character numbers, it means that if you assign every byte of the array to a char variable you will get the corresponding ASCII charactrer back again:



But it will print unknow characters if you use another encoding.
[ May 26, 2006: Message edited by: Edwin Dalorzo ]
 
I carry this gun in case a vending machine doesn't give me my fritos. This gun and this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic