• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Are the code pages in charsets.jar documented somewhere?

 
Tom Katz
Ranch Hand
Posts: 169
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to convert raw bytes into various code pages. I tried using the String constructor that takes in a byte[] and a string representing the code page - e.g 'new String(bytes, "cp037") - and it seemed like some of the characters I got back were different from what's documented for that code page.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hmmm - no offense, but that sounds kind of backwards. The new String(byte[], String) constructor does convert to different encodings - it converts from different encodings of bytes, to a statndard Java String consisting of a sequence of chars whose numeric values are interpreted as Unicode. Strings do not have encodings; bytes have encodings. So if you've got an array of bytes, you need to find out what one encoding was used to generate the bytes in the array. Then you can use that encoding in the new String(byte[], String) constructor to get a String. If you then wish to convert that String to bytes, using some other encoding, then you can use getBytes(String) - naming whatever encoding you want.
 
Tom Katz
Ranch Hand
Posts: 169
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for info, I appreciate it...
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic