Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Retrieve charset information from a String object

 
Rade Koncar
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

This question already tackles the following problem, but the answer seems a bit blurry to me.
If you can create a String object with the following constructor ( link: http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#String%28byte[],%20java.nio.charset.Charset%29 ), then I guess this charset information is stored in the String object somewhere. Is there a way to retrieve this information?

From what I understand, encoding and underlaying bytes of data are not connected, only with correct encoding information you can display text correctly (from raw bytes).
However, String can be constructed with encoding in mind (and consequently it is not just plain byte array), probably there is a way to retrieve it somehow?
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The charset is not stored in the String. The string content encoding is implicit and is always UNICODE stored as UTF16 code points. The charset parameter is required for the constructor to be able to know how to convert the bytes to UTF16 code points.
 
Rade Koncar
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks, you're right. All strings are internally stored in common format, so you can for example compare them regardless of their encoding (I checked this).
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic