This week's book giveaway is in the OCPJP forum.
We're giving away four copies of OCA/OCP Java SE 7 Programmer I & II Study Guide and have Kathy Sierra & Bert Bates on-line!
See this thread for details.
The moose likes I/O and Streams and the fly likes Retrieve charset information from a String object Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Retrieve charset information from a String object" Watch "Retrieve charset information from a String object" New topic
Author

Retrieve charset information from a String object

Rade Koncar
Greenhorn

Joined: Sep 14, 2012
Posts: 10
Hello,

This question already tackles the following problem, but the answer seems a bit blurry to me.
If you can create a String object with the following constructor ( link: http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#String%28byte[],%20java.nio.charset.Charset%29 ), then I guess this charset information is stored in the String object somewhere. Is there a way to retrieve this information?

From what I understand, encoding and underlaying bytes of data are not connected, only with correct encoding information you can display text correctly (from raw bytes).
However, String can be constructed with encoding in mind (and consequently it is not just plain byte array), probably there is a way to retrieve it somehow?
Richard Tookey
Ranch Hand

Joined: Aug 27, 2012
Posts: 1067
    
  10

The charset is not stored in the String. The string content encoding is implicit and is always UNICODE stored as UTF16 code points. The charset parameter is required for the constructor to be able to know how to convert the bytes to UTF16 code points.
Rade Koncar
Greenhorn

Joined: Sep 14, 2012
Posts: 10
Thanks, you're right. All strings are internally stored in common format, so you can for example compare them regardless of their encoding (I checked this).
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Retrieve charset information from a String object