Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

ResourceBundle character set  RSS feed

 
Andy Canfield
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am trying to use class ResourceBundle
The strings I need are in LanguageText_th.properties
This is a UTF-8 text file.
Apparently ResourceBundle assumes ISO-8859-1 encoding.
Is it possible to get it to read a Unicode text file?
Or do I have to re-invent the wheel?
 
Joe Ess
Bartender
Posts: 9428
12
Linux Mac OS X Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There's a native2ascii converter mentioned in the Properties JavaDoc that can convert file encodings. Will that help?
 
Andy Canfield
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joe Ess wrote:There's a native2ascii converter mentioned in the Properties JavaDoc that can convert file encodings. Will that help?

The text string "ascii" never apperas in
http://download.oracle.com/javase/1.5.0/docs/api/java/util/ResourceBundle.html
which is where I get my information.

Even so, the name is not encouraging. I don't want to convert ResourceBundle's concept of "native" to ascii, I want to convert it to Java String. A hex dump of the string shows that the original bytes are unchanged; e.g. there are three String bytes for each UTF-8 character. What I really need is MsgText.getString( "keyword", "UTF-8" );

Ahah! I found a workaround!

String Q1 = MsgText.getString( "q_male" );
byte[] Q2 = Q1.getBytes( "ISO-8859-1" );
String Q3 = new String( Q2, "UTF-8" );

Q1 is "ผู้ชาย"
Q3 is (correct) "ผู้ชาย"

Certainly this will require an interface class to correct the bug in ResourceBundle. If you think of any other way to do it let me know. Otherwise, thank you very much.
 
Joe Ess
Bartender
Posts: 9428
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Take a look at the Properties page. It specifically mentions the non-XML format using ISO 8859-1 character encoding and how to convert other encodings.
 
Paul Clapham
Sheriff
Posts: 22480
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That's the Java 5 documentation; in Java 6 there's the load(Reader) method which you could use to load the properties from an InputStreamReader configured to use UTF-8 or any other charset.
 
Joe Ess
Bartender
Posts: 9428
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
True, but Andy's dealing with ResourceBundles, not property files.
 
Alex Hurtt
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It would seem that if you use the default implementation of ResourceBundle it expects an ISO-8859-1 file and any characters that cannot be represented in ISO-8859-1 encoding must be represented by Unicode Escapes. However check out java.util.PropertyResourceBundle. It has a constructor which accepts a Reader (as well as one which takes a stream which is the one ResourceBundle.getBundle() seems to use and therefore imposes the iso-8859-1 limitation). Javadoc claims the following:

/**
* Creates a property resource bundle from a {@link java.io.Reader
* Reader}. Unlike the constructor
* {@link #PropertyResourceBundle(java.io.InputStream) PropertyResourceBundle(InputStream)},
* there is no limitation as to the encoding of the input property file.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!