• Post Reply Bookmark Topic Watch Topic
  • New Topic

iso-2022-jp loading problem  RSS feed

 
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Trying to load and display some iso-2022-jp characters, but they just look like garbage. Tested with Java 1.5 and 6. It's supposed to look like this:

纊鍈昱曻伀佖侔倞傔冝匀咜喆埇夋奣孖寘

File is saved here.

Here's the code I'm using to load and display:



You may have to change the font for your system. It's definitely not the font, since I have successfully loaded and displayed the following HTML-encoded version of the text:

 
Rob Shields
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Oops, that HTML-encoded text should look like:

纊鍈昱曻伀佖侔倞傔冝匀咜喆埇夋奣孖寘

Just as a correction, it doesn't actually look like garbage when is displayed, it looks like this:

������������������

The file I linked to displays properly in a web browser when you select iso-2022-jp as the encoding.

However, when I copy the text from my browser into a new text file in gedit and hit file -> save as and choose iso-2022-jp as the encoding, it says:

The document contains one or more characters that cannot be encoded using the specified character coding.
Select a different character coding from the menu and try again.


I guess this may be a clue.
 
Rob Shields
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I notice that the file starts with the iso-2022-jp escape sequence 1B 24 42 which is defined here as the escape sequence for the charset JIS X 0208-1983, as part of is-2022-jp-2.

I have tried using is-2022-jp-2 as the charset, but this not supported by Java 1.5.

I supposed I could manually parse the escape sequences and try a different charset.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!