• Post Reply Bookmark Topic Watch Topic
  • New Topic

Unicode conversion question  RSS feed

 
Tony Evans
Ranch Hand
Posts: 598
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am writing a small application that should be able to take it non ASCII supported characters an display them. Say chars from the cyrillic or arabic alphabet.

At the monent any non ASCII are presented as a random collection of chars.

example ἀγαυός is outputted as ??a???.

Is the best way to convert ἀγαυός into a unicode representation \u1f00\u03b3\u03b1\u03c5\u03cc\u03c2 which I then translate back into a string which hopfully should be of the form of ἀγαυός.

Thanks for any help
 
John de Michele
Rancher
Posts: 600
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony,

Does the font that you're using support Unicode characters?

John.
 
Paul Clapham
Sheriff
Posts: 22827
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony Evans wrote:Is the best way to convert ἀγαυός into a unicode representation \u1f00\u03b3\u03b1\u03c5\u03cc\u03c2 which I then translate back into a string which hopfully should be of the form of ἀγαυός.


No. Java represents characters in Unicode already. If there's "translation" to be done -- because you're converting the data between Java characters and some non-Unicode external format, perhaps -- then you should choose a suitable encoding for that translation. Those are already built into Java, you don't need to invent your own.

Your first decision is likely to be what encoding to use for wherever the data is coming from. That decision will likely already have been made by somebody else, particularly if the data is in a file or in a database table. So your task would be to find out what encoding they chose to use (or happened to use).

At any rate you should first learn the basics of Unicode. Try reading this: Joel On Unicode.
 
Tony Evans
Ranch Hand
Posts: 598
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Cheers paul will read through that doc tonight.

Cheers Tony
 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 16059
88
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Where / how are you displaying the output of your program? In the Windows console window? Note that the Windows console window has only very limited support for Unicode characters (at least in a normal Western language version of Windows) - the console uses a font that doesn't contain the complete Unicode character set, so that it will display question marks for the missing characters.
 
Tony Evans
Ranch Hand
Posts: 598
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am displaying it on a JSP page it allows me to enter the text ἀγαυός , but then when I process the request to send it elsewhere it outputs the question marks.

Since I want to output it to a mobile device I should convert it into unicode.

But I think what everyone is saying that since it is already in unicode. I dont need to translate it into unicod, but that my application browser does not contain the complete Unicode character set.

So I need to make sure my JVM or browser supports the complete Unicode character set. So when i process the string ἀγαυός I should see it on my browser, and then stream it into unicode when hitting the gateway.

Which I think most gateway applications will do for you.

Cheers Tony
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!