• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

UTF8 java + arabic

 
Ranch Hand
Posts: 61
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi - I need to get arabic into a java string. Have saved the arabic as UTF-8, wondering about the correct way to get that into a string? googling gives me lots of suggestions, so just wondering which is the correct one.

Thanks

L
 
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
How is this data currently stored? Binary in a database?
 
Lucy Sommerman
Ranch Hand
Posts: 61
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
text file - as utf-8
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ok, then you want to be concerned with how you're reading in that data. that is, Use a ByteBuffer with UTF-8 encoding, as following:


[ September 15, 2005: Message edited by: Max Habibi ]
 
Lucy Sommerman
Ranch Hand
Posts: 61
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks u r a lifesaver. L
 
Lucy Sommerman
Ranch Hand
Posts: 61
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
just to check.

and the string itself will be UTF8 though and not converted to UTF 16? this is plugging into something else, will not handle UTF 16 - thanks

L
 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi:

I am having some difficulty with UTF-8 encoded
chracaters in Java.

My XML has a question which has cyrillic characters. My Java servlet renders this as HTML with a form for the reply.
The HTML produced
displays OK in the browser (the response type on the
Java servelet has to be set to "text/html;
charset=UTF-8" for this to work).

I have to send cyrillic characters back in the
response to the question, in a text field on the form.
The browser is sending back a byte stream (which I am
printing here as hex): d0b3d0bed180d0bed0b4 (this is a
cyrillic word correctly coded as utf-8).

However, on collecting the response (using
request.getParameterValues(fieldname))the servlet
returns the byte stream: d0b3d0bed13fd0bed0b4.
A mistake in the fifth byte!

Has anyone heard of this problem? I suspect the
problem is in the JAVA UTF-8 converter.

Regards

Graham
 
Grahamsmit Smith
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I now know the answer, thanks to Bruno Van Haetsdaele .

Before calling request.getParameterValues(fieldname));
one should call request.setCharacterEncoding("UTF-8");

Grahamsmit
 
Ranch Hand
Posts: 245
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Lucy Sommerman:
just to check.

and the string itself will be UTF8 though and not converted to UTF 16? this is plugging into something else, will not handle UTF 16 - thanks

L



Strings are sequences of characters which are 16-bit (UFT-16). You can (and probably need to) convert the String to byte array or write to stream to plug it into "something else". In both cases character encoding can be specified.
reply
    Bookmark Topic Watch Topic
  • New Topic