• Post Reply Bookmark Topic Watch Topic
  • New Topic

Special characters problem

 
Baba Bizlowsky
Ranch Hand
Posts: 39
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi everybody.

I'm trying to make a in non-English Web application . I need it to contain special characters from my language (čć��đ .

When I put those chars in JSP, it works fine. However, when I embed them in Java code, the Strings containig them display ?s instead of the chars. For instance čačići becomes ?a?i?i. Very irritating.

Can someone tell me how Java manages this? Changing the code page for Explorer doesn't do anything since the source page already contains ?s.

Thanks for any kind of help. Cheers!
 
danny liu
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Baba,

It is probably relevant to character-encoding setup of JVM. When strings of special characters pass through your java code, they cannot be recognized and changed to question mark. The default character-encoding for JVM is ISO 8819.
You should check whether ISO 8819 support those special characters, or you may have to use another character-encoding parameter.

Dan
 
Baba Bizlowsky
Ranch Hand
Posts: 39
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for asnwering.

But can I please bother you to tell me how and where do I set these char settings? You see, I have a JavaBean which defines a String. A JSP page uses this bean while building a HTML page, and the page gets shown in IE5.

I have put Unicode instead of this chars in the original JavaBean code (for instance "This is a \u0161ip"). I have set a "charset=iso-8859-2" (which contains my chars) on top of the JSP page. Still, the source HTML has a message "This is a ?ip". It's driving me nuts.

Can you please tell me what I'm missing? Thanks.
 
Zbyszek Cyktor
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try this:

If You use servlet (ie. as a controller), make sure it contains

request.setCharacterEncoding("UTF-8");
response.setContentType("text/html; charset=UTF-8");

In JSP itself remember about

<%@ page
contentType="text/html;charset=UTF-8"
pageEncoding="UTF-8"
%>

In generated HTML don't forget to add

<meta http-equiv="content-type" content="text/html; charset=UTF-8"/>

to Your head section.

Of course You can replace UTF-8 with any appropriate encoding,
although I strongly encourage You to use it.

Your post says that the String which is wrongly displayed is defined
in some Java class. In such case You also have to remember to
tell the compiler to use specific encoding while compiling this class.
Check the docs.

Zbyszek
 
danny liu
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Zbyszek,

You list how to read and write special characters in request and response objects. However, when Baba's jsp get that string from the bean, the special character is already modified to ?. Thus it is useless to put charset=iso-8859-2" in jsp.

Baba,

You may use the following steps to fix that bug.

a. check what character encoding is used by your application

put this into your bean

String foo = new java.io.InputStreamReader(new java.io.ByteArrayInputStream(new byte[0])).getEncoding();
System.out.println(foo);

b. usually, it should be "ISO 8859". If it is not, check JDK's version.
when migrating JDK from 1.3 to 1.4.2, there are some bugs on charset.

c. try upgrading JDK to 1.5

d. or exploit some application server related parameters.

Here is a thread for that http://forum.java.sun.com/thread.jspa?threadID=556966&messageID=2731737

Hope it helps.

Dan
 
Zbyszek Cyktor
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dan, You didn't read my email carefully.

Originally posted by danny liu:
However, when Baba's jsp get that string from the bean, the special character is already modified to ?.
Thus it is useless to put charset=iso-8859-2" in jsp.


The encoding can not become broken while passing a String from the
bean to the JSP, because we are are just using reference to the very
same String object.

However it might break when JSP code is passing text to the output stream,
for which I gave solution (correct headers and page definition in JSP).
Another possibility is that (as I mentioned at the end of my email) the string itself might be written in the source code of the class in different encoding, than the encoding used during compilation, which will also result in messing the characters.

Zbyszek
 
danny liu
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Zbyszek,

Thanks for your comments.

However, I still doubt it a little bit.

First you say,
The encoding can not become broken while passing a String from the
bean to the JSP, because we are are just using reference to the very
same String object.


Then you come out with,
Another possibility is that (as I mentioned at the end of my email) the string itself might be written in the source code of the class in different encoding, than the encoding used during compilation, which will also result in messing the characters.


It proves that it is possible the string is messed before it is passed to jsp, due to runtime or compilation error.

In that case, you must reset up the character encoding parameter of JVM some how.

Dan
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!