Dan Cane

+ Follow
since Nov 23, 2009
Cows and Likes
Total received
In last 30 days
Total given
Total received
Received in last 30 days
Total given
Given in last 30 days
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Dan Cane



(Problem Exists Between Keyboard and Chair)

Well sorta.. The server I was publishing on had #%@ SET JAVA_OPTS -Djavax.servlet.request.encoding="ISO-8859-1" in the shell script, changing the default to ISO in places that I didn't "force" UTF8.... I looked everywhere in the environment, but didnt think to look at the tomcat startup script...

Problem solved!!!

The site looks cool in Chinese

Thanks for all of your help!

10 years ago

Paul Clapham wrote:
And when I look at the headers sent to Firefox, I see things like this:

Content-Type: text/html;charset=ISO-8859-1

Where are you able to see that? I'm looking at firebug's request of the strut and here is what I have:

Response Headers
Date Tue, 24 Nov 2009 13:37:23 GMT
Content-Type text/html
Content-Language zh
Vary Accept-Encoding
Content-Encoding gzip
Content-Length 2183
Keep-Alive timeout=15, max=100
Connection Keep-Alive

Request Headers
Host pbd.modernizingderm.com
User-Agent Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language en-us,en;q=0.5
Accept-Encoding gzip,deflate
Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive 300
Connection keep-alive
Referer http://pbd.modernizingderm.com/xmd/app/FirmHome.action?locale=en
Cookie JSESSIONID=028CC2436C8702324EA9E90DE70B8588
Cache-Control max-age=0

I see the -Accept=Charset" having both ISO and utf-8 (what my local browser can support) , but the content-type only has text/html... I think you've found the issue, but I'm trying to figure out how to reproduce that (so i can fix it. :) )

FWIW: All pages have the meta-tag: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
and my taglib for the "header" (used on all pages) contains <%@ page contentType="text/html; charset=UTF-8" language="java" %>

10 years ago

Paul Clapham wrote:When something renders as ?, that means that it went through an encoding which couldn't encode it.

Normally there's two steps between a properties file and what you are looking at: (1) read the data into memory from the properties file which as you know uses ISO-8859-1 plus some special handling, and (2) ... well, you didn't say how you are looking at that data.

You seem to have dealt with (1) correctly. So the problem would most likely be with (2) ... whatever it is.


Ahh - you might need that detail :)

I'm looking at it in a browser. I've checked IE8 and FF - and the page encoded is being picked up correctly as UTF8. I'm super stumped as the "lower" UTF characters work (e.g. a Spanish "n"), but anything "higher" doesn't.... Why it works wehn served from Windows, but not UNIX -- a clue - but as to what I have no idea...

The URL to take a peek at this is pbd.modernizingderm.com/xmd/ (just plain ol http, i just didnt want spiders crawling it yet)

As you can see, latin chars OK - UTF8 = check, but asian chars are a no-go... :(

10 years ago
I'm beating my head against the wall here, so I thought I'd ping you to see if you had some insight..

I didn't see a natural area to post this in, so let me know if I should move it somewhere else. My app is struts based, so....

I've localized my application - it's designed to run as UTF-8. I have bundles for 5 languages -- and the application works perfectly on my local windows box, but when I run it off the UNIX server, *some* of the UTF8 characters don't render (they render as ? marks). I'm positive that the character encoding is set correctly, and the browser is correctly detecting the encoding as UTF8. Again, this works 100% on my local box, but not on the server.

I've been eliminating variables as best I can, and now I'm down to something at the OS level? It also doesn't add up that SOME UTF8 encoded text works (Spanish and French letters, which are encoded as UTF8) -- but Asian languages (Traditional Chinese and Japanese) do not...

Completely at a loss.

It's (probably) not a tomcat setting, since I'm running the same tomcat locally and it works... Grrrr... Could the property files be reading in differently on UNIX vs. Windows? Java .property files are supposed to be ISO8895-1 encoded - and when I look at it on the UNIX file-system it looks OK...

Any ideas?

If you have an insight I'd appreciate it. I'm happy to provide access to the page I'm working on, just didn't want it posted in a public place (yet )

Thanks in advance!


- edit -

A little more information, in case it helps:

1) My dev environment: eclipse 3.5 on windows vista 64
2) All of the property files are using the /uxxxx format to encode the text. I know it's supposed to be an ISO8859-1 file with these /uxxxx characters.

A sample of the bundles:


address.country=Pa\u00eds <--- THIS WORKS EVEN ON UNIX. Odd....

address.city=\u753a <--- Renders as ?
address.country=\u56fd <--- Renders as ?
address.state=\u72b6\u614b <--- Renders as ??

10 years ago