• Post Reply Bookmark Topic Watch Topic
  • New Topic

what should the default charset be?  RSS feed

Mike Curwen
Ranch Hand
Posts: 3695
IntelliJ IDE Java Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm having a bit of trouble with charsets and encodings. My problem is specifically related to JavaMail and webapps, but I'm posting in the general forum , because I think my difficulty is in a general misunderstanding of charsets/encoding.

I've got a website that I am in the process of i18n-enabling.

The two languages are French and English. So far, I've had no real trouble with the french accented characters. Everything just appears to work the way I'd expect.

In TextPad, I can see my � and � (and any other accents) fine. I view the file info and it tells me my document "code set" is ANSI. Not sure what 'code set' is, perhaps they mean char set?

Anyways.. I upload the i18n properties file containing French words (and thus, special characters) to my web server. I then use the java.util.Locale to retrieve the localized text and it all works. The web pages have the �, etc, etc.

Another part of the site I'm i18n'ing is generated/feedback emails. The body of the emails contain static text as well as dynamic. The static text is being pulled out of the properties file as well. When I pull these out of the file, and send them through JavaMail, I get message bodies that look like:

"D?sol?s. Le syst?me d?extraction des mots de passe est pr?sentement hors d?usage."

When it should read:
"D�sol�s. Le syst�me d'extraction des mots de passe est pr�sentement hors d'usage."

The special characters are not being properly decoded? It's using the wrong charset?

I view the message headers, and observe:
Content-Type: text/plain; charset=ANSI_X3.4-1968

I was under the impression that UTF-8 was Java's 'default' ?

Investigating my System properties programmatically, I discover:

file.encoding = ANSI_X3.4-1968

Hmm.. the same as my email.

To make matters worse, there are other emails the system generates that have a different header (just text/plain, with no charset specified), and *these* emails manage to output the correct special characters.

Where might my encodings/charsets be off?
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!