• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

what should the default charset be?

 
Ranch Hand
Posts: 3695
IntelliJ IDE Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm having a bit of trouble with charsets and encodings. My problem is specifically related to JavaMail and webapps, but I'm posting in the general forum , because I think my difficulty is in a general misunderstanding of charsets/encoding.

I've got a website that I am in the process of i18n-enabling.

The two languages are French and English. So far, I've had no real trouble with the french accented characters. Everything just appears to work the way I'd expect.

In TextPad, I can see my � and � (and any other accents) fine. I view the file info and it tells me my document "code set" is ANSI. Not sure what 'code set' is, perhaps they mean char set?

Anyways.. I upload the i18n properties file containing French words (and thus, special characters) to my web server. I then use the java.util.Locale to retrieve the localized text and it all works. The web pages have the �, etc, etc.

Another part of the site I'm i18n'ing is generated/feedback emails. The body of the emails contain static text as well as dynamic. The static text is being pulled out of the properties file as well. When I pull these out of the file, and send them through JavaMail, I get message bodies that look like:

"D?sol?s. Le syst?me d?extraction des mots de passe est pr?sentement hors d?usage."

When it should read:
"D�sol�s. Le syst�me d'extraction des mots de passe est pr�sentement hors d'usage."

The special characters are not being properly decoded? It's using the wrong charset?

I view the message headers, and observe:
Content-Type: text/plain; charset=ANSI_X3.4-1968

I was under the impression that UTF-8 was Java's 'default' ?

Investigating my System properties programmatically, I discover:

file.encoding = ANSI_X3.4-1968

Hmm.. the same as my email.

To make matters worse, there are other emails the system generates that have a different header (just text/plain, with no charset specified), and *these* emails manage to output the correct special characters.

Where might my encodings/charsets be off?
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic