Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

escaping foreign chars in generating xml, why?

 
Sven Anderson
Ranch Hand
Posts: 58
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I just started working on a system where foreign characters such as � � � in xml output automatically (using Spring HtmlUtils) are escaped to it's equivalant HTML character reference. I don't really see the point in doing this if you have a db that stores its data as utf-8 and a webserver also serving pages as utf-8.

Are there are advantages/disadvantages using html char references instead of outputting foreign chars just as they are?

Many thanks
E
 
Paul Clapham
Sheriff
Posts: 21416
33
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The advantage is that when you do that, the XML you produce is resistant to being botched up by mis-encoding. You may be carefully ensuring that everything you do is encoded in UTF-8 but that is certainly not a common attitude in the Web world.
 
Sven Anderson
Ranch Hand
Posts: 58
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Paul,

Does this mean that if you have an environment where database/web-server successfully serve utf-8 you shouldn't really have to bother with escaping characters and instead rely on the utf-8 encoding and leave the characters as they are?

Thanks
E
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, roundtripping of UTF-8 text from DB through web server to browser, back to web server and into the database is possible, and it's not even all that difficult. For starters, make sure that the DB encoding is set to Unicode, and that all pages you serve are declared as UTF-8 encoded.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic