• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

XML XSL Encoding Issue

 
vinay purohit
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I am generating an XML document and then transforming it using an XSL stylesheet.

While generating the xml document, certain Strings contain special characters
example string: "... clients through the MYWeb� Web site".
Notice the ® symbol.

But when I generate the XML, somehow someother characters too are appended.
<DESCRIPTION>
... clients through the MyWeb�� Web site.
</DESCRIPTION>
Notice the � symbol.

and it comes out as garbage characters in the resulting HTML.

Any ideas why this could be happening? Can anyone shed light on this?

Thanks in advance.

cheers
Vinay
[ June 14, 2005: Message edited by: vinay purohit ]
 
Balaji Loganathan
author and deputy
Bartender
Posts: 3150
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
as you can geuss, its unicode problem, parsers like xerces/xalan expects the input xml to be complaint to utf standards(in general).
So make sure the special chars were unicode complaint.
 
vinay purohit
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I did try generating the xml with encoding as UTF-8.

<?xml version="1.0" encoding="UTF-8"?>

and my stylesheet as
<xsl utput method="html" encoding="UTF-8"/>

But didnt work. Is there anything that I am doing wrong? Am I missing anything?

cheers
Vinay
 
Balaji Loganathan
author and deputy
Bartender
Posts: 3150
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well... Welcome to Javaranch Vinay.

The addtional characters it generates is strange, may be bcos of the improper input xml.
From where u sourceing ur xml? i mean db or some text files ?

I use IDE's like xmlspy home ediiton, turbo to make sure that my input xml is utf complaint. sometimes i manually(using java) convert special chars to unicode specfic, for your case &# 174; is the utf equi of ® (with no space between &# and 174 .
Try to use this and see how its works. !!
 
vinay purohit
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, the XML is generated at runtime. I will try what u have suggested and let u know.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic