I am trying to read XML data in UTF-8 encoding to display Japanese Characters. The XML file looks like, where the Japanese Character are the replaced by corresponding HEX numbers(//with no space between the & and #):
<?xml version="1.0" encoding="UTF-8"?>
<translate>
<Data1>& #12507;</Data1>
<Data2>& #12507;</Data2>
<Data3>& #12507;</Data3>
<Data4>& #12507;</Data4>
<Data5>& #12507;</Data5>
</translate>
I am using DOM for traversing through the nodes but when I print the read String, it displays ??? instead of the actual Japanese Characters.
The JSP function code traverseTree for Tree Traversal looks like:
public String []traverseTree(Node node,JspWriter out, int i, String []Data) throws Exception
{
if(node == null)
{
return Data;
}
int type = node.getNodeType();
switch (type)
{
// handle document nodes
case Node.DOCUMENT_NODE:
{
traverseTree
(((Document)node).getDocumentElement(),out, i, Data);
break;
}
// handle element nodes
case Node.ELEMENT_NODE:
{
String elementName = node.getNodeName();
NodeList childNodes = node.getChildNodes();
if(childNodes != null)
{
int length = childNodes.getLength();
for (int loopIndex = 0; loopIndex < length ; loopIndex++)
{
traverseTree(childNodes.item(loopIndex),out, i, Data);
i++;
}
}
break;
}
// handle text nodes
case Node.TEXT_NODE:
{
String data = node.getNodeValue().trim();
int changed = i+1;
changed = changed/2;
if((data.indexOf("\n") <0) && (data.length() > 0))
{
String now= new String(data.getBytes(), "UTF-8") ;
Data[changed] = new String(data.getBytes(), "UTF-8");
}
}
}
return Data;
}
%>
So at the end of Tree Traversal, I get the Japanese text in a String array Data, which i use for display in the JSP code.
Can you please explain me where I am making a mistake, I am also using charset="x-sjis" for JSP, but cant see the Japanese Characters.
Thanks in advance.
Kapil
[This message has been edited by Kapil Sabharwal (edited September 26, 2001).]