• Post Reply Bookmark Topic Watch Topic
  • New Topic

XML / Java / Japanese Characters  RSS feed

 
Gaurav Mac Mathur
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The Problem is i am reading Japanese Characters from Database and Writing them in XML.
I see "教育" these kind of characters in XML, and while parsing teh XML I get Error because of these.
I am not doign something...
Guide me Guru's Whats Missing
Regards
 
Michael Morris
Ranch Hand
Posts: 3451
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Gaurav Mathur:
The Problem is i am reading Japanese Characters from Database and Writing them in XML.
I see "教育" these kind of characters in XML, and while parsing teh XML I get Error because of these.
I am not doign something...
Guide me Guru's Whats Missing
Regards

This is just a guess but it's probably an encoding issue. What is the encoding on the XML header line? It's usually UTF-8:

What parser are you using to read the XML file? There could be some known issue there with exotic characters.
 
Gaurav Mac Mathur
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
and i am surprised to see the correct Japanses Characters in my Post, in XML what i see is ( i am putting space in between characters)
& # x 9 5 a 2 ;
& # x 3 0 b 9 ;
So that Shows that my XML is Populated correctly
now I have to see how to parse this XML.
I will look about the encodign things and will update you .
Thanks....
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You may have an extra layer of escapes involved here. When you look at the XML, are you using some sort of XML editor, or a plain text editor? If it's an XML editor and you see something like "関" then it's possible that what's actually in the file (if you look with a text editor) is "関". This means that however the file was generated, an extra escape sequence was generated somewhere along the way. On the other hand, if the text editor shows "関", then a good XML editor may render that as "関". Or not; depends on what mode it's in maybe. (I'm not actually familiar with the current state of XML tools.)
Note that in order to write this post, I often put in extra & escapes myself. I think that what you see is what I intended to write, but it's easy to get confused with this sort of thing.
 
Gaurav Mac Mathur
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well here is a Update on this Problem
if i simpally Replace
 by あ
the Parsing workes fine...
Any clue whats wrong....
Regards
[ July 18, 2003: Message edited by: Jim Yingst ]
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ummm... why would you have  in an XML file anyway? It's not a printable character. Unicode value 2 represents a "start of text" control character. No one uses those control characters any more; they are inherited from the dark ages. More importantly, they are not allowed as part of an XML document; even as escaped character entities. Allowed characters are:
#x9, #xA, #xD, [#x20-#xD7FF], [#xE000-#xFFFD], [#x10000-#x10FFFF]
Basically this means that the only control characters allowed are tab, newline, and carriage return. You just can't have  in XML; it's not supposed to be allowed by any valid XML parser.
 
Gaurav Mac Mathur
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I myself is surprised,
I looked into the problem more deeply and figured out that this character is coming from Database....
there were varchar field having this text, somehow couple of places 'A' was replaced by 'c' and this character ( Hex value 02).
This character is not visible on SQL prompt, its presence was discovered by seeing the length and Hex codes and counting the visible character..
Would like to share the Query with you.


Poblem Resolved
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!