Help coderanch get a
new server
by contributing to the fundraiser
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Devaka Cooray
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Tim Moores
  • Carey Brown
  • Mikalai Zaikin
Bartenders:
  • Lou Hamers
  • Piet Souris
  • Frits Walraven

Dom Parser and special character

 
Ranch Hand
Posts: 107
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi there,

I am using org.w3c.dom.Dom parser to parser my XML document.
one Node in that xml is

I am using following code to get url node value




and I am putting that value in one hashtable, but getNodeValue() method only returns http://server-img.ppapi.com/transcode?refresh=true from url node, it's not returning location=http://ad.pp.com/adserver/image%3fpath%3db0a18ef22074416d9f2c01d73cf9adc4.jpg&qpub=0&dfmt=jpg after '&', it should return full value like
http://server-img.ppapi.com/transcode?refresh=true&location=http://ad.pp.com/adserver/image%3fpath%3db0a18ef22074416d9f2c01d73cf9adc4.jpg&qpub=0&dfmt=jpg

Any help is highly appreciated !

Thanks,
Bharat
 
Ranch Hand
Posts: 110
Firefox Browser MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
XML parser can't parse special characters like &,< and < etc. "&" has to be represented as & amp; (no space between & and amp;)in order to be processed by DOM parser.
See xml specification http://www.w3.org/TR/REC-xml/#syntax for more info

 
Marshal
Posts: 28288
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Venkat Divvela wrote:XML parser can't parse special characters like &,< and < etc. "&" has to be represented as & amp; (no space between & and amp;)in order to be processed by DOM parser.



That's very true. But if the ampersand hadn't been properly escaped in the document, then the parser would have thrown an exception. It wouldn't have just ignored it. So I suspect it was, and the escaping has been lost somewhere in the process of copying it and posting it here and displaying it in our browsers.

And that doesn't explain why only the part of the node before the ampersand was returned, either. My theory is that the element actually has more than one text child, and Bharat has just assumed that the text is all in a single child. Note that the Node interface has a "normalize" method, and if you read its documentation you'll see a mention of "adjacent Text nodes".
 
reply
    Bookmark Topic Watch Topic
  • New Topic