• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • Ron McLeod
  • Junilu Lacar
  • Liutauras Vilda
Sheriffs:
  • Paul Clapham
  • Jeanne Boyarsky
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Piet Souris
  • Carey Brown
Bartenders:
  • Jesse Duncan
  • Frits Walraven
  • Mikalai Zaikin

Dom Parser and special character

 
Ranch Hand
Posts: 107
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi there,

I am using org.w3c.dom.Dom parser to parser my XML document.
one Node in that xml is

I am using following code to get url node value




and I am putting that value in one hashtable, but getNodeValue() method only returns http://server-img.ppapi.com/transcode?refresh=true from url node, it's not returning location=http://ad.pp.com/adserver/image%3fpath%3db0a18ef22074416d9f2c01d73cf9adc4.jpg&qpub=0&dfmt=jpg after '&', it should return full value like
http://server-img.ppapi.com/transcode?refresh=true&location=http://ad.pp.com/adserver/image%3fpath%3db0a18ef22074416d9f2c01d73cf9adc4.jpg&qpub=0&dfmt=jpg

Any help is highly appreciated !

Thanks,
Bharat
 
Ranch Hand
Posts: 110
Firefox Browser MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
XML parser can't parse special characters like &,< and < etc. "&" has to be represented as & amp; (no space between & and amp;)in order to be processed by DOM parser.
See xml specification http://www.w3.org/TR/REC-xml/#syntax for more info

 
Sheriff
Posts: 27235
87
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Venkat Divvela wrote:XML parser can't parse special characters like &,< and < etc. "&" has to be represented as & amp; (no space between & and amp;)in order to be processed by DOM parser.



That's very true. But if the ampersand hadn't been properly escaped in the document, then the parser would have thrown an exception. It wouldn't have just ignored it. So I suspect it was, and the escaping has been lost somewhere in the process of copying it and posting it here and displaying it in our browsers.

And that doesn't explain why only the part of the node before the ampersand was returned, either. My theory is that the element actually has more than one text child, and Bharat has just assumed that the text is all in a single child. Note that the Node interface has a "normalize" method, and if you read its documentation you'll see a mention of "adjacent Text nodes".
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic