• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Paul Clapham
  • Devaka Cooray
  • Bear Bibeault
Sheriffs:
  • Junilu Lacar
  • Knute Snortum
  • Liutauras Vilda
Saloon Keepers:
  • Ron McLeod
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Piet Souris
Bartenders:
  • salvin francis
  • Carey Brown
  • Frits Walraven

Parsing the CDATA section in XML using XML Pull Parser

 
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sample XML



With the below code I was able to retrieve <title>, <published> and <author> values within the <entry> tag.



From the <content> tag how can I retrieve the "href" value within the <a> and the text value(The BJP is likely to anoint Narendra Modi.....) from the <p><span> tag.
 
Marshal
Posts: 25454
65
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It looks like the contents of that CDATA section is a fragment of HTML. Not an HTML document, but a bunch of HTML tags. So the first step is to get the contents of the CDATA section into a string (using your XML parser). The second step is to parse that String using an HTML parser -- no XML parser will be able to deal with that. Make sure you choose an HTML parser which is capable of dealing with "tag soup".
 
Ranch Hand
Posts: 729
7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Before suggesting how to do it, I would say the provider can at any time supply the payload as texte rather than cdsect without violating much, hence, a solution should take care of that freedom.

Further to simply life, I would simply use a regex as a way to pick up the href hoping the href would be normal enough.

This is what you can do.
>
 
sn omen
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
With the above code I'm getting the href value within <a> tag. How can I retrieve the text within the <span> tag??
 
Poop goes in a willow feeder. Wipe with this tiny ad:
Two software engineers solve most of the world's problems in one K&R sized book
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
    Bookmark Topic Watch Topic
  • New Topic