• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

how to extract part of data from XML file

 
Pingili Vishwanath
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have an XML file like this:

<?xml version="1.0" encoding="UTF-8"?>
<BILLING11><CARRIER>Verizon</CARRIER>
<DID>999998</DID>
<GMT>20071113</GMT>
<BR><EI>1768221198</EI>
<CID>200</CID>
<ADS>211</ADS>
</BR>
<BR><EI>1768221200</EI>
<CID>200</CID>
<ADS>219</ADS>
</BR>
<CT>2</CT>

Here I have 2 records with <BR>, where CT tells the count.

I want to extract each record and write them into new XML file.
For example, i want to extract the following from above XML file.

<BR><EI>1768221198</EI>
<CID>200</CID>
<ADS>211</ADS>
</BR>

Any idea how to do this?

Thanks in advance.
Vishwa
 
sudha swami
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can either use SAX Parser or DOM Parser.
sudha
 
Raees Uzhunnan
Ranch Hand
Posts: 126
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
StAX is the most appropriate parser to efficiently read parts of xml
You can move through XML data like a cursor using stax.

thanks
Raees
 
Nitesh Kant
Bartender
Posts: 1638
IntelliJ IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
By any chance do you have an xsd for this xml?
If yes you can use xmlbeans, that will do two things:

  • Hide the parser implementation
  • You can use XPath/XQuery to point to any element inside the xml and work on it


  • If not, you can use any open source xpath engine that will do the trick for you. Parsing the whole xml using any sort of parser will anytime be more cryptic and give you aweful performance as compared to XPath.
     
    William Brogden
    Author and all-around good cowpoke
    Rancher
    Posts: 13071
    6
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Parsing the whole xml using any sort of parser will anytime be more cryptic and give you aweful performance as compared to XPath.


    Alas, this is incorrect. XPath has to work on top of the standard library so will always be slower. The advantage of XPath is clarity of expression and reduced lines of code, not speed.

    I did some timing experiments for this article - it is a big difference.

    Bill
     
    sudha swami
    Ranch Hand
    Posts: 186
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi,
    Correct if am right.

    If i want to parse the part of XML Data with out considering the speed, then XPATH would be better than SAX/DOM Parser.

    regards
    sudha
     
    Paul Clapham
    Sheriff
    Posts: 21322
    32
    Eclipse IDE Firefox Browser MySQL Database
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    No, that isn't even wrong. You can't use XPath until you have already parsed the XML using the DOM parser.
     
    sudha swami
    Ranch Hand
    Posts: 186
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    thanks for the info
     
    Nitesh Kant
    Bartender
    Posts: 1638
    IntelliJ IDE Java MySQL Database
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    William:
    Alas, this is incorrect. XPath has to work on top of the standard library so will always be slower. The advantage of XPath is clarity of expression and reduced lines of code, not speed.


    Do you think this will be a generic behavior or it also depends on the XPath engine implementation? (I wanted to do some R&D and come up with results but i really did not have time, so asking for an opinion.Obviously the comparison must be for the same parsing methodology used in XPath engine and otherwise)
    I was just wondering that it does not make sense for the performance to go down alarmingly using XPath as compared to the node wise search, atleast for the simple xpath that is used in your article. More so if the xpath is pre-compiled. What do you say?
     
    William Brogden
    Author and all-around good cowpoke
    Rancher
    Posts: 13071
    6
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I was just wondering that it does not make sense for the performance to go down alarmingly using XPath as compared to the node wise search, atleast for the simple xpath that is used in your article. More so if the xpath is pre-compiled. What do you say?


    I say it makes perfect sense. No magic is going on here, XPath interpretation has only the tools in org.w3c.dom (in whatever actual implementation) to work with. The search has to interpret the path in those terms.

    I suggest you take a look at the XPath specification (for example XPath 2.0) - it is always in terms of a DOM.

    Now - if you want to try to figure out an XPath-like high speed scan using SAX or StAX - great, but it wont be XPath. XPath-like syntax seems to be getting popular - for example this Apache project applying the syntax to object graphs!

    Bill
     
    • Post Reply
    • Bookmark Topic Watch Topic
    • New Topic