This week's book giveaway is in the OCAJP forum.
We're giving away four copies of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) and have Khalid A Mughal & Rolf W Rasmussen on-line!
See this thread for details.
Win a copy of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) this week in the OCAJP forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Dom parser

 
avihai marchiano
Ranch Hand
Posts: 342
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can you pleaser command on dom parser.

I work with Jboss 4.2, and if i understand correctly dom4j has problem to run when working in jboss.

Thank you
 
fred rosenberger
lowercase baba
Bartender
Posts: 12180
34
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This sounds more like a thread for our XML forum. I'm going to move it over there. Any followups will be in that one, not here.
 
Paul Clapham
Sheriff
Posts: 21298
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sounds like a question about JBoss to me. Avihai, should I move this post to the JBoss forum or would you like to expand on your question? Right now I don't really see anything that can be answered.
 
avihai marchiano
Ranch Hand
Posts: 342
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Lets ignore Jboss.

What is the best dom parser?

thank you
 
Rahul Bhattacharjee
Ranch Hand
Posts: 2308
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by avihai marchiano:

What is the best dom parser?


This is the way I would take this if someone asks me what is the best DOM parser.

As you might be knowing that DOM construts an in memory tree model of the XML and its size is must more than the size of the example.So I would give weightage to the Parser which consumes lesser memory.

Second thing to look at how much time does the parser takes to parse the xml into a in memory tree model.That would be another thing to look at.

Another think to look at would be to check how much time its taking and comparing with the size of documents.
Though I have never done the above analysis , but doing so might be the solution to your question.
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you have reason to believe that the default implementation used by javax.xml.parsers.DocumentBuilder won't do? If so, in what way is it insufficient?
 
avihai marchiano
Ranch Hand
Posts: 342
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I work with the default and the parsing of the xml take too much time.
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Have you considered using SAX instead of DOM?
 
avihai marchiano
Ranch Hand
Posts: 342
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yep, but i need all the data, so i decide to go for dom.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13071
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here is a recent parser performance article.

Note how the throughput "winner" varies according to document size.

Bill
(Have you considered using SAX parsing with a custom approach to saving the data in Java classes? Building a DOM involves a LOT of object creation you may be able to avoid. Does your program require a DOM for data manipulation?)
[ September 19, 2007: Message edited by: William Brogden ]
 
avihai marchiano
Ranch Hand
Posts: 342
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Wow, thank you it is great.

Can you please explain your last comment about consider sax even if you need the whole document.

As far as i know if you need the whole document you should prefer dom.

i need only to read data.
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As far as i know if you need the whole document you should prefer dom.

I wouldn't say prefer. If you need access to the whole document at the same time, then DOM is probably the way to go. But SAX will present the whole document as well, but in a sequential manner. So if you're processing (say) the 5th <foobar> element, and the code suddenly thinks "uh oh, I need the value of the 3rd <foobar> element, and I didn't save it when it was parsed", then you're out of luck. But if upon reading the 3rd element you already know that you're going to need it later, then you can store it somewhere, and access it later.

i need only to read data.

That's actually a strong reason to prefer SAX, because DOM does a good many things (and uses quite a bit of memory) to set things up so that you can change and save the document.
 
avihai marchiano
Ranch Hand
Posts: 342
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you.

I certainly agree.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13071
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can you please explain your last comment about consider sax even if you need the whole document.


I will give it a try. Suppose your XML document represents a collection of books with the data for each one inside a <book> element. Each starting book tag contains some attributes you want to keep and there are additional elements with various bits of data.
We are going to define a book class where each instance represents all the data inside one <book> element so the collection of instances represents the usable data from the document.

In your custom SAX event handler you do this:

1. When a startElement event for "book" occurs, create a new book object, passing the constructor the "Attributes" - keep a reference to the new object as your working object.
2. For each subsequent event, keep track of the current element and/or pass the text data you need to keep to some method in the working book object. (Remember that characters() events may contain only part of the data for a Text node.)
3. When you get the endElement event for "book" that object is complete - add the reference to some collection.

This saves all the object creation that would go into a DOM and lets you skip data you don't need for a particular application.

Bill
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic