Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

JAXB and larg xml document  RSS feed

 
Hanna Habashy
Ranch Hand
Posts: 532
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello all:

I am still green in web services, and I need your collective wisdom to choose a technology for parsing and building large xml documents.
I have some knowledge with DOM, SAX and JAXB, and I want to know which technology suites better.
There will be 2 modules:

The first module:
Receives large unformatted XML documents, parse it, convert it to a formatted XML documents and hand it to the second module.

The second module:
Receives the formatted XML documents from the first module, and save it to the database. Also, builds formatted XML documents from the database, and serve it to clients in a web service.

I know SAX doesn't allow me to manipulate XML data in memory, so it is not an option for building XML documents. Correct me if I am wrong?

DAO and JAXB build in-memory structure and should be suitable technologies, however I don't know about the performance. Is there any performance issue with either package?

Do you recommend any other technology?

Thanks
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The following questions occur to me:
1. How large is your primary document - can you build a DOM and have memory left over?
2. How much rearrangement of elements and computation is required to create the formatted XML output - do values get changed or just rearranged. Perhaps XSLT will be the simplest approach if you just have to rearrange.

I really can't see any role for JAXB in this problem.

Bill
 
Hanna Habashy
Ranch Hand
Posts: 532
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Thanks for reply.

The two module that I described above are a small portion of a JEE applicaiton. The main application consists of multiple modules. It is an information gathering and reporting application for Healthcare.

The second module (described above) that serves formatted XML as a web service reads patients and providers data from the database, build the XML and the serve it in a web service and visa versa.

The first module that converts unformatted XML documents to a formatted documents acts as gateway for data synchronization between our database and other partners databases. In fact, unformatted XML is only one type of document that is accepted. Other types include text files and Excel sheets.

The size of the unformatted XML documents can potentially get so large. The data itself can be changed, because it is in the database and can be updated and deleted. So, the formatted XML documents cannot be static, it has be generated on the fly.

What do you think which technology could be used?
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The size of the unformatted XML documents can potentially get so large.


I am guessing that means you can not count on being able to create a DOM in memory. Processing this large XML file plus a mix of other possible documents indicates that you will end up with a streaming approach like SAX.

I suggest you look into some of the "pipeline" style tools. I did two survey articles on XML pipelines: part one and part two.

The ServingXML open source toolkit does a lot of things that sound similar to your problem.

Let us know what you come up with, the general problem of restructuring large data sets into good XML comes up alot in this and the XML forum.

Bill
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!