• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Sheriffs:
  • Tim Cooke
  • Knute Snortum
  • Bear Bibeault
Saloon Keepers:
  • Ron McLeod
  • Tim Moores
  • Stephan van Hulst
  • Piet Souris
  • Ganesh Patekar
Bartenders:
  • Frits Walraven
  • Carey Brown
  • Tim Holloway

DOM and large XML

 
Ranch Hand
Posts: 297
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm working on a routine that will display a JTree generated from an XML file. The XML file is potentially gigantic, as it is an application log file. I'm thinking of using SAX to filter out unwanted records, i.e. only DOM-ing the last x number of entries.
Wondering if anyone has advice/resources for doing this type of thing. Any input is greatly appreciated.
Many Regards,
Michael
 
Sheriff
Posts: 5782
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could even use a stylesheet that filters the nodes that you want from the XML, feed it into an XSLT engine and get the XML "subset" you want. ( Remember stylesheets transfer one XML to another XML document?? ).
This way you don't have to do the SAX-parsing yourself.
Once the target subset ( of your XML ) document is generated, you can feed it to DOM.
Sounds like a plan?
------------------
Ajith Kallambella M.
Sun Certified Programmer for the Java�2 Platform.
IBM Certified Developer - XML and Related Technologies, V1.
 
Sheriff
Posts: 6920
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't think the stylesheet would do what you want. XSLT pretty much requires DOM, so the whole document would be loaded in. Preprocessing with SAX (or awk ) sounds a good idea, though.
 
Michael Hildner
Ranch Hand
Posts: 297
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I appreciate the input, as I'm new to this. Thank you.
Don't understand the reference to awk though... a joke?
 
Ranch Hand
Posts: 180
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'd use SAX and create your own "document" objects that are closely tied to the data structure of the log. That way, you can still load the whole thing if you want to without running out of memory. Your objects would be "lighter" than the DOM Document. Now, if your data set is still too big, I'd use SAX to ignore all but the most recent log entries (or something else meaningful).
Hope this helps.
------------------
Chris Stehno (Sun Certified Programmer for the Java 2 Platform)
 
Frank Carver
Sheriff
Posts: 6920
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I just mentioned awk to remind people that sometimes you don't need the big, powerful tools when simple ones will do the job. If all you are interested in is the contents of one particular element, an that element is always on a line of its own, awk would be a good choice for extracting the data. It doesn't really understand XML, and it is vulnerable to changes in the XML formatting, but it's fast and easy ...
 
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
http://65.1.136.127/developerlife/parsertest2/performance.html has performance statistics for reading large XML file with various parsers. According to their testing IBM parser will read XML file till the size is 20MB, after that "OutOfMemoryException" is thrown. Though I am able to repeatedly ( 20 - 25 times) create DOM - process - discard DOM with Oracle XML parsers for < 1 MB file before getting "OutOfMemoryException".
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!