I'm working on a routine that will display a JTree generated from an XML file. The XML file is potentially gigantic, as it is an application log file. I'm thinking of using SAX to filter out unwanted records, i.e. only DOM-ing the last x number of entries. Wondering if anyone has advice/resources for doing this type of thing. Any input is greatly appreciated. Many Regards, Michael
You could even use a stylesheet that filters the nodes that you want from the XML, feed it into an XSLT engine and get the XML "subset" you want. ( Remember stylesheets transfer one XML to another XML document?? ). This way you don't have to do the SAX-parsing yourself. Once the target subset ( of your XML ) document is generated, you can feed it to DOM. Sounds like a plan? ------------------ Ajith Kallambella M. Sun Certified Programmer for the Java�2 Platform. IBM Certified Developer - XML and Related Technologies, V1.
Open Group Certified Distinguished IT Architect. Open Group Certified Master IT Architect. Sun Certified Architect (SCEA).
I'd use SAX and create your own "document" objects that are closely tied to the data structure of the log. That way, you can still load the whole thing if you want to without running out of memory. Your objects would be "lighter" than the DOM Document. Now, if your data set is still too big, I'd use SAX to ignore all but the most recent log entries (or something else meaningful). Hope this helps. ------------------ Chris Stehno (Sun Certified Programmer for the Java 2 Platform)
- Chris Stehno, SCPJ
posted 18 years ago
I just mentioned awk to remind people that sometimes you don't need the big, powerful tools when simple ones will do the job. If all you are interested in is the contents of one particular element, an that element is always on a line of its own, awk would be a good choice for extracting the data. It doesn't really understand XML, and it is vulnerable to changes in the XML formatting, but it's fast and easy ...
http://188.8.131.52/developerlife/parsertest2/performance.html has performance statistics for reading large XML file with various parsers. According to their testing IBM parser will read XML file till the size is 20MB, after that "OutOfMemoryException" is thrown. Though I am able to repeatedly ( 20 - 25 times) create DOM - process - discard DOM with Oracle XML parsers for < 1 MB file before getting "OutOfMemoryException".