• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Slow SAX parsing with Java 1.4

 
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I've been using the Xerces SAX parsers with good results, but I decided to try the built-in Java 1.4 parsers. I was surprised at the awful parse times when I turn on validation. Here's one example:
Xerces 2.0.1:
Without Validation = .7 seconds
With Validation = 1.2 seconds
Java 1.4:
Without Validation = .3 seconds
With Validation = 22.7 seconds
I had heard that the Crimson parser upon which Java 1.4 is based was fast, which I found true without validation. But, with validation it's prohibitively slow for my application.
Is this consistent with others' results? Any chance something else not obvious is going on here?
Ron
 
Sheriff
Posts: 5782
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have seen that. JAXP bundled with 1.4 is noticeably slower than its predecessor ie., plain Crimson with 1.3. To a certain extent the slowness is attributed to the 'fat' JDK1.4 bundle which has about 60% more core classes than 1.3.
You should bear in mind that validation in general is an expensive operation. Almost every parser performs poorly when validation is turned on.
If you must validate your document, but the response time is critical I recommend that you checkout Multi Schema Validator from Sun. It is a lightweight validator for checking constraints against a schema definition. Since it makes use of "Abstract Grammer Language", instead of standard XML-parsing with validation, the performance is great even for large documents. What's really cool is that you can feed the document to MSV before you feed it to the parser! Simply check the well-formedness with any parser and then check validity with MSV.
Just my two cents worth....
 
author and deputy
Posts: 3150
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I too reckon u to use MSV when its comes to performance.
For me it took 9 sec to valid a small XML file against a 2MB XSD(which includes another 2.5kb,3kb,1.75mb XSD's).
Online XSV -- 9-15 secs
XMLSPY -- Poof!! System Hangup
Turbo XML -- Not even opening the XSD successfully
Xerces - 50 mins (Really!!)
Regards
Balaji

Originally posted by Ajith Kallambella:
I have seen that. JAXP bundled with 1.4 is ...

 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic