I am using Apache Xerces Parser 2.6.2 to parse some XML documents with a W3C XML Schema document (XSD). There are a few complex rules and semantics that I have written in my XSD that implements the business logic of the application.
Whenever I give an invalid xml file (non conformant to my schema) to the parser, it throws a SAXParseException. My error handler retrieves these errors and displays them at the user interface.
For example, org.xml.sax.SAXParseException: cvc-minInclusive-valid: Value '-240' is not facet-valid with respect to minInclusive '1'. org.xml.sax.SAXParseException: cvc-minInclusive-valid: Value '-400' is not facet-valid with respect to minInclusive '1'. org.xml.sax.SAXParseException: cvc-maxInclusive-valid: Value '40000000' is not facet-valid with respect to maxInclusive '1500'.
As you can see the error messages are not much useful. I want to get the name of the node in the xml file which caused the problem. If I can get the name of the node/element in the xml file, I will be able to display more meaningful error messages to the user such as:
"The value entered for cost is invalid: -240". etc.
Is there any way I can extract more information from SAXParseException, so that I can get the node's name and/or value and display messages to the user (not necessarily an XML specialist) and errors that would make sense to them within a business perspective rather than technical jargon.
SAXParseException does provide methods getLineNumber() and getColumnNumber(). If you have the XML file at hand, you could possibly at least include, say, a hundred characters around the erroneous line/column into what you show to the user.
I am aware about the line number and column number methods available through the parseException class. However, what I am after is some method that gives me the node name that ignited the parse error. One possible solution is to write a custom handler that would use the lineNumber, go back to the XML file and extract the name of the node that encountered the problem.
Seems to me that you could have your SAX startElement method store the name of the "current" node as an instance variable and use that in your exception logging in addition to the line and column numbers.
Originally posted by William Brogden: Seems to me that you could have your SAX startElement method store the name of the "current" node as an instance variable and use that in your exception logging in addition to the line and column numbers.
Doh! So obvious That's definitely the best way to do it.
Pardon my ignorance, but where is the startElement method implemented in the Apache Xerces's SAXParser? All I am using is the SAXParser.parse() method, and have set the error handler by to my custom error handler using the following property,
Ah - that is the whole trick about SAX - the parser needs to be connected to a handler(s) that You the programmer define. The parser can be thought of as "firing events" that can be handled by - in this case - a startEvent method. So in SAX parsing the parser is only concerned with locating the various parts of an XML document and giving your handler a shot at using them. The exact classes/interfaces involved have gone through a lot of evolution during the last few years. Look for the JavaDocs for org.xml.sax.helpers.DefaultHandler in the Java1.4 and 1.5 standard library. Bill
No no no - it means you have to extend DefaultHandler (org.xml.sax.helpers package) and attach your custom event handler to the SAX parser. See the various parse() methods in SAXParser. The parser will reliably deliver events (method calls) to the event handling object you attach. Just like the error handler but for content events. Bill