Hi,
To start with sorry about the length of the post, I can't get to a site for hosting a file due to my overlords securing of the Internets and I am not sneaky enough to find a way around the blocking.
Stack
JDK version build 1.6.0_30-b12
Xerces-J 2.6.2
I am working on a project that parses XML input and writes it to a POJO. In order to do this we use internal XERCES to parse the XML and a Handler to write to the POJO. We pretty much need the whole tree to be parsed and then work with the results. We don’t use a schema to get the results.
The issue that we are having expresses itself with a malformed body being returned to the POJO, not always and it was difficult for us to catch this defect. Eventually we have been able to isolate an XML
string that will consistently produce the defect.
From inspecting the source it appears that the defect is occurring in
XMLEntityScanner.load function, in particular line 1742 where it reads in further from the XML String. By calling function
There seems to be an issue with the read operation occurring when the boundary char is a "/" at which point the read operation looses it lunch-box and alters the xmlString offset and length members. If we add another 5 chars to any of the bodies in previous to line 40 in the data then the issue resolves.
To demonstrate this defect I have included the parser and demo StandAloneBynamicHandler.java.
So my question really is are we doing this correctly, is there an underlying issue that I don not fully understand?
As this is in code that has been out there for years I am very doubtful that we have stumbled upon a new defect but I have been unable to find anything with specific reference to the above class.
I also do not think it is related to
https://coderanch.com/t/129602/XML/SAX-parser-character-call-back as this is a precursor to the call to
characters(char ch[], int start, int length) call.
Thanks in advance
Allain