SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(fileName);
XML file I am getting as an input is modified by the end user. He adds the data into it and then using my functionality that data is getting into system. In this scenario I am getting an JDOM parser exception "Document root element is missing.."
Reason is the editor used to modify XML appends some charachters at the begining of file and it fails in reading this.
Is there any way I can handle this problem? Either removing this chars or ignoring them. These chars are not predicatable, those might be some garbage values also.
If you can't change the process to avoid the prepended characters but you can detect some rhyme or reason to them, you can make this approach more robust.
I don't believe your users are keying in those few non-printable characters which are not allowed in XML documents. I think this "garbage" you're talking about is more likely to be missing ">" characters at the end of tags or unescaped ampersands or just plain fat-finger mistyping. There's no way to fix that sort of thing via a preprocessor.
But if you're stuck with people modifying XML by hand, then give them an XML editor to do that with. Don't let them use Notepad or something like that. Make sure they send you well-formed XML.
May be the xml parser that you are using does not understand BOM and throws an exception.
Anybody has any thoughts about this?
Vinod Borole wrote:I guess the special characters that you are mentioning about is called the Byte Order Mark (BOM). These are the characters inserted by the editor at the start when you edit the xml file.
This is definitely a possibility. Notepad has a bad habit of doing that. So if you gave people a proper XML editor to edit their XML with, that wouldn't happen.
Sure you can handle it in code. You just wrap the InputStream that contains the XML (with possible byte order mark) in another InputStream which skips over anything preceding the first "<" character when it's created. PushbackInputStream is a useful basis for that.
Aditya Bhanose wrote:Yes, my problem is about Byte order mark only. I think there is no way we can handle this in code.
But don't underestimate the ability of end-users to botch up XML documents in other ways. I did it myself yesterday -- I made a little change to a configuration file which caused my application to fail to start up correctly.