Win a copy of GANs in ActionE this week in the AI forum
or WebAssembly in Action in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Bear Bibeault
  • Paul Clapham
  • Jeanne Boyarsky
  • Knute Snortum
  • Liutauras Vilda
  • Tim Cooke
  • Junilu Lacar
Saloon Keepers:
  • Ron McLeod
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Carey Brown
  • Joe Ess
  • salvin francis
  • fred rosenberger

Processing XML as "raw" Strings

Posts: 24
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Question for the seasoned XMLers here:-

The company that I work for does a lot of XML processing. Mainly, they generate XML content from legacy data in the form of text files, using hand crafted java code, such as String and StringBuffer methods, like append, substring, indexof, replace etc etc.

I've seen this cause them many problems with regard to namespaces, and referencing elements and atributes correctly, for example when you do a String.indexOf("myAttribute"), how do you know that you get the correct element, for example "myAttribute" has no guarentee of uniqueness within a document. Moreover, in general I like to use well defined API's where they are proven to work well.

The people in the company always cite performance concerns, like "Dom is slow and uses too much memory". I know that DOM Vs SAX Vs Databinding all have their merits and weaknesses and thus areas of application, but to me, I can see no justification, when programing in Java to build XML documents by hand. I am just wondering if anyone else out there has had similar experiences, and if anyone can advocate a situation where by processing / generating XML documents by hand would be preferable to building a DOM, or generating SAX events?

I'm just very interested in your comments and experiences and thoughts and opinions... many thanks in advance, Jonathan.
Posts: 24939
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've generated XML by hand in the past and it's surprisingly difficult. Getting the ampersands and so on in text nodes escaped properly, making sure that unusual characters are output in the correct encoding (the cent sign was the killer here), and so on. Nowadays I don't do that any more, I use code written by XML professionals (SAX, DOM, as you say).

I would take the performance allegations by the horns. It's possible that writing your XML via SAX events could be faster than the hand-crafted code you're already using, because the writers of Xerces et al. have spent years optimizing their code. Try converting an existing program and do the comparison, you might be surprised.

I don't find DOM particularly easy to work with, but fortunately I haven't had to use it. All of the generating of XML I have had to do is converting non-XML sources into XML documents sequentially, and SAX works great for that. I can't imagine writing my own code for something that really needs DOM, though.
If you want to look young and thin, hang around old, fat people. Or this tiny ad:
Java file APIs (DOC, XLS, PDF, and many more)
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!