• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Search large xml files

 
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What I need to do is search XML files which possibly could be as large as 1 mb. I will be searching by allowing the user to type in a word or words and searching the whole xml document for occurences of that word(s). Now I will be searching all the nodes, the attributes as well as the text/data.
Now I have used a xml dom and sax and the searching takes a couple of minutes sometimes.
What is the fastest way to search an xml document? What can I do to speed up my search times.
 
High Plains Drifter
Posts: 7289
Netbeans IDE VI Editor
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
DOM of course is going to take a lot of time up front, building a tree in memory. SAX essentially guarantees a method call on every element. Both approaches are predicated on the idea that context is as important as the word you want to find.
If all you really want to do is find a string, don't use either tool. Put the parser tools away, and just use regular expression pattern matching to find it.
 
Saloon Keeper
Posts: 27763
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
... Or consider an XML-structured DBMS.
 
Sheriff
Posts: 5782
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
A combination of JDOM with custom data structures to facilitate search/lookup (eg. hashtables ) can perfom a lot better than plain DOM.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic