• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

Read large xml file for searching

 
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

Can someone please help me I have a very large xml file i.e. > 12 MB. I have a search form where a user enters some data a I need to search for the content in that xml file and display it on the page in a datatable. Now as the file is huge what is the best approach to search the file. As x number of users may be doing a search at one time. The xml file is stored in a specific folder and that never changes.

Is it ok to read the file data and store it in a static hashset and keep it in memory and search on the hasset. Or what else can be a better way.

Thanks
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Can you cache it as a DOM Document in memory? Then you could run XPath or XQuery queries over it without having to do any file I/O or parsing.
 
Mike Boota
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
But what if there are many users accessing the file and do I have to load it in memory for each request or have it loaded in memory once on server startup.

thanks
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Caching means that you'd load the document once, and then keep it in memory. Subsequent accesses are faster that way.
 
Ranch Hand
Posts: 2308
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Mike Boota:
But what if there are many users accessing the file and do I have to load it in memory for each request or have it loaded in memory once on server startup.



As already suggested , you can have the XML read onto a DOM and then use XPath to get the required values.Once you load it , all the XPath queries can be directed to the same DOM.No need to load it again and again.

One thing to note is , for a file of about 12 MB it would atleast take 18 MB (might be a few more MB's) of RAM.

So if the xml has some information specific to user and in production you will have many more users.In that case the size of the xml would be big and will take a lot of memory.Then you might have to think of something else.
 
reply
    Bookmark Topic Watch Topic
  • New Topic