Win a copy of Node.js Design Patterns: Design and implement production-grade Node.js applications using proven patterns and techniques this week in the Server-Side JavaScript and NodeJS forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Rob Spoor
  • Tim Cooke
  • Junilu Lacar
Sheriffs:
  • Henry Wong
  • Liutauras Vilda
  • Jeanne Boyarsky
Saloon Keepers:
  • Jesse Silverman
  • Tim Holloway
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Mikalai Zaikin
  • Piet Souris

XML vs Database

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am looking for views on this:

We have data of size 10 TB(terabytes), stored in multiple disks. Metadata (data describing data like filename, its location, author, description etc.) can go in GB(gigabyes) say 5 GB. To develop a web based application, should metadata be stored in xml files or in a database like oracle, mysql etc.

Since data is going to increase in future, scalability is required. Which approach will give better performance?

It will be like a user wants to find data matching a particular criteria e.g. all files generated between specified start date and end date, extracting required data and analysing it to give statistics, generate plot etc. At runtime, we are generating results, so user should get good performance.

As xml file will be larger, so can't use DOM, but Is using SAX parser scalable and gives good performance?


Thanks
Ashish
 
author & internet detective
Posts: 40747
827
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ashish,
Databases are designed for search. There are performance optimizations, such as indexes. While XML allows search, it involves reading the whole file. This is going to be slower than an index.
 
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jeanne Boyarsky:
Ashish,
Databases are designed for search. There are performance optimizations, such as indexes. While XML allows search, it involves reading the whole file. This is going to be slower than an index.



A counter argument would be that the file system plus something like Lucene would give a far quicker (and richer) search capability than an RDBMS can provide.

Replicating an XML document structure in database entites is a lot of maintenance. I'd avoid it if at all possible.

Does your data require any referential integrity or other constraints? If no, then I'd go for the file system every time.
 
Jeanne Boyarsky
author & internet detective
Posts: 40747
827
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Paul,
I interpreted the question differently than you. If it is a matter of leaving the data in XML, it should definitely stay on the file system.

I thought Ashish just had data and had the choice of putting it in XML or in tables.
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic