• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Read and load xml file into a db table using SAX

 
Ranch Hand
Posts: 116
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi!

Wondering if anyone have any sample code to read a xml file and load the parsed attribute values of a particular matching tag into a database table.
I already have a code which does the above using DOM. But as expected it performs extremely poor when the XML file is very large (~500 meg). I have a similar SAX code with hardcoded tag names which is currently writing a pipe delimited input file. The generated input file would be used separately used by the Oracle's SQLLoader utility to do the table load. This 2 way process is very efficient. I was wondering if I could incorporate the data loading into the existing code using JDBC effieciently possibly using JDBC API - PreparedStatement's addBatch and executeBatch methods? I have been able to do the one insert at a time data load by dynamically constructing and executing the insert statement. But for the inserts which normally takes only few minutes now with the existing process this one took hours (God knows exactly how long!) for load of 170000 rows.
Any idea how I can get this SAX + JDBC effort done effieciently?

Thanking you all!
 
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So, you have the SAX part of the code working satisfactorily, but you have a question about JDBC aspects such as executeBatch()? From your description it sounds to me like you should be using PreparedStatement as well. Isn't this more of a JDBC question?
 
Ranch Hand
Posts: 775
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Doesn't sound like SAX is so much of a factor unless you are holding too much data in memory or causing the garbage collector to work overtime.

The thing to watch out for with SQL batching is that there tends to be a moderately optimal batch size, depending on your database, driver, network configuration, vendor-specific options to the driver, etc. You may have to experiment to find the right size. Also, you may find it helps if you do commits periodically. If you keep loading data via one transaction you'll be pounding pretty heavily on assorted database internals that are trying to allow for a possible rollback.

There are other things that can slow you down, but it depends on the database and version. For example, in most databases for huge loads you may want to do things like disabling or dropping indexes and triggers, restablishing them at the end of the job. You have to be particularly careful with Oracle 7x/8x versions because dictionary-managed tablespaces can get progressively slower as you add rows.

If you are willing to use SQLLoader *and* you are loading the data from the same system that the database is installed on, your best option is generally to do a direct-path load. You can't write JDBC code that will ever compare with a direct-path load for performance.
[ January 27, 2006: Message edited by: Reid M. Pinchback ]
 
reply
    Bookmark Topic Watch Topic
  • New Topic