• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

Oracle ResultSet limits

 
Ranch Hand
Posts: 204
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello!

Is there a limit on the amount of data that can be retrieved into a ResultSet?

I need to download over 1 billion records and the last time i tried i ran into a memory error when using the ResultSet approach!

Or is there a way to 'stream' the data from Oracle into another process? , i know this seems to work with the XMLSerializer!

Thanks much!
bc
 
ranger
Posts: 17347
11
Mac IntelliJ IDE Spring
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Do you actually need all 1 billion records at one time? I have yet to see a system that would require that.

Mark
 
bob connolly
Ranch Hand
Posts: 204
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Mark yes we do!

I just typed in a detailed response and hit some key and it all disappeared, this has happened many times, wonder what key combo it is! any,

We need to validate the data on our user tables before they extract their data and send it to us on some pretty big flat files and what we'd like to do is extract the data into some kind of a "stream" and pipe that stream into a java program which will validate each field on the table, for RI rules, missing values ect!

And some of the tables have a billion+ records on them!

Today i found some references to some Toplink solutions on the Oracle site, but really havn't found any real work examples to go by!

Anyway Mark, thanks for any ideas or suggestions!
bc
 
Mark Spritzler
ranger
Posts: 17347
11
Mac IntelliJ IDE Spring
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Each validation is done on a per record basis. You can still get parts of the full data, and bring back parts, check for validation, write to a file, then go back for next set of data. This is called paging.

This process that you are doing, is there any time limits. Because a billion records to write to disk would take a long time. Not only thinking about the disk size of the file. How many bytes are a record times the number of record is the number of bytes the file would be. If one record was just one byte you are already at 1Gig. How are you even going to transfer that file to the customer.

I think, if you have that big of data, you need another architecture/approach than querying all the data, validate all the data, then write to a file. It just isn't right.

Sorry.

Mark
 
bob connolly
Ranch Hand
Posts: 204
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Mark!

Thanks for the paging idea, i'm going to look into it today!

Well on the validation approach, we will only create a report showing the record in error, so it's not that often that we'll find a bad record, but out a billion recs, we might end up with 10,000 ify records that need to be discussed before generating a flat file to be FTP'd.

For the architecture, we process over hundred files that vary from 1000 to 1 billion recs, with field counts from 3 to 50 fields, and we normally use the Oracle XMLSAXSerializer to download the data in an XML buffer and pipe that buffer directly to the SAX validator, and that works file for files of 10 million or less, but with billion record files, it takes over 4 days because the XML increases the record count to over 25 billion records!

So basically, if there was a utility like the XMLSAXSerializer to extract the data into a 'stream', without putting the data in an XML format, that would be ideal!

Anyway, i'm going to look into the paging and Toplink approach today and see how that might work out!

Thanks again Mark!
bc
 
reply
    Bookmark Topic Watch Topic
  • New Topic