• Post Reply Bookmark Topic Watch Topic
  • New Topic

accessing 1 million records  RSS feed

 
kri shan
Ranch Hand
Posts: 1489
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to access 1 million records thru servlets from the Oracle db. Retrived records are displayed as the CSV files ( all at the same time). Pagination concept is not required. Which is the best way to access all 1 million records (optimized waY) ?
 
Jeroen Wenting
Ranch Hand
Posts: 5093
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
very carefully or you run out of memory...

Create an SQL query to return them and DON'T tell the driver to prefetch everything or you'll likely run out of memoryspace.

Read each row in turn and add the data to your report.

Close resultset, query and connection.
Clean up any memory allocated (temporary objects etc.).

return data.

The amount of data will likely be massive so test for timeouts.

Be very careful about memoryleaks...

For performance reasons I'd probably run the extraction at regular intervals (or create some mechanism to trigger it) and store the result to a file.
Then when a client requests the data I return that file instead of a life result.
This works only if the data is pretty stagnant of course, but most of it will be.
At worst you'll be better off with some mechanism that appends to the existing data to insert new records created say every few minutes and refresh the entire data set once every hour or less frequently (keeping track of the number of records read somewhere, like in a static member of the process that takes care of generating the report).
 
Mark Spritzler
ranger
Sheriff
Posts: 17309
11
IntelliJ IDE Mac Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
"refresh the entire data set once every hour "

My guess is that the process of getting all those records into a file will take well over an hour.

Mark
 
Mark Spritzler
ranger
Sheriff
Posts: 17309
11
IntelliJ IDE Mac Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In that regards, what is the data going to be used for. Is it for some import/export process to another database, because I don't know of anyone that would be able to read 1 million records in a .csv file.

I question the means or reasons why this needs to be done, and that there must be better solutions for you than trying this with 1 million records to a file through a Servlet.

Mark
 
Jeroen Wenting
Ranch Hand
Posts: 5093
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Mark Spritzler:
"refresh the entire data set once every hour "

My guess is that the process of getting all those records into a file will take well over an hour.

Mark


hehe, could well be.
Biggest database conversion I ever had to program ran over several tables the largest of which was 2GB on disk (at maybe 1K per record, so roughly 2 million records).
That one ran 36 hours when using 6 CPUs in parallel on a 32 CPU Tandom and was programmed directly on the database (so no JDBC layer in between) in C with embedded SQL.

Day before it was to be executed the customer decided the risk was too great and they withdrew the entire change which had necessitated the conversion.
2 months of work (programming, design, testing, documentation) down the drain but at least we got paid...
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!