Forums Register Login

Adding millions of entries in database

+Pie Number of slices to send: Send
Hello all,

My application will need to parse some files which will generate more then 1 milion of records in the database.
Using StatelessSession from hibernate and postgresql i've reach the time of 70 second for 1 million records (an entity has 4 characters column and the bigint primary key).

I have 2 questions:
1) Since I didn't work with so many data until now, do you think that this is a good time?
2) Do you have any suggestions to improve this time?

Thanks.
+Pie Number of slices to send: Send
1. 70s for 1 million records is not bad at all. Is this just the parsing process? This is just over a minute!

2. See #1

WP
+Pie Number of slices to send: Send
 

William P O'Sullivan wrote:Is this just the parsing process?
WP



For now i have just created one million object similar with what i will have from the files and saved them. I'm just concerned about the inserting into database process. The reading from the file will be fast. If there are any problems i can use multithreding for the parsing, but the database insert (Java object -> database row process) is usually the bottleneck.
+Pie Number of slices to send: Send
 

Bogdan Baraila wrote:...and saved them. I'm just concerned about the inserting into database process...


Maybe its just me but I still dont get it. Is this time for parsing a file, saving it to database or both of it?
+Pie Number of slices to send: Send
Like the title says is just about the inserting of 1 million entities in the database.
+Pie Number of slices to send: Send
If you want help, you need to clarify:

Is your 70s(econds) timing for the actual insert into Postgres (and committing) of 1,000,000 entries ?

You claim to be processing from memory only, so yes your file i/o will add some overhead.

Be clear please.

WP
+Pie Number of slices to send: Send
Yes 70 seconds it's just the time of creating the objects in memory and saving them in my database.
+Pie Number of slices to send: Send
Is that in-memory? It is still very very fast?

I would suggest modifying your code to create a test file from your 1,000,000 records.

Then modify again to read that file, and see what your baseline timing is.

WP
+Pie Number of slices to send: Send
Yes It's in memory. I'm just instantiating the object into a for, setting them some values in function of the for index and save them into database.
I know that the file processing will add some extra time, but now i'm more concerned about the saving in database process (i have lots of ideea of how i can speed up the file processing but for the database saving this is all i have got until now).
+Pie Number of slices to send: Send
I suggest use batch process and split updating few records each time on iteration.
+Pie Number of slices to send: Send
Using multithreading and StatelessSession i have now obtained a time of 35 seconds for 1 millions records. It can go even lower if i use more threads but for now is enough
The two armies met. But instead of battle, they decided to eat some pie and contemplate this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com


reply
reply
This thread has been viewed 849 times.
Similar Threads
Stateless web service and caching
Help on improving Performance
how to display records in different pages
lock and unlock
Database Lock(-1)?
More...

All times above are in ranch (not your local) time.
The current ranch time is
Mar 28, 2024 12:09:27.