Win a copy of Kubernetes in Action this week in the Cloud/Virtualization forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

How to insert to MySql efficiently?  RSS feed

 
Ranch Hand
Posts: 224
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm trying to insert data into an empty table. At the end, it will have 60 million+ records.

This is the pseudocode of what I do currently:


It's taking 3.5 - 4 hours. I'm trying to cut down the time as much as possible.

Then I read this article: http://dev.mysql.com/doc/refman/5.1/en/insert-speed.html
It says that:

If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time.



I'm thinking it must be something like this:


What I'm also wondering is whether addBatch mechanism that I do is already like what is mentioned in the article? Can somebody please advise?

Thanks in advance for all your help.
 
Marshal
Posts: 56942
174
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I tried that sort of thing, but more complicated, more as a test of performance than anything else, and I got nearly 60,000,000 records in. Took about a week . . .

So I don't think you need worry about 3� hours.

You can't tell from the API how that method is implemented; you could try getting whichever class implements Statement and see whether that tells you. Statement and PreparedStatement are actually interfaces, so they don't have actual implementations.
 
author
Bartender
Posts: 4093
21
Eclipse IDE Flex Google Web Toolkit
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
60 million? How often do you perform the insert? I'd recommend putting everything in a SQL file and inserting it directly into the database. That number of records never works that well via JDBC even with batch updates.
 
Susan Smith
Ranch Hand
Posts: 224
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It's a one time insert for the 60+ million. But in other parts in my project, I have to do insert/ update in millions too (< 10 millions approximately), that's why I'm looking for a really really efficient way to make things run more smoothly.
 
Sheriff
Posts: 22968
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you're looking for an efficient method, and you have more than one method to choose from, then I would advise setting up some fairly large test data sets and running them through the different methods. Measure whatever counts as "efficiency" (in your case it sounds like it's wall-clock time) and see which method wins.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!