Win a copy of The Little Book of Impediments (e-book only) this week in the Agile and Other Processes forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Help improve performance jdbcTemplate update

 
Jehan Jaleel
Ranch Hand
Posts: 196
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I have a Spring batch application where I read data from a source table (called SETTLEMENT_DETAIL), I do some processing on that data, and then I need to update the status for each record that I processed. The problem is that in a typical batch run of this program this source table can have large # of records to process, usually 300K. I used batch update of jdbcTemplate, but it sill takes almost 2 hours to update all 300K records...



As you can see I maintained a list of Settlement detail objects that I need to update and I passed in this list to jdbcTemplate's batch update method.

Does anyone have any suggestions on how I can make this faster?

Thanks in advance for any help.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The most performing solution would probably be to do the update in the database, skipping Java altogether. Most databases support pretty complicated updates right in SQL. It saves the time needed to transfer the data over the network, and also might allow the database to perform some optimizations that are not possible otherwise.

If this is not an option, you'll need to do some measurements to find out where the bottlenecks are. I'd suggest trying to update these 300K records in the database (using TOAD or other SQL client) just to see how long does this process take in itself. You won't be able to beat that time, unless you get rid of some indexes on the columns being updated. (Sidenote: if you're updating a significant fraction of the rows in the table - say, 20% or more, in some databases it might be quicker to create a new table as select (do the update here), drop the old table, rename new to the proper name, create indexes and constraints etc. This might require significant re-engineering of your processes, but it might be actually faster than updating 300K rows.)

Populating a temporary table from Java (ie. inserts instead of updates) and then doing a bulk update in SQL might also save some time, but you'd have to try it out to see. It is also possible that it would make things worse, not better.

If you find out the plain update takes significantly less time than your Java processing even after taking time spent on the network into account, you should try to profile the Java application to find possible bottleneck there.

(In the code you've posted I'd suggest to reuse the BatchPreparedStatementSetter instead of creating a new one for every row, but this certainly is not the thing that will shave the processing time significantly, if at all.)
 
Jehan Jaleel
Ranch Hand
Posts: 196
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Martin,

First of thanks for taking the time to respond to my post, I appreciate it.

Regarding letting the DB to the update, I do not see how that would work because how would the DB know which records my code processed.

Regarding updating the 300K records directly using DB Visualizer, I am not sure how I can do that. If you look at the SQL I am firing I am still sending a separate SQL for each update, something like this..

UPDATE SETTLEMENT.SETTLEMENT_DETAIL SET ADVICE_HEADER_ID = 1 where SETTLEMENT_DETAIL_ID = 5
UPDATE SETTLEMENT.SETTLEMENT_DETAIL SET ADVICE_HEADER_ID = 1 where SETTLEMENT_DETAIL_ID = 6
UPDATE SETTLEMENT.SETTLEMENT_DETAIL SET ADVICE_HEADER_ID = 2 where SETTLEMENT_DETAIL_ID = 7
UPDATE SETTLEMENT.SETTLEMENT_DETAIL SET ADVICE_HEADER_ID = 2 where SETTLEMENT_DETAIL_ID = 8
UPDATE SETTLEMENT.SETTLEMENT_DETAIL SET ADVICE_HEADER_ID = 1 where SETTLEMENT_DETAIL_ID = 9

I am not sure how I can create 300K of such statements in my DB Visualizer SQL editor.

Thanks again for all your help.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jehan Jaleel wrote:Regarding letting the DB to the update, I do not see how that would work because how would the DB know which records my code processed.

That would mean moving the processing itself into the database. Either as a plain SQL (if possible - a lot is possible in plain SQL), or as a stored procedure. It would be a major task to do. Some databases (eg. Oracle) even allow to code stored procedures in java, so it might be possible to reuse some of your current code directly.

Regarding updating the 300K records directly using DB Visualizer, I am not sure how I can do that. If you look at the SQL I am firing I am still sending a separate SQL for each update, something like this.....

You could create a table containing 300K existing SETTLEMENT_DETAIL_IDs (selected at random, probably), and then issue an update similar to
Make sure to set all columns in the test run, so that the conditions are as equal to the real ones as possible. Of course you'll do the test in a test environment

How long does the processing itself take? If it took a significant amount of time, you could probably save data to database in smaller batches while continuing the processing in a separate thread. That way could at least part of the time spent waiting for the database be used for the processing.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic