• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Extreme performance degradation using prepared statement

 
Saskia de Jong
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm using a number of queries that basically perform a sum for a few columns. The table the queries operate on is fairly large (couple of milion records). A query uses a number of parameters that control whether some rows are taken along in the final result or not.

So far nothing fancy. The odd thing is that if I execute any of these queries by building a string for it and executing it through a regular JDBC statement, then the execution time is about 800ms. However, if I execute the same query using a prepared statement the execution time sky rockets to a whopping 20 seconds!

I'm using PostgreSql 8.1. JDBC is used through Tomcat 5.5.15 and its standard connection pool DBCP. The VM is Sun's 5.0 update 6 JVM. Both Tomcat and PostgreSQL are running on a debian linux box.

Does someone have any clue as to what causes this extreme slowness when using prepared statements?

Just in case, the code I use is:


[ January 24, 2006: Message edited by: Saskia de Jong ]
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 35279
384
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Saskia,
Welcome to JavaRanch!

A few questions to narrow down the problem:
1) Can you post the code you are comparing it to that uses Statement?
2) Are you timing the whole method or just the executeQuery()? If the former, can you time just the executeQuery?
3) What types are the parameters? int, varchar...
 
Saskia de Jong
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Jeanne Boyarsky:
Saskia,
Welcome to JavaRanch!


Thanks! This looks like a real nice forum. Never seen so much specialized sub-groups anywhere


1) Can you post the code you are comparing it to that uses Statement?


Sure, here it is:



The convert to own structure code basically iterates over the entire result set and copies everything to a detached dataset that I created myself.


2) Are you timing the whole method or just the executeQuery()? If the former, can you time just the executeQuery?


I'm timing the whole method but I also enabled PG query logging, and this gives the exact same numbers (minus at most a 100ms difference between those two readings).

I'm going to time just the execute method now and will post back the results of that later.


3) What types are the parameters? int, varchar...


The types in query are all int4. I use these to limit the amount of rows that are processed for my query in the where clause.

e.g.


Here, foo and bar are both columns with type int4.

The types I set in Java with setObject are all string representations of integers. E.g. the code given above would come down to:



One other thing I noticed is that when I use a more complex filtering in the where clause, the difference between prepared and non prepared is even larger. E.g.



In the last case, the difference between prepared and non-prepared is a factor 10. I re-tested this 12 times, and restarted the DB in between.

When I use only 1 filtering value for "foo" then the difference between prepared and non prepared is 'only' a factor 2.
 
Saskia de Jong
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I now additionally timed just the preparedStatament.executeQuery(); method.

For the case with 3 parameters in the above mentioned IN the results for 3 runs were:

full method / executeQuery() only
22709 / 22696
22742 / 22730
22762 / 22752

For the case with 1 parameter the results for 3 runs were:
6148 / 6118
6116 / 6110
6128 / 6114

All numbers are in milliseconds. As can be seen the difference is actually very small, ranging from 6ms to about 30ms.
 
Reid M. Pinchback
Ranch Hand
Posts: 775
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It sounds like the database is choosing a different execution plan for the dynamic sql than for the prepared statement. Not familiar with the guts of postgresql, but it sounds like you need to dig into whatever tuning documentation you can find to figure out when it does full table scans versus using indexes, etc.

Generally prepared statements are a better approach; databases tend to scale to heavy volumes more effectively with them. The downside is that sometimes a database can use literal values in a query to come up with a better execution plan. This is particularly likely when one or more of the parameters are from columns that have highly skewed value distributions (like status codes where some codes are much more likely to be used than other codes).
 
Saskia de Jong
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As if I had nothing else to do today , I spend the entire day testing through a dozens of configurations. I wrote a series of unit tests for the query and tried several things.

First I added 1 test where I used setInteger instead of setObject. On the 8.1 DB this gave interesting results.

For "prepared with setInteger", "prepared with setObject" and "non-prepared" resp, this gave on average the execution times:

1416, 11396, 1257

Here, using setInteger is significantly faster than setObject, but still a bit slower compared with the non-prepared version.

After this I tested with another version of postgres (7.4) and with more complicated queries, but they all gave the same kind of results. setObject unrealistically slow, setInteger fast, but non-prepared the fastest.

Then an interesting test-case came up. I changed the driver from postgresql-8.1-404.jdbc3.jar to pg74jdbc3.jar and all of a sudden each version of the execution method gave the exact same timings. For the same config as the numbers I gave above I was now seeing resp:

1188, 1158, 1124

I re-run my tests with different queries, different DBs (e.g. 7.4 driver on 7.4 DB, 7.4 driver on 8.1 DB) and the execution time of prepared vs non-prepared and setObject vs setInteger is always about equal. For the record, I repeated each test atleast 6 times, and the DBs had no other load on them.

So, in this case there's definitely something wrong with the postgresql-8.1-404.jdbc3.jar driver!

update: I also tried with the 8.0 driver (both 313 and 314), but this gave the same results as the 8.1 driver again. So there's probably nothing inherently wrong with the current 8.1 driver. Maybe the old 7.4 driver simply does not support true prepared statements and always sends the same SQL text to the DB?
[ January 25, 2006: Message edited by: Saskia de Jong ]
 
Wei Dai
Ranch Hand
Posts: 86
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Repeated PreparedStatement execution should be quicker than repeated Statement execution. If your test show a slower speed, it should mean that your database engine doesn't support PreparedStatement in fact, all PreparedStatement execution were converted into Statement execution.
 
Saskia de Jong
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Wei Dai:
Repeated PreparedStatement execution should be quicker than repeated Statement execution.


It should, but it's slower...

Like I mentioned, I repeated each test at least a couple of time. Apart from the fact that every query is faster after the initial execute (prepared or not-prepared), it wasn't seeing any differences for repeated execution.

I even enabled the prepared statement cache in my connection pool (DBCP), but this also did not help.


If your test show a slower speed, it should mean that your database engine doesn't support PreparedStatement in fact, all PreparedStatement execution were converted into Statement execution.[/QB]


The PG JDBC 8.1 driver definitely does support prepared statements (wire protocol 3). PG 8.1 itself also supports prepared statements.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic