Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Datasource.getConnection takes longer for the first run when called from particular method  RSS feed

 
Alex Lieb
Ranch Hand
Posts: 61
3
Java Netbeans IDE Notepad
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi!

So I have a method that's decided it wants to be special.

We have a connection pool set up so we can just draw connections out whenever we need to run a query, and that's all fine and good, but the performance restrictions on a very small section of our code just got much, much more strict. So I've been parcelling out the various things this code has to do and looking for what's taking up all those milliseconds. There's roughly 3 distinct processes that need to happen all at once as fast as possible now, where earlier we were really ok if it took a feww seconds. I've inspected the first of those 3 distinct processes and I've managed to optimize a lot of it and get it down to the point where there aren't any steps that have wide variation in the time they take to execute, and I sidestepped some other steps entirely, but there's one database call that has an unfortunate combination of attributes: We *do* need it to happen, and we care about what it returns, and it sometimes takes a long time to execute.

I know this isn't the absolute most accurate timing method, but it's working well enough for what I'm doing; it's ok if it's off by a few milliseconds, I just need to know what steps in each method account for most of any given method's run time. Also I'm paraphrasing this code somewhat, but basically it looks like this right now:

DISCLAIMER: I know I have a hanging PreparedStatement and a lazily closed ResultSet, I know I could include those in the try with resources, I just took them out temporarily so I could put little timers around them.



Most of the time, this query takes about 40-50 milliseconds to execute. That's fine. It just gets a connection, runs the query, pulls the data, and we all move on with our lives. But for some reason the first time it runs through this method, it takes somewhere between 450 and 500 milliseconds to run. I *think* it's only the first time it runs, but I don't really know; since I don't know why it's taking longer in the first place I really have no idea if it's going to just randomly decide to do that sometimes or if it's just doing that the first time.

I've actually taken out quite a few of the timers I had in here; I've accounted for every millisecond that this method runs; the first time it runs, the total runtime is about 500 milliseconds, 40-50 of those are occupied by executeQuery();, which is pretty consistent with query execution/response times in other methods. The other ~450 are spent executing connectionPool.getConnectionFromPool();

In subsequent runs, executeQuery takes the same amount of time to run, but getConnectionFromPool() is in line with how long it takes when it's run from other methods; it takes 0-1 milliseconds

getConnectionFromPool is a method that we defined, so I added timer things in that method too; all of the runtime for THAT method is being taken up by a single line:

Connection conn = ds.getConnection();

Connection is of type java.sql.Connection, and ds is of type javax.sql.DataSource

There's only one way I can try to account for the lag, but it doesn't make any sense based on everything I know about how these work:

The datasource is one of 4 datasources we have; each datasource has access to multiple databases. This query uses a datasource that's been used previously, so it's not like it's connecting to the datasource for the first time (which, by the way, takes about 900 milliseconds), but this query *is* the first query in our application that makes a call to one of the databases *in* the datasource...

That seems like it would explain why it takes a little longer to get the connection, but... When it calls getConnection(), it has no idea what database it's going to call, it's just getting a connection to the datasource. If they were the reason for the delay, wouldn't the delay be happening during prepareStatement();? Or is there some kind of look-ahead happening behind the scenes here? Why does this query want to be special?
 
Alex Lieb
Ranch Hand
Posts: 61
3
Java Netbeans IDE Notepad
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Note: I was doing this all while connected to a database 8 states away, and I was under the impression that in production we would also have to deal with a database/project location separation like that for at least a few more months, but apparently I will not, so if I want to get a good approximation of how long a process would take on production I should actually be connected to my local database. So I did that. The query execution/response times are way lower now, but that spike the first time you run getImportantNumber is still evident, and I'm still curious about why.

The first time you call that method, it takes about 20-25 milliseconds to run "getConnection" and about 4-8 milliseconds to actually execute the query. This is interesting because it means the most important factor in both of these database interactions isn't execution time; it's transmission time. So for some reason it's deciding that this method's first call to getConnection, which should be identical to all the getConnection calls before it, needs to go through about 4 or 5 times as many request/response cycles with the database as you would need to execute a simple query. I feel like there *has* to be some kind of lookahead going on here...
 
Alex Lieb
Ranch Hand
Posts: 61
3
Java Netbeans IDE Notepad
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Duuuude... Coding is trippy.

I figured it out.

It goes back to this:

...I sidestepped some other steps entirely, but there's one database call that has an unfortunate combination of attributes: We *do* need it to happen, and we care about what it returns, and it sometimes takes a long time to execute.


So I realized there was only one thing it could possibly be doing in that time, since it's not connecting to a new datasource and it doesn't know it needs to connect to a new database yet. The only thing it was doing was getting a connection from the pool...

I looked up the connection pool behavior and found that it does NOT do look-ahead, but I was reminded of something that sent me in the right direction: the connection pool has to create and destroy connections as they're needed; most of the time when I'm just testing stuff by myself, I have the luxury of thinking of it as a single connection object that gets passed around, because I'm the only user using the project, so that's more or less how it *behaves*. But that isn't what the connection pool is; it stores a single Connection *Pool*, from which *multiple* connections are drawn, so if you ask it for a connection while one is in use, it has to make a new one, and from then on, unless it determines that there isn't enough activity to warrant having the second one around, it'll get rid of the second one.

Earlier on in the code, I decided to run a separate query in a separate thread because it was ok to run that query asynchronously. The method that runs that query is executed *directly* before the getImportantNumber method, so I figured it probably wasn't done running yet, so when it tried to run that query, it had to create a new connection and add it to the pool. So to test that, I told it to run the separate thread method 3 times instead of 1;

The first time it runs the separate thread query, it takes 0 milliseconds to get the connection.
The second time, it took 14 milliseconds to get the connection
The third time, it took 15 milliseconds to get the connection
And when it ran the importantNumber query after that, it also took 14 seconds to get the connection, and I couldn't find a way to log it and it goes by too fast to check in the debugger, but I'd be willing to bet that at that moment there were 4 Connections stored in the connection pool. So apparently it takes about 14-15 milliseconds to create a database connection and add it to the pool.

This also explains why the delay appears to only show up the first time you run that method; it's not about that method, it's about the current size of the connection pool and the threaded method that runs immediately before it. If I wait a while and let the connection pool decide it doesn't need all those connections any more, then try it again, it'll probably have that delay again the next time I call that method even if I don't restart the program.

Man, this stuff is just really cool...
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!