In this regard, I happen to have to develop a DB Pool and stability and recovery issues are tricky to get right, and even then you cannot cover 100% of the cases.
We went through a period of time when our DB hosts were "less than reliable" and we ended up developing, in our opinion, quite a stable pool. Some of the things that we do are:
.- Check each non-used connection every X milliseconds with a dummy SQL
String (select sysdate from dual, for example). If the statement fails, drop the connection and get another.
.- If all the connections are broken (probable indication of DB host being down), shut down the pool and try again first with a single Connection when the pool is accesed again. If/When that single conn works, start the pool again.
.- After X seconds of no pool usage, close all the connections and shutdown the pool until first access again. It saves connections from timing out when there's no access, as some DB close connections when not in use.
.- Check that the connections you request to be created are really created and take that number into account for the max-connections number. Some times DB take a long time to return a new connection, or they just plain hang, so if you don't take into account the number of "connections requested but not yet created", you might overwhelm your DB and create too many threads.
.- Recycle the connections after X number of usages. Why? Some drivers/DBs don't clean properly all objects and they throw errors in the long run ("too many cursors open" anyone? ;) )
With all of that, there's still a rare case where we have had the pool get in trouble, due to a
JDBC driver bug, but DB hosts are stopped daily for maintenance and applications recover nicely.
Cheers!
[originally posted on jforum.net by GreenEyed]