We've been encountering an interesting scenario in WebSphere (WAS 6.1). We know the root cause for the problem in question and fixed it, and things have been working fine since then. But what we don't understand is WebSphere's behavior in the scenario. This is what our process does:
1. It is a stateless session bean that uses bean managed transaction.
2. In a nutshell, it starts a transaction, selects a row in Oracle database table for update, do some processing on the retrieved result set, writes to a MQ queue and commits/rollback.
3. Normally the above logic works perfectly as the whole transaction will be completed with in 30-100 millisecs.
4. But recently we introduced some other MDBs into the same JVM as this session bean. These MDBs also read and write to some queues under the same queue connection factory as the session bean. The caveat though is that we've got the queue connection in the ejbCreate and closed it in ejbRemove() as against doing it just for the period needed with in the onMessage somewhere.
And here is the configuration we've setup in WAS:
a. Total transaction lifetime timeout - 120 secs
b. Queue Connection timeout - 1800 secs
c. Queue Connection Pool Maximum Connections - 10
5. When these MDBs and the session beans are all running simultaneously, we believe the MDBs are hogging all the queue connections and the session bean waits forever to get the queue connection.
6. So the scenario we face is that: the session bean starts a transaction, locks the db row for update and tries to get the queue connection. Since there is no available queue connection, it just keeps waiting. Meanwhile after 120 secs, the container times out the transaction. At this point, I would assume the container would roll back the transaction as part of the timeout, but it didn't, continuing to hold the db lock. When the queue connection request finally timed out after 30 mnts, it threw the error that the queue connection cannot be obtained. But it also throws an error that the transaction cannot be rolled back because I try to rollback when the queue connection cannot be obtained. So when I try to rollback, it throws javax.transaction.RollbackException but it still doesn't release the lock on the db row. In fact that lock was not released until we kill the JVM. The WebSphere documentation claims that the transaction will be rolled back when the transaction times out, but I don't understand why it wouldn't release the resources if it had really executed the rollback correctly.