Hi,
We have a severe issue in our web application that the server (
tomcat) goes down and the login page itself will not be accessable. The root cause is same each time. We identified that there were table row level locks happened in the oracle database. This situation is happening once in a while in our production application. We request the DB team to kill the sessions and some times they fail to do so. We restart the application by restarting all the apache and tomcat servers.
We are operating our application in cluster mode with four Apache & Tomcat servers. In our tomcat server logs we will have all the HTTP requests of every user. We observed that the HTTP requests of one server will be kept pending and the request are successfully completed in rest 3 servers. This number of pending HTTP requests will be increased gradually in all the servers one after the other and the application will be down. Users can't access the login page also.
From the Oracle DB team, we came to know that some of the sessions are aquiring locks on the rows of the oracle tables while doing a Read/Write operations and are not releasing the locks for other sessions. And due to this the HTTP requests are kept pending for longer time.
We also identified some sideeffects of this situations. When the applications doesnot respond to the users operation, they again retry. Due to that another HTTP request is getting generated for the same operation and these HTTP requests will multiply when he clicks on the same HTML buttons like "Done" "Next" "Submit" e.t.c. The previous requests for the same operation are not getting killed when user retrys.
We are not getting any ways or means to find out which HTTP request or which users session or which server has lead to the DB locks.
Can anyone help us on this issue.
We ran a script to find out which tables contains the locks and the ouput is atatched here