• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

tomcat suddenly stopped working for half an your without any error in log

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi folks,



I have two tomcats here working in parallel on separate servers. Both have exactly the same configuration and get requests by a load balancer. On each tomcat my struts webapp is deployed. Both webapps are accessing the same database. And both suddenly stopped working without any kind of error message.

What happened yesterday in detail is that suddenly server A stopped working. The logs stopped at 09:55am . No error message, nothing. When I tried to connect to the server my connection timed out. Half an hour later server B also stopped working. Exactly the same way, no connection possible no error in log. After resetting Server A at 10:40am the other server (B) suddenly continued working. There where some minor error in the log caused by lost connections... but thats all. Server B just kept working as if nothing has happened.

My first suggestion was that there was an deadlock situation at the database...but during the 'pause'-time there weren't any waiting connections. Next to this even if the database isn't available the tomcat should do something, shouldn't it? Like accepting the connection and do some logging (I do some logging every time a user connects, independent whether he's accessing data from the db or not)....Additionally the database in the tomcat runs in a timeout after 10s.

I am always thinking about the DB since it's the only thing how both servers are connected. There is no interaction between the two servers except via the database.

Since the restart of tomcat A everything is working perfectly again.



Does anyone of you have any idea what could has happened here?



Thanks in advance for your help!
Alex

 
Saloon Keeper
Posts: 27762
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to the JavaRanch, Alexander!

There's another possibility. There might be some other, unrelated process running on those servers that's stealing all the CPU, disk, and/or network resources from the Tomcat server and its apps. In a worst-case scenario, your servers might have been "pwned", and are serving (unbeknownst to you) as nodes in a botnet.

If this problem recurs, you should monitor the systems as a whole to see if you can detect unusual loads.
 
Alexander Rondel
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Tim,


thanks for your answer!

I will have a look at this.

Unfortunately bot machines run on VMs and aren't connected to the world wide web. So I think (& hope) we didn't get pwned....
The underlying machine was watched and didn't have any performance problems....

I hope this does not happen again....


Cheers
Alex
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If this was my problem I would use the Management app to look at the memory and thread use.

I had this - all connection attempts timeout - thing happen once, the management app revealed that all available threads were working on a request - some for days!

Turned out my code was trying to connect to a service that was not running and I had failed to account for that possibility.

Bill
 
Alexander Rondel
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Bill,


in opposition to what the server administration said at first there was an enormous peak at the CPU usage at the error time.

There seems to be something in the applications sources going wrong... I'll put massive logging at it to be able to reproduce it the next time it occurs (which hopefully sin't during the next 20 years ;) )


Thanks a lot for your help
Alex
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic