I'm facing a problem of what seems to be silent dropping of HTTP connection to Tomcat
server, when serving a long running request from browser.
I was wondering if anybody else has run into such a problem, and how they solved it.
The situation is like this:
My webapp uses Struts
1.2 / JSP
/ jQuery, and runs on Tomcat 6.0.29 on a 64bit remote Ubuntu VPS.
It involves video conversion - users upload video files (between 5-20MB typically), and app converts them to FLV.
Upload form is submitted from a JSP page, using jQuery ajaxForm plugin.
When user clicks Upload, ajax request is POSTed, encoded as "multipart/form-data".
A Struts Action receives uploaded file, converts to FLV and executes other logic involving DB operations.
During processing, user just sees an animated busy icon. No HTTP response is posted by server while processing is going on.
Once processing is complete, success, failure or error messages are returned to browser in JSON ("application/json") format with a HTTP SC_OK (200) status.
The ajaxForm's success() or error() handlers then execute and show the received information.
Now the problem:
On a single machine or LAN, everything works fine, though the entire processing may sometimes take as much as 15 minutes.
But on the remote Ubuntu VPS, what I find is that after some time (~2 minutes), the HTTP connection drops silently. This is revealed by netstat -an.
No exceptions are thrown - neither in the app code nor any seen in Tomcat logs.
I'm catching Throwables, not just checked Exceptions, and logging them - so there's definitely no exception from the app.
Moreover, the processing on server goes through completely successfully (revealed by logs).
Even the response.getWriter().write() for the JSON response goes through without exception.
But client machine never receives it (revealed by wireshark). So browser is not notified at all about the response.
User ends up seeing an animated busy icon for almost 30 minutes or so till some browser timeout comes into play and then ajaxForm's error() handler
is called with error=timeout.
Initially, I assumed it's an issue with this particular server. But surprisingly, exact same problem occurs on another remote server too!... an Amazon EC2 AMI running Amazon 64bit Linux and Tomcat.
In order to systematically isolate the issue, I did some experiments:
- Is it a Struts issue? To find out, I wrote another simple webapp with just 1 servlet
which simulates a long running op by just sleeping for about 15 minutes without sending any HTTP response.
Then wakes up and sends 1 response and ends. Same problem is seen here too. So issue is not with Struts or file upload.
- Is it Tomcat configuration? I experimented by setting Tomcat's Connector properties in conf/server.xml. I played around with all combinations and ranges of connectionUploadTimeout
- All to no avail.
- Problem is seen on all 3 browsers I tested (IE,FF,Chrome), so it can't be a browser issue.
- Experimented with the client side keep alive values (net.http.keep-alive.timeout in FF's about:config). Again, no success.
- One other clue I got...though it's from the Tomcat JK connector docs
which is not applicable to me, the problem seems similar:
...One particular problem with idle connections comes from firewalls, that are often deployed between the web server layer and the backend. Depending on their configuration, they will silently drop connections from their status table if they are idle for to long...
- The response "Transfer-encoding" is set by default to "chunked" by Tomcat itself (revealed by Wireshark). So it can't be a problem of clients thinking response is over.
My current workaround
The only solution I've found so far is to write some small bits of response periodically about every 30 secs - while processing is going on - just to keep the connection alive.
It's unnecessary for functionality and smells like a hack to me!
Worse, I've to bring in some kind of timer thread
to write periodic response while a worker thread does the processing.. so I'll be creating threads in Tomcat environment!
Overall, I'm not comfortable at all with this solution that seems more of a workaround.
And now my questions:
1) I don't think this use case of a long running operation is all that rare. So have any of you run into this issue and how did you solve/workaround it?
2) Is it really necessary to send a response periodically just to keep the connection alive?
3) The timer thread to send periodic responses and worker thread to do processing are just an idea which is simple at the moment to implement. But what would be a good solution
from your point of view? This app has a roadmap to eventually become a 3 tier one where processing will be handed off to a EJB
in an app server...but at this point, everything has to run in Tomcat due to other constraints. And even with a separate tier, the problem of long running operation in web tier still remains. So what would you suggest as a good solution?
All suggestions are welcome.
Thanks in advance,