• Post Reply Bookmark Topic Watch Topic
  • New Topic

Hung threads caused by HashMap in dom4j?  RSS feed

 
Surender Suri
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

we are using dom4j for creating the XML request and making a http post call to external client, but since few weeks we are experiencing hung threads issue and when i got the thread dump, this is what i see and lot of them . Is this a known issue ? or am i missing something ?

Any help whould be appreciated.

"http-80-20" - Thread t@688
java.lang.Thread.State: RUNNABLE
at java.util.HashMap.get(HashMap.java:303)
at org.dom4j.tree.QNameCache.get(QNameCache.java:79)
at org.dom4j.DocumentFactory.createQName(DocumentFactory.java:157)
at org.dom4j.tree.AbstractElement.addElement(AbstractElement.java:704)
at com.hypertechsolutions.ipm.cars.carconnector.CarServiceHandler.ā(CarServiceHandler.java:1062)

Thanks,
Suri
 
Henry Wong
author
Sheriff
Posts: 23283
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This is probably something else, but I did run into cases where the hashmap becomes a trap for threads.

On particular JVM versions, on machines with lots of physical cores, and applications using lots of threads; If these threads are all sharing a hashmap, that is *not* synchronized, under certain usage (which requires luck), it can cause a corruption in the hashmap (of a circular nature). The data structure is corrupted in such a way that a get() operation will go into an infinite loop.

So... once this happens, any thread that calls the get() operation will get stuck in that method -- eventually, since there is no synchronization, all of the threads will get stuck in that method.

Henry
 
Mike Simmons
Ranch Hand
Posts: 3090
14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Henry's theory sounds very plausible to me - even if he said it's probably something else.

Look at the first line of each stack trace (at least, each stack trace that's involved in this HashMap problem - is it always pointing to line 303 of HashMap.java? Or is it often pointing to different lines, close to 303?

Here's the get() method I see for JDK 6 (1.6.0_22), which is what I happen to have available right now:

Line 303 is the one with e.next - though it may be different on your machine. What exact version do you get when you type java -version? Can you download the source for that version? Anyway, if line 303 is inside that for loop, and if there's a circularity in the bucket entries, then you've got an infinite loop, and the line numbers will vary from thread to thread, and very over time (if you get a second or third stack dump of the same threads). But they will always be in a small range close to 303.

As for how to fix it - do you provide or create this hashmap yourself? Or is it internal to dom4j? If you provide it, then it should be simple to replace it with a HashMap wrapped in Collections.synchronizedMap(). Or maybe a Hashtable or ConcurrentHashMap. If it's internal to the dom4j parser, then maybe you should try using a different XML parser. Or maybe you can limit the way different threads access dom4j. I'm not very familiar with how dom4j works or how you're using it, so there's not much more I can say here.
 
Surender Suri
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Mike/Henry,

The JDK version on our server is jdk1.6.0_17 and yes it's always pointing to line 303 of hashmap.java. I will download this versions souce and verify the code, guess won't find much difference. The hashmap code is actually internal to dom4j, so i guess using a different XML parser may help.

will it help if the jdk version is changed ? like jdk 1.5 update 16 ? coz that was the version we were using before and never faced this issue until we upgraded to 1.6_17.

Thanks,
Suri
 
Henry Wong
author
Sheriff
Posts: 23283
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mike Simmons wrote:Henry's theory sounds very plausible to me - even if he said it's probably something else.


To be fair, I wasn't that confident because it was a very hard state to get into. I only saw it with machines with lots of processor cores. I guess there were changes with the new versions of the JVM that made it more likely.

Henry
 
Henry Wong
author
Sheriff
Posts: 23283
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Surender Suri wrote:
will it help if the jdk version is changed ? like jdk 1.5 update 16 ? coz that was the version we were using before and never faced this issue until we upgraded to 1.6_17.


Keep in mind that you always had the issue -- this symptom just never appeared. In your case, it appeared with the JVM upgrade, but it could have also appeared with a hardware upgrade, an OS upgrade, or even a software upgrade of something else in your system that changed the timing. And since it is a race condition, it could have also manifested itself with different symptoms; and you just never noticed.

I would recommend looking to see if there is a newer version of dom4j. Maybe it has been fixed.

Henry
 
Mike Simmons
Ranch Hand
Posts: 3090
14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Suri, is the CarServiceHandler part of the stack trace your code? Or code that's under the control of people you work with? It looks like you're calling addElement from multiple threads. Are all these threads working with the same document? Are all these threads able to modify the same shared data? If so, does dom4j offer any guarantees of thread-safety? I can't find any, in my limited googling. I suspect the fundamental problem here is that there's no thread safety in this code in the first place, and that's just now being detected. Much as Henry just suggested.

Aside from trying a newer dom4j or other parsers, it might be worthwhile to add some synchronization yourself. This is risky if we don't really understand just how the dom4j code works. But let's assume that CarServiceHandler.java line 1062 contains something like this:

Try replacing that with:

If that doesn't work, try replacing synchronized(parent) with something like synchronized(document), where document is the root of the whole object you're editing. I don't know if the HashMap that's having problems is associated with each individual element, or with the document as a whole. And I don't know if there are other data structures within dom4j that also need protection from concurrent access. These are just guesses, perhaps worth a try.

But in the long run, the best solution is probably to find an XML parser that explicitly claims to be thread-safe. If no such claim is made, you should assume that it isn't thread-safe.
 
Mike Simmons
Ranch Hand
Posts: 3090
14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mike Simmons wrote:Try replacing that with:

Looking at the source for dom4j-1.6.1, it's unlikely this will work. The DocumentFactory is likely to be a singleton, so it's shared by all Element instances. Synchronizing on an individual element will not provide protection when two elements call createQName at the same time. You probably need to sync on something static.

Actually, looking at the 1.6.1 code, it's apparent that the HashMap in question is already wrapped in a Collections.synchronizedMap(). So at least this version of the code is designed for thread safety. It's also clear from the differing line numbers that 1.6.1 is not the version you're using. So I recommend doing as Henry says, upgrade to a newer version of dom4j, and they've quite possibly already fixed this problem for you.
 
Surender Suri
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Mike,

The CarServiceHandle is part of our code and is invoked by multiple threads and uses below code to get an instance of DocumentFactory and create a Document object. We are using dom4j version 1.1 and not sure if that guarantees any thread-safety.

Document carReservationRequest = DocumentFactory.getInstance().createDocument();
...
Element emailInfo = primary.addElement("Email");
emailInfo.setText(bookingRequest.getEmailId());
Element address = primary.addElement("Address");
Element address1 = address.addElement("AddressLine"); - This is line 1062 where thread hungs.
address1.setText(bookingRequest.getAddressLine1());
Element address2 = address.addElement("AddressLine");
address2.setText(bookingRequest.getAddressLine2());

I guess as suggested by you and henry, upgrading the dom4j version to 1.6 is the best bet. Will try and see if that resolves the issue.

Thanks for all your help.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!