• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Not so simple NPE

 
Greenhorn
Posts: 23
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Howdy ranchers, I found NullPointerException that I can't explain.
I'm going mad because of it so maybe you will help me.

Here is method that throws NPE, line numbers are in comments


NPE was thrown from line 321 during stress tests under heavy load, it happened only few times and can't be reproduced, code is multi-threaded.

Any ideas ?

Jacek
 
Rancher
Posts: 4803
7
Mac OS X VI Editor Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jacek Ostrowski:

Any ideas ?



Nothing obvious, but I'd wrap the code with a Mutex and see if that improved it.

Can you tell if its is 'param' that is null, I would expect it is your allocated Object array. I'd add some if ( x== null) printStackTrace code, if you can't set breakpoints.
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Since objs is a local variable, that really seems impossible. I think the most likely scenario is that you're accidentally executing an older version of the code, and the line number is incorrect. Otherwise, it seems like it would have to be a JVM bug of some sort. You could try looking for a bug report, but I suspect there are many NullPointerException reports that are not related to this problem.

It may be possible to put in a workaround, catching the NPE, logging it, and retrying the failed code - but I'd be very reluctant to do something like that without understanding what the problem really is.

What kind of heavy stress are you dealing with? Lots of threads? Sheer CPU load? IO? Or something else?
 
Sheriff
Posts: 22781
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does the NPE also occur if you change it into the following:
 
Jacek Ostrowski
Greenhorn
Posts: 23
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Can you tell if its is 'param' that is null



Yes, there is another log method that calls this one with a null param.
But it should have no meaning and it should not throw NPE.

Since objs is a local variable, that really seems impossible


Exactly what I'm thinking.

I think the most likely scenario is that you're accidentally executing an older version of the code


That is the first what I've checked. Line numbers are correct.

it seems like it would have to be a JVM bug of some sort. You could try looking for a bug report, but I suspect there are many NullPointerException reports that are not related to this problem.


It is my conclusion too. There are many NPEs in bug database and so far I haven't found anything close to mine.

What kind of heavy stress are you dealing with ?


We do load tests and this means that we have many threads and high CPU load.


Thanks for those suggestions. I'm going back to java bug database

Jacek
 
Ranch Hand
Posts: 1282
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jacek Ostrowski:
Yes, there is another log method that calls this one with a null param.
But it should have no meaning and it should not throw NPE.


Why not ? if(level == null || msg == null || params == null)result == npe;// duh

Since objs is a local variable, but not the objs[0] ? ( no ? ) - so I'd take P.F.'s approach.
 
Jacek Ostrowski
Greenhorn
Posts: 23
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Why not ? if(level == null || msg == null || params == null)result == npe;// duh



I agree that NPE may be thrown from lines 322 or 324 if some of arguments are null, but I can't see how it could be thrown from line 321. It looks like line 320 was not executed or at least not completed before line 321 was called.

So far the most important for me is to understand how it happened.

Jacek
 
Nicholas Jordan
Ranch Hand
Posts: 1282
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well the only thing that makes any sese to me is JLS - 17.3 Incorrectly Synchronized Programs Exhibit Surprising Behaviors, which I gravitate towards because the problem description is it happened only few times and can't be reproduced therefore I go on the board that we may never find it on the basis that the choice of words by working engineers - adjuncted by real-world constraints - suggests that if multiple threads can access the code - a reasonable assumption given your problem definition - then all of the fixes suggested here should be applied.

What I meant was if at any line, some thread other than the one that entered, does something with any of the ref's, not just the one under suspicion, then we get an npe. The fact that this occurs only under heavy load complicates isolation and recovery, and I sought to generalize this. So if the problem is to be found, I suggest wrapping the code in some sort of harness. An app that runs reliably is usually better than one that runs fast but fails occasionally.
[ December 03, 2007: Message edited by: Nicholas Jordan ]
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Umm, yeah, no not so much. They're local variables. Nothing else can access them. Stale memory caches of other variables can't affect anything here.

Jacek, tell us something about the environment this code is running in. Who is the JVM vendor (is it Sun's or something else?)

Are you using any native code (that you wrote, or third party?) Threads can hit each other with stray pointer accesses, since they're in the same memory space; perhaps native code on some other thread is smashing your stack?
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I wonder if it's just a bug in reporting which line the error is coming from. It could be that some JIT optimizations are confusing the line number info. I haven't seen that happen, but it's the most plausible explanation I can think of. Perhaps you should just check for null arguments, and throw a clear exception if you find them:

Of course, maybe it's OK for some of those to be null. If you're using classes from java.util.logging, then the Level is the only one that should throw an NPE if it's null. At least when using the SimpleFormatter. Anyway, try testing what happens when different inputs are null.
 
Jacek Ostrowski
Greenhorn
Posts: 23
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This code runs as standalone process on sun jdk 1.5.0_11 on RedHat Linux.
As you can suspect shown method is a part of logging utility.
Application uses third party native code via JNI but as far as I know we never had any problems with invalid stack traces.

As for JLS 17.3 suggestion - I agree that it looks like there is some synchronization issue but I can't find what it could be.
Class that thrown NPE is a log4j Logger wrapper and it only has static field logger. Classes using this wrapper follow log4j usage pattern - they declare static field logger.


Jacek
 
Ernest Friedman-Hill
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by jaceko:

Application uses third party native code via JNI



Well, this is very significant because given this, "impossible" stuff now becomes possible. Thread-safety or JNI bugs in that native code could cause pretty much anything to happen. If the JNI code itself is third-party, have you talked to that vendor at all? Otherwise, you need to look hard at the code yourself. Are you using local vs. global references correctly? Are you pinning things when needed, and unpinning them afterwards? JNI is full of dragons.
 
reply
    Bookmark Topic Watch Topic
  • New Topic