I've been collecting debugging "war stories" for a while now - several of them are used to illustrate points in the book.
I'd be very interested to hear your own stories. What was the most difficult bug you've had to track down? Or the most bizarre cause you eventually uncovered? Perhaps you invented a novel technique to unearth the problem?
Everyone has at least one bug that they thought they would never track down, until finally the penny drops. What was yours?
One problem that I faced was deployment of a war file comliled in a diferrent jdk version and then deploying it in the server which had a diferrent version of jdk/jre ... I forgot the exact exception but you can identify it with a L that is appended to the end of the stack trace ... this might be very hazy as to what i am trying to say but thought i would just share my exp ..
The nastiest bug that I had to hunt down was a heisenbug when I was developing a J2ME (racing) game where two players could compete via Bluetooth. I can't recall all the details of it, but as far as I remember it was like this: the two mobile phones connected via Bluetooth and as soon as the level started a few seconds later one application just freezed while the other one was still functioning.
On the PC in the emulator everything worked fine. If I used a remote debugger: everything fine. If I used the RMS (=record management system. on mobile phones you couldn't use directly files, you had to use the RMS) for logging: everything fine. Using a different combination of phones: everything fine.
I can't remember if both phones where the same model or if it was different models with different processor speed. In the end it was a race-condition that let the application dead-lock. I think to remember that I could narrow the source of the bug down to a few lines. But I just couldn't see what it was. (Some shots into the dark using synchronized blocks at different locations didn't help). I printed the method and walked around with this paper. It was driving me nuts.
Took me two weeks to figure out why the application dead-locked.
And who ever developed a J2ME application knows how painful deployment is. For every change you had to compile, copy it via Bluetooth or cable on your (both) mobile phones, start both applications, connect, play the level and pray. And as soon as the application freezed: battery out, restart the mobile phone (you couldn't kill the application in any other way).
As a sidenote: due to some legal obscurities the game was never published *sigh*
Paul Butcher wrote:What was the most difficult bug you've had to track down?
I can't share details due to NDAs, but the two most typical difficult types I've had:
1) Randomness - something that only happens when certain entries are inserted into a HashMap. (It's only deterministic if you know what the entries are.)
2) Multi condition bugs coupled with something that doesn't happen often. Such as a bug that only happens on Tuesdays in a leap year if you pick a certain 2 account numbers in succession.
Deployment: Imagine having 6 instances of app servers and a key file or two doesn't make it to just one instance.
Logging: Creating a logger/log-file that the team isn't familiar with, and logging key info there (aka SQL exceptions).
Settings: Users who have windows DPI settings at 120%, which causes all sorts of UI display grief.
Clients: who often use the word "bug" to denote a "feature" they feel is missing [face-palm]
Dates: over the past 10 years, I've repeatedly seen issues with date comparisons across many languages.
I worked at a shop where we ran the DB, Web, and App server all on one machine. After a particular deployment - our app was massively slow. We churned through all the code, analyzed the logs, profiled the network traffic... and couldn't figure it out. We were getting a sizable amount of traffic, so the decision was made to purchase more servers/CPU/Memory. After all this shiny new hardware was installed - we found a rogue SQL statement (too many nested sub-queries) that was killing the DB. We refactored the SQL, and things returned to normal. Except that we kept all that new hardware.
Never underestimate the power of a well-timed bug!
You showed up just in time for the waffles! And this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop