I hope you don't mind, but I'm going to be a little greedy today and ask a second question in addition to my previous post.
I've been working on some very CPU-intensive applications (image processing, computational geometry, etc.). One thing I've found in working with performance issues is one little experiment can undo a vast amount of theorizing. So every time I look at a way to improve the performance of my code, I try to run it through realistic timing tests to see if my ideas actually work. I've lost track of the times that some idea that I was sure was going to improve performance turned out to do just the opposite.
The thing is, as valuable as timing tests are, Windows is a terribly noisy testing environment. I never seem to know when something is going to kick off in the background when I'm running a test. I try to compensate for that by running multiple repetitions of the same test (and discarding the initial tests during which class loaders and JIT operations get involved). Even so, I routinely see time values that can bounce around by 20 or 30 percent (most of my tests take a few seconds to run). Does your book address ways of better isolating system performance issues when conducting timing tests?
If you are looking to test small amounts of code (outside of the context of an application) you are heading down the path of writing a Microbenchmark. There are many perils to doing this, especially if you are hand rolling your own test harness. One example is that the Just In Time Compiler kicks in at 10_000 iterations by default, so if you haven't got the right warm up you won't see the true profile. There's also the issue that I've seen many times where the benchmarks people write are simply optimised away as the author does nothing with the result. You could log them, but that's not a true performance measure.
A benchmark always gives you a number, the issue is being able to trust that number.
If you are going to do Microbenchmarking you really should be using JMH, which is written by the authors of the JVM. It will do things like warm up and prevent unwanted optimisations from happening in your benchmark (e.g. removing the test itself).
Even JMH though won't necessarily help with proving a speed up in your target application, as the optimisation at runtime would potentially be very different to results obtained from JMH.
The first principle is that you must not fool yourself – and you are the easiest person to fool.
One of the big traps in performance tuning is Confirmation Bias - we go look for the evidence of the causes that we think are behind our problems - we tend to cherry-pick data if we're not careful.
So, we need to take a step back, not second guess, and let the data speak. Part of that is recognising that we operate in noisy environments.
20-30% noise in an environment is, if anything, quite low.
Consider this bit of code:
All it's really doing is showing that the time take to hash a string is proportional to its length - that is O(1) - but what it actually shows us is a measure of how noisy our execution environment is.
Grab the data points and analysis the distribution of them, against string length, and you'll see how much random error there is - easy 20-30% on some measurements on most real systems, I bet.
My conclusion from this is, unless we're squeezing out the last few glistening drops of performance, we have to concentrate on the big points first - and take the noise as a fact of life - if a performance effect can't be made to rise above the noise then we shouldn't regard it as a real effect - that's what Feynman would have done in Physics, after all.
Gary W. Lucas
posted 8 months ago
Thanks! I've never heard of JMH, and I am very grateful for the suggestion.
It's funny you should mention the compiler optimizing away the test logic. I've gotten burned by that a few times. A test would do all sorts of interesting work, but because the application didn't collect the computation results, the optimizer just removed the test logic from the code... which made it run much faster, of course, but didn't provide me with much information. So I've gotten into the habit of making sure that my test collects (and prints) a data result even if it's just a dummy value I don't care about.
All of the world's problems can be solved in a garden - Geoff Lawton. Tiny ad:
RavenDB is an Open Source NoSQL Database that’s fully transactional (ACID) across your database