Interesting - your results are surprising to me in several ways. For one, comparing the first two techniques: I would have expected startsWith() to be faster (or at least, no slower) than a compiled regex. Just because I'd expect Sun's startsWith() to be pretty well-optimized (it's a simple operation after all) while a compiled regex, being more complex and variable in nature, might not be optimized as well. I'm at a loss to guess why the regex might be significantly faster in this case. The final result is surprising too, in two ways. First, that it's so much faster initially. I would have thought that it would be wasting a lot of time to search the entire file, considering that for a final result you're only interested in the matches that occur in a particular field. But I guess it's faster to keep the initial search as simple as possible, avoiding the extra object churn of creating a new String for each field before examining it. (Or possibly some other CharSequence). I assume that after you identify the records that seem to have a match (using the initial big CharSequence search) you then check the record more carefully to see if the regex match occurs in the appropriate field? This would of course be more time-intensive, but at least it's only done on the small subset of records which matches the initial regex search, right? Come to think of it, aside from object churn there's the fact that the first two techniques probably have a lot of lock aquisition (and release and reaquisition) as you request a read of each individual record. The "one big search" strategy just requires a single big read - once you've got a copy of the whole file in a buffer, you can be completely independant of the other threads, right? That's a nice plus... Anyway, the other odd thing about your final results is that while "one big read" is very very fast initially, it's performance seems to degrade sharply for the later threads. The average performance still seems better than the first two methods, but there's cause for concern I think - what happens if you use 200 or 300 threads rather than 100? I wonder what the cause of this slowdown is? One possibility is that "one big read" requires more memory at one time, and the later threads are forcing more memory to be allocated to the JVM? Perhaps you can try running with the -verbose:gc option while running these tests, to see what's going on with total memory usage at the same time. I'm just now considering how to implement find() for my assignment. I wouldn't have expected OBR to be worthwhile, but based on your results maybe I'll give it a try. Since the instructions don't mention performance much, I'm thinking that there's little reason to worry about what happens if there are 100 simultaneous requests. But a single user may very well notice the difference between a 47 ms response and a 1.5-2.5 ms response, even if performance wasn't cited as a big concern. Thanks for the info - I'll let you know if I come up with additional data.
"I'm not back." - Bill Harding, Twister
All of the world's problems can be solved in a garden - Geoff Lawton. Tiny ad:
RavenDB is an Open Source NoSQL Database that’s fully transactional (ACID) across your database