Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

IO Performance problem

 
Bill Compton
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a Java program doing some IO that's about 2x slower than the corresponding C code and would like help optimizing the Java version. The C version goes in an average of 64 seconds and the Java in about 127, not counting the time for the JVM to load and start. The task is to read many (~6000) small text files from a given directory. The salient part of the Java code is:

The text files look like:
AREA_NAME_GOES_HERE
40.67281150817,22.93920917511
40.23754310607,22.93920917511
40.23754310607,22.50393657684
40.67281150817,22.50393657684
END
In case it matters, I'm using Java version "Classic VM (build JDK-1.2-V, native threads)" and Borland's freebie C++ compiler version 5.5 on Win 2000 with plenty of memory (224). To get consistent runtimes, I'm testing each just after a reboot to insure they're not penalized by other stuff running or benefiting from files cached in memory. Suggestions...?
 
paul wheaton
Trailboss
Pie
Posts: 21736
Firefox Browser IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would start by making it multi threaded. Maybe ten threads running at once. There is a lot of time sucked up with opening a file.
Next, rather than using the StringTokenizer stuff, I would use the much faster String.indexOf(',');
Next, string comparison stuff is pretty slow. For Java as well as C. I hate while loops that contain a lot of stuff. And I hate seeing code duplicated outside of the loop for initialization.
Closing your buffered reader will close all the other file stuff.
try this:


 
Bill Compton
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the suggestions and code, Sherrif. The leaner string / parsing stuff whittled a little off the time -- down to an average of 124 from 127 seconds. I'll have a go at multi-threading. Should be interesting; haven't done threads in Java yet -- good chance to explore that area.
 
paul wheaton
Trailboss
Pie
Posts: 21736
Firefox Browser IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Heavy I/O is the primary reason we have threads in Java. Check out the books listed at the top of the threads forum.
What O/S and VM are you using? I wonder if the VM you are using is clunky.

 
Bill Compton
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm using Java version "Classic VM (build JDK-1.2-V, native threads)" and Borland's freebie C++ compiler version 5.5 on Win 2000 with plenty of memory (224).
The first (crude) whack at a multi-threaded loader has shaved some more off the time. It's down to ~93 seconds from ~124 using 10 threads. Now that I understand the basics, I'm going to refactor my initial implementation to clean up the architecture and hopefully further reduce the time. I'll report back on that in a day or so.
 
paul wheaton
Trailboss
Pie
Posts: 21736
Firefox Browser IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So is your VM the Sun VM?
I suspect that the VM could do a bit more optimizing with I/O stuff.
I suppose that increasing the buffer size won't make much difference since the file sizes are already so small?
 
Bill Compton
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Uh, dunno. How can I tell? java -version just says:
java version "1.2"
Classic VM (build JDK-1.2-V, native threads)
Can you recommend where I can get a better VM?
Yeah, bigger buffer definitely seems unlikely to help. Most files are just 6 lines long.
 
paul wheaton
Trailboss
Pie
Posts: 21736
Firefox Browser IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mine says:
java version "1.2.2"
Classic VM (build JDK-1.2.2-W, native threads, symcjit)
And I know that this is the Sun VM. The "jit" part on mine should give a huge amount of optimization.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Also try jdk 1.3, which uses HotSpot by default, which usually speeds things up nicely.

 
Bill Compton
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I loaded jdk 1.3 and that made a big difference. Between this change and the others (multiple threads, simpler string stuff) the Java code is down to the same runtime as the (unoptimized) C version. This is probably good enough for our purposes. Thanks for all the helpful suggestions!
 
Jack Shirazi
Author
Ranch Hand
Posts: 96
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would guess that the C code doesn't use two byte characters, nor String objects. For this kind of task, converting bytes into chars and creating multiple String objects both impose significant overheads. Once again (see the 'speed of Integer' thread), Java provides you with the ability to get maximum speed, but to do so your code ends up looking very similar to the C code. It depends one whether the speed is more important than using good object-oriented coding.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic