Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Has 5.0 and 6.0 gotten slower in this respect?  RSS feed

 
Darrin Smith
Ranch Hand
Posts: 276
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I just ran a little test pitting Java 1.4.2 against 5.0 and 6.0. The test is fairly simple...create 100 classes and in that class create 1000 Strings from StringBuffer (I also tired StringBuilder in 5.0...almost no change) and then use a pattern matcher to check the Strings.

The results:
Java 1.4.2: 359 milliseconds ( less than one half of a second)
Java 5.0: 10079 milliseconds ( yes...over 10 seconds)
Java 6.0: 8859 milliseconds ( less than 9 seconds)

I then profiled the Java 5 and Java 6 (my profiler requires Java 5 or above) to see what was taking all of the time, and over 90% of it is creating the Strings.

Any thoughts as to why 5 and 6 are so slow?

The code follows.




 
Pj Murray
Ranch Hand
Posts: 194
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can you provide a little bit more information? Like which JVM are you using (i.e. Sun or BEA or IBM etc) and what hardware are you using (processor, memory, i/o info, etc).
 
Darrin Smith
Ranch Hand
Posts: 276
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by PJ Murray:
Can you provide a little bit more information? Like which JVM are you using (i.e. Sun or BEA or IBM etc) and what hardware are you using (processor, memory, i/o info, etc).



It's the Sun JVM.

Hardware is a DELL OptiPlex GX620 (Pentium 4 2.8GHz, 1GB RAM)

Tests on two similar machines...both saw the same thing.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here's a shorter test that shows the same drastic performance difference from 1.4 to 5:

Yes, there was an implementation change for Strings and StringBuffers back when JDK 5 came out. If you look at the source code for StringBuffer in 1.4 and earlier, you will see that Strings and StringBuffers used to share the same char[] array. So toString() and substring() would create a new String that was actually a view into the same char[] tjhat the StringBuffer was using. To make this work, the StringBuffer had to keep track of whether its char[] content was shared. And if the shared content ever changed, the StringBuffer needed to copy the content to a new char[] so that any immutable String using the old char[] would not be affected. It was fairly complex to manage, but it allowed very fast toString() and substring() operations.

Apparently however there were some subtle bugs in this approach, and in JDK 5 they were fixed with a much simpler scheme - now a new String always gets its own new char[] array - which means that the content must be copied from the old char[]. Which can be time-consuming when the content is very long and you repeat the operation many times.

Note that by appending onto a StringBuffer, there's no chance that you will change any previously-shared char[] data. So your code above is very fast in 1.4. But try changing data within the shared region - e.g. insert
into the createStrings() loop. Suddenly the 1.4 performance becomes about the same as the JDK 5 performance.

I'm surprised that the difference was as great as it is in this case. You really created a situation that took advantage of the particular way toString() was optimized in the past. Most of the time I just call toString() once for a particular StringBuffer, and that makes the performance difference much less notable.
 
Darrin Smith
Ranch Hand
Posts: 276
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Jim Yingst:
Here's a shorter test that shows the same drastic performance difference from 1.4 to 5:

Yes, there was an implementation change for Strings and StringBuffers back when JDK 5 came out. If you look at the source code for StringBuffer in 1.4 and earlier, you will see that Strings and StringBuffers used to share the same char[] array. So toString() and substring() would create a new String that was actually a view into the same char[] tjhat the StringBuffer was using. To make this work, the StringBuffer had to keep track of whether its char[] content was shared. And if the shared content ever changed, the StringBuffer needed to copy the content to a new char[] so that any immutable String using the old char[] would not be affected. It was fairly complex to manage, but it allowed very fast toString() and substring() operations.

Apparently however there were some subtle bugs in this approach, and in JDK 5 they were fixed with a much simpler scheme - now a new String always gets its own new char[] array - which means that the content must be copied from the old char[]. Which can be time-consuming when the content is very long and you repeat the operation many times.

Note that by appending onto a StringBuffer, there's no chance that you will change any previously-shared char[] data. So your code above is very fast in 1.4. But try changing data within the shared region - e.g. insert
into the createStrings() loop. Suddenly the 1.4 performance becomes about the same as the JDK 5 performance.

I'm surprised that the difference was as great as it is in this case. You really created a situation that took advantage of the particular way toString() was optimized in the past. Most of the time I just call toString() once for a particular StringBuffer, and that makes the performance difference much less notable.


Thanks for the feedback!

I thought that the problem was with the append method more than toString, but now that you say that it makes sense.

What I did to get around it was just replace the test code's createString method with this, 5.0 and 6.0 suddenly became faster than 1.4.2:

for(int x = 0; x < strings.length; x++)
{
strings[x] = x + " ABCDEFGHIJKLMNOPQRSTUVWXYZ";
}

From what I have always been told, this should be SLOWER than using the StringBuffer.append method (as it was before), but since the toString isn't needed that may have been the difference maker.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, you're now making much, much shorter strings than you were previously. You're no longer appending onto an ever-lengthening base. A closer equivalent to your old code would be:

which is slower on both 1.4 and 5.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
toString() and substring() would create a new String that was actually a view into the same char[]


That approach bit me realllllly good one time. Since the new String is just a view into a char[] that whole char[] has to stay in memory. I was reading huge text into a single String then grabbing bits for new Strings. I could not figure out WHY memory use was so high until some kind soul pointed this out.

Bill
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!