• Post Reply Bookmark Topic Watch Topic
  • New Topic

Nested Array Initialization

 
Keith Cummings
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello.
I'm trying to optimize some code I inherited. In this code an array was initialized like this:

I thought, "what a waste since arrays are always initialized to default values (false for boolean) so there is no need to loop through these nested arrays." So I changed the code to this:

The original code is 5 to 10 times faster! This is completely opposite of what I expected. What am I missing?
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24213
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How are you measuring the difference? Either way, this should be pretty quick. Does this code get executed many times?
 
Keith Cummings
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ernest Friedman-Hill wrote:How are you measuring the difference? Either way, this should be pretty quick. Does this code get executed many times?


I'm testing my code for a high-performance system. I'm running both blocks 1000 times and capturing the runtime using System.currentTimeMillis(); before and after each block. I'm guessing the difference may be because of memory allocation. The looping initialization is allocating in many small blocks where the single-line initialization might be asking for one contagious block of memory. That's just a guess at this point though.
 
R van Vliet
Ranch Hand
Posts: 144
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Keith Cummings wrote:
Ernest Friedman-Hill wrote:How are you measuring the difference? Either way, this should be pretty quick. Does this code get executed many times?


I'm testing my code for a high-performance system. I'm running both blocks 1000 times and capturing the runtime using System.currentTimeMillis(); before and after each block. I'm guessing the difference may be because of memory allocation. The looping initialization is allocating in many small blocks where the single-line initialization might be asking for one contagious block of memory. That's just a guess at this point though.


Interesting, this is the case on both -client and -server settings (the former is twice, the latter 5 times as fast using the old version). As you point out it might be related to memory allocation or it being JITted early but this seems strange at best. With 56k entries it's not exactly an amount of data that should kick the memory manager into a fit. I'd be curious to see who knows where this strange behaviour is coming from,.
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24213
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, I just gave it a try, and I was surprised to see that I can reproduce this observation too. On Mac OS X, using the 64-bit JDK 1.6.0_13 , I get a factor of 5 difference. For the 32-bit JDK 1.5 kit, they're both a lot slower, but the difference is only 50% or so (still in favor of the hand-written loop.)

The single line version is compiled to a single bytecode (multianewarray), while the hand-written loop is compiled to lots and lots of code. It's odd to think that the single bytecode isn't backed by highly optimized machine code, but the conclusion seems to be that it is not, that multianewarray's implementation needs improvement.
 
Tim Holloway
Bartender
Posts: 18408
58
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ernest Friedman-Hill wrote:OK, I just gave it a try, and I was surprised to see that I can reproduce this observation too. On Mac OS X, using the 64-bit JDK 1.6.0_13 , I get a factor of 5 difference. For the 32-bit JDK 1.5 kit, they're both a lot slower, but the difference is only 50% or so (still in favor of the hand-written loop.)

The single line version is compiled to a single bytecode (multianewarray), while the hand-written loop is compiled to lots and lots of code. It's odd to think that the single bytecode isn't backed by highly optimized machine code, but the conclusion seems to be that it is not, that multianewarray's implementation needs improvement.


Optimizers are funny things. Since so much of optimization is a trade-off, either in generated code or in the amount of code (and code development!) it takes to generate the code, every optimizer is going to have its little quirks.

Multi-dimensional arrays aren't as common in Java as they are in FORTRAN. Maybe not even as common as in COBOL. So I wouldn't expect a uniform high degree of polish in most JVMs.

Truthfully, I probably would have implemented lazy construction myself, so the various nooks and crannies would be built up on demand. Which would make initial allocation quite fast, but at the expense of some slowdown in runtime, especially for non-sparse arrays. For sparse arrays, this can also result in significant runtime storage savings.

Evil practices like that are why real-life benchmarks are better than coding what you "know" is efficient.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!