Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Performance comparison fork/join vs normal loop

 
Marco Farrugia
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I wrote the small program below that sums the content of all the elements in an long array and puts the result in the last element of the array. The algorithms was implemented both using Fork/Join and without and the time taken was compared to confirm the performance advantage of fork/join. Surprisingly the fork/join took much longer.

Can someone understand and explain to me why did this happen?

In line 24, I am using an AtomicLongArray, however I also tried using normal Java addition in a synchronized block which took less time, but still more than the normal way. In the latter case I had to use a synchronized block, as without it the result was inconsistent and incorrect due to thread interference.

Below find the output:


And find the code hereunder:



Thanks
 
Edward Harned
Ranch Hand
Posts: 291
Java
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are so many reasons that it is best to start here: The Calamity

The F/J framework needs to "warm up." There have been discussions about this on the concurrency-interest list. The framework was designed for compiled code. So it takes a while for the JIT compiler to perform.

Then there is the work-stealing problem. All forked tasks go into one deque and all the threads have to go looking for work. SLOW. This is one reason your simple code works faster.

The article lists others.
 
Vlado Zajac
Ranch Hand
Posts: 245
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Unnecessary synchronization slows that down.
In each task (which needs to be RecursiveTask not RecursiveAction) compute sum from start to end to local variable (using values from join()ed sub-tasks) and return that as result of the task. Set the last element only at the very end (using result of main task).

But since simple sum of an array is more limited by memory speed than computing power, parallel algorithm won't be faster than simple loop.

 
Edward Harned
Ranch Hand
Posts: 291
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The performance boost comes from the size of the array. When you get into millions of elements and many CPUs then F/J is worthwhile (once it warms up.)

You might want to try the TymeacDSE project mentioned in the article. It scatters the array subsets to all the threads using work-sharing. Much, much faster.
 
Jayesh A Lalwani
Rancher
Posts: 2756
32
Eclipse IDE Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Wow, Edward, that article of yours is a very good read. Thanks for posting that.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic