• Post Reply Bookmark Topic Watch Topic
  • New Topic

Question for Cay : Java 8 streams performance.  RSS feed

 
Thillai Sakthi
Ranch Hand
Posts: 109
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello
There have been some conflicting opinions about the performance of the streams versus good old for loops. Can you please enlighten whether the programmers need to be concerned about performance degradations when using streams and other FP constructs that Java 8 offers ?

Thanks for your advice.
 
Cay Horstmann
author
Ranch Hand
Posts: 197
22
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, you need to be concerned. As with every power tool, you can create real damage when you abuse it.

Let me talk about parallel streams first, because that's most interesting from a performance point of view.

Parallel streams are effective when you have a collection that is (a) large (b) in memory and (c) is processed with non-blocking operations. In those situations, parallel streams are a big win because they are easy, and the alternative is so hard that few programmers would attempt it.

When your collection is small, you don't want to parallelize--the overhead won't pay. And if you have blocking operations, you really don't want to parallelize the stream, or performance will be terrible.

Now what about plain old sequential streams when compared to plain old loops? Of course, there is an overhead for all the stream goodness. But it isn't huge, and in many situations you may not care. In particular, when you do exploratory programming, to slice and dice through a largish data set, it is so much better to use streams to try out various things. As an example, I used streams to find interesting facts in a large data set of movies, and it took me quite a few tries to find interesting nuggets. Without streams, it would have taken much longer. And the performance was just fine. Generally, one is quoted an overhead of maybe 10%, but I haven't benchmarked that myself.

Where were you thinking of putting streams to use?

Cheers,

Cay
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you for the answer. Does a 10% overhead mean anything? If you can search a million‑element collection in 1″, who will care if it increases to 1.1″?
I have a copy of Maurice Naftalin's Mastering Lambdas (Manning 2015) which has a chapter (page 138‑158) about performance. There are diagrams about the relative speeds of sequential and parallel streams and he gives links: http://git.io/r6BtKQ 4Nluqw YB9V6g aMoy6w and others with the Java code in. As a general rule, the more often you do something and the longer that something takes, the better the chance that parallel streams will provide faster performance.

Well done asking such an interesting question.
 
Cay Horstmann
author
Ranch Hand
Posts: 197
22
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I had a long discussion with Maurice about that benchmarking work at last year's JCrete conference. And the consensus was that it is really difficult to come up with realistic situations where parallel streams give you amazing performance in practice. If you read the data from disk, say one line at a time, you are already hosed--that's inherently sequential. There is a way of getting around that with memory-mapped files, and he also has a proof of concept in his book. A more robust implementation will come to Java 9.

And Heinz Kabutz has a scary example where real programmers started using parallel streams for everything, even when calling blocking operations. That interferes badly with the fork-join pool that is used for parallelizing the operations.

It is a good idea to have a mental image of what the stream pipeline does. Understand how laziness works, and how parallelization works. Then you will intuitively do the right thing (I hope). But if you consider .stream and .parallel as black boxes, it is easy to go astray.

Cheers,

Cay
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!