This week's book giveaway is in the Artificial Intelligence forum.
We're giving away four copies of Pragmatic AI and have Noah Gift on-line!
See this thread for details.
Win a copy of Pragmatic AI this week in the Artificial Intelligence forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Jeanne Boyarsky
  • Liutauras Vilda
  • Campbell Ritchie
  • Tim Cooke
  • Bear Bibeault
Sheriffs:
  • Paul Clapham
  • Junilu Lacar
  • Knute Snortum
Saloon Keepers:
  • Ron McLeod
  • Ganesh Patekar
  • Tim Moores
  • Pete Letkeman
  • Stephan van Hulst
Bartenders:
  • Carey Brown
  • Tim Holloway
  • Joe Ess

Modern Java Recipes: Performance in Stream  RSS feed

 
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ken Kousen. First of all, congratulation for the book, seems to be very interesting.

Nowadays people are working more often with streams, but what about the performance implications, sometimes I heard that use a for-loop has better performance (memory and speed) than use streams, there are some useful guidelines to know when use stream and when avoid them that you could share?
 
gunslinger & author
Ranch Hand
Posts: 130
6
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have an example in the book that deals with that, taken from a similar example in "Java 8 in Action" (which is being revised to be Java 8 and 9 in Action). In my example, I use the JMH profiling tool to evaluate the performance of methods that sum the first 10 million long values.

In one method, I simply add the longs in a loop. In another, I use a LongStream with a range and a sum method. Then I make that (sequential by default) stream parallel. Then I do the sum in the most inefficient way possible, by using a Stream<Long> with an iterate method and a reduce, and finally I do the same thing in parallel.

The results are that the sequential LongStream sum is even faster than the simple loop, though the difference is speed is probably not significant. Making it parallel doesn't help, but mostly because summing primitives is, as the kids used to say, wicked fast already. By contrast, summing the Stream<Long> is much, much slower, on the order of about 10 to 15 times. Making that one parallel actually makes the performance worse, mostly because using iterate with a limit is not an easy structure for the system to partition. The bottom line is that as long as you don't do something silly, like using a stream of wrapped values where a primitive stream is available, the performance is about the same as a regular loop. So go ahead and use streams to write your code, and then you can experiment with parallelization afterwards.

When is parallelization worth it? The general rules are: you need a stateless, associative operation (like addition), you need either a lot of data or a process that takes a lot of time on each element, and you need a source of data that is easy to partition. If those conditions apply, you're likely to see a benefit that exceeds the cost of splitting the work and joining all the individual results together again.

Trisha Gee (and if you don't know that name, look her up -- she's awesome) has published some studies that show similar results. As Brian Goetz likes to say, parallelization is an optimization. Get your code working sequentially first, and then see what you can do in parallel. The streams in Java 8 have been optimized enough to make it worthwhile to write your code that way and then try to optimize.
 
Kenneth A. Kousen
gunslinger & author
Ranch Hand
Posts: 130
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In case you want to see the tests I'm referring to in the previous message, they're in the GitHub repo for the book: https://github.com/kousen/java_8_recipes . Specifically, I'm talking about https://github.com/kousen/java_8_recipes/blob/master/src/jmh/java/manning/ParallelStreamBenchmark.java .

I used three separate GitHub repos for the book: the "java_8_recipes" one just mentioned, a similar one called "java_9_recipes" for the Java 9 stuff, and one called "cfboxscores" for the larger CompletableFuture example I did. That last one involves downloading MLB boxscore data for all the games played on a range of dates concurrently and then post-processing them in various ways.

Anyone is of course welcome to anything in the repositories, whether you actually buy the book or not.
 
Saloon Keeper
Posts: 9145
173
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Great answer Kenneth. I'm really interested in getting your book.
 
Consider Paul's rocket mass heater.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!