using reduce to compute an average

Jeanne Boyarsky
author & internet detective
Marshal
Posts: 35077
380
One of the exercises in "Java 8 for the Really Impatient" (page 44, question 10) says "write a call to reduce that can be used to compute the average of a Stream<Double> I'm at a loss.

I do know how to compute an average of Stream<Double> if I actually needed to do so:

This doesn't use reduce though. Using reduce implies being able specify in terms of pairs of elements. I can't think of a way to determine average that way. While using reduce() does feel like an academic exercise, I really want to know what Cay had in mind. Any ideas? The only thing I can think of is the following. It uses reduce, but doesn't feel like the spirit of the questions because it cheats and uses a local variable for the count.

The second part of that exercise was much easier - why can't you compute the sum and divide by count()? (because you can only go thru the stream once.)

Piet Souris
Rancher
Posts: 1400
29
hi Jeanne,

I experimented a little, using local int variables to do the counting,
but I got the error that local variables in lambda's must be final or almost final.
So that didn't work, therefeore I could not think of anything better than

You can make it easier when you use a simple ArrayList<Double>, of which you can
determine the length; you wouldn't need an AtomicInteger to do the counting.

Greetz,
Piet

Jeanne Boyarsky
author & internet detective
Marshal
Posts: 35077
380
I like that yours does all the calculation inside reduce. I feel like this wasn't the point of the exercise. The ones before/after it were much easier.

Matthew Brown
Bartender
Posts: 4568
9
Can you zip* the list with an ascending sequence? Then you could reduce the resulting list of pairs, and you'd have the current count available at each step of the calculation.

* I haven't looked at Java 8 yet, so I don't know if it would use this word. Most functional languages would have an operation to pair up values of two sequences. Some have a "zip with index" that does exactly what I'm talking about here.

Jeanne Boyarsky
author & internet detective
Marshal
Posts: 35077
380
Matthew Brown wrote:Can you zip* the list with an ascending sequence?

No. Java got rid of the zip method. Piet and Stephen showed how to implement it, but it makes my eyes bleed. And is much, much, much harder to read than the hacks we have above.

Matthew Brown
Bartender
Posts: 4568
9
OK, then, here's another way that doesn't use mutable state. Instead of each step passing on the current mean alone, get it to pass on a pair of the mean and the current count. So you have to initialise the reduce with (0, 0). The reducing function needs to take one of these pairs and the next number, and return the next pair.

Still don't know if that's the solution they had in mind, though.

Jeanne Boyarsky
author & internet detective
Marshal
Posts: 35077
380
I guess. Java doesn't have built in pairs that I know of. I could create my own class, but this is back to feeling more complicated than it should. Plus I have to map to pairs to call reduce, right?

Mike Simmons
Ranch Hand
Posts: 3090
14
Yup. This is a good use case for built-in tuples, but Java still doesn't believe in them. Too bad - the equivalent Scala code is fairly slick.

Stephan van Hulst
Bartender
Posts: 6475
83
Yeah, maybe it's best to ask the author. Here's my solution:

Rob Spoor
Sheriff
Posts: 20707
68
Why not let get() simply return return x / i? If i is 0, then get() will return Double.NaN which is what I would expect for the average of nothing - something that's not a number, not 0.

Never mind, x / 0 is + or - infinity, not NaN. Still, I'd return NaN instead of 0.

Cay Horstmann
author
Ranch Hand
Posts: 172
15
• 2
The key idea is that, when you accumulate the results, you need to keep track of a pair (sum, count). Your reduction function is

((sum, count), value) -> (sum + value, count + 1)

There are a couple of technical issues.

1) Lamely, Java doesn't have pairs. You can use AbstractMap.SimpleEntry<Double, Integer> to stay within the standard library, but it would be reasonable to define a generic class Pair or Tuple2 and keep it in your toolbox.

2) You are forced into the third (most complex) variation of reduce since the parameter types vary, so you also need to supply a method for combining partial results

((sum, count), (sum2, count2) -> (sum + sum2, count + count2)

Stephan van Hulst
Bartender
Posts: 6475
83
Rob Spoor wrote:Still, I'd return NaN instead of 0.

You are correct. I work with doubles so seldom that I hadn't even considered using NaN, or that /0 is a valid operation.