Eran’s answer described the differences between the reduce versions of the two-arg and three-arg versions, in which the former reduces Stream<T> to T , while the latter decreases Stream<T> to U However, in reality this did not explain the need for an additional combiner function when reducing Stream<T> to U
One of the principles of designing Streams APIs is that the API should not differentiate between serial and parallel threads, or else, a specific API should not prevent the flow from working sequentially or in parallel. If your lambdas have the correct properties (associative, non-interfering, etc.), a thread executed sequentially or in parallel should give the same results.
Let's first consider a shorthand version with two arguments:
T reduce(I, (T, T) -> T)
Consistent implementation is simple. The identity value I “accumulates” with the zero-flow element to give a result. This result accumulates with the first element of the flow to give another result, which, in turn, accumulates with the second element of the flow, etc. After the last element is accumulated, the final result is returned.
Parallel implementation begins by dividing the stream into segments. Each segment is processed by its own stream in the sequential manner described above. Now, if we have N threads, we have N intermediate results. They need to be reduced to one result. Since each intermediate result is of type T, and we have several, we can use the same accumulator function to reduce the intermediate results of N to one result.
Now consider a hypothetical operation to reduce two arguments, which reduces Stream<T> to U In other languages, this is called the “fold” or “fold-left” operation, so I will name it here. Please note that this does not exist in Java.
U foldLeft(I, (U, T) -> U)
(Note that the identity value of I is of type U.)
The serial version of foldLeft similar to the serial version of reduce , except that the intermediate values are of type U instead of type T. But the rest is the same. (The hypothetical foldRight operation would be similar, except that the operations would be performed from right to left, and not from left to right.)
Now consider the parallel version of foldLeft . Let the separation of the flow into segments begin. Then we can have that each of the N threads decreases the values of T in its segment by N intermediate values of type U. Now what? How do we get from N values of type U to one result of type U?
There is no other function that combines several intermediate results of type U into a single result of type U. If we have a function that combines two values of U into one that are sufficient to reduce any number of values to one, just like the original reduction above . Thus, the reduction operation, which gives a different type of result, needs two functions:
U reduce(I, (U, T) -> U, (U, U) -> U)
Or using Java syntax:
<U> U reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)
So, for parallel reduction for another type of result, we need two functions: one that accumulates the elements of T to intermediate values of U, and the second that combines the intermediate values of U into a single result of U. If we do not switch types, it turns out that the battery function is same as combiner function. The fact that reduction to the same type has only the function of the battery, and reduction to another type requires separate functions of the battery and combiner.
Finally, Java does not provide the operations foldLeft and foldRight , because they imply a specific order of operations, which is essentially sequential. This contradicts the design principle described above for providing APIs that equally support serial and parallel operation.