RateSize () on a sequential Spliterator

I am implementing a Spliterator , which explicitly restricts parallelization, having trySplit() return null . Will the estimateSize() implementation offer any performance improvements for the stream created by this splitter? Or is the estimated size useful only for parallelization?

EDIT: To clarify, I specifically ask about the estimated size. In other words, my separator does not have the SIZED characteristic.

+8
java java-8 java-stream spliterator
source share
2 answers

A look at the call hierarchy for the corresponding spliterator attribute shows that it at least matters to the performance of stream.toArray()

enter image description here

In addition, the internal flag implementation uses an equivalent flag, which is apparently used for sorting:

enter image description here

Therefore, in addition to parallel flow operations, a size estimate is used for these two operations.

I do not pretend to be exhaustive information for my search, so just take them as examples.


Without the SIZED characteristic, I can find the calls to estimateSize() , which are related to parallel execution of the stream pipeline.

Of course, this could change in the future, or in a different Stream implementation than the standard JDK, one could act differently.

+5
source share

A separator can cross elements:

1. Individually ( tryAdvance () )

2. Essentially in scope ( forEachRemaining () )

According to java docs estimateSize() comes in handy when splitting.

Separators can provide an estimate of the number of remaining elements through an estimate method (). Ideally, as reflected in the characteristic SIZED, this value exactly corresponds to the number of elements that will be encountered upon successful traversal. However, even if this is not known exactly, the estimated value may still be useful for operations performed at the source, such as helping to determine whether it is preferable to divide further or cross the remaining elements sequentially .

Since your separator does not have a characteristic, SIZED estimateSize will not offer any performance (due to lack of parallelism), however keep in mind that estimateSize Java docs say nothing about parallelism, all states:

Returns: the estimated size or Long.MAX_VALUE, if it is infinite, unknown, or too expensive to calculate.

0
source share

All Articles