Why Collection <T> Implementation of Stream <T>?

Question

Why Collection <T> Implementation of Stream <T>?

This is a question about developing an API. When extension methods were added to C #, IEnumerable got all the methods that activated the use of lambda expressions directly in all Collections.

With the advent of lambdas and default methods in Java, I expect Collection to implement Stream and provide standard implementations for all of its methods. Thus, we will not need to call stream() to use the power that it provides.

What is the reason architects of the library have chosen a less convenient approach?

+7

java lambda java-8 api-design

Vitaliy Feb 11 '15 at 16:26

source share

4 answers

A collection is an object model.
Flow is a subject model.

Definition of a collection in a document :

A collection is a group of objects known as its elements.

Definition of flow in a document :

A sequence of elements supporting serial and parallel unit operations

In this case, the stream is a specific collection. Not this way. Therefore, Collection should not implement Stream, regardless of backward compatibility.

So why does Stream<T> implement Collection<T> ? Because this is another way to look at a bunch of objects. Not as a group of elements, but operations that you can perform on it. Thus, this is why I say that the collection is an object model, and Stream is an object model.

+1

Umnyobe Feb 11 '15 at 16:42

source share

Firstly, from the Stream documentation:

Collections and streams that bear some surface similarities have different goals. Collections primarily relate to effective management and access to their elements. On the contrary, threads do not provide means for direct access or manipulation of their elements and are instead associated with a declarative description of their source and the computational operations that will be performed collectively on this source.

So you want to keep the flow concepts and the appart collection. If Collection will implement Stream , each collection will be a stream that is not conceptually. The way it is done now, each collection can give you a stream that works in this collection, something else if you think about it.

Another factor that comes to mind is traction / traction, as well as encapsulation. If every class implementing Collection had to perform Stream operations, it would have two (different) goals and might become too long.

0

André stannek Feb 11 '15 at 16:40

source share

I assume this was done in such a way as to avoid breaking with existing code that implements Collection. It would be difficult to provide a default implementation that would work correctly with all existing implementations.

-one

John r Feb 11 '15 at 16:30

source share

John kugelman · Accepted Answer · 2015-02-11T16:38:24+0000

From the Lambda Frequently Asked Questions to Maurice Naftalin :

Why aren't Stream operations defined directly in the collection?
Early open API projects such as filter , map and reduce on Collection or Iterable . However, experience with this design has led to a more formal division of "streaming" methods into their own abstraction. Causes:
Collection methods, such as removeAll , do in-place modifications, unlike new methods that are more functional in nature. Mixing two different methods with the same abstraction forces the user to keep track of which ones. For example, given an ad
 Collection strings; 
two very similar calling methods
 strings.removeAll(s -> s.length() == 0); strings.filter(s -> s.length() == 0); // not supported in the current API 
would have surprisingly different results; the first of them will remove all empty String objects from the collection, and the second will return a stream containing all non-empty String s, without affecting the collection.
Instead, the current design ensures that only the explicitly received stream can be filtered:
 strings.stream().filter(s.length() == 0)...; 
where the ellipsis is an additional stream operation ending in a final operation. This gives the reader a much clearer intuition about the effect of the filter;
With the lazy methods added to the Collection , users were confused by the perceived but erroneous need to speculate whether the collection was in “lazy mode” or “standby mode”. Instead of burdening the Collection with new and different functions, a cleaner presentation of Stream with new functionality;
The more methods added to Collection , the greater the likelihood of name collisions with existing third-party implementations. By adding several methods ( Stream , parallel ), the probability of conflict is significantly reduced;
View transformation is still needed to access the parallel view; the asymmetry between the serial and parallel representations of the flows was unnatural. Compare for example
 coll.filter(...).map(...).reduce(...); 
from
 coll.parallel().filter(...).map(...).reduce(...); 
This asymmetry would be especially evident in the API documentation, where Collection would have many new methods for creating sequential threads, but only for creating parallel threads that would then have all the same methods as Collection . Factoring them into a separate interface, StreamOps say, did not help; which will still contradict each other, must be implemented using both Stream and Collection ;
A single view processing also leaves room for other additional looks in the future.

Why Collection <T> Implementation of Stream <T>?

Why aren't Stream operations defined directly in the collection?

More articles: