Java 8 Stream: Multiple Collector Grouping

I want to use Java 8 Stream and Group with one classifier, but I have several Collector functions. Therefore, when grouped, for example, the average and the sum of one field (or, possibly, another field) are calculated.

I will try to simplify this a bit with an example:

public void test() { List<Person> persons = new ArrayList<>(); persons.add(new Person("Person One", 1, 18)); persons.add(new Person("Person Two", 1, 20)); persons.add(new Person("Person Three", 1, 30)); persons.add(new Person("Person Four", 2, 30)); persons.add(new Person("Person Five", 2, 29)); persons.add(new Person("Person Six", 3, 18)); Map<Integer, Data> result = persons.stream().collect( groupingBy(person -> person.group, multiCollector) ); } class Person { String name; int group; int age; // Contructor, getter and setter } class Data { long average; long sum; public Data(long average, long sum) { this.average = average; this.sum = sum; } // Getter and setter } 

The result should be a map that links the result of the grouping, for example

 1 => Data(average(18, 20, 30), sum(18, 20, 30)) 2 => Data(average(30, 29), sum(30, 29)) 3 => .... 

This works great with a single function, like "Collectors.counting ()", but I like to link more than one (perfectly infinite from the list).

 List<Collector<Person, ?, ?>> 

Is it possible to do something like this?

+8
java java-8 java-stream
source share
4 answers

For the specific task of summing and averaging, use collectingAndThen along with summarizingDouble :

 Map<Integer, Data> result = persons.stream().collect( groupingBy(Person::getGroup, collectingAndThen(summarizingDouble(Person::getAge), dss -> new Data((long)dss.getAverage(), (long)dss.getSum())))); 

For a more general problem (collect various things about your faces) you can create a complex collector, for example:

 // Individual collectors are defined here List<Collector<Person, ?, ?>> collectors = Arrays.asList( Collectors.averagingInt(Person::getAge), Collectors.summingInt(Person::getAge)); @SuppressWarnings("unchecked") Collector<Person, List<Object>, List<Object>> complexCollector = Collector.of( () -> collectors.stream().map(Collector::supplier) .map(Supplier::get).collect(toList()), (list, e) -> IntStream.range(0, collectors.size()).forEach( i -> ((BiConsumer<Object, Person>) collectors.get(i).accumulator()).accept(list.get(i), e)), (l1, l2) -> { IntStream.range(0, collectors.size()).forEach( i -> l1.set(i, ((BinaryOperator<Object>) collectors.get(i).combiner()).apply(l1.get(i), l2.get(i)))); return l1; }, list -> { IntStream.range(0, collectors.size()).forEach( i -> list.set(i, ((Function<Object, Object>)collectors.get(i).finisher()).apply(list.get(i)))); return list; }); Map<Integer, List<Object>> result = persons.stream().collect( groupingBy(Person::getGroup, complexCollector)); 

Map values ​​are lists in which the first item is the result of applying the first collector, etc. You can add a custom finisher step using Collectors.collectingAndThen(complexCollector, list -> ...) to convert this list to something more suitable.

+12
source share

Using the map as the output type, one could have a potentially endless list of reducers, each of which created its own statistics and added it to the map.

 public static <K, V> Map<K, V> addMap(Map<K, V> map, K k, V v) { Map<K, V> mapout = new HashMap<K, V>(); mapout.putAll(map); mapout.put(k, v); return mapout; } 

...

  List<Person> persons = new ArrayList<>(); persons.add(new Person("Person One", 1, 18)); persons.add(new Person("Person Two", 1, 20)); persons.add(new Person("Person Three", 1, 30)); persons.add(new Person("Person Four", 2, 30)); persons.add(new Person("Person Five", 2, 29)); persons.add(new Person("Person Six", 3, 18)); List<BiFunction<Map<String, Integer>, Person, Map<String, Integer>>> listOfReducers = new ArrayList<>(); listOfReducers.add((m, p) -> addMap(m, "Count", Optional.ofNullable(m.get("Count")).orElse(0) + 1)); listOfReducers.add((m, p) -> addMap(m, "Sum", Optional.ofNullable(m.get("Sum")).orElse(0) + p.i1)); BiFunction<Map<String, Integer>, Person, Map<String, Integer>> applyList = (mapin, p) -> { Map<String, Integer> mapout = mapin; for (BiFunction<Map<String, Integer>, Person, Map<String, Integer>> f : listOfReducers) { mapout = f.apply(mapout, p); } return mapout; }; BinaryOperator<Map<String, Integer>> combineMaps = (map1, map2) -> { Map<String, Integer> mapout = new HashMap<>(); mapout.putAll(map1); mapout.putAll(map2); return mapout; }; Map<String, Integer> map = persons .stream() .reduce(new HashMap<String, Integer>(), applyList, combineMaps); System.out.println("map = " + map); 

Produces:

 map = {Sum=10, Count=6} 
+4
source share

You can tie them

A collector can create only one object, but this object can contain several values. You can return the card, for example, if there is an entry on the card for each collector you return.

You can use Collectors.of(HashMap::new, accumulator, combiner);

In your accumulator will be a Collectors Map, where the keys of the created Map correspond to the Collector's name. The combiner would need to combine several esp results when it runs in parallel.


Typically, embedded collectors use a data type for complex results.

From collectors

 public static <T> Collector<T, ?, DoubleSummaryStatistics> summarizingDouble(ToDoubleFunction<? super T> mapper) { return new CollectorImpl<T, DoubleSummaryStatistics, DoubleSummaryStatistics>( DoubleSummaryStatistics::new, (r, t) -> r.accept(mapper.applyAsDouble(t)), (l, r) -> { l.combine(r); return l; }, CH_ID); } 

and in class

 public class DoubleSummaryStatistics implements DoubleConsumer { private long count; private double sum; private double sumCompensation; // Low order bits of sum private double simpleSum; // Used to compute right sum for non-finite inputs private double min = Double.POSITIVE_INFINITY; private double max = Double.NEGATIVE_INFINITY; 
+3
source share

Instead of combining collectors, you should create an abstraction that is an aggregator of collectors: implement the Collector interface with a class that accepts a list of collectors and delegates each method call to each of them. Then, at the end, you return new Data() with all the results that the nested collectors produce.

You can avoid creating a custom class with all method declarations using Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics... characteristics) . finisher lambda will call the finisher of each nested collector and then return an instance of Data .

+3
source share

All Articles