Is there any advantage of calling a map after mapToInt, where ever it was required

I am trying to calculate the sum of the squared values ​​in a list. Below are three options that all calculate the required value. I want to know which one is most effective. I expect a third to be more efficient, since automatic boxing is performed only once.

// sum of squares int sum = list.stream().map(x -> x * x).reduce((x, y) -> x + y).get(); System.out.println("sum of squares: " + sum); sum = list.stream().mapToInt(x -> x * x).sum(); System.out.println("sum of squares: " + sum); sum = list.stream().mapToInt(x -> x).map(x -> x * x).sum(); System.out.println("sum of squares: " + sum); 
+10
java performance java-8 java-stream
source share
3 answers

If in doubt, try it out! Using jmh, I get the following results in a list of 100 thousand items (in microseconds, less is better):

 Benchmark Mode Samples Score Error Units capSO32462798.for_loop avgt 10 119.110 0.921 us/op capSO32462798.mapToInt avgt 10 129.702 1.040 us/op capSO32462798.mapToInt_map avgt 10 129.753 1.516 us/op capSO32462798.map_reduce avgt 10 1262.802 12.197 us/op capSO32462798.summingInt avgt 10 134.821 1.203 us/op 

So, you have from faster to slower:

  • for(int i : list) sum += i*i;
  • mapToInt(x -> x * x).sum() and mapToInt(x -> x).map(x -> x * x).sum()
  • collect(Collectors.summingInt(x -> x * x))
  • map(x -> x * x).reduce((x, y) -> x + y).get()

Note that the results are very dependent on JIT optimizations. If the logic in the comparison is more complicated, some of the optimizations may not be available (longer code = less investment), in which case the versions of the threads may take 4-5 times longer than the for loop, but if this logic is heavy CPU, the difference will decrease again. Profiling your actual application will give you more information.


Code for comparison:

 @State(Scope.Benchmark) @BenchmarkMode(Mode.AverageTime) public class SO32462798 { List<Integer> list; @Setup public void setup() { list = new Random().ints(100_000).boxed().collect(toList()); } @Benchmark public int for_loop() { int sum = 0; for (int i : list) sum += i * i; return sum; } @Benchmark public int summingInt() { return list.stream().collect(Collectors.summingInt(x -> x * x)); } @Benchmark public int mapToInt() { return list.stream().mapToInt(x -> x * x).sum(); } @Benchmark public int mapToInt_map() { return list.stream().mapToInt(x -> x).map(x -> x * x).sum(); } @Benchmark public int map_reduce() { return list.stream().map(x -> x * x).reduce((x, y) -> x + y).get(); } } 
+10
source share

I expect the second to be the fastest.

There is neither a second nor a third example in the box (if there are already nested elements in the list). But there is unboxing.

In your second example, there may be two unboxing (one for each x in x*x ), and the third does unboxing only once. However, unpacking is fast, and I think it’s not worth optimizing, since a longer pipeline with an additional function call will certainly slow it down.

Sidenote: in general, you should not expect Stream be faster than regular iterations in arrays or lists. When you do math, when speed matters (like that), it's better to go the other way: just iterate over the elements. If your result is an aggregated value, then aggregate it, if it is a mapping, then select a new array or list of the same size and fill it with the calculated values.

+1
source share

The mapToInt () method, a variant of the map operation (such as mapToInt (), mapToDouble (), etc., create streams specialized for types, such as IntStream and DoubleStream). Whenever we need to use some method of the IntStream class after displaying the stream, we can use mapToINT ().

0
source share

All Articles