I know where you are hinting at your question, and I will do my best to explain.
Consider a list of input data, consisting of 8 elements:
[1, 2, 3, 4, 5, 6, 7, 8]
And suppose that the threads will parallellize it as follows, in reality they do not, the exact process of parallelization is quite difficult to understand. But for now, suppose they would divide the size into two until there are two elements left.
The branching will look like this:
First Division:
[1, 2, 3, 4]
[5, 6, 7, 8]
The second division:
[1, 2]
[3, 4]
[5, 6]
[7, 8]
Now we have four pieces that (in our theory) will be processed by four different threads that do not know each other.
This can be fixed by synchronizing on some external resource, but then you lose the benefits of parallelization, so we need to assume that we are not synchronizing, and when we are not synchronizing, other threads will not see what other threads are done, so our result it will be rubbish.
Now to the question of where you are asking about statelessness, how can it be handled correctly in parallel? How to add items that are processed in parallel in the correct order to the list?
First, suppose a simple display function in which you map lambda i -> i + 10 , and then print it using System.out::println in foreach.
Now after the second division, the following will happen:
[1, 2] -> [11, 12] -> { System.out.println(11); System.println(12); }
[3, 4] -> [13, 14] -> { System.out.println(13); System.println(14); }
[5, 6] -> [15, 16] -> { System.out.println(15); System.println(16); }
[7, 8] -> [17, 18] -> { System.out.println(17); System.println(18); }
There is no guarantee on the order, except that all elements processed by the same thread (internal state, not rely) are processed in order.
If you want to process them in order, you need to use forEachOrdered , which ensures that all threads will work in the correct order and you will not lose too many parallelization advantages because of this, since it only applies to the final state.
To find out how you can add parelllized items to the list, look at this using Collectors.toList() , which provides methods for:
- Create a new list.
- Adding a value to the list.
- Combining two lists.
Now after the second division, the following will happen:
For every four threads, it will do the following (only one thread is displayed here):
- We had
[1, 2] . - We list it on
[11, 12] . - We create an empty
List<Integer> . - We will add
11 to the list. - We will add
12 to the list.
Now all the threads have done this, and we have four two-item lists.
Now the following merges occur in the order specified:
[11, 12] ++ [13, 14] = [11, 12, 13, 14][15, 16] ++ [17, 18] = [15, 16, 17, 18]- Finally
[11, 12, 13, 14] ++ [15, 16, 17, 18] = [11, 12, 13, 14, 15, 16, 17, 18]
And therefore, the resulting list is ordered and the mapping is done in parallel. Now you must also understand why parallelization requires a higher minimum, but only two elements, since creating new lists and merging become too expensive.
Hopefully now you will understand why stream operations must be idle in order to take full advantage of parallelization.