First, itβs important to note: Do not use .par on a List , since it requires copying all the data (since List can only be read sequentially). Instead, use something like Vector , for which the .par conversion can happen without copying.
You seem to think of parallelism the wrong way. Here is what will happen:
If you have a file like this:
0 1 2 3 4 5 6 7 8 9
And the functions f and g :
def f(line: String) = { println("running f(%s)".format(line)) line.toInt } def g(n: Int) = { println("running g(%d)".format(n)) n + 1 }
Then you can do:
io.Source.fromFile("data.txt").getLines.toIndexedSeq[String].par.map(l => g(f(l)))
And we get the conclusion:
running f(3) running f(0) running f(5) running f(2) running f(6) running f(1) running g(2) running f(4) running f(7) running g(4) running g(1) running g(6) running g(3) running g(5) running g(0) running g(7) running f(9) running f(8) running g(9) running g(8)
So, although the whole operation g(f(l)) takes place in one thread, you can see that each line can be processed in parallel. Thus, many operations f and g can be performed simultaneously on separate threads, but f and g for a particular line will be executed sequentially.
This, after all, is what you should expect, because there really is no way to read the line, run f and run g in parallel. For example, how to execute g at the output of f if the line has not yet been read?
source share