I donβt think the right data type is needed for this. Since this is only a linked list, the data will necessarily be available sequentially. Despite the fact that you can evaluate elements in parallel, you will not get much at the reduction stage. If you really need a List, I think the best feature would be just
parFold f = foldl1' f . withStrategy (parList rseq)
or maybe
parFold f = foldl1' f . withStrategy (parBuffer 5 rseq)
If the recovery step is difficult, you can get a win by dividing the list as follows:
parReduce f = foldl' f mempty . reducedList . chunkList . withStrategy (parList rseq) where chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls reducedList = parMap rseq (foldl' f mempty)
I took the liberty of assuming that your data is Monoid for mempty, if this is not possible, you can either replace mempty with your own empty type, or worse use foldl1' .
Two operators from Control.Parallel.Strategies are used here. parList evaluates all list items in parallel. After that, chunkList divides the list into pieces of 1000 elements. Each of these fragments is then reduced in parallel using parMap .
You can also try
parReduce2 f = foldl' f mempty . reducedList . chunkList where chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls reducedList = parMap rseq (foldl' f mempty)
Depending on how the work is distributed, one of them may be more effective than the others.
If you can use a data structure that has good indexing support (Array, Vector, Map, etc.), you can make binary divisions for the reduction phase, which is likely to be better overall.
John l
source share