The problem is that parVector does not cause an evaluation of the elements of the vector. Each element remains thin and nothing sparkles until the vector is consumed (after being printed), which is too late for the sparks to work. You can force evaluation of each element by composing a parVector strategy with rdeepseq .
import qualified Data.Vector as V import qualified Data.Vector.Unboxed as U import Data.Vector.Strategies import Control.DeepSeq import Control.Parallel.Strategies main = do let res = genVVec 200 `using` (rdeepseq `dot` parVector 20) print res genUVec :: Int -> U.Vector Int genUVec n = U.map (ack 2) $ U.enumFromN n 75 genVVec :: Int -> V.Vector (U.Vector Int) genVVec n = V.map genUVec $ V.enumFromN 0 n ack :: Int -> Int -> Int ack 0 n = n+1 ack m 0 = ack (m-1) 1 ack mn = ack (m-1) (ack m (n-1)) instance (NFData a, U.Unbox a) => NFData (U.Vector a) where rnf vec = seq vec () instance (NFData a) => NFData (V.Vector a) where rnf = rnf . V.toList
I also changed your instance of NFData (U.Vector a) . Since a U.Vector is unboxed, evaluating WHNF is sufficient, and enforcing each item through list conversion is wasteful. In fact, the default definition for rnf works fine if you like.
With these two changes, I get the following
SPARKS: 200 (200 converted, 0 pruned)
and runtime is reduced by almost 50%. I also adjusted the block block size to 20, but the result is very similar to block size 2.
source share