Your control actions do not actually evaluate the result,
pickRandom :: [a] -> IO a pickRandom [] = error "List is empty" pickRandom (x:xs) = do stdgen <- newStdGen return (pickRandom' xs x 1 stdgen)
gets only the new StdGen and returns thunk. It's very fast.
pickRandomWithLen :: [a] -> IO a pickRandomWithLen [] = error "List is empty" pickRandomWithLen xs = do gen <- newStdGen (e, _) <- return $ randomR (0, (length xs) - 1) gen return $ xs !! e
computes the length of the list and then returns thunk, which of course is much slower.
Coercion is how to evaluate the result,
return $! ...
speeds up length with the version much
benchmarking Using length mean: 14.65655 ms, lb 14.14580 ms, ub 15.16942 ms, ci 0.950 std dev: 2.631668 ms, lb 2.378186 ms, ub 2.937339 ms, ci 0.950 variance introduced by outliers: 92.581% variance is severely inflated by outliers benchmarking Using reservoir collecting 100 samples, 1 iterations each, in estimated 47.00930 s mean: 451.5571 ms, lb 448.4355 ms, ub 455.7812 ms, ci 0.950 std dev: 18.50427 ms, lb 14.45557 ms, ub 24.74350 ms, ci 0.950 found 4 outliers among 100 samples (4.0%) 2 (2.0%) high mild 2 (2.0%) high severe variance introduced by outliers: 38.511% variance is moderately inflated by outliers
(after forcing the input list to be computed before printing the sum), since this requires only one PRNG call, while the collector fetch uses length list - 1 calls.
The difference is likely to be less if a faster PRNG is used than StdGen .
Indeed, using System.Random.Mersenne instead of StdGen ( StdGen is required to have an IO a result type, and since it offers no generation in a certain range, but only the default range slightly distorts the distribution of the selected elements, but since we are only interested in time, necessary to generate pseudorandom numbers, it doesnβt matter), the collector sampling time drops to
mean: 51.83185 ms, lb 51.77620 ms, ub 51.91259 ms, ci 0.950 std dev: 482.4712 us, lb 368.4433 us, ub 649.1758 us, ci 0.950
(the pickRandomWithLen time pickRandomWithLen not change noticeably, of course, since it uses only one generation). About nine times faster, which shows that pseudo-random generation is the dominant factor.