Haskell histogram calculation faster

I am new to Haskell and I want to create a histogram. I use Data.Vector.Unboxed for fusible data operations; which burns quickly (when compiled with -O -fllvm), and my bend application is the bottleneck; which integrates a bucket count.

How can I do it faster? I read about trying to reduce the number of tricks, strictly speaking, so I made everything rigorous using seq and foldr ', but without observing a significant increase in performance. Your ideas are highly recommended.

import qualified Data.Vector.Unboxed as V histogram :: [(Int,Int)] histogram = V.foldr' agg [] $ V.zip kv where n = 10000000 c = 1000000 k = V.generate n (\i -> i `div` c * c) v = V.generate n (\i -> 1) agg kv [] = [kv] agg kv@ (k,v) acc@ ((ck,cv):as) | k == ck = let a = (ck,cv+v):as in a `seq` a | otherwise = let a = kv:acc in a `seq` a main :: IO () main = print histogram 

Compiled with

 ghc --make -O -fllvm histogram.hs 
+7
haskell ghc
source share
1 answer

First compile the program with -O2 -rtsopts . Then, to get the first idea where you can optimize, run the program with the parameters +RTS -sstderr :

 $ ./question +RTS -sstderr [(0,1000000),(1000000,1000000),(2000000,1000000),(3000000,1000000),(4000000,1000000),(5000000,1000000),(6000000,1000000),(7000000,1000000),(8000000,1000000),(9000000,1000000)] 1,193,907,224 bytes allocated in the heap 1,078,027,784 bytes copied during GC 282,023,968 bytes maximum residency (7 sample(s)) 86,755,184 bytes maximum slop 763 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1964 colls, 0 par 3.99s 4.05s 0.0021s 0.0116s Gen 1 7 colls, 0 par 1.60s 1.68s 0.2399s 0.6665s INIT time 0.00s ( 0.00s elapsed) MUT time 2.67s ( 2.68s elapsed) GC time 5.59s ( 5.73s elapsed) EXIT time 0.02s ( 0.03s elapsed) Total time 8.29s ( 8.43s elapsed) %GC time 67.4% (67.9% elapsed) Alloc rate 446,869,876 bytes per MUT second Productivity 32.6% of total user, 32.0% of total elapsed 

Please note that 67% of your time is spent on GC! Obviously, something is wrong. To find out what is wrong, we can run the program with heap profiling enabled (using +RTS -h ), resulting in the following figure:

First heap profile

So you are seeping. How did this happen? Looking at the code, the only time a thunk is created (recursively) in agg is when you add. Creating cv strict by adding an interference pattern fixes the problem:

 {-# LANGUAGE BangPatterns #-} import qualified Data.Vector.Unboxed as V histogram :: [(Int,Int)] histogram = V.foldr' agg [] $ V.zip kv where n = 10000000 c = 1000000 k = V.generate n (\i -> i `div` c * c) v = V.generate n id agg kv [] = [kv] agg kv@ (k,v) acc@ ((ck,!cv):as) -- Note the ! | k == ck = (ck,cv+v):as | otherwise = kv:acc main :: IO () main = print histogram 

Output:

 $ time ./improved +RTS -sstderr [(0,499999500000),(1000000,1499999500000),(2000000,2499999500000),(3000000,3499999500000),(4000000,4499999500000),(5000000,5499999500000),(6000000,6499999500000),(7000000,7499999500000),(8000000,8499999500000),(9000000,9499999500000)] 672,063,056 bytes allocated in the heap 94,664 bytes copied during GC 160,028,816 bytes maximum residency (2 sample(s)) 1,464,176 bytes maximum slop 155 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 992 colls, 0 par 0.03s 0.03s 0.0000s 0.0001s Gen 1 2 colls, 0 par 0.03s 0.03s 0.0161s 0.0319s INIT time 0.00s ( 0.00s elapsed) MUT time 1.24s ( 1.25s elapsed) GC time 0.06s ( 0.06s elapsed) EXIT time 0.03s ( 0.03s elapsed) Total time 1.34s ( 1.34s elapsed) %GC time 4.4% (4.5% elapsed) Alloc rate 540,674,868 bytes per MUT second Productivity 95.5% of total user, 95.1% of total elapsed ./improved +RTS -sstderr 1,14s user 0,20s system 99% cpu 1,352 total 

This is much better.


So now you may ask why the problem occurred even if you used seq ? The reason for this is that seq forces the first argument to WHNF , and for the pair (_,_) (where _ are the unvalued thunks) WHNF! In addition, seq aa matches a , because it seq ab (informally) means: evaluates the value of a before b is evaluated, so seq aa simply means: estimates a before the a is evaluated, and this is the same, which is just evaluating a!

+15
source share

All Articles