What are the options for speeding up this feature?

I am trying to speed up the following function:

{-# LANGUAGE BangPatterns #-} import Data.Word import Data.Bits import Data.List (foldl1') import System.Random import qualified Data.List as L data Tree a = AB (Tree a) (Tree a) | A (Tree a) | B (Tree a) | C !a deriving Show merge :: Tree a -> Tree a -> Tree a merge (C x) _ = C x merge _ (C y) = C y merge (A ta) (A tb) = A (merge ta tb) merge (A ta) (B tb) = AB ta tb merge (A ta) (AB tb tc) = AB (merge ta tb) tc merge (B ta) (A tb) = AB tb ta merge (B ta) (B tb) = B (merge ta tb) merge (B ta) (AB tb tc) = AB tb (merge ta tc) merge (AB ta tb) (A tc) = AB (merge ta tc) tb merge (AB ta tb) (B tc) = AB ta (merge tb tc) merge (AB ta tb) (AB tc td) = AB (merge ta tc) (merge tb td) 

To emphasize its performance, I sorted using merge :

 fold ab abc list = go list where go (AB a' b') = ab (go a') (go b') go (A a') = a (go a') go (B b') = b (go b') go (C x) = cx mergeAll :: [Tree a] -> Tree a mergeAll = foldl1' merge foldrBits :: (Word32 -> t -> t) -> t -> Word32 -> t foldrBits cons nil word = go 32 word nil where go 0 w !r = r go lw !r = go (l-1) (shiftR w 1) (cons (w.&.1) r) word32ToTree :: Word32 -> Tree Word32 word32ToTree w = foldrBits cons (C w) w where cons 0 t = A t cons 1 t = B t toList = fold (++) id id (\ a -> [a]) sort = toList . mergeAll . map word32ToTree main = do is <- mapM (const randomIO :: a -> IO Word32) [0..500000] print $ sum $ sort is 

Performance looked good, about 2.5 times slower than Data.List sort . None of what I did was sped up, though: nesting multiple functions, engaging in many places, UNPACK on C !a UNPACK Is there a way to speed up this feature?

+6
source share
1 answer

You definitely have too many thunks allocated. I will show how to analyze the code:

 merge (A ta) (A tb) = A (merge ta tb) 

Here you highlight A constructor with one argument, which is thunk. Can you tell when a piece of merge ta tb will be forced? Probably only at the very end when the resulting tree is used. Try adding a hit to each argument of each Tree constructor to make sure that it is strictly speaking:

 data Tree a = AB !(Tree a) !(Tree a) | A !(Tree a) | B !(Tree a) | C !a 

The following example:

 go lw !r = go (l-1) (shiftR w 1) (cons (w.&.1) r) 

Here you highlight thunk for l-1 , shifrR w 1 and cons (w.&.1) r . The first will be forced in the next iterations when comparing l with o , the second will be forced when you will force 3d thunk in the next iteration ( w is used here), and the third thunk will be forced in the next iteration due to the explosion on r . So probably this particular situation is in order.

+8
source

All Articles