Why two definitions of "reverse" in Haskell Data.List

Question

Why two definitions of "reverse" in Haskell Data.List

Looking at the source Data.List shows reverse , defined as

 #ifdef USE_REPORT_PRELUDE reverse = foldl (flip (:)) [] #else reverse l = rev l [] where rev [] a = a rev (x:xs) a = rev xs (x:a) #endif

I would like to know why the second definition is provided? In a way, is it superior?

EDIT:

Like @nm below, the second version is "as a simple reworking of the first version, with the extension foldl and flip (:) ". Indeed, the Data.List itself defines foldl as

 foldl :: (b -> a -> b) -> b -> [a] -> b foldl f z0 xs0 = lgo z0 xs0 where lgo z [] = z lgo z (x:xs) = lgo (fzx) xs

It is impossible to find out the motivation of the authors of Data.List (if only one of them does not visit this page), but as a beginner, is it normal if I write code as the first version of reverse and leave the compiler to do inlining for me?

+7

haskell

Jyotirmoy bhattacharya Jun 24 '14 at 15:49

source share

2 answers

I believe the first version,

 reverse = foldl (flip (:)) []

is the reverse version defined in the Haskell report, and the second version

 reverse l = rev l [] where rev [] a = a rev (x:xs) a = rev xs (x:a)

- equivalent function, more efficient. You can see that the second version uses the accumulation parameter, so all this is a tail call, which is very effective for most Haskell implementations.

The second version is the one provided by default, perhaps the first one is provided so that the compiler authors can check how well the programs work with shorter definitions of functions in the report.

Note. . Others seem to have come to the same conclusion in a message to a haskell cafe .

+10

Dietrich epp Jun 24 '14 at 15:55

source share

Reid barton · Accepted Answer · 2014-06-26T14:47:28+0000

We can compile copies of these two versions and see what ghc outputs are.

 module Rev where myReverse1 = foldl (flip (:)) [] myReverse2 l = rev l [] where rev [] a = a rev (x:xs) a = rev xs (x:a)

Build with -ddump-simpl to see the generated kernel, and -dsuppress-all to eliminate some non-local noise:

 rwbarton@morphism :/tmp$ ghc -O -ddump-simpl -dsuppress-all -fforce-recomp Rev [1 of 1] Compiling Rev ( Rev.hs, Rev.o ) ==================== Tidy Core ==================== Result size of Tidy Core = {terms: 40, types: 58, coercions: 0} Rec { myReverse3 myReverse3 = \ @ a_aNh z_aNB ds_aNC -> case ds_aNC of _ { [] -> z_aNB; : x_aNH xs_aNI -> myReverse3 (: x_aNH z_aNB) xs_aNI } end Rec } myReverse1 myReverse1 = \ @ a_aNh xs0_aNz -> myReverse3 ([]) xs0_aNz Rec { myReverse4 myReverse4 = \ @ a_aMV ds_dNu a1_auj -> case ds_dNu of _ { [] -> a1_auj; : x_auk xs_aul -> myReverse4 xs_aul (: x_auk a1_auj) } end Rec } myReverse2 myReverse2 = \ @ a_aMV l_auh -> myReverse4 l_auh ([])

myReverse3 at myReverse3 compared to myReverse4 shows that they are the same, except that they take their arguments in the reverse order. Indeed, you can see that lgo in foldl has its arguments, canceled from rev in myReverse2 . I am sure that as a result of this there is no noticeable difference in performance, and if it is unintentional.

So yes, with optimizations, the GHC will compile the two definitions of reverse , essentially the same thing. My assumptions about why there is a built-in definition,

The implementation of most standard libraries dates back a long time and was used for sharing between GHC, Hugs, and several other Haskell compilers. Perhaps GHC or one of the other systems at that time was not so good at optimizing.
Today, it’s still useful to use these manually optimized versions when developing GHC: It is common to create a compiler and its libraries with optimizations disabled (since it is much faster), and then manual optimization, for example, this means that the resulting compiler and programs created by it are more effective , I checked, and the usual "fast" BuildFlavour still creates libraries with -O , so there is no truth in that.

Why two definitions of "reverse" in Haskell Data.List

More articles: