Foldl tail is recursive, so how does foldr work faster than foldl?

I wanted to test foldl vs foldr. From what I saw, you should use foldl over foldr when you can ever due to tail recursion optimization.

It makes sense. However, after running this test, I am confused:

foldr (takes 0.057 s when using the time command):

a::a -> [a] -> [a] ax = ([x] ++ ) main = putStrLn(show ( sum (foldr a [] [0.. 100000]))) 

foldl (accepts the 0.089s command when using the time command):

 b::[b] -> b -> [b] b xs = ( ++ xs). (\y->[y]) main = putStrLn(show ( sum (foldl b [] [0.. 100000]))) 

Clearly this example is trivial, but I'm confused about why foldr beats foldl. Shouldn't this be a clear case when the warehouse wins?

+52
optimization tail-recursion haskell fold combinators
Aug 07 '10 at 7:54
source share
7 answers

Welcome to the world of lazy appreciation.

When you think of it in terms of rigorous evaluation, foldl looks “good” and foldr looks “bad” because foldl is tail recursive, but foldr would have to build a tower on the stack in order to process the last element first.

However, lazy appreciation turns tables. Take, for example, the definition of a mapping function:

 map :: (a -> b) -> [a] -> [b] map _ [] = [] map f (x:xs) = fx : map f xs 

This would not be too good if Haskell used a strict estimate, since he had to calculate the tail first and then add the element (for all elements in the list). It seems that the only way to do this effectively is to create elements in the opposite direction.

However, thanks to Haskell's lazy appreciation, this map feature is really effective. Lists in Haskell can be thought of as generators, and this map function generates its first element by applying f to the first element of the input list. When he needs a second element, he does the same thing again (without using extra space).

It turns out that map can be described in terms of foldr :

 map f xs = foldr (\x ys -> fx : ys) [] xs 

It is hard to say looking at it, but a lazy estimate, because foldr can immediately give its first argument f :

 foldr fz [] = z foldr fz (x:xs) = fx (foldr fz xs) 

Since f defined by map can return the first element of the result list using only the first parameter, addition can work lazily in constant space.

Now lazy appreciation bites off. For example, try running the amount [1..1000000]. This causes a stack overflow. Why should it be? He should just evaluate left to right, right?

Let's see how this evaluates Haskell:

 foldl fz [] = z foldl fz (x:xs) = foldl f (fzx) xs sum = foldl (+) 0 sum [1..1000000] = foldl (+) 0 [1..1000000] = foldl (+) ((+) 0 1) [2..1000000] = foldl (+) ((+) ((+) 0 1) 2) [3..1000000] = foldl (+) ((+) ((+) ((+) 0 1) 2) 3) [4..1000000] ... = (+) ((+) ((+) (...) 999999) 1000000) 

Haskell is too lazy to add on as they become available. Instead, it ends with a tower of unreasonable thunders who must be forced to obtain a number. During this evaluation, the stack overflows, as it must deeply re-evaluate all the tricks.

Fortunately, Data.List has a special function called foldl' that works strictly. foldl' (+) 0 [1..1000000] will not overflow the overflow. (Note: I tried replacing foldl with foldl' in your test, but it actually ran it slower.)

+80
Aug 07 '10 at 8:14
source share

EDIT: Looking at this issue again, I think that all current explanations are somewhat inadequate, so I wrote a more detailed explanation.

The difference is how foldl and foldr apply their reduction function. Considering the case of foldr , we can expand it as

 foldr (\x -> [x] ++ ) [] [0..10000] [0] ++ foldr a [] [1..10000] [0] ++ ([1] ++ foldr a [] [2..10000]) ... 

This list is processed by sum , which consumes it as follows:

 sum = foldl' (+) 0 foldl' (+) 0 ([0] ++ ([1] ++ ... ++ [10000])) foldl' (+) 0 (0 : [1] ++ ... ++ [10000]) -- get head of list from '++' definition foldl' (+) 0 ([1] ++ [2] ++ ... ++ [10000]) -- add accumulator and head of list foldl' (+) 0 (1 : [2] ++ ... ++ [10000]) foldl' (+) 1 ([2] ++ ... ++ [10000]) ... 

I did not take into account the details of the concatenation of the list, but this is how the reduction occurs. The important part is that everything is handled in order to minimize click through lists. foldr moves only once, concatenations do not require continuous traversal of the list, and sum finally consumes the list in one pass. Critically, the head of the list is accessible from foldr right up to sum , so sum can start working immediately, and the values ​​can be gc'd as they are created. With merge frameworks such as vector , even intermediate lists are likely to work.

Contrast this with the foldl function:

 b xs = ( ++xs) . (\y->[y]) foldl b [] [0..10000] foldl b ( [0] ++ [] ) [1..10000] foldl b ( [1] ++ ([0] ++ []) ) [2..10000] foldl b ( [2] ++ ([1] ++ ([0] ++ [])) ) [3..10000] ... 

Note that now the head of the list is not available until foldl finishes. This means that the entire list must be constructed in memory before sum can start working. It is much less effective. Running two versions with +RTS -s shows miserable garbage collection performance from the foldl version.

This is also the case when foldl' doesn't help. The added rigidity of foldl' does not change the way you create an intermediate list. The head of the list remains unavailable until the foldl 'command completes, so the result will be slower than with foldr .

I use the following rule to determine the best choice fold

  • For folds that are shorthand, use foldl' (e.g. this will be the only / final round)
  • Otherwise use foldr .
  • Do not use foldl .

In most cases, foldr is the best fold function because the traversal direction is optimal for lazy list evaluations. It is also the only way to handle endless lists. The additional foldl' rigor may speed foldl' up in some cases, but it depends on how you use this structure and how lazy it is.

+23
Aug 07 '10 at 9:53
source share

I do not think that someone really said the real answer to this until I missed something (which may well be true and welcome with downvotes).

I think the biggest one in this case is that foldr builds the list as follows:

[0] ++ ([1] ++ ([2] ++ (... ++ [1000000])))

While foldl builds the list as follows:

(((([0] ++ [1]) ++ [2]) ++ ...) ++ [999888]) ++ [999999]) ++ [1000000]

The difference is subtle, but note that in the version of foldr ++ there is always only one list item as its left argument. With the foldl version in ++ left argument (an average of about 500,000) there are up to 999999 elements, but only one element in the correct argument.

However, ++ takes time proportional to the size of the left argument, as it should look if the entire list of left arguments is to the end, and then move the last element to the first element of the correct argument (at best, you may actually need to do copy). The list of right arguments does not change, so it does not matter how large it is.

This is why the foldl version foldl much slower. In my opinion, this has nothing to do with laziness.

+6
Mar 14 '13 at 5:39
source share

The problem is that tail recursion optimization is memory optimization, not runtime optimization!

Optimizing tail regeneration avoids the need to memorize values ​​for each recursive call.

So foldl is actually “good” and foldr is “bad”.

For example, given the definitions of foldr and foldl:

 foldl fz [] = z foldl fz (x:xs) = foldl f (z `f` x) xs foldr fz [] = z foldr fz (x:xs) = x `f` (foldr fz xs) 

How the expression "foldl (+) 0 [1,2,3]" is evaluated:

 foldl (+) 0 [1, 2, 3] foldl (+) (0+1) [2, 3] foldl (+) ((0+1)+2) [3] foldl (+) (((0+1)+2)+3) [ ] (((0+1)+2)+3) ((1+2)+3) (3+3) 6 

Note that foldl does not remember the value 0, 1, 2 ..., but passes the entire expression ((0 + 1) +2) +3) as an argument lazily and does not evaluate it until the last estimate is foldl, where it reaches the base case and returns the value passed as the second parameter (z), which has not yet been evaluated.

On the other hand, how foldr works:

 foldr (+) 0 [1, 2, 3] 1 + (foldr (+) 0 [2, 3]) 1 + (2 + (foldr (+) 0 [3])) 1 + (2 + (3 + (foldr (+) 0 []))) 1 + (2 + (3 + 0))) 1 + (2 + 3) 1 + 5 6 

The important difference here is that when foldl evaluates the entire expression in the last call, avoiding the need to return in order to reach remembered values, foldr no. foldr remembers a single integer for each call and adds in each call.

It is important to remember that foldr and foldl are not always equivalent. For example, try to compute these expressions in a hug:

 foldr (&&) True (False:(repeat True)) foldl (&&) True (False:(repeat True)) 

foldr and foldl are equivalent only under certain conditions described here

(sorry for my bad english)

+5
Feb 16 '13 at 16:12
source share

For a, the list [0.. 100000] needs to be expanded immediately, so foldr can begin with the last element. Then, when he puts things together, intermediate results

 [100000] [99999, 100000] [99998, 99999, 100000] ... [0.. 100000] -- ie, the original list 

Since no one can change the value of this list (Haskell is a pure functional language), the compiler can reuse this value. Intermediate values, such as [99999, 100000] , can simply be pointers to the extended list [0.. 100000] instead of individual lists.

For b, look at the intermediate values:

 [0] [0, 1] [0, 1, 2] ... [0, 1, ..., 99999] [0.. 100000] 

Each of these staging lists cannot be reused, because if you change the end of the list, you will change any other values ​​that point to it. Thus, you create a bunch of additional lists that take time to build in memory. Thus, in this case, you spend much more time distributing and filling out these lists, which are intermediate values.

Since you just make a copy of the list, it’s faster because it starts by expanding the complete list and then simply moves the pointer from the back of the list to the foreground.

+2
Aug 07 2018-10-10T00:
source share

Neither foldl nor foldr is tail optimized. This is just foldl' .

But in your case, using ++ with foldl' not a good idea, because a consistent evaluation of ++ will lead to re-crawling the growing battery.

+1
Aug 07 2018-10-10T00:
source share

Ok, let me rewrite your functions so that the difference is obvious -

 a :: a -> [a] -> [a] a = (:) b :: [b] -> b -> [b] b = flip (:) 

You see that b is more complex than a. If you want to be precise, a takes one transition step to calculate the value, but b takes two. This makes the time difference that you are measuring, in the second example, the reduction should be performed twice.

// edit: But the time complexity is the same, so I would not bother it much.

-2
Aug 07 '10 at 8:12
source share



All Articles