Is there an intermediate data structure created in lists

Question

Is there an intermediate data structure created in lists

It seems that foldr does some sort of merging with list comprehension, so it requires less memory (11mb) compared to foldl (21mb) in this, for example

 myfunc = sum $ foldr g acc [ fx | x <- xs ] fx = .. gxy = ..

Can anyone explain how and why? Just like lazy appreciation helps.

+7

haskell fold list-comprehension

vis Dec 9 '11 at 17:01

source share

3 answers

We can weaken the understanding of essentially map f xs . If you compile this, then ghc really should be able to plan the amount, fold and card in one pass: http://www.haskell.org/haskellwiki/Correctness_of_short_cut_fusion , But even if you do not, then laziness is your friend to use memory. The list created on the map, lazy - f applies only when required. And f will be required only when the warehouse requires it. And since your foldr explicitly produces a different (lazy) list, each step of the fold is required only by the sum in turn. That way, you still use every function, but you don't need to create complete intermediate data structures at the same time. While you wrote a whole set of functional compositions, the evaluation model will relate to this particular set of code, modulo a whole bunch of manual scope, somewhat similar to a loop (although, without merging, a cycle with a fair amount of indirectness).

+8

sclv Dec 9 '11 at 17:19

source share

This is the function of the GHC compiler. Basically, GHC can recognize when a list is used in a pipeline, and can convert the entire construct to the while -loop equivalent in C, which does not highlight the list at all.

The reason this works with foldr rather than foldl depends on the function g that you use in your example. Since foldr , unlike foldl , accumulates the results of the function specified as a parameter (aka: foldl requires the entire list before it can actually evaluate the function g , so in this case it creates a huge “crash” of unvalued functions and the final element in the list so in this case it uses a lot more memory - while foldr can start evaluating g as soon as it gets any input to the list), it is called "strict" in its accumulator, and some assumptions can be made by the compiler, which can weight gain ty to optimization.

If, for example, the function g gives a value that is a list, it can continue the aforementioned pipeline optimization strategy, basically considering foldr as a map and doing the whole construction (from to generate a list to list consumption) in a strict loop. This is only possible because foldr gives exactly one list item for each list item that it consumes, that foldl not guaranteed (especially for infinite lists).

+1

dflemstr Dec 9 '11 at 17:23

source share

Daniel Fischer · Accepted Answer · 2011-12-09T17:24:43+0000

Left help cannot produce any conclusion (part of the result) before the entire list has passed. Depending on which function you add, this can create a large data structure or a large stream that uses most of the memory (it can work in read-only memory if you draw, for example, (+) on an Int list).

The right paint can for the corresponding functions (such that it can lead to a [partial] result without checking the second argument) gives the result gradually, so that if the result is properly consumed and the corresponding input list is generated properly, the whole calculation can be performed in a small constant space . As sclv said, in such cases it comes down mainly to a loop.

Is there an intermediate data structure created in lists

More articles: