Fold Optimization

I'm just wondering if there are any (only for polymorphic) first order optimizations with folds.

There is deforestation for maps: map g (map f ls) => map (g . f) ls and rev (map f ls) => rev_map f ls (faster in Ocaml).

But the addition is so powerful that it seems that it can not be optimized.

+8
performance functional-programming fold ocaml
source share
3 answers

The obvious:

 fold_left f acc (List.map g li) => fold_left (fun acc x -> f acc (gx)) acc li fold_right f li acc => fold_left f acc li (* if (f,acc) is a monoid *) 

You might be interested in a classic article on the topic "Functional programming with bananas, lenses, envelopes, and barbed wire." Beware, however, that it is technical and has an impenetrable notation.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125

Edit: my first version of the first rule was incorrect, edited thanks to vincent-hugot.

+4
source share

You can use deforestation in the folds. In fact, map/map fusion is a special case of this.

The trick is to replace the list design with a special build function:

 build :: (forall b. (a -> b -> b) -> b -> b) -> [a] build g = g (:) [] 

Now using the standard definition of foldr

 foldr :: (a -> b -> b) -> b -> [a] -> b foldr cn [] = n foldr cn (x:xs) = cx (foldr cn xs) 

We have the following equivalence:

 foldr cn (build g) == gcn 

(Actually, this is true only under certain, but general circumstances. For details, see "Short circuit correctness" ).

If you write your functions to create a list (including map ) using build and your consumers using foldr , then the above equality can remove most of the intermediate lists. Haskell list enumerations translate to a combination of build and foldr .

The disadvantage of this approach is that it cannot handle left folds. Stream Fusion handles this just fine. It expresses producers of lists and transformers as streams (co-inductive data types like iterators). The above article is very readable, so I recommend a look.

The banana article mentioned by Gasche provides more detailed information on fold types and their equivalents.

Finally, there is Bird and Moor “Programming Algebra,” which mentions transformations such as combining two folds into one .

+3
source share

If you are interested in delving deeper into theory, I suggest you read something about catamorphisms , anamorphisms, and hylomorphisms . Although the category theory surrounding it may seem a little intimidating, the concept is not so difficult.

Catamorphisms are functions that consume recursive data structures and produce some kind of value. Anamorphisms are functions that, with a certain value (type of seed), create recursive data structures. In particular, foldr and build , mentioned in other anniversaries, are functions for constructing catamorphisms and anamorphisms in lists. But this concept can be applied to almost any recursive data structure, for example, to various types of trees, etc.

Now, if you create a recursive data structure with anamorphism, and then destroy it with a catamorphism, you get the so-called hilomorphism. In this case, you actually do not need an intermediate structure. You can skip its creation and destroy. This is often called a deforestation .


Regarding map : this function is interesting in that it is both a catamorphism and anamorphism:

  • map consumes a list and produces something; but also
  • map creates a list, consumes something.

Thus, you can consider the composition of two maps map f . map g map f . map g as a composition of anamorphism ( map g ) with a catamorphism ( map f ), forming a hilomorphism. Thus, you know that you can optimize (deforestation) without creating an interim list.

Specifically: you can write map two ways: using foldr , and the other using build :

 mapAna :: (a -> b) -> [a] -> [b] mapAna f xs = build (mapAna' f xs) mapAna' :: (a -> b) -> [a] -> (b -> c -> c) -> c -> c mapAna' f [] cons nil = nil mapAna' f (x : xs) cons nil = (fx) `cons` (mapAna' f xs cons nil) mapCata :: (a -> b) -> [a] -> [b] mapCata f xs = foldr (\x ys -> fx : ys) [] xs 

and the composition of map f (map g zs) as mapCata f (mapAna g zs) , which after some simplifications and application of foldr cn (build g) == gcn leads to map (f . g) .

+1
source share

All Articles