Haskell's short-lived memory?

In an object oriented language, when I need to cache / memoize the results of a function for a known lifetime, I will usually follow this pattern:

  • Create a new class
  • Add a data element and method to the class for each function result. I want to cache
  • Implement a method to first check to see if the result has been stored in the data item. If so, return this value; otherwise, call the function (with the appropriate arguments) and save the returned result in the data element.
  • Objects of this class will be initialized with the values ​​required for various function calls.

This object-oriented approach is very similar to the memoization functional diagram described here: http://www.bardiak.com/2012/01/javascript-memoization-pattern.html

The main advantage of this approach is that the results are saved only for the lifetime of the cache object. A common use case is to process a list of work items. For each work item, a cache object is created for this item, the work item with this cache object is processed, then the work item and cache object are discarded before moving on to the next work item.

What are good ways to implement short-term memoization in Haskell? And does the answer depend on whether the functions that need to be cached are pure or include IO?

Just to repeat - it would be nice to see solutions for functions that are related to IO.

+8
haskell memoization
source share
4 answers

Let me use the Luke Palmer memoization library: Data.MemoCombinators

import qualified Data.MemoCombinators as Memo import Data.Function (fix) -- we'll need this too 

I am going to define things slightly different from how his library works, but basically it is the same thing (and, moreover, compatible). A “memoizable” thing takes itself as a contribution and produces a “real” thing.

 type Memoizable a = a -> a 

"memoizer" takes a function and creates its memoized version.

 type Memoizer ab = (a -> b) -> a -> b 

Let's write a little function to combine these two things. The Memoizable and Memoizer require a resulting memoized function.

 runMemo :: Memoizer ab -> Memoizable (a -> b) -> a -> b runMemo memo f = fix (f . memo) 

This is a bit of magic using the fix combinator. Do not pay attention to it; You can do it if you are interested.

So write a Memoizable version of a classic example:

 fib :: Memoizable (Integer -> Integer) fib self = go where go 0 = 1 go 1 = 1 go n = self (n-1) + self (n-2) 

Using the self convention makes code simple. Remember that self is what we expect, it is a memorialized version of this function itself, so recursive calls must be on self . Now run ghci.

 ghci> let fib' = runMemo Memo.integral fib ghci> fib' 10000 WALL OF NUMBERS CRANKED OUT RIDICULOUSLY FAST 

Now, about something about runMemo , you can create several fresh memoized versions of the same function, and they will not use memory banks. This means that I can write a function that is created locally and uses fib' , but then, as soon as fib' falls out of scope (or earlier, depending on the intelligence of the compiler) , it can be garbage collected . It does not need to be memorized at the upper level. This may or may not work well with memoization methods that rely on unsafePerformIO . Data.MemoCombinators uses a clean, lazy Trie that goes well with runMemo . Instead of creating an object that essentially becomes the memoization manager, you can simply create memoized functions on demand. The catch is that if your function is recursive, it should be written as Memoizable . The good news is that you can plug in any Memoizer you want. You can even use:

 noMemo :: Memoizer ab noMemo f = f ghci> let fib' = runMemo noMemo fib ghci> fib' 30 -- wait a while; it computing stupidly 1346269 
+12
source share

Lazy-Haskell programming, in a sense, is a paradigm of memories, taken to the extreme. In addition, everything you do in an imperative language is possible in Haskell, using either IO monad, ST monad, monad transformers, arrows, or you name what.

The only problem is that these abstraction devices are much more complex than the imperative equivalent that you mentioned, and they need a pretty deep mind remake.

+4
source share

I believe that the above answers are more complex than necessary, although they may be more portable than what I am going to describe.

As I understand it, there is a rule in ghc that each value is evaluated exactly once when a lambda expression is entered. Thus, you can precisely create your short memoization object as follows.

 import qualified Data.Vector as V indexerVector :: (t -> Int) -> V.Vector t -> Int -> [t] indexerVector idx vec = \e -> tbl ! e where m = maximum $ map idx $ V.toList vec tbl = V.accumulate (flip (:)) (V.replicate m []) (V.map (\v -> (idx v, v)) vec) 

What does it do? It groups all the elements in Data.Vector t passed as the second vec argument according to the integer computed by the first idx argument, saving its grouping as Data.Vector [t] . It returns a function of type Int -> [t] , which searches for this grouping by this pre-computed index value.

Our ghc compiler promised that tbl would only crash once when we call indexerVector . Therefore, we can assign lambda the expression \e -> tbl ! e \e -> tbl ! e returned by indexVector , another value that we can reuse without fear that tbl will ever be redistributed. You can verify this by inserting trace in tbl .

In short, your caching object is just that lambda expression.

I found that almost everything you can accomplish with a short-term object can be better achieved by returning a lambda expression like this.

+2
source share

You can use the same template in haskell. A lazy assessment will take care of checking whether the value is evaluated. This has already been mentioned several times, but sample code can be useful. In the example below, memoedValue will be evaluated only once when it is required.

 data Memoed = Memoed { value :: Int , memoedValue :: Int } memo :: Int -> Memoed memo i = Memoed { value = i , memoedValue = expensiveComputation i } 

Even better, you can remember values ​​that depend on other stored values. You avoid dependecy loops. They can lead to lossless play.

 data Memoed = Memoed { value :: Int , memoedValue1 :: Int , memoedValue2 :: Int } memo :: Int -> Memoed memo i = r where r = Memoed { value = i , memoedValue1 = expensiveComputation i , memoedValue2 = anotherComputation (memoedValue1 r) } 
+1
source share

All Articles