This CPS trick in functional programming takes a non-tail-recursive function and rewrites it in a Continued Transfer (CPS) style, thereby trivially making it a tail recursive. There are a lot of questions, for example
Take an example
let rec count n = if n = 0 then 0 else 1 + count (n - 1) let rec countCPS n cont = if n = 0 then cont 0 else countCPS (n - 1) (fun ret -> cont (ret + 1))
The first version of count will accumulate stack frames in each recursive call, creating a stack overflow with speed n = 60000 on my computer.
The idea behind the CPS trick is that the implementation of countCPS is tail recursive, so the calculation
let f = countCPS 60000
It will actually be optimized to work as a loop and work without problems. Instead of stack frames, the continuation that will be executed will accumulate at every step, but it is an honest object on the heap, where memory does not cause problems. As such, the CPS style is said to trade a stack of heap space. But I am skeptical about this.
Here's why: Evaluating the calculation, actually continuing with the sequel, how countCPS 60000 (fun x -> x) hits my stack! Every challenge
countCPS (n - 1) (fun ret -> cont (ret + 1))
generates a new continuation closure from the old one and launches it with a single function application. Therefore, when evaluating countCPS 60000 (fun x -> x) we call a nested sequence of 60000 closures, and even if their data is on the heap, we have functional applications without changes, so again there are stack frames.
Let me dive into the generated code disassembled in C #
For countCPS we get
public static a countCPS<a>(int n, FSharpFunc<int, a> cont) { while (n != 0) { int arg_1B_0 = n - 1; cont = new Program<a> .countCPS@10 (cont); n = arg_1B_0; } return cont.Invoke(0); }
There we go, tail recursion is actually optimized. However, the closure class looks like
internal class countCPS@10 <a> : FSharpFunc<int, a> { public FSharpFunc<int, a> cont; internal countCPS@10 (FSharpFunc<int, a> cont) { this.cont = cont; } public override a Invoke(int ret) { return this.cont.Invoke(ret + 1); } }
Thus, triggering an external closure will cause .Invoke to close the child, then it will close again and again ... We really have 60,000 nested function calls again.
So, I do not see how the continuation trick is really capable of doing what is advertised.
Now we can claim that this.cont.Invoke again the tail of the call, so it does not need a stack frame. Does .NET do this kind of optimization? What about more complex examples like
let rec fib_cps nk = match n with | 0 | 1 -> k 1 | n -> fib_cps (n-1) (fun a -> fib_cps (n-2) (fun b -> k (a+b)))
At the very least, we would have to state why we can optimize the deferred function calls recorded in the sequel.
Edit
interface FSharpFunc<A, B> { B Invoke(A arg); } class Closure<A> : FSharpFunc<int, A> { public FSharpFunc<int, A> cont; public Closure(FSharpFunc<int, A> cont) { this.cont = cont; } public A Invoke(int arg) { return cont.Invoke(arg + 1); } } class Identity<A> : FSharpFunc<A, A> { public A Invoke(A arg) { return arg; } } static void Main(string[] args) { FSharpFunc<int, int> computation = new Identity<int>(); for(int n = 10; n > 0; --n) computation = new Closure<int>(computation); Console.WriteLine(computation.Invoke(0)); }
To be more precise, we model the closure that creates the stylish CPS function in C #.
Clearly, the data is on the heap. However, evaluating computation.Invoke(0) leads to a cascade of nested Invoke child closures. Just set a breakpoint on Identity.Invoke and look at the stack trace! So, how does the heap precast transaction stack work if it actually makes heavy use of both?