TL; DR : recursion is used to process inductively defined data that is ubiquitous.
Recursion is natural when you work at higher levels of abstraction. Functional programming is not only about function coding; it's about working at higher levels of abstraction, where you naturally use functions. Using functions, it is natural to reuse the same function (call it again), from any context where it makes sense.
The world is built by repeating similar / identical building blocks. If you cut a piece of fabric in half, you have two pieces of fabric. Mathematical induction underlies mathematics. We humans think (like, 1,2,3 ...). Any inductively defined thing (for example, {numbers from 1} are {1 and numbers from 2}) is naturally processed / analyzed by a recursive function, in accordance with the same cases in which this thing is defined / constructed.
Recursion is everywhere. Any iterative loop is still overridden for recursion, because when you enter this loop, you enter the same loop again (only with different loop variables). Thus, this is not like inventing new concepts in the field of computing, it is more like discovering the basics and making it explicit.
So recursion is natural. We just write down some laws about our problem, some equations with a function we define that preserve some invariant (assuming that the function is coherently defined), redefining the problem in simplified terms and voila! We have a solution.
Example, function for calculating the length of a list (inductively defined recursive data type). Suppose it is defined and returns the length of the list, which is not surprising. What laws should he obey? What invariant is preserved under what simplification of the problem?
The most immediate thing is to list the list separately from its head element, and the rest is aka tail of the list (according to how the list is defined / built). Law,
length (x:xs) = 1 + length xs
D'ee! But what about an empty list? It must be that
length [] = 0
So how do we write such a function? ... Wait ... We already wrote this! (In Haskell, if you are wondering where the function application is expressed by matching, the brackets are used only for grouping, and (x:xs) is a list with x its first element and xs rest).
All we need for a language that allows this programming style is that it is TCO (and perhaps a little luxurious, TRMCO ), so there is no stack inflating, and we are set up.
Another thing - cleanliness - the immutability of code variables and / or data structures (record fields, etc.).
What this does is, in addition to freeing our minds from having to keep track of what is ever changing, it makes time clearly evident in our code, rather than hiding in our “changing” mutable variables / data. We can only “change” the value of a variable in the imperative code from now on - we cannot very well change its value in the past, can we?
And so we end up with lists of recorded change history, with an explicit explicit expression in the code: instead of x := x + 2 we write let x2 = x1 + 2 . This simplifies the discussion of code.
To eliminate invariance in the context of tail recursion with TCO , consider this tail recursive rewriting of the above length function under the battery paradigm of arguments:
length xs = length2 0 xs -- the invariant: length2 a [] = a -- 1st arg plus length2 a (x:xs) = length2 (a+1) xs -- length of 2nd arg
Here TCO means the reuse of call forwarding in addition to a direct transition, and therefore the call chain for length [1,2,3] can be considered as actually mutating the records of the call stack frame corresponding to the function parameters:
length [1,2,3] length2 0 [1,2,3] -- a=0 (x:xs)=[1,2,3] length2 1 [2,3] -- a=1 (x:xs)=[2,3] length2 2 [3] -- a=2 (x:xs)=[3] length2 3 [] -- a=3 (x:xs)=[] 3
In a pure language, without any primitives that change the value, the only way to express the changes is to pass the updated values as arguments to the function, which will be processed further. If the further processing is the same as before, then, naturally, we must call the same function for it, passing it the updated values as arguments. And this recursion.
And the following leads to the fact that the whole history of calculating the length of the argument list is explicit (and available for reuse if necessary):
length xs = last results where results = length3 0 xs length3 a [] = [a] length3 a (x:xs) = a : length3 (a+1) xs
In Haskell, this is known as guarded recursion, or corecursion (at least I think so).