Yes, you can define numbers (and indeed, arbitrary data types) inside lambda calculus. Here is an idea.
First, let me choose which numbers we will determine. The simplest numbers to work with are natural numbers: 0, 1, 2, 3, etc. How do we define them? The usual approach is to use Peano's axioms:
- 0 is a natural number.
- If n is a positive integer, then Sn is a positive integer.
Here S denotes the successor of n or n + 1. Thus, the first few natural Peano numbers are 0, S0, SS0, SSS0, etc. is a unary representation.
Now, in lambda calculus, we can represent a function application, so we can represent Sn, but we don’t know how to represent 0 and S ourselves. But, fortunately, lambda calculus offers us a way to defer this choice: we can accept them as arguments and let someone decide! Let z be written for a given value 0, and s given for S. Then we can represent the first few numbers as follows, calling ⟦n⟧ for "representing the lambda calculus of the natural number n":
- ⟦0⟧ = λ z s. g
- ⟦1⟧ = λ z s. sz
- ⟦2⟧ = λ z s. s (sz)
- ⟦3⟧ = λ z s. s (s (sz))
Just as a positive integer n is n applications of S up to 0, the representation of the lambda calculus n is the application of n copies of any successor function s to any zero z. We can also define a successor:
- ⟦0⟧ = λ z s. g
- ⟦S⟧ = λ n. λ z s. s (nzs)
Here we see that the successor applies one additional copy of s to n, making sure that n uses the same z and s. We see that this gives us the same views as for the "evaluated":
- ⟦0⟧ = λ z s. g
- ⟦1⟧ = ⟦S0⟧
= (λ n. λ z s. s (nzs)) (λ z 's', z')
⇝ λ z s. s (λ z 's'. z') zs)
⇝ λ z s. sz - ⟦2⟧ = ⟦SS0⟧
= (λ n. λ z s. s (nzs)) ((λ n '. λ z' s'. s' (n 'z' s')) (λ z "s". z "))
⇝ (λ n. Λ z s. S (nzs)) (λ z 's'. S' ((λ z "s". Z ") z 's'))
⇝ (λ n. Λ z s. S (nzs)) (λ z 's'. S 'z')
⇝ λ z s. s (λ z 's'. s 'z') zs)
⇝ λ z s. s (sz) - ⟦3⟧ = ⟦SSS0⟧
= (λ n. λ zs. s (nzs)) ((λ n '. λ z' s'. s' (n 'z' s')) ((λ n ". λ z" s ". s" (n "z" s ")) (λ z ‴ s ‴. z ‴)))
⇝ (λ n. Λ zs. S (nzs)) ((λ n '. Λ z' s'. S '(n' z 's')) (λ z "s". S "((λ z ‴ s ‴. z ‴) z "s")))
⇝ (λ n. Λ zs. S (nzs)) ((λ n '. Λ z' s'. S '(n' z 's')) (λ z "s". S "z")) < w> ⇝ (λ n. Λ z s. S (nzs)) (λ z 's'. S' ((λ z "s". S "z") z 's'))
⇝ (λ n. Λ z s. S (nzs)) (λ z 's'. S' (s' z '))
⇝ λ z s. s ((λ z 's'. s' (s' z')) zs)
⇝ λ z s. s (s (sz))
(Yes, it becomes dense and difficult to read quickly. Working through this is a pretty good exercise, if you feel that you need more practice - this leads to the fact that I caught a mistake in what I originally wrote!)
Now we have defined 0 and S, so a good start, but we also need the principle of induction. In the end, what makes natural numbers the way they are! So how will this work? Well, it turns out we are basically tuned. When we think about our principle of induction programmatically, we need a function that takes the basic case and the inductive case as input and produces a function from natural numbers to some kind of output. I will call the conclusion "evidence for n". Then our inputs should be:
- The main case, which is our proof for 0.
- Inductive case, which is a function that takes as input evidence for n and gives a proof for Sn.
In other words, we need some kind of null value and some successor function. But these are just our arguments z and s! Thus, it turns out that we represent natural numbers as their induction principle, which, I think, is pretty cool.
And that means that we can define the basic operations. I will simply define the complement here, and leave everything else as an exercise. In our inductive formulation, we can define an addition as follows:
- m + 0 = m
- m + Sn = S (m + n)
This is inductively defined in the second argument. So how do we translate this? It will be:
- ⟦+⟧ = Λ m n. λ z s. n (msz) s
Where is it from? Well, we apply our inductive principle to n. In the base case, we return m (using the surrounding s and z), as above. In the inductive case, we apply a successor (ambient s) to what we get. So it must be right!
Another way to look at this is that since nsz applies n copies of s to z, we have n (msz) s apply n copies of s to msz, for all n + m copies of s in z, Again, the correct one answer!
(If you are still not convinced, I recommend that you develop a small example of type ⟦1 + 2⟧, which should be small enough to be convenient, but large enough to be at least somewhat interesting.)
So now we see how to determine the addition for natural numbers inside a pure untyped lambda calculus. Here are some additional thoughts for further reading if you want; they are more condensed and less explained.
This presentation method is more generally applicable; this is not only for natural numbers. It is called Church Coding and can be adapted to represent arbitrary algebraic data types . Just as we represented natural numbers by their induction principle, we represent all data types according to their structural recursive scheme (own fold): a data type representation is a function that takes one argument for each constructor, and then applies this "constructor" " with all the necessary arguments. So:
- Booleans:
- ⟧False⟧ = λ f t. e
- ⟦True⟧ = λ f t. t
- Tuple:
- Sum types (
data Either ab = Left a | Right b ):- ⟦Left x⟧ = λ l r. lx
- ⟦Right y⟧ = λ l r. ry
- Lists (
data List a = Nil | Cons a (List a) ):- ⟦Nil⟧ = λ n c. P
- ⟦Cons xl⟧ = λ n c. cxl
Note that in the latter case, l will be the encoded list itself.
This method also works in a typed setup, where we can talk about folds (or catamorphisms) for data types. (I mostly mention this because I personally think it's really cool.) data Nat = Z | S Nat data Nat = Z | S Nat then isomorphic to forall a. a -> (a -> a) -> a forall a. a -> (a -> a) -> a , and the lists e isomorphic to forall a. e -> (e -> a -> a) -> a forall a. e -> (e -> a -> a) -> a , which is only part of a signature of the type of general foldr :: (e -> a -> a) -> a -> [e] -> a . Universal quantized a is a type of natural number or the list itself; they must be universally evaluated, therefore higher rank types are required to complete them. Isomorphism is evidenced by the fact that foldr Cons Nil is an identical function; for a natural number encoded as n , we also have n ZS restoring our original list.
If you are concerned about the fact that we used only natural numbers, we can define a representation for integers; for example, a general view of unitary style
data Int = NonNeg Nat | NegSucc Nat
Here NonNeg n represents n , and NegSucc n represents -(n+1) ; an additional +1 in the negative case guarantees the presence of a single 0 . You must make sure that you can, if you want, implement various arithmetic functions in Int in a programming language with real data types; these functions can then be encoded in untyped lambda calculus through church encoding, and so we are tuned. Fractions are also represented as pairs, although I do not know a representation that ensures that all fractions are uniquely represented. Representing real numbers becomes complicated, but the IEEE 754 floating point numbers can be represented as 32-, 64-, or 128-byte booleans, which are terribly inefficient and bulky, but encoded.
More efficient representations of natural numbers (and integers, etc.) are also available; eg,
data Pos = One | Twice Pos | TwiceSucc Pos
encodes positive binary numbers ( Twice n - 2*n or adding 0 to the end; TwiceSucc - 2*n + 1 or adding 1 to the end, the base register is One , one 1 ). Encoding natural numbers is as simple as
data Nat = Zero | PosNat Pos
but then our functions, such as addition, become more complicated (but faster).