A Brief Introduction to the λ-Calculus (Part 2)

In the second part of this series on λ-calculus, we’ll explore how to perform recursion in λ-calculus and how to represent numbers in λ-calculus using Peano arithmetic. At the end of this post, I’ll show how to code an algorithm to compute the nth number in the Fibonacci sequence using λ-calculus.

A word of warning, this post contains some challenging concepts. Take it slow and make sure you understand each step before moving on to the next one.

Self application

To begin with, we’ll look at an interesting behavior that happens when you call the self_apply function with itself as an argument.

Recall that self_apply is defined as:

self_apply = λi.(i i)

Applying self_apply to itself:

(self_apply self_apply)
=> (λi.(i i) λi.(i i))
=> (λi.(i i) λj.(j j)) [α-conversion] 
=> (λj.(j j) λj.(j j)) [β-reduction]
=> (self_apply self_apply)

Notice that applying α-conversion and β-reduction the statement (self_apply self_apply) results in (self_apply self_apply). No matter how many operations are applied to this expression we get the same expression as a result.

If you had a machine which could choose which operation to apply to a given λ-calculus expression and which would apply them automatically, this expression would cause an automated infinite self repeating operation. We’ll take advantage of this behavior soon.

Recursion

Recall from part 1 that we can’t perform recursion in λ-calculus by using function names inside function definitions. Let’s temporarily lift that restriction to imagine an example of what we’d like to achieve:

WARNING: What follows is not real λ-calculus! It is only presented for illustrative purposes.

f = λi.(f (G i))

Repeatedly expanding the definition of such a function would look like this:

(f x)
=> (λi.(f (G i)) x)
=> (λi.(λi'.(f (G i')) (G i)) x)
=> (λi.(λi'.(λi''.(f (G i'')) (G i')) (G i)) x)

I had to rename variables as part of these pseudo-operations because this isn’t actually valid λ-calculus.

Let’s stop the recursion there by deleting the f and see what would result if we could terminate this iteration. Again, this is not a real λ-calculus operation:

(λi.(λi'.(λi''.(G i'') (G i')) (G i)) x)
=> (λi'.(λi''.(G i'') (G i')) (G x)
=> (λi''.(G i'') (G (G x))
=> (G (G (G x)))

As you can see, this would prefix a series of calls to G applied to x.

This illustrates the kind of behavior we want, repeatedly expanding f, resulting in a series of recursive operations.

We can achieve this kind of recursion without referring to the function name in its function body by taking the next iteration of f as an argument, f’, to the function f. We’re back to real λ-calculus from now on:

f = λf'.λi.λc.(c i (f' f' (G i)))

Notice that we need to pass a copy of the f function to f as its first argument to use for further recursion. We use a selector function, c, to terminate the recursion by discarding the (f’ f’ (G i)) expression if c is true. In the case where c is false we continue the recursion by selecting the (f’ f’ (G i)) expression. We have two instances of f’ in the recursion expression so that further recursive calls to f can re-use the copy of f we already have.

When c is false, f returns a function, (f’ f’ (G i)), which takes a selector function as an argument to terminate the iteration. This way we can specify when to terminate the function by passing it a series of false and true values.

We can call the function recursively to duplicate G a finite number of times in front of x as follows. We need to call f with itself as an argument followed by its intended argument, x, and a string of Boolean selector functions:

(f f x false false true)
=> (λf'.λi.λc.(c i (f' f' (G i))) f x false false true)
=> (λi.λc.(c i (f f (G i))) x false false true)
=> (λc.(c x (f f (G x))) false false true)
=> ((false x (f f (G x))) false true)
=> ((λf'.λi.λc.(c i (f' f' (G i))) f (G x)) false  true)
=> ((λc.(c (G x) (f f (G (G x)))) false true)
=> ((false (G x) (f f (G (G x)))) true)
=> ((λf'.λi.λc.(c i (f' f' (G i))) f (G (G x))) true)
=> ((λi.λc.(c i (f f (G i))) (G (G x))) true)
=> ((λc.(c (G (G x)) (f f (G (G (G x)))))) true)
=> (true (G (G x)) (f f (G (G (G x)))))
=> (G (G x))

Notice how we use the false function to continue the recursion and the true function to terminate the recursion.

Y

This kind of manual recursive function call is fine for performing recursion when we know the number of iterations we need ahead of time, but what about if we don’t know the number of iterations ahead of time? We need an automatic mechanism for performing recursion which can terminate when we encounter a terminating condition.

We saw a function, self_apply, which performs an automatic infinite operation sequence in the first section of this article. Notice in that section how the body (i i) causes the infinite repetition? Let’s modify self_apply by adding in a conditional function as an argument that can conditionally discard the repetition (i i):

half_recur = λf.λi.(f (i i))

If we pass false to this function, the (i i) self application is not evaluated:

half_recur false F G 
=> λf.λi.(f (i i)) false F G
=> λi.(false (i i)) F G
=> (false (F F)) G
=> (λfirst.λsecond.second (F F)) G
=> λsecond.second G
=> G

false terminates the automatic self application!

If we apply this function to true, we get the self apply behavior, (F F), as a result.

Applying half_recur to a function, F, we get another function:

half_recur F
=> λf.λi.(f (i i)) F
=> λi.(F (i i))

By applying this function to itself, we get the famous Y-combinator, also known as the fixed point combinator, discovered by Haskell Curry:

Y = λf.(λi.(f (i i)) λi.(f (i i)))

The Y-combinator has a very useful behavior when it’s applied to a function:

(Y F)
=> (λf.(λi.(f (i i)) λi.(f (i i))) F)
=> (λi.(F (i i)) λi.(F (i i)))
=> (λi.(F (i i)) λi'.(F (i' i'))) [α-conversion]
=> (F (λi'.(F (i' i')) λi'.(F (i' i'))))
=> (F (λi.(F (i i)) λi.(F (i i)))) [α-conversion]
== (F (Y F))

The Y-combinator duplicates the function it’s passed as a prefix to itself!

This behavior allows us to perform general recursion.

Notice that this is not necessarily an infinite recursion. If F is a conditional function, a false condition would discard the Y-combinator and terminate the recursion!

As a personal note, I think that Curry’s discovery of the Y-combinator was an act of sheer genius. It is a remarkable mechanism.

Numbers

In the late 19th century, Guiseppe Peano created a set of axioms for natural numbers which we can use to perform arithmetic in λ-calculus. (I’m going to skip over the undecidability of Peano’s arithmetic for the purpose of this article).

Peano’s axioms use a “successor” function, as well as defining the number 0 as a natural number. We’ll use the following two functions for zero and successor:

zero = identity == λi.i
successor = λn.λs.((s false) n)

This kind of numbering system, described in Greg Michaelson’s book, is an alternative to Church Numerals. Michaelson’s formulation of Peano’s arithmetic makes it easier for us to perform comparisons on numbers.

Any natural number can be created by applying the successor function a number of times to zero. For example:

one = (successor zero)
two = (successor (successor zero))
three = (successor (successor (successor zero)))

Every time we apply successor to a number, it creates a pair of arguments false and n to a function s. We can define the natural numbers like so:

one = successor zero
=> λn.λs.((s false) n) zero
=> λs.((s false) zero)

two = successor one == successor successor zero
=> λn.λs.((s false) n) λn.λs.((s false) n) zero 
=> λn.λs.((s false) n) λs.((s false) zero)
=> λn.λs.((s false) n) λs'.((s' false) zero) 
=> λs.((s false) λs'.((s' false) zero))

and so on...

Notice that all numbers are functions which take a selector function as an argument.

We choose to define numbers this way because in the case where we apply a number to true, the result will be false, unless the number is zero, in which case we’ll get true as a result:

zero true
=> λi.i true
=> true

one true
=> λs.((s false) zero) true
=> ((true false) zero)
=> ((λfirst.λsecond.first false) zero)
=> (λsecond.false zero)
=> false

This is because true has the same behavior as select_first and the first argument which will be passed to the selector for any non-zero number will always be false because of how we defined successor (successor always puts false as the first argument to the selector).

This behavior allows us to create an is_zero function:

is_zero = λn.(n true)

is_zero applied to zero will evaluate to true, and if applied to any other natural number, it will evaluate to false.

Since we defined natural numbers as recursive applications of the successor function, we can discard one application of the successor function to get the predecessor of a number. We’ve previously seen a function which will discard the first argument in a pair of functions, the select_second function (a.k.a false).

Here’s what happens when we apply a non-zero natural number (successor X) to false:

((successor X) false)
=> ((λn.λs.((s false) n) X) false)
=> (λs.((s false) X) false)
=> ((false false) X)
=> ((λfirst.λsecond.second false) X)
=> (λsecond.second X)
=> X

By applying the number to false, we’ve stripped the successor function prefix from X!

We can define a predecessor function like this:

predecessor = λn.((is_zero n) n (n false))

We need to handle zero as a special case, so we use is_zero in this function.

Addition

In order to add two numbers to one another, we’ll take the approach of incrementing one number while decrementing the other:

add_iter = λadd_iter'.λx.λy.((is_zero y) x (add_iter' (successor x) (predecessor y)))

We can then define the add function by calling this increment/decrement function recursively with the Y-combinator:

add = Y add_iter

Let’s try adding 1+1.

This next section is pretty long, but it demonstrates that addition can work just by applying the four operations of λ-calculus:

add one one
=> (Y add_iter) one one
=> λf.(λi.(f (i i)) λi.(f (i i))) add_iter one one
=> (λi.(add_iter (i i)) λi’.(add_iter (i’ i’))) one one
=> add_iter (λi’.(add_iter (i’ i’)) λi’.(add_iter (i’ i’))) one one

Replacing (λi’.(add_iter (i’ i’)) λi’.(add_iter (i’ i’))) by (Y add_iter)

=> add_iter (Y add_iter) one one
=> λadd_iter'.λx.λy.((is_zero y) x (add_iter' (successor x) (predecessor y))) (Y add_iter)) one one
=> λx.λy.((is_zero y) x ((Y add_iter) (successor x) (predecessor y))) one one
=> λy.((is_zero y) one ((Y add_iter) (successor one) (predecessor y))) one
=> ((is_zero one) one ((Y add_iter) (successor one) (predecessor one)))
=> (false one ((Y add_iter) (successor one) (predecessor one)))
=> (false one ((Y add_iter) (successor one) (predecessor one)))
=> ((Y add_iter) (successor one) (predecessor one))

Replacing successor one by two and predecessor one by zero

=> (Y add_iter) two zero
=> (λf.(λi.(f (i i)) λi.(f (i i))) add_iter add_iter) two zero
=> ((λi.(add_iter (i i)) λi.(add_iter (i i))) add_iter) two zero
=> ((λi.(add_iter (i i)) λi’.(add_iter (i’ i’)))) two zero
=> add_iter (λi’.(add_iter (i’ i’)) λi’.(add_iter (i’ i’))) two zero

Replacing (λi’.(add_iter (i’ i’)) λi’.(add_iter (i’ i’))) by (Y add_iter)

=> add_iter (Y add_iter) two zero
=> λadd_iter'.λx.λy.((is_zero y) x (add_iter' (successor x) (predecessor y))) (Y add_iter) two zero
=> λx.λy.((is_zero y) x ((Y add_iter) (successor x) (predecessor y))) two zero
=> λy.((is_zero y) two ((Y add_iter) (successor two) (predecessor y))) zero
=> ((is_zero zero) two ((Y add_iter) (successor two) (predecessor zero)))
=> (true two ((Y add_iter) (successor two) (predecessor zero)))
=> two

Yes, I know that took a lot of work; but it is a valid arithmetic procedure which is performed using nothing but functions!

Fibonacci

Now that we have numbers, arithmetic, and recursion, it’s trivial to define a function to calculate the nth number in the Fibonacci sequence:

fib_iter = λfib_iter'.
           λn.((is_zero n)
                 zero
                 ((is_zero (predecessor n))
                    one
                    (add
                       (fib_iter' (predecessor n))
                       (fib_iter' (predecessor (predecessor n))))))

fib = Y fib_iter

Conclusion

That’s essentially all we need to use λ-calculus to perform general computations on the natural numbers. As Countess Ada Lovelace famously realized, any machine which can perform general numerical computations is sufficiently powerful to perform computations on any kind of data.

There’s a lot more to λ-calculus than what I’ve discussed in these posts, but this should be enough to give you the tools you need to continue exploring λ-calculus.