Recursion is an interesting, and important computation tool that enables us to formulate patterns of self-similarity and express infinity, using a finite set of elements. There is a strong relation between recursion and the principle of induction in mathematics. Imperative programming languages usually allow recursion in a program to take place by letting us define a function that calls itself. Here is a simple recursive function in C++:

#include <iostream.h>

void count(int n)
{
    std::cout << "Counting: " << n << "\n";
    if (n < 5) {
        count(n+1);
    }
}

int main(int argc, char **argv)
{
    count(1);
    return 0;
}

Run this example from codepad:
http://codepad.org/YmxQcTIX

In functional programming, recursion has a much more central position. Purely functional languages do not define any looping constructs, but instead rely entirely on recursion to achieve iterative algorithms. Many data types are defined recursively. In Haskell, recursion even allows us to define (conceptually) infinite data structures:

n = 5 : n
take 3 n                -- [5, 5, 5]

Codepad: http://codepad.org/sRbFkSH5

Here, we take the first three elements from an infinite list of fives. The list n is defined as a single element, 5, followed by a list – itself.

n is a 5, followed by n, which is a 5 followed by n, which is a 5 followed …

Fixed Point

A fixed point is a value x for a function f, that satisfies the equation f(x) = x.

f(x) = x This function has an infinite number of fixed points.	f(x) = x + 1 This function has no fixed points.
f(x) = 1 This function has one fixed point: f(1) = 1.	f(x) = sin(5x) + x / 1.1

Now, to see how the concept of fixed points is relevant to recursion, consider the following definition in Haskell:

x = f x

This equation is clearly that of a fixed point for the function f;

x = f(x) ⇔ f(x) = x.

As a declaration, we can also think of this as x being defined in terms of itself. This definition captures the basic pattern of recursion. One way to visualize this is to reiterate the expression on a new line, thinking of each line as a computational step, replacing x on the right-hand-side with the previous definition, then repeating that process:

x = f x
  = f (f x)
  = f (f (f x))
  = f (f (f (f x)))   ... etc

So far, we haven't actually defined the function f. To make it simple, we could define f as the identity function x → x, that is f(x) = x.

f x = x
x = f x
print (x)

As expected, this leads to infinite recursion.

Higher-order fixed points

Whereas a fixed point for a first-order function is a simple scalar value, a fixed point for any higher-order function is, of course, itself a function.

f fx = \n -> 5
x = f x

print (x 1)    -- 5

Codepad: http://codepad.org/idlciTa0

Here, f is a function that takes a function as argument and returns a function:

fx → (n → 5)

The anonymous function n → 5 is a fixed point for the function f.

(n → 5) → (n → 5)

At this point, we need to think about why we actually get a definite value for x at all. Consider, for example, the following:

f x = 1
x = f x

print (x)         -- 1

Codepad: http://codepad.org/fUYbEf5l

This makes sense, except, based on what we said earlier, x = f x should lead to infinite recursion, which is obviously not the case. We must realize that the approach to evaluating the expression is of great importance here.

Strict vs. non-strict semantics

In the context of programming languages, and more specifically the study of denotational semantics, the term evaluation model refers to a strategy used to determine how expressions and function arguments are evaluated. The difference between a non-strict and a strict evaluation model is, essentially, the direction in which evaluation of an expression propagates. Most languages rely on strict semantics, and evaluate expressions from the inside out. Haskell, on the other hand, does the opposite, and uses something called lazy evaluation. An expression is only evaluated when its result is needed.

This Javascript program is only meant to illustrate the notion of strict evaluation, and should not be seen as an actual program. Don't run it, unless you want your browser/JS runtime to freeze!

function fn(x) {
    return 'strict';
}

function fn2() {
     while (true) {}
}

alert(fn(fn2()));

In a strict environment, expressions such as fn(fn2()) will not terminate, since evaluation is "eager" and starts with the inner function call fn2(), which doesn't terminate.

Lazy evaluation

Non-strict evaluation starts from the "outside":

module Main where

fn x = "lazy"
n = n                   -- non-terminating definition

main = print (fn n)

Codepad: http://codepad.org/JbqRzimu

Since evaluation of fn is not depending on its parameter, x, the compiler is lazy and ignores the argument. This corresponds to a call-by-name treatment of arguments.

In a sense, this is similar to the short-circuit evaluation operators || and && in C/C++:

bool a = true || nonTerminatingFunction();

Infinite data structures

Haskell's laziness allows us to define an infinite list of numbers, and then take a specific element from that list:

n = [1..]
n!!8                 -- 9

Codepad: http://codepad.org/PoZZlJPx

The double exclamation mark (!!) is an infix function that returns a specific index from a list (similar to the subscript operator [ ] in C/C++).

doubled (x:xs) = x+x : doubled xs
numbers = [1..]
doubledNumbers = doubled numbers

print (take 5 doubledNumbers)      -- prints [2, 4, 6, 8, 10]

Codepad: http://codepad.org/CCj54Ieq

Here, doubled takes a list [e₁, e₂, e₃ … e_n] and returns a list [e₁×2, e₂×2, e₃×2 … e_n×2]. Then, we define an infinite list of numbers, apply doubled to this list, and take the first five elements from the resulting list. It looks almost as if we are performing calculations on infinite sets of data. Of course, the key here is the fact that evaluation only takes place when the result is needed.

Defining recursive functions

To use one of the most well-known examples of recursive function definitions, here is how we can intuitively define the factorial function in Haskell:

 factorial n = if (n == 0) then 1 else n * factorial (n - 1)

If we now recall the general pattern for recursion we identified earlier, we can express this in a slightly different way:

fn f = \x -> if (x == 0) then 1 else x * f (x - 1)
factorial = fn factorial

factorial 4                 -- 24

Note how the factorial = fn factorial definition matches the x = f x pattern.

The fixed point combinator

Based on the same idea, we can now define a more general function that takes any function (t → t), and computes a fixed point for that function. This is called a fixed point combinator.

fn f = \x -> if (x == 0) then 1 else x * f (x - 1)

fpc f = f (fpc f)           -- fixed point combinator
factorial = fpc fn

factorial 4                 -- 24

Codepad: http://codepad.org/uGjfg5Jb

So, a fixed point combinator is a function that computes a fixed point for other functions,

(t → t) → t.

In this case, the type of the fixed point is (Int → Int), so the signature would be:

((Int → Int) → (Int → Int)) → (Int → Int).

We can now try this fixed point combinator with some other recursive functions:

Fibonacci numbers:

fn2 f = \x -> if (x == 0 || x == 1) then x else f (x-1) + f (x-2)
fibonacci = fpc fn2

print (fibonacci 4)         -- 3

Codepad: http://codepad.org/ui3cUXo7

Greatest common divisor:

fn3 f = \x -> \y -> if (y == 0) then x else f y (rem x y)
divisor = fpc fn3

print (divisor 72 81)       -- 9

Codepad: http://codepad.org/3YymharR

The fixed point combinator in Lambda Calculus

Definitions

V is the set of variables
Λ is the set of lambda terms

Combinators

If M ∈ Λ, we denote the free variables of M as FV(M).

FV(v) = {v}
FV(M N) = FV(M) U FV(N)
FV(λx.M) = FV(M) − {x}

M is a closed λ-term if FV(M) = ∅. The set of closed λ-terms is denoted by Λ^o.

FV(M) = ∅ ⇔ M ∈ Λ^o

Another name for a closed λ-term is combinator. Here are some examples:

λxy.x
λx.x

This: λab.abc is not a combinator, since c ∈ FV(λab.abc).

The standard combinators are defined as:

I	≡ λx.x
K	≡ λxy.x
K_∗	≡ λxy.y
S	≡ λxyz.xz(yz)

Fixed Point Theorem

The Fixed Point Theorem in λ-Calculus states that:

∀F ∃X FX = X

For all F ∈ Λ there exists X ∈ Λ such that FX = X. So, every λ-term has at least one fixed point.

The Y-Combinator

Recursion in the λ-Calculus can be expressed using the Y-Combinator, a fixed point combinator discovered by Haskell Curry. It is defined as:

Y ≡ λf.(λx.f(xx))(λx.f(xx))

We have:

∀F F(YF) = YF ⇔ YF is a fixed point for all F ∈ Λ.

The factorial function in λ-Calculus

We can now define the λ-term that represents the factorial function,

fact x = if (x == 0) then 1 else x × fact (x - 1);

using Y:

Y(λf.λx.if (iszero x) 1 (mult x (f(sub x 1)))).

With the additional definitions for if, iszero, mult, sub, and the Church encoded numerals:

c₀ ≡ λf.λx.x
c₁ ≡ λf.λx.f x
c₂ ≡ λf.λx.f(f x)
c_n ≡ λf.λx.fⁿ(x),

we can now apply the function to the Church numeral for 3 ≡ λf.(λx.(f (f (f x)))):

((λf.((λx.(f (x x)))(λx.(f (x x)))))(λf.(λn.((((λx.(λy.(λz.((x y) z))))((λn.((n (λx.(λx.(λy.y))))(λx.(λy.x)))) n)) (λf.(λx.(f x))))(((λm.(λn.(λf.(λx.((m (n f)) x))))) n)(f ((λn.(λf.(λx.((λx.(x (λx.(λy.y))))((n ((λf.(λp.(((λx.(λy.(λp.((p x) y))))(λx.(λy.y)))((((λx.(λy.(λz.((x y) z))))((λx.(x (λx.(λy.x)))) p))((λx.(x (λx.(λy.y)))) p))(f ((λx.(x (λx.(λy.y)))) p)))))) f))(((λx.(λy.(λp.((p x) y))))(λx.(λy.x))) x)))))) n))))))) (λf.(λx.(f (f (f x)))))

—»_β λf.λx.f (f (f (f (f (f (x)))))) ≡ c₆ (3! = 3 × 2 × 1 = 6)

in 3677 β-reductions.

If you don't feel like doing this using pen and paper, you can use the same tool I used. Just remember to replace the λ sign with a backslash (\) in the above expression.

The Syntactic Sugar

Saturday, August 6, 2011

The Pattern of Recursion and Fixed Point Combinators