Wednesday, July 27, 2011

Lambdas in C++0x, Part I

The new C++0x standard adds support for anonymous functions; so called lambda expressions. The concept of anonymous, or locally defined functions plays a central role in functional programming and has its origin in the Lambda Calculus (λ-Calculus); a theory formulated by Alonzo Church in 1936 (yes, that was before FORTRAN). The basic idea here is to treat functions as first class objects – functions can be stored in variables, passed as arguments, and be used as return values etc.
Lambda expressions are similar to functors (function objects), but with the added advantage of now being an integral part of the language. We can think of them as syntactic sugar for a functor. Just like function objects, lambdas are type-safe and can maintain state.

Using lambda expressions

The lambda term λx.3 represents the anonymous function x → 3, i.e., a function which takes a single input, x and (always) returns 3.

Using the new C++0x lambda syntax, this can be expressed as:
[](int x) { return 3; }
…where the initial square brackets ( [ ] ) are called the lambda introducer and tells the compiler that a lambda expression follows. Since x is irrelevant here, we may simply omit the parameter specification in this case. This leaves us with a very basic function that returns the value 3:
[] { return 3; }
The meaning of "function" in mathematics is, of course, very different from functions in imperative programming, but from here on when we talk about functions, this means C++ functions.
Now, if we want to call this function and store its return value somewhere, we can just casually append (), similar to a normal function call:
int n = [] { return 3; }();            // note the parentheses
This statement initializes the variable n with 3 – the return value of our function. Perhaps the true power of λ is not yet evident, but we are about to take things to the next level by introducing the auto keyword. In the previous example, we stored the return value of the function in a variable n, but what if we want to store the function itself? This is where auto comes in handy:
auto my_function = [](int x) { return (x + 1); };
Here, we have actually defined a local function, on-the-fly; my_function takes an int, x as parameter and returns x + 1. This function can be invoked, just like any other function:
auto my_function = [](int x) -> int { return (x + 1); };
std::cout << my_function(4);       // prints 5
Note also that I added -> int after the parameter specification. This is called a trailing return type specification, and can actually be applied to any function declaration in C++0x (not only lambdas).

The lambda function block is just like any ol' block, and normal block scope rules apply:
auto func1 = [] {
    auto func2 = [] {
    };
};
auto f1 = func1;
auto f2 = func2;       // error!

Higher-orderism

A function which takes another function as parameter and/or returns another function, is called a higher-order function. In this example, func1 is a locally defined function that defines another (inner) function, func2, and then returns that function:
auto func1 = [] {
    auto func2 = [] {
        return 3;
    };
    return func2;
};

auto f = func1();
f();               // 3

// or we could write
func1()();         // 3

The type of lambda

Until now, we haven't mentioned what the actual type of a lambda is. Using the auto keyword we can leave the job of specifying the type to the compiler, but in order to pass a function as parameter to another function, we need to explicitly state its type:
auto local_func = [] {
    std::cout << "hello";
}

yield(local_func);

void yield(std::tr1::function<void(void)> lambda)
{
    lambda();
}
There we have it – std::tr1::function<void(void)> is a function that returns void and takes no parameters (tr1 stands for Technical Report 1). Knowing that these locally defined functions are really function objects, we can pass the function as const reference. The signature would then be:
void yield(const std::tr1::function<void(void)> &lambda)
Another example; here is a local function that takes two int parameters, and returns an int:
auto local = [](int a, int b) -> int {
    return (a + b);
};

yield(local);

void yield(const std::tr1::function<int(int, int)> &f)
{
    std::cout << f(1, 2);
}

Templates

An even more generic approach would be to use templates – a function that takes a function as parameter would then look something like this:
template <typename F>
void yield(const F &f)
{
    f();
}

auto f = [] {
    std::cout << "what is my name?";
};

yield(f);
And here is a templated function that returns another function:
template <typename F>
F function_factory()
{
    auto f = []() {
        // ...
        return 123;
    };
    return f;
}

Algorithms

Lambdas can also be used with the STL algorithms, here is an example of a for_each loop with a lambda function:
#include <algorithm>

std::vector<int> list;
// ...
std::for_each(list.begin(), list.end(), [](int x) {
    std::cout << x;
});

Syntax

The detailed syntax for a lambda expression is shown here (gray items are optional):
  [C](P) M -> R { body }(Q)
  C = capture specification
  P = parameter specification
  M = mutable specification
  R = return type specification
  Q = actual call parameters
So far, we haven't mentioned the mutable, or capture specifications. More on those will follow in the second part of this post, where we will also discuss how a lambda expression can have access to identifiers declared outside of the function itself (closure).

0 comments:

Post a Comment