Wednesday, August 31, 2011

Behind the Scenes: Polymorphism and Function Resolution in C++

This post describes how function resolution is typically (but not always) implemented by compilers.
For non-virtual functions, the compiler matches a function call with the corresponding function definition at compile-time. This is known as static binding.

















Dynamic Polymorphism

When a member function is declared virtual, it can be overridden in a class that directly, or indirectly, derives from the class where it was declared (the base class).
class Base
{
public:
    virtual ~Base() {}
    virtual void hello() = 0;     // hello is pure virtual
};

class A : public Base
{
public:
    void hello() { std::cout << "A says 'hello'\n"; }
};

class B : public Base
{
public:
    void hello() { std::cout << "B says 'hello'\n"; }
};

void sayHello(Base& helloer)
{
    helloer.hello();
}

int main()
{
    Base *a = new A;
    Base *b = new B;
    sayHello(*a);
    sayHello(*b);
    delete a;
    delete b;
}
This program generates the following output:
A says 'hello'
B says 'hello'
Even though helloer is a reference to the base class, which function that will actually be called depends on the type of object we pass to sayHello(). This means, of course, that the function call to hello() must be resolved at run-time. This is referred to as dynamic binding. Internally this is achieved using two constructs known as v-tables (virtual tables, or dispatch tables) and v-pointers.

















  • A v-table is created for every class that: declares one or more virtual functions, or overrides a virtual function. (A)
    1. Such class is called a polymorphic class.
    2. Functions that override virtual functions from a parent class are adjusted in the v-table of the derived class. (B)
    3. The function call is resolved at run-time, from the matching index in the v-table, pointed to by the actual object's v-pointer. (C)
    4. Each object of a polymorphic class has its own v-pointer, but there is only one v-table per class.
    5. The v-tables and v-pointers are added by the compiler, and not visible to the programmer.
    class Fruit
    {
    public:
        Fruit()
            : m_color("fruit-colored"),
              m_peeled(false)
        {}
    
        virtual ~Fruit() {}
        virtual void eat();    
        virtual void peel();
        void setColor(const std::string &color) { m_color = color; }
        const std::string &color() const { return m_color; }
    
    protected:
        std::string m_color;
        bool m_peeled;
    };
    
    void Fruit::eat()
    {
        std::cout << "Now eating this " << m_color << " fruit!\n";
    }
    
    void Fruit::peel()
    {
        if (!m_peeled)
             m_peeled = true;
    }
    
    class Orange : public Fruit
    {
    public:
         Orange()
             : Fruit()
         {
             m_color = "orange";
         }
         void eat();
    };
    
    void Orange::eat()
    {
        if (m_peeled) {
            Fruit::eat();
        } else {
            std::cout << "Can't eat orange before peeling it!\n";
        }
    }
    int main()
    {
        Fruit *fruit1 = new Fruit;
        Fruit *fruit2 = new Orange;
        Fruit *fruit3 = new Orange;
    
        fruit1->eat();       // Now eating this fruit-colored fruit!
        fruit2->eat();       // Can't eat orange before peeling it! 
    
        fruit2->peel();
        fruit2->eat();       // Now eating this orange fruit! 
    
        std::cout << fruit2->color();        // orange
    
        delete fruit1;
        delete fruit2;
        delete fruit3;
    
        return 0;
    }
    Run this example from codepad: http://codepad.org/QUyLRq6l

    Friday, August 26, 2011

    Lambda Calculus Cheat Sheet





















    Click here to download in PDF format

    Tuesday, August 16, 2011

    The Meaning and Usefulness of Closures

    In the previous post, I talked about the Fixed Point Combinator in λ-Calculus, and mentioned that combinator is another word for a closed lambda term, i.e., an expression with no free variables:

    λx.x, λxy.x, λxy.y, etc.

    Take for example the term λxy.y, which is really shorthand for λx.λy.y:

    FV(λx.λy.y)   =   FV(λy.y) − {x}   =   FV(y) − {y} − {x}   =   {y} − {y} − {x}   =   Ø.

    The opposite of a combinator – any expression that uses free variables, is called a closure. Even though the meaning of the term is slightly different in different contexts, the idea is basically the same:

    Examining the term λx.(f x), we have the free variables FV(λx.f x) …

    = FV(f x) − {x} = FV(f) ∪ FV(x) − {x} = {f} ∪ {x} − {x} = {f, x} − {x}
    = {f}.

    In Haskell, the definition for a function, func:
    func f = \x -> f x
    … uses the variable f, bound outside of the abstraction. It is therefore a closure.

    Saturday, August 6, 2011

    The Pattern of Recursion and Fixed Point Combinators

    Recursion is an interesting, and important computation tool that enables us to formulate patterns of self-similarity and express infinity, using a finite set of elements. There is a strong relation between recursion and the principle of induction in mathematics. Imperative programming languages usually allow recursion in a program to take place by letting us define a function that calls itself. Here is a simple recursive function in C++:
    #include <iostream.h>
    
    void count(int n)
    {
        std::cout << "Counting: " << n << "\n";
        if (n < 5) {
            count(n+1);
        }
    }
    
    int main(int argc, char **argv)
    {
        count(1);
        return 0;
    }
    Run this example from codepad:
    http://codepad.org/YmxQcTIX

    In functional programming, recursion has a much more central position. Purely functional languages do not define any looping constructs, but instead rely entirely on recursion to achieve iterative algorithms. Many data types are defined recursively. In Haskell, recursion even allows us to define (conceptually) infinite data structures:
    n = 5 : n
    take 3 n                -- [5, 5, 5]
    Codepad: http://codepad.org/sRbFkSH5

    Here, we take the first three elements from an infinite list of fives. The list n is defined as a single element, 5, followed by a list – itself.
    n is a 5, followed by n, which is a 5 followed by n, which is a 5 followed …