Wednesday, November 9, 2011

Reinventing the Wheel, Part I: String Reverse Algorithm

A string reverse algorithm takes a string of characters, and produces a string with the characters in reverse order.


In doing this, we could
  • use a more high-level C++ approach,
std::string reverse(const std::string &str);

// ...

std::string reversed = reverse("uncategorized");
printf(reversed);                                 // dezirogetacnu
where the function returns a reversed copy of the original string. (Btw, to reverse a std::string, use the std::reverse algorithm, see bottom of this post.)

Alternatively, we can
  • write a C-styled function, using a "raw" pointer and then modify the string, in-place:
void reverseInPlace(char *str);

// ...

char s[] = "november";
reverseInPlace(s);
printf(s);                                // rebmevon
Since we are going for minimal memory footprint and optimal performance, we will pursue the latter approach.

Saturday, November 5, 2011

Non-Public and Virtual Inheritance in C++

In addition to public inheritance, C++ also supports three more rarely used forms of derivation; private, protected and virtual.

Public inheritance

class A {};
class B : public A {};
Public inheritance models an is-a relationship (e.g., Employee is a Person). Class members can be inherited or redefined in the derived class. Dynamic polymorphism (run-time binding) is achieved through the use of virtual member functions.
class A
{
public:
    virtual void stuff() { std::cout << "A::stuff()" << std::endl; }
};

class B : public A
{
public:
    void stuff() { std::cout << "B::stuff()" << std::endl; }
};

// ...

A* x = new B;
x->stuff();        // B::stuff()
The scope resolution operator (::) can be used to qualify hidden names:
struct A { int x; };
struct B : public A { int x; };   // A::x is now hidden
                                  // in the scope of B
// ...
B b;
b.x = 1;            // refers to B::x
b.A::x = 2;         // uses the qualified name

Composition

Many times, inheritance introduces strong coupling, and a general rule of OO design is that you should prefer composition over inheritance. Composition, or containment, expresses the type of relationship where one object of a specific type contains an object of another type. This is also known as a has-a relationship.
class A
{
public:
    void doStuff() { std::cout << "doing stuff!" << std::endl; }
};

class B
{
private:
    A m_a;              // B has an A
};

Friendship

In C++, it is also possible to declare classes friends of each other. Doing so gives the "befriended" class access to private and protected members of the class in which the friend keyword appears.
class B;     // forward declaration
class A
{
private:
    void doStuff() { std::cout << "doing stuff!" << std::endl; }

    friend class B;
};

class B
{
public:
    void doStuffWithA()
    {
        m_a.doStuff();         // Possible due to friendship
    }

private:
    A m_a;
};
Friendship is a powerful feature that should be used sparingly, and only in cases where classes are closely related. Now, let's take a look at private and protected inheritance, and how they relate to containment.

Private inheritance

class A {};
class B : private A {};
Private inheritance results in all public functions of a class being inherited as private.
class A
{
public:
    void doStuff() { std::cout << "doing stuff!" << std::endl; }
};

class B : private A {};

// ...

B b;
b.doStuff();         // error: ‘void A::doStuff()’ is inaccessible
As it turns out, this is very similar to composition, and the has-a kind of relationship described earlier.

Has-a vs. is-implemented-in-terms-of

Composition can actually signify two different types of relationship, namely
  • has-a, and
  • is-implemented-in-terms-of.

Scott Meyers explains the difference in this interview by Bill Venners. It goes something like this: In software design, and object oriented programming in particular, we are dealing with two separate domains. Objects can be described as either
  • real world objects, or
  • implementation artifacts.

The former is referred to as the application domain and the latter as implementation domain. Classes in the application domain would be, for example, User, Calendar, Document, and Brush. The implementation domain could include Mutex, List, Array, Buffer, SmartPointer etc. Implementation artifacts are abstractions that only make sense in the context of the program itself. A has-a relationship means the type of composition that corresponds to the application domain, whereas is-implemented-in-terms-of relates more to the implementation. Perhaps an Employee has a Role,
class Role
{
    // ...
};

class Employee
{
private:
    Role m_role;
};
but Staff is implemented in terms of a List (of Employees).
class Staff : private std::list<Employee>
{
    // ...
};
The distinction is not 100% obvious to me, but … nevertheless, it should be clear that the reason for not using public inheritance here is to avoid the interface of std::list being exposed through our Staff class.

Private inheritance is similar to containment, but in the case of containment, the containing class operates only on the public interface of the contained object. Using private inheritance, the derived class can also access protected members of the base class.

Other differences between (simple) composition and private inheritance are, that:
  • Using composition, a class can contain more than one object.
  • Using private inheritance, the derived class can override the base class' virtual functions.

Composition gives us control over how, and to what extent, the public interface of a contained object is exposed:
class A
{
public:
    void doStuff() {}
};

class B
{
public:
    void doStuff()
    {
        m_a.doStuff();
    }

private:
    A m_a;
};
Private inheritance makes this even more straightforward. The interface of the derived class is inaccessible from the outside world, however, a using directive can relax the terms of this contract:
class B : private A
{
public:
    using A::doStuff;
};
When should we use private inheritance then? The C++ FAQ Lite says:
Use composition when you can, private inheritance when you have to.

Protected inheritance

class A {};
class B : protected A {};
Protected inheritance is similar to private, with the difference that the base class' public functions now are being inherited as protected instead of private.
class A
{
public:
    void hello() {}
};

class B : private A {};

class C : public B
{
public:
    C() { hello(); }  // error: ‘void A::hello()’ is inaccessible
};
Using private inheritance for class B, class C, derived from B, does not have access to A's public members, hence the error:
error: ‘void A::hello()’ is inaccessible
On the other hand, if we change the declaration of B to,
class B : protected A {};
the program will compile.

Virtual inheritance

Multiple inheritance is a source of much confusion, and usually best avoided. Virtual inheritance exists to address a problem called the Diamond Problem. This is a situation that arises due to multiple inheritance, when the class structure becomes ambiguous as a result of a class being inherited more than once:
class Base
{
public:
     virtual void doStuff() {}
};

class A : public Base {};
class B : public Base {};

class C : public A, public B {};

// ...

C c;
c.doStuff();
This produces the following error:
request for member ‘doStuff’ is ambiguous
The problem comes from the fact that both A and B inherit Base, hence doStuff() could refer to both A::doStuff() and B::doStuff(). Looking at the class hierarchy, the diamond shape appears.


This ambiguity can be resolved. One way would be to simply qualify all calls.
c.A::doStuff();
This, however, is usually not the best solution. Instead, we can make Base a virtual base class to A and B, and therefore behave as if it was a direct base class of C.
class Base
{
public:
     virtual void doStuff() {}
};

class A : virtual public Base {};
class B : virtual public Base {};

class C : public A, public B {};

// ...

C c;
c.doStuff();         // ok!
A derived class can have both virtual and non-virtual base classes.

See this article for a more in-depth discussion: http://www.cprogramming.com/tutorial/virtual_inheritance.html

Thursday, November 3, 2011

The Reference-to-Pointer Syntax in C++

The syntax of pointer declarations is sometimes difficult to decipher, just consider the following three, elementary cases:
  • int *p[2]
  • int (*p)[2]
  • int (*p)()

 

Example 1

int *p[2];
This is actually an array of two int pointers. We can also think of this as a pointer to two consecutive int pointers.

int *p[2];

int *a = new int;
*a = 21;
int *b = new int;
*b = 22;

p[0] = a;
p[1] = b;

std::cout << *p[0] << std::endl;          // 21
std::cout << *p[1] << std::endl;          // 22

int **q = p;

std::cout << **q << std::endl;
std::cout << **(q + 1) << std::endl;

delete a;
delete b;
http://codepad.org/6c07RgKf

 

Example 2

int (*p)[2];
This is a pointer to an array of two ints. Just as in the first example, we can think of this a pointer to a pointer, since an array is basically a const pointer.


int n[2];
n[0] = 4;
n[1] = 7;

int (*p)[2] = &n;

std::cout << (*p)[0] << std::endl;       // 4
std::cout << (*p)[1] << std::endl;       // 7

// or ...

std::cout << **p << std::endl;           // 4
std::cout << *(*p + 1) << std::endl;     // 7
http://codepad.org/UDcH5cIQ

 

Example 3

int (*p)();
This is a pointer to a function returning an int.
int takeThisInt()
{
    return 123;
}

// ...

int (*p)() = &takeThisInt;
int x = p();                      // note the parentheses
std::cout << x << std::endl;      // 123
http://codepad.org/iL701mg5

Monday, October 31, 2011

Quiz Time: X-Y Path Permutations


Q: In an x-y coordinate system (or Cartesian coordinate system), how many paths are there from the point of origin to a point with coordinates ⟨x, y⟩, where
  • x ∧ y ∈ Z + (x and y are positive integers),
  • a path is built up from exactly x + y segments, each with a length of one (1), and
  • a line segment must be either horizontal or vertical?

See below for answer.
 

Tuesday, October 25, 2011

Behind the Scenes: Name Hiding in C++

Name hiding happens when an overloaded member function is declared in the scope of a derived class. Here is an example:
class Base
{
public:
    virtual ~Base() {}

    virtual void someFunction() const;
    virtual void someFunction(int x) const;
};

void Base::someFunction() const {}
void Base::someFunction(int x) const {}

class Derived : public Base
{
public:
    void someFunction() const;
};

void Derived::someFunction() const {}

int main(int argc, char **argv)
{
    Derived d;
    d.someFunction(123);         

    return 0;
}
This, in fact, will not compile and gcc reports the following error:
error: no matching function for call to 'Derived::someFunction(int)'
While if we remove Derived::someFunction(), the code compiles without errors. First of all, note that there are two orthogonal language features involved here: function overloading and inheritance (overloading vs. overriding).


The scope of Derived becomes nested within the scope of its base class, however, introducing a name in the derived class obscures all overloaded versions of the same function in the Base class, hence the compiler error. This is actually standard behavior, for reasons which will be discussed below.

Thursday, October 6, 2011

Lambdas in C++0x, Part II

Let's recall the syntax of a lambda expression, from the first part of this post:
  [C](P) M -> R { body }(Q)
  C = capture specification
  P = parameter specification
  M = mutable specification
  R = return type specification
  Q = actual call parameters
Lambda expressions are locally defined functions, also known as anonymous functions. A minimal example could look something like this:
int main()
{
    auto f = []() { std::cout << "hello"; };
    f();
    return 0;
}
The lambda can also be invoked as it is defined, on-the-fly:
[]() { std::cout << "hello"; }();
Here is a lambda that takes an int as argument:
auto fn = [](int x) { std::cout << "x = " << x << "\n"; };
fn(123);
Side note: I tend to use the terms parameter and argument more or less interchangeably. To make things clear, we should, however, make a distinction between the two:
  • A parameter is a variable declared in the prototype or declaration of a function.
  • The argument is the value that is actually passed to the function.

A return value can be specified, using the trailing return type specification (→ R):
auto fn = [](int x) -> int { return x + 1; };
int n = fn(1);
std::cout << "->" << n << "\n";
The auto keyword saves us some work when the compiler is able to infer the type of an expression automatically. Alternatively, we could write:
std::tr1::function<int(int)> fn = [](int x) -> int { return x + 1; };
In this post, I will continue to examine the capture specification and mutable keyword.

Monday, September 26, 2011

Copy Semantics and the Rule of Three

Commonly referred to as "the three amigos", the
  • copy constructor,
  • copy assignment operator, and
  • destructor
are sometimes generated by the compiler, when not defined directly by the programmer. Given a class definition such as the one below, the compiler will provide its own, synthesized version of the copy constructor and assignment operator for us.
class Basic
{
public:
    Basic()
        : m_data(new char[1024])
    {
    }

    ~Basic()
    {
        if (m_data)
            delete[] m_data;
         m_data = 0;
     }

private:
     char *m_data;
};
Here is how these default, compiler adaptations deal with different member types when we copy, assign to, or delete an object:

Default copy constructor:
  • Primitive types: Value is copied
  • Pointer types: Pointer is copied (shallow copy)
  • Objects: Calls object's copy constructor

Default assignment operator:
  • Primitive types: Value is copied
  • Pointer types: Pointer is copied (shallow copy)
  • Objects: Calls object's assignment operator

Default destructor:
  • Primitive types: Does nothing
  • Pointer types: Does nothing
  • Objects: Calls object's destructor

With C++11 also comes two new amigos (more about these later), they are the:
  • move constructor, and
  • move assignment operator.

A shallow copy simply means that the value of the pointer is copied, i.e., the address of the data, not the data itself. In our class definition above, the compiler-generated copy constructor and assignment operator are implicit. Consequently, the two objects in the following statements will end up with m_data pointers pointing to the same data.
Basic obj1;
Basic obj2(obj1);

The problem with our compiler-generated copy constructor (or, the problem with the Basic class destructor, depending on how you look at it) manifests when the objects are destroyed.
int main(int argc, char **argv)
{
    Basic obj1;
    Basic obj2(obj1);
    return 0;
}
This program will crash, and here is why:

  • The first object falls out of scope, and the data pointed to by m_data is released (in the destructor). So far, so good.
  • The second object falls out of scope. Since this object's m_data pointer is still pointing to the address of the memory we just released, the program will attempt to release the same memory again, which will cause the program to crash.

Rule of three

Dr. Dobbs describes the problem in great detail, in an article by Andrew Koenig and Barbara E. Moo. The observations are summarized in the following two rules, collectively referred to as "the rule of three":
  • If a class has a nonempty destructor, it almost always needs a copy constructor and an assignment operator.
  • If a class has a nontrivial copy constructor or assignment operator, it usually needs both of these members and a destructor as well.

Simply put, the rule states that; if we define one of the three members, we should usually define all three of them. To do so, there are essentially four different approaches. Each will be described in more detail below.
  • No copying
  • Deep copy
  • Reference counting
  • Copy-on-write