cplusplus.co.il

A question of memory layout

Posted on: 20/01/2010

Actual object memory layout can be a little tricky when inheritance and its virtual tables are involved. And it gets even trickier when pointer arithmetic is employed. Do you consider yourself a low-level expert?

Let us consider the following main program:

#include <iostream>

void f (A *a) {
    std::cout << a[2].x << std::endl;
}

int main () {
    B b[10];
    f(b);
}

In this post we will present three different definitions for classes A and B. Each definition will vary slightly from the previous one, and is likely to generate different output.

The first definition is the most straight forward:

// version I

struct A {
    A () : x(1) {}
    unsigned x;

    void dummy () {}
};

struct B : A {
    B () : y(2) {}
    unsigned y;

    void dummy () {}
};

As you could guess: we have defined two classes – A, and B which inherits from A. B contains one member – x, which is initialized to 1. B contains an extra member – y, which is initialized to 2. What will this version print?

At this point we shall add an innocent ‘virtual’ specification to the member function B::dummy:

// version II

struct A {
    A () : x(1) {}
    unsigned x;

    void dummy () {}
};

struct B : A {
    B () : y(2) {}
    unsigned y;

    virtual void dummy () {}
};

What will now be printed?

We shall be fair and let A have its own ‘virtual’ qualifier as well:

// version III

struct A {
    A () : x(1) {}
    unsigned x;

    virtual void dummy () {}
};

struct B : A {
    B () : y(2) {}
    unsigned y;

    virtual void dummy () {}
};

What do you expect to be printed now?

… .. .

Hint: Each of the suggested implementations generates a different output.

Advertisements

6 Responses to "A question of memory layout"

Hi RMN!

I am surprised that the second and the third definitions have a different output. To my understanding, once a single virtual function is declared in the path, the entire path becomes virtual, which means that the second example should be the same as the third.

Nadav

Nadav – I don’t know what the standard has to say about this, but i do think such a compiler/linker design would be infeasible. A and B can be declared in different files – and in fact in different lib’s altogether. When the compiler is faced with an ‘A*’ argument, it has only A’s type declaration to consult, and *not* its children.

Ofek, you bring a really good point. I looked at the compiled code and I was surprised to see the memory layout. After taking into account the fact that class “A” may not be modified, it does make sense. My mistake was that is a single virtual function is declared in the path, the entire path DOWNWARDS is declared virtual and not upwards, as we can see from this example.

rmn, I really enjoyed this one!

Roman, you could devise an even nastier ‘gotcha’ with virtual inheritance… You can fiddle with the size of the virtual *base* table, as well as the virtual *function* table – as both precede the data members in the object memory layout.

Good suggestion! I’ll add this to my TODO list 🙂

Thanks for the input.

It is perceivable that the different versions of the classes above will result in different output. But isn’t A* not the proper way of accessing in all cases?

C++ does not have an inherent array type, and we assume A* to be an array. Secondly, related to the intent of your post which is object layout, the user should be aware of what he is accessing.

Nice post!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: