cplusplus.co.il

Thunksgiving

Posted on: 31/10/2010

Quoting Wikipedia:

The word thunk has at least three related meanings in computing science. A “thunk” may be:

  1. A piece of code to perform a delayed computation (similar to a closure)
  2. A feature of some virtual function table implementations (similar to a wrapper function)
  3. A mapping of machine data from one system-specific form to another, usually for compatibility reasons

In all three senses, the word refers to a piece of low-level code, usually machine-generated, that implements some detail of a particular software system.

In this post (whose name looks like an unrelated typo) we shall observe the need for a thunk of the second kind, in C++.

Let us consider the following multiple inheritance hierarchy:

#include <iostream>

struct A {
    virtual A *clone () const {
        return new A(); 
    }

    int a;
};

struct B {
    virtual B *clone () const {
        return new B();
    }

    int b;
};

struct C : A, B {
    virtual C *clone () const {
        C *c = new C();
        std::cout << c << std::endl;
        return c;
    }

    int c;
};

int main () {
    B *b = new C();
    B *b_cloned = b->clone();
    std::cout << b_cloned << std::endl;
}

Please, do not let the length of the code snippet intimidate you. The illustrated test case is actually pretty simple; Three classes are defined — A, B, and C. C is derived from the unrelated other classes. All classes share a single virtual function which allows to clone() them, and a single member variable – just so they aren’t empty. The purpose of the two printouts will be made clear soon.

One nuance worth paying attention to is the fact that the clone() function is refined within C::clone() to yield a pointer-to-C instead of pointer-to-A/B. This is called Covariance. I could talk a lot more about Covariance, Contravariance, and C++’s lack of support for Contravariance in function parameters, but these topics will have to wait for a later post.

Let us recap what memory layout do A, B, and C have (on a 32bit machine):

A: [ Avptr | int(a) ] 8bytes
B: [ Bvptr | int(b) ] 8bytes
C: [ Cvptr | int(a) | BCvptr | int(b) | int(c) ] 20bytes

Needless to say that a cast from C* to A* requires no effort at all, while the virtual table pointer (vptr) of class B (notice that it is labelled BCvptr, which is intentionally different from Bvptr) is where it is to allow convenient casts of the form C* to B*. Therefore, it is not surprising to discover that the compiler will actually carry out much needed pointer adjustments in the following case:

C c;
B *b = &c; // b != &c. Actually, b = &c+8.

Now, onwards to the interesting part; What happens when we write line #31 in the original code snippet (invocation of C::clone() which yields a pointer-to-C, from a B context)? And more importantly, what is actually going on under the hood?

The call b->clone() is invoked through the BCvptr. Due to the dynamic type of the object being C, we expect C::clone() to be called. But here’s the catch — a call to C::clone() yields a pointer (to a newly constructed C object) which points to the head of a C object, rather than to a B object! If C::clone() would be called normally, we would have a pointer-to-B which points on the wrong memory area, and that would just make us all unhappy.. So what dark magic is there, within BCvptr, that allows this code to work flawlessly?

Using the magical -fdump-class-hierarchy flag for g++, and the wondrous c++filt utility, we can obtain the actual structure of C’s virtual table:


Vtable for C
C::vtable for C: 6u entries
0 (int (*)(...))0
4 (int (*)(...))(& typeinfo for C)
8 C::clone
12 (int (*)(...))-0x00000000000000008
16 (int (*)(...))(& typeinfo for C)
20 C::covariant return thunk to C::clone() const

I will not go into much detail about the whole structure as it is enough information for one post, but I do believe that what’s important to us is pretty apparent here: at offset 8 there’s the normal invocation of C::clone within a Cvptr, while at offset 20 (which is exactly 12+8) there’s a call to a “covariant return thunk to C::clone”. As you have probably guessed by now, BCvptr is actually a pointer to offset 12 within Cvptr. Therefore, when making the aforementioned b->clone() call — a call is made to the function whose address resides at offset 8 from the current vptr; And since the current vptr is BCvptr, a call to the thunk is made.

Obviously, the compiler essentially implements this thunk as a simple wrapper around C::clone() combined with the required pointer adjustment to yield a proper pointer-to-B. Some of you may find this technique pretty similar to the notion of a Trampoline.

To wrap things up, it should come as no surprise to you, that both printouts produce addresses which differ by 8bytes — which is the exact required adjustment.

13 Responses to "Thunksgiving"

Very Interesting.

I have never considered how covariance works with multiple inheritance.

Thanks!

Nice writeup!
i think the ‘pointer adjustment to yield a proper pointer-to-B’ cannot be part of the thunk itself – as it cannot know its return value is casted to a B*.
I also vaguely recall reading about a more complicated case that required a similar thunk. Something involving virtual bases – gonna look it up..

Ah, there: http://www.freepatentsonline.com/5297284.html
The case you described is in the paragraph starting with ‘The first case occurs when a function member in a derived class overrides a function member that is defined in more than one base class…’.
The other case is on the following paragraph: ‘The second case occurs when a derived class has a base class that overrides a function member in a virtual base class and the derived class itself does not override the function member…’

This stuff is exactly the reason multiple inheritance is banned in most coding standards, and thrown right out in C# (and i think also in most other modern languages).

Thanks 🙂

Actually, the ‘covariant return thunk’ is very specific to the C::clone which yields a C* that gets called from a B base pointer which should return a B*. Therefore, it has to make the pointer adjustment from C* to B* each time it’s called from a B* pointer, and that’s why that adjustment is part of the thunk itself.

Another interesting point which I did not want to make on the post, is that just above the typeinfo within the vtable, there’s the much needed ‘this’ pointer adjustment to make the code within the actual virtual function (clone() in our case) work properly by making the proper adjustment to ‘this’ (0 if we’re pointing on a C* object, and -8 if we’re pointing on its B* portion). But that’s another story.

Why should a clone() call made from a ‘B context’ necessarily return a B*? The following code is just as legal –

B *b = new C();
C *b_cloned = b->clone();

There’s no reason for clone(), compiled in either contexts, to assume anything about the casting its return value would undergo. Any such casting should be external to the function. (or am i still missing something in your intention?)

The static b->clone() call (regardless of the dynamic nature of the call itself) must always return a B* pointer, and in your case we can statically make proper adjustment from B* to C* for the assignment (a downcast which requires static/dynamic_cast, by the way, and that’s where the actual adjustment takes place).
This is exactly why the thunk is needed and always does the same (makes sure a B* pointer is returned, as B::clone() requires).

That is exactly the point of clone() being virtual – the call is *not* the static call, and we expect it to behave identically to C::clone(), regardless of the pointer type used for the invocation. In particular, we do not expect its return type to change to B*. Perhaps the use of clone() for the sake of demo is misleading – in other methods it would be obvious that the return type should be independent of the invoking pointer type (what you call a context). The sole job of the thunk should be to adjust the ‘this’ input argument.
I’ll try and check the generated assembly for the thunk tomorrow, but i’d be *very* surprised to discover otherwise.

I’ve slightly edited my comment, please let me know if it still does not make sense.
I also think you might have missed the need for a dynamic/static_cast in your example, as it involves a downcast. Maybe that’s the missing bit.

Let me try again: your code printouts verify that you called C::clone() – as you obviously expected – that’s what being virtual is all about. So, static invocations of B::clone() have nothing to do with it.
Now, C::clone’s signature is –
C* C::clone () const

From a compiler perspective, any call to this function, even via a B*, must conform to this signature. (It is unthinkable that a function very *signature* depends on the invoking pointer type.) The compiler must adjust the this argument exactly so this function can do what it was compiled to do, and in particular return a C*. Only at the caller site can the compiler determine what further manipulations (e.g., casting) must this return value undergo. So, the thunk must leave the return type untouched – any other behaviour seems weird to me.

i re-read MS’s patent and it is clear that at least they only adjust the input. Beyond checking the assembly – maybe you can use an ‘auto’ var to hold the b->clone() return value, and then test its typeid?
(last one for today – i’d be happy to continue tomorrow).

I think you are mixing stuff up. Adjusting ‘this’ for use within C::clone itself is irrelevant at this point so we will leave that at the moment (despite it being related).

Now, the specific thunk we are talking about is only called from a B* pointer within a bigger C* object. Now, since

B* b_clone = b->clone;

must always result in the same code, be it when a B* is returned and be it when its a C*, somebody has to make a magical adjustment when a C* is actually returned. That happens when the thunk is executed and the thunk is what makes the call conform to the normal case in which B::clone is invoked (when its a real B object).

Another point which you are missing, is that in order for your example to work, we must use static/dynamic_cast, and that cast will only know what to do because the static type of the b->clone() expressions is B*.

Last one on my end as well, good night 🙂

Ahh, nothing like sleep & coffee to clear one’s mind. I re-read your explanations and you are completely right.
Two personal lessons:
(1) Never claim anything about C++ that you didn’t test in code,
(2) Never argue online past 11:30PM.

Thanks for your patience!

Don’t mention it, glad it worked out 🙂

Thanks for the feedback!

[…] Roman just posted a nice investigation he did, mostly using the g++ switch -fdump-class-hierarchy – which dumps to stdout a graphical representation of generated classes layout. […]

Leave a comment

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 27 other subscribers

Categories