Virtual Function

Introduction

The virtual table is the key factor that achieves polymorphism in C++. This post will talk about the virtual table and its structure in C++. More particularly, I will introduce the virtual functions under Linux and Windows. All tests are done on 64-bit platform. 32-bit platform is a little bit different but the basic idea is similar.

Sample Code

This post will be based on the following code. On windows, we compile the code with Dev C++ [1]. On Linux, we compile the code with gcc (6.3.0). Because different compilers are using different conventions on different platforms, we will discuss them separately.

#include<iostream>
using namespace std;

class Test
{
    public :   
        int count;
        virtual void Show()
        {
            cout<<"I am in Test Class"<<endl;
        }
        
        Test()
        {
             count = -1;     
        }
};

class Test1 : public Test
{
      public:
           virtual void Show()
           {
            cout<<"I am in Test1 Class"<<endl;
           }
           virtual void T1Show()
           {
            cout<<"I am in Test1 own Class"<<endl;
           }
	  Test1() {t1 = 1;}
	  private:
		  int t1;

};

class Test2 : public Test
{
      public:
           virtual void Show()
           {
            cout<<"I am in Test2 Class"<<endl;
           }
           virtual void T2Show()
           {
            cout<<"I am in Test2 own Class"<<endl;
           }
           Test2() {t2 = 2;}
	   private:
		  int t2;

};

class Test3: public Test1, public Test2
{
	  public:
			virtual void Show()
			{
				cout<<"I am in Test Derived Class"<<endl;
			}
			virtual void T1Show()
			{
				cout<<"I am in T1 Derived Class"<<endl;
			}
			virtual void T2Show()
			{
				cout<<"I am in T2 Derived Class"<<endl;
			}
                        Test3() {t3 = 3;}
	  private:
		  int t3;
};

int foo(Test3 *t)
{
	t->Show();
	return 0;
}

int bar(Test t)
{
	t.Show();
	return 0;
}

int main()
{
    Test *Obj;
    Test BaseObj;   // Base Class Object
    Test1 Obj1;
    Test2 Obj2;  
    Test3 Obj3;
    Test3 *Obj4 = new Test3(); 
    int x;
    Obj = &BaseObj;
    Obj->Show();  //In this case derived class show function called.
    
    cin>>x;
    if(x%2==1)
    {
              Obj = &Obj1;
    }
    else if(x%2==0)
    {
         Obj = &Obj2;
    }
    Obj->Show();
    foo(Obj4);
    bar(BaseObj);
    return 0;
}

Constructor Function

Constructor functions are responsible for initializing the objects. For objects that contain virtual function, constructor function is also responsible for initializing the virtual table of the objects. Let’s see how constructor function works for virtual table.

MSVC


Under MSVC, the object instance is pass via rcx register. We can observe all the constructor functions involved in the sample code. Here we will introduce Test::Test(), Test1::Test1() and Test3::Test3().
Test::Test()

public _ZN4TestC1Ev
_ZN4TestC1Ev proc near
push    rbp
mov     rbp, rsp
mov     [rbp+arg_0], rcx
mov     rax, [rbp+arg_0]
lea     rdx, off_4905F0
mov     [rax], rdx
mov     rax, [rbp+arg_0]
mov     dword ptr [rax+8], 0FFFFFFFFh
pop     rbp
retn
_ZN4TestC1Ev endp


The virtual function of Test::Show (_ZN4Test4ShowEv) is located at 0x4905f0. The value is also served as virtual table pointer and is stored at the top of allocated object.

Test1::Test1()

public _ZN5Test1C2Ev
_ZN5Test1C2Ev proc near
push    rbp
mov     rbp, rsp
sub     rsp, 20h
mov     [rbp+arg_0], rcx
mov     rax, [rbp+arg_0]
mov     rcx, rax        ; this
call    _ZN4TestC2Ev    ; Test::Test(void)
mov     rax, [rbp+arg_0]
lea     rdx, off_490610
mov     [rax], rdx
mov     rax, [rbp+arg_0]
mov     dword ptr [rax+0Ch], 1
add     rsp, 20h
pop     rbp
retn
_ZN5Test1C1Ev endp


The virtual function of Test::Show (_ZN4Test4ShowEv) is located at 0x490610. An interesting discovery here is that the constructor function of test1 will call Test::Test() in the first place and overwrite the virtual table pointer next. The constructor function of Test2 is similar, so we will skip this part here.

Test3::Test3()
Now we move to the constructor function of Test3

public _ZN5Test3C1Ev
_ZN5Test3C1Ev proc near
push    rbp
mov     rbp, rsp
sub     rsp, 20h
mov     [rbp+arg_0], rcx
mov     rax, [rbp+arg_0]
mov     rcx, rax        ; this
call    _ZN5Test1C2Ev   ; Test1::Test1(void)
mov     rax, [rbp+arg_0]
add     rax, 10h
mov     rcx, rax        ; this
call    _ZN5Test2C2Ev   ; Test2::Test2(void)
mov     rax, [rbp+arg_0]
lea     rdx, off_490650
mov     [rax], rdx
mov     rax, [rbp+arg_0]
lea     rdx, off_490678
mov     [rax+10h], rdx
mov     rax, [rbp+arg_0]
mov     dword ptr [rax+20h], 3
add     rsp, 20h
pop     rbp
retn

From the constructor function of Test3::Test3(), we can clearly tell how the data layout of Test3 is prepared. First, the allocated object (this) is passed to Test1::Test1 to initialize the half part of the allocated object. Secondly, this+0x10 is passed to Test2::Test2 to initialize the second half part of the allocated object. Finally, off_490650 is stored at [this] and off_490678 is stored at [this+0x10].

GCC

Things are a little bit different on GCC.

The object instance is passed via rdi register. But the data layout of the object is still the same. For simplicity, we will just list Test3::Test3() here.

public _ZN5Test3C2Ev ; weak
_ZN5Test3C2Ev proc near
push    rbp             
mov     rbp, rsp
sub     rsp, 10h
mov     [rbp+var_8], rdi
mov     rax, [rbp+var_8]
mov     rdi, rax        ; this
call    _ZN5Test1C2Ev   ; Test1::Test1(void)
mov     rax, [rbp+var_8]
add     rax, 10h
mov     rdi, rax        ; this
call    _ZN5Test2C2Ev   ; Test2::Test2(void)
lea     rdx, off_201CA8
mov     rax, [rbp+var_8]
mov     [rax], rdx
lea     rdx, off_201CD0
mov     rax, [rbp+var_8]
mov     [rax+10h], rdx
mov     rax, [rbp+var_8]
mov     dword ptr [rax+20h], 3
nop
leave
retn
_ZN5Test3C2Ev endp


Here we can find that the constructor function will also call the constructor function of Test1 and Test2 first and replace the virtual table pointers with its own.

Virtual Function Call

Next let’s discuss the virtual function call itself. This part is much easier than the constructor function. More details could be found in [2] and [3].

MSVC

GCC

From the two virtual function calls call rax given in the two code snippets above. We can see that a routine virtual function call could be divided into four steps:
(1) Dereference the this pointer of the object to fetch the vptr.
(2) Dereference vptr+offset to fetch vfptr, the virtual function to be called.
(3) Set the arguments to the function on the stack or in the registers depending on the calling convention, including this pointer
(4) Invoke vfptr via an indirect call (or jump under some compiler optimization).

For virtual function call under Windows 64-bit, this pointer is represented by rcx register. For virtual function call under Linux 64-bit, this pointer i represented by rdi register.

Conclusion

This post gives a very basic knowledge about the virtual function call under different calling conventions in different systems. The introduction of the constructor function gives the background on how VTint is built upon. The introduction on virtual function call gives some background

Reference

[cover page] https://www.pixiv.net/member_illust.php?mode=medium&illust_id=41688170
[1] http://www.bloodshed.net/devcpp.html
[2] http://www.openrce.org/articles/full_view/23
[3] http://www.lrdev.com/lr/c/virtual.html

One thought on “Virtual Function

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.