Introduction
The virtual table is the key factor that achieves polymorphism in C++. This post will talk about the virtual table and its structure in C++. More particularly, I will introduce the virtual functions under Linux and Windows. All tests are done on 64-bit platform. 32-bit platform is a little bit different but the basic idea is similar.
Sample Code
This post will be based on the following code. On windows, we compile the code with Dev C++ [1]. On Linux, we compile the code with gcc (6.3.0). Because different compilers are using different conventions on different platforms, we will discuss them separately.
#include<iostream> using namespace std; class Test { public : int count; virtual void Show() { cout<<"I am in Test Class"<<endl; } Test() { count = -1; } }; class Test1 : public Test { public: virtual void Show() { cout<<"I am in Test1 Class"<<endl; } virtual void T1Show() { cout<<"I am in Test1 own Class"<<endl; } Test1() {t1 = 1;} private: int t1; }; class Test2 : public Test { public: virtual void Show() { cout<<"I am in Test2 Class"<<endl; } virtual void T2Show() { cout<<"I am in Test2 own Class"<<endl; } Test2() {t2 = 2;} private: int t2; }; class Test3: public Test1, public Test2 { public: virtual void Show() { cout<<"I am in Test Derived Class"<<endl; } virtual void T1Show() { cout<<"I am in T1 Derived Class"<<endl; } virtual void T2Show() { cout<<"I am in T2 Derived Class"<<endl; } Test3() {t3 = 3;} private: int t3; }; int foo(Test3 *t) { t->Show(); return 0; } int bar(Test t) { t.Show(); return 0; } int main() { Test *Obj; Test BaseObj; // Base Class Object Test1 Obj1; Test2 Obj2; Test3 Obj3; Test3 *Obj4 = new Test3(); int x; Obj = &BaseObj; Obj->Show(); //In this case derived class show function called. cin>>x; if(x%2==1) { Obj = &Obj1; } else if(x%2==0) { Obj = &Obj2; } Obj->Show(); foo(Obj4); bar(BaseObj); return 0; }
Constructor Function
Constructor functions are responsible for initializing the objects. For objects that contain virtual function, constructor function is also responsible for initializing the virtual table of the objects. Let’s see how constructor function works for virtual table.
MSVC
Under MSVC, the object instance is pass via rcx register. We can observe all the constructor functions involved in the sample code. Here we will introduce Test::Test(), Test1::Test1() and Test3::Test3().
Test::Test()
public _ZN4TestC1Ev _ZN4TestC1Ev proc near push rbp mov rbp, rsp mov [rbp+arg_0], rcx mov rax, [rbp+arg_0] lea rdx, off_4905F0 mov [rax], rdx mov rax, [rbp+arg_0] mov dword ptr [rax+8], 0FFFFFFFFh pop rbp retn _ZN4TestC1Ev endp
The virtual function of Test::Show (_ZN4Test4ShowEv) is located at 0x4905f0. The value is also served as virtual table pointer and is stored at the top of allocated object.
Test1::Test1()
public _ZN5Test1C2Ev _ZN5Test1C2Ev proc near push rbp mov rbp, rsp sub rsp, 20h mov [rbp+arg_0], rcx mov rax, [rbp+arg_0] mov rcx, rax ; this call _ZN4TestC2Ev ; Test::Test(void) mov rax, [rbp+arg_0] lea rdx, off_490610 mov [rax], rdx mov rax, [rbp+arg_0] mov dword ptr [rax+0Ch], 1 add rsp, 20h pop rbp retn _ZN5Test1C1Ev endp
The virtual function of Test::Show (_ZN4Test4ShowEv) is located at 0x490610. An interesting discovery here is that the constructor function of test1 will call Test::Test() in the first place and overwrite the virtual table pointer next. The constructor function of Test2 is similar, so we will skip this part here.
Test3::Test3()
Now we move to the constructor function of Test3
public _ZN5Test3C1Ev _ZN5Test3C1Ev proc near push rbp mov rbp, rsp sub rsp, 20h mov [rbp+arg_0], rcx mov rax, [rbp+arg_0] mov rcx, rax ; this call _ZN5Test1C2Ev ; Test1::Test1(void) mov rax, [rbp+arg_0] add rax, 10h mov rcx, rax ; this call _ZN5Test2C2Ev ; Test2::Test2(void) mov rax, [rbp+arg_0] lea rdx, off_490650 mov [rax], rdx mov rax, [rbp+arg_0] lea rdx, off_490678 mov [rax+10h], rdx mov rax, [rbp+arg_0] mov dword ptr [rax+20h], 3 add rsp, 20h pop rbp retn
From the constructor function of Test3::Test3(), we can clearly tell how the data layout of Test3 is prepared. First, the allocated object (this) is passed to Test1::Test1 to initialize the half part of the allocated object. Secondly, this+0x10 is passed to Test2::Test2 to initialize the second half part of the allocated object. Finally, off_490650 is stored at [this] and off_490678 is stored at [this+0x10].
GCC
Things are a little bit different on GCC.
The object instance is passed via rdi register. But the data layout of the object is still the same. For simplicity, we will just list Test3::Test3() here.
public _ZN5Test3C2Ev ; weak _ZN5Test3C2Ev proc near push rbp mov rbp, rsp sub rsp, 10h mov [rbp+var_8], rdi mov rax, [rbp+var_8] mov rdi, rax ; this call _ZN5Test1C2Ev ; Test1::Test1(void) mov rax, [rbp+var_8] add rax, 10h mov rdi, rax ; this call _ZN5Test2C2Ev ; Test2::Test2(void) lea rdx, off_201CA8 mov rax, [rbp+var_8] mov [rax], rdx lea rdx, off_201CD0 mov rax, [rbp+var_8] mov [rax+10h], rdx mov rax, [rbp+var_8] mov dword ptr [rax+20h], 3 nop leave retn _ZN5Test3C2Ev endp
Here we can find that the constructor function will also call the constructor function of Test1 and Test2 first and replace the virtual table pointers with its own.
Virtual Function Call
Next let’s discuss the virtual function call itself. This part is much easier than the constructor function. More details could be found in [2] and [3].
MSVC
GCC
From the two virtual function calls call rax given in the two code snippets above. We can see that a routine virtual function call could be divided into four steps:
(1) Dereference the this pointer of the object to fetch the vptr.
(2) Dereference vptr+offset to fetch vfptr, the virtual function to be called.
(3) Set the arguments to the function on the stack or in the registers depending on the calling convention, including this pointer
(4) Invoke vfptr via an indirect call (or jump under some compiler optimization).
For virtual function call under Windows 64-bit, this pointer is represented by rcx register. For virtual function call under Linux 64-bit, this pointer i represented by rdi register.
Conclusion
This post gives a very basic knowledge about the virtual function call under different calling conventions in different systems. The introduction of the constructor function gives the background on how VTint is built upon. The introduction on virtual function call gives some background
Reference
[cover page] https://www.pixiv.net/member_illust.php?mode=medium&illust_id=41688170
[1] http://www.bloodshed.net/devcpp.html
[2] http://www.openrce.org/articles/full_view/23
[3] http://www.lrdev.com/lr/c/virtual.html
[…] the indirect call which matches the pattern of virtual function call mentioned in previous post Virtual Function. In the end, we pick sub_307f56cc as our invoking […]
LikeLike