How are virtual functions and vtable implemented?

Question

How are virtual functions and vtable implemented?

We all know what virtual functions are in C ++, but how are they implemented at a deep level?

Is it possible to change vtable or even get direct access at runtime?

Is there a vtable for all classes, or only those that have at least one virtual function?

Do abstract classes just have NULL for a function pointer of at least one record?

Does one virtual function slow down an entire class? Or just a virtual function call? And does speed affect if the virtual function is actually overwritten or not, or does it have no effect as long as it is virtual.

+80

c ++ polymorphism virtual-functions vtable

Brian R. Bondy Sep 19 '08 at 3:29

source share

11 answers

Is it possible to change vtable or even get direct access at runtime?

Not portable, but if you don't mind dirty tricks, of course!

WARNING This method is not recommended for use by children, adults aged 969 , or small fluffy creatures from Alpha Centauri. Side effects may include demons that fly out of your nose , the sudden appearance of Yog-Sothoth as a must in all subsequent code reviews, or the retroactive addition of IHuman::PlayPiano() to all existing instances]

In most compilers I've seen, vtbl * is the first 4 bytes of the object, and the contents of vtbl is just an array of member pointers there (usually in the order in which they were declared, with the base class), of course there are others possible layouts, but this is what I usually observed.

 class A { public: virtual int f1() = 0; }; class B : public A { public: virtual int f1() { return 1; } virtual int f2() { return 2; } }; class C : public A { public: virtual int f1() { return -1; } virtual int f2() { return -2; } }; A *x = new B; A *y = new C; A *z = new C;

Now, to pull out some fraud ...

Changing a class at runtime:

 std::swap(*(void **)x, *(void **)y); // Now x is a C, and y is a B! Hope they used the same layout of members!

Method override for all instances (monkeypatching class)

This is a bit more complicated since vtbl itself is probably in read only memory.

 int f3(A*) { return 0; } mprotect(*(void **)x,8,PROT_READ|PROT_WRITE|PROT_EXEC); // Or VirtualProtect on win32; this part very OS-specific (*(int (***)(A *)x)[0] = f3; // Now C::f1() returns 0 (remember we made x into a C above) // so x->f1() and z->f1() both return 0

The latter, most likely, will make checker viruses, and the connection will fall asleep and notice due to mprotect manipulations. In a process using the NX bit, it may fail.

+21

puetzk Sep 19 '08 at 13:39

source share

Does one virtual function slow down an entire class?

Or just a virtual function call? And does speed affect if the virtual function is actually overwritten or not, or does it have no effect as long as it is virtual.

The presence of virtual functions slows down the entire class, since another data element must be initialized, copied ... when working with an object of such a class. For a class with half a dozen or so members, the difference should be careless. For a class that contains only one char member or has no members at all, the difference can be noticeable.

In addition, it is important to note that not every virtual function call is a virtual function call. If you have an object of a known type, the compiler can emit code for a normal function call and can even embed the specified function if it likes it. This only happens when you make polymorphic calls using a pointer or link that can point to an object of a base class or an object of some derived class, for which you need a vtable link and pay for it in terms of performance.

 struct Foo { virtual ~Foo(); virtual int a() { return 1; } }; struct Bar: public Foo { int a() { return 2; } }; void f(Foo& arg) { Foo x; xa(); // non-virtual: always calls Foo::a() Bar y; ya(); // non-virtual: always calls Bar::a() arg.a(); // virtual: must dispatch via vtable Foo z = arg; // copy constructor Foo::Foo(const Foo&) will convert to Foo za(); // non-virtual Foo::a, since z is a Foo, even if arg was not }

The steps to be taken are essentially the same, regardless of whether this function is overwritten or not. The vtable address is read from the object, the function pointer obtained from the corresponding slot, and the function called by the pointer. In terms of actual performance, industry forecasts may have some effect. So, for example, if most of your objects are related to the same implementation of this virtual function, then it is likely that the branch predictor will correctly predict which function to call before the pointer is found. But it doesn’t matter which function is common: most objects can delegate an unwritten base case or most objects belonging to the same subclass, and therefore delegate the same rewritten case.

How are they implemented at a deep level?

I like jheriko's idea of demonstrating this with a mock implementation. But I would use C to implement something similar to the code above, so the low level is easier to see.

parent class foo

 typedef struct Foo_t Foo; // forward declaration struct slotsFoo { // list all virtual functions of Foo const void *parentVtable; // (single) inheritance void (*destructor)(Foo*); // virtual destructor Foo::~Foo int (*a)(Foo*); // virtual function Foo::a }; struct Foo_t { // class Foo const struct slotsFoo* vtable; // each instance points to vtable }; void destructFoo(Foo* self) { } // Foo::~Foo int aFoo(Foo* self) { return 1; } // Foo::a() const struct slotsFoo vtableFoo = { // only one constant table 0, // no parent class destructFoo, aFoo }; void constructFoo(Foo* self) { // Foo::Foo() self->vtable = &vtableFoo; // object points to class vtable } void copyConstructFoo(Foo* self, Foo* other) { // Foo::Foo(const Foo&) self->vtable = &vtableFoo; // don't copy from other! }

derived class Bar

 typedef struct Bar_t { // class Bar Foo base; // inherit all members of Foo } Bar; void destructBar(Bar* self) { } // Bar::~Bar int aBar(Bar* self) { return 2; } // Bar::a() const struct slotsFoo vtableBar = { // one more constant table &vtableFoo, // can dynamic_cast to Foo (void(*)(Foo*)) destructBar, // must cast type to avoid errors (int(*)(Foo*)) aBar }; void constructBar(Bar* self) { // Bar::Bar() self->base.vtable = &vtableBar; // point to Bar vtable }

function f making a virtual function call

 void f(Foo* arg) { // same functionality as above Foo x; constructFoo(&x); aFoo(&x); Bar y; constructBar(&y); aBar(&y); arg->vtable->a(arg); // virtual function call Foo z; copyConstructFoo(&z, arg); aFoo(&z); destructFoo(&z); destructBar(&y); destructFoo(&x); }

So you can see that vtable is just a static block in memory, mostly containing function pointers. Each polymorphic class object will point to a vtable corresponding to its dynamic type. It also simplifies the connection between RTTI and virtual functions: you can check what type of class is just by looking at what vtable points to. The above is simplified in many ways, for example, multiple inheritance, but the general concept sounds.

If arg is of type Foo* and you take arg->vtable , but actually an object of type Bar , then you will still get the correct vtable address. This is because vtable always the first element in the address of an object, regardless of whether it called vtable or base.vtable in a correctly typed expression.

+11

MvG Apr 09 '15 at 18:51

source share

Each object has a vtable pointer that points to an array of member functions.

+2

who Sep 19 '08 at 3:33

source share

This answer was included in the Community Wiki answer.

Do abstract classes just have NULL for a function pointer of at least one record?

The answer to this question is that it is not specified - a call to a pure virtual function leads to undefined behavior if it is not defined (which is usually not) (ISO / IEC 14882: 2003 10.4-2). Some implementations simply set the NULL pointer to a vtable entry; other implementations place a pointer to a dummy method that does something similar to a statement.

Please note that an abstract class can determine the implementation for a pure virtual function, but this function can only be called using the qualified identifier syntax (that is, completely specifying the class in the method name, similar to calling a base class a method from a derived class). This is done to provide an easy-to-use default implementation, but it requires the derived class to provide overrides.

+2

Michael Burr Sep 19 '08 at 4:01

source share

You can recreate the functionality of virtual functions in C ++ by using function pointers as members of a class and static functions as implementations, or by using a pointer to member functions and member functions for implementations. There are only notational advantages between these two methods ... in fact, calls to virtual functions are just self-esteem. In fact, inheritance is just an innovative convenience ... all this can be implemented without using language functions for inheritance. :)

Below is an example of untested, probably erroneous code, but hopefully demonstrates this idea.

eg.

 class Foo { protected: void(*)(Foo*) MyFunc; public: Foo() { MyFunc = 0; } void ReplciatedVirtualFunctionCall() { MyFunc(*this); } ... }; class Bar : public Foo { private: static void impl1(Foo* f) { ... } public: Bar() { MyFunc = impl1; } ... }; class Baz : public Foo { private: static void impl2(Foo* f) { ... } public: Baz() { MyFunc = impl2; } ... };

+2

jheriko Sep 19 '08 at 14:41

source share

Usually with VTable, an array of function pointers.

+1

Lou Franco Sep 19 '08 at 3:31

source share

Something not mentioned here in all of these answers is that in the case of multiple inheritance, when the base classes have virtual methods. The following class has several pointers to vmt. As a result, the size of each instance of such an object is larger. Everyone knows that a class with virtual methods has 4 bytes for vmt, but in the case of multiple inheritance, it has virtual methods for every base class 4. 4. 4 is the size of the pointer.

+1

Philip Stuyck Apr 12 '15 at 19:42

source share

I will try to make it simple :)

We all know what virtual functions are in C ++, but how are they implemented at a deep level?

This is an array with pointers to functions that are implementations of a specific virtual function. The index in this array represents the specific index of the virtual function defined for the class. This includes pure virtual functions.

When a polymorphic class is derived from another polymorphic class, we may have the following situations:

The output class does not add new virtual functions or override them. In this case, this class shares the vtable with the base class.
The output class adds and cancels virtual methods. In this case, it gets its own vtable, where the added virtual functions have an index starting with the last one received.
Several polymorphic classes in inheritance. In this case, we have an index shift between the second and next bases and its index in the derived class

Is it possible to change vtable or even get direct access at runtime?

Non-standard way - there is no API to access them. Compilers may have some extensions or private APIs to access them, but it can only be an extension.

Is there a vtable for all classes, or only those that have at least one virtual function?

Only those that have at least one virtual function (even if it is a destructor) or display at least one class with its vtable ("is polymorphic").

Are abstract classes just NULL for a function pointer of at least one record?

This is a possible implementation, but rather not practiced. Instead, there is usually a function that prints something like a "pure virtual function" and does abort() . A call to this can happen if you try to call an abstract method in a constructor or destructor.

Does one virtual function slow down an entire class? Or just a virtual function call? And does speed affect if the virtual function is actually overwritten or not, or does it have no effect as long as it is virtual.

Slowing down depends only on whether the call is allowed as a direct call or as a virtual call. And nothing else matters. :)

If you call a virtual function through a pointer or a reference to an object, then it will always be implemented as a virtual call - because the compiler can never know which object will be assigned to this pointer at runtime, and whether it will belong to the class in which this method is overridden or not. Only in two cases can the compiler allow a virtual function call as a direct call:

If you call a method via a value (a variable or the result of a function that returns a value) - in this case, the compiler does not doubt what the actual class of the object is, and can “hard resolve” it at compile time.
If the virtual method is declared final in the class to which you have a pointer or a link through which you call it ( only in C ++ 11 ). In this case, the compiler knows that this method cannot be further redefined, and it can only be a method from this class.

Note that these virtual calls have only the overhead of dereferencing the two pointers. Using RTTI (although available only for polymorphic classes) is slower than calling virtual methods, you should find a case to implement the same two ways. For example, defining virtual bool HasHoof() { return false; } virtual bool HasHoof() { return false; } , and then redefined only as bool Horse::HasHoof() { return true; } bool Horse::HasHoof() { return true; } , you can call if (anim->HasHoof()) , which will be faster than trying if(dynamic_cast<Horse*>(anim)) . This is due to the fact that dynamic_cast must go through the class hierarchy in some cases even recursively to see if it is possible to build a path from the actual pointer type and the desired class type. Although the virtual call is always the same - dereferencing of two pointers.

+1

Ethouris Apr 14 '15 at 12:16

source share

Burly's answers are correct here, except for the question:

Do abstract classes just have NULL for a function pointer of at least one record?

The answer is that a virtual table is not created at all for abstract classes. There is no need, since no objects of these classes can be created!

In other words, if we have:

 class B { ~B() = 0; }; // Abstract Base class class D : public B { ~D() {} }; // Concrete Derived class D* pD = new D(); B* pB = pD;

The vtbl pointer, accessed via pB, will be a vtbl of class D. This is exactly the same as the polymorphism is implemented. That is, how methods D access through pB. No need for vtbl for class B.

In response to Mike's comment below ...

If the class B in my description has a virtual method foo () that is not overridden by D and the panel of the virtual method () that is overridden, then D vtbl will have a pointer to B foo () and its own bar (). Still no vtbl created for B.

0

Andrew Stein Sep 19 '08 at 4:55

source share

I made a very nice proof of the concept a little earlier (to find out if the order of inheritance has) let me know if your C ++ implementation really rejects it (my gcc version only gives a warning for assigning anonymous structures, but this is a mistake), I'm curious.

CCPolite.h

 #ifndef CCPOLITE_H #define CCPOLITE_H /* the vtable or interface */ typedef struct { void (*Greet)(void *); void (*Thank)(void *); } ICCPolite; /** * the actual "object" literal as C++ sees it; public variables be here too * all CPolite objects use(are instances of) this struct structure. */ typedef struct { ICCPolite *vtbl; } CPolite; #endif /* CCPOLITE_H */

CCPolite_constructor.h

 /** * unconventionally include me after defining OBJECT_NAME to automate * static(allocation-less) construction. * * note: I assume CPOLITE_H is included; since if I use anonymous structs * for each object, they become incompatible and cause compile time errors * when trying to do stuff like assign, or pass functions. * this is similar to how you can't pass void * to windows functions that * take handles; these handles use anonymous structs to make * HWND/HANDLE/HINSTANCE/void*/etc not automatically convertible, and * require a cast. */ #ifndef OBJECT_NAME #error CCPolite> constructor requires object name. #endif CPolite OBJECT_NAME = { &CCPolite_Vtbl }; /* ensure no global scope pollution */ #undef OBJECT_NAME

main.c

 #include <stdio.h> #include "CCPolite.h" // | A Greeter is capable of greeting; nothing else. struct IGreeter { virtual void Greet() = 0; }; // | A Thanker is capable of thanking; nothing else. struct IThanker { virtual void Thank() = 0; }; // | A Polite is something that implements both IGreeter and IThanker // | Note that order of implementation DOES MATTER. struct IPolite1 : public IGreeter, public IThanker{}; struct IPolite2 : public IThanker, public IGreeter{}; // | implementation if IPolite1; implements IGreeter BEFORE IThanker struct CPolite1 : public IPolite1 { void Greet() { puts("hello!"); } void Thank() { puts("thank you!"); } }; // | implementation if IPolite1; implements IThanker BEFORE IGreeter struct CPolite2 : public IPolite2 { void Greet() { puts("hi!"); } void Thank() { puts("ty!"); } }; // | imposter Polite Greet implementation. static void CCPolite_Greet(void *) { puts("HI I AM C!!!!"); } // | imposter Polite Thank implementation. static void CCPolite_Thank(void *) { puts("THANK YOU, I AM C!!"); } // | vtable of the imposter Polite. ICCPolite CCPolite_Vtbl = { CCPolite_Thank, CCPolite_Greet }; CPolite CCPoliteObj = { &CCPolite_Vtbl }; int main(int argc, char **argv) { puts("\npart 1"); CPolite1 o1; o1.Greet(); o1.Thank(); puts("\npart 2"); CPolite2 o2; o2.Greet(); o2.Thank(); puts("\npart 3"); CPolite1 *not1 = (CPolite1 *)&o2; CPolite2 *not2 = (CPolite2 *)&o1; not1->Greet(); not1->Thank(); not2->Greet(); not2->Thank(); puts("\npart 4"); CPolite1 *fake = (CPolite1 *)&CCPoliteObj; fake->Thank(); fake->Greet(); puts("\npart 5"); CPolite2 *fake2 = (CPolite2 *)fake; fake2->Thank(); fake2->Greet(); puts("\npart 6"); #define OBJECT_NAME fake3 #include "CCPolite_constructor.h" fake = (CPolite1 *)&fake3; fake->Thank(); fake->Greet(); puts("\npart 7"); #define OBJECT_NAME fake4 #include "CCPolite_constructor.h" fake2 = (CPolite2 *)&fake4; fake2->Thank(); fake2->Greet(); return 0; }

exit:

 part 1 hello! thank you! part 2 hi! ty! part 3 ty! hi! thank you! hello! part 4 HI I AM C!!!! THANK YOU, I AM C!! part 5 THANK YOU, I AM C!! HI I AM C!!!! part 6 HI I AM C!!!! THANK YOU, I AM C!! part 7 THANK YOU, I AM C!! HI I AM C!!!!

note that since I never select my fake object, there is no need to destroy; destructors are automatically placed at the end of the area of dynamically allocated objects to restore the memory of the object literal itself and the vtable pointer.

0

Dmitry Jan 18 '17 at 3:37 on

source share

Zach Burlingame · Accepted Answer · 2008-09-19 03:36

How are virtual functions implemented at a deep level?

From "Virtual Functions in C ++"

Whenever a program has a declared virtual function, a v-table is created for the class. A V-table consists of virtual function addresses for classes that contain one or more virtual functions. An object of the class containing the virtual function contains a virtual pointer that points to the base address of the virtual table in memory. Whenever there is a call to a virtual function, the v-table is used to resolve the address of the function. A class object that contains one or more virtual functions contains a virtual pointer, called vptr, at the very beginning of the object in memory. Therefore, the size of the object in this case increases by the size of the pointer. This vptr contains the base address of the virtual table in memory. Note that virtual tables belong to the class, i.e. There is only one virtual table for a class, regardless of the number of virtual functions that it contains. This virtual table, in turn, contains the base addresses of one or more virtual functions of the class. While the virtual function is being called on the object, vptr of this object provides the base address of the virtual table for this class in memory. This table is used to allow calling a function, because it contains the addresses of all virtual functions of this class. This is how dynamic binding is enabled during a virtual function call.

Is it possible to change vtable or even get direct access at runtime?

In general, I think the answer is no. You can do some memory manipulation to find the vtable, but you still don't know what the function signature looks like. Everything that you would like to achieve with this ability (supporting the language) should be possible without directly accessing the vtable or changing it at runtime. Also note that the C ++ language specification does not indicate that vtables are required, but this is how most compilers implement virtual functions.

Is there a vtable for all objects, or only those that have at least one virtual function?

I believe the answer here is “it depends on the implementation”, since the specification does not require vtables in the first place. However, in practice, I believe that all modern compilers only create vtable if the class has at least 1 virtual function. There is spatial overhead associated with vtable, and time overhead associated with calling a virtual function and a non-virtual function.

Are abstract classes just NULL for a function pointer of at least one record?

Answer: it is not specified by the language specification, so it depends on the implementation. Calling a pure virtual function leads to undefined behavior if it is not defined (which is usually not) (ISO / IEC 14882: 2003 10.4-2). In practice, it allocates a slot in vtable for a function, but does not assign an address to it. This leaves the vtable incomplete, which requires derived classes to implement the function and complete the vtable. Some implementations simply set the NULL pointer to a vtable entry; other implementations place a pointer to a dummy method that does something similar to a statement.

Please note that an abstract class can determine the implementation for a pure virtual function, but this function can only be called using the qualified identifier syntax (that is, completely specifying the class in the method name, similar to calling a base class a method from a derived class). This is done to provide an easy-to-use default implementation, but it requires the derived class to provide overrides.

Does one virtual function have a slowdown of the whole class or just a call to a virtual function?

It comes to my knowledge, so please help me if I am wrong!

I believe that only functions that are virtual in the class affect the time performance associated with calling the virtual function, and not the virtual function. There is room for a class for a class anyway. Please note that if there is a vtable, then for each object there is only 1, and not one object.

Does speed change if the virtual function is really overridden or not, or does it not work as long as it is virtual?

I do not think that the runtime of an overridden virtual function is reduced compared to calling the underlying virtual function. However, for the class associated with the definition of another vtable for the derived class vs base class, there is an additional overhead.

Additional resources:

http://www.codersource.net/published/view/325/virtual_functions_in.aspx (via the reverse machine)
http://en.wikipedia.org/wiki/Virtual_table
http://www.codesourcery.com/public/cxx-abi/abi.html#vtable

How are virtual functions and vtable implemented?

How are virtual functions implemented at a deep level?

Is it possible to change vtable or even get direct access at runtime?

Is there a vtable for all objects, or only those that have at least one virtual function?

Are abstract classes just NULL for a function pointer of at least one record?

Does one virtual function have a slowdown of the whole class or just a call to a virtual function?

Does speed change if the virtual function is really overridden or not, or does it not work as long as it is virtual?

Additional resources:

Does one virtual function slow down an entire class?

How are they implemented at a deep level?

parent class foo

derived class Bar

function f making a virtual function call

We all know what virtual functions are in C ++, but how are they implemented at a deep level?

Is it possible to change vtable or even get direct access at runtime?

Is there a vtable for all classes, or only those that have at least one virtual function?

Are abstract classes just NULL for a function pointer of at least one record?

Does one virtual function slow down an entire class? Or just a virtual function call? And does speed affect if the virtual function is actually overwritten or not, or does it have no effect as long as it is virtual.

In response to Mike's comment below ...

More articles: