Pay attention to the following code:
struct A { virtual ~A() {} virtual int go() = 0; }; struct B : public A { int go() { return 1; } }; struct C : public B { int go() { return 2; } }; int main() { B b; B &b_ref = b; return b_ref.go(); }
In GCC 4.4.1 (using -O2 ), the call to B::go() becomes nested (that is, virtual sending does not occur). This means that the compiler confirms that a_ref does indeed point to a variable of type B Reference B can be used to point to C , but the compiler is smart enough to anticipate that it is not, so it fully optimizes the function call by inserting the function.
Excellent! This is an incredible optimization.
But why does the GCC not do the same in the following case?
struct A { virtual ~A() {} virtual int go() = 0; }; struct B : public A { int go() { return 1; } }; struct C : public B { int go() { return 2; } }; int main() { B b; A &b_ref = b; return b_ref.go();
Any ideas? What about other compilers? Is this optimization common? (I'm very new to this prototype compiler, so I'm curious)
If the second case worked, I could create some really great templates, for example:
template <typename T> class static_ptr_container { public: typedef T st_ptr_value_type; operator T *() { return &value; } operator const T *() const { return &value; } T *operator ->() { return &value; } const T *operator ->() const { return &value; } T *get() { return &value; } const T *get() const { return &value; } private: T value; }; template <typename T> class static_ptr { public: typedef static_ptr_container<T> container_type; typedef T st_ptr_value_type; static_ptr() : container(NULL) {} static_ptr(container_type *c) : container(c) {} inline operator st_ptr_value_type *() { return container->get(); } inline st_ptr_value_type *operator ->() { return container->get(); } private: container_type *container; }; template <typename T> class static_ptr<static_ptr_container<T>> { public: typedef static_ptr_container<T> container_type; typedef typename container_type::st_ptr_value_type st_ptr_value_type; static_ptr() : container(NULL) {} static_ptr(container_type *c) : container(c) {} inline operator st_ptr_value_type *() { return container->get(); } inline st_ptr_value_type *operator ->() { return container->get(); } private: container_type *container; }; template <typename T> class static_ptr<const T> { public: typedef const static_ptr_container<T> container_type; typedef const T st_ptr_value_type; static_ptr() : container(NULL) {} static_ptr(container_type *c) : container(c) {} inline operator st_ptr_value_type *() { return container->get(); } inline st_ptr_value_type *operator ->() { return container->get(); } private: container_type *container; }; template <typename T> class static_ptr<const static_ptr_container<T>> { public: typedef const static_ptr_container<T> container_type; typedef typename container_type::st_ptr_value_type st_ptr_value_type; static_ptr() : container(NULL) {} static_ptr(container_type *c) : container(c) {} inline operator st_ptr_value_type *() { return container->get(); } inline st_ptr_value_type *operator ->() { return container->get(); } private: container_type *container; };
These patterns can be used to avoid virtual sending in many cases:
// without static_ptr<> void func(B &ref); int main() { B b; func(b); // since func() can't be inlined, there is no telling I'm not // gonna pass it a reference to a derivation of `B` return 0; } // with static_ptr<> void func(static_ptr<B> ref); int main() { static_ptr_container<B> b; func(b); // here, func() could inline operator->() from static_ptr<> and // static_ptr_container<> and be dead-sure it dealing with an object // `B`; in cases func() is really *only* meant for `B`, static_ptr<> // serves both as a compile-time restriction for that type (great!) // AND as a big runtime optimization if func() uses `B`'s // virtual methods a lot -- and even gets to explore inlining // when possible return 0; }
Would it be practical to implement this? (and donβt keep talking about micro-optimization, because it can be a huge optimization.)
- edit
I just noticed that the problem with static_ptr<> has nothing to do with the problem that I uncovered. The type of pointer is preserved, but it is still not inlined. I think GCC is simply not as deep as necessary to find out that static_ptr_container <> :: value is not a reference or pointer. Sorry about that. But the question is still unanswered.
- edit
I developed a version of static_ptr<> that really works. I changed the name a little:
template <typename T> struct static_type_container {
The only weakness is that the user must access ptr->value in order to get the actual object. Overloading operator ->() does not work in GCC. Any method that returns a reference to the actual object, if it is built-in, breaks the optimization. What a pity..