Slow performance from unordered_map when accessing elements in a member template in a derived class

Question

Slow performance from unordered_map when accessing elements in a member template in a derived class

I am trying to implement component architecture in a game engine project. Each GameObject has an unordered_map that contains a pointer to the base class Component . At the moment, I have only one derived component class, which is the Transform class. I wanted to implement this component-based architecture, similar to the Unity convention: I want to get the component of the game object by calling a member template function like GetComponent<Transform>() .

Here are the headers:

Componentent.h

 enum type{ TRANSFORM // more will be added later }; class Component // base class { public: Component() : _owner(NULL) {} virtual ~Component(){} static type Type; protected: GameObject* _owner; };

Transform.h

 class Transform : public Component { public: Transform(); ~Transform(); static type Type; void Rotate(float deg); // to be encapsulated later on Vector2D _position; float _rotation; Vector2D _scale; };

GameObject.h

 class GameObject { public: GameObject(); ~GameObject(); void Update(); //and more member functions template<class T> T* GetComponent(); private: // some more private members unordered_map<type, Component*> _componentList; // only 1 component of each type }; template<class T> T* GameObject::GetComponent() { return static_cast<T*>(_componentList[T::Type]); }

My initial implementation used std::vector to store Component* , and the application ran at 60 frames per second (I also have a frame rate controller that limits only FPS to 60). When I switched to unordered_map to access these component pointers, performance dropped to 15 FPS.

I draw only two quadrads, and I call GetComponent<Transform>() only 6 times per frame at this point, so there’s not much on the stage.

What have i tried?

I tried using const char* , std::string , type_info and finally enum type as key values for unordered_map , but nothing helps: all implementations got me 15-16 FPS.

What causes this performance problem? How can I isolate the problem?

_{Hope I provided enough details, feel free to ask for more code if necessary}

+7

c ++ performance inheritance unordered-map templates

Varaquilex Oct 29 '15 at 9:47

source share

1 answer

Team upvote · Answer 1 · 2015-11-26T17:22:31+0000

Disclaimer: Although the information below should be carried out independently, the first basic sanity check, given such a sharp difference in performance compared to such a simple case, is to first make sure that construction optimization is turned on. With this off the road ...

unordered_map is ultimately designed as a rather large-scale container, predetermining a potentially large number of buckets.

See here: std :: unordered_map very high memory usage

And here: How does C ++ STL unordered_map resolve conflicts?

When calculating the hash index is trivial, the amount of memory (and steps between them) that can be accessed for such a small unordered_maps can very easily turn into a bottleneck with no cache with something that is often accessed, since retrieving the component interface from an object .

For systems with entity components, you usually do not have many entity related components - maybe something like dozens of vertices, and often only a few. As a result, std::vector is actually a much more suitable structure and, above all, in terms of locality of links (small arrays that can often be accessed every time, every time you extract a component interface from an object). While the smaller dot, std::vector::operator[] also a function that is trivially nested.

If you want to do even better than std::vector here (but I recommend this only after profiling and defining it for you), provided that you can output some general upper case bound, N , for the number of components usually available in essence, something like this might work even better:

 struct ComponentList { Component* components[N]; Component** ptr; int num; };

Start by installing ptr on components and then access them through ptr . Insertion of a new component increases num . When num >= 4 (a rare case), change ptr to point to a dynamically allocated block with a large size. When destroying a ComponentList free dynamically allocated memory if ptr != components ComponentList . This takes a little memory if you store fewer N elements (although std::vector also usually does this with initial bandwidth and how it grows), but it turns your entity and component list into completely contiguous if num > N As a result, you get better link locality and possibly even better results than where you started (I assume that due to a significant reduction in frame rate, you extract components from objects quite often from loops, which is not uncommon in ECS )

Given how often access to component interfaces can be obtained from an object, and very often in very tight cycles, this can be worth the effort.

However, your initial choice of std::vector was actually the best considering the typical data scale (the number of components available in the entity). With very small data sets, basic linear sequential searches often outperform more complex data structures, and you often want to focus on memory / caching efficiency instead.

I tried to use const char *, std :: string, type_info and finally enum enter unordered_map as key values, but nothing really helps: the whole implementation got me 15-16 FPS.

Just on this note, for the keys you want something that can be compared in constant time, as an integer value. One of the features that may be convenient in this case is an interned string, which simply stores int for a balance between convenience and performance (allowing clients to create them via string , but compare them using int while searching for components).

Slow performance from unordered_map when accessing elements in a member template in a derived class

More articles: