I have a huge tree that can occupy up to several gigabytes. The node structure is shown below. You will notice that I made the last element an array of size 1. The reason for this is that I can override Node with flexible sizes. Similar to what C supports as a flexible element of an array. Instead, I could use std::unique_ptr<T[]> or std::vector<T> , but the problem is that with every node tree there is double dynamic allocation, double indirect use and extra cache misses. In my last test, this made my program take about 50% more time, which is simply not acceptable for my application.
template<typename T> class Node { public: Node<T> *parent; Node<T> *child; T &operator[](int); private; int size; T array[1]; };
The easiest way to implement operator[] is this.
template<typename T> T &Node::operator[](int n) { return array[n]; }
It should work perfectly in the most reasonable C ++ implementations. But since the C ++ standard allows insane implementations to check the bounds of an array, since, as I know, this technically causes undefined behavior. Then can I do this?
template<typename T> T &Node::operator[](int n) { return (&array[0])[n]; }
I am a bit confused. The [] operator for primitive types is just syntactic sugar before * . Thus, (&array[0])[n] equivalent to (&*(array + 0))[n] , which, I think, can be cleared as array[n] , doing everything the same as the first. Good, but I can still do it.
template<typename T> T &Node::operator[](int n) { return *(reinterpret_cast<T *>(reinterpret_cast<char *>(this) + offsetof(Node<T>, array)) + n); }
Hopefully now I am free from possible undefined behaviors. Perhaps the built-in assembly will show my intent better. But do I really have to go this far? Can someone clarify all this for me?
By the way, T always a type of POD. All Node also a POD.