Your first choice should depend on the expected access patterns and the amount of data that you are likely to store: ...
- if there will never be a lot of data (n less than 30, say), an unsorted array will be fine,
- If you almost never add, delete or update, the sorted array will be fine:
- if n is less than, say, 1 million, and you are only always looking for the top element (first or last takes first place), heaps will be fine (especially if you often update randomly selected elements like you do in LRU (least recently used) queue for the cache, say, because on average such an update is O (1), not O (log (n)))
- if n is less than, say, 1 million, and you are not sure what you will be looking for, a balanced tree (say red-black or AVL) will be fine;
- if n is large (1 million or more, say), you are probably better off with b-tree or trie (balanced binary tree performance tends to โfall off a cliffโ when n is large enough: memory accesses are usually too scattered , and cache misses really start to hurt).
... but I recommend leaving the parameter as open as possible so that you can test at least one of the alternatives and switch to it if it works better.
Over the past twenty years, I have only worked in two applications where heaps were the best choice for anything (once for LRU and once in an unpleasant application for operations research), restoring additivity to randomly perturbed k-dimensional hypercubes, where most cells in the hypercube appeared in k different heaps, and the memory was at its best). However, in these two cases they performed much better than the alternatives: literally ten times faster than balanced trees or trees.
For the hypercube problem that I mentioned in the last paragraph, my team leader thought that red-black trees would work better than heaps, but benchmarking showed that red-black trees were slower (as far as I remember, they were about twenty times slower), and although the b-trees were much faster, the heaps also beat them comfortably.
An important feature of the heap in both cases, which I mentioned above, was not the search for O (1) minimum value, but rather the average update time O (1) for an element selected randomly.
-James Barbetti (Well, I thought I knew, but captcha keeps telling me that I'm not human)
source share