Data structures: what should I use for these conditions?

Question

Data structures: what should I use for these conditions?

This should not be a tricky question, but I would just like someone to bounce off of him before I continue. I just need to decide which data structure to use based on these expected actions:

It will often be necessary to sort in sorted order (starting from the head).
You need to remove / restore arbitrary elements from the ordered view /.
Later I will often resort to data and work with several sorted views.
Also later, I often change the position of elements in my sorted views.

This, by the way, is in Java.

My best guess is that I will either override some kind of custom Linked Hash Set (to sort the links in sorted order), or maybe just using a set of trees. But I'm still not quite sure. Recommendations?

Edit: I think due to arbitrary deletion / restoration, I should probably stick to a set of trees, right?

Actually, not necessarily. Hmmm ...

+6

java performance sorting data-structures multiway-tree

Daddy warbox Feb 21 '10 at 9:49

source share

2 answers

The standard LinkedHashSet or LinkedMultiset from google collections if you want your data structure to store non-unique values.

+3

Roman Feb 21 '10 at 10:17

source share

Steve314 · Accepted Answer · 2010-02-21T11:19:08+0000

In theory, I would say that the correct data structure is a multi-road tree - preferably something like a B + tree. Traditionally, this is a disk-based data structure, but modern main memory has many similar characteristics due to cache layers and virtual memory.

Iterating the B + tree in order is very efficient, because (1) you only iterate over the linked list of leaf nodes - the branch nodes are not needed, and (2) you get very good locality.

Search, delete and insert arbitrary elements - log (n), as with any balanced tree, but with different constant factors.

When resorting to a tree, it is mainly a question of choosing an algorithm that gives good performance when working with a linked list of blocks (leaf nodes), minimizing the need to use leaf nodes - options for quick sorting or merging seem likely candidates, as soon as the elements are sorted in nodes branching, just pass the summary information through leaf nodes.

BUT - pragmatic, this is only what you would do if you were sure that you needed it. The odds are good that you are better off using a standard container. Algorithm / data structure optimization is the best kind of optimization, but it can still be premature.

Data structures: what should I use for these conditions?

More articles: