What is a good data structure for constructing equivalence classes on tree nodes?

Question

What is a good data structure for constructing equivalence classes on tree nodes?

I am looking for a good data structure for creating equivalence classes on tree nodes. In an ideal structure, the following operations should be fast (O (1) / O (n) as needed) and easy (without clauses of the secret code):

(A) Go through the tree from the root; on each child transition node -> list all equivalent versions of the child element node
(B) Combine two equivalence classes
(C) Create new nodes from the list of existing nodes (children) and other data.
(D) Find any nodes structurally equivalent to a node (i.e. they have the same number of children, the corresponding children belong to the same equivalence class, and their "other data" are equal) so that new (or newly changed)) nodes can be placed to the correct equivalence class (via merge)

So far I have considered (some of them could be used in combination):

Parfait, where children are links to collections of nodes, not nodes. (A) is fast, (B) requires a tree walk and node updates to indicate a merged collection, (C) you need to find a collection containing each child of the new node, (D) requiring a tree transition
Maintaining a hash of nodes by their characteristics. This makes (D) much faster, but (B) slower (since the hash needs to be updated when combining equivalence classes)
Link the nodes together in a circular linked list. (A) fast, (B) will be fast, but for the fact that this “merging” of the part of the circular list with itself actually splits the list (C), it will be fast (D) will require a tree walk
As above, but with an optional "up" pointer in each node, which can be used to search for the canonical member of the circular list.

Am I missing a sweet alternative?

+6

language-agnostic algorithm data-structures tree equivalence-classes

Markusq Mar 23 '09 at 19:17

source share

3 answers

I do not think that any one structure will solve your problems, but you can take a look at the Data Structure with unrelated sets . After all, an equivalence class is the same as a partition of a set. He should be able to quickly process some of these operations.

+3

Jay kominek Mar 23 '09 at 20:40

source share

Stepping back for a moment, I suggest not using the tree at all. The last time I had to deal with a similar problem, I started with a tree, but later switched to an array.

The reasons for multiple, but number one reasons were performance, my classes are up to 100 or so, children could actually work better by manipulating them as an array than through tree nodes, mainly due to hardware locality and processor prefetching logic, and pipelining CPU processing

Thus, although the structure of an array algorithmically requires more N operations than a tree, performing these dozens of operations is likely faster than chasing pointers in memory.

+1

Robert Gould Apr 14 '09 at 1:13

source share

Kevin beck · Accepted Answer · 2009-04-09T00:44:26+0000

You seem to have two forms of equivalence. Regular equivalence (A), tracked as equivalence classes that are kept up to date and structural equivalence (D), for which you collect one equivalence class from time to time and then drop it.

It seems to me that the problem will be conceptually simpler if you keep equivalence classes for both simple and structural equivalence. If this leads to too much outflow of structural equivalence, you can preserve equivalence classes for some aspects of structural equivalence. Then you can find a balance where you can afford to maintain these equivalence classes, but still significantly reduce the number of nodes that you need to study when building a list of structurally equivalent nodes.

What is a good data structure for constructing equivalence classes on tree nodes?

More articles: