O (1) Creating, searching, combining in unrelated data structure sets

Today I had a discussion with someone about the Kruskal Minimum Spanning Tree algorithm because of this slide page.

The presentation author said that if we implement disjoint sets using a (twice) linked list, the performance for Make and Find will be O (1) and O (1), respectively. Operating time Union (u, v) min (nu, nv) , where nu and nv are the sizes of the sets storing u and v.

I said that we can improve Union (u, v) time as O (1) by making a representation pointer of each element, indicating a locator that contains a pointer to the real representation of the set.

In Java, the data structure will look like this:

class DisjointSet { LinkedList<Vertex> list = new LinkedList<Vertex>(); // for holding the members, we might need it for print static Member makeSet(Vertex v) { Member m = new Member(); DisjointSet set = new DisjointSet(); m.set = set; set.list.add(m); m.vertex = v; Locator loc = new Locator(); loc.representation = m; m.locator = loc; return m; } } class Member { DisjointSet set; Locator locator; Vertex vertex; Member find() { return locator.representation; } void union(Member u, Member v) { // assume nv is less than nu u.set.list.append(v.set.list); // hypothetical method, append a list in O(1) v.set = u.set; v.locator.representation = u.locator.representation; } } class Locator { Member representation; } 

Sorry for the minimal code. If this can be done, than the run time for each operation with a disjoint set ( Make, Find, Union ) will be O (1) . But the one with whom I spoke cannot see improvement. I would like to know your opinion on this matter.

And also what is the fastest Find / Union performance in various implementations? I am not an expert in data structure, but, quickly browsing the Internet, I found that for this there is no constant data structure or time algorithm.

+4
source share
2 answers

My intuition agrees with your colleague. You speak:

u.set.list.append(v.set.list); // hypothetical method, append a list in O(1)

It looks like your intention is that merging is done through append. But to implement the Union, you will need to remove duplicates so that the result is a set. Therefore, I can see the O (1) algorithm for a given size fixed , for example ...

 Int32 set1; Int32 set2; Int32 unionSets1And2 = set1 | set2; 

But it deceives me. If you do this for general cases of N, I don’t see how you avoid any form of iteration (or hash search). And that will do O (n) (or at best O (log n)).

FYI: It wasn’t easy for me to follow your code. In makeSet, you create a local locator that never exits a function. This is not like what he is doing. And it is not clear what your intention is. You might want to edit and refine your approach.

+3
source

Using the Tarjan version of the Union-Find structure (with path compression and combining with ranking), the Find m sequence and (n-1) mixed unions will be in O (m.α (m, n)), where α (m, n) is inverse to the Ackermann function , which has a value of 4 for all practical values ​​of m and n . Thus, this basically means that Union-Find in the worst case dampens constant operations, but not really.

As far as I know, it is impossible to get a better theoretical complexity, although improvements have led to better practical effectiveness.

For special cases of disjoint sets, such as those used in language theory, it has been shown that linear (i.e., everything in O (1)) adaptations are possible - essentially grouping the nodes together --- but these improvements cannot be transferred to a common problem. On the other hand, to create the O (n) algorithm for a minimal spanning tree (Chazelle algorithm), a somewhat similar main idea was used with great success and ingenuity.

Therefore, your code may not be right. Moron pointed out the error: when you combine two sets, you only update the "presentation" of the manual of each list, but not of all the other elements, while assuming in the search function that each element directly knows its representation.

+3
source

All Articles