Merge Installation Runtime

Given the two sets A and B, what is the general algorithm used to find their union, and what is the run time?

My intuition:

a = set((1, 2, 3)) b = set((2, 3, 5)) union = set() for el in a: union.add(el) for el in b: union.add(el) 

Add checks for the collision, which is O (1), and then adds the element, which is (??). This is done n times (where n is | a | + | b |). So this is O (n * x), where x is the timeout for the add operation.

Is it correct?

+4
source share
4 answers

It is very implementation dependent. Others mentioned comparison-based sets (have less than for sorting) or hashables (have a good hash function for hashing). Another possible implementation included "union-find", which only supports a specialized subset of the usual operations with many, but very fast (the union, in my opinion, is amortized by constant time?), You can read about it here.

http://en.wikipedia.org/wiki/Union_find

and see application example here

http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!220.entry

+3
source

The complexity of add / find (collision) will depend on the implementation of the union.

If you use some hash-table-based data structure, your collision operation will indeed be constant with a good hash function.

Otherwise, the addition is likely to be O (Log (N)) for the sorted list / tree structure.

+4
source

First answer: if you are dealing with sets of numbers, you can implement the set as a sorted vector of individual elements. Then you can implement the union (S1, S2) simply as a merge operation (check for duplicates), which takes time O (n), where n = the sum of powers.

Now my first answer is a bit naive. And Akusete is right: you can, and you must, implement the set as a hash table (the set must be a common container, and not all objects can be sorted!). Then both the search and the insert are O (1), and, you guessed it, the union takes O (n) time.

(Looking at your Python code) Python collections are implemented using hash tables. Read this interesting thread . See Also this implementation , which uses sorted vectors instead.

+3
source

If you can use bits (each bit in an array from int is equal to an element of your set), you can just go through the int array and the OR elements. This has complexity O (N) (where N is the length of the array) or O ((m + 31) / 32), where M is the number of elements.

0
source

All Articles