Parallel Algorithm for Multiple Intersections

Question

Parallel Algorithm for Multiple Intersections

I have n-sets (distributed over n-rows) of data that represent grid nodes, and I would like to know an efficient parallel algorithm to find the intersection of these sets, i.e. common nodes. An intersection is defined as soon as any 2 sets share a node.

For instance:

Input:

Rank 0: Set 1 - [0, 1, 2, 3, 4] Rank 1: Set 2 - [2, 4, 5, 6] Rank 2: Set 3 - [0, 5, 6, 7, 8]

Implement a parallel algorithm → Result: (after finding the intersections)

 Rank 0: [0, 2, 4] Rank 1: [2, 4, 5, 6] Rank 2: [0, 5, 6]

The algorithm should be executed on n-rows with 1 set per rank.

+4

set algorithm parallel-processing distributed-computing set-intersection

Gk-f3d Dec 12 '12 at 18:15

source share

1 answer

Ira Baxter · Answer 1 · 2012-12-12T22:37:20+0000

You must be capable of this fast O (N) in parallel, with hash tables.

For each set S_i for each member m_x (all of which can be executed in parallel), put the set element in the hash table associated with the given name, for example. At any time, when you get hit in the hash table on m_x from the set S_j, now you have the corresponding given number S_i, and you immediately know that S_i intersects S_j. You can put m_x in derived intersection sets.

You need a table with secure access. It's simple; lock buckets during updates.

[Another answer suggested sorting sets. With most sorting algorithms, it would be O (N ln N) time, and not so fast].

Parallel Algorithm for Multiple Intersections

More articles: