Parallel convex hull algorithm in gpu

Question

Parallel convex hull algorithm in gpu

I am implementing a separation and subjugation approach to a convex hull in CUDA. This is my approach: Below:

Create a list of lists for storing convex hulls;
curSize = input size (all points);
for i: 1 to register N
to begin
curSize = curSize / 2;
Take every 2 adjacent convex hulls in the list of lists and combine them into a larger hull (using the standard upper and lower common tangent approach to expand and conquer the convex hull) in curSize streams
// Initially, it combines every 2 adjacent points in the list with the body, then in the next iteration it combines the convex hulls of size 2 into larger convex hulls, etc., finally, the list of lists will have a single list of points on the hull
end

But it becomes too complicated, and I feel that I am not using CUDA parallel power, because at each level of the tree I create N / 2 ^ i threads, the complexity of which is O (N) when combining all adjacent cases at this level, Therefore, the network complexity is still O (N logN).

Can you tell me how to do it better or give an alternative parallel lead algorithm for the convex hull (it would be great if I can get an algorithm for the parallel version of the Graham scan)?

+4

algorithm computational-geometry gpu-programming

alpha_cod Jul 30 '11 at 12:40

source share

1 answer

Sergey Bankevich · Answer 1 · 2011-07-30T15:17:04+0000

The complexity of your algorithm is still O (N) (unchanged from the single-threaded version) because you are doing 3 things:

Flow control: O (N)
Remove some vertices from the lists: O (N) is depreciated since there are only N points.
Look at the points adjacent to the deleted one, but not removed during the current step: O (N), since there are no more than two such points for each merge.

But if your points are not sorted, it is better to parallelize the sorting.

Parallel convex hull algorithm in gpu

More articles: