Parallel convex hull algorithm in gpu

I am implementing a separation and subjugation approach to a convex hull in CUDA. This is my approach: Below:

  • Create a list of lists for storing convex hulls;
  • curSize = input size (all points);

  • for i: 1 to register N

  • to begin
  • curSize = curSize / 2;
  • Take every 2 adjacent convex hulls in the list of lists and combine them into a larger hull (using the standard upper and lower common tangent approach to expand and conquer the convex hull) in curSize streams
  • // Initially, it combines every 2 adjacent points in the list with the body, then in the next iteration it combines the convex hulls of size 2 into larger convex hulls, etc., finally, the list of lists will have a single list of points on the hull
  • end

But it becomes too complicated, and I feel that I am not using CUDA parallel power, because at each level of the tree I create N / 2 ^ i threads, the complexity of which is O (N) when combining all adjacent cases at this level, Therefore, the network complexity is still O (N logN).

Can you tell me how to do it better or give an alternative parallel lead algorithm for the convex hull (it would be great if I can get an algorithm for the parallel version of the Graham scan)?

+4
source share
1 answer

The complexity of your algorithm is still O (N) (unchanged from the single-threaded version) because you are doing 3 things:

  • Flow control: O (N)
  • Remove some vertices from the lists: O (N) is depreciated since there are only N points.
  • Look at the points adjacent to the deleted one, but not removed during the current step: O (N), since there are no more than two such points for each merge.

But if your points are not sorted, it is better to parallelize the sorting.

+1
source

All Articles