Merge overlapping rectangles aligned along the axis

I have a set of rectangles aligned along the axis. When two rectangles overlap (partially or completely), they must be combined into their common bounding box. This process works recursively.

Detecting all overlaps and using union-find to form groups that you merge at the end will not work, since merging two rectangles spans a large area and can create new overlaps. (In the figure below, after combining two overlapping rectangles, a new overlap appears.)

enter image description here

As in my case, the number of rectangles is moderate (for example, N <100), you can use a brute force solution (try all the pairs and if a match is found, merge and restart from the very beginning). Anyway, I would like to reduce the complexity, which is probably O (NΒ³) in the worst case.

Any suggestion how to improve this?

+7
algorithm computational-geometry rectangles
source share
3 answers

I think R-Tree will be here. R-Tree indexes rectangular areas and allows you to insert, delete, and query (for example, intersecting queries) in O (log n) for "regular" queries in low sizes.

The idea is to process your rectangles in a row, for each rectangle you do the following:

  • execute the intersection request of the current R-tree (empty to the beginning)

  • If there are results, delete the results from the R-tree, merge the current rectangle with all the rectangles of the results and insert the newly merged rectangle (for the last step, go to step 1.).

  • If there are no results, just paste the rectangle into the R-Tree

Total you follow

  • O (n) traverse requests in step 1. (O (n log n))
  • O (n) enter the steps in step 3. (O (n log n))
  • no more than n delete and n insert steps in step 2. This is because every time you perform step 2, you reduce the number of rectangles by at least 1 (O (n log n))

In theory, you should avoid O (n log n) , however the merge steps at the end (with large rectangles) may have low selectivity and need more than O (log n), but depending on the data distribution this should not ruin the overall execution time .

+1
source share

Use a balanced normalized square tree.

Normalized: collect all x coordinates, sort them and replace them with an index in a sorted array. Same for y coordinates.

Balanced: when building a quad tree, there is always a split in the middle coordinate.

So, when you get the rectangle you want to go, and mark the correct nodes in the tree with some rectangle id. If you find other rectangles at the bottom (which means they will overlap), assemble them in a set. When this is done, if the vector is not empty (you find the overlapping rectangles), then we create a new rectangle to represent the union of the subelements. If the computed rectangle is larger than the one you just inserted, then apply the algorithm again using the new computed rectangle. Repeat this until it grows larger, then move on to the next input rectangle.

For performance, each node in the quad-core storage stores all the rectangles that overlap this node, in addition to marking it as end-node.

Difficulty: initial normalization of O(NlogN) . Inserting and checking for overlaps will be O(log(N)^2) . You need to do this for the original N rectangles, as well as for the ceilings. Each time you detect an overlap, you eliminate at least one of the original rectangles so that you can find the maximum (N-1) overlap. So overall you need 2N operations. So the overall complexity will be O(N(log(N)^2)) .

This is better than other approaches because you do not need to check any rectangles for overlapping.

0
source share

This can be solved using a combination of a flat scan and a spatial data structure: we combine the intersecting rectangles along the scan line and put any rectangles behind the scan line in the spatial data structure. Each time we get a new merged rectangle, we check the spatial data structure for any rectangles intersecting this new rectangle and merge them if they are found.

If a rectangle (R) behind the scan line intersects a rectangle (S) along the scan line, then either of the two angles R closest to the scan line is inside S. This means that the spatial data structure must store points (two points for each rectangle) and respond to requests for any point lying inside this rectangle. One obvious way to implement such a data structure is to use a segment tree, where each sheet contains a rectangle with top and bottom sides in the corresponding y-coordinate, and each other node points to one of its descendants containing the rightmost rectangle (where its right side is the nearest to the scan line).

To use such a tree of segments, we must compress (normalize) the y-coordinates of the corners of the rectangles.

  • Y-coordinate compression
  • Sort rectangles by x coordinates on the left side.
  • Move the sweep line to the next rectangle, if it passes the right parts of some rectangles, move them to the segment tree.
  • Make sure the current rectangle intersects something along the sweep line. If you do not go to step 3.
  • Make sure the rectangle union found in step 4 intersects something in the segment tree and merges recursively, then go to step 4.
  • When step 3 reaches the end of the list, we get all the rectangles under the scan line and all the rectangles in the segment tree and unzip their coordinates.

The complexity of the temporary case is determined by the tree of segments: O (n log n). The complexity of the space is O (n).

0
source share

All Articles