Find the maximum overlapping subset of ranges

If you have a set of ranges, for example, the following simple example ...

[ [12, 25], #1 [14, 27], #2 [15, 22], #3 [17, 21], #4 [20, 65], #5 [62, 70], #6 [64, 80] #7 ] 

... how do you calculate the maximum intersecting subset (not exactly how to express it, but I mean “the subset of ranges that intersect and have the highest power”) and determine the degree of intersection (the power of the ranges in this subset)?

Logically, I can work it out and maybe I can translate it into a naive algorithm. Going down the list, we see that 1-5 intersect and 5-7 intersect, and # 5 intersects both sets.

As a result, I just want a subset, as it gives me information about the power, and I can easily calculate the intersection of the set while they all intersect. In the above example, this will be [[14, 27],[15, 22],[12, 25],[17, 21],[20, 65]] .

At the top of my head, I can try converting each range into a node graph by connecting the ones that intersect, and I find the largest fully connected graph.

I also thought iteratively starting from the beginning, keep creating a list of intersecting ranges with passing intersections on each, to check against - until you click on an element that doesn't intersect, and then start a new list. Continue checking each element at existing intersections. However, I am not sure if this is complete.

I could strike at the implementation of something (lang is ruby ​​FWIW), but I would like to hear how others can solve this problem, and what the most efficient and elegant way can be.

Update:

I believe that this is a specific case of the maximum click problem, which is NP-hard and therefore actually complex. Proposals for proximity / use in the real world will be rated most highly!

See also: http://en.wikipedia.org/wiki/Maximum_clique / Find all complete subgraphs in a graph

Update 2

Found good evidence of this problem NP-hardness and NP-completeness here: http://www.cs.bris.ac.uk/~popa/ipl.pdf

This seems to be the end of the line. Sorry people! I will work with a rather greedy approach. Thank you. The strike>

As the answers said, I don’t think this article describes this problem ... we probably have more information here based on the ranges.

+6
source share
1 answer

If I understand the problem correctly, this is not an example of the NP problem described in the document you are attached to. Here is my understanding of the problem and the solution to polynomial time.

  • We are given a finite set of ranges of real numbers, for example n: [A1, B1], [A2, B2], ..., [An, Bn], where Ai <= Bi.

  • Create a sorted list of start and end points, sorted by number, indicating whether the point is a start or end point.

In your example, it will be: 12+, 14+, 15+, 17+, 20+, 21-, 22-, 25-, 27-, 62+, 64+, 65-, 70-, 80 -

  • Initialize curOverlap and maxOverlap to zero.

  • Scroll through the list, increasing curOverlap for each + and decreasing it for each -. Set maxOverlap = max (curOverlap, maxOverlap) for each increment.

Continue your example:
val, cur, max
12, 1, 1
14, 2, 2
15, 3, 3
17, 4, 4
20, 5, 5
21, 4, 5
22, 3, 5
25, 2, 5
27, 1, 5
62, 2, 5
64, 3, 5
65, 2, 5
70, 1, 5
80, 0, 5

The maximum overlap is 5. You can also save the val value associated with max if you want to know where the maximum overlap occurred. This will give you 20. in this example. Then it is trivial to go through the initial set of ranges and find 5 that include 20.

-edit- If you have duplicate values, consider the pluses before the minuses for each value so that you include ranges that overlap at one point.

+8
source

All Articles