Possible question for an interview: how to find all overlapping intervals

Question

Possible question for an interview: how to find all overlapping intervals

This is not a question of the interview itself, as I came across this in my project, but I decided that it could be a worthy question.

You have N pairs of intervals, say integers. You need to specify all intervals that overlap with each other in O (N) time. For example, if you have

{1, 3} {12, 14} {2, 4} {13, 15} {5, 10}

answer {1, 3}, {12, 14}, {2, 4}, {13, 15}. Please note that you do not need to group them, so the result can be in any order, as in the example.

I just scored O (N) time, because the KMP algorithm takes O (N) to search for strings .: D

The best that I came up with and what I am currently using in the project is O (N ^ 2). Yes, brute force is rather sad, but no one complains, so I won’t reorganize it .: P However, I was curious if the smarter solution has a more elegant solution.

+55

algorithm

Tom Tucker Dec 28 2018-10-12T00:

source share

13 answers

Sort intervals by starting point. Then collapse this list by combining each interval with its neighbor (i.e. (1,4), (2,6) → (1,6)) if they overlap. The resulting list contains those intervals that do not have a matching partner. Filter them from the source list.

This is linear in time after the initial sorting operation, which can be performed with any algorithm (n log n). Not sure how you get around this. Even the “filter duplicates” operation at the end is linear if you use the already sorted input and output order of the first operation.

Here is the implementation in Haskell:

 -- Given a list of intervals, select those which overlap with at least one other inteval in the set. import Data.List type Interval = (Integer, Integer) overlap (a1,b1)(a2,b2) | b1 < a2 = False | b2 < a1 = False | otherwise = True mergeIntervals (a1,b1)(a2,b2) = (min a1 a2, max b1 b2) sortIntervals::[Interval]->[Interval] sortIntervals = sortBy (\(a1,b1)(a2,b2)->(compare a1 a2)) sortedDifference::[Interval]->[Interval]->[Interval] sortedDifference [] _ = [] sortedDifference x [] = x sortedDifference (x:xs)(y:ys) | x == y = sortedDifference xs ys | x < y = x:(sortedDifference xs (y:ys)) | y < x = sortedDifference (x:xs) ys groupIntervals::[Interval]->[Interval] groupIntervals = foldr couldCombine [] where couldCombine next [] = [next] couldCombine next (x:xs) | overlap next x = (mergeIntervals x next):xs | otherwise = next:x:xs findOverlapped::[Interval]->[Interval] findOverlapped intervals = sortedDifference sorted (groupIntervals sorted) where sorted = sortIntervals intervals sample = [(1,3),(12,14),(2,4),(13,15),(5,10)]

+25

gcbenison Mar 19 '12 at 18:25

source share

The standard approach to interval tasks in a row is to sort them by starting point, and then just go from the first to the last. O(n*logn) ( O(n) if already sorted)

 end = 0; for (current in intervals) { if current.start < end { // there an intersection! // 'current' intersects with some interval before it ... } end = max(end, current.end) }

+6

Nikita Rybak Dec 28 2018-10-12T00:

source share

Not sure about O (N), but what if we first sort them by the first number in each tuple, and then sequentially find those where the first number of the tuple is greater than the largest number specified in previous tuples that also do not intersect with next tuple.

So you first get:

{1, 3}, {2,4}, {5, 10}, {12, 14}, {13, 15}

since 4 (largest) 5 and 10 <12, {5, 10}.

This will lead to the fact that we will track the largest number that we encounter, and every time we find a tuple whose initial number is larger, we check whether it overlaps with the next.

This becomes dependent on the efficiency of the sorting algorithm, because the last process will be O (N)

+4

jbx Dec 28 2018-10-12T00:

source share

Suppose that the difference between the start and end points is small, for example, 32. for example. 1..32. Then each interval can be written as a bit pattern in a 32-bit word. for example, [1, 2] -> 001; [2, 3]-> 010; [1, 3] -> 011; [2, 3, 4] -> 110 [1, 2] -> 001; [2, 3]-> 010; [1, 3] -> 011; [2, 3, 4] -> 110 [1, 2] -> 001; [2, 3]-> 010; [1, 3] -> 011; [2, 3, 4] -> 110 . Two intervals or interval combinations overlap if their bitwise AND value is nonzero. eg. [1,2] overrides [1,3] because 001&011 == 001 is non-zero. O (n) alg - maintain bitwise OR intervals visible so far, and AND each new:

 bitsSoFar = 0 for (n=0; n < bitPatterns.length; n++) if (bitPatterns[n] & bitsSoFar != 0) // overlap of bitPatterns[n] with an earlier pattern bitsSoFar |= bitPatterns[n]

Left as an exercise:

modify the algorithm to also identify a bit pattern overlap with a later
work out a bit pattern for an interval in O (1)

+1

Jan Newmarch Feb 20 '11 at 4:25

source share

If N pairs of intervals are integers, then we can get them in O (n).

Sort by the first number in a pair, then by the second number. If all are integers, we can use bucket sort or radius sort to get it through O (n).

{1, 3}, {2,4}, {5, 10}, {12, 14}, {13, 15}

Then combine one by one,

{1,3}

{1,4} with overlapping {1,3} and {2,4}

{1,4}, {5,10}

{1,4}, {5,10}, {12,14}

{1,4}, {5,10}, {12,15} with overlapping {12,14} and {13,15}

The combination will take time O (N)

+1

will Mar 19 '12 at 17:12

source share

This problem can be reduced to the problem of the uniqueness of an element.

The uniqueness of the element has an Omega (n log n) lower bound (counting the number of comparisons), so you cannot do better than this.

+1

Tea Tagger Mar 22 '14 at 17:45

source share

It has been quite a while since I used it, but the solution I used was derived from the red-black tree described in the Introduction to Algorithms called the interval tree. This is a tree sorted by launch interval, so you can quickly (binary search) get the first node first. IIRC, nodes have been sorted by a property that allows you to stop "walking" through the tree when candidate nodes cannot match your interval. Therefore, I believe that this is O (m) search, where m is the number of matching intervals.

I found a search for this implementation .

Brett

[edit] Rereading the question, this is not what you asked. I think this is the best implementation when you have a list (for example) of meetings already scheduled in the conference rooms (which are added to the tree) and you want to find which rooms are still available for a meeting with a new start and duration ( search query). We hope that this decision has some significance.

0

Brett Stottlemyer Dec 28 2018-10-12T00:

source share

 int find_no_overlapping_intervals(const vector< pair<int, int>& a) { // a[i].first -> X ,a[i].second->Y // sort by start time sort.begin(a.begin(), a.end()); // maintain <end-time, its frequency> in m map<int, int> m; // i // for each point, we know its a[i].X >= a[0].X. ....a[i-1].X // there will be overlap, if for some j < i, a[j].Y >= a[i].X // lower_bound to find this.. it can be found in O(log(n)) as we use std::map for maintaining y int ans = 0; for (int i=0; i < a.begin(); i++) { auto sit = m.lower_bound(m.begin(), m.end(), a[i].first); auto eit = m.upper_bound(m.begin(), m.end(), a[i].first); for (auto it = sit; it != eit; it++) ans += it->second; m[a[i].second]++; } return ans; }

0

Jimmy Aug 01 '13 at 11:27

source share

 public ArrayList<Interval> merge(ArrayList<Interval> intervals) { ArrayList<Interval> res = new ArrayList<Interval>(); if (intervals == null || intervals.isEmpty()) return res; Comparator<Interval> comparator = new Comparator<Interval>() { @Override public int compare(Interval i1, Interval i2) { if (i1.start < i2.start) return -1; else if (i1.start > i2.start) return 1; else { if (i1.end < i2.end) return -1; else if (i1.end > i2.end) return 1; else return 0; } } }; Collections.sort(intervals, comparator); for (int i = 0; i < intervals.size(); i++) { Interval cur = intervals.get(i); if (res.isEmpty()) { res.add(cur); } else { Interval last = res.get(res.size() - 1); if (last.end >= cur.start) { last.end = Math.max(last.end, cur.end); } else { res.add(cur); } } } return res; }

0

mihirk Mar 22 '14 at 5:20

source share

This is a simple O(N*log(N)) implementation in Python:

 def overlapping(intervals): last = (-1, -1) overlapping = set() for curr in sorted(intervals, key=lambda p: p[0]): if curr[0] < last[1]: overlapping.add(curr) overlapping.add(last) last = max(curr, last, key=lambda p: p[1]) return list(overlapping - set((-1, -1))) print overlapping([(1, 3), (12, 14), (2, 4), (13, 15), (5, 10)]) #=> [(1, 3), (13, 15), (2, 4), (12, 14)]

First, it sorts the intervals by the end time than for each interval, compares the start time with the longest end time, and if it is shorter, this means that there is overlap.

A sorting element is one that takes more time, so the final complexity is N*log(N) .

0

fede1024 May 11 '14 at 15:35

source share

Implementation in C ++ using the sweep algorithm ( O(N log N) ):

 #include <algorithm> #include <iostream> #include <set> #include <tuple> #include <vector> struct Interval { double start; double end; }; //----------------------------------------------------------------------------- // Enum to qualify event: Start/End of "segment-line" enum class EIntervalState { Start, End }; //----------------------------------------------------------------------------- // Events used for collision detection struct Event { double pos; EIntervalState state; std::size_t id; }; //----------------------------------------------------------------------------- // Comparator to use if {1, 2} and {2, 3} should overlap class LessIncludeLimit { public: bool operator()(const Event& lhs, const Event& rhs) const { return std::tie(lhs.pos, lhs.state) < std::tie(rhs.pos, rhs.state); } }; //----------------------------------------------------------------------------- // Comparator to use if {1, 2} and {2, 3} should not overlap class LessExcludeLimit { public: bool operator()(const Event& lhs, const Event& rhs) const { return std::tie(lhs.pos, rhs.state) < std::tie(rhs.pos, lhs.state); } }; //----------------------------------------------------------------------------- std::vector<Event> MakeEvents(const std::vector<Interval>& intervals) { std::vector<Event> res; std::size_t id = 0; for (const auto& interval : intervals) { res.push_back({interval.start, EIntervalState::Start, id}); res.push_back({interval.end, EIntervalState::End, id}); ++id; } return res; } //----------------------------------------------------------------------------- template <typename Less> std::vector<std::pair<std::size_t, std::size_t>> ComputeOverlappingIntervals(std::vector<Event>&& events, Less less) { std::sort(events.begin(), events.end(), less); std::set<std::size_t> activeIds; std::vector<std::pair<std::size_t, std::size_t>> res; for (const auto& event : events) { switch (event.state) { case EIntervalState::Start: { for (const auto& id : activeIds) { res.emplace_back(std::minmax(event.id, id)); } activeIds.emplace(event.id); break; } case EIntervalState::End: { activeIds.erase(event.id); break; } } } return res; }

And use it:

 int main() { const std::vector<Interval> intervals = { {1, 3}, {12, 14}, {2, 4}, {13, 15}, {5, 10} }; for (const auto& p : ComputeOverlappingIntervals(MakeEvents(intervals), LessExcludeLimit{})) { std::cout << "intervals " << p.first << " and " << p.second << " overlap\n"; } }

Demo

0

Jarod42 Jun 22 '17 at 18:30

source share

You can list the list once and save a hash table of all the intervals encountered so far. If the input interval is part of some interval from the hash table, combine it into a hash table. Mark non-primitive intervals (combined intervals that consist of more than one interval).

Then you go through the list a second time and for each interval check the hash table whether it was contained in the combined interval or not.

I do not know if this is O (N), but it is much better than O (N ^ 2).

-one

Ilya Kogan Dec 28 2018-10-12T00: 00Z

source share

marcog · Accepted Answer · 2010-12-28 00:46

Throw the endpoints of the intervals into the array, labeling them as start or end points. Sort them by breaking the tacks, placing the end points in front of the starting points if the intervals are closed, or vice versa if they are half open.

1S, 2S, 3E, 4E, 5S, 10E, 12S, 13S, 14E, 15E

Then iterate over the list, tracking how many intervals we are in (this corresponds to the number of processed start points minus the number of end points). Whenever we press the starting point while we are already in the interval, this means that we must have overlapping intervals.

 1S, 2S, 3E, 4E, 5S, 10E, 12S, 13S, 14E, 15E ^ ^ overlap overlap

We can find which intervals overlap with them, storing data along with the endpoints and keeping track of which gaps we are in.

This solution is O (N logN), and the main factor is sorting.

Possible question for an interview: how to find all overlapping intervals

More articles: