Java algorithm for finding intersection between intervals

I have such a time interval:

[5,10] 

and I have more list of time points with different lengths, for example:

 t1=[3,6,9,10] t2=[2,4,5,6,10] .. 

where in t1 [3,6] is the first interval, [6,9] second, etc.

The same goes for t2 and another list.

Now I need to save a list and a specific interval that intersect with the first time interval. For example, in t1 , I have [3,6] that intersect at [5,10], [6,9] that intersect with [5,10] , etc.

I made an algorithm, but I work with a lot of data, and I need a fast algorithm. For example, if I work with 300,000 lists, and each list has 200 times, my algorithm is 1 fine in about 5-10 seconds. But if I have 10,000 or more time, the algorithm is very slow.

My algorithm is as follows:

 First time interval <- [t1,t2] For each list For each time interval [s1,s2] in list if(s1>= t1 && s2 <= t2) { saveIntervall() } else if (s1<= t2 && s2 > t2) { saveIntervall() } else if(s1 < t1 && s2 >= t1) { saveIntervall() } else if (s1 < t1 && s2 > t2) { saveIntervall() } 

Edit1: Yes I ordered the list.

Edit2: I have another problem, when I find the interpolation, I need to calculate the score between 0 and 1 of how big the intersection is. I use this:

  double score = 0.0d; if(s1>= t1 && s2 <= t2) { score = (s2-s1) / (t2-t1); } else if(t2 >= s2 && t1 > s1) { score = (s2-t1) / (t2-t1); } else if(t2 < s2 && t1 <= s1) { score = (t2-s1) / (t2-t1); } else { score = 1; } 

Can I speed it too?

+8
java algorithm
source share
3 answers

First of all, your data structure is confusing - if you are trying to talk about discrete time intervals, structure your data like this; for example int[][] , where the internal array always has a length of 2, so your t1 becomes:

 int[][] t1 = {{3,6}, {6,9}, {9,10}}; 

Using the right structure will probably help you simplify your algorithm and make it easier to work with.


However, it would be better than properly structured arrays to use a dedicated type to represent these intervals, so that you can walk around List<Interval> objects and perform any checks on them. But do not reinvent the wheel. The amazing Guava library provides a robust Range class that you can use. However, it also provides RangeSet and RangeMap classes that allow you to easily do what you are talking about. See also the Ranges Explained article, which covers the basics.

Please note that you can easily convert your current project to Range objects inside if you cannot reverse engineer the array structure from the outside.

Having tried to create your own IntervalSet class at some point, let me tell you that this is a complex problem and you can save a lot of headaches by using their well-designed and highly detailed range utilities.

Here's how I will do what you describe using Guava - note that we avoid even having to think about the math involved - Range does the right thing for us:

 /** * Given a Range and an group of other Ranges, identify the set of ranges in * the group which overlap with the first range. Note this returns a Set<Range> * not a RangeSet, because we don't want to collapse connected ranges together. */ public static <T extends Comparable<?>> Set<Range<T>> getIntersectingRanges(Range<T> intersects, Iterable<Range<T>> ranges) { ImmutableSet.Builder<Range<T>> builder = ImmutableSet.builder(); for(Range<T> r : ranges) { if(r.isConnected(intersects) && !r.intersection(intersects).isEmpty()) { builder.add(r); } } return builder.build(); } /** * Given a 2-length array representing a closed integer range, and an array of * discrete instances (each pair of which therefore represents a closed range) * return the set of ranges overlapping the first range. * Example: the instances array [1,2,3,4] maps to the ranges [1,2],[2,3],[3,4]. */ public static Set<Range<Integer>> getIntersectingContinuousRanges(int[] intersects, int[] instances) { Preconditions.checkArgument(intersects.length == 2); Preconditions.checkArgument(instances.length >= 2); ImmutableList.Builder<Range<Integer>> builder = ImmutableList.builder(); for(int i = 0; i < instances.length-1; i++) { builder.add(Range.closed(instances[i], instances[i+1])); } return getIntersectingRanges(Range.closed(intersects[0], intersects[1]), builder.build()); } 

Using your examples:

 public static void main(String[] args) { int[] interval = {5,10}; int[] t1 = {3,6,9,10}; int[] t2 = {2,4,5,6,10}; System.out.println(getIntersectingContinuousRanges(interval, t1)); System.out.println(getIntersectingContinuousRanges(interval, t2)); } 

Above above:

 [[3β€₯6], [6β€₯9], [9β€₯10]] [[4β€₯5], [5β€₯6], [6β€₯10]] 
+1
source share

The intersection of two intervals [s1, s2] and [t1, t2] is empty if and only if :

  t2 < s1 or s2 < t1 

So, for two intervals, in order to check whether these two intersect or not, you only need to perform the above test.

Also, as soon as you find out that s2 <t1, then it makes no sense to go further in the list that cited t1, since large intervals never intersect, which means that you must move on.

Naive Psuedo algorithm:

  given [s1, s2] for each list [t1, t2, ... t(n)] in search_lists for each interval [t(x), t(x+1)] from [t1, t2, ... t(n] (x goes from 0 to n-1) if t(x+1) < s1 continue if s2 < t(x) break saveInterval() 

This can be improved quite a bit to really use the fact that [t1, t2, .., t (n)] is sorted.

first of all, note that [s1, s2] will intersect with [t(x), t(x+1)] iff t(x+1) >= s1 and s2 >= t(x)

but

 if t(x) >= s1 then for every h>0 `t(x+h) >= s1` 

also

 if s2 >= t(x) then for every h>0 `s2 >= t(xh)` 

therefore, if we find the smallest i, so that t (i + 1)> = s1, then all intervals from [t(i), t(i+1)] occur on the first intersection condition; those. ( [t(i+1), t(i+2)] , [t(i+2), t(i+3)] ...)

and if we find the largest j, so that s2> = t (j-1), then all the intervals from [t(j-1), t(j)] back will correspond to the second condition. those. ( [t(j-2), t(j-1)] , [t(j-3), t(j-2)] ...)

All intervals between i and j satisfy both criteria and only them.

So, the final algorithm:

 given [s1, s2] for each list [t1, t2, ... t(n)] in search_lists find the smallest i such that t(i+1)>=s1 find the biggest j such that s2>= t(j-1) if j>i then all the intervals between `{t(i)... t(j)}` intersect with [s1, s2] otherwise there is no intersection. 

Since {t1, t2, t3...t(n)} sorted, we can use binary search to efficiently find indices i and j

EDIT2:

Intersection of [s1, s2] and [t1, t2]: [max(s1, t1), min(s2,t2)]

dimensions: L1 = s2-s1 L2 = t2-t1 L3 = min(s2,t2) - max(s1,t1)

The rating you are looking for: L3/ min(L2, L1) rating from 0 to 1.

 (min(s2,t2) - max(s1,t1)) / ( min(s2-s1, t2-t1) ) 

The calculation cost is 3 tests, 3 minus operations and one floating point operation. But I assume that the intervals are valid and the intersection exists, otherwise additional tests are needed. ( s2>s2 , t2>t1 and min(s2,t2) > max(s1,t1) . The final test is the same iff condition for the intersection from the discussion above.

+5
source share

Given the left border and the length of the two intervals, the intersection can be checked using the following code.

 protected boolean intervalOverlap( int pos1, int length1, int pos2, int length2 ){ int pos_A_left = pos1; int pos_A_right = pos1 + length1; int pos_B_left = pos2; int pos_B_right = pos2 + length2; return pos_B_left < pos_A_right && pos_A_left < pos_B_right; } 

There is a short article discussing this approach. In addition, an alternative representation of intervals is presented (using the center and length) for which the intersection test can be implemented more efficiently.

0
source share

All Articles