Request for sorted intervals

I am looking for a data structure that works efficiently at closed intervals with the following properties:

  • dynamically add or remove an interval

  • set and at any time change the number ("depth") for each interval. none of the two depths is the same.

  • find all intervals that overlap with any given interval, sorted by depth

The closest structure I found is the Interval Tree , but it lists the found intervals in an arbitrary order relative to their depths. I could collect all the β€œunsorted” intervals, as reported, and sort them later, but I was sure that you could avoid sorting the result for each query.

Please, does anyone know about such a data structure or have any suggestion on how (if at all possible) to extend the interval tree to support such sorting?

Example:

  • add [1,2] to the empty structure and set its depth to 1
  • add [10,100], depth = 2
  • add [5.55], depth = 3
  • for [5.50] reports [10,100] and [5.55]
  • set the depth from [10,100] to 3 and from [5.55] to 2
  • for [5.50] reports [5.55] and [10,100]

Edit

I'm more interested in quick add / removes and queries than when updating depths. Depth can take as much as O (n) if this helps speed up other operations.

+7
algorithm data-structures interval-tree
source share
4 answers

Suppose the algorithm you want exists. Then create a set of a million intervals, each of which [1, 1] with random depths and inserts them into such an interval tree. Then request the interval [1, 1] . It should return all intervals in sorted order with complexity O(M + log N) , but N = 1 , so we sort the set of elements of M in linear time.

In other words, sorting the elements by depth after they are obtained from the interval tree is also good from the point of view of complexity, since this is theoretically possible.

+4
source share

Depth as it is set is equivalent to the position of the intervals in their imaginary list . So, the usual list of pairs of numbers is enough. A list can easily add, delete, or toggle your items.

If you also need to find the depth for a given interval, make a function for it (you did not indicate the need, however)

+1
source share

Here's my solution in Java using TreeMap , which is basically a Binary-Tree

Test: http://ideone.com/f0OlHI

Complexity

  Insert : 2 * O(log n) Remove : 2 * O(log n) Search : 1 * O(log n) ChangeDepth : 7 * O(log n) findOverlap : O(n) 

IntervalDataSet.java

 class IntervalDataSet { private TreeMap<Integer,Interval> map; public IntervalDataSet () { map = new TreeMap<Integer,Interval> (); } public void print () { for(Map.Entry<Integer,Interval> entry : map.entrySet()) { Integer key = entry.getKey(); Interval value = entry.getValue(); System.out.println(key+" => ["+value.min+","+value.max+"] "); } } public boolean changeDepth (int depth, int newDepth) { if (!map.containsKey(depth)) return false; if (map.containsKey(newDepth)) return false; Interval in = map.get(depth); in.depth = newDepth; remove(depth); insert(in); return true; } public boolean insert (Interval in) { if (in == null) return false; if (map.containsKey(in.depth)) return false; map.put(in.depth, in); return true; } public boolean remove (int depth) { if (!map.containsKey(depth)) return false; map.remove(depth); return true; } public Interval get (int depth) { return map.get(depth); } public void print (int depth) { if (!map.containsKey(depth)) System.out.println(depth+" => X "); else map.get(depth).print(); } public void printOverlappingIntervals (Interval in) { for (Interval interval : map.values()) if (interval.intersect(in)) interval.print(); } public ArrayList<Interval> getOverlappingIntervals (Interval in) { ArrayList<Interval> list = new ArrayList<Interval>(); for (Interval interval : map.values()) if (interval.intersect(in)) list.add(interval); return list; } public int size () { return map.size(); } } 

Interval.java

 class Interval { public int min; public int max; public int depth; public Interval (int min, int max, int depth) { this.min = min; this.max = max; this.depth = depth; } public boolean intersect (Interval b) { return (b != null && ((this.min >= b.min && this.min <= b.max) || (this.max >= b.min && this.max <= b.max)) ); } public void print () { System.out.println(depth+" => ["+min+","+max+"] "); } } 

Test.java

 class Test { public static void main(String[] args) { System.out.println("Test Start!"); System.out.println("--------------"); IntervalDataSet data = new IntervalDataSet (); data.insert(new Interval( 1,3, 0 )); data.insert(new Interval( 2,4, 1 )); data.insert(new Interval( 3,5, 3 )); data.insert(new Interval( 4,6, 4 )); System.out.println("initial values"); data.print(); System.out.println("--------------"); System.out.println("Intervals overlapping [2,3]"); data.printOverlappingIntervals(new Interval( 2,3, -1 )); System.out.println("--------------"); System.out.println("change depth 0 to 2"); data.changeDepth( 0, 2 ); data.print(); System.out.println("--------------"); System.out.println("remove depth 4"); data.remove( 4 ); data.print(); System.out.println("--------------"); System.out.println("change depth 1 to 4"); data.changeDepth( 1, 4 ); data.print(); System.out.println("--------------"); System.out.println("Test End!"); } } 

IntervalDataSet2

Complexity

 initialization : O(n) findOverlap : 2 * O(log n) + T(merge) 

 class IntervalDataSet2 { private Integer [] key; private TreeMap<Integer,Interval> [] val; private int min, max, size; public IntervalDataSet2 (Collection<Interval> init) { TreeMap<Integer,TreeMap<Integer,Interval>> map = new TreeSet<Integer,TreeMap<Integer,Interval>> (); for (Interval in : init) { if (!map.containsKey(in.min)) map.put(in.min, new TreeMap<Integer,Interval> ()); map.get(in.min).put(in.depth,in); if (!map.containsKey(in.max)) map.put(in.max, new TreeMap<Integer,Interval> ()); map.get(in.max).put(in.depth,in); } key = new Integer [map.size()]; val = new TreeMap<Integer,Interval> [map.size()]; int i = 0; for (Integer value : map.keySet()) { key [i] = value; val [i] = map.get(value); i++ ; } this.size = map.size(); this.min = key [0]; this.max = key [size-1]; } private int binarySearch (int value, int a, int b) { if (a == b) return a; if (key[(a+b)/2] == value) return ((a+b)/2); if (key[(a+b)/2] < value) return binarySearch(value, ((a+b)/2)+1, b); else return binarySearch(value, (a, (a+b)/2)-1); } public TreeMap<Integer,Interval> findOverlap (Interval in) { TreeMap<Integer,Interval> solution = new TreeMap<Integer,Interval> (); int alpha = in.min; int beta = in.max; if (alpha > this.max || beta < this.min) return solution; int i = binarySearch(alpha, 0,(size-1)); int j = binarySearch(beta, 0,(size-1)); while (alpha <= beta && key[i] < alpha) i++; while (alpha <= beta && key[j] > beta) j--; for (int k = i; k <= j; k++) solution.addAll ( val[k] ); return solution; } } 
+1
source share

In the end, it's pretty hard to think about these issues. What you actually have is a one-dimensional space acting on a line. By adding depth, you get the second coordinate. By requesting a unique depth, you can actually draw a picture.

Your request should cross each interval (line) of your image with a rectangle from x1 to x2 and every y in Y.

So a naive view of this problem is to compare each line in y order if it intersects. Since you need all the results, it takes O (n) to find the answer. And this answer should also be sorted by depth in O (m log m).

You are trying to go with a one-dimensional R-tree. Allows you to define areas on the go. Within such a structure, each node spans a region from min to max. Now you can divide this region further. Since there are gaps in both sections, since the separation actually separates it, they are stored in the node (and not in the child nodes). Within these child nodes, you again get such a list, and so on.

To search, you check all nodes and child nodes whose regions intersect with the search interval.

In each node, the list of intervals is sorted by their depth values.

Thus, the problem boils down to a series of sorted lists (of all the intervalls that may be contained within your search interval). These lists should now be filtered for each interval that really crosses your search interval. But you do not need to filter everything. Since if such a node interval is completely contained within your search interval, all its intervals and all its child intervals really intersect with the search interval (since they are completely contained inside).

To combine all of these lists, you simply use a combination of compounds in which you select the next element of this association, which has the smallest depth. You sort all of these lists by the depth of your first item (the item with the smallest depth in each list). Now you are looking for the first element and move it to the result. Now you compare the next item in the list with the first item in the next list, and if the depth is even less, you copy it. If the depth becomes greater, you simply sort it with the correct position, taking log k (where k is the number of non-empty lists in your collection) and continue with the first list and repeat it until all the lists are empty or go through work since You keep the cursor position for each list.

This way you only sort the lists and compare them with the next element if it is even smaller or inserts it.

This is the best structure that I can think of. at first you easily exclude almost all intercepts that cannot intersect with it. How do you compose a result based on lists of potentials in which you have lists that, as you know, are part of the result in full (it can be argued that you need to check only a certain number of jump lists, since the trees diverge very quickly). By controlling the separation strategy, you control the value of each piece. If, for example, you split only at> 10 intervals, you will provide k

In the worst case, this is bad, but I assume that the expected actual performance will be good and better than O (n + m log m), since you are already using various sorted lists of potential intersections.

And remember that if a node has children, it itself contains a list of all the intervals that intersect with the split point. Therefore, if your search interval also intersects with the split point, each interval of this node is also part of your result.

Therefore, feel free to experiment with this structure. This should be easy to implement in one day.

0
source share

All Articles