Quick search in the list of intervals

Question

Quick search in the list of intervals

I have a list of starting positions with ~ 280,000 items. Fully covers about 73 million positions.

For performance reasons, I have already decomposed them into parts in a dictionary (by the coefficient of the subset), which, in turn, contains a list of tuples (beginning, end).

Finally, I get a list of positions that I want to check if they are located in regions stretched to the beginning and end.

posit = (start,end)
dict[subset].append(posit)

for position in dict[subset]:
    if posit[0] < varpos < posit[1]:
    # do some stuff here

These searches currently take a lot of time. But due to memory reasons, I also do not want to create a faster set containing all the positions between start and stop.

Do you have any pointers to creating a quick start structure, end position or better search strategy?

+4

python

sargas Jul 20 '14 at 13:23

source share

1

Nanook · Answer 1 · 2015-06-12T20:20:43+0000

, , 280000 . - . findRange.

, 280000 . 1000 "PositionMatches" findRange .

7.260579 100 'PositionMatches' 71.96268 1000 'PositionMatches'.

import random
import time

values = list()
for a in range(0,73000000,250) :
    values.append([a, a+200])

possiblePositionMatches = list()
count = 1000
while count:
    count = count - 1
    possiblePositionMatches.append(random.randint(0,73000000))

matches = []

def findRange(value) :
    for x in range(len(values)) :
        if (value >= values[x][0]) and (value < values[x][1]) :
            matches.append([value, values[x]])

def main():
    t1 = time.process_time()
    for y in possiblePositionMatches: 
        findRange(y)
    print (matches)
    t2 = time.process_time() - t1
    print("Total Time: {0} seconds".format(t2))

main()

Quick search in the list of intervals

More articles: