Random access to all combinations of a large list in Python

Background:

I have a list of 44906 elements: large = [1, 60, 17, ...]. I also have a personal computer with limited memory (8 GB) running under Ubuntu 14.04.4 LTS.

Purpose:

I need to find all paired combinations largein a memory-efficient manner, without filling out the list with all combinations in advance.

The problem and what I have tried so far:

When I use itertools.combinations(large, 2)and try to assign it to a list, my memory immediately becomes full, and I get very slow performance. The reason for this is that the number of pairwise combinations looks like n*(n-1)/2, where nis the number of elements in the list.

The number of combinations for n=44906is displayed on 44906*44905/2 = 1008251965. A list with many entries is too large to hold in memory. I would like to be able to create a function so that I can connect a number ito find ith pair combination of numbers in this list and a way to somehow dynamically calculate this combination without reference to the list of elements 1008251965, which cannot be stored in memory.

An example of what I'm trying to do:

Say I have an array small = [1,2,3,4,5]

In the configuration in which I have the code, it itertools.combinations(small, 2)will return a list of tuples as such:

[(1, 2), # 1st entry
 (1, 3), # 2nd entry
 (1, 4), # 3rd entry
 (1, 5), # 4th entry
 (2, 3), # 5th entry
 (2, 4), # 6th entry 
 (2, 5), # 7th entry
 (3, 4), # 8th entry
 (3, 5), # 9th entry
 (4, 5)] # 10th entry

A function call like this: `find_pair (10) 'will return:

(4, 5)

giving the 10th record in a potential array, but without first triggering the entire combinatorial explosion.

, , , :

>>> from itertools import combinations
>>> it = combinations([1, 2, 3, 4, 5], 2)
>>> next(it)
(1, 2)
>>> next(it)
(1, 3)
>>> next(it)
(1, 4)
>>> next(it)
(1, 5)

, , next() 10 , 10- , , 10- , .

- , , ? , , ?

+4
5

, -

def comb(k):         
        row=int((math.sqrt(1+8*k)+1)/2)    
        column=int(k-(row-1)*(row)/2)  
        return [row,column]

,

small = [1,2,3,4,5]
length = len(small)
size = int(length * (length-1)/2)
for i in range(size):
    [n,m] = comb(i)
    print(i,[n,m],"(",small[n],",",small[m],")")

0 [1, 0] ( 2 , 1 )
1 [2, 0] ( 3 , 1 )
2 [2, 1] ( 3 , 2 )
3 [3, 0] ( 4 , 1 )
4 [3, 1] ( 4 , 2 )
5 [3, 2] ( 4 , 3 )
6 [4, 0] ( 5 , 1 )
7 [4, 1] ( 5 , 2 )
8 [4, 2] ( 5 , 3 )
9 [4, 3] ( 5 , 4 )

, , .

, comb .

@Blckknght , , itertools,

for i in range(size):
        [n,m] = comb(size-1-i) 
        print(i,[n,m],"(",small[length-1-n],",",small[length-1-m],")")  


0 [4, 3] ( 1 , 2 )
1 [4, 2] ( 1 , 3 )
2 [4, 1] ( 1 , 4 )
3 [4, 0] ( 1 , 5 )
4 [3, 2] ( 2 , 3 )
5 [3, 1] ( 2 , 4 )
6 [3, 0] ( 2 , 5 )
7 [2, 1] ( 3 , 4 )
8 [2, 0] ( 3 , 5 )
9 [1, 0] ( 4 , 5 )
+4

itertools.combinations - . :

>>> from itertools import combinations
>>> it = combinations([1, 2, 3, 4, 5], 2)
>>> next(it)
(1, 2)
>>> next(it)
(1, 3)
>>> next(it)
(1, 4)
>>> next(it)
(1, 5)
>>> next(it)
(2, 3)
>>> next(it)
(2, 4)

.. : .

, , n'th, ( ), , combinations() (.. , )?

+6

, k row col. , row col k.

N,

b = 2*N - 1

, k th ...

row = (b - math.sqrt(b*b - 8*k)) // 2
col = k - (2*N - row + 1)*row / 2
kth_pair = large[row][col]

, .

+3

, 44906 . , , , , 44905 large[0] . , i i <= 44905 (large[0], large[i]).

44905 < i <= 89809 (large[1],large[i-44904]).

, - (large[j],large[i-(exclusive lower bound for j)+1]). , , . , ( j = 0, 0, j = 1, 44905 ..) , : 44905, 44905 + 44904, 44905 + 44904 + 44903...

+1

For a clearly defined order of the created pairs, the indices of the first and second elements must be associated with n and the length of the sequence. If you find them, you will be able to achieve performance in const-time mode, as index lists are executed O(1).

The pseudocode will look like this:

def find_nth_pair(seq, n):
    idx1 = f1(n, len(seq))  # some formula of n and len(seq)
    idx2 = f2(n, len(seq))  # some formula of n and len(seq)
    return (seq[idx1], seq[idx2])

You need to find the formulas for idx1 and idx2.

+1
source

All Articles