Effective enumeration of ordered subsets in Python

I am not sure of the proper mathematical terminology for the code I'm trying to write. I would like to generate combinations of unique integers where the "ordered subsets" of each combination are used to exclude some later combinations.

Hope the example will be clear:

from itertools import chain, combinations ​ mylist = range(4) max_depth = 3 rev = chain.from_iterable(combinations(mylist, i) for i in xrange(max_depth, 0, -1)) for el in list(rev): print el 

This code outputs an output containing all the subsets I want, but also some additional ones that I don't have. I manually inserted comments to indicate which items I do not want.

 (0, 1, 2) (0, 1, 3) (0, 2, 3) (1, 2, 3) (0, 1) # Exclude: (0, 1, _) occurs as part of (0, 1, 2) above (0, 2) # Exclude: (0, 2, _) occurs above (0, 3) # Keep (1, 2) # Exclude: (1, 2, _) occurs above (1, 3) # Keep: (_, 1, 3) occurs above, but (1, 3, _) does not (2, 3) # Keep (0,) # Exclude: (0, _, _) occurs above (1,) # Exclude: (1, _, _) occurs above (2,) # Exclude: (2, _) occurs above (3,) # Keep 

So the desired output of my generator or iterator would be:

 (0, 1, 2) (0, 1, 3) (0, 2, 3) (1, 2, 3) (0, 3) (1, 3) (2, 3) (3,) 

I know that I can make a list of all (desired and undesirable) combinations, and then filter out the ones that I don't need, but I was wondering if there was a more efficient, generator or iterative method.

+6
source share
2 answers

You are trying to exclude any combination that is a prefix of a previously returned combination. It's simple.

  • If the tuple t has a length max_depth , it cannot be the prefix of the previously returned tuple, since any tuple its prefix must be longer.
  • If the tuple t ends with mylist[-1] , it cannot be the prefix of the previously returned tuple, since there are no elements that can be legitimately added to the end of t to expand it.
  • If the tuple t has a length less than max_depth and does not end with mylist[-1] , then t is the prefix of the previously returned tuple t + (mylist[-1],) , and t not returned.

Thus, the combinations you must generate exactly match the max_depth and shorter lengths that end with mylist[-1] . The following code does this in exactly the same order as your source code, and handles cases like maxdepth > len(mylist) :

 def nonprefix_combinations(iterable, maxlen): iterable = list(iterable) if not (iterable and maxlen): return for comb in combinations(iterable, maxlen): yield comb for length in xrange(maxlen-2, -1, -1): for comb in combinations(iterable[:-1], length): yield comb + (iterable[-1],) 

(I assumed that in the case where maxdepth == 0 , you still do not want to include empty tuples in your output, although for maxdepth == 0 it is not a prefix of the previous version, Returned tuple. If you need an empty tuple in this case , you can change if not (iterable and maxlen) to if not iterable .)

+2
source

I noticed an interesting pattern in your desired output, and I have a generator that produces this. Does this work for all of your cases?

 from itertools import combinations def orderedSetCombination(iterable, r): # Get the last element of the iterable last = (iterable[-1], ) # yield all the combinations of the iterable without the # last element for iter in combinations(iterable[:-1], r): yield iter # while r > 1 reduce r by 1 and yield all the combinations while r>1: r -= 1 for iter in combinations(iterable[:-1], r): yield iter+last # yield the last item yield last iter = [0,1,2,3] for el in (list(orderedSetCombination(iter, 3))): print(el) 

Here is my explanation of the logic:

 # All combinations that does not include the last element of the iterable # taking r = max_depth items at a time (0,1,2) # from here on, its the combinations of all the elements except # the last element and the last element is added to it. # so here taking r = r -1 items at a time and adding the last element # combinations([0,1,2], r=2) (0,1,3) (0,2,3) (1,2,3) # the only possible value right now at index r = 2 is the last element (3) # since all possible values of (0,1,_) (0,2,_) (1,2,_) are already listed # So reduce r by 1 again and continue: combinations([0,1,2], r=1) (0, 3) (1, 3) (2, 3) # continue until r == 0 and then yield the last element (3,) 
+1
source

All Articles