The most efficient way to generate "ordered subsets" of a sequence

Question

The most efficient way to generate "ordered subsets" of a sequence

I need to generate all the "ordered subsets" (apologies if I do not use the correct mathematical terminology) of the sequence in Python, while the omitted elements are replaced by None .Given [1, 2] , I want [(1, 2), (1, None), (None, 2), (None, None)] . Each "ordered subset" must have the property that in each position it is either the same element as in the sequence of seeds, or it is None .

I can quite easily generate subsets with missing elements that are missing in the following:

 from itertools import combinations for length in xrange(len(items), 0, -1): for combination in combinations(items, length): yield combination

I can’t figure out what is the most effective way to repair missing items. My first thought is to do something like this:

 from itertools import combinations indexes = range(len(items)) for length in xrange(len(items), 0, -1): for combination in combinations(indexes, length): yield tuple(items[i] if i in combination else None for i in indexes)

Just wondering if anyone can spot any obvious flaws in this, or if there is a more efficient solution that I missed. (Note that items will be a fairly short list, usually under 10 items, so I'm not interested in finding O (N) "combinations" in the inner loop).

+7

python

dcrosta Aug 17 '12 at 20:53

source share

4 answers

You can start with an empty list, for each element of your seed you can copy all the final lists and add the seed to the end.

eg.

 solutions = [] solutions.append([]) for elem in seed: newPartials = [] for partial in solutions: newPartial = partial[:] newPartial.append(elem) newPartials.append(newPartial) solutions.extend(newPartials)

or you could create the number of possible solutions, 2^n , where n is the length of the list of seeds and the use of modular arithmetic, delete elements, for example:

 solutions = [] for i in xrange(2**n): solutions.append(seed[:]) seedLen = len(seed) for i in xrange(2**(n-1)): // % 0 case of following loop solutions[i].pop(0) for elemLoc in xrange(1,seedLen): for solutionNum in xrange(2**n): if solutionNum % elemLoc = 0: solutions[solutionNum].pop(elemLoc)

This solution is rather inefficient, I basically turned it on because it is an interesting way to solve the problem.

+2

rsegal Aug 17 '12 at 21:15

source share

Alternative - in case you want to show how stupid you are NOT to use itertools:

 >>> given=[1,2] >>> gz=zip(given,[None]*len(given)) >>> [(i,j) for i in gz[0] for j in gz[1]] [(1, 2), (1, None), (None, 2), (None, None)]

+2

the wolf Aug 18 '12 at 0:45

source share

Here is another one:

 >>> given=[1,2,3,4] >>> rtr=[[]] >>> for t in map(None,*(given,[None])): ... rtr=[x+[y] for x in rtr for y in t] ... >>> rtr [[1, 2, 3, 4], [1, 2, 3, None], [1, 2, None, 4], [1, 2, None, None], [1, None, 3, 4], [1, None, 3, None], [1, None, None, 4], [1, None, None, None], [None, 2, 3, 4], [None, 2, 3, None], [None, 2, None, 4], [None, 2, None, None], [None, None, 3, 4], [None, None, 3, None], [None, None, None, 4], [None, None, None, None]]

+2

dawg Aug 18 '12 at 1:51

source share

Joel cornett · Accepted Answer · 2012-08-17T20:57:14+0000

 from itertools import product, repeat given = [1, 2] with_nones = zip(given, repeat(None)) print(list(product(*with_nones)))

The most efficient way to generate "ordered subsets" of a sequence

More articles: