() (1,) (2,) (3,) (1,2) (1,...">

Random chain from itertools

I am copying an example from python docs .

def powerset(iterable): "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)" s = list(iterable) return chain.from_iterable(combinations(s, r) for r in range(len(s)+1)) 

How can we randomize the order of the values ​​we get while the powerset result remains lazily evaluated?

EDIT: The reason I want it is because I would like to calculate the sum of the derived sets and stop as soon as I find two sets having the same sum. If I am not mistaken, the problem is NP-complete .

+4
source share
4 answers

itertools.combinations() gives us the results in the given order from input. Given this, we can shuffle our input list to create a random order of elements (it is obvious that there will be much less possible orders for the result).

 def random_powerset(iterable): s = list(iterable) lengths = list(range(len(s)+1)) shuffle(lengths) return chain.from_iterable(combinations(s, r) for r in lengths if not shuffle(s)) 

(This is a bit ugly hack - we know that shuffle(s) will always return False , so we can add it as a condition to ensure it runs for every combinations() call.)

We will pre-generate a list of lengths so that we can also shuffle this.

This is not entirely random (there will still be order - all elements of length n will cluster together, for example, and these elements will be in order, depending on the random order of input), but there will be a lot of randomness if this is enough for you.

Output Example:

 >>> list(random_powerset(range(3))) [(), (2,), (0,), (1,), (2, 1), (2, 0), (1, 0), (1, 2, 0)] >>> list(random_powerset(range(3))) [(), (0, 1), (0, 2), (1, 2), (0, 1, 2), (2,), (0,), (1,)] >>> list(random_powerset(range(3))) [(0, 1, 2), (2,), (1,), (0,), (0, 2), (0, 1), (2, 1), ()] >>> list(random_powerset(range(3))) [(1, 2, 0), (0,), (2,), (1,), (), (0, 1), (0, 2), (1, 2)] >>> list(random_powerset(range(3))) [(), (2, 1), (2, 0), (1, 0), (0,), (2,), (1,), (2, 1, 0)] >>> list(random_powerset(range(3))) [(1, 0), (1, 2), (0, 2), (0, 2, 1), (), (1,), (0,), (2,)] 

I think the best you can do without making it lazy.

+2
source

Here's another idea: Store the combination generators and make them randomly until you consume everything. It also randomizes the order of the given sizes.

Change I assume that you do not care about the order of the elements in the same set, since you sum them up. If you do, you can put random.shuffle(next_value) before exiting.

 import itertools import random def random_powerset(l): combs = [itertools.combinations(l,i) for i in range(len(l)+1)] while combs: comb_index = random.choice(range(len(combs))) try: next_value = next(combs[comb_index]) yield next_value except StopIteration: combs.pop(comb_index) 

Conclusion:

 In : list(random_powerset(range(3))) Out: [(0, 1), (0, 2), (0, 1, 2), (1, 2), (), (0,), (1,), (2,)] In : list(random_powerset(range(3))) Out: [(0, 1, 2), (0,), (), (0, 1), (1,), (0, 2), (1, 2), (2,)] In : list(random_powerset(range(3))) Out: [(0, 1), (0, 1, 2), (0, 2), (), (0,), (1,), (1, 2), (2,)] In : list(random_powerset(range(3))) Out: [(), (0,), (0, 1), (0, 1, 2), (1,), (0, 2), (2,), (1, 2)] In : list(random_powerset(range(3))) Out: [(), (0, 1), (0,), (0, 1, 2), (1,), (0, 2), (2,), (1, 2)] In : list(random_powerset(range(3))) Out: [(0, 1), (0,), (0, 2), (1, 2), (), (1,), (2,), (0, 1, 2)] In : list(random_powerset(range(3))) Out: [(), (0, 1, 2), (0,), (1,), (2,), (0, 1), (0, 2), (1, 2)] 
+2
source

You can improve the Lattyware solution a little if you go beyond itertools.chain :

 def chain_random(iterables): iterables = list(iterables) icount = len(iterables) if icount == 0: return while icount > 1: shuffle(iterables) try: yield iterables[-1].next() except StopIteration: iterables.pop() icount -= 1 for element in iterables[0]: yield element def random_powerset(iterable): s = list(iterable) lengths = list(range(len(s)+1)) shuffle(lengths) return chain_random(combinations(s, r) for r in lengths if not shuffle(s)) 

Output Example:

 >>> list(random_powerset(range(3))) [(), (2, 1, 0), (1, 0), (1, 2), (2,), (0, 2), (1,), (0,)] >>> list(random_powerset(range(3))) [(1, 0), (1, 2), (0, 2, 1), (2,), (), (0, 2), (0,), (1,)] >>> list(random_powerset(range(3))) [(0, 1), (), (0, 2), (0,), (1, 2), (2, 0, 1), (1,), (2,)] >>> list(random_powerset(range(3))) [(), (1, 2), (2,), (1, 0), (0,), (2, 0), (1,), (1, 0, 2)] >>> list(random_powerset(range(3))) [(0, 1), (), (2,), (0, 2), (1, 2), (1,), (1, 2, 0), (0,)] >>> list(random_powerset(range(3))) [(0, 2, 1), (0,), (), (2, 0), (1,), (2, 1), (2,), (0, 1)] 

itertools written in C, so chain_random will be slower than itertools.chain . But you get more randomization this way.

+1
source

This is a lazy and random solution:

 import random def powerset(seq): n = 2**len(seq) used = set([]) while len(used) < n: choice = random.randint(0, n - 1) if not (choice in used): used.add(choice) binary = bin(choice)[2:].zfill(len(seq)) yield [i[1] for i in zip(binary, seq) if i[0] == '1'] #or following line if > python 2.7: #yield itertools.compress(seq, binary) print list(powerset([1,2,3])) print list(powerset([1,2,3])) #output: [[3], [1], [2, 3], [], [1, 2], [2], [1, 3], [1, 2, 3]] [[2, 3], [1, 3], [1], [1, 2, 3], [1, 2], [2], [3], []] 

If you consider the combination [1, 2, 3] in binary format:

 n 123 0 000 1 001 2 010 3 011 4 100 5 101 6 110 7 111 

Each combination can be marked with a unique binary identifier. And there are always 2**len(seq) combinations .... So:

  • Randomly select an integer, 0 and 2**len(seq) - 1 .
  • check that we have not used it before (if we have, draw again).
  • convert it to binary.
  • zip using seq .
  • if zipped binary digits are '0' , we exclude it from the output.

This is lazy and will work for large seq .

A little caution:

A problem may occur, but that probably doesn't matter to you. Toward the end of the sequence, you may encounter difficulty re-redrawing (which may take some time). Since the probability of drawing an already drawn number is number of successful draws / 2**len(seq) , therefore, in this figure, g expected number of draws for finding an unused new number:

 n / (n - g) #where n = 2**len(seq) 

This is normal if: n is small or for large n : g << n (both or both of these situations are very likely, so neither of them should be a big problem). In fact, with large n you can do without used and checking for repetitions in general, since the expected number of iterations before the first repetition approaches n**0.5 .

+1
source

Source: https://habr.com/ru/post/1410905/


All Articles