How to use random.shuffle () for a generator? python

How to use random.shuffle () for a generator without initializing a list from a generator? Is it possible? if not, how else should I use random.shuffle() on my list?

 >>> import random >>> random.seed(2) >>> x = [1,2,3,4,5,6,7,8,9] >>> def yielding(ls): ... for i in ls: ... yield i ... >>> for i in random.shuffle(yielding(x)): ... print i ... Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/random.py", line 287, in shuffle for i in reversed(xrange(1, len(x))): TypeError: object of type 'generator' has no len() 

Note: random.seed() was designed so that it returns the same result after each script run?

+13
python generator list random shuffle
source share
5 answers

To mix the sequence evenly, random.shuffle() needs to know how long the input takes. The generator cannot provide this; you must materialize it to a list:

 lst = list(yielding(x)) random.shuffle(lst) for i in lst: print i 

Instead, you can use sorted() with random.random() as the key:

 for i in sorted(yielding(x), key=lambda k: random.random()): print i 

but since it also creates a list, thereโ€™s little point in this way.

Demo:

 >>> import random >>> x = [1,2,3,4,5,6,7,8,9] >>> sorted(iter(x), key=lambda k: random.random()) [9, 7, 3, 2, 5, 4, 6, 1, 8] 
+29
source share

It is not possible to randomize the output of a generator without temporarily storing all the elements. Fortunately, this is pretty easy in Python:

 tmp = list(yielding(x)) random.shuffle(tmp) for i in tmp: print i 

Pay attention to the list() call, which will read all the elements and put them in a list.

If you do not want or cannot store all the elements, you will need to change the generator to get it in random order.

+3
source share

Depending on the case, if you know how much data you have in advance, you can index the data and calculate / read it based on the shuffled index. This means โ€œdon't use a generator for this problem,โ€ and without specific use cases it's hard to find a common method.

Alternatively ... If you need to use a generator ...

it depends on how "shuffled" you want the data. Of course, as people have noted, generators do not have length, so you need to at some point evaluate the generator, which can be expensive. If you do not need perfect randomness, you can enter a random play buffer:

 from itertools import islice import numpy as np def shuffle(generator, buffer_size): while True: buffer = list(islice(generator, buffer_size)) if len(buffer) == 0: break np.random.shuffle(buffer) for item in buffer: yield item shuffled_generator = shuffle(my_generator, 256) 

This will shuffle the data in buffer_size portions, so you can avoid memory problems if this is your limiting factor. Of course, this is not a random random case, so it should not be used for sorting, but if you just need to add some randomness to your data, this may be a good solution.

+1
source share

I needed to find a solution to this problem, so that I could expensively calculate the elements in an arbitrary order, without wasting calculations on generating values. This is what I came up with for your example. This includes creating another function to index the first array.

You will need NumPy installed

 pip install numpy 

The code:

 import numpy as np x = [1, 2, 3, 4, 5, 6, 7, 8, 9] def shuffle_generator(lst): return (lst[idx] for idx in np.random.permutation(len(lst))) def yielding(ls): for i in ls: yield i # for i in random.shuffle(yielding(x)): # print i for i in yielding(shuffle_generator(x)): print(i) 
0
source share

You can choose from randomly obtained results, generating not a completely random, but somewhat mixed set in the range. Similar to @sturgemeister code above, but not divided into parts .... there are no defined boundaries of randomness.

For example:

 def scramble(gen, buffer_size): buf = [] i = iter(gen) while True: try: e = next(i) buf.append(e) if len(buf) >= buffer_size: choice = random.randint(0, len(buf)-1) buf[-1],buf[choice] = buf[choice],buf[-1] yield buf.pop() except StopIteration: random.shuffle(buf) yield from buf return 

The results should be completely random in the buffer_size window:

 for e in scramble(itertools.count(start=0, step=1), 1000): print(e) 

For an arbitrary 1000 elements in this thread ... they seem random. But, looking at the general trend (more than 1000), it is clearly increasing.

To check, confirm that it returns 1000 unique elements:

 for e in scramble(range(1000), 100): print(e) 
0
source share

All Articles