How to prevent iterator depletion in Python (3.x)?

If I create two lists and zip them

a=[1,2,3] b=[7,8,9] z=zip(a,b) 

Then I set z to two lists

 l1=list(z) l2=list(z) 

Then the content of l1 turns out to be exact [(1,7), (2,8), (3,9)], but the content of l2 is true [].

I assume this is the general behavior of python regarding iterations. But as a novice programmer, migrating from the C family, this makes no sense to me. Why does he behave this way? And is there a way to overcome this problem?

I mean, yes, in this particular example, I can just copy l1 to l2, but is there a general way to "reset" no matter which Python you use to iterate "z" after I repeat it once?

+7
source share
5 answers

There is no way to generate a reset generator. However, you can use itertools.tee to β€œcopy” the iterator.

 >>> z = zip(a, b) >>> zip1, zip2 = itertools.tee(z) >>> list(zip1) [(1, 7), (2, 8), (3, 9)] >>> list(zip2) [(1, 7), (2, 8), (3, 9)] 

This includes caching values, so it makes sense only if you repeat both iterators at about the same speed. (In other words, do not use it like I am here!)

Another approach is to pass the function of the generator and call it whenever you want to iterate over it.

 def gen(x): for i in range(x): yield i ** 2 def make_two_lists(gen): return list(gen()), list(gen()) 

But now you have to bind the arguments to the generator function when passing it. You can use lambda for this, but many people find lambda ugly. (Not me, though! YMMV.)

 >>> make_two_lists(lambda: gen(10)) ([0, 1, 4, 9, 16, 25, 36, 49, 64, 81], [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]) 

I hope it goes without saying that in most cases it is best to compile a list and copy it.

Also, as a more general way of explaining this behavior, consider this. The generator point should produce a series of values, while maintaining some state between iterations. Now, sometimes, instead of just sorting through the generator, you can do something like this:

 z = zip(a, b) while some_condition(): fst = next(z, None) snd = next(z, None) do_some_things(fst, snd) if fst is None and snd is None: do_some_other_things() 

Say this cycle may or may not exhaust z . Now we have the generator in uncertain condition! Therefore, at the moment it is important that the behavior of the generator is restrained in a clearly defined way. Although we do not know where the generator is located, we know that a) all subsequent calls will give later values ​​in the series, and b) after it is β€œempty”, we received all the elements in the sequence exactly once. The more we have to manipulate the z state, the more difficult it is to reason about it, therefore it is better to avoid situations that violate these two promises.

Of course, as Joel Cornett points out, below you can write a generator that receives messages through the send method; and one could write a generator that could be reset using send . But note that in this case, all we can do is send a message. We cannot directly manipulate the state of the generator, and therefore, all changes in the state of the generator are clearly defined (by the generator itself, provided that it is written correctly!). send really intended for implementing coroutines , so I would not use it for this purpose. Every day, generators almost never do anything with the values ​​sent to them - I think for the reasons I cited above.

+8
source

Just create a list from your iterator using list() once, and then use it.

It just happens that zip returns a generator , which is an iterator that you can execute only once.

You can iterate over the list as many times as you want.

+2
source

If you need two copies of the list that you make, if you need to change them, I suggest you make the list once, and then copy it:

 a=[1,2,3] b=[7,8,9] l1 = list(zip(a,b)) l2 = l1[:] 
+2
source

No, there is no way to "reset them."

Generators generate their output once, one at a time on demand, and then run when the output is exhausted.

Think of them as reading a file, after you finish, you will have to reboot if you want others to go to the data.

If you need to keep the generator output, consider storing it, for example, in a list, and then reuse it as often as you need. (Quite similar to solutions that were guided by using xrange() , the vs range() generator, which created a whole list of elements in memory in v2)

Update: fixed terminology, temporary brain interruption ...

+1
source

Another explanation. As a programmer, you probably understand the difference between classes versus instances (i.e. Objects). zip() is called a built-in function (in an official document). In fact, this is a built-in function of the generator . This means that it is rather a class. You can even try it interactively:

 >>> zip <class 'zip'> 

Classes are types. Because of this, the following should also be clear:

 >>> type(zip) <class 'type'> 

Your z is an instance of the class, and you can think of calling zip() as calling the class constructor:

 >>> a = [1, 2, 3] >>> b = [7, 8, 9] >>> z = zip(a, b) >>> z <zip object at 0x0000000002342AC8> >>> type(z) <class 'zip'> 

z is an iterator (object) that stores inside iterators for a and b . Due to its general implementation, the class z (or zip ) does not matter for reset iterators using a or b or any other sequences. Because of this, there is no way to reset z . The cleanest way to solve your specific problem is to copy the list (as you mentioned in the question and Lennart Regebro suggests ). Another understandable way is to use zip(a, b) twice, thus constructing two z sequential iterators that behave from the very beginning in the same way:

 >>> lst1 = list(zip(a, b)) >>> lst2 = list(zip(a, b)) 

However, this cannot be used in the general case with an identical result. Consider a or b be unique sequences generated based on some current conditions (for example, temperatures read from several thermometers).

0
source

All Articles