How to count elements in a generator consumed by other code

I am creating a generator that is consumed by another function, but I still would like to know how many elements were generated:

lines = (line.rstrip('\n') for line in sys.stdin) process(lines) print("Processed {} lines.".format( ? )) 

The best I can come up with is to wrap the generator in a class that holds the count, or maybe turn it inside out and send () things. Is there an elegant and efficient way to see how many elements a generator is if you are not the one who consumes it in Python 2?

Edit:. Here I have finished:

 class Count(Iterable): """Wrap an iterable (typically a generator) and provide a ``count`` field counting the number of items. Accessing the ``count`` field before iteration is finished will invalidate the count. """ def __init__(self, iterable): self._iterable = iterable self._counter = itertools.count() def __iter__(self): return itertools.imap(operator.itemgetter(0), itertools.izip(self._iterable, self._counter)) @property def count(self): self._counter = itertools.repeat(self._counter.next()) return self._counter.next() 
+8
python generator count
source share
6 answers

Here is another way: itertools.count() example:

 import itertools def generator(): for i in range(10): yield i def process(l): for i in l: if i == 5: break def counter_value(counter): import re return int(re.search('\d+', repr(counter)).group(0)) counter = itertools.count() process(i for i, v in itertools.izip(generator(), counter)) print "Element consumed by process is : %d " % counter_value(counter) # output: Element consumed by process is : 6 

Hope this was helpful.

+8
source share

If you don't care that you consume a generator, you can simply:

 sum(1 for x in gen) 
+12
source share

Usually I just turn the generator into a list and take its length. If you have reason to believe that it will consume too much memory, your best bet is indeed the wrapper class that you proposed to yourself. It's not so bad:

 class CountingIterator(object): def __init__(self, it): self.it = it self.count = 0 def __iter__(self): return self def next(self): nxt = next(self.it) self.count += 1 return nxt __next__ = next 

(The last line is for direct compatibility with Python 3.x.)

+8
source share

Here is another approach. Using a list to output count is a little ugly but pretty compact:

 def counter(seq, count_output_list): for x in seq: count_output_list[0] += 1 yield x 

Used like this:

 count = [0] process(counter(lines, count)) print count[0] 

You can also do counter() to take a dict in which he could add the key "count", or an object on which he could set the member count .

+1
source share

If you do not need to return a counter and just want to register it, you can use the finally block:

 def generator(): i = 0 try: for x in range(10): i += 1 yield x finally: print '{} iterations'.format(i) [ n for n in generator() ] 

What produces:

 10 iterations [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 
+1
source share

This is another solution, similar to @ sven-marnach solution:

 class IterCounter(object): def __init__(self, it): self._iter = it self.count = 0 def _counterWrapper(self, it): for i in it: yield i self.count += 1 def __iter__(self): return self._counterWrapper(self._iter) 

I wrapped the iterator using the generator function and avoided overriding next . The result is iterable (not an iterator, because it lacks the next method), but if it goes, then it is faster. In my tests, this is 10% faster.

0
source share

All Articles