Itertools or handwritten generator - which is preferable?

I have several Python generators that I want to combine into a new generator. I can easily do this with a hand-written generator using a bunch of yield .

On the other hand, the itertools module itertools designed for such things, and for me it seems that to create the generator I need is Putin's way of creating various iterators of this itertools module.

However, in this problem it soon becomes quite complicated (the generator must maintain some state - for example, are the first or later elements processed --- the i-th output additionally depends on the conditions of the i-th input element and various input lists should be processed differently before they are connected to the generated list.

As the composition of standard iterators that solve my problem, --- due to the one-dimensional nature of writing the source code --- it is almost incomprehensible, I wonder if there are any advantages to using standard itertools generators compared to manual generator functions (mainly in more complicated cases). In fact, I think that in 90% of cases, handwritten versions are much easier to read - probably because of their more imperative style compared to the functional style of iterator chains.

EDIT

To illustrate my problem, here is an example (toy): let a and b be two iterabilities of the same length (input). Elements a consist of integers, elements b are iterators themselves, whose individual elements are strings. The output should correspond to the output of the following generator function:

 from itertools import * def generator(a, b): first = True for i, s in izip(a, b): if first: yield "First line" first = False else: yield "Some later line" if i == 0: yield "The parameter vanishes." else: yield "The parameter is:" yield i yield "The strings are:" comma = False for t in s: if comma: yield ',' else: comma = True yield t 

If I write the same program in a functional style using generator expressions and itertools , I get something like:

 from itertools import * def generator2(a, b): return (z for i, s, c in izip(a, b, count()) for y in (("First line" if c == 0 else "Some later line",), ("The parameter vanishes.",) if i == 0 else ("The parameter is:", i), ("The strings are:",), islice((x for t in s for x in (',', t)), 1, None)) for z in y) 

Example

 >>> a = (1, 0, 2), ("ab", "cd", "ef") >>> print([x for x in generator(a, b)]) ['First line', 'The parameter is:', 1, 'The strings are:', 'a', ',', 'b', 'Some later line', 'The parameter vanishes.', 'The strings are:', 'c', ',', 'd', 'Some later line', 'The parameter is:', 2, 'The strings are:', 'e', ',', 'f'] >>> print([x for x in generator2(a, b)]) ['First line', 'The parameter is:', 1, 'The strings are:', 'a', ',', 'b', 'Some later line', 'The parameter vanishes.', 'The strings are:', 'c', ',', 'd', 'Some later line', 'The parameter is:', 2, 'The strings are:', 'e', ',', 'f'] 

This may be more elegant than my first decision, but it looks like it's code written once-no-understand-later. I am wondering if this way of writing my generator has enough advantages for this to be done.

PS: I think part of my problem with a functional solution is that in order to minimize the number of keywords in Python, some keywords, such as "for", "if", and "else", were redesigned for use in expressions, so that their placement in the expression becomes familiar (the order in the generator expression z for x in a for y in x for z in y looks, at least to me, less natural than ordering in the classic for : for x in a: for y in x: for z in y: yield z loop for x in a: for y in x: for z in y: yield z ).

+7
python iterator generator
source share
1 answer

I did some profiling and the regular generator function was faster than your second generator or my implementation.

 $ python -mtimeit -s'import gen; a, b = gen.make_test_case()' 'list(gen.generator1(a, b))' 10 loops, best of 3: 169 msec per loop $ python -mtimeit -s'import gen; a, b = gen.make_test_case()' 'list(gen.generator2(a, b))' 10 loops, best of 3: 489 msec per loop $ python -mtimeit -s'import gen; a, b = gen.make_test_case()' 'list(gen.generator3(a, b))' 10 loops, best of 3: 385 msec per loop 

It is also the most readable material, so I think I will go with it. However, I will post my solution anyway, because I think this is a cleaner example of such functional programming that you can do with itertools (although obviously still not optimal, I feel that it should be able to smoke regularly generator function. I will hack it)

 def generator3(parameters, strings): # replace strings with a generator of generators for the individual charachters strings = (it.islice((char for string_char in string_ for char in (',', string_char)), 1, None) for string_ in strings) # interpolate strings with the notices strings = (it.chain(('The strings are:',), string_) for string_ in strings) # nest them in tuples so they're ate the same level as the other generators separators = it.chain((('First line',),), it.cycle((('Some later line',),))) # replace the parameters with the appropriate tuples parameters = (('The parameter is:', p) if p else ('The parameter vanishes.',) for p in parameters) # combine the separators, parameters and strings output = it.izip(separators, parameters, strings) # flatten it twice and return it output = it.chain.from_iterable(output) return it.chain.from_iterable(output) 

for reference, test case:

 def make_test_case(): a = [i % 100 for i in range(10000)] b = [('12345'*10)[:(i%50)+1] for i in range(10000)] return a, b 
+7
source share

All Articles