Pythonic-iterating over sliding windows in a list?

What is the most efficient Pythonic way to iterate over a list in moving pairs? Here is an example:

>>> l ['a', 'b', 'c', 'd', 'e', 'f', 'g'] >>> for x, y in itertools.izip(l, l[1::2]): print x, y ... ab bd cf 

this is an iteration in pairs, but how can we get an iteration over a moving pair? The iteration value for pairs:

 ab bc cd de etc. 

which iterates over pairs, except that each pair pushes the pair apart by one element, rather than 2 elements. thanks.

+8
source share
7 answers

What about:

 for x, y in itertools.izip(l, l[1:]): print x, y 
+7
source

You can go even easier. Just write down the list and offset the list by one.

 In [4]: zip(l, l[1:]) Out[4]: [('a', 'b'), ('b', 'c'), ('c', 'd'), ('d', 'e'), ('e', 'f'), ('f', 'g')] 
+12
source

Here is a small generator that I wrote some time ago for a similar scenario:

 def pairs(items): items_iter = iter(items) prev = next(items_iter) for item in items_iter: yield prev, item prev = item 
+4
source

Here is a function of arbitrarily slippery windows that works for iterators / generators, as well as lists

 def sliding(seq, n): return izip(*starmap(islice, izip(tee(seq, n), count(0), repeat(None)))) 

Nathan's solution is probably more efficient.

+3
source

Dates determined by the addition of two subsequent entries in the list are displayed below and are ordered from the fastest to the slowest.

Gilles

 In [69]: timeit.repeat("for x,y in itertools.izip(l, l[1::1]): x + y", setup=setup, number=1000) Out[69]: [1.029047966003418, 0.996290922164917, 0.998831033706665] 

Jeff Reedy

 In [70]: timeit.repeat("for x,y in sliding(l,2): x+y", setup=setup, number=1000) Out[70]: [1.2408790588378906, 1.2099130153656006, 1.207326889038086] 

Alestanis

 In [66]: timeit.repeat("for i in range(0, len(l)-1): l[i] + l[i+1]", setup=setup, number=1000) Out[66]: [1.3387370109558105, 1.3243639469146729, 1.3245630264282227] 

chmullig

 In [68]: timeit.repeat("for x,y in zip(l, l[1:]): x+y", setup=setup, number=1000) Out[68]: [1.4756009578704834, 1.4369518756866455, 1.5067830085754395] 

Nathan Villaescusa

 In [63]: timeit.repeat("for x,y in pairs(l): x+y", setup=setup, number=1000) Out[63]: [2.254757881164551, 2.3750967979431152, 2.302199125289917] 

sr2222

Note the reduced repeat number ...

 In [60]: timeit.repeat("for x,y in SubsequenceIter(l,2): x+y", setup=setup, number=100) Out[60]: [1.599524974822998, 1.5634570121765137, 1.608154058456421] 

Installation Code:

 setup=""" from itertools import izip, starmap, islice, tee, count, repeat l = range(10000) def sliding(seq, n): return izip(*starmap(islice, izip(tee(seq, n), count(0), repeat(None)))) class SubsequenceIter(object): def __init__(self, iterable, subsequence_length): self.iterator = iter(iterable) self.subsequence_length = subsequence_length self.subsequence = [0] def __iter__(self): return self def next(self): self.subsequence.pop(0) while len(self.subsequence) < self.subsequence_length: self.subsequence.append(self.iterator.next()) return self.subsequence def pairs(items): items_iter = iter(items) prev = items_iter.next() for item in items_iter: yield (prev, item) prev = item """ 
+1
source

Not the most effective, but flexible enough:

 class SubsequenceIter(object): def __init__(self, iterable, subsequence_length): self.iterator = iter(iterable) self.subsequence_length = subsequence_length self.subsequence = [0] def __iter__(self): return self def next(self): self.subsequence.pop(0) while len(self.subsequence) < self.subsequence_length: self.subsequence.append(self.iterator.next()) return self.subsequence 

Using:

 for x, y in SubsequenceIter(l, 2): print x, y 
0
source

There is no need to import, this will work if there is a list of objects or a string; anything with var[indexing] . Tested on python 3.6

 # This will create windows with all but 1 overlap def ngrams_list(a_list, window_size=5, skip_step=1): return list(zip(*[a_list[i:] for i in range(0, window_size, skip_step)])) 

the for loop creates this with the alphabet a_list ( window = 5 shown, the OP will want window=2 :

 ['ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'BCDEFGHIJKLMNOPQRSTUVWXYZ', 'CDEFGHIJKLMNOPQRSTUVWXYZ', 'DEFGHIJKLMNOPQRSTUVWXYZ', 'EFGHIJKLMNOPQRSTUVWXYZ'] 

zip(*result_of_for_loop) will collect all full vertical columns as results. And if you want overlapping less than all but one:

 # You can sample that output to get less overlap: def sliding_windows_with_overlap(a_list, window_size=5, overlap=2): zip_output_as_list = ngrams_list(a_list, window_size)]) return zip_output_as_list[::overlap+1] 

With overlap=2 it skips columns starting with B & C and choosing D

 [('A', 'B', 'C', 'D', 'E'), ('D', 'E', 'F', 'G', 'H'), ('G', 'H', 'I', 'J', 'K'), ('J', 'K', 'L', 'M', 'N'), ('M', 'N', 'O', 'P', 'Q'), ('P', 'Q', 'R', 'S', 'T'), ('S', 'T', 'U', 'V', 'W'), ('V', 'W', 'X', 'Y', 'Z')] 

UPDATE: it looks like this is what @chmullig provided, with options

0
source

All Articles