Elegant way to remove adjacent duplicate items in a list?

I'm looking for a clean, Pythonic way to eliminate from the following list:

li = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0] 

all continuous repeating elements (runs through more than one number) to get:

 re = [0, 1, 2, 4, 3, 1] 

but although I have working code, it feels non-Pythonic, and I'm sure there must be a way out (maybe some less well-known functions of itertools ?) to achieve what I want in a much more concise and elegant way.

+4
source share
4 answers

Here is a version based on Karl that does not require copies of the list ( tmp , snippets, and zipped list). izip significantly faster than (Python 2) zip for large lists. chain bit slower than slicing, but does not require a tmp object or copy of a list. islice plus creating tmp slightly faster, but requires more memory and less elegant.

 from itertools import izip, chain [y for x, y, z in izip(chain((None, None), li), chain((None,), li), li) if x != y != z] 
Test

A timeit shows that it is about twice as fast as Karl or my fastest groupby version for short groups.

Be sure to use a value other than None (e.g. object() ) if your list may contain None s.

Use this version if you need it to work on an iterator / iterable that is not a sequence, or your groups are long:

 [key for key, group in groupby(li) if (next(group) or True) and next(group, None) is None] 

timeit shows it about ten times faster than another version for 1000 element groups.

Previously slow versions:

 [key for key, group in groupby(li) if sum(1 for i in group) == 1] [key for key, group in groupby(li) if len(tuple(group)) == 1] 
+8
source

Agf's answer is good if the size of the groups is small, but if there are enough duplicates in the row, it’s more efficient not to β€œsum 1” over these groups

 [key for key, group in groupby(li) if all(i==0 for i,j in enumerate(group)) ] 
+4
source
 tmp = [object()] + li + [object()] re = [y for x, y, z in zip(tmp[2:], tmp[1:-1], tmp[:-2]) if y != x and y != z] 
+1
source

Other solutions use various itertools helpers, and also understand and probably look more "pythonic". However, the quick time test that I ran shows that this generator is a bit faster:

 _undef = object() def itersingles(source): cur = _undef dup = True for elem in source: if dup: if elem != cur: cur = elem dup = False else: if elem == cur: dup = True else: yield cur cur = elem if not dup: yield cur source = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0] result = list(itersingles(source)) 
+1
source

All Articles