Changing track value in duplicate list in Python

I have a list with duplicate values ​​as shown below:

x = [1, 1, 1, 2, 2, 2, 1, 1, 1] 

This list is created from a pattern matching the regular expression (not shown here). The list will have duplicate values ​​(many, many repetitions - hundreds, if not thousands) and will never be randomly ordered, because it is that the regular expression matches every time.

I want to keep track of indexes on a list in which entries change from the previous value . Therefore, for the above list x I want to get a change tracking list [3, 6] indicating that x[3] and x[6] are different from previous entries in the list.

I managed to do this, but I was wondering if there is a cleaner way. Here is my code:

 x = [1, 1, 1, 2, 2, 2, 1, 1, 1] flag = [] for index, item in enumerate(x): if index != 0: if x[index] != x[index-1]: flag.append(index) print flag 

Output : [3, 6]

Question : Is there a cleaner way to do what I want in fewer lines of code?

+5
source share
5 answers

This can be done using list comprehension, with the range function.

 >>> x = [1, 1, 1, 2, 2, 2, 3, 3, 3] >>> [i for i in range(1,len(x)) if x[i]!=x[i-1] ] [3, 6] >>> x = [1, 1, 1, 2, 2, 2, 1, 1, 1] >>> [i for i in range(1,len(x)) if x[i]!=x[i-1] ] [3, 6] 
+6
source

You can do something like this using itertools.izip , itertools.tee and list comprehension:

 from itertools import izip, tee it1, it2 = tee(x) next(it2) print [i for i, (a, b) in enumerate(izip(it1, it2), 1) if a != b] # [3, 6] 

Another alternative using itertools.groupby on enumerate(x) . groupby groups similar elements together, so we only need the index of the first element of each group, except the first:

 from itertools import groupby from operator import itemgetter it = (next(g)[0] for k, g in groupby(enumerate(x), itemgetter(1))) next(it) # drop the first group print list(it) # [3, 6] 

If NumPy is an option:

 >>> import numpy as np >>> np.where(np.diff(x) != 0)[0] + 1 array([3, 6]) 
+3
source

Instead of multi-indexing, which has O(n) complexity, you can use an iterator to check the next item in the list:

 >>> x =[1, 1, 1, 2, 2, 2, 3, 3, 3] >>> i_x=iter(x[1:]) >>> [i for i,j in enumerate(x[:-1],1) if j!=next(i_x)] [3, 6] 
+2
source

I am here to add a mandatory answer containing a list comprehension.

 flag = [i+1 for i, value in enumerate(x[1:]) if (x[i] != value)] 
+2
source

itertools.izip_longest is what you are looking for:

 from itertools import islice, izip_longest flag = [] leader, trailer = islice(iter(x), 1), iter(x) for i, (current, previous) in enumerate(izip_longest(leader, trailer)): # Skip comparing the last entry to nothing # If None is a valid value use a different sentinel for izip_longest if leader is None: continue if current != previous: flag.append(i) 
+1
source

Source: https://habr.com/ru/post/1212294/


All Articles