Python generators duplicate

How can I either not add duplicate entries to the generator, or delete them if they already exist?

If I have to use something else, I advise.

+4
source share
2 answers

If values ​​are hashed, the easiest, most dumbest way to remove duplicates is to use set :

 values = mygenerator() unique_values = set(values) 

But be careful: the sets do not remember in what order the values ​​were entered. Thus, it scrambles the sequence.

The function below may be better than set for your purpose. It filters out duplicates without any other inappropriate values:

 def nub(it): seen = set() for x in it: if x not in seen: yield x seen.add(x) 

Call nub one argument, any iterable value of the hash. It returns an iterator that creates all the same elements, but with duplicate removal.

+9
source

itertools.groupby() can collapse adjacent duplicates if you are willing to do a little work.

 print [x[0] for x in itertools.groupby([1, 2, 2, 3])] 
+3
source

All Articles