The fastest way to remove all entry elements from a list?

What is the fastest way to remove all multiple entry elements from a list of arbitrary elements (in my example, a list of lists)? As a result, only those elements that occur at a time in the list should be displayed, thereby removing all duplicates.

: [[1, 2], [1, 3], [1, 4], [1, 2], [1, 4], [1, 2]]

output: [[1, 3],]

This solution was slow:

output = [item for item in input if input.count(item)==1] 

This solution was faster:

 duplicates = [] output = [] for item in input: if not item in duplicates: if item in output: output.remove(item) duplicates.append(item) else: output.append(item) 

Is there a better solution, perhaps by sorting the list first? Any ideas are welcome.

+4
source share
2 answers

If you do not care about maintaining order:

 from collections import Counter def only_uniques(seq): return [k for k,n in Counter(seq).iteritems() if n == 1] 

If you care about maintaining order:

 from collections import Counter def only_uniques_ordered(seq): counts = Counter(seq) return [k for k in seq if counts[k] == 1] 

Both algorithms work in O(n) time.


Edit: Forgot the list of lists. To be able to hash a sequence, it must be immutable, so you can do something like this:

 list_of_tuples = [tuple(k) for k in list_of_lists] 

And then run list_of_tuples through one of the above functions. Note that you will get the list of tuples back from it, but unless you specifically modify the sequences after that, the tuples should work just as well for your purposes.

If you need to translate back, it's about the same:

 list_of_lists = [list(k) for k in list_of_tuples] 
+8
source
 a = [[1, 2], [1, 3], [1, 4], [1, 2], [1, 4], [1, 2]] print list(set(tuple(i) for i in a)) 

A task is being performed on one liner.

user $ time python foo.py
[(1, 2), (1, 3), (1, 4)]

real 0m0.037s
user 0m0.024s
sys 0m0.010s

For printing only unique user-defined items. The solution is a variant of the Amber solution, except that I do not use the collection module.

 a = [[1, 2], [3, 4], [1, 3], [1, 4], [1, 2], [1, 4], [1, 2]] d = {tuple(i): a.count(i) for i in a} print [k for k, v in d.iteritems() if v == 1] 

Output:

 [(1, 3), (3, 4)] 
+2
source

All Articles