Python filters the list, leaving only objects that happen once

Question

Python filters the list, leaving only objects that happen once

I would like to filter this list,

l = [0,1,1,2,2]

to leave only

[0].

I'm struggling to do this with "pythonic": o) Is this possible without nested loops?

+6

python list filter

Daniel farrell Aug 16 '09 at 10:13

source share

9 answers

You will need two loops (or, which is the same thing, a loop and listcomp, as shown below), but not nested:

 import collections d = collections.defaultdict(int) for x in L: d[x] += 1 L[:] = [x for x in L if d[x] == 1]

This solution assumes that the list items are hashable, i.e. they can be used as indexes in dicts, members of collections, etc.

The OP indicates that they care about the IDENTITY object, not VALUE (for example, two sub-lists, both merits [1,2,3 , which are equal but may not match, will not be considered duplicate). If this is true, then this code can be used, just replace d[x] with d[id(x)] in both cases and it will work for ANY type of object in the list L.

Mutable objects (lists, dicts, sets, ...) are usually not hashed and therefore cannot be used this way. User objects by default hashable (with hash(x) == id(x) ), unless their class defines special comparison methods ( __eq__ , __cmp__ , ...), in which case they are hashed if and only if their class also defines __hash__ .

If the elements of the list L are not hashed but comparable for inequality (and therefore sorted) and you do not care about your order in the list, you can complete the task in O(N log N) by first sorting the list and then applying itertools.groupby (almost, but not quite, as another answer suggested).

Other approaches, gradually decreasing characteristics and increasing generality, can deal with non-expandable sorts when you take care of the initial order of the list (make a sorted copy and in the second cycle check the repetitions on it with bisect - also O (N log N), but a little slower) and with objects whose only applicable property is that they are comparable for equality (in no way can the dangerous performance of O (N ** 2) be avoided in this maximally general case).

If the OP can clarify which case relates to its specific problem, I will be happy to help (and, in particular, if the objects in it are hashable, the code that I have already indicated above should be sufficient ;-).

+12

Alex martelli Aug 16 '09 at 10:16

source share

 [x for x in the_list if the_list.count(x)==1]

Although this is still a nested loop behind the curtains.

+9

sepp2k Aug 16 '09 at 10:18

source share

In the same vein as Alex's solution, you can use Counter / multiset (built-in 2.7, recipe compatible with 2.5 and above) to do the same:

 In [1]: from counter import Counter In [2]: L = [0, 1, 1, 2, 2] In [3]: multiset = Counter(L) In [4]: [x for x in L if multiset[x] == 1] Out[4]: [0]

+4

Ryan Aug 17 '09 at 1:18

source share

 >>> l = [0,1,1,2,2] >>> [x for x in l if l.count(x) is 1] [0]

+3

iElectric Aug 16 '09 at 10:20

source share

 l = [0,1,2,1,2] def justonce( l ): once = set() more = set() for x in l: if x not in more: if x in once: more.add(x) once.remove( x ) else: once.add( x ) return once print justonce( l )

+3

Jochen ritzel Aug 16 '09 at 23:37

source share

I think the actual timings are interesting:

Alex 'answer:

 python -m timeit -s "l = range(1,1000,2) + range(1,1000,3); import collections" "d = collections.defaultdict(int)" "for x in l: d[x] += 1" "l[:] = [x for x in l if d[x] == 1]" 1000 loops, best of 3: 370 usec per loop

Mine:

 python -m timeit -s "l = range(1,1000,2) + range(1,1000,3)" "once = set()" "more = set()" "for x in l:" " if x not in more:" " if x in once:" " more.add(x)" " once.remove( x )" " else:" " once.add( x )" 1000 loops, best of 3: 275 usec per loop

sepp2k O (n ** 2) version to demonstrate why complexity matters; -)

 python -m timeit -s "l = range(1,1000,2) + range(1,1000,3)" "[x for x in l if l.count(x)==1]" 100 loops, best of 3: 16 msec per loop

Roberto + sorted by:

 python -m timeit -s "l = range(1,1000,2) + range(1,1000,3); import itertools" "[elem[0] for elem in itertools.groupby(sorted(l)) if elem[1].next()== 0]" 1000 loops, best of 3: 316 usec per loop

mhawke's:

 python -m timeit -s "l = range(1,1000,2) + range(1,1000,3)" "d = {}" "for i in l: d[i] = d.has_key(i)" "[k for k in d.keys() if not d[k]]" 1000 loops, best of 3: 251 usec per loop

I like the latter, smart and fast; -)

+1

Jochen ritzel Aug 17 '09 at 2:16

source share

 >>> l = [0,1,1,2,2] >>> [x for x in l if l.count(x) == 1] [0]

+1

Juanjo conti Aug 17 '09 at 3:27

source share

A similar question arose about the coding problem. My solution is not the most elegant, but this is my first solution, created by myself, so I would like to share it!

 testlist = [2,4,6,8,10,2,6,10] def unique_elements(testlist): final_list = [] for x in testlist: if testlist.count(x)==1: final_list.append(x) else: pass print(final_list) unique_elements(testlist)

0

Reputable misnomer Jun 29 '19 at 1:27

source share

mhawke · Accepted Answer · 2009-08-17T00:58:17+0000

Here is another vocabulary oriented path:

l = [0, 1, 1, 2, 2] d = {} for i in l: d[i] = i in d [k for k in d if not d[k]] # unordered, loop over the dictionary [k for k in l if not d[k]] # ordered, loop over the original list

Python filters the list, leaving only objects that happen once

More articles: