Removing duplicates from a list of lists in Python

Question

Removing duplicates from a list of lists in Python

Can someone suggest a good solution for removing duplicates from nested lists if you want to evaluate duplicates based on the first element of each nested list?

The main list is as follows:

L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46]]

If there is another list with the same element in the first position [k][0] that has already occurred, then I would like to delete this list and get this result:

 L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33]]

Can you suggest an algorithm to achieve this?

+9

python list

elfuego1 Jul 17 '09 at 13:45

source share

5 answers

use dict like this:

 L = {'14': ['65', 76], '2': ['5', 6], '7': ['12', 33]} L['14'] = ['22', 46]

if you get the first list from some external source, convert it like this:

 L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46]] L_dict = dict((x[0], x[1:]) for x in L)

+3

Jiaaro Jul 17 '09 at 13:52

source share

I'm not sure what you mean by "another list", so I assume that you say these lists inside L

 a=[] L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46],['7','a','b']] for item in L: if not item[0] in a: a.append(item[0]) print item

0

ghostdog74 Jul 17 '09 at 13:50

source share

If the order doesn't matter, the code below

 print [ [k] + v for (k, v) in dict( [ [a[0], a[1:]] for a in reversed(L) ] ).items() ]

gives

[['2', '5', '6'], ['14', '65', '76'], ['7', '12', '33']]

0

rein Jul 17 '09 at 14:03

source share

Use Pandas:

 import pandas as pd L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46],['7','a','b']] df = pd.DataFrame(L) df = df.drop_duplicates() L_no_duplicates = df.values.tolist()

If you want to remove duplicates in specific columns, use instead:

 df = df.drop_duplicates([1,2])

0

Rupert Schiessl Mar 17 '16 at 8:57

source share

Brian · Accepted Answer · 2009-07-17 13:54

Do you care about maintaining order / duplicate which is deleted? If not, then:

 dict((x[0], x) for x in L).values()

will do it. If you want to keep order and want to keep the first one you find, then:

 def unique_items(L): found = set() for item in L: if item[0] not in found: yield item found.add(item[0]) print list(unique_items(L))

Removing duplicates from a list of lists in Python

More articles: