Count duplicates in tuple list

I have a list of tuples: a = [(1,2),(1,4),(1,2),(6,7),(2,9)]I want to check if one of the individual elements of each tuple matches the same position / element in another tuple, and how many times this happens.

For example: if only the 1st element in some tuples has a duplicate, return the tuple and how many times it was duplicated. I can do this with the following code:

a = [(1,2), (1,4), (1,2), (6,7), (2,9)]

coll_list = []
for t in a:
    coll_cnt = 0
    for b in a:
        if b[0] == t[0]:
            coll_cnt = coll_cnt + 1
    print "%s,%d" %(t,coll_cnt)
    coll_list.append((t,coll_cnt))

print coll_list

I want to know if there is a more efficient way to do this?

+5
source share
5 answers

use the collection library. In the following code, val_1 val_2 gives you duplicates of each of the first elements and second elements of the tuples, respectively.

import collections
val_1=collections.Counter([x for (x,y) in a])
val_2=collections.Counter([y for (x,y) in a])

>>> print val_1
<<< Counter({1: 3, 2: 1, 6: 1})

This is the number of occurrences of the first element of each set

>>> print val_2
<<< Counter({2: 2, 9: 1, 4: 1, 7: 1})

+6

Counter

from collections import Counter
a = [(1,2),(1,4),(1,2),(6,7),(2,9)]
counter=Counter(a)
print counter

:

Counter({(1, 2): 2, (6, 7): 1, (2, 9): 1, (1, 4): 1})

( ) , , . (1,2) , .

>>> counter[(1,2)]
2

, .

first_element = Counter([x for (x,y) in a])
second_element = Counter([y for (x,y) in a])

first_element second_element Counter ,

>>> first_element
Counter({1: 3, 2: 1, 6: 1})
>>> second_element
Counter({2: 2, 9: 1, 4: 1, 7: 1})

, , , :

>>> first_element[2]
1

2 1 .

+11

count_map .

>>> count_map = {}
>>> for t in a:
...     count_map[t] = count_map.get(t, 0)  +1
... 
>>> count_map
{(1, 2): 2, (6, 7): 1, (2, 9): 1, (1, 4): 1}
+4

pandas, :

import pandas
print(pandas.Series(data=[(1,2),(1,4),(1,2),(6,7),(2,9)]).value_counts())

(1, 2)    2
(1, 4)    1
(6, 7)    1
(2, 9)    1
dtype: int64
+2

Perhaps the dictionary may work better. Because in your code you travel the list twice. And that makes the complexity of your code O (n ^ 2). And this is not very good :)

The best way is to travel at a time and use 1 or 2 conditions for each move. Here is my first solution for this kind of problem.

a = [(1,2),(1,4),(1,2),(6,7),(2,9)]

dict = {}
for (i,j) in a:
    if dict.has_key(i):
            dict[i] += 1
    else:
            dict[i] = 1

print dict

For this code, this will produce the result:

{1: 3, 2: 1, 6: 1}

I hope this will be helpful.

+2
source

All Articles