I am trying to get the number of unique elements in a csv column using Python.
Example CSV file (no header):
AB,asd
AB,poi
AB,asd
BG,put
BG,asd
I have tried this so far.
import csv
from collections import defaultdict, Counter
input_file = open('Results/1_sample.csv')
csv_reader = csv.reader(input_file, delimiter=',')
data = defaultdict(list)
for row in csv_reader:
data[row[0]].append(row[1])
for k, v in data.items():
print k
print Counter(v)
This gives the result in this format:
AB
Counter({'asd': 2, 'poi': 1})
BG
Counter({'asd': 1, 'put': 1})
But I want my conclusion to be as follows:
AB:2
BG:2
total_unique_count:3 #unique count of column[1], irrespective of the data in column[0]
source
share