If your CSV files are not so large, they will bring your machine to its knees, if you load them into memory, then you can try something like:
import csv csv1 = list(csv.DictReader(open('file1.csv'))) csv2 = list(csv.DictReader(open('file2.csv'))) set1 = set(csv1) set2 = set(csv2) print set1 - set2
For large files, you can upload them to the SQLite3 database and use SQL queries to do the same, or sort by the appropriate keys, and then merge.
source share