Merging lists by value

Question

Merging lists by value

Is there an efficient way to merge two tuple lists in python based on a common value. I am currently doing the following:

name = [ (9, "John", "Smith"), (11, "Bob", "Dobbs"), (14, "Joe", "Bloggs") ] occupation = [ (9, "Builder"), (11, "Baker"), (14, "Candlestick Maker") ] name_and_job = [] for n in name: for o in occupation: if n[0] == o[0]: name_and_job.append( (n[0], n[1], n[2], o[1]) ) print(name_and_job)

returns:

 [(9, 'John', 'Smith', 'Builder'), (11, 'Bob', 'Dobbs', 'Baker'), (14, 'Joe', 'Bloggs', 'Candlestick Maker')]

Although this code works fine for small lists, it is incredibly slow for longer lists with millions of entries. Is there a more efficient way to write this?

EDIT The numbers in the first column are unique.

EDIT Changed @John Kugelman code a bit. Added get (), just in case the dictionary of dictionaries does not have a suitable key in the occupation dictionary:

 >>>> names_and_jobs = {id: names[id] + (jobs.get(id),) for id in names} >>>> print(names_and_jobs) {9: ('John', 'Smith', None), 11: ('Bob', 'Dobbs', 'Baker'), 14: ('Joe', 'Bloggs', 'Candlestick Maker')}

+5

performance python dictionary list for-loop

Jesse reilly Jun 03 '15 at 10:32

source share

2 answers

 from collections import OrderedDict from itertools import chain od = OrderedDict() for ele in chain(name,occupation): od.setdefault(ele[0], []).extend(ele[1:]) print([[k]+val for k,val in od.items()]) [[9, 'John', 'Smith', 'Builder'], [11, 'Bob', 'Dobbs', 'Baker'], [14, 'Joe', 'Bloggs', 'Candlestick Maker']]

If you want the data to be ordered by the way they appear in the names, you need to use OrderedDict, since normal dicts are unordered.

You can also add data in a loop by creating the desired tuples, and then just call od.values to get a list of tuples:

 from collections import OrderedDict from itertools import chain od = OrderedDict() for ele in chain(name, occupation): k = ele[0] if k in od: od[k] = od[k] + ele[1:] else: od[k] = ele print(od.values()) [(9, 'John', 'Smith', 'Builder'), (11, 'Bob', 'Dobbs', 'Baker'), (14, 'Joe', 'Bloggs', 'Candlestick Maker')]

+1

Padraic cunningham Jun 03 '15 at 10:39

source share

John kugelman · Accepted Answer · 2015-06-03T22:37:38+0000

Use dictionaries instead of flat lists.

 names = { 9: ("John", "Smith"), 11: ("Bob", "Dobbs"), 14: ("Joe", "Bloggs") } jobs = { 9: "Builder", 11: "Baker", 14: "Candlestick Maker" }

If you need to convert them to this format, you can do:

 >>> {id: (first, last) for id, first, last in name} {9: ('John', 'Smith'), 11: ('Bob', 'Dobbs'), 14: ('Joe', 'Bloggs')} >>> {id: job for id, job in occupation} {9: 'Builder', 11: 'Baker', 14: 'Candlestick Maker'}

Then there will be a piece of cake to combine the two.

 names_and_jobs = {id: names[id] + (jobs[id],) for id in names}

Merging lists by value

More articles: