Match set of dictionaries. The most elegant solution. python

Given two lists of dictionaries, new and old. Dictionaries represent the same objects in both lists. I need to find the differences and create a new list of dictionaries, where there will be objects only from new dictionaries and updated attributes from old dictionaries.
Example:

list_new=[ { 'id':1, 'name':'bob', 'desc': 'cool gay' }, { 'id':2, 'name':'Bill', 'desc': 'bad gay' }, { 'id':3, 'name':'Vasya', 'desc': None }, ] list_old=[ { 'id':1, 'name':'boby', 'desc': 'cool gay', 'some_data' : '12345' }, { 'id':2, 'name':'Bill', 'desc': 'cool gay', 'some_data' : '12345' }, { 'id':3, 'name':'vasya', 'desc': 'the man', 'some_data' : '12345' }, { 'id':4, 'name':'Elvis', 'desc': 'singer', 'some_data' : '12345' }, ] 

So .. In this example, I want to create a new list, where there will be only new gays from list_new with updated data. Corresponds to id . So, Bob will become Bobi, Bill will become burned, Vasya will become - a man. The end of Elvis must be absent.

Give me an elegant solution. With fewer iteration cycles.

There is a way, I allow it. Which is not the best:

  def match_dict(new_list, old_list) ids_new=[] for item in new_list: ids_new.append(item['id']) result=[] for item_old in old_medias: if item_old['id'] in ids_new: for item_new in new_list: if item_new['id']=item_old['id'] item_new['some_data']=item_old['some_data'] result.append(item_new) return result 

The reason I doubt it is because there is a cycle inside the loop. If there are lists of 2000 items, the process will take the same time.

+6
python dictionary
source share
9 answers

Steps:

  • Create a search dictionary for list_old by id
  • A loop through list_new dicts creates a combined dict for each if it existed in the old

the code:

 def match_dict(new_list, old_list): old = dict((v['id'], v) for v in old_list) return [dict(d, **old[d['id']]) for d in new_list if d['id'] in old] 

EDIT: Incorrectly named variables inside a function.

+1
source share

Cannot get it in one line, but here is a simpler version:

 def match_new(new_list, old_list) : ids = dict((item['id'], item) for item in new_list) return [ids[item['id']] for item in old_list if item['id'] in ids] 
+3
source share

Without knowing the limitations of your data, I believe that id is unique in each list and that your list contains only display types (string, int, ...) that are hashable.

 # first index each list by id new = {item['id']: item for item in list_new} old = {item['id']: item for item in list_old} # now you can see which ids appeared in the new list created = set(new.keys())-set(old.keys()) # or which ids were deleted deleted = set(old.keys())-set(new.keys()) # or which ids exists in the 2 lists intersect = set(new.keys()).intersection(set(old.keys())) # using the same 'conversion to set' trick, # you can see what is different for each item diff = {id: dict(set(new[id].items())-set(old[id].items())) for id in intersect} # using your example data set, diff now contains the differences for items which exists in the two lists: # {1: {'name': 'bob'}, 2: {'desc': 'bad gay'}, 3: {'name': 'Vasya', 'desc': None}} # you can now add the new ids to this diff diff.update({id: new[id] for id in created}) # and get your data back into the original format: list_diff = [dict(data, **{'id': id}) for id,data in diff.items()] 

python 3 syntax is used, but it should be easy to port to python 2.

edit: here is the same code written for python 2.5:

 new = dict((item['id'],item) for item in list_new) old = dict((item['id'],item) for item in list_old) created = set(new.keys())-set(old.keys()) deleted = set(old.keys())-set(new.keys()) intersect = set(new.keys()).intersection(set(old.keys())) diff = dict((id,dict(set(new[id].items())-set(old[id].items()))) for id in intersect) diff.update(dict(id,new[id]) for id in created)) list_diff = [dict(data, **{'id': id}) for id,data in diff.items()] 

(note how the code is less readable without understanding the dict)

+2
source share

for each dictionary in old_list, find the dictionary in new_list with the same identifier, then do: old_dict.update(new_dict)

remove each new_dict after the update from new_list and add the remaining unused dicts after the loop.

+1
source share

Something like this you need:

 l = [] for d in list_old: for e in list_new: if e['id'] == d['id']: l.append(dict(e, **d)) print l 

Read here on how to merge dictionaries.

+1
source share

You can do something like this:

 def match_dict(new_list, old_list): new_dict = dict((obj['id'], obj) for obj in new_list) old_dict = dict((obj['id'], obj) for obj in old_list) for k in new_dict.iterkeys(): if k in old_dict: new_dict[k].update(old_dict[k]) else: del new_dict[k] return new_dict.values() 

If you often do this, I suggest storing your data in the form of dictionaries with the identifier as the key, and not a list, so you would not need to convert it every time.

to change . Here is an example showing how to store data in a dictionary.

 list_new = [{'desc': 'cool guy', 'id': 1, 'name': 'bob'}, {'desc': 'bad guy', 'id': 2, 'name': 'Bill'}, {'desc': None, 'id': 3, 'name': 'Vasya'}] # create a dictionary with the value of 'id' as the key dict_new = dict((obj['id'], obj) for obj in list_new) # now you can access entries by their id instead of having to loop through the list print dict_new[2] # {'id': 2, 'name': 'Bill', 'desc': 'bad guy'} 
+1
source share

You will be much better off if your top-level data structure is more of a dicton than a list. Then it will be:

 dict_new.update(dict_old) 

However, for what you actually have, try the following:

 result_list = [] for item in list_new: found_item = [d for d in list_old if d["id"] == item["id"]] if found_item: result_list.append(dict(item, **found_item[0])) 

This one actually still has a loop inside the loop (the inner loop is “hidden” in the list comprehension), so it is still O (n ** 2). On large datasets, it will undoubtedly be noticeably faster to convert it to a dict, update it, and then convert it back to a list.

0
source share

You may like this one. Please take a look, thanks.

 def match_dict(new_list, old_list): id_new = [item_new.get("id") for item_new in list_new] id_old = [item_old.get("id") for item_old in list_old] for idx_old in id_old: if idx_old in id_new: list_new[id_new.index(idx_old)].update(list_old[id_old.index(idx_old)]) return list_new from pprint import pprint pprint(match_dict(list_new, list_old)) 

Output:

 [{'desc': 'cool gay', 'id': 1, 'name': 'boby', 'some_data': '12345'}, {'desc': 'cool gay', 'id': 2, 'name': 'Bill', 'some_data': '12345'}, {'desc': 'the man', 'id': 3, 'name': 'vasya', 'some_data': '12345'}] 
0
source share
 [od for od in list_old if od['id'] in {nd['id'] for nd in list_new}] 
0
source share

All Articles