How to access all dictionaries in a dictionary where a particular key has a specific meaning

I have a dictionary of dictionaries, and each nested dictionary has the same keys, for example:

all_dicts = {'a':{'name': 'A', 'city': 'foo'}, 'b':{'name': 'B', 'city': 'bar'}, 'c':{'name': 'C', 'city': 'bar'}, 'd':{'name': 'B', 'city': 'foo'}, 'e':{'name': 'D', 'city': 'bar'}, } 

How to get a list (or dictionary) of all dictionaries where 'city' has the value 'bar' ?

The following code works, but does not scale:

 req_key = 'bar' selected = [] for one in all_dicts.keys(): if req_key in all_dicts[one]: selected.append(all_dicts[one]) 

Say 'city' can have 50,000 unique values, and the all_dicts dictionary contains 600,000 values, iterating through the dictionary for each value of 'city' not very efficient.

Is there a scalable and efficient way to do this?

+7
python dictionary
source share
4 answers

What you can do is create an index in this dictionary, for example:

 cityIndex={} for item in all_dicts.values(): if item['city'] in cityIndex: cityIndex[item['city']].append(item) else: cityIndex[item['city']]=[item] 

This will require some initial processing time, as well as some additional memory, but then it will be very fast. If you want all the items with some cityName , you will get them by following these steps:

 mylist=cityIndex[cityName] if cityName in cityIndex else [] 

This gives you many benefits if all_dicts is created once and requested after that many times.

If all_dicts changes during the execution of your program, you will need one more code to support cityIndex . If item added to all_dicts , just do:

 if item['city'] in cityIndex: cityIndex[item['city']].append(item) else: cityIndex[item['city']]=[item] 

and if the item is deleted, this is an easy way to remove it from the index (provided that the combination of "name" and "city" is unique among your items):

 for i, val in enumerate(cityIndex[item['city']]): if val['name']==item['name']: break del cityIndex[item['city']][i] 

If there are many more requests than updates, you will still get a significant performance boost.

+9
source share

You must check all values; there is no alternative to this. However, you could use coding using a vectorized approach - a list that will be much faster than the for loop:

 selected = [d for d in all_dicts.values() if d['city']=='bar'] print(selected) # [{'name': 'B', 'city': 'bar'}, {'name': 'C', 'city': 'bar'}, {'name': 'D', 'city': 'bar'}] 

Using dict.values instead of accessing dictionary keys also improves performance as well as memory efficiency in Python 3.

+7
source share

Or use filter , in python 3:

 >>> list(filter(lambda x: x['city']=='bar', all_dicts.values())) # [{'name': 'D', 'city': 'bar'}, {'name': 'B', 'city': 'bar'}, {'name': 'C', 'city': 'bar'}] 

Or using pandas :

 import pandas as pd df = pd.DataFrame(all_dicts).T df[df.city=='bar'].T.to_dict() # {'e': {'city': 'bar', 'name': 'D'}, 'c': {'city': 'bar', 'name': 'C'}, 'b': {'city': 'bar', 'name': 'B'}} 
+3
source share
 all_dicts = {'a':{'name': 'A', 'city': 'foo'}, 'b':{'name': 'B', 'city': 'bar'}, 'c':{'name': 'C', 'city': 'bar'}, 'd':{'name': 'B', 'city': 'foo'}, 'e':{'name': 'D', 'city': 'bar'}, } citys = {} for key, value in all_dicts.items(): citys[key] = value['city'] #{'a': 'foo', 'b': 'bar', 'e': 'bar', 'd': 'foo', 'c': 'bar'} for key, value in citys.items(): if value == 'bar': print(all_dicts[key]) 

of

 {'name': 'B', 'city': 'bar'} {'name': 'D', 'city': 'bar'} {'name': 'C', 'city': 'bar'} 

Create a helper dict to save the city as an index, and you can reference it very quickly.

0
source share

All Articles