Python: check if one dictionary is a subset of another larger dictionary

I am trying to write my own filtering method, which takes an arbitrary number of kwargs and returns a list containing the elements of a list like the database that these kwargs contain.

For example, suppose d1 = {'a':'2', 'b':'3'} and d2 = the same. d1 == d2 leads to True. But suppose d2 = the same, plus many other things. My method should be able to define d1 in d2 , but Python cannot do this with dictionaries.

Context:

I have a Word class, and each object has properties such as word , definition , part_of_speech , etc. I want to be able to call a filtering method in the main list of these words, for example Word.objects.filter(word='jump', part_of_speech='verb-intransitive') . I cannot figure out how to manage these keys and values ​​at the same time. But it can have great functionality outside this context for other people.

+87
python dictionary filter subset
Feb 17 '12 at 6:15
source share
14 answers

Convert to a pair of elements and check for restrictions.

 all(item in superset.items() for item in subset.items()) 

Optimization remains as an exercise for the reader.

+92
Feb 17 '12 at 6:18
source share

In Python 3, you can use dict.items() to get a representation of the type of dict elements. Then you can use the <= operator to check if one view is a "subset" of another:

 d1.items() <= d2.items() 

In Python 2.7, use dict.viewitems() to do the same:

 d1.viewitems() <= d2.viewitems() 

In Python 2.6 and below, you will need a different solution, for example using all() :

 all(key in d2 and d2[key] == d1[key] for key in d1) 
+70
Jan 10 '17 at 22:19
source share

Note for people who need this for unit testing: there is also an assertDictContainsSubset() method in the Python TestCase class.

http://docs.python.org/2/library/unittest.html?highlight=assertdictcontainssubset#unittest.TestCase.assertDictContainsSubset

However, it was deprecated in 3.2, not sure why there might be a replacement for it.

+32
Oct 07 '13 at 9:32
source share

for keys and values, check using: set(d1.items()).issubset(set(d2.items()))

if you need to check only the keys: set(d1).issubset(set(d2))

+19
Aug 12 '12 at 18:37
source share

For completeness, you can also do this:

 def is_subdict(small, big): return dict(big, **small) == big 

However, I make no complaints about speed (or lack thereof) or readability (or lack thereof).

+18
Feb 15 '16 at 18:50
source share
 >>> d1 = {'a':'2', 'b':'3'} >>> d2 = {'a':'2', 'b':'3','c':'4'} >>> all((k in d2 and d2[k]==v) for k,v in d1.iteritems()) True 

context:

 >>> d1 = {'a':'2', 'b':'3'} >>> d2 = {'a':'2', 'b':'3','c':'4'} >>> list(d1.iteritems()) [('a', '2'), ('b', '3')] >>> [(k,v) for k,v in d1.iteritems()] [('a', '2'), ('b', '3')] >>> k,v = ('a','2') >>> k 'a' >>> v '2' >>> k in d2 True >>> d2[k] '2' >>> k in d2 and d2[k]==v True >>> [(k in d2 and d2[k]==v) for k,v in d1.iteritems()] [True, True] >>> ((k in d2 and d2[k]==v) for k,v in d1.iteritems()) <generator object <genexpr> at 0x02A9D2B0> >>> ((k in d2 and d2[k]==v) for k,v in d1.iteritems()).next() True >>> all((k in d2 and d2[k]==v) for k,v in d1.iteritems()) True >>> 
+10
Feb 17 '12 at 6:30
source share

My function is for the same purpose, doing it recursively:

 def dictMatch(patn, real): """does real dict match pattern?""" try: for pkey, pvalue in patn.iteritems(): if type(pvalue) is dict: result = dictMatch(pvalue, real[pkey]) assert result else: assert real[pkey] == pvalue result = True except (AssertionError, KeyError): result = False return result 

In your example, dictMatch(d1, d2) should return True, even if it has other things in it, plus it also applies to lower levels:

 d1 = {'a':'2', 'b':{3: 'iii'}} d2 = {'a':'2', 'b':{3: 'iii', 4: 'iv'},'c':'4'} dictMatch(d1, d2) # True 

Notes. if type(pvalue) is dict be even the best solution that avoids if type(pvalue) is dict and applies to an even wider range of cases (like hash lists, etc.). Also, recursion is not limited here, so use your risk. ;)

+4
Aug 20 '13 at 12:30
source share

This seemingly simple question costs me a couple of hours in the study to find a 100% reliable solution, so I documented what I found in this answer.

  • "Pythonic-ally" says small_dict <= big_dict will be the most intuitive way, but too bad that it will not work . {'a': 1} < {'a': 1, 'b': 2} seems to work in Python 2, but it is not reliable because the official documentation explicitly calls it. Search for "Outcomes other than equality are resolved sequentially, but not otherwise defined." in this section . Not to mention that comparing 2 dicts in Python 3 throws a TypeError exception.

  • The second most intuitive thing is small.viewitems() <= big.viewitems() only for Python 2.7 and small.items() <= big.items() for Python 3. But there is one caveat: it is potentially buggy . If your program can potentially be used in Python <= 2.6, its d1.items() <= d2.items() actually compares 2 lists of tuples without any special order, so the final result will be unreliable and will become an unpleasant error in your program. I don’t want to write another implementation for Python <= 2.6, but it’s still not very convenient for me that my code contains a known error (even if it is on an unsupported platform). Therefore, I refuse this approach.

  • I agree with @blubberdiblub's answer (credit goes to him):

    def is_subdict(small, big): return dict(big, **small) == big

    It is worth noting that this answer is based on the == behavior between dicts, which is clearly defined in the white paper, therefore should work in every version of Python . Search:

    • "Dictionaries are compared equal if and only if they have the same pairs (key, value)." last sentence on this page
    • "Mappings (instances of dict) compare the same if and only if they have the same pair (key, value). Comparing comparisons of keys and elements provides reflexivity." to this page
+2
Feb 17 '17 at 21:36
source share

Here is a general recursive solution to a given task:

 import traceback import unittest def is_subset(superset, subset): for key, value in subset.items(): if key not in superset: return False if isinstance(value, dict): if not is_subset(superset[key], value): return False elif isinstance(value, str): if value not in superset[key]: return False elif isinstance(value, list): if not set(value) <= set(superset[key]): return False elif isinstance(value, set): if not value <= superset[key]: return False else: if not value == superset[key]: return False return True class Foo(unittest.TestCase): def setUp(self): self.dct = { 'a': 'hello world', 'b': 12345, 'c': 1.2345, 'd': [1, 2, 3, 4, 5], 'e': {1, 2, 3, 4, 5}, 'f': { 'a': 'hello world', 'b': 12345, 'c': 1.2345, 'd': [1, 2, 3, 4, 5], 'e': {1, 2, 3, 4, 5}, 'g': False, 'h': None }, 'g': False, 'h': None, 'question': 'mcve', 'metadata': {} } def tearDown(self): pass def check_true(self, superset, subset): return self.assertEqual(is_subset(superset, subset), True) def check_false(self, superset, subset): return self.assertEqual(is_subset(superset, subset), False) def test_simple_cases(self): self.check_true(self.dct, {'a': 'hello world'}) self.check_true(self.dct, {'b': 12345}) self.check_true(self.dct, {'c': 1.2345}) self.check_true(self.dct, {'d': [1, 2, 3, 4, 5]}) self.check_true(self.dct, {'e': {1, 2, 3, 4, 5}}) self.check_true(self.dct, {'f': { 'a': 'hello world', 'b': 12345, 'c': 1.2345, 'd': [1, 2, 3, 4, 5], 'e': {1, 2, 3, 4, 5}, }}) self.check_true(self.dct, {'g': False}) self.check_true(self.dct, {'h': None}) def test_tricky_cases(self): self.check_true(self.dct, {'a': 'hello'}) self.check_true(self.dct, {'d': [1, 2, 3]}) self.check_true(self.dct, {'e': {3, 4}}) self.check_true(self.dct, {'f': { 'a': 'hello world', 'h': None }}) self.check_false( self.dct, {'question': 'mcve', 'metadata': {'author': 'BPL'}}) self.check_true( self.dct, {'question': 'mcve', 'metadata': {}}) self.check_false( self.dct, {'question1': 'mcve', 'metadata': {}}) if __name__ == "__main__": unittest.main() 

NOTE. The source code will not work in certain cases, credits for correction will be sent to @ olivier-melançon

+2
Mar 22 '18 at 0:23
source share

I know this question is old, but here is my solution for checking if one nested dictionary is part of another nested dictionary. The solution is recursive.

 def compare_dicts(a, b): for key, value in a.items(): if key in b: if isinstance(a[key], dict): if not compare_dicts(a[key], b[key]): return False elif value != b[key]: return False else: return False return True 
+2
Aug 27 '18 at 9:18
source share

This function works for invalid values. I also find it clear and easy to read.

 def isSubDict(subDict,dictionary): for key in subDict.keys(): if (not key in dictionary) or (not subDict[key] == dictionary[key]): return False return True In [126]: isSubDict({1:2},{3:4}) Out[126]: False In [127]: isSubDict({1:2},{1:2,3:4}) Out[127]: True In [128]: isSubDict({1:{2:3}},{1:{2:3},3:4}) Out[128]: True In [129]: isSubDict({1:{2:3}},{1:{2:4},3:4}) Out[129]: False 
0
Oct 08 '15 at 9:31 on
source share

If you don't mind using pydash, there is is_match that does just that:

 import pydash a = {1:2, 3:4, 5:{6:7}} b = {3:4.0, 5:{6:8}} c = {3:4.0, 5:{6:7}} pydash.predicates.is_match(a, b) # False pydash.predicates.is_match(a, c) # True 
0
Feb 14 '19 at 18:06
source share

A short recursive implementation that works for nested dictionaries:

 def compare_dicts(a,b): if not a: return True if isinstance(a, dict): key, val = a.popitem() return isinstance(b, dict) and key in b and compare_dicts(val, b.pop(key)) and compare_dicts(a, b) return a == b 

This will consume a and b dicts. If anyone knows a good way to avoid this without resorting to partially iterative solutions, as in other answers, please let me know. I need a way to break the dictation into a head and tail based on a key.

This code is more useful as a programming exercise and is probably much slower than other solutions that mix recursion and iteration. The @Nutcracker solution is pretty good for nested dictionaries.

0
Aug 26 '19 at 12:24
source share

Here is a solution that also repeats correctly in lists and sets contained in a dictionary. You can also use this for lists containing dictations, etc.

 def is_subset(subset, superset): if isinstance(subset, dict): return all(key in superset and is_subset(val, superset[key]) for key, val in subset.items()) if isinstance(subset, list) or isinstance(subset, set): return all(any(is_subset(subitem, superitem) for superitem in superset) for subitem in subset) # assume that subset is a plain value if none of the above match return subset == superset 
0
Aug 27 '19 at 12:57 on
source share



All Articles