I did not work with threads in Python at all and asked this question as a complete stranger.
I am wondering if defaultdict is thread safe. Let me explain this:
I have
d = defaultdict(list)
which by default creates a list of missing keys. Let's say I have several threads started doing this at the same time:
d['key'].append('value')
In the end, I should get ['value', 'value'] . However, if defaultdict not thread safe, if thread 1 gives thread 2 after checking if 'key' in dict and before d['key'] = default_factory() , this will cause striping and another thread will create a list in d['key'] and will add 'value' possibly.
Then, when thread 1 runs again, it will continue with d['key'] = default_factory() , which will destroy the existing list and value, and we will end in ['key'] .
I looked at the source code of CPython for defaultdict . However, I could not find any castles or mutexes. I assume it is not thread safe if it is documented like this.
Some guys at IRC said last night that Python has a GIL, so it is conceptually thread safe. Some of the threads mentioned should not be executed in Python. I'm pretty confused. Ideas?
python defaultdict python-collections
Ahmet Alp Balkan
source share