Why does Python provide locking mechanisms if it obeys the GIL?

Question

Why does Python provide locking mechanisms if it obeys the GIL?

I know that Python threads can only execute bytecode one at a time, so why does the threading library provide locks? I assume that race conditions cannot occur if only one thread is executed at a time.

The library provides locks, conditions, and semaphores. Is the sole purpose of this to synchronize execution?

Update:

I did a little experiment:

from threading import Thread from multiprocessing import Process num = 0 def f(): global num num += 1 def thread(func): # return Process(target=func) return Thread(target=func) if __name__ == '__main__': t_list = [] for i in xrange(1, 100000): t = thread(f) t.start() t_list.append(t) for t in t_list: t.join() print num

Basically, I had to start 100k threads and increase by 1. The result was 99993.

a) How can the result be not 99999 if there is GIL synchronization and the exclusion of race conditions? b) Is it even possible to start 100k OS threads?

Update 2 after seeing the answers:

If the GIL really does not provide a way to perform a simple operation, such as incrementing atomically, what is the purpose of having it? This does not help in unpleasant concurrency problems, so why was this done? I heard use cases for C extensions, can anyone explain this?

+7

python multithreading python-multiprocessing gil

dani-h Nov 11 '14 at 20:04

source share

2 answers

Custom Python threads operate at the bytecode level. That is, after each bytecode (well, in fact, I believe that the number of bytecodes is configurable), the stream can give control to another stream.

Any operation on a shared resource in which no bytecode needs to be locked. And even if the given operation in a certain version of CPython is one bytecode, this may not be the case in each version of each interpreter, so you should still use a lock.

For the same reason, you need locks to start, really, with the exception of the VM level, not the hardware level.

+3

kindall Nov 11 '14 at 20:13

source share

Ned batchelder · Accepted Answer · 2014-11-11T20:18:37+0000

GIL synchronizes bytecode operations. Only one byte code can be executed immediately. But if you have an operation that requires more than one bytecode, you can switch streams between bytecodes. If you need an operation to be atomic, you need synchronization above and above the GIL.

For example, integer increment is not a single byte code:

 >>> def f(): ... global num ... num += 1 ... >>> dis.dis(f) 3 0 LOAD_GLOBAL 0 (num) 3 LOAD_CONST 1 (1) 6 INPLACE_ADD 7 STORE_GLOBAL 0 (num) 10 LOAD_CONST 0 (None) 13 RETURN_VALUE

It took four bytecodes to implement num += 1 . GIL does not guarantee that x will increase atomically. Your experiment demonstrates the problem: you lost updates because the threads switched between LOAD_GLOBAL and STORE_GLOBAL.

The purpose of the GIL is to ensure that the reference count of Python objects will increase and decrease atomically. It is not intended to help you with your own data structures.

Why does Python provide locking mechanisms if it obeys the GIL?

More articles: