GIL protects Python boarding schools. It means:
- You donโt need to worry about what is happening in the interpreter due to multithreading
- most things do not work in parallel because python code is executed sequentially due to GIL
But GIL does not protect your own code. For example, if you have this code:
self.some_number += 1
This will read the value of self.some_number , compute some_number+1 , and then write it back to self.some_number .
If you do this in two threads, the operations (read, add, write) of one thread and the other can be mixed so that the result is incorrect.
This may be the order of execution:
- thread1 reads
self.some_number (0) - thread2 reads
self.some_number (0) - thread1 computes
some_number+1 (1) - thread2 computes
some_number+1 (1) - thread1 writes 1 to
self.some_number - thread2 writes 1 to
self.some_number
You use locks to enforce this order of execution:
- thread1 reads
self.some_number (0) - thread1 computes
some_number+1 (1) - thread1 writes 1 to
self.some_number - thread2 reads
self.some_number (1) - thread2 computes
some_number+1 (2) - thread2 writes 2 to
self.some_number
EDIT: Let me end this answer with some code that shows the explained behavior:
import threading import time total = 0 lock = threading.Lock() def increment_n_times(n): global total for i in range(n): total += 1 def safe_increment_n_times(n): global total for i in range(n): lock.acquire() total += 1 lock.release() def increment_in_x_threads(x, func, n): threads = [threading.Thread(target=func, args=(n,)) for i in range(x)] global total total = 0 begin = time.time() for thread in threads: thread.start() for thread in threads: thread.join() print('finished in {}s.\ntotal: {}\nexpected: {}\ndifference: {} ({} %)' .format(time.time()-begin, total, n*x, n*x-total, 100-total/n/x*100))
There are two functions that implement the increment. One uses locks, and the other does not.
The increment_in_x_threads function implements the parallel execution of an incremental function in many threads.
Now doing this with lots of threads makes it almost certain that an error will occur:
print('unsafe:') increment_in_x_threads(70, increment_n_times, 100000) print('\nwith locks:') increment_in_x_threads(70, safe_increment_n_times, 100000)
In my case, it is printed:
unsafe: finished in 0.9840562343597412s. total: 4654584 expected: 7000000 difference: 2345416 (33.505942857142855 %) with locks: finished in 20.564176082611084s. total: 7000000 expected: 7000000 difference: 0 (0.0 %)
Thus, without blocking, there were many errors (33% of the increments failed). On the other hand, with locks he was 20 times slower.
Of course, both numbers are blown up because I used 70 threads, but this shows the general idea.