How locks are implemented on multiple cores

For a uni-processor, the locking algorithm is quite simple.

Lock(threadID) { Disable Interrupts If lock is already owned by same thread{ Restore Interrupts return } if lock is free { make lock busy set current thread as the owner of the lock } else { add threadID to the lock queue. } Restore Interrupts return } 

But how do we implement this code in multiprocessor / multicore systems. What if 2 kernels / procs try to give the same lock for different processes.

+4
source share
2 answers

Mutexes are usually implemented using atomic operations on a single memory value. For example, blocking may be the only word that is free at 0 and blocked at 1 . To obtain a lock, the processor locks the memory bus (so that other processors cannot read or write to the memory), read the current value of the word, set it to 1 if it is 0 , and unlock the memory bus. To unlock, the word can be set to 0 .

This is a simple example that does not take into account what happens when it is blocked. Different operating systems handle using different mechanisms. Linux uses something called futexes . I'm not sure what Windows or Mac do.

Despite the fact that the algorithm you posted is correct, non-kernel code cannot disable processor interrupts, so user space code will use atomic operations even on one core.

+1
source

I say that the easiest way to think about locking is to use atomic exchange instructions. The following gets an X lock.

  LOCK:
   set RegisterA = 1
   Atomic_Exchange (X, RegisterA) // runs such that no other thread can work with X
   if RegisterA == 1:
     Means X was 1 when I esecuted the exchange thus someone else has the lock
     Since I do not have the lock, goto LOCK
   else:
     If A is zero, it means I was the first one to set X to 1, which means I own the lock

 UNLOCK:
     X = 0

Atomic exchange exists on most computers. Intel x86 has an EXCHG instruction for this. Just FYI, Intel x86 also has a comparison and exchange instruction that takes care of the acquisition as well as the comparison for you. Basically, instead of doing an exchange first and then doing a test in software, it uses hardware and only exchanges if X == 0 for starters. This saves the extra write of the X variable, which reduces cache skips for X, which leads to better performance.

0
source

All Articles