Spinlocks, how useful are they?

How often do you actually use spin blocks in your code? How common is it to face a situation where using the employment cycle actually outperforms using locks?
Personally, when I write some code that requires thread safety, I tend to compare it with different synchronization primitives, and as far as possible it seems that using locks gives better performance than using spindlock. No matter how little time I keep in the lock, the amount of disputes that I get when using spindle shutters is much more than the amount I get from using locks (of course, I run my tests on a multiprocessor machine).

I understand that he is more likely to encounter a spin lock in the "low level" code, but I am interested to know if it is useful for you to use it even with a higher level of programming?

+31
c ++ multithreading c # locking spinlock
Sep 21 '09 at 18:59
source share
10 answers

It depends on what you do. In general application code, you want to avoid spindlock.

In low-level materials, where you hold the lock for only a few instructions, and latency is important, a spinlock is better than a lock. But these cases are rare, especially in applications where C # is commonly used.

+32
Sep 21 '09 at 19:07
source share

In C #, "Spin locks" were, in my experience, almost always worse than locking β€” it's a rare occurrence when spin locks outperform locking.

However, this is not always the case. NET 4 adds the System.Threading.SpinLock structure. This provides advantages in situations where the lock is held for a very short time and is repeatedly captured. From MSDN documents to Data Structures for Parallel Programming :

In scenarios where a lock is expected to be expected to be short, SpinLock provides better performance than other locks.

Spin locks can be superior to other blocking mechanisms when you do something like locks through a tree - if you only have locks on each node for a very short period of time, they can execute a traditional lock. I came across this in the rendering engine with multi-threaded scene updating, at some point - spin locks are profiled to outperform the lock using Monitor.Enter.

+20
Sep 21 '09 at 19:18
source share

For my work in real time, especially with device drivers, I used them honestly. It turns out that (the last time I dated it), expecting a synchronization object, such as a semaphore tied to hardware interrupts, to chew on at least 20 microseconds, regardless of how much time it actually takes to interrupt. A single check of the registered hardware memory register followed by a check for RDTSC (to ensure a timeout so that you do not lock the machine) was in the high nannosecond range (mainly in noise). For hardware confirmation of the connection, which should not take a lot of time, it is really difficult to overcome the spin lock.

+11
Sep 21 '09 at 19:17
source share

My 2c: If your updates satisfy some access criteria, they are good candidates for spinlocks:

  • fast i.e. you will have time to get a spin lock, perform updates and release a spin lock in one quantum of the flow so that you don’t get a preliminary hold while holding a spin lock li>
  • localized all the data that you are updating is preferably located in only one loaded page, you do not want to skip TLB while you are holding a spin lock, and you definitely do not want the page summary to read!
  • atomic you do not need any other lock to perform the operation, i.e. never wait for locks under a spin lock.

For anything that has the potential to exit, you should use a known locking structure (events, mutexes, semaphores, etc.).

+9
Sep 21 '09 at 19:12
source share

One option for using spin locks is if you expect very low competition, but there will be a lot of them. If you do not need support for recursive locking, spin locking can be implemented in one byte, and if the conflict is very low, the waste of the processor cycle is negligible.

For practical use, I often have arrays of thousands of elements, where updates for different elements of the array can safely be performed in parallel. The chances of two threads trying to update the same element at the same time are very small (low rivalry), but I need one lock for each element (I will have many of them). In these cases, I usually allocate an ubytes array of the same size as the array that I am updating in parallel, and implement spinlocks inline as (in the D programming language):

while(!atomicCasUbyte(spinLocks[i], 0, 1)) {} myArray[i] = newVal; atomicSetUbyte(spinLocks[i], 0); 

On the other hand, if I had to use regular locks, I would have to allocate an array of pointers to objects, and then allocate a Mutex object for each element of this array. In scenarios like the ones described above, this is simply wasteful.

+7
Jul 16 '10 at 17:05
source share

If you have a critical performance code, and you determined that it should be faster than it is now, and you determined that the blocking speed is decisive, then it would be nice to try spin-lock. Other times, why bother? Normal locks are easier to use correctly.

+5
Sep 21 '09 at 19:44
source share

Pay attention to the following points:

  • Most mutex implementations shrink a bit before the thread is virtually unplanned. Because of this, it is difficult to compare the theses of mutexes with clean direct locks.

  • Several threads that are "as fast as possible" in the same spinlock will absorb the entire bandwidth and drastically reduce the efficiency of your program. You need to add a tiny β€œsleeping” time by adding noop to your watch face.

+4
Sep 21 '09 at 19:31
source share

You are unlikely to ever need to use spinlocks in your application code if you should avoid something.

I cannot, for any reason, use spinlock in C # code running on a normal OS. Busy locks are mostly application-level waste - rotation can lead to the use of the entire processor time-list, and locking will immediately trigger a context switch if necessary.

High-performance code in which you have nr threads = nr processors / cores may come in handy in some cases, but if you need performance optimization at this level, you are likely to make the next third-generation game, working on an embedded OS with poor synchronization primitives by creating an OS / driver or not using C # anyway.

+3
Sep 21 '09 at 19:12
source share

I used spin lock for the stop phase of the garbage collector in the HLVM project because they are lightweight and it is a toy virtual machine. However, spinlocks can be counterproductive in this context:

One of the main mistakes in the garbage collector of the Glasgow Haskell Compiler is so annoying that it has a name, the last kernel slowdown. "This is a direct result of their improper use of spin-locks in their GC and is accelerated in Linux due to its scheduler, but on in fact, the effect can be observed when other programs compete for processor time.

The effect is clear in the second graph here and can be seen working not only on the last core here , where the Haskell program sees performance degradation beyond just 5 cores.

+3
Jun 11 '10 at
source share

Always remember these points when using spinlocks :

  • Fast user mode execution.
  • Synchronizes threads within a single process or multiple processes, if they are in shared memory.
  • Not returned until the object belongs.
  • Doesn't support recursion.
  • Spends 100% of CPU resources on "waiting."

I personally saw so many dead ends just because someone thought it would be a good idea to use a spinlock.

Be very, very careful when using spinlocks.

(I cannot stress this enough).

0
Jun 12 '19 at 16:43 on
source share



All Articles