Even with a fair critical section, the code is likely to have terrible performance, because if the critical section is held for long periods of time, threads will often wait for it.
So, I suggest you try rebuilding the code, so it does not need to block critical sections for long periods of time. Either use a different approach altogether (it is often recommended to transfer objects in turn messages, because they are easy to receive) or, at least, performing most of the calculations on local variables without holding the lock, and not only blocking to store the results. If a lock is held for shorter periods of time, threads will spend less time waiting for it, which will generally improve performance and make impartiality. You can also try increasing lock lock (lock small objects separately), which will also reduce competition.
Edit: Well, thinking about this, I believe that every critical section in Linux is roughly fair. Whenever there are sleepers, the unlock operation must enter the core to tell him to wake them. During the return from the kernel, the scheduler starts and selects the process with the highest priority. Sleepers grow in priority, waiting, so at some point they will be high enough that the release will cause the swtich task.
Jan Hudec
source share