Blocking Protection for Receive / Release Synchronization

I have a tempfile share that divides into 4K pieces (or some such value). Each 4K file is represented by an index starting at zero. For this share, I track the indexes of the 4K blocks and always return the smallest indexed 4K block that is not used, or -1 if they are all used.

This ResourceSet class for indexes has a publicly available retrieval and release method, both of which use a synchronized lock, the duration of which is about the same as when generating 4 random numbers (expensive, cpu-wise).

Therefore, as you can see from the following code, I use AtomicInteger "counting semaphore" to prevent a large number of threads from simultaneously falling into the critical section when receiving (), return -1 (not available right now) if there are too many threads.

I am currently using the constant 100 for the CAS hard cycle to try to increase the atomic integer in the acquirer and the constant 10 for the maximum number of threads, then to allow a critical section that is long enough to create competition. My question is, what should these constants be for a moderately high-loaded servlet engine that has multiple threads trying to access these pieces of 4K?

public class ResourceSet { // ??? what should this be // maximum number of attempts to try to increment with CAS on acquire private static final int CAS_MAX_ATTEMPTS = 50; // ??? what should this be // maximum number of threads contending for lock before returning -1 on acquire private static final int CONTENTION_MAX = 10; private AtomicInteger latch = new AtomicInteger(0); ... member variables to track free resources private boolean aquireLatchForAquire () { for (int i = 0; i < CAS_MAX_ATTEMPTS; i++) { int val = latch.get(); if (val == -1) throw new AssertionError("bug in ResourceSet"); // this means more threads than can exist on any system, so its a bug! if (!latch.compareAndSet(val, val+1)) continue; if (val < 0 || val >= CONTENTION_MAX) { latch.decrementAndGet(); // added to fix BUG that comment pointed out, thanks! return false; } } return false; } private void aquireLatchForRelease () { do { int val = latch.get(); if (val == -1) throw new AssertionError("bug in ResourceSet"); // this means more threads than can exist on any system, so its a bug! if (latch.compareAndSet(val, val+1)) return; } while (true); } public ResourceSet (int totalResources) { ... initialize } public int acquire (ResourceTracker owned) { if (!aquireLatchForAquire()) return -1; try { synchronized (this) { ... algorithm to compute minimum free resoource or return -1 if all in use return resourceindex; } } finally { latch.decrementAndGet(); } } public boolean release (ResourceIter iter) { aquireLatchForRelease(); try { synchronized (this) { ... iterate and release all resources } } finally { latch.decrementAndGet(); } } } 
+4
source share
3 answers

Writing a good and strong spinlock is actually quite complicated and requires a good understanding of memory barriers. Simple selection of a constant is not going to reduce it and, certainly, will not be portable. Google gperftools has an example that you can look at, but probably a lot more complicated than what you need.

If you really want to reduce the conflict on the lock, you can consider using a more subtle and optimistic scheme. A simple one might be to divide your pieces into n groups and associate a lock with each group (also called stripping). This will help reduce competition and increase throughput, but it will not help reduce latency. You can also associate AtomicBoolean with each piece and CAS to receive it (try again in case of failure). Be careful when working with non-blocking algorithms, because they are usually complex to get right. If you fix it, it can significantly reduce the latency of acquiring a piece.

Note that it’s hard to suggest a more subtle approach without knowing what your fragment selection algorithm looks like. I also assume that you really have a performance issue (it was profiled and that’s it).

While I'm in this, your spinlock implementation is wrong. You should never rotate directly to CAS because you are spammed up. This will be incredibly slow with any serious disagreement (related to the problem of a thundering herd ). It would be minimal first to check the variable for availability before your CAS (simple if the barrier is not read). It would not be better if all your flows rotated at the same value. This should avoid the associated cache line from ping-pong-ing between your kernels.

Please note that I do not know what types of memory barriers are associated with atomic operations in Java, so my suggestions above may not be optimal or correct.

Finally, The Art of Multiprocessor Programming is a fun book to read in order to better familiarize yourself with all the non-meaning that I erupt in this answer.

+1
source

I'm not sure if you need to create your own Lock class for this scenario. Because the JDK provided ReentrantLock, which also uses the CAS statement during blocking. Performance should be pretty good compared to your personal Lock class.

0
source

You can use the Semaphore tryAcquire method if you want your threads to overlap no resource available .

For one, I would simply replace your synchronized with ReentrantLock and use the tryLock() method on it. If you want your threads to wait a bit, you can use tryLock(timeout) in the same class. Which one to choose and what value to use for the timeout must be determined using a performance test.

Creating an explicit shutter seems, it seems, seems to me unnecessary. I'm not saying that he can never help, but IMO this is likely to hurt performance, and this is an added complication for sure. Therefore, if you do not have a performance problem here (based on the test you did), and you find that this kind of gating helps, I would recommend going with the simplest implementation.

0
source

Source: https://habr.com/ru/post/1411816/


All Articles