Are there any implementations of parallel blocking lock queues?

Question

Are there any implementations of parallel blocking lock queues?

I know about blocking queues and non-blocking queues, a great example of the implementations Scott et al provides . but are there any queue-lock-lock implementations?

In an unblocked lock, the dequeue queue does not require locking, but if there are no elements in the queue, it blocks the user. Are there any implementations of such a beast? I prefer if they are C # implementations, but any implementation will technically work.

Update:

I think that ultimately with the race condition on line D14.1:

initialize(Q: pointer to queue t) node = new node() // Allocate a free node node–>next.ptr = NULL // Make it the only node in the linked list Q–>Head = Q–>Tail = node // Both Head and Tail point to it signal = new ManualResetEvent() // create a manual reset event enqueue(Q: pointer to queue t, value: data type) E1: node = new node() // Allocate a new node from the free list E2: node–>value = value // Copy enqueued value into node E3: node–>next.ptr = NULL // Set next pointer of node to NULL E4: loop // Keep trying until Enqueue is done E5: tail = Q–>Tail // Read Tail.ptr and Tail.count together E6: next = tail.ptr–>next // Read next ptr and count fields together E7: if tail == Q–>Tail // Are tail and next consistent? E8: if next.ptr == NULL // Was Tail pointing to the last node? E9: if CAS(&tail.ptr–>next, next, <node, next.count+1>) // Try to link node at the end of the linked list E10.1: signal.Set() // Signal to the blocking dequeues E10.2: break // Enqueue is done. Exit loop E11: endif E12: else // Tail was not pointing to the last node E13: CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) // Try to swing Tail to the next node E14: endif E15: endif E16: endloop E17: CAS(&Q–>Tail, tail, <node, tail.count+1>) // Enqueue is done. Try to swing Tail to the inserted node dequeue(Q: pointer to queue t, pvalue: pointer to data type): boolean D1: loop // Keep trying until Dequeue is done D2: head = Q–>Head // Read Head D3: tail = Q–>Tail // Read Tail D4: next = head–>next // Read Head.ptr–>next D5: if head == Q–>Head // Are head, tail, and next consistent? D6: if head.ptr == tail.ptr // Is queue empty or Tail falling behind? D7: if next.ptr == NULL // Is queue empty? D8.1: signal.WaitOne() // Block until an enqueue D8.X: // remove the return --- return FALSE // Queue is empty, couldn't dequeue D9: endif D10: CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) // Tail is falling behind. Try to advance it D11: else // No need to deal with Tail // Read value before CAS, otherwise another dequeue might free the next node D12: *pvalue = next.ptr–>value D13: if CAS(&Q–>Head, head, <next.ptr, head.count+1>) // Try to swing Head to the next node D14.1: if(head.ptr == tail.ptr && next.ptr==NULL) // Is queue empty? <--- POSSIBLE RACE CONDITION??? D14.2: signal.Reset() D14.3: break // Dequeue is done. Exit loop D15: endif D16: endif D17: endif D18: endloop D19: free(head.ptr) // It is safe now to free the old dummy node D20: return TRUE // Queue was not empty, dequeue succeeded

+4

multithreading c # concurrency lock-free blockingqueue

Kiril Apr 20 '11 at 17:55

source share

2 answers

.NET parallel extensions: (built-in, for .NET 4.0+):

http://blogs.msdn.com/b/pfxteam/archive/2010/01/26/9953725.aspx

Someone from the implementation of StackOverflow:

Block free constructions in .net

<h / "> Reply to the explanation in the comments:

If the lock when empty is not busy (waiting for a signal), it looks like you need a semaphore count to wait.

An alternative approach could be to use a regular queue, as well as atomic comparisons and exchanges or spin locks to prevent simultaneous access,
then if the consumer thread is trying to enter when the queue is empty, block the binary semaphore,
if the provider’s thread tries to enter when the queue is empty, unlock the binary semaphore to wake all sleeping clients (and return them to direct blocking so that several threads can be entered only if there are enough elements in the queue for them).

e.g. .// pseudo code

 /// Non-blocking lock (busy wait) void SpinLock() { While (CompareAndExchange(myIntegerLock, -1, 0) != 0) { // wait } } void UnSpinLock() { Exchange(myIntegerLock, 0); } void AddItem(item) { // Use CAS for synchronization SpinLock(); // Non-blocking lock (busy wait) queue.Push(item); // Unblock any blocked consumers if (queue.Count() == 1) { semaphore.Increase(); } // End of CAS synchronization block UnSpinLock(); } Item RemoveItem() { // Use CAS for synchronization SpinLock(); // Non-blocking lock (busy wait) // If empty then block if (queue.Count() == 0) { // End of CAS synchronization block UnSpinLock(); // Block until queue is not empty semaphore.Decrease(); // Try again (may fail again if there is more than one consumer) return RemoveItem(); } result = queue.Pop(); // End of CAS synchronization block UnSpinLock(); return result; }

+1

Danny varod Apr 20 '11 at 18:01

source share

johnnycrash · Accepted Answer · 2011-04-20T20:59:29+0000

EDIT:

SIMPLY: I suggest you do not need a head and tail for your turn. Just have a head. If head = NULL, the list is empty. Add items to the head. Remove items from your head. Simplification, less CAS operations.

HELPER: I suggested in the comments that you need to think about an auxiliary scheme to combat the race. In my version of what “blocking is free” means, it’s normal to have rare race conditions if they do not create problems. I like the extra performance, as well as the idle thread sleeping for a couple of ms for too long.

Helper ideas. When the consumer captures the work, he can check if there is a flow in the coma. When a producer adds work, he can search for streams in comas.

So watch out for the sleepers. Use the linked list of sleepers. When a thread decides that there is no work, it marks itself as! Awake and CAS himself to top the sleeping list. When a signal is received for awakening, the stream marks itself as awake. Then the newly awakened thread clears the sleeping list. To clear a parallel single linked list, you have to be careful. You can use only CAS. Thus, while the head of the sleeping list is awake, you can turn off the head. If the head is not awake, continue to scan the list and “lazy unlock” (I made this term) the remaining awake items. Lazy unlink is just ... just set the next ptr of the previous element over the active element. Parallel scanning will still reach the end of the list, even if it falls into elements that are awake. Subsequent checks see a shorter list. Finally, anytime you add work or take off work, scan the sleeping list for awake items. If work with consumer notifications remains after capturing some work (.next work! = NULL), the consumer can scan the sleep list and signal the first thread, which is awake. After the producer adds work, the producer can scan the sleeping list and do the same.

If you have a broadcast scenario and it is not possible to transmit a single stream, just save the number of sleeping streams. As long as this account is still> 0, the consumer who has noticed the remaining work, and the work of adding the consumer will broadcast a signal for waking up.

In our environment, we have 1 thread per SMT, so the sleep list can never be so big (well, if I don’t get one of these new 128 parallel thread machines!) We create work items at an early stage of the transaction. In the first second, we could generate 10,000 work items, and this production is rapidly declining. Threads work for several seconds on these work items. Thus, we rarely have a thread in the inaction pool.

YOU MAY ALSO USE LOCKS If you only have 1 thread and you rarely work, this does not work for you. In this case, the performance of the mutexes is not a concern, and you should just use them. In this scenario, use the sleep queue lock. Think of a lock as it is "no locks where it counts."

PREVIOUS MAIL: You say: There is a line of work. There are many consumer flows. The consumer needs to put work in and do it if there is any work. The consumer stream should sleep until it works.

If you are, we do this using only atomic operations as follows:

A work queue is a linked list. There is also a linked list of sleeping threads.

To add work: CAS will lead the list for a new job. When work is added, we check if there are any threads in the sleeping list. If there is, before adding work, we CAS sleep from the list of sleepers, set its work = new job, and then ask the sleeping one to wake up. We are adding work to the work queue.

To consume work: CAS tops the list for head-> next. If the head of the job list is NULL, we create a thread in the sleepers list.

As soon as the thread has a work item, the thread should indicate to CAS the state of the WORK_INPROGRESS work item or some of them. If this fails, it means that the work is done by another, so the consumer flow returns to the job search. If the thread wakes up and has a work item, it should still have a CAS state.

So, if the work is added, the sleeping consumer always wakes up and transfers the work. pthread_kill () always wakes up the thread in sigwait (), because even if the thread receives a signal after the signal, the signal is received. This solves the thread problem, which is placed on the sleeping list, but receives a signal before bedtime. All that happens is that the thread is trying to own its → work, if one exists. Failure to work or lack of work directs the flow back to consumption. If the stream does not fall into the CAS list in the sleeping list, this means that either the other stream beat it or the producer removed the sleeping one. For security, we have a stream as if it just woke up.

We do not get any race conditions that do this, and have several manufacturers and consumers. We were also able to expand this to allow threads to sleep on individual work items.

Are there any implementations of parallel blocking lock queues?

More articles: