Moving buffers in streams with one writer and multiple readers

History

There is a stream of letters periodically collecting data from somewhere (in real time, but this does not really matter in the question). After that, many read this data. The usual solution for this is with two reader latches and two such buffers:

Writer (case 1): acquire lock 0 loop write to current buffer acquire other lock free this lock swap buffers wait for next period 

or

 Writer (case 2): acquire lock 0 loop acquire other lock free this lock swap buffers write to current buffer wait for next period 

Problem

In both methods, if another lock operation fails, the replacement is not performed, and the author overwrites his previous data (because the writer is in real time, he cannot wait for readers). Thus, in this case, all readers lose this data frame.

This is not such a big problem, but readers are my own code, and they are short, so this problem is solved with a double buffer, and if a problem occurs, I could make it a triple buffer (or more).

The problem is the delay, which I want to minimize. Imagine case 1:

 writer writes to buffer0 reader is reading buffer1 writer can't acquire lock1 because reader is still reading buffer1 | | | reader finishes reading, | (writer waiting for next period) <- **this point** | | writer wakes up, and again writes to buffer0 

At ** this point **, other theoretical readers could read buffer0 data, if only the writer could swap after the reader has finished, rather than waiting for his next period. What happened in this case is that only one reader was a little late, all readers missed one frame of data, while the problem could be completely avoided.

Case 2 is similar:

 writer writes to buffer0 reader is idle | | | reader finishes reading, | (writer waiting for next period) | | reader starts reading buffer1 writer wakes up | it can't acquire lock0 because reader is still reading buffer1 overwrites buffer0 

I tried mixing solutions, so the writer tries to exchange buffers immediately after recording, and if this is not possible, immediately after waking up in the next period. So something like this:

 Writer (case 3): acquire lock 0 loop if last buffer swap failed acquire other lock free this lock swap buffers write to current buffer acquire other lock free this lock swap buffers wait for next period 

Now the delay problem persists:

 writer writes to buffer0 reader is reading buffer1 writer can't acquire lock1 because reader is still reading buffer1 | | | reader finishes reading, | (writer waiting for next period) <- **this point** | | writer wakes up swaps buffers writes to buffer1 

Again at ** this point ** all readers can start reading buffer0 , which is a short delay after buffer0 been written, but instead they should wait for the next write period.

Question

The question is, how can I handle this? If I want the script to be executed exactly in the right period, it needs to wait for the period using the RTAI function, and I cannot do it like

 Writer (case 4): acquire lock 0 loop write to current buffer loop a few times or until the buffer has been swapped sleep a little acquire other lock free this lock swap buffers wait for next period 

This leads to jitter. because "several times" may turn out to be longer than "waiting for the next period", so the writer may skip the beginning of his period.

To be more clear, here is what I want:

 writer writes to buffer0 reader is reading buffer1 | | | reader finishes reading, | (writer waiting for next period) As soon as all readers finish reading, | the buffer is swapped | readers start reading buffer0 writer wakes up | writes to buffer1 

What i found already

I found read-copy-update , which, as I understand it, continues to allocate memory for buffers and frees them until readers do with them, which is impossible for me for many reasons. First, threads are shared between the kernel and user space. Secondly, with RTAI you cannot allocate memory in a real-time stream (because then your stream will cause Linux system calls and, therefore, disrupt real performance! (Not to mention that using your own RCU implementation in Linux is useless from for the same reasons)

I also thought that there is an extra thread that tries to swap buffers at a higher frequency, but that doesn't seem like such a good idea. Firstly, he himself would have to synchronize with the writer, and secondly, it’s good that I have many of these reader writers working in different parts in parallel, and one additional stream for each writer just seems too big. One thread for all authors seems very complicated in terms of synchronization with each author.

+4
source share
5 answers

What API do you use for read / write locks? Do you have a temporary lock like pthread_rwlock_timedwrlock ? If so, I think this is the solution to your problem, for example, in the following code:

 void *buf[2]; void writer () { int lock = 0, next = 1; write_lock (lock); while (1) { abs_time tm = now() + period; fill (buf [lock]); if (timed_write_lock (next, tm)) { unlock (lock); lock = next; next = (next + 1) & 1; } wait_period (tm); } } void reader () { int lock = 0; while (1) { reade_lock (lock); process (buf [lock]); unlock (lock); lock = (lock + 1) & 1; } } 

What is happening here is that it does not matter to the writer whether he expects blockages or for the next period, until he is sure to wake up before the next period. An absolute timeout provides this.

+3
source

Isn't that a problem that triple buffering should solve. Thus, you have 3 buffers, allows you to call them write1, write2 and read. The write stream alternates between writing to write1 and write2, ensuring that they are never blocked and that the last full frame is always available. Then, in the read streams at some suitable point (say, immediately before or after reading the frame), the read buffer is translated with the available write buffer.

Although this ensures that authors are never blocked (buffer overflows can be done very quickly just by clicking two pointers, perhaps even with a CAS atom instead of locking), there is still a problem when readers have to wait for others to finish reading with the buffer before flipping . I suggest that this may be a bit solved by RCU-esque with a pool of readable buffers, where available can be flipped.

+1
source
  • Use Queue (FIFO linked list)
  • The writer in real time will always add (enqueue) to the end of the queue
  • Readers will always delete (deactivate) from the beginning of the queue.
  • Readers will be blocked if the queue is empty.

to avoid dynamic allocation

I would probably use a circular queue ... I would use the __sync built-in atomic operations. http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html#Atomic-Builtins

  • Circular Queue (FIFO 2d Array)
    • ex: byte [] [] Array = new byte [MAX_SIZE] [BUFFER_SIZE];
    • Pointers to start and end points
  • Writer overwrites buffer in Array [End] []
    • Writer can increment Start if it completes the cycle completely
  • Reader gets buffer from array [Start] []
    • Reading blocks if Start == End
+1
source

If you do not want the writer to wait, perhaps he should not acquire a lock that anyone else can hold. I would like it to perform some kind of synchronization to make sure that what it writes is really written out - usually most synchronization calls cause the flash memory command or barrier to be executed, but the details will depend on the memory model of your processor and the implementation of your package streams.

I would see if there is any other synchronization primitive around that works better, but if a push comes in, I would lock the record and unlock the lock that no one has ever used.

Readers should be prepared to miss something from time to time and should be able to detect when they missed the material. I would associate a validity flag and a long sequence counter with each buffer and ask the scriptwriter to do something like "a clear validity flag, count the increment sequence, synchronization, write to the buffer, count the increment sequence, set the validity flag, synchronization". If the reader reads the sequence counter, synchronizes, the truth flag true, reads data, synchronizes and re-reads the same sequence counter, then it is possible that he will not receive distorted data.

If you are going to do this, I will test it exhaustively. It seems plausible to me, but it may not work with your specific implementation of everything: from the compiler model to memory.

Another idea or way to verify this is to add the checksum to the buffer and write it last.

See also search by non-blocking algorithms such as http://www.rossbencina.com/code/lockfree

To go with this, you probably want the writer to be able to signal sleeping readers. Perhaps you can use Posix semaphores for this β€” for example, ask the reader to ask the writer to call sem_post () on a specific semaphore when it reaches a given sequence number or when the buffer becomes valid.

+1
source

Another option is to stick with the lock, but make sure readers never linger too long while holding the lock. Readers can keep time spent with a short and predictable lock, doing nothing while they hold that lock, but copy data from the write buffer. The only problem is that a reader with a lower priority can be interrupted by a task with a higher priority halfway through the record, and for this is http://en.wikipedia.org/wiki/Priority_ceiling_protocol .

Given this, if a writer stream has a high priority, the worst case that needs to be performed for each buffer is the writer stream, which fills the buffer, and for each reader stream, to copy data from this buffer to another buffer. If you can afford it in each cycle, then the flow of letters and a certain number of copies of data to read will always be completed, and readers who process the data that they copied may or may not do their job. If they don’t do this, they will be behind and notice it when they follow the lock capture and look around to see which buffer they want to copy.

FWIW, my experience in reading code in real time (when you want to show that there are errors, not in our code) is that it is incredibly and intentionally simple, very clearly laid out and not necessarily more effective than necessary to fit it deadlines, so some seemingly pointless copying of data to get a direct lock to work can be very useful if you can afford it.

0
source

All Articles