Buffer types

Recently, an interviewer asked me about the types of buffers. What types of buffers exist? Actually this question arose when I said that I would record all system calls in a log file to monitor the system. He said that he would write slowly every call to the file. How to prevent this. I said that I would use a buffer, and he asked me what type of buffer? Can someone explain me the types of buffers, please.

+4
source share
5 answers

C under UNIX (and possibly other operating systems) usually has two buffers, at least in your current scenario.

The first exists in C runtime libraries, where the information to be written is buffered before being delivered to the OS.

The second is located in the OS itself, where information is buffered until it can be physically recorded on the main medium.

As an example, we wrote a registration library many years ago, which forced information to be written to disk so that it was there if either the program crashed or the OS crashed.

This was achieved using the sequence:

fflush (fh); fsync (fileno (fh)); 

The first one actually guaranteed that the information was transferred from the C runtime buffers to the operating system, the second one was written to disk. Keep in mind that this is an expensive operation and should only be done if you absolutely need information written immediately (we did this only at the SUPER_ENORMOUS_IMPORTANT level).

Honestly, I'm not quite sure why your interviewer thought it would be slow if you did not write a lot of information. Two levels of buffering should already work there quite adequately. If this was a problem, you could simply enter another layer yourself, which wrote messages in the memory buffer, and then passed it to one call of type fprint when it was about to overflow.

But if you don’t do this without any function calls, I don’t see it be much faster than the fprint buffer already gives you.


After clarifying in the comments that this question is actually related to buffering inside the kernel:

Basically, you want it to be as fast, efficient and efficient (not prone to a crash or lack of resources) as possible.

Probably the best option would be a buffer statically allocated or dynamically allocated once at boot time (you want to avoid the possibility that dynamic redistribution will fail).

Others have proposed a circular (or circular) buffer, but I would not go this way (technically) for the following reason: using the classic circular buffer means that there will be two independent records to write data when it is wrapped. For example, if your buffer has:

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |s|t|r|i|n|g| |t|o| |w|r|i|t|e|.| | | | | | |T|h|i|s| |i|s| |t|h|e| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ^ ^ | | Buffer next --+ +-- Buffer start 

then you will need to write "This is the " and then "string to write." .

Instead, save the next pointer, and if the bytes in the buffer plus the added bytes are less than the size of the buffer, just add them to the buffer without physically writing to the main medium.

Only if you are going to overflow the buffer do you start to do complex things.

You can use one of two ways:

  • Either discard the buffer, how much it costs, return the next pointer to the beginning to process a new message; or
  • Add part of the message to fill the buffer, then clear it and return the next pointer to the beginning to process the rest of the message.

I would choose the second one, given that you have to take into account messages that are too large for the buffer.

I'm talking about something like this:

 initBuffer: create buffer of size 10240 bytes. set bufferEnd to end of buffer + 1 set bufferPointer to start of buffer return addToBuffer (size, message): while size != 0: xfersz = minimum (size, bufferEnd - bufferPointer) copy xfersz bytes from message to bufferPointer message = message + xfersz bufferPointer = bufferPointer + xfersz size = size - xfersz if bufferPointer == bufferEnd: write buffer to underlying media set bufferPointer to start of buffer endif endwhile 

It basically handles messages of any size efficiently, reducing the number of physical records. Of course, they will be optimized - it is possible that the message could have been copied to the kernel space, so it makes no sense to copy it to the buffer if you write it anyway. You can also write information from a copy of the kernel directly to the main medium and only transfer the last bit to the buffer (since you need to save it).

In addition, you probably want to flush an incomplete buffer to the main media if nothing has been written for a while. This would reduce the likelihood of a lack of information that the core itself crashes.

Also: technically, I suppose this is a kind of circular buffer, but it has special case handling to minimize the number of records and does not need to use a tail pointer because of this optimization.

+5
source

There are also ring buffers that have limited space requirements and are probably best known in the Unix dmesg facility.

+2
source

What comes to mind for me is time buffers and size. Thus, you can either just write everything in the buffer to write once every x seconds / minutes / hours or something else. Alternatively, you can wait until x records or byte byte values ​​in bytes are recorded and written immediately. This is one way to make log4net and log4J.

0
source

In general, there are First-In-First-Out ( FIFO ) buffers, also known as queues; and there are "Latest * -In-First-Out" ( LIFO ) buffers, also known as stacks.

To implement FIFO, there are circular buffers that are usually used where a fixed-size byte array is allocated. For example, this method may use a keyboard device driver or serial I / O. This is a common type of buffer, which is used when it is not possible to dynamically allocate memory (for example, in the driver, which is necessary for the OS virtual memory subsystem).

If dynamic memory is available, FIFO can be implemented in various ways, in particular, using list-bound data structures.

In addition, binomial heaps that implement priority queues can be used to implement the FIFO buffer.

A special case of neither the FIFO nor the LIFO buffer is the TCP segment reassembly buffers. They may contain segments received out of order ("from the future") that are held until intermediate segments that have not yet arrived.

* My abbreviation is better, but most will call LIFO "Last In, First Out", not the last.

0
source

Correct me if I am wrong, but I would not use the mmap file for the log to avoid both the overhead of small write syscalls and the possibility of data loss if the application (but not the OS) crashed? It seems like the perfect balance between performance and reliability for me.

0
source

Source: https://habr.com/ru/post/1314794/


All Articles