Linux processes

On Linux, what happens to the state of a process when it needs to read blocks from disk? Is he blocked? If so, how is the other process running?

+81
linux process cpu kernel
Sep 25 '09 at 6:12
source share
8 answers

Waiting for read() or write() to / from the file descriptor return, the process will be placed in a special kind of sleep called "D" or "Sleep disk". This is special because the process cannot be killed or interrupted in this state. This process, pending return from ioctl (), will also be strewn in this way.

The exception is that a file (for example, a terminal or other character device) is opened in O_NONBLOCK mode, transmitted when it is assumed that a device (for example, a modem) needs time to initialize. However, in your question you indicated blocking devices. Also, I have never tried ioctl() , which probably blocks fd, open in non-blocking mode (at least not consciously).

The choice of another process depends entirely on the scheduler used, as well as on what other processes could do to change their weights in this scheduler.

Under certain circumstances, some user-space programs are known to remain in this state forever, until reboot. They are usually grouped together with other "zombies", but this term will not be correct, as they are not technically non-existent.

+77
Sep 25 '09 at 6:25
source share

When a process needs to extract data from a disk, it actually stops working on the CPU to allow other processes to execute, because the operation can take a lot of time - the usual time to search for a disk is at least 5 ms, and 5 ms - 10 million. CPU cycles, eternity with program point of view!

From the point of view of the programmer (also called "in user space"), this is called a system call lock . If you call write(2) (which is the thin shell of libc for a system call with the same name), your process does not exactly stop at that boundary; in the kernel, it continues to execute the system call code. Most of the time it goes all the way to a certain driver of the disk controller (file name β†’ file system / VFS β†’ block device β†’ device driver), where the command to extract the block on the disk is transferred to the corresponding equipment, which is very fast work most of the time.

Then the process is put into a state of sleep (in the core space, a lock is called sleeping - nothing is ever "blocked" from the point of view of the kernel). It will be awakened after the equipment has finally retrieved the necessary data, then the process will be marked as operational and scheduled. In the end, the scheduler will start the process.

Finally, in user space, the lock system call returns with the proper state and data, and the program flow continues.

Most system I / O calls can be called in non-blocking mode (see O_NONBLOCK in open(2) and fcntl(2) ). In this case, system calls are returned immediately and only report the sending of the operation to disk. The programmer will later need to explicitly check whether the operation was successful, successful or not, and retrieve its result (for example, using select(2) ). This is called asynchronous or event-based programming.

Most answers that mention state D (which is called TASK_UNINTERRUPTIBLE in Linux state names) are incorrect. State D is a special standby mode that runs only in the kernel space code path, when this code path cannot be interrupted (since it is too complicated for programming), expecting it to be blocked only for a very long time. a short time. I believe most D-states are actually invisible; they are very short-lived and cannot be detected by tools like "top".

You may encounter unkillable processes in state D in several situations. NFS is famous for this, and I have come across this many times. I think there is some semantic collision between some VFS code paths that suggest that they always reach local disks and quickly detect errors (on SATA, the error wait time is about a few 100 ms), and NFS, which actually selects data from the network, which is more resilient and has slow recovery (normal TCP latency is 300 seconds). Read this article for the TASK_KILLABLE solution introduced in Linux 2.6.25 with the TASK_KILLABLE state. There was a hack before this era when you could send signals to clients of the NFS process by sending SIGKILL to the rpciod kernel rpciod , but forget about this ugly trick ...

+121
Jul 13 2018-11-21T00:
source share

The process performing I / O will be transferred to state D (continuous sleep) , which frees the processor until a hardware interrupt occurs that tells the CPU to return to program execution. See man ps for other process states.

Depending on your kernel, there is a process scheduler that monitors the launch of ready-made processes. It, along with the scheduling algorithm, tells the kernel which it processes to assign to which processor. Kernel processes and user processes must be considered. Each process is allocated a slice of time, which is a piece of processor time that is allowed to be used. As soon as the process uses its entire time slice, it is marked as expired and gets a lower priority in the planning algorithm.

In kernel 2.6, there is an O (1) time complexity scheduler , so no matter how many processes you have running, it will assign processors in a constant time mode. However, this is more complicated, since in 2.6 pre-setting and balancing of processor load was introduced - this is not a simple algorithm. In any case, its efficient and processors will not remain inactive while you wait for I / O.

+6
Sep 25 '09 at 6:34
source share

As explained by others, processes in state "D" (uninterrupted sleep) are responsible for the process ps. This has happened to me many times with RedHat 6.x and NFS home directories.

To list processes in state D, you can use the following commands:

 cd /proc for i in [0-9]*;do echo -n "$i :";cat $i/status |grep ^State;done|grep D 

To find out the current directory of the process and possibly the installed NFS drive with problems, you can use a command similar to the example below (replace number 31134 with the number of the sleeping process):

 # ls -l /proc/31134/cwd lrwxrwxrwx 1 pippo users 0 Aug 2 16:25 /proc/31134/cwd -> /auto/pippo 

I found that passing the umount command using the -f (force) switch to the corresponding mounted nfs file system could wake the sleeping process:

 umount -f /auto/pippo 

the file system was not disconnected because it was busy, but the process associated with it woke up, and I was able to solve the problem without rebooting.

+2
Aug 2 '16 at 15:02
source share

Assuming your process is a single thread, and that you are using blocking I / O, your process blocks waiting for I / O to complete. The kernel will select another process that will be executed in the meantime, based on good, priority, last run time, etc. If there are no other running processes, the kernel will not work; instead, it will inform the machine that the machine is in standby mode (which will reduce power consumption).

Processes waiting for I / O to complete are usually displayed in state D, such as ps and top .

+1
Sep 25 '09 at 6:24
source share

Yes, the task is blocked in the read () system call. Another task that is ready to start or if other tasks are not ready, an unoccupied task is launched (for this CPU).

A normal reading of the blocking disk forces the task to enter state β€œD” (as others have noted). Such tasks contribute to load averaging, even if they do not consume a processor.

Some other types of IOs, especially ttys and network, do not behave exactly the same - the process ends in state "S" and can be interrupted and not taken into account on average for loading.

+1
Sep 25 '09 at 21:57
source share

Yes, tasks waiting for I / O are blocked, and other tasks are being performed. The next task is selected by the Linux scheduler .

0
Sep 25 '09 at 6:26
source share

As a rule, the process is blocked. If the read operation is in a file descriptor that is marked as non-blocking, or if the process uses asynchronous I / O, it will not be blocked. In addition, if there are other threads in the process that are not blocked, they can continue to work.

The decision about which process starts next depends on the scheduler in the kernel.

0
Sep 25 '09 at 6:26
source share



All Articles