How does writing in I / O writev work?

The writev function takes a struct iovec array as an input argument

writev(int fd, const struct iovec *iov, int iovcnt);

Input is a list of memory buffers that need to be written to a file (say). I want to know:

Is writev internally:

for (each element in iov) write(element)

so that each element from iov written to a file in a separate I / O call? Or writev write everything to a file in one I / O call?

+7
source share
3 answers

According to the standards, the for loop you mentioned is not a valid writev implementation for several reasons:

  • The cycle may not end by writing one iov before moving on to the next, in the case of a short record - but this could be bypassed, making the cycle more complicated.
  • The loop may have abnormal atomic behavior for pipes: if the total record length is less than PIPE_BUF , the record in the pipe is required to be atomic, but the loop would violate the atomicity requirement. This problem cannot be solved, with the exception of moving all iov entries to one buffer before writing, when the total length is not more than PIPE_BUF .
  • There may be times in a loop where this can lead to blocking, when a single writev call is writev to perform partial recording without locking. As far as I know, in the general case this problem will be impossible.
  • Perhaps other reasons that I did not think about.

I'm not sure about point number 3, but it definitely exists in the opposite direction when you read. The read call in the loop may block if the terminal has some data (shorter than the total length iov), followed by the EOF indicator; The readv call should immediately return with a partial read in this case. However, due to a bug in Linux, readv on terminals is actually implemented as a read loop in kernelspace, and it detects this lock error. I had to work around this error when implementing musl stdio:

http://git.etalabs.net/cgi-bin/gitweb.cgi?p=musl;a=commit;h=2cff36a84f268c09f4c9dc5a1340652c8e298dc0

To answer the last part of your question:

Or writev write everything to a file in a single I / O call?

In all cases, the writev matching writev will be a single system call. The transition to how it is implemented on Linux: for regular files and for most devices, the basic file driver has methods that implement the io iov-style directly, without any internal loop. But the Linux terminal driver is very outdated and does not have modern io methods, as a result of which the kernel returns to the write / read cycle for writev / readv when working on the terminal.

+6
source

A direct way to find out how the code works is to read the source code.

see http://www.oschina.net/code/explore/glibc-2.9/sysdeps/posix/writev.c

This is just alloca () or malloc () buffer, copy all vectors into it and call write () once.

How it works. Nothing mysterious.

+5
source
 Or does writev write everything to file in a single I/O call? 

I afarid not everything, although sys_writev is struggling to write everything in one call. it depends on the vfs implementation, if vfs does not give the writev implementation, then kenerl will call vfs' write () in a loop. it is better to check the return value of writev / readv to find out how many wrotten bytes how you do it in write ().

you can find writev code in the kernel, fs / read_write.c: do_readv_writev.

+3
source

All Articles