Why does a read operation in a file with a zero byte with memory result in SIGBUS?

Question

Why does a read operation in a file with a zero byte with memory result in SIGBUS?

Here is an example of the code I wrote.

#include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> #include <sys/mman.h> int main() { int fd; long pagesize; char *data; if ((fd = open("foo.txt", O_RDONLY)) == -1) { perror("open"); return 1; } pagesize = sysconf(_SC_PAGESIZE); printf("pagesize: %ld\n", pagesize); data = mmap(NULL, pagesize, PROT_READ, MAP_SHARED, fd, 0); printf("data: %p\n", data); if (data == (void *) -1) { perror("mmap"); return 1; } printf("%d\n", data[0]); printf("%d\n", data[1]); printf("%d\n", data[2]); printf("%d\n", data[4096]); printf("%d\n", data[4097]); printf("%d\n", data[4098]); return 0; }

If I provide the zero byte of foo.txt for this program, it terminates with SIGBUS.

 $ > foo.txt && gcc foo.c && ./a.out pagesize: 4096 data: 0x7f8d882ab000 Bus error

If I provided one byte of foo.txt for this program, then there is no such problem.

 $ printf A > foo.txt && gcc foo.c && ./a.out pagesize: 4096 data: 0x7f5f3b679000 65 0 0 48 56 10

mmap (2) mentions the following.

Using the displayed area may result in the following signals:
SIGSEGV Attempt to write to the area displayed as read-only.
SIGBUS An attempt to access a part of the buffer that does not correspond to the file (for example, outside the file, including the case when another process cut the file).

So, if I understand this correctly, even the second test case (1-byte file) should have led to SIGBUS, because data[1] and data[2] are trying to access the part of the buffer ( data ) that does not correspond to the file .

Can you help me understand why only a file with a zero byte causes this program to crash using SIGBUS?

+8

c mmap sigbus

Lone learningner Jan 01 '16 at 13:59

source share

2 answers

A 1-byte file does not crash, because mmap will display memory several times the size of the page and zero balance. On the man page:

The file is displayed in multiple page sizes. For a file that is not a multiple of the page size, the remaining memory is reset to zero when displayed and written to this region, not written to the file. The effect of resizing the base mapping file on pages that match the added or deleted regions of the file is not specified.

+3

ma_il Jan 01 '16 at 14:08

source share

Andrew Henle · Accepted Answer · 2017-01-01T14:36:34+0000

You get SIGBUS when accessing the end of the last whole displayed page, because the states of the POSIX standard are :

The mmap() function can be used to display a region of memory that is larger than the current size of the object. Access to the memory is within the scope of the mapping, but outside the current end of the base objects, it can lead to sending SIGBUS signals to the process.

In a file with a zero byte, the entire page you display "is outside the current end of the base object." So you get SIGBUS .

You do not get SIGBUS when you go beyond the 4kB page that you matched because it is not in your mapping. You do not get SIGBUS access to your mapping when your file is greater than zero, since the entire page is displayed.

But you would get SIGBUS if you matched additional pages beyond the end of the file, for example, matching two 4kB pages for a 1-byte file. If you go to the second page of 4kB, you will get SIGBUS .

Why does a read operation in a file with a zero byte with memory result in SIGBUS?

More articles: