Stack prefix in linux - one or more errors are required

Question

Stack prefix in linux - one or more errors are required

On Linux, when a process requests some (virtual) memory from the system, it is simply registered in vma (the virtual memory descriptor of the process), but the physical page for each virtual is not saved during the call. Later, when the process gains access to this page, it will crash (access will cause the Fault page to crash), and the PF # handler will highlight the physical page and page of the update process pages.

There are two cases: a reading error can turn into a link to a zero page (a special global pre-reset page) that is write-protected; and a recording error (both on the page with the zero page and on the page with the required, but not physically correlated) will lead to the actual distribution of the personal physical page.

For mmaps (and brk / sbrk, which is also an internal mmap), this method is for each page; all mmaped areas are registered as a whole in vma (they have start and end addresses). But the stack is handled differently because it only has a start address (higher on a typical platform, incremented to lower addresses).

The question arises:

When I access the new unallocated memory near the stack, it will receive PF # and grow. How is this handled if I get access not to a page next to the stack, but to a page that is 10 or 100 pages away from the stack?

eg.

int main() { int *a = alloca(100); /* some useful data */ int *b = alloca(50*4096); /* skip 49 pages */ int *c = alloca(100); a[0]=1; /* no accesses to b - this is untouched hole of 49 pages */ c[0]=1; }

Will this program receive 2 or 50 private physical pages allocated for the stack?

I think it might be beneficial to ask the kernel to distribute dozens of physical pages in one file, and then do dozens of pages highlighting page by page (1 interrupt + 1 context switch + simple, cached loop through N requests for page placement compared with N interrupts + N context switches + N page allocation when mm code can be pulled from Icache).

+8

memory-management linux linux-kernel page-fault

osgx Dec 19 '12 at 7:26

source share

2 answers

With this code:

 int main() { int *a = alloca(100); /* some useful data */ int *b = alloca(50*4096); /* skip 49 pages */ int *c = alloca(100); int i; #if TOUCH > 0 a[0] = 1; // [1] #endif #if TOUCH > 1 c[0] = 1; // [2] #endif #if TOUCH > 2 for (i=0; i<25; i++) // [3] b[i*1024] = 1; #endif #if TOUCH > 3 for (i=25; i<50; i++) // [4] b[i*1024] = 1; #endif return 0; }

And this script:

 for i in 1 2 3 4; do gcc dc -DTOUCH=$i echo "Upto [$i]" $(perf stat ./a.out 2>&1 | grep page-faults) done

Exit:

 Upto [1] 105 page-faults # 0.410 M/sec Upto [2] 106 page-faults # 0.246 M/sec Upto [3] 130 page-faults # 0.279 M/sec Upto [4] 154 page-faults # 0.290 M/sec

+4

perreal Dec 21 '12 at 2:12

source share

PT · Accepted Answer · 2012-12-22T17:14:57+0000

Auto stack expansion can be considered an automatic call to mremap to resize the virtual address area, which is considered the "stack". Once this is processed, page errors in the stack area or in the vanilla mmap area are handled the same way, i.e. One page at a time.

So you should get ~ 2 pages, not ~ 51. The @perreal empirical answer confirms this ...

To the last part of the question, the cost of related page errors is one of the factors that lead to the development of "huge pages." I don’t think there are other ways to “batch” handle page errors on Linux. Maybe madvise can do something, but I suspect that it basically optimizes the very expensive part of the page errors, which looks at the backup pages during storage). Page faults that map to zero pages are relatively easy to compare.

Stack prefix in linux - one or more errors are required

More articles: