Why is the process address space divided into four segments (text, data, stack and heap)?

Why should the process address space be divided into four segments (text, data, stack, and heap)? What is an advocate? is it possible to have only one whole large segment?

+4
source share
3 answers

There are several reasons for dividing programs into parts in memory.

One of them is that the memory of commands and data can be architecturally different and non-contiguous, that is, read and write from / to use different instructions and circuits inside and outside the CPU, forming two different address spaces (for example, reading code from address 0 and reading data from address 0 will usually return two different values ​​from different memories).

Another is reliability / security. You rarely want to change program code and persistent data. In most cases, when this happens, this is due to the fact that something is wrong (either in the program itself or at its input, which can be maliciously built). You want to prevent this and know if there are any attempts. Likewise, you do not want data regions to change as executable. If they exist and there are security errors in the program, the program can easily be forced to do something harmful when malicious code turns it into data areas of the program as data and triggers these security errors (for example, buffer overflows).

Another one is storage ... In many programs, several data areas are not initialized at all or are initialized with one common predefined value (often 0). Memory should be reserved for these data areas when the program is loaded and about to start, but these areas do not need to be stored on disk because there is no meaningful data.

On some systems, you can have everything in one place (section / segment / etc). One noteworthy example here is MSDOS, where programs like .COM have no structure except that they should be less than about 64 KB, and the first executable command should appear at the very beginning of the file and assume that its location matches IP -address = 0x100 (where IP is the register of the instruction pointer). How to place and move the code and data in the .COM program does not matter to the programmer.

There are other architectural artifacts such as x86 segments. Again, MSDOS is a good example of the OS that deals with them .. EXE style in it can have several segments in them that directly relate to x86 processor segments, a real-mode addressing scheme in which memory is viewed through 64KB-long windows "known as segments. The position of these windows / segments depends on the value of the registers of the CPU segment. By changing the values ​​of the segment registers, you can move the "windows". To access more than 64 KB, you need to use different values ​​of the segment register, and this often implies the presence of several segments in .EXE (there can be not only one segment for code, but one for data, but also several segments for any of them) .

+4
source

At least the segments of text and data are separated to prevent the execution of malicious code that is stored inside the variable.

Instructions (compiled code) are stored in the text segment, and the contents of your variables are stored in the data segment, the last of which is never executed, only read and write to.

A bit more info here .

+2
source

Isn't this difference a big, hacky workaround for fixing security in von Neumann architecture, where data and instructions share the same memory?

0
source

All Articles