Please explain the exec () function and its family

What is the exec() function and its family? Why is this feature used and how does it work?

Please someone explain these features.

+87
c unix
Nov 17 '10 at 13:39
source share
7 answers

Simply put, on UNIX you have the concept of processes and programs. A process is what the program runs in.

The simple idea of ​​a UNIX "execution model" is that you can perform two operations.

The first is fork() , which creates a completely new process that contains a duplicate of the current program, including its state. There are several differences between the processes that allow them to find out who is the parent and who is the child.

The second is exec() , which replaces the program in the current process with a new program.

From these two simple operations, you can build the entire UNIX runtime model.




To add some details to the above:

Using fork() and exec() illustrates the spirit of UNIX in the sense that it provides a very simple way to start new processes.

The fork() call creates almost a duplicate of the current process, identical in almost all respects (not all are copied, for example, due to resource limitations in some implementations, but the idea is to create the closest possible copy). One process calls fork() , while two processes return from it - it sounds weird, but it's really pretty elegant

The new process (called the child) receives a different process identifier (PID) and has the PID of the old process (parent) as the parent PID (PPID).

Since the two processes are currently running exactly in the same code, they should be able to say that there is something - the fork() return code provides this information - the child gets 0, the parent gets the PID of the child (if fork() fails, the child is not created, and the parent code receives the error code). Thus, the parent knows the PID of the child and can communicate with it, destroy it, wait for it, etc. (A child process can always find its parent process by calling getppid() ).

The call to exec() replaces all the current contents of the process with a new program. It loads the program into the current process space and runs it from the entry point.

Thus, fork() and exec() are often used sequentially to start a new program as a child of the current process. Shells usually do this whenever you try to run a program such as find - the shell branches, then the child loads the find program into memory, setting up all the command line arguments, standard I / O, etc.

But they are not required to be used together. It is perfectly acceptable for a program to call fork() without the following exec() if, for example, the program contains both parent and child code (you need to be careful what you do, each implementation may have limitations). This has been used quite a bit (and is still in use) for daemons that simply listen on the TCP port and design their copy to process a specific request, while the parent returns to listening. In this situation, the program contains both parent and child code.

Similarly, programs that know they have finished and just want to run another program do not need fork() , exec() , and then wait()/waitpid() for the child. They can simply load the child directly into the current process space using exec() .

Some UNIX implementations have optimized fork() , which uses what they call copy-on-write. It is a trick to defer copying the process space to fork() until the program tries to change something in that space. this is useful for programs that use only fork() , not exec() , because they do not need to copy the entire process space. On Linux, fork() only copies page tables and a new task structure; exec() does the bulk of the work of "dividing" the memory of two processes.

If exec is invoked according to fork (and this is what happens mostly), it invokes a write to the process space and then is copied to the child process.

Linux also has vfork() , even more optimized, that shares almost everything between the two processes. Because of this, there are certain restrictions on what the child can do, and the parent stops until the child calls exec() or _exit() .

The parent must be stopped (and the child is not allowed to return from the current function), since the two processes even share the same stack. This is slightly more efficient for the classic fork() use case, followed immediately by exec() .

Note that there is a whole family of exec calls ( execl , execle , execve , etc.), but exec in the context here means any of them.

The following diagram illustrates a typical fork/exec operation in which the bash used to list a directory using the ls :

 +--------+ | pid=7 | | ppid=4 | | bash | +--------+ | | calls fork V +--------+ +--------+ | pid=7 | forks | pid=22 | | ppid=4 | ----------> | ppid=7 | | bash | | bash | +--------+ +--------+ | | | waits for pid 22 | calls exec to run ls | V | +--------+ | | pid=22 | | | ppid=7 | | | ls | V +--------+ +--------+ | | pid=7 | | exits | ppid=4 | <---------------+ | bash | +--------+ | | continues V 
+228
Nov 17 '10 at 13:51
source share

The functions of the exec () family have different types of behavior:

  • l: arguments are passed as a list of strings in main ()
  • v: arguments are passed as an array of strings in main ()
  • p: path / s to search for a new running program
  • e: environment can be specified by the caller

You can mix them, so you have:

  • int execl (const char * path, const char * arg, ...);
  • int execlp (const char * file, const char * arg, ...);
  • int execle (const char * path, const char * arg, ..., char * const envp []);
  • int execv (const char * path, char * const argv []);
  • int execvp (const char * file, char * const argv []);
  • int execvpe (const char * file, char * const argv [], char * const envp []);

For all of them, the original argument is the name of the file to be executed.

For more information, read the exec (3) man page :

 man 3 exec # if you are running a UNIX system 
+30
Jun 01 '16 at 3:26
source share

The exec family of functions makes your process execute another program, replacing the old running program. Ie if you call

 execl("/bin/ls", "ls", NULL); 

then the ls program is executed with the process identifier, the current working directory and the user / group (access rights) of the process called execl . Subsequently, the original program no longer works.

To start a new process, use the fork system call. To execute a program without replacing the original, you need fork , then exec .

+16
Nov 17 2018-10-17
source share

exec often used in conjunction with the fork that I saw, which you also asked about, so I will discuss this with that in mind.

exec turns the current process into another program. If you have ever watched Doctor Who, then this happens when he is recovering - his old body is replaced by a new body.

The way this happens with your program and exec is that many of the resources that the OS kernel checks to see if the file that you pass exec as an argument to the program (first argument) checks the current user (process user ID, calling the exec call), and if it replaces the virtual memory mapping of the current process with virtual memory with a new process and copies the argv and envp data that was transferred in the exec call to the area of ​​this new virtual memory card. Several other things may also happen here, but files open for a program called exec will still be open for a new program, and they will share the same process identifier, but a program called exec will stop (if exec failed).

The reason this is done is because by separating the launch of a new program into two steps, like this, you can do some things between the two steps. The most common task is to make sure that the new program has certain files open as specific file descriptors. (remember that file descriptors do not match FILE * , but are int values ​​that the kernel knows about). You can:

 int X = open("./output_file.txt", O_WRONLY); pid_t fk = fork(); if (!fk) { /* in child */ dup2(X, 1); /* fd 1 is standard output, so this makes standard out refer to the same file as X */ close(X); /* I'm using execl here rather than exec because it easier to type the arguments. */ execl("/bin/echo", "/bin/echo", "hello world"); _exit(127); /* should not get here */ } else if (fk == -1) { /* An error happened and you should do something about it. */ perror("fork"); /* print an error message */ } close(X); /* The parent doesn't need this anymore */ 

Runs:

 /bin/echo "hello world" > ./output_file.txt 

from the shell.

+7
Nov 17 '10 at 21:11
source share

what is the function exec and its family.

The exec function exec is all the functions used to execute a file, such as execl , execlp , execle , execv and execvp . All of them are external interfaces for execve and provide various methods to call it.

why is this feature used

Exec functions are used when you want to run (run) a file (program).

and how it works.

They work by overwriting the current process image with the one you started. They replace (ending) the currently running process (calling the exec command) with the new running process.

For more information: see this link .

+6
Nov 17 '10 at 13:47
source share

The exec(3,3p) functions replace the current process with another. That is, the current process stops, and the other runs instead, intercepting some resources that the original program had.

+4
Nov 17 2018-10-17
source share

When a process uses fork (), it creates its copy, and these duplicates become children for the process. The fork () function is implemented using the clone () system call on linux, which returns the kernel twice.

  • A nonzero value (the process identifier of the child) is returned to the parent.
  • Zero value is returned to the child.
  • If the child was not created successfully due to any problems, such as a lack of memory, -1 is returned to fork ().

Let's look at an example:

 pid = fork(); // Both child and parent will now start execution from here. if(pid < 0) { //child was not created successfully return 1; } else if(pid == 0) { // This is the child process // Child process code goes here } else { // Parent process code goes here } printf("This is code common to parent and child"); 

In this example, we assumed that exec () is not used inside the child process.

But the parent and child differ in some attributes of the PCB (process control unit). It:

  1. PID - both children and parents have different process IDs.
  2. Waiting signals - the child does not inherit the waiting signals of the parents. When created, it will be empty for the child process.
  3. Memory locks - the child does not inherit the memory locks of its parents. Memory locks are locks that you can use to lock a memory area, and then this memory area cannot be transferred to disk.
  4. Lock records - the child does not inherit the lock records of their parents. Write locks are associated with a file block or an entire file.
  5. The use of process resources and processor time for the child is set to zero.
  6. The child also does not inherit timers from the parent.

But what about baby memory? Is a new address space created for the child?

Answers in no. After fork (), the parent and child elements share the memory address space of the parent element. On Linux, these address spaces are divided into several pages. Only when a child writes to one of the parent memory pages does a duplicate of this page be created for the child. This is also called copy-on-write (copy parent pages only when the child writes to it).

Let's deal with a copy of the record with an example.

 int x = 2; pid = fork(); if(pid == 0) { x = 10; // child is changing the value of x or writing to a page // One of the parent stack page will contain this local variable. That page will be duplicated for child and it will store the value 10 in x in duplicated page. } else { x = 4; } 

But why copy when recording?

Typical process creation is done using the fork () -exec () combination. Let's first understand what exec () does.

The Exec () function group replaces the child address space with a new program. When exec () is called on a child, a separate address space is created for the child, which is completely different from the parent.

If there weren’t a copy-write mechanism associated with fork (), duplicate pages would be created for the child, and all data would be copied to the child pages. Allocating new memory and copying data is a very expensive process (it takes CPU time and other system resources). We also know that in most cases the child process will call exec (), and this will replace the child memory with a new program. Therefore, the first copy we made would be a waste if there were no copy when recording.

 pid = fork(); if(pid == 0) { execlp("/bin/ls","ls",NULL); printf("will this line be printed"); // Think about it // A new memory space will be created for the child and that memory will contain the "/bin/ls" program(text section), it stack, data section and heap section else { wait(NULL); // parent is waiting for the child. Once child terminates, parent will get its exit status and can then continue } return 1; // Both child and parent will exit with status code 1. 

Why does a parent expect a child process?

  1. A parent can assign a task to its child and wait for it to complete its task. Then he can carry on some other work.
  2. As soon as the child process ends, all resources associated with the child process are freed, except for the process control unit. Now the child is in a zombie state. Using wait (), the parent can find out the status of the child, and then ask the kernel to free the PCB. If the parent does not use the wait, the child will remain in a zombie state.

Why do we need the exec () system call?

There is no need to use exec () with fork (). If the code that will execute the child is inside the program associated with parent, exec () is not needed.

But remember the cases when a child has to run several programs. Let's take an example shell program. It supports several commands such as find, mv, cp, date, etc. Will it be correct to include the program code associated with these commands in one program or to have a child load these programs into memory if necessary?

It all depends on your use case. You have a web server that gave input x, which returns 2 ^ x clients. For each request, the web server creates a new child element and requests it for calculation. Will you write a separate program to calculate this and use exec ()? Or do you just write the calculation code inside the parent program?

Typically, creating a process involves a combination of calls to fork (), exec (), wait (), and exit ().

+2
Aug 14 '19 at 18:40
source share



All Articles