How does fork () work?

Im really new to forking, what does pid do in this code? Can someone explain what will come out on line X and line Y?

#include <sys/types.h> #include <stdio.h> #include <unistd.h> #define SIZE 5 int nums[SIZE] = {0,1,2,3,4}; int main() { int i; pid_t pid; pid = fork(); if (pid == 0) { for (i = 0; i < SIZE; i++) { nums[i] *= -i; printf("CHILD: %d ",nums[i]); /* LINE X */ } } else if (pid > 0) { wait(NULL); for (i = 0; i < SIZE; i++) printf("PARENT: %d ",nums[i]); /* LINE Y */ } return 0; } 
+6
source share
5 answers

fork() duplicates the process, so after calling fork there are actually 2 instances of your program.

How do you know which process is the source (parent) and which is new (child)?

In the parent process, fork() returns the PID of the child process (which will be a positive integer). This is why the if (pid > 0) { /* PARENT */ } code works. In the child process, fork() simply returns 0 .

Thus, due to checking if (pid > 0) parent process and the child process give different output, which you can see here (as @jxh pointed out in the comments).

+21
source

The simplest example fork ()

 printf("I'm printed once!\n"); fork(); // Now there are two processes running one is parent and another child. // and each process will print out the next line. printf("You see this line twice!\n"); 

The return value of fork (). Return value -1 = failed; 0 = in the child process; positive = in the parent process (and the return value is the identifier of the child process)

 pid_t id = fork(); if (id == -1) exit(1); // fork failed if (id > 0) { // I'm the original parent and // I just created a child process with id 'id' // Use waitpid to wait for the child to finish } else { // returned zero // I must be the newly made child process } 

What is the difference between a child process and a parent process?

  • The parent is notified via a signal when the child process ends, but not vice versa.
  • The child does not inherit pending or timer alarms. The full list can be found in fork ()
  • Here, the process identifier can be returned by getpid (). The ID of the parent process can be returned by getppid ().

Now let's visualize your program code

 pid_t pid; pid = fork(); 

Now the OS creates two identical copies of address spaces, one for the parent and the other for the child.

enter image description here

Both the parent and child processes start their execution immediately after the fork () system call. Since both processes have the same but separate address spaces, these variables, initialized before the fork () call, have the same values ​​in both address spaces. Each process has its own address space, so any changes are independent of the others. If the parent parameter changes the value of its variable, the modification only affects the variable in the address space of the parent process. Other address spaces created by fork () sysem calls will not be affected even if they have the same variable names.

enter image description here

Here, the parent pid is not equal to zero; it calls the ParentProcess () function. On the other hand, the child has a null pid, and it calls ChildProcess (), as shown below: enter image description here

In its parent process, wait() code, it stops at that point until the child exits. Thus, the output of the child appears first.

 if (pid == 0) { // The child runs this part because fork returns 0 to the child for (i = 0; i < SIZE; i++) { nums[i] *= -i; printf("CHILD: %d ",nums[i]); /* LINE X */ } } 

EXIT a child process

what comes out on line X

  CHILD: 0 CHILD: -1 CHILD: -4 CHILD: -9 CHILD: -16 

Then, after the child exits, the parent will continue to work after calling wait () and print the next result.

 else if (pid > 0) { wait(NULL); for (i = 0; i < SIZE; i++) printf("PARENT: %d ",nums[i]); /* LINE Y */ } 

EXIT the parent process:

what comes out on line Y

 PARENT: 0 PARENT: 1 PARENT: 2 PARENT: 3 PARENT: 4 

Finally, both outputs, combined by the child and the parent process, will be displayed on the terminal as follows:

  CHILD: 0 CHILD: -1 CHILD: -4 CHILD: -9 CHILD: -16 PARENT: 0 PARENT: 1 PARENT: 2 PARENT: 3 PARENT: 4 

Additional information refers to this link.

+11
source

The fork() function is special because it actually returns twice: once for the parent process and once for the child process. In the parent process, fork() returns the pid of the child. In the child process, it returns 0. In case of an error, no child process is created, and -1 is returned to the parent.

After a successful call to fork() child process is basically an exact duplicate of the parent process. Both have their own copies of all local and global variables and their own copies of any open file descriptors. Both processes start simultaneously, and since they use the same file descriptors, the output of each process will alternate with each other.

We carefully consider the example in the question:

 pid_t pid; pid = fork(); // When we reach this line, two processes now exist, // with each one continuing to run from this point if (pid == 0) { // The child runs this part because fork returns 0 to the child for (i = 0; i < SIZE; i++) { nums[i] *= -i; printf("CHILD: %d ",nums[i]); /* LINE X */ } } else if (pid > 0) { // The parent runs this part because fork returns the child pid to the parent wait(NULL); // this causes the parent to wait until the child exits for (i = 0; i < SIZE; i++) printf("PARENT: %d ",nums[i]); /* LINE Y */ } 

As a result, you get the following:

 CHILD: 0 CHILD: -1 CHILD: -4 CHILD: -9 CHILD: -16 PARENT: 0 PARENT: 1 PARENT: 2 PARENT: 3 PARENT: 4 

Since the parent process calls wait() , it stops at that point until the child exits. Thus, the child's exit appears first. Then, after the child exits, the parent will continue to work after calling wait() and print the next result.

+4
source

fork() is the function call that the process creates. The process that calls fork() is called the parent process , and the newly created process is called the child process. .

When returning from the fork() system call, two processes have identical copies of their user level context , except for the return value, pid .

In the parent pid process is the Child process ID ( Child process ID the newly created child process )
There is 0 in the child process pid .


The kernel performs the following sequence of operations for fork() .

  • It allocates a slot in the process table for the new process.
  • It assigns a unique ID number to a child process. .
  • It makes a logical copy of the context of the parent process . Because certain parts of a process, such as a text area, can be shared between processes, the kernel can sometimes increase the number of links to regions instead of copying the area to a new physical location in memory,
  • It increases file table counters and modes for files related to the process.
  • It returns the number of ID child to the parent process and a value of 0 for the child process .


Now let's see what happens in your code when you call fork()
 01: pid = fork(); 02: if (pid == 0) { 03: for (i = 0; i < SIZE; i++) { 04: nums[i] *= -i; 05: printf("CHILD: %d ",nums[i]); /* LINE X */ 06: } 07: } 08: else if (pid > 0) { 09: wait(NULL); 10: for (i = 0; i < SIZE; i++) 11: printf("PARENT: %d ",nums[i]); /* LINE Y */ 12: } 

Line 01 is called: fork() , a child process is created. fork() returned, and the return value is stored in pid .
[Note: since there is no error checking in the OP code, this will be discussed later]

String value 02: pid checked for a value of 0 . Please note that this check is performed both in the parent process and in the new child process . As mentioned above, the pid value will be 0 in the child process and Child process ID in the parent process . So, this condition check evaluates to True in the child process and False in the parent process . Therefore, lines 03-07 are executed in the child process .

Line 03-07: These lines are fairly straightforward. Array num[] child process changed ( nums[i] *= -i; ) and printed using printf("CHILD: %d ",nums[i]); .

It should be noted that the values ​​that are printed refer to the num[] array of the child process . Array num[] parent process still the same as before.

It uses a neat trick called copy-on-write . Although this question is not asked, it will still be interesting to read.

Line 08: This line is now checked in the parent process . It will not check the child process. since the previous if was successful. The process identifier is always a positive number, therefore, when the parent process received the process identifier of the newly created child process, it will always pass the else if (pid > 0) test else if (pid > 0) and enter the block.

[Note: there can never be 0 because 0 reserved. Read here .]

Line 09:. This line makes the parent process until the child process completes . It is for this reason that you will see all printf() child process before any of the printf() parent process .

Line 10-12: This is also a fairly simple for loop that prints the value of the num[] array. Note that the values ​​do not change for the parent process . Since it was previously modified by a child process that owns its own copy of the num[] array.


If fork() not working.

There is a chance that calling fork() might fail. In this case, the return value is -1 . This must also be done for the correct program.

 pid = fork(); if (pid == -1) perror("Fork failed"); 

Some content is taken from the UNIX Operating System Design book.

+1
source

In the simplest cases, the behavior of fork() very simple - if you figured it out a bit in your first encounter with it. It either returns once with an error, or returns twice, once in the original (parent) process and once in a completely new, almost exact copy of the original process (child process). Upon return, these two processes are nominally independent, although they have many resources.

 pid_t original = getpid(); pid_t pid = fork(); if (pid == -1) { /* Failed to fork - one return */ …handle error situation… } else if (pid == 0) { /* Child process - distinct from original process */ assert(original == getppid() || getppid() == 1); assert(original != getpid()); …be childish here… } else { /* Parent process - distinct from child process */ assert(original != pid); …be parental here… } 

A child process is a copy of a parent. It has the same set of open file descriptors, for example; each file descriptor N opened in the parent object is open in the child element and they have the same open file description. This means that if one of the processes changes the position of the read or write in the file, it also affects the other process. On the other hand, if one of the processes closes the file, which does not directly affect the file in the other process.

This also means that if data was buffered in the standard I / O package in the parent process (for example, some data was read from the standard input file descriptor ( STDIN_FILENO ) to the data buffer for stdin , then this data is available to both the parent and the child and both can read this buffered data without affecting the other, which will also display the same data.On the other hand, after reading the buffered data, if the parent reads another buffer full, which moves the current file position for both the parent and the child, so then the child will not see the data that the parent just read (but if the child also reads the data block, the parent will not see this). This can be confusing. Therefore, it is usually recommended to make sure there is no pending standard input / output before forcing - fflush(0) is one way to do this.

In the code snippet assert(original == getppid() || getppid() == 1); it is possible that by the time the child completes the statement, the parent process may have quit, in which case the child will be inherited by a system process - which usually has PID 1 (I don’t know any POSIX systems where orphans are inherited other PID, but there is probably one).

Other shared resources, such as memory mapped files or shared memory, are still available in both. The subsequent behavior of the memory mapped file depends on the parameters used to create the mapping; MAP_PRIVATE means that two processes have independent copies of data, and MAP_SHARED means that they use the same copy of data, and changes made by one process will be visible in another.

However, not every program that plugs in is as simple as the story described so far. For example, a parent process might acquire some (advisory) locks; these locks are not inherited by the child. A parent can be multithreaded; the child has one thread of fulfillment - and there are restrictions placed on what the child can do safely.

The POSIX specification for fork() details the differences:

The fork() function should create a new process. The new process (child process) must be an exact copy of the calling process (parent process), except as described below:

  • The child process must have a unique process identifier.

  • The child process ID must also not match the active process group ID.

  • The child process must have a different parent process identifier, which must be the process identifier of the calling process.

  • The child process must have its own copy of the parent file descriptors. Each of the descriptors of the child files must refer to the same description of the open file with the corresponding file descriptor of the parent.

  • The child process must have its own copy of the threads of the parent open directories. Each open directory stream in the child process can share the directory positioning with the corresponding directory stream of the parent element.

  • The child process must have its own copy of the parent message directory descriptors.

  • The values ​​of the child process tms_utime , tms_stime , tms_cutime and tms_cstime must be set to 0.

  • The time remaining until the alarm is reset to zero, and the alarm, if any, should be canceled; see the alarm.

  • [XSI] ⌦ All semadj values ​​must be cleared. ⌫

  • File locks set by the parent process must not be inherited by the child process.

  • The set of pending child process signals is initialized to an empty set.

  • [XSI] ⌦ The interval timer must be reset in the child process. ⌫

  • Any semaphores open in the parent process must also be open in the child process.

  • [ML] ⌦ The child process must not inherit the address space memory locks set by the parent process through calls to mlockall() or mlock() . ⌫

  • Memory mappings created in the parent are stored in the child process. MAP_PRIVATE mappings inherited from the parent must also be MAP_PRIVATE mappings in the child, and any data changes in these mappings made by the parent before calling fork() must be visible to the child. Any changes to the data in the MAP_PRIVATE mappings made by the parent after fork() returns should only be visible to the parent. Modifications to data in MAP_PRIVATE mappings made by a child should only be visible to the child.

  • [PS] ⌦ For scheduling policies SCHED_FIFO and SCHED_RR, the child process must inherit the policy and priority settings of the parent process during the fork() function. For other planning policies, the policy and priority settings on fork() determined by implementation. ⌫

  • Timers for each process created by the parent must not be inherited by the child process.

  • [MSG] ⌦ The child process must have its own copy of the parent's message queue descriptors. Each of the child's message descriptors must refer to the same open message queue description as the corresponding parent message descriptor. ⌫

  • No asynchronous input or asynchronous output operations should be inherited by a child process. Any use of asynchronous control units created by the parent causes undefined behavior.

  • A process must be created in a single thread. If a multi-threaded process calls fork() , the new process should contain a replica of the calling thread and its entire address space, possibly including the state of the mutexes and other resources. Therefore, in order to avoid errors, the child process can only perform operations with the asynchronous signal until one of the exec functions is called. The fork handlers can be set using the pthread_atfork() function to support application invariants on fork() calls.

  • When an application calls fork() from a signal handler, and any of the fork handlers registered by pthread_atfork() calls a function that is not safe for the asynchronous signal, the behavior is undefined.

  • [OB TRC TRI] ⌦ If the Trace option and the Trace Inherit option are supported:

    If the calling process is tracked in a trace stream that has its own inheritance policy set to POSIX_TRACE_INHERITED, the child process must be traced to this trace stream, and the child process must inherit the parent mapping of trace event names for the type trace event. If the trace stream in which the calling process is tracked has its inheritance policy set to POSIX_TRACE_CLOSE_FOR_CHILD, the child process should not be traced to this trace stream. The inheritance policy is set by calling posix_trace_attr_setinherited() . ⌫

  • [OB TRC] ⌦ If the Trace parameter is supported, but the Trace Inherit option is not supported:

    A child process must not be traced to any of the trace streams of its parent process. ⌫

  • [OB TRC] ⌦ If the Trace option is supported, the child process of the trace controller process should not control the trace flows controlled by its parent process. ⌫

  • [CPT] ⌦ The initial value of the CPU hours of the child process must be set to zero. ⌫

  • [TCT] The initial CPU time for one thread of the child process must be set to zero.

    , POSIX.1-2008, . , POSIX.1-2008, POSIX.1-2008.

    fork() , , .

, , . POSIX fork() .

, . . , , () COW - - , , . ; . , , , fork() - ( , exec*() ). , , - . open() dup2() requires a discussion of the differences between file descriptors and descriptions of open files.

+1
source

All Articles