Python: How does os.fork () work?

I am learning multi-processing in Python. I tried multiprocessing, and after I read the source code of the multiprocessor module, I found that it uses os.fork() , so I'm writing some code to test os.fork() , but I'm stuck. My code is as follows:

 #!/usr/bin/env python # -*- coding: utf-8 -*- import os import time for i in range(2): print '**********%d***********' % i pid = os.fork() print "Pid %d" % pid 

I think that each print will be done twice, but they will be done three times. I can not understand how it works? I read this. Need to know how the fork works?
From what is said in this article, it will also be executed twice, so I am so stuck ...

+18
source share
2 answers

To answer the question directly, os.fork() works by calling the base OS function fork() .

But you are certainly interested in what this does. Well, this creates another process that resumes exactly in the same place as this one. Thus, when you start the cycle for the first time, you get a fork, after which you have two processes: the "original" (which receives the pid value for the PID of the child process) and the forked (which receives the pid value of 0 ).,

They both print their pid values โ€‹โ€‹and continue the second cycle, which they both print. Then they both fork, leaving you with 4 processes that all print their respective pid values. Two of them must be 0 , the other two must be the PID of the child they just created.

Change code to

 #!/usr/bin/env python # -*- coding: utf-8 -*- import os import time for i in range(2): print '**********%d***********' % i pid = os.fork() if pid == 0: # We are in the child process. print "%d (child) just was created by %d." % (os.getpid(), os.getppid()) else: # We are in the parent process. print "%d (parent) just created %d." % (os.getpid(), pid) 

Youโ€™d better see what happens: each process will tell you its PID and what happened at the fork.

+24
source

First of all, delete this line print '******...' . It just bothers everyone. Instead, let's try this code ...

 import os import time for i in range(2): print "I'm about to be a dad!" time.sleep(5) pid = os.fork() if pid == 0: print "I'm {}, a newborn that knows to write to the terminal!".format(os.getpid()) else: print "I'm the dad of {}, and he knows to use the terminal!".format(pid) os.waitpid(pid) 

Well, first, what is a โ€œplugโ€? Fork is a feature of modern and standards-compliant operating systems (with the exception of M $ Windows: this joke of the operating system is almost entirely up-to-date with requirements and standards), which allows a process (aka โ€œprogramโ€) and includes a Python interpreter!) Literally make an exact copy of yourself, effectively creating a new process (another instance of the "program"). Once this magic is done, both processes become independent. Changing something in one of them does not affect the other.

The process responsible for writing this dark and ancient spell is known as the parent process. The soulless result of this immoral abomination to life itself is known as a childish process.

As it should be obvious to everyone, including those for whom it is not, you can become a member of this select group of programmers who sold their souls with os.fork() . This function performs a branching operation and thus leads to the creation of a second process from the air.

Now, what does this function return, or, more importantly, how does it return at all? If you do not want to go crazy, please do not go and read the Linux kernel file /kernel/fork.c ! Once the kernel does what we know, it should do it, but we donโ€™t want to accept it, os.fork() returns in two processes! Yes, even the call stack is copied!

So, if they are exact copies, how to distinguish between parent and child? Just. If the result of os.fork() is zero, then you are working with a child. Otherwise, you are working in the parent element, and the return value is the PID (Process IDentifier) โ€‹โ€‹of the child element. In any case, a child can get its own PID from os.getpid() , no?

Now, taking that into account, and the fact that doing fork() inside a loop is a recipe for mess, here's what happens. Let me call the original process the "main" process ...

  • Master: i = 0 , forks into child- # 1-of-master
    • child- # 1-of-master: i = 1 branches into child- # 1-of-child- # 1-of-master
    • child- # 1-of- child- # 1-of-master: for loop, exit
    • child- # 1-of-master: for loop, exit
  • Master: i = 1 , forks into child- # 2-of-master
    • child- # 2-of-master: i = 1 branches into child- # 1-of-child- # 2-of-master
    • child- # 1-of- child- # 2-of-master: for loop, exit
    • child- # 2-of-master: for loop, exit
  • Wizard: for looping, exit

As you can see, a total of 6 parent / child fingerprints come from 4 unique processes, resulting in 6 lines of output, something like ...

I am the father of 12120, and he knows how to use the terminal!

I am 12120, a newborn who knows how to write to the terminal!

I am the father of 12121, and he knows how to use the terminal!

I am 12121, a newborn who knows how to write to the terminal!

I am the father of 12122, and he knows how to use the terminal!

I am 12122, a newborn who knows how to write to the terminal!

But this is just arbitrary, it could deduce it instead ...

I am 12120, a newborn who knows how to write to the terminal!

I am the father of 12120, and he knows how to use the terminal!

I am 12121, a newborn who knows how to write to the terminal!

I am the father of 12121, and he knows how to use the terminal!

I am 12122, a newborn who knows how to write to the terminal!

I am the father of 12122 and he knows how to use the terminal!

Or anything but that. The OS (and your funky-style motherboard clock) is fully responsible for the order in which the processes get time intervals, so put the blame on Torvalds (and don't expect self-deception when you return) if you don't like how the kernel manages to organize your processes;).

I hope this shed some light on you!

+38
source

All Articles