General List Multiprocessing

I wrote a program like this:

from multiprocessing import Process, Manager def worker(i): x[i].append(i) if __name__ == '__main__': manager = Manager() x = manager.list() for i in range(5): x.append([]) p = [] for i in range(5): p.append(Process(target=worker, args=(i,))) p[i].start() for i in range(5): p[i].join() print x 

I want to create a common list of lists among processes, and each process will change the list in it. But the result of this program is a list of empty lists: [[], [], [], [], []].

What will go wrong?

+6
source share
1 answer

I think this is due to the quirk of how managers are implemented.

If you create two Manager.list objects, and then add one of the lists to the other, the type of list that you add to the parent list:

 >>> type(l) <class 'multiprocessing.managers.ListProxy'> >>> type(z) <class 'multiprocessing.managers.ListProxy'> >>> l.append(z) >>> type(l[0]) <class 'list'> # Not a ListProxy anymore 

l[0] and z are not the same object and do not behave as you would expect as a result:

 >>> l[0].append("hi") >>> print(z) [] >>> z.append("hi again") >>> print(l[0]) ['hi again'] 

As you can see, changing the nested list has no effect on the ListProxy object, but changing the ListProxy object changes the nested list. The documentation actually explicitly notes this :

Note

Modifications of the changed values ​​or elements in the dict and list proxy files will not be distributed through the manager, since the proxy server has no way of knowing when its values ​​or elements will be changed. To change such an element, you can reassign the changed object to the container proxy:

Digging in the source code, you can see that when you call append in ListProxy, the call to append is actually sent to the manager object via IPC, and then the manager calls the add to the general list. This means that the append arguments should be pickled / unpainted. During the scattering process, the ListProxy object becomes a regular Python list, which is a copy of what ListProxy (or its referent) points to. This is also noted in the documentation :

An important feature of proxy objects is that they can be passed between processes. Please note, however, that if the proxy is sent to the appropriate manager process, then the referent is unpacked. This means, for example, that one common object may contain a second

So, back to the above example, if l [0] is a copy of z , why does updating z also update l[0] ? Since the copy is also registered in the Proxy object, therefore, when changing ListProxy ( z in the above example), it also updates all registered copies of the list ( l[0] in the above example). However, the copy does not know anything about the proxy, so when the copy is changed, the proxy does not change.

So for your example to work, you need to create a new manager.list() object every time you want to change the sublist, and just update this proxy object directly, and not update it using the parent list index:

 #!/usr/bin/python from multiprocessing import Process, Manager def worker(x, i, *args): sub_l = manager.list(x[i]) sub_l.append(i) x[i] = sub_l if __name__ == '__main__': manager = Manager() x = manager.list([[]]*5) print x p = [] for i in range(5): p.append(Process(target=worker, args=(x, i))) p[i].start() for i in range(5): p[i].join() print x 

Here's the conclusion:

 dan@dantop2 :~$ ./multi_weirdness.py [[0], [1], [2], [3], [4]] 
+6
source

All Articles