How to choose chunksize for python multiprocessing with large datasets

Question

How to choose chunksize for python multiprocessing with large datasets

I am trying to use python to get some performance for a task that can be highly parallelized using http://docs.python.org/library/multiprocessing .

When they look at their library, they say that they use the block size for very long iterations. Now my iterable is not long, one of the dicts contained in it is: ~ 100,000 entries, with tuples as keys and numpy arrays for values.

How can I configure chunksize for this and how can I quickly transfer this data?

Thank.

+5

python multiprocessing

Sandro Apr 24 '10 at 20:09

source share

1 answer

Thomas Wouters · Accepted Answer · 2010-04-24T21:41:56+0000

- . multiprocessing , , , , - - , . . , , dicts . dict, , .

How to choose chunksize for python multiprocessing with large datasets

More articles: