How to use multiple nodes / cores in a cluster with parellelized Python code

Question

How to use multiple nodes / cores in a cluster with parellelized Python code

I have a piece of Python code where I use joblib and multiprocessing so that parts of the code run in parallel. I have no problem running this on my desktop, where I can use the task manager to see that it uses all four cores and runs the code in parallel.

I recently found out that I have access to an HPC cluster with 100 + 20 cores. The cluster uses SLURM as a workload manager.

First question: is it possible to run parallel Python code in a cluster?

If possible,

Do I need to change Python code at all to work in a cluster and
What #SBATCH instructions should be placed in the job submission file to indicate that the parallelized parts of the code should work on four cores (or four nodes)?

The cluster that I have access to has the following attributes:

PARTITION      CPUS(A/I/O/T)       NODES(A/I)  TIMELIMIT      MEMORY  CPUS  SOCKETS CORES 
standard       324/556/16/896      34/60       5-00:20:00     46000+  8+    2       4+

+4

python python-2.7 hpc slurm

derNincompoop Jan 21 '15 at 16:30

source share

1 answer

dmg · Answer 1 · 2015-01-21T16:45:37+0000

MPI is generally considered the de facto standard for high performance computing. There are several MPI bindings for Python:

MPI for Python
pyMPI
Boost.MPI has Python bindings.

There are also many frameworks for this - a list

Your code will require at least minimal changes, but there should not be too many of them.

MPI , multiprocessing

, , 100 24 , 2400 Python.

How to use multiple nodes / cores in a cluster with parellelized Python code

More articles: