I would like to know what happens if an OpenMPI / MPICH2 cluster node terminates? Is there some kind of mechanism that is tolerant of this case and continues to execute?
Thanks for your answers Heinrich
Note that the function that existed since MPI 1.x is that you can install an error handler: for example,
http://www.mpi-forum.org/docs/mpi-11-html/node148.html
, MPI_ERRORS_ARE_FATAL ( ), ( , ).
; MPI .
MPI - http://icl.cs.utk.edu/ftmpi/ ( MPI 1.2). http://osl.iu.edu/research/ft/cifts/ , OpenMPI , / , BLCR, .
MPI-3 API MPI, .
, MPI . , , , , , . .