The mechanisms you are talking about are strictly implementation dependent. MPI is a mid-level standard that sits on top of what communication mechanisms are provided by the hardware and operating system.
ORTE is part of Open MPI - one of the universal MPI implementations in the modern world. There are also MPICH and MPICH2 and their variants (for example, Intel MPI). Most supercomputer vendors provide their own MPI implementations (for example, IBM provides a modified MPICH2 for Blue Gene / Q).
The Open MPI method is that it is divided into several layers, and each level of functionality is provided by many modules that load dynamically. There is a scoring mechanism that should choose the best module under certain conditions.
All MPI implementations provide a mechanism for launching the so-called SPMD launch. In fact, the MPI application is a special kind of SPMD (Single Program Multiple Data) - many copies of one executable file are executed, and message transfer is used as a mechanism for communication and coordination. This is the SPMD launcher, which takes a list of execution nodes, starts the process remotely, and establishes the connection and the communication scheme between them (in Open MPI, this is called MPI Universe). This is the one that creates the MPI_COMM_WORLD global MPI communicator and distributes the initial rank assignment, and it can provide options such as binding processes to CPU cores (which is very important for NUMA systems). After starting the processes, some identification mechanism is available (for example, mapping between ranks and IP address / TCP port) other addressing schemes can be used. Open MPI, for example, starts remote processes using ssh
, rsh
or can use the mechanisms provided by various resource management systems (for example, PBS / Torque, SLURM, Grid Engine, LSF ...). After the processes are completed, their IP addresses and port numbers are recorded and broadcast in the Universe, processes can find each other in other (faster) networks, for example. InfiniBand, and establish communication routes on them.
Routing messages are usually not executed by the MPI itself, but remain in the core communication network. MPI only takes care of creating the messages and then sends them to the network, which will be delivered to the destination. Shared memory is usually used to communicate between processes that are on the same node.
If you are interested in technical details, I would recommend that you read the Open MPI source code. You can find it in the project website .