It will depend on your use case: the computational model of the neural network and the runtime. Here is a recent article (2014) written by Plotnikova et al., Which uses “Erlang and the Erlang / OTP platform with a predefined base implementation of actor model functions” and a new model developed by the authors, which they describe as “one neuron-one process,” using "Gravity Search Algorithm" for learning:
http://link.springer.com/chapter/10.1007%2F978-3-319-06764-3_52
We briefly refer to their abstract: “The article developed an asynchronous distributed modification of this algorithm and presents the results of experiments. The proposed architecture shows an increase in performance for distributed systems with different environmental parameters (high-performance cluster and local area network with a slow gateway).
In addition, most of the other answers here refer to a computational model that uses matrix operations for a training and modeling base, for which the authors of this article compare, saying: “this neural network model of this case (i.e. based on matrix operations) becomes completely the mathematical and its original character (from biological prototypes of neural networks) is lost "
The tests were carried out on three types of systems:
- The IBM cluster is represented as 15 virtual machines.
- A distributed system deployed on a local network is presented in the form of 15 physical machines.
- The hybrid system is based on system 2, but each physical machine has four processor cores.
They give the following specific results: "The presented results indicate a good ability to distribute gravity, especially for large networks (801 or more neurons). The acceleration depends on the number of nodes almost linearly. If we use 15 nodes, we can get about eight times the acceleration of the learning process "
Finally, they conclude with respect to their model: “The model includes three levels of abstraction: NNET, MLP and NEURON. This architecture allows you to encapsulate some common functions at common levels and some specific for the considered functions of neural networks at special levels. Asynchronous message passing between levels allow us to differentiate the synchronous and asynchronous parts of the training and modeling algorithms and, as a result, improve the use of resources. "
source share