The main thing to consider when moving from multi-threaded to distributed computing is the increase in overhead for running tasks on remote machines compared to spooling another thread on the current computer. The granularity of the work elements must be large enough to justify a much slower connection between nodes. Messaging between threads on the same computer is many orders of magnitude faster than messaging between different computers over a network.
Sharing resources is harder on machines. Sharing objects in memory is simple in multiple threads in the same process, but requires some engineering to achieve similar work on different machines. Locks basically do not exist on different machines. Look at using the Message Queuing / Server service to coordinate work between multiple machines, return results to an aggregator, etc.
You mention "offside". If you are considering external computing resources, be sure to look for them for cloud computing or flexible computing service providers. Oddly enough, they are not used in one breath, like parallel programming, as often as you think. Cloud computing offers you the ability to scale parallelism to hundreds or thousands of compute nodes, for which you only pay when you use them. When your calculations are completed or a live source for analyzing your data goes home at the end of the day, you can turn off your cloud nodes and stop billing hours until you start them again.
Amazon, Google and Microsoft are three major cloud service providers (among others), and each of them has very different characteristics, strengths and weaknesses. I am working on Azure materials at Microsoft. Laser embedded message queues are pretty handy for running workflows / workflows on a scale.
Do you use LAMP or .NET, since your platform is really less about performance issues and more about the tools and skill sets that you have in your development team. The deliberate choice of a target platform that is inconsistent with your skill set of the development team is a great way to add a lot of time and re-qualify expenses in your project schedule.
C # /. NET works very well for coding parallel systems compared to C ++ or scripts in other environments. Explore the language features, debugging tools, and ready-made libraries and services available to you when evaluating which platform is best for your skill set and desired system design.
dthorpe
source share