There are a few brief mentions of asynchronous models, but no one has explained this, so I thought I would call back. The most common method I've seen as an alternative to multithreading is asynchronous architectures. All that actually means is that instead of sequentially executing the code in one thread, you use the polling method to run some functions, and then return and periodically check until the data appears.
It really only works on models like your aforementioned crawler, where the real bottleneck is I / O, not the processor. In broad strokes, the asynchronous approach initiates the download on several sockets, and the polling cycle periodically checks to see if the download is complete, and when this is done, we can move on to the next step. This allows you to run multiple downloads waiting on the network by switching contexts within the same thread.
A multi-threaded model will work the same, except for using a separate thread, rather than a polling cycle that checks for multiple sockets in a single thread. In an I / O-bound application, asynchronous polling works in much the same way as streaming for many use cases, since the real problem is simply waiting for the I / O to complete and not so much for the processor to wait for data processing.
Another example of the real world is for a system that was supposed to execute a number of other executable files and wait for results. This can be done in threads, but it is also much simpler and almost as effective to simply disable several external applications as process objects, and then periodically check them until they are fully executed. This puts the processor intensive components (executable code in external executables) into their own processes, but data processing is processed asynchronously.
Python ftp server lib I'm working on, pyftpdlib uses the Python asynchronous library to handle serving FTP clients with only one thread, asynchronous socket communication for file transfers and command / response.
See more information on the Python Twisted library page on Asynchronous Programming - although somewhat specific to using Twisted, it also introduces asynchronous programming from a newbie.
Jay
source share