On a synchronous server, you must handle locks when accessing your data structures (if updates are present), and this takes time and code (and this is the source of hard-to-reach errors). In addition, the presence of many (for example, thousands) of threads creates technical problems in many implementations (for example, for stack distribution), and if the server is tied to IO, these threads almost all sleep (waiting for the network) and simply lose memory.
Using an asynchronous model in a single thread, you can ignore the lock problem (this means that your processing is as fast as it can be received), and you only use the memory needed for clients (there is only one stack).
However, multi-core machines are now quite common, so part of the advantage is lost (because you have to block if you modify the general data structure). Probably the best performance can be achieved by using a balancer in front of N asynchronous servers, where N is the optimal number of threads for your environment.
The bad side of the asynchronous approach is that depending on your tools, the code can be pretty ugly and hard to understand, and that if the calculation is not trivial and by mistake your processing enters an endless loop, the entire asynchronous server will not respond (therefore, probably watchdog timer should be added).
source share