Are parallel computing important for web development?

Let's say I have a web application running on S-servers with an average number of C-cores each. My application processes an average of R requests at any given time. Assuming that R is about 10 times larger than S * C, would the benefits of spreading query work across multiple cores not be minimal, since each core already processes about 10 queries?

If I'm right, why does this guy say that concurrency is so important for the future of Python as a language for web development

I see many reasons why my argument would be wrong. Perhaps the application receives some very complex queries that exceed the number of available cores. Or, perhaps, there is a big variation in the complexity of the queries, so one core may be unsuccessful and given 10 consecutive complex queries, as a result, some of them take much more time than is reasonable. Given that the guy who wrote the above essay is much more experienced than me, I think there is a significant chance that I am wrong about this, but I would like to know why.

+4
source share
5 answers

In the hypothetical circumstances that you are developing, there are approximately 10 in-game requests for each core, as long as a reasonable handling of the request to the kernel (perhaps even the simplest cyclic load balancing) is just fine if each request lives throughout its life cycle on one core.

Point, this scenario is only ONE possibility - heavy requests that can really benefit (in terms of lower latency) from marshaling multiple cores per request are certainly an alternative option. I suspect that your script is more common on today's website, but it would be nice to handle both types and "batch" background processing ... especially because the number of cores (unlike each core speed) is what increases , and that will grow today.

Not with me to object to the wisdom of Jacob Kaplan-Moss, but I'm used to becoming a very good concurrency, at my employer, in a more pleasant and more explicit and difficult way than he seems to be defending - mapreduce for batch-like jobs based on the hash distribution for registering N backends to work around for 1 request, etc.

Perhaps I just do not have enough real experience with (say) Erlang, Scala or Haskell regarding the new transactional software memory to see how wonderfully they scale for the high use of tens or hundrends of thousands of cores on low QPS, high workloads on Q. ... but it seems to me that the silver bullet for this scenario (minus a relatively limited subset of cases where you can turn to mapreduce, pregel , sharding, etc.) has not yet been invented in any language. With an explicit, carefully thought-out Python architecture, it is certainly no worse than Java, C # or C ++ when processing such scripts, at least in my experience.

+5
source

Not soon in my assessment. Most single web requests have a lifespan of less than a second. In light of this, it makes no sense to separate the web request task and rather distribute the web request tasks across all cores. What web servers are capable of and what they are already doing.

+4
source

Caveat: I looked at the "Concurrency" section, which seems to be what you are talking about. The problem seems to be (and this is not new, of course):

  • Python threads do not start in parallel due to the GIL.
  • A system with many cores will need so many backends (in practice, you probably need at least two NNN threads).
  • Systems seek to increase the number of cores; typical PCs have four cores, and affordable server systems with 128 or more cores are probably just around the corner.
  • Running 256 separate Python processes means the data is not shared; the entire application and any downloaded data are replicated in each process, which leads to massive memory loss.

The last bit is an error of this logic. In fact, if you start 256 Python backends in a naive way, there is no data. However, this has no design provision: the wrong way to start many backend processes.

The right way is to load your entire application (all the Python modules you depend on, etc.) into one master process. This master process then issues backend processes for processing requests. They become separate processes, but standard write memory management means that all fixed data is already loaded together with the wizard. All code that was preloaded by the wizard is now shared between all workers, despite the fact that they are all separate processes.

(Of course, COW means that if you write to it, it creates a new copy of the data, but things like compiled Python bytecode should not change after loading.)

I don't know if there are any Python related issues that prevent this, but if so, these are implementation details that need to be fixed. This approach is much easier to implement than trying to eliminate the GIL. This also eliminates the possibility of traditional locking and threading problems. This is not as bad as in some languages ​​and in other use cases - there is practically no interaction or blocking between threads, but they don’t disappear completely, and the race conditions in Python are just as painful as a track down, as they are on any another language.

+1
source

One thing you omit is that a web request is not one consecutive series of instructions in which only the processor is involved.

A typical web request processor may need to perform some calculations from the CPU, then read some configuration data from the disk, then ask the database server for some records that must be transferred to it via Ethernet, and so on. CPU usage may be low, but it may still take a non-trivial amount of time due to waiting for all the I / O operations between each step.

I think that even with GIL Python can start other threads while one thread is waiting for I / O. (Other processes, of course, can.) However, Python threads are not like Erlang threads: start off with them enough and they will start to hurt.

Another problem is memory. C libraries are shared between processes, but (AFAIK) Python libraries are not. Thus, starting a 10x Python process can reduce I / O expectations, but now you have 10 copies of each Python module loaded into the kernel.

I do not know how significant this is, but they complicate the situation much higher than "R> 10 * S * C". In the Python world, everything remains to be done to solve them, because these are not easy problems.

+1
source

In this article, he seems to highlight GIL as the reason for concurrent processing in Python web applications, which I just don't understand. As you grow larger, in the end you will have a different server, and GIL or not GIL, it does not matter - you have several computers.

If he talks about the ability to squeeze more out of one computer, then I don’t think it is important, especially for large-scale distributed computing - different machines do not share GIL. And, indeed, if you have many computers in a cluster, for many reasons it is better to have medium-sized servers instead of one super-server.

If he has in mind a way to improve functional and asynchronous approaches, then I somewhat agree, but it seems tangential to his “we need better concurrency”. Python may have this now (which he admits), but apparently it isn’t good enough (all because of the GIL, of course). Honestly, this is more like attacking GIL than justifying concurrency in web development.

An important point regarding concurrency and web development is that concurrency is complex. The beauty of something like PHP is that there is no concurrency. You have a process and you are stuck in this process. It is so simple and easy. You do not need to worry about any concurrency problems - sudden programming is much easier.

0
source

All Articles