The threaded.c program was tested using http_load. The program works for nginx. There is only one instance of the program. If the requests will be serviced sequentially, I would expect it to take 40 seconds for 20 requests, even if they are sent in parallel. Here are the results (I used the same numbers as Andrew Bradford - 20, 21 and 40) -
20 Requests, 20 in parallel, took 2 seconds -
$ http_load -parallel 20 -fetches 20 request.txt 20 fetches, 20 max parallel, 6830 bytes, in 2.0026 seconds 341.5 mean bytes/connection 9.98701 fetches/sec, 3410.56 bytes/sec msecs/connect: 0.158 mean, 0.256 max, 0.093 min msecs/first-response: 2001.5 mean, 2002.12 max, 2000.98 min HTTP response codes: code 200 -- 20
21 Requests, 20 in parallel, took 4 seconds -
$ http_load -parallel 20 -fetches 21 request.txt 21 fetches, 20 max parallel, 7171 bytes, in 4.00267 seconds 341.476 mean bytes/connection 5.2465 fetches/sec, 1791.55 bytes/sec msecs/connect: 0.253714 mean, 0.366 max, 0.145 min msecs/first-response: 2001.51 mean, 2002.26 max, 2000.86 min HTTP response codes: code 200 -- 21
40 requests, 20 in parallel, took 4 seconds -
$ http_load -parallel 20 -fetches 40 request.txt 40 fetches, 20 max parallel, 13660 bytes, in 4.00508 seconds 341.5 mean bytes/connection 9.98732 fetches/sec, 3410.67 bytes/sec msecs/connect: 0.159975 mean, 0.28 max, 0.079 min msecs/first-response: 2001.86 mean, 2002.62 max, 2000.95 min HTTP response codes: code 200 -- 40
Thus, this proves that even if the values ββFCGI_MAX_CONNS, FCGI_MAX_REQS and FCGI_MPXS_CONNS are hardcoded, requests are sent in parallel.
When Nginx receives several requests, it puts them all in the FCGI application queue back in the opposite direction. It does not wait for a response from the first request before sending the second request. In the FCGI application, when a thread serves the first request at any time, the other thread does not wait for the first to complete, it will pick up the second request and start working on it. And so on.
So, the only time you lose is the time it takes to read the request from the queue. This time is usually negligible compared to the time it takes to process the request.