Nginx <=> php-fpm: unix socket gives an error, tcp connection is slow

I am running nginx with php-fpm on a high traffic site. I allow nginx to communicate with php-fpm over tcp / ip, both nginx and php-fpm pools running on the same server.

When I use tcp / ip to allow the nginx and php-fpm pools to interact with eachother, loading the pages takes a few (5-10) seconds before it does anything at all, and when it finally starts moving, it takes no time to download until the end. Since this php-fpm status page shows that the listening is full, I assume it takes some time before the request is processed. Netstat shows a lot of (20k +) connections in TIME_WAIT status, I don’t know if this is connected, but it seemed relevant.

When I try to allow nginx and php-fpm to communicate via a UNIX socket, the time until the actual page load is reduced to almost zero, and the time before the page that is ending in my browser is 1000 times smaller. The only problem with UNIX sockets is that it gives me a lot of errors in the logs:

 *3377 connect() to unix:/dev/shm/.php-fpm1.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 122.173.178.150, server: nottherealserver.fake, request: "GET somerandomphpfile HTTP/1.1", upstream: "fastcgi://unix:/dev/shm/.php-fpm1.sock:", host: "nottherealserver.fake", referrer: "nottherealserver.fake" 

My two questions are:

Does anyone know why the tcp / ip method has such a long wait until it really connects to the php-fpm server?

Why do UNIX sockets cause problems when using this instead of tcp / ip?

What I tried:

set net.ipv4.tcp_tw_recycle and net.ipv4.tcp_tw_reuse to 1 when trying to reduce the number of TIME_WAIT connections (from 30k + to 20k +)

the value of net.core.somaxconn increased from the default value from 128 to 1024 (when trying to use higher values, but still the same error when using UNIX sockets)

increased maximum number of open files

Which is probably also very relevant: tried using lighttpd + fastcgi, it has the same problem for a long time before the connection is finally processed. MySQL is not too busy, should not be the reason for a long wait time. Disk latency is 0% (SSD), so a busy disk is also not the culprit.

Hope someone found a fix for this problem and ready to share :)

+4
source share
1 answer

Answering my question as the problem is solved (not sure if this is the right way to do this).

My problem was that APC caching didn't work at all. It was installed, configured, and enabled, but added nothing to its cache. After switching from APC to Xcache, there was a huge drop in load and load. I still don’t know why APC didn’t do anything, but at the moment they are just glad that the problem has been resolved :)

Thanks for all the input from you guys!

+2
source

All Articles