We have a large EC2 instance with 32 cores, currently running Nginx, Tornado and Redis, serving an average of 5K requests per second. Everything seems to be working fine, but CPU utilization is already up to 70%, and we need to support even more requests. One idea was to replace Tornado with uWSGI, because we really do not use the asynchronous functions of Tornado.
Our application consists of one function, it receives JSON (~ = 4KB), doing some blocking but very fast things (Redis) and returns JSON.
- Proxy HTTP request to one of the Tornado instances (Nginx)
- Parse an HTTP request (Tornado)
- Read the body line of POST (compressed JSON) and convert it to a python dictionary (Tornado)
- Extract data from Redis (locks) located on the same machine (py-redis with hiredis)
- Process data (python3.4)
- Update Redis on the same machine (py-redis with hiredis)
- Prepare compressed JSON for the response (python3.4)
- Send proxy response (Tornado)
- Send a response to the client (Nginx)
We thought that speed improvement would be achieved using the uwsgi protocol, we can install Nginx on a separate server and proxy server for all requests to uWSGI using the uwsgi protocol. But after trying all possible configurations and changing the OS parameters, we still can not get it to work even at the current load. In most cases, the nginx log contains errors 499 and 502. In some configurations, it simply stopped receiving new requests, for example, to some OS limit.
So, as I said, we have 32 cores, 60 GB of free memory and a very fast network. We do not do heavy things, only very fast blocking operations. What is the best strategy in this case? Processes, Threads, Async? What OS parameters should be installed?
Current configuration:
[uwsgi] master = 2 processes = 100 socket = /tmp/uwsgi.sock wsgi-file = app.py daemonize = /dev/null pidfile = /tmp/uwsgi.pid listen = 64000 stats = /tmp/stats.socket cpu-affinity = 1 max-fd = 20000 memory-report = 1 gevent = 1000 thunder-lock = 1 threads = 100 post-buffering = 1
Nginx config:
user www-data; worker_processes 10; pid /run/nginx.pid; events { worker_connections 1024; multi_accept on; use epoll; }
OS configuration:
sysctl net.core.somaxconn net.core.somaxconn = 64000
I know that the limits are too high, I began to try any possible value.
UPDATE
I ended up with the following configuration:
[uwsgi] chdir = %d master = 1 processes = %k socket = /tmp/%c.sock wsgi-file = app.py lazy-apps = 1 touch-chain-reload = %dreload virtualenv = %d.env daemonize = /dev/null pidfile = /tmp/%c.pid listen = 40000 stats = /tmp/stats-%c.socket cpu-affinity = 1 max-fd = 200000 memory-report = 1 post-buffering = 1 threads = 2