Node.js + Socket.IO scaling with redis + cluster

Currently, I am faced with a task where I have to scale the Node.js application using Amazon EC2. As I understand it, the way to do this is to have each child server use all available processes using the cluster and have sticky connections to ensure that every user connecting to the server “remembers” what working day they are data from previous sessions.

After that, the next best step from what I know is to deploy as many servers as necessary and use nginx to balance the load between all of them, again using sticky connections to find out which "child" server, what each user data is switched on.

So, when the user connects to the server, does this happen?

Connection with a client → Find / select a server → Find / select a process → Acknowledgment / connection Socket.IO, etc.

If not, let me better understand this load balancing task. I also do not understand the importance of redis in this situation.

Below is the code that I use to use the entire processor on the same machine for a separate Node.js process:

var express = require('express'); cluster = require('cluster'), net = require('net'), sio = require('socket.io'), sio_redis = require('socket.io-redis'); var port = 3502, num_processes = require('os').cpus().length; if (cluster.isMaster) { // This stores our workers. We need to keep them to be able to reference // them based on source IP address. It also useful for auto-restart, // for example. var workers = []; // Helper function for spawning worker at index 'i'. var spawn = function(i) { workers[i] = cluster.fork(); // Optional: Restart worker on exit workers[i].on('exit', function(worker, code, signal) { console.log('respawning worker', i); spawn(i); }); }; // Spawn workers. for (var i = 0; i < num_processes; i++) { spawn(i); } // Helper function for getting a worker index based on IP address. // This is a hot path so it should be really fast. The way it works // is by converting the IP address to a number by removing the dots, // then compressing it to the number of slots we have. // // Compared against "real" hashing (from the sticky-session code) and // "real" IP number conversion, this function is on par in terms of // worker index distribution only much faster. var worker_index = function(ip, len) { var s = ''; for (var i = 0, _len = ip.length; i < _len; i++) { if (ip[i] !== '.') { s += ip[i]; } } return Number(s) % len; }; // Create the outside facing server listening on our port. var server = net.createServer({ pauseOnConnect: true }, function(connection) { // We received a connection and need to pass it to the appropriate // worker. Get the worker for this connection source IP and pass // it the connection. var worker = workers[worker_index(connection.remoteAddress, num_processes)]; worker.send('sticky-session:connection', connection); }).listen(port); } else { // Note we don't use a port here because the master listens on it for us. var app = new express(); // Here you might use middleware, attach routes, etc. // Don't expose our internal server to the outside. var server = app.listen(0, 'localhost'), io = sio(server); // Tell Socket.IO to use the redis adapter. By default, the redis // server is assumed to be on localhost:6379. You don't have to // specify them explicitly unless you want to change them. io.adapter(sio_redis({ host: 'localhost', port: 6379 })); // Here you might use Socket.IO middleware for authorization etc. console.log("Listening"); // Listen to messages sent from the master. Ignore everything else. process.on('message', function(message, connection) { if (message !== 'sticky-session:connection') { return; } // Emulate a connection event on the server by emitting the // event with the connection the master sent us. server.emit('connection', connection); connection.resume(); }); } 
+6
source share
1 answer

I believe your general understanding is correct, although I would like to make a few comments:

Load balancing

You are right that one of the methods of load balancing is the presence of nginx load balancing between different instances, and within each instance there is a cluster balance between the work processes that it creates. However, this is only one way, and not necessarily always the best.

Between instances

First, if you use AWS, you may need to use ELB . It was developed specifically for EC2 instances for load balancing, and this creates the problem of setting load balancing between trivial instances. It also provides many useful features, and (with Auto Scaling ) can make scaling extremely dynamic without requiring any effort on your part.

One feature of ELB has, which is especially important for your question, is that it supports sticky sessions out of the box - only the question is ticking the box.

However, I need to add a major caveat: ELB can break socket.io in weird ways . If you just use a lengthy survey, you should be fine (assuming sticky sessions are allowed), but getting actual work websites is somewhere between extremely frustrating and impossible.

Between processes

While there are many alternatives to using the cluster, both inside and out of Node, I tend to agree that the cluster itself is usually great.

However, one case where it does not work is that you want to do sticky sessions behind a load balancer, as you are obviously here.

First of all, you need to clearly indicate that the only reason you even need sticky sessions, first of all, is that socket.io relies on session data stored in memory between job requests (during a handshake for websites or mainly for lengthy surveys). In general, relying on data stored in this way should be avoided as much as possible for a number of reasons, but with socket.io you really have no choice.

Now this does not seem too bad, because the cluster can support sticky sessions using the sticky-session module mentioned in the socket.io documentation or the snippet you seem to be using.

The fact is that since these sticky sessions are based on the client’s IP address, they won’t work for the load balancer, be it nginx, ELB or anything else, since everything that is visible inside the instance at this point is the IP address load balancer. remoteAddress your code trying the hash is actually not a client address.

That is, when your Node code tries to act as a load balancer between processes, the IP address it tries to use will always be the IP address of another load balancer that balances between instances. Thus, all requests will complete at the same process, surpassing the whole purpose of the cluster.

You can see the details of this problem and a couple of possible ways to solve it (none of which are especially beautiful) in this matter.

The Importance of Redis

As I mentioned earlier, after you have several instances / processes receiving requests from your users, storing the session data in memory is no longer enough. Important sessions are one way, although there are other, possibly better solutions, including a central session repository that Redis can provide. See post for a fairly complete overview of the topic.

Having seen that your question is about socket.io, I assume that you probably had in mind Redis special meaning for websockets, therefore:

If you have several socket.io servers (instances / processes), this user will be connected to only one of these servers at any given time. However, any of the servers may at any time wish to transmit a message to this user or even broadcast to all users, regardless of which server they are currently on.

To this end, socket.io supports “Adapters,” of which Redis is one that allows various socket.io servers to communicate with each other. When one server issues a message, it goes to Redis, and then all the servers see it (Pub / Sub) and can send it to their users, making sure that the message reaches its goal.

This again is explained in the socket.io documentation regarding several nodes and perhaps even better in this stack overflow is the answer .

+23
source

All Articles