Scalability tip for a large gaming site

I am creating a website where players can play a turn-based game for virtual credits (for example, a poker site, but different). In the setup, I came up with:

  • One data server containing all player accounts with related data (database + service). The database and API can be shared on two servers if that helps.
  • One or more web servers that serve the website, connecting to the data server if necessary.
  • One lobby server where players can find each other and customize games (maybe several, but less user-friendly).
  • Several game servers on which the game is launched (all rules, etc. are on the server, the client is just a remote control and viewer) with one load balancer.
  • Game client

The client will be created using Flash, the web server will use PHP. The rest is all Java.

Communication

  • The player logs in to the site. The web server sends the username / password to the data server, which creates the session key (for example, a cookie).
  • The player launches the client. The client connects to the lobby server, passing the session key. The lobby server checks this key with the data server.
  • When the lobby is created and the game starts, the lobby server selects the game server from the load balancer and sets up the game on this game server.
  • The lobby server tells clients to connect to the game server and play the game.
  • When the game is over, the game server allows the lobby. The lobby server checks the account and updates credits on the data server.

Protocols

  • Java for Java: RMI
  • PHP or Flash to Java: A custom binary protocol over a socket. This protocol supports closing a socket in standby mode while saving and resuming a virtual connection.

If the client has his wishes, the site will need to support thousands of simultaneous players. With this information, do you see any bottlenecks in my setup? I personally worry a little about the existence of only one data server, but I'm not sure how to separate it. Other scalability considerations (or others) are also welcome.

+4
source share
2 answers

There are many separate services in your architecture that are crucial for ANY part of the system to work for ANY user. I believe these are SPOF s.

  • You might want to consider sharding (or horizontal splitting) for your data server.
  • Consider multiple servers with multiple servers. The Flash client can still mask them as a single lobby if you want to. Personally, I do not like to play games with people with whom I cannot speak any language that I do not understand. In addition, I do not like to join the lobby server, finding n-thousands of games and not knowing anyone. Make some lobby functions (when you put thoughts into it, you really can). There is no real use for the lobby with 10,000 people. If you still want to go with it, you can still try partitioning based on the assumption that the player filters certain parameters (opponent level, type of game, etc.), Trying to divide the lobby according to one or even several criteria.
  • The load balancer does not actually require enough power for the physical server, I suppose. Why not play it on all servers? All he needs to know is availability / server. Assuming you have 10,000 game servers (in this case, I think it's a damn lot) and a refresh rate of 1 second (which is much more here), all you sync is 10,000 integers per second (let's say you can represent accessibility as a number (which I suppose you can)). If you figured out something better than connecting every game server to each lobby server, it doesn't even require too many connections on the same machine.

In this type of application, I believe that horizontal partitioning is a good idea, because for one it can be done easily and adds reliability to the system. Suppose your SPOFs are split, not duplicated. It is simpler and possibly cheaper. If part of the SPOF goes down (say, 1 of your 20 independent and physically distributed data servers), this is bad because 5% of your players are blocked. But probably he will get up soon. If your SPOF is redundant, the probability is less than something that fails. But if so, EVERYBODY is blocked. This is a problem because you will have all attempts to return online all at once. As soon as your SPOF returns, he will be amazed at the number of orders an order of magnitude higher than usual. And you can still use horizontal partitioning and redundancy at the same time, as suggested for the balancing service.

+3
source

While working on several facebook games, I would say the following:

Think about scalability for thousands of players, but you must get tens of thousands of players before the scaling efforts for these players pay off.

That is, plan ahead, but worry about getting 1 player before planning a system for thousands of simultaneous players.

I suspect that the setup you described will work very well for your initial user base. While you are building, avoid doing such things as: Requiring the login server to talk to the lobby server. Make each server stand on it, the big thing that will kill you is the interdependence between the services.

But the most important thing is to do it in the most convenient way. If you have enough users to tax your system, it will be very good. You can hire a DBA to help you understand how to scale when you have many users.

+1
source

All Articles