Handling 7000+ customers? -MultiThreading (TCP, high traffic)

I am going to create a network system that can handle 7000+ tcp socket clients with 5KB / s input (clients send). I examined this question: Link "> . They said:" Create 1024 threads to handle 1024 clients. "I know that there is a method called " select () " , and I can not open 7000+ threads to handle 7000+ clients. because my processor (or server) has only 8CPU, which means 7000 + threads is a big mistake. Now I think Iโ€™ll create ~ 1000 threads and I will process each 7 socket groups in these threads. But now the question is: if I have one application, but I have a 2CPU processor, I canโ€™t get the maximum performance with 1000 threads, I have to create (maybe) 500 threads. Otherwise, if I have an 8CPU processor, I can't get max. performance with 1000 threads, and I need to create (maybe) 2000 threads to handle sockets. As I can understand, "this can a processor handle X threads? And is that true?

EDIT: I think I can create a profiler that watches the program. Namely, each thread of the journal "I finished my work in X seconds." and the profiler processes these messages and decides to create a stream or kill a stream. But how to understand the status of threads (or CPU)?

+7
c ++ multithreading networking sockets
source share
2 answers

There is no way that one server can handle 35 Gb / s traffic (and even if there is, it will be very expensive).

How do I approach this problem:

  • Understand API
  • Understand the protocols, choose the best one for the job.
  • understand compression, security aspects (encryption, authentication), licensing
  • Find out how you are going to balance server balancing.
  • Write a prototype server and client
  • Generate some server load and understand the limitations of one instance
  • Hide N servers behind load balancers and observe their behavior.

Things you want to concentrate on the first line of code:

  • How are you going to scale it horizontally
  • What are your performance metrics.

Edit

So this is in KB , which is much better :). I would recommend thinking about LB in advance.

There are various technologies and protocols that can help you write an effective application. From personal experience, I would prefer HTTP for TCP. There are many good load balancers, adding compression, encryption, authentication is a matter of days.

I also heard that node.js is ultrafast if you do I / O to handle client requests.

The servers I wrote were ASP.NET Web API applications, each of which processed several MB / s. It was trivial to hide the servers behind load balancers (I used HAProxy and nginx , but of course many others are available there. I also worked with the C ++ socket server and the development time for it was much longer. I want to say that if you can use a higher-level language, prefer it to a lower-level language, this will reduce the time dev (in my case by 10 times!) and make life easier.

+12
source share

The first question is what does the server do? Will cpu or IO be bound during each client request?

If this is a CPU, then it makes no sense to try to process them all in parallel, since you no longer have concurrency than the number of cores on the server. In this case, you can simply create as many threads as there are cores on the server, and process the inputs one at a time as fast as you can.

If the server process is associated with IO, you need to determine how long it will take for each thread that processes the client request, which will give you an idea of โ€‹โ€‹how many threads it makes sense to create. What a classic approach, but as others have pointed out, in this case a more modern approach would be to use an asynchronous programming library. For C ++ on Windows, this will be PPL.

UPDATE

You seem very interested in staying low, therefore, to understand the essence of your initial question, how to calculate how many threads the kernel can support.

First, you will create wrapper functions for any blocking calls made by threads (which record the blocking time of each thread). From these indicators you can determine the average filling of the flows, and as soon as you find out that obtaining an approximate calculation of the optimal number of flows is quite simple.

thread_occupancy = (thread_run_time - thread_blocked_time) / thread_run_time optimal_thread_count = num_cores / thread_occupancy 

You probably want to add at least 0.1 (10%) to thread_occupancy to cover thread switching threads.

But, as others have said, this classic multi-threaded approach only works for a few dozen threads. As soon as the OS controls the planning of one hundred threads or so, then the overhead of planning increases to such an extent that adding more threads does not bring any benefits. The point at which this happens, although highly dependent on the system and software, so you just need to do some tests to get to the sweet spot.

If your operations are so limited in IO that you want to process hundreds or thousands of requests at the same time, then you have no choice but to process multiple requests in a stream using asynchronous processing, which usually requires the use of an asynchronous library. In this case, you will usually have one thread per core, fully occupied under the control of the asynchronous library. This can be processed by the library itself, or you may need to configure it manually, but in any case you do not need time to control the number of threads, so when you can control the busyness of threads in the same way as a purely multi-threaded approach there would be little for you to do with this information.

0
source share

All Articles