The cost of using select () in different threads for the same process

Question

The cost of using select () in different threads for the same process

In my application, I start using the select() call in several places, tracking various things in my process (network connections, IPC, messaging, files ...). All calls use their own set of file descriptors, that is, the descriptor is not used twice for choice.

This means that in some cases I have something like 5 select() blocking calls in different threads.

Is there a performance limitation when using select() several times in different threads instead of maybe just using one call and sending the results to the corresponding threads?

Is there really a limit on the number of pending select() calls?

Is there a tool to measure this?

Since the application is likely to grow even further, I suspect that at some point, if this starts to become problematic, I will have to encode some centralized select() that collects all the FDs to track and notify client flows when data ready for collection / recording.

So I decided it was better to ask before ...

+4

c select

Gui13 Aug 17 '11 at 14:33

source share

2 answers

A call to a call should be handled in the OS, and, as you said, blocking it, not polling it, will not lead to a decrease in the performance of your application. I also do not believe that there will be restrictions on its use aside from any restrictions on the number of file descriptors that your OS can open at all, which has nothing to do with the choice.

0

John Humphreys - w00te Aug 17 '11 at 2:36 p.m.

source share

Nemo · Accepted Answer · 2011-08-17T14:52:54+0000

There is unlikely to be any performance difference that you notice.

Inside the kernel, select adds your thread to the wait queue for each descriptor you select on, and puts it into sleep mode. If you select n in descriptors, your thread will be added to the queue n wait. When something happens to the handle (for example, data arrives at the socket), all threads in the wait queue wake up.

Choosing a huge number of descriptors will add you to a huge number of waiting queues. After you wake up, your thread should be removed from all waiting queues, including those on which there was no activity. Thus, there may be a slight advantage on this side for waiting for a small set of descriptors in multiple threads, rather than in a huge set of descriptors in a single thread.

On the other hand, select itself requires the kernel to execute all possible descriptors in order to see which ones are members of your fd_set . So on this side, there might be a slight advantage to having only one thread calling select ...

In general, I would suggest that this is a wash.

If you intend to deal with a large number of descriptors, you are better off using a more scalable (albeit not portable) mechanism, such as epoll . With epoll several threads that handle the descriptor pool should scale very well.

The cost of using select () in different threads for the same process

More articles: