How to iterate over only active file descriptors from fd_set result from select ()?

So, in my current server implementation, this is currently something like this:

void loop(){ // step 1: clear set fd_set readfds; while(true){ // step 1: FD_ZERO(readfds); // step 2: loop_through_sockets_and_add_active_sockets_to(theset); // step 3: switch(select(FD_SETSIZE, &readfds, 0, 0, &tv)) { case SOCKET_ERROR: patia->receiveEvent(Error, net::getError()); return; case 0: return; } // step 4: loop through sockets and check, using FD_ISSET, which read fd have incoming data. } } 

Now, not clearing fd_set (using FD_SET, FD_CLR when adding / removing channels) would be the best way to do something.

My question is, how can you go through fd_set after select () without checking each member of the set, if it is part of the set, without using FD_ISSET?

I mean, when you have 4000 active connections, whenever incoming data arrives, the above loop must go through a potential of up to 4000 sockets before getting into the correct one. The complexity will be n ^ 2 if all threads are active!

+6
c posix pipe
source share
3 answers

My question is, how can you go through fd_set after select () without checking each member of the set, if it is part of the set, without using FD_ISSET?

You can not.

There is a small optimization in which select() returns the number of ready descriptors, so if you keep a counter of the number that you processed, you can stop when you know that you did everything without waiting for the completion of the set.

I mean, when you have 4000 active connections, whenever incoming data arrives, the above loop must go through a potential of up to 4000 sockets before getting into the correct one. The complexity will be n ^ 2 if all threads are active!

I do not see where you get O (n ^ 2) from. Of course, after returning from select() you will go through the set after processing each finished descriptor along the way. If you have 4000 ready-made I / O descriptors, the overhead of looping through an array of 4000 C memory objects will be pretty minor.

+6
source share

It is very likely that you already have a data structure associated with each open file descriptor, which is select() ed. You will already need a way to refer to this when demultiplexing the fd_set returned from select() .

If the number of file descriptors is significantly less than FD_SETSIZE , you are probably better to iterate over this (for example, only open file descriptors) and use FD_ISSET() to check activity.

+3
source share

You can.
This ISSET() over the set without ISSET() - all ISSET() from the array. But you still have to touch all long in the fd_set.__fds_bits .

 #include<sys/select.h> #include<stdio.h> int main(void) { fd_set fds; FD_ZERO(&fds); //Fill the set. FD_SET(6, &fds);FD_SET(20, &fds);FD_SET(33, &fds);FD_SET(200, &fds); int i; unsigned long *m = (unsigned long *)__FDS_BITS(&fds); int fd=0; for (i = 0; i < sizeof (fd_set) / sizeof (unsigned long); ++i) //can use int, long or long long. Using long because internal structure is long. { fd=sizeof (unsigned long)*i*8; while(m[i]!=0) { fd+=__builtin_ctzl(m[i]); //Get Number of trailing zero bits in long. printf("FD=%d\n",fd); /*Found FD*/ m[i]>>=(__builtin_ctzl(m[i]))+1; ++fd; } } return 0; } 

This works fine for me using gcc (SUSE Linux) 4.6.2


Background

on my system, fd_set looks like this (extraction and simplified /usr/include/sys/select.h ):

 typedef struct { __fd_mask __fds_bits[__FD_SETSIZE/__NFDBITS]; } 

with __fd_mask being typedef to long int .
__FD_SETSIZE/__NFDBITS seems 16 on my system.
So this array is bits (__FD_SETSIZE/__NFDBITS)*sizeof(__fd_mask)*8 .
With __NFDBITS = 8*sizeof(__fd_mask) you see that there is a __FD_SETSIZE bit in this.

Looking at the definitions of actual macros in /usr/include/bits/select.h shows that __fd_bits are used to store fds. When fd n , the nth bit in __fd_bits set to 1.

i386 (among others ) processors have one operation for counting the final zero bits of a number. With this number of __fd_bits zeros, you can easily shift the __fd_bits record by that sum + 1, and you will find the next true bit.

Compared to the input data set cycle of your fd_set, you need a minimum of __FD_SETSIZE/__NFDBITS=16 cycles if you do not optimize the use of the return value of select.

But make sure your processor and compiler support the operation for the selected loop type. When it defaults to NOT using a bit operation, but a complex implementation can get worse.

If this is actually better than a loop, then known fds should be proven.

+3
source share

All Articles