There are several possibilities.
Exceeding FD_SETSIZE
Your code checks for a negative file descriptor, but does not exceed the upper limit, which is FD_SETSIZE (usually 1024). Whenever this happens, your code
- damage to own stack
- represents an empty
fd_set for select , which will lead to a hang
Suppose you donβt need so many concurrent open file descriptors, the solution will probably be to look for removal of the file descriptor leak, especially the stack code that handles the closure of abandoned descriptors.
There is a suspicious comment in your code that indicates a possible leak:
If this comment means someone sets m_socket to -1 and hopes that a recv will catch a closed socket and close it, who knows, maybe we are closing -1, not a real closed socket. (Note the difference between closing at the network level and closing at the file descriptor level, which requires a separate close call.)
This can also be considered by switching to poll , but there are several other limitations imposed by the operating system that make this route quite difficult.
Out-of-band data
You say that the server "sends" data. If this means that data is sent using the send call (as opposed to the write call), use strace to specify the argument of the send flags. If the MSG_OOB flag is MSG_OOB , the data arrives as data out of range - and your select call will not notice it until you pass a copy of fds as another parameter.
fd_set fds_copy = fds; select( m_socket + 1, &fds, 0, &fds_copy, timeout == -1 ? 0 : &tv )
Fasting process
If the mailbox is heavily overloaded, the server performs without any blocking calls and with real-time priority (use top to check this) - and the client does not work - the client may be hungry.
Paused process
A client can theoretically be stopped using SIGSTOP . You probably know if this is so by pressing somewhere ctrl-Z or having some specific process that exercises control over the client, except for the launch itself.