Linux socket: how to detect a disconnected network in a client program?

Question

Linux socket: how to detect a disconnected network in a client program?

I am debugging a linux c socket based program. Since all the examples are available on websites, I applied the following structure:

sockfd= socket(AF_INET, SOCK_STREAM, 0); connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)); send_bytes = send(sockfd, sock_buff, (size_t)buff_bytes, MSG_DONTWAIT);

I can detect a shutdown when the uninstall server closes its server program. But if I unplug the Ethernet cable, the send function still returns positive values, not -1.

How to check the network connection in the client program, assuming that I can not change the server?

+6

c linux sockets send

user2052197 Feb 08 '13 at 10:09

source share

4 answers

To detect remote shutdown, do read()

Check out this topic for more information:

Can the read () function of a connected socket return null bytes?

+1

Forhad ahmed Feb 08 '13 at 22:21

source share

Check the return value and see if it is equal to this value:

EPIPE
This socket has been connected, but the connection is now broken. In this case, the transmission first generates a SIGPIPE signal; if this signal is ignored or blocked, or if its handler returns, then sending is not performed using EPIPE.

Also add a test for the SIGPIPE signal in the handler to make it more controlled.

+1

Ramy Al Zuhouri Feb 08 '13 at 22:24

source share

You cannot detect a disconnected network cable with only the write () call function. This is because of the tcp retransmission performed by the tcp stack without your consciousness. Here are the solutions.

Even if you already set the keepalive parameter in your application socket, you will not be able to determine the status of a dead socket connection in a timely manner if your application continues to write to the socket. This is due to the retransmission of tcp in the tcp kernel stack. tcp_retries1 and tcp_retries2 are the kernel parameters for setting the tcp retransmission timeout. It is difficult to predict the exact timeout of the retransmission because it is calculated by the RTT mechanism. You can see this calculation in rfc793. (3.7. Data Transfer)

https://www.rfc-editor.org/rfc/rfc793.txt

All platforms have kernel configurations for retransmission of tcp.

 Linux : tcp_retries1, tcp_retries2 : (exist in /proc/sys/net/ipv4)

http://linux.die.net/man/7/tcp

 HPUX : tcp_ip_notify_interval, tcp_ip_abort_interval

http://www.hpuxtips.es/?q=node/53

 AIX : rto_low, rto_high, rto_length, rto_limit

http://www-903.ibm.com/kr/event/download/200804_324_swma/socket.pdf

You must set a lower value for tcp_retries2 (default is 15) if you want early detection of a dead connection, but this is not an exact time, as I said. In addition, you cannot currently set these values for only one socket. These are global kernel parameters. There have been several attempts to apply the tcp retransmission socket option to a single socket ( http://patchwork.ozlabs.org/patch/55236/ ), but I do not think that it was applied in the core core. I cannot find such a definition of parameters in system header files.

For reference, you can track your keepalive socket option through 'netstat -timers', as shown below. https://stackoverflow.com/questions/34914278

 netstat -c --timer | grep "192.0.0.1:43245 192.0.68.1:49742" tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (1.92/0/0) tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (0.71/0/0) tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (9.46/0/1) tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (8.30/0/1) tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (7.14/0/1) tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (5.98/0/1) tcp 0 0 192.0.0.1:43245 192.0.68.1:49742 ESTABLISHED keepalive (4.82/0/1)

In addition, when keepalive timeout ocurrs, you may encounter different return events depending on the platforms you are using, so you should not determine the dead state of the connection only with return events. For example, HP returns the POLLERR event, and AIX returns the POLLIN event when the keepalive timeout occurs. At this time, you will encounter the ETIMEDOUT error in the recv () call.

In the latest kernel version (starting from version 2.6.37) you can use the TCP_USER_TIMEOUT parameter, which will work well. This parameter can be used for a single socket.

Finally, you can use the read function with the MSG_PEEK flag, which allows you to check that the socket is in order. (MSG_PEEK just looks if the data goes to the kernel stack buffer and never copies the data to the user buffer). Thus, you can use this flag only to check the socket, without any side effects.

0

cloudrain21 Jan 25 '16 at 7:29

source share

cnicutar · Accepted Answer · 2013-02-08T22:25:27+0000

But if I unplug the Ethernet cable, the send function will still return positive values, not -1. A.

First of all, you should know that send does not actually send anything, it is just a copy / system call function. It copies the data from your process to the kernel - after a while, the kernel will extract this data and send it to the other side after packing it into segments and packages. Therefore, send can only return an error if:

The socket is invalid (e.g. dummy file descriptor)
The connection is clearly invalid, for example, it has not been established or has already been completed in some way (FIN, RST, timeout - see below).
No more copy space

The main thing is that send does not send anything and, therefore, its return code does not say anything about the actual data coming to the other side .

Returning to your question, when TCP sends data, it expects valid confirmation in a reasonable amount of time. If he does not get one, he returns. How often is it forwarded? Each TCP stack does things differently, but the norm should use exponential delays. That is, first wait 1 second, then 2, then 4, and so on. On some stacks, this process may take several minutes.

The main thing is that in case of interruption, TCP will declare the connection dead only after a serious period of silence (on Linux, it does something like 15 repetitions - more than 5 minutes).

One way to solve this problem is to implement a verification mechanism in your application. For example, you can send a request to the server "reply within 5 seconds or I will declare this connection dead", and then recv with a timeout.

Linux socket: how to detect a disconnected network in a client program?

More articles: