UDP packet drops from Linux kernel

I have a server that sends UDP packets through multicast and several clients that list these multicast packets. Each packet has a fixed size of 1040 bytes, the entire data size that is sent by the server is 3 GB.

My surrounding:

1 Ethernet Gbit

40 nodes, 1 Node sender and 39 receiver nodes. All nodes have the same hardware configuration: 2 AMD processors, each processor has 2 cores with a frequency of 2.6 GHz

On the client side, a single thread reads a socket and queues data. One additional thread pushes data from the queue and does a bit of processing.

During multicast, I find out the 30% packet rate on the Node side. Looking at the statistics of netstat -su, I can say that the missing packages by the client application are equal to the value of RcvbufErrors from the output of netstat.

This means that all missing packets are discarded by the OS because the socket buffer is full, but I do not understand why the capture stream cannot read the buffer in time. During transmission, 2 out of 4 cores are used by 75%, the rest are asleep. I am the only one who uses these nodes, and I would suggest that such machines have no problems with processing 1 Gbps bandwidth. I already made some optimization by adding g ++ compiler flags for amd cpus, this will reduce the packet transfer rate to 10%, but, in my opinion, it is too high.

Of course, I know that UDP is not reliable, I have my own correction protocol.

I do not have administrative permissions, so I can’t change the system settings.

Any tips on how to improve performance?

EDIT: I solved this problem using 2 threads that read a socket. The recv socket buffer is still sometimes full. But the average drop is less than 1%, so it’s not a problem.

+4
source share
3 answers

Keeping track of network failures on Linux can be a bit complicated, as there are many components where packet drops can occur. They can be performed at the hardware level, in a subsystem of a network device, or in protocol layers.

I wrote a very detailed blog post explaining how to control and configure each component. This is a little difficult to summarize as a short answer here, since there are so many different components that need to be monitored and configured.

+3
source

Besides the obvious removal of everything unnecessary from the socket read cycle:

  • Increase the socket receive buffer with setsockopt(2) ,
  • Use recvmmsg(2) if your kernel supports it to reduce the number of system calls and copies of the user kernel,
  • Consider a non-blocking approach using the edge effect of epoll(7) ,
  • See if you really need threads, locking / syncing is very expensive.
+2
source

"On the client side, one thread reads the socket and queues the data." I think the problem is in this thread. He does not receive messages fast enough. Too much time is spent on something else, for example, getting a mutex when putting data in a queue. Try to optimize operations in the queue, for example, use an unblocked queue.

-1
source

Source: https://habr.com/ru/post/1416172/


All Articles