Linux () multicast performance degrades with local listeners

Question

Linux () multicast performance degrades with local listeners

We have a publisher application that sends data using multicast. The application is extremely performance sensitive (we optimize the microsecond level). Applications that listen to this published data can be (and often) on the same computer as the publication application.

Recently, we noticed an interesting phenomenon: the time spent on sendto () increases in proportion to the number of listeners on the machine.

For example, say, without listeners, the base time for our sendto () call is 5 µs. Each additional listener increases the sendto () call time by about 2 microseconds. So, if we have 10 listeners, now sending sendto () takes 2 * 10 + 5 = 25 microseconds.

This for me suggests that the sendto () call is blocked until the data is copied to each listener.

An analysis of the listening side also confirms this. If there are 10 listeners, each listener receives data two microseconds later than the previous one. (Ie, the first listener receives data in about five microseconds, and the last listener receives data in about 23-25 ms.)

Is there a way, both at the program level and at the system level, to change this behavior? Something like a non-blocking / asynchronous call to sendto ()? Or, at least, block only until the message is copied to the kernel memory, so it can return without waiting for all listeners)?

+4

performance linux latency multicast sendto

Matt Jul 28 '11 at 10:37

source share

2 answers

Excuse me for clarifying the obvious, but not a socket lock? (add O_NONBLOCK to the flag set for the port - see fcntl )

0

Foo bah Jul 29 '11 at 5:14

source share

Steve-o · Accepted Answer · 2011-07-29T05:21:48+0000

The multicast loop is incredibly inefficient and should not be used for high-performance messages. As you noted for each send, the kernel copies the message to each local listener.

The recommended approach is to use a separate IPC method to distribute to other threads and processes on the same host, either in shared memory or as unix sockets.

For example, this can be easily implemented using ZeroMQ sockets by adding an IPC connection over a PGM multicast connection in the same ZeroMQ socket.

Linux () multicast performance degrades with local listeners

More articles: