Fastest socket method for large amounts of data between large numbers of files

I am creating a socket application that needs to shuffle a large number of small / medium-sized files, something like 5-100 KB files for many different clients (sort of like a web server, but still not quite).

Should I just go with standard polling / epoll (linux) or asynchronous sockets in winsock (win32), or are there any methods with even better performance (like overlapping i / o on win32)?

Both Linux and Windows are possible platforms!

+4
source share
6 answers

In windows, you can try using TransmitFile , which can improve your performance by avoiding kernel space ↔ copying.

+1
source

On Linux, demultiplexing multiple sockets using epoll is the fastest way to parallel I / O over TCP.

But I also mentioned that in the interest of portability (and since you seem to be interested in Linux or Windows), you should look into Boost.Asio. It has a portable API, but uses epoll for Linux and overrides Windows I / O, so you can create high-performance and portable network applications.

In addition, since you are working with files, you must also implement double buffering when performing I / O operations for maximum performance. In other words, you send / recv each file using two buffers. For example, on the sending side, you read from the disk into one buffer, and then send this buffer over the network, and the other stream reads the next block of data from the disk to the second buffer. Thus, you overlap disk I / O with network I / O.

+3
source

On Linux, sendfile() is a high-performance API specifically designed to send data from files to sockets (you still have to use poll to multiplex, it's just a replacement for the read / write ).

+2
source

In addition to epoll it looks like Linux sendfile(2) is well suited for your server-side needs.

+2
source

Unfortunately, if you want the best possible performance, you still have to manually process your I / O code on Windows and Linux, since the current abstraction libraries do not scale as much as many threads (if at all).

Boost asio is probably the best option if you want portability (and ease of use), but it has limitations related to multi-threaded scalability (see Socket Server Socket - Impossible to saturate the CPU ). I believe that the main problem is integrating timeout processing without excessive blocking into the multi-threaded event loop.

Essentially, what you would like to use for maximum performance are I / O completion ports with a workflow pool on Windows and an epoll cross trigger with a workflow pool on Linux.

0
source

Do not optimize your program prematurely.

Assuming this is not a premature optimization, the easiest way is to simply save all the data in memory. You can use mmap () if you want, or just load them at startup. Sending files that are already stored in memory is straightforward.

Having said that while trying to multiplex a lot of things with (like) epoll might be a bit of a headache, can't you use something that's already written?

0
source

All Articles