AIO network sockets and zero-copy under Linux - c

I have been experimenting with async Linux network sockets (aio_read et al in aio.h/librt), and one thing i have been trying to find out is whether these are zero-copy or not. Pretty much all i have read so far discusses file I/O, whereas its network I/O i am interested in.
AIO is a bit of a pain to use and i suspect is non-portable, so wondering whether its worth persevering with it. Zero-copy is just about the only advantage (albiet a major one for my purposes) it would have over (non-blocking) select/epoll..

In GLIBC, AIO is implemented using POSIX threads and a regular pread-call. So it's likely more expensive than select or epoll and doing the read or recv yourself.

Related

aio_read inside signal handler

I'm going to use aio for async read. When aio completes and signal handler is triggered, I may need to do another aio_read call and proceed.
aio_read isn't mentioned among safe functions (in man signal). Ordinary read is, though.
What are the dangers of doing subsequent aio_read calls inside aio signal handler?
As the author of proposed Boost.AFIO which can make use of POSIX AIO, I strongly recommend against using POSIX AIO at all. I am hardly alone in this opinion, #arvid is similarly against: http://blog.libtorrent.org/2012/10/asynchronous-disk-io/. The API itself is poorly designed and as a result scales poorly with load unless you use OS-specific alternatives or extensions to AIO like BSD kqueues. POSIX AIO is essentially useless as-is.
Additionally, AIO calls are not signal safe on Linux, which you are probably using. This is because on Linux they are implemented in userspace using an emulation based on a threadpool. On BSD, AIO calls have a proper kernel syscall interface, but in the kernel turn into - yes you guessed it - a threadpool based emulation unless O_DIRECT is turned on.
You are therefore much better off on POSIX of simply always using a threadpool unless all your i/o is with O_DIRECT on. If O_DIRECT is indeed always on, Linux provides a custom kernel API detailed at http://man7.org/linux/man-pages/man2/io_submit.2.html which is fairly effective, and on BSD if you replace signal driven handling with BSD kqueues (https://www.freebsd.org/cgi/man.cgi?kqueue, see EVFILT_AIO) then with O_DIRECT things can also scale well, better than a threadpool anyway.
Use of signal based completion handling on ANY POSIX platform has dreadful performance. AFIO v2 provides a generic POSIX AIO backend, and it is dreadful, dreadful, dreadful. Avoid like the plague.
Note that a threadpooled synchronous API design is portable, scales well for most use cases, and is what I (and indeed arvid) would recommend to anybody without highly specialised needs like writing a database backend where you need very tight control over the physical storage layer, and anything but O_DIRECT|O_SYNC isn't an option.
Ok, all that said, if you really really want to use signal driven aio, I assume this is because you want to multiplex your file i/o with non-file i/o stuff and you therefore can't use aio_suspend() which is the proper API for doing this. The way AFIO v2 handles this is to use a realtime signal to interrupt aio_suspend() when something not aio related needs to be processed, it can then be handled and aio_suspend() restarted. You need to be very careful in handling races and deadlocks, and you'll need to carefully mask and unmask the signal for the thread calling aio_suspend() lest the realtime signal gets lost and you get a lost wakeup. All in all, it's not worth it for the typically much lower i/o performance you get over a threadpool + synchronous APIs.

Is C select() function deprecated?

I am reading a book about network progamming in C. It is from 2004.
In the example code, author is using select C function to accept multiple connections from the client. Is that function deprecated today?
I see that there are different ways to accept multiplexed I/O like poll and epoll. What are the advantages?
It's not deprecated, and lots of programs rely on it.
It's just not the best tool as it has some limitations:
The number of file descriptors is limited (OS specific, usually possible to increase it with kernel recompiling).
Doesn't scale well (with lots of fds): the whole FD set must be maintained, and re-initialized as select manipulates it.
Feel free to use it if these aren't relevant for you. Otherwise use poll/libevent if you're looking for a cross-platform solution, or in some rare-cases epoll/kqueue for platform specific optimized solutions.
It's not deprecated in its behavior, but its design may have performance issues. For example, linux epoll() documentation states:
API can be used either as an edge-triggered or a level-triggered inter‐
face and scales well to large numbers of watched file descriptors.
Since the efficient alternatives are specific to each operating system, an option better than directly using select() is to use a cross platform multiplexing library (which uses the best implementation available), examples being:
libevent
libev
libuv
If you're developing for a specific operating system, use the recommended implementation for high performance applications.
However, since some people don't like current libraries for I/O multiplexing (due to "being ugly"), select is still a viable alternative.

C network programming?

What libraries are the best (in terms of performance) for network programming in C on windows and UNIX?
I'm quite interested with respect to high frequency trading.
I have heard about BSD and POSIX but I wasnt sure if there were faster performance-specific libraries?
The fastest way would be to use the OS's networking functions: socket(), setsockopt(), connect(), listen(), send(), recv() etc. etc.
There are subtle differences between them on several OS's.
To cope with this, there are wrappers around them in several libraries, e.g. in Qt (at least, IIRC). I don't think anything will noticeably slow down if you use them...
What about ZeroMQ. [http://www.zeromq.org/][1]
It's a faster, easy to code and also can be used as Message Queue.

Does posix aio in linux 2.6 support socket file descriptor?

I've seached such question in google and got different answers.I cann't determine whether posix aio in linux 2.6 support socket file descriptor or not.
if it support tcp socket,does the aiocb.aio_offset=0 relative to the first byte readed from the tcp socket fd?
if it doesn't,does any asynchronous io library in linux support socket fd?
A comment above states that aio does not support sockets. You ask for possible alternatives.
The obvious ones are:
use an event driven programming model, either produced by hand using poll(2) or what have you or via a library like Niels Provos' "libevent"
use threads
I generally prefer the event driven way of doing things, and generally use libevent, which is documented here: http://libevent.org/
Bear in mind, however, that event driven programming is rather heavily different from what you may be used to in program organization. Threads are conceptually similar, although often less efficient when handling large numbers of sockets.

libevent2 and file io

I've been toying around with libevent2, and I've got reading files working, but it blocks. Is there any way to make file reading not block just within libevent. Or, do I need to use another IO library for files and make it pump events that I need.
fd = open("/tmp/hello_world",O_RDONLY);
evbuffer_read(buf,fd,4096);
The O_NONBLOCK flag doesn't work either.
In POSIX disks are considered "fast devices" meaning that they always block (which is why O_NONBLOCK didn't work for you). Only network sockets can be non-blocking.
There is POSIX AIO, but e.g. on Linux that comes with a bunch of restrictions making it unsuitable for general-purpose usage (only for O_DIRECT, I/O must be sector-aligned).
If you want to integrate normal POSIX IO into an asynchronous event loop it seems people resort to thread pools, where the blocking syscalls are executed in the background by one of the worker threads. One example of such a library is libeio
No.
I've yet to see a *nix where you can do non-blocking i/o on regular files without resorting to the more special AIO library (Though for some, e.g. solaris, O_NONBLOCK has an effect if e.g. someone else holds a lock on the file)
Please take a look at libuv, which is used by node.js / io.js: https://github.com/libuv/libuv
It's a good alternative to libeio, because it does perform well on all major operating systems, from Windows to the BSDs, Mac OS X and of course Linux.
It supports I/O completion ports, which makes it a better choice than libeio if you are targeting Windows.
The C code is also very readable and I highly recommend this tutorial: https://nikhilm.github.io/uvbook/

Resources