aio_read inside signal handler - c

I'm going to use aio for async read. When aio completes and signal handler is triggered, I may need to do another aio_read call and proceed.
aio_read isn't mentioned among safe functions (in man signal). Ordinary read is, though.
What are the dangers of doing subsequent aio_read calls inside aio signal handler?

As the author of proposed Boost.AFIO which can make use of POSIX AIO, I strongly recommend against using POSIX AIO at all. I am hardly alone in this opinion, #arvid is similarly against: http://blog.libtorrent.org/2012/10/asynchronous-disk-io/. The API itself is poorly designed and as a result scales poorly with load unless you use OS-specific alternatives or extensions to AIO like BSD kqueues. POSIX AIO is essentially useless as-is.
Additionally, AIO calls are not signal safe on Linux, which you are probably using. This is because on Linux they are implemented in userspace using an emulation based on a threadpool. On BSD, AIO calls have a proper kernel syscall interface, but in the kernel turn into - yes you guessed it - a threadpool based emulation unless O_DIRECT is turned on.
You are therefore much better off on POSIX of simply always using a threadpool unless all your i/o is with O_DIRECT on. If O_DIRECT is indeed always on, Linux provides a custom kernel API detailed at http://man7.org/linux/man-pages/man2/io_submit.2.html which is fairly effective, and on BSD if you replace signal driven handling with BSD kqueues (https://www.freebsd.org/cgi/man.cgi?kqueue, see EVFILT_AIO) then with O_DIRECT things can also scale well, better than a threadpool anyway.
Use of signal based completion handling on ANY POSIX platform has dreadful performance. AFIO v2 provides a generic POSIX AIO backend, and it is dreadful, dreadful, dreadful. Avoid like the plague.
Note that a threadpooled synchronous API design is portable, scales well for most use cases, and is what I (and indeed arvid) would recommend to anybody without highly specialised needs like writing a database backend where you need very tight control over the physical storage layer, and anything but O_DIRECT|O_SYNC isn't an option.
Ok, all that said, if you really really want to use signal driven aio, I assume this is because you want to multiplex your file i/o with non-file i/o stuff and you therefore can't use aio_suspend() which is the proper API for doing this. The way AFIO v2 handles this is to use a realtime signal to interrupt aio_suspend() when something not aio related needs to be processed, it can then be handled and aio_suspend() restarted. You need to be very careful in handling races and deadlocks, and you'll need to carefully mask and unmask the signal for the thread calling aio_suspend() lest the realtime signal gets lost and you get a lost wakeup. All in all, it's not worth it for the typically much lower i/o performance you get over a threadpool + synchronous APIs.

Related

Why is select(2) called "synchronous" multiplexing?

I am a bit confused at the moment about select(2), which states in the summary:
select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O
multiplexing
as far as I am aware many libraries and programs such as libuv and nodejs use select/epoll/kqueue/iocp for their event loop, which is used for their corresponding async/await feature (and async I/O?).
So, what exactly does synchronous multiplexing mean? Can I achieve async I/O using select? What exactly is the difference between synchronous multiplexing and asynchronous multiplexing?
You've got a parse error there. It is not synchronous multiplexing but multiplexing of synchronous I/O: select is used to multiplex synchronous I/O calls. read and write and such are called synchronous I/O because they will either block until the transfer is complete, or not do the transfer (non-blocking non-ready sockets for example).
This can be contrasted with truly asynchronous calls where the system call just initiates the transfer and it is completed in the background and a notification is given after the completion.
The nodejs and libuv are different beasts. Even though the I/O in C is possibly multiplexed and synchronous, it will appear as asynchronous to them - there is no blocking synchronous read calls because it all happens transparently on the C/library side.
So, what exactly does synchronous multiplexing mean?
Synchronous operations are distinguished from asynchronous ones in that the former do not allow the caller to continue until they complete, whereas the latter do. Software (a)synchronicity is closely related to multithreading, and the main characteristic of select() that makes its operation synchronous instead of asynchronous is that it works entirely within a single (user) thread of execution. When you call select() your thread blocks until either one of the file descriptors you specified becomes ready, or the timeout you specified expires.
The alternative would be a programming model where you register interest in I/O on the file descriptors, then come back later to check whether they are ready.
It should be noted, however, that although select() is certainly synchronous itself, the multiplexing is mostly up to the programmer. select() provides a means to achieve it, but performs no I/O itself. It's essential brilliance is in giving you the information you need to avoid blocking trying to do I/O on one file descriptor while a different one is ready to be serviced.
Can I achieve async I/O using select?
No, select doesn't do anything to particularly facilitate asynchronous I/O. It helps you handle multiple I/O channels efficiently via a single thread, but that thread operates synchronously. This nevertheless tends to be a big win, because I/O is very slow, and that slowness is mostly associated with I/O peripherals and media, not CPU and memory. Generally speaking, a single thread has plenty of processing power to handle multiple I/O channels as long as it chooses wisely which ones to handle at any given opportunity, and select() facilitates that.

User-level threads context switching: How to detect when a thread is blocking in C?

As the title suggests, is there a way in C to detect when a user-level thread running on top of a kernel-level thread e.g., pthread has blocked (or about to block) for I/O?
My use case is as follows: I need to execute tasks in a multithreaded environment (on top of kernel threads e.g., pthreads). The tasks are basically user functions that can be synchronized and may use blocking operations within. I need to hide latency in my implementation. So, I am exploring the idea of implementing the tasks as user-level threads for better control of their execution context such that, when a task blocks or synchronizes, I context-switch to other ready tasks (i.e., implementing my own scheduler for the user-level threads). Consequently, almost the full use of the OS’s time quantum per kernel thread can be achieved.
There used to be code that did this, for example GNU pth. It's generally been abandoned because it just doesn't work very well and we have much better options now. You have two choices:
1) If you have OS help, you can use the OS mechanisms. Windows provides OS help for this, IOCP dispatching uses it.
2) If you have no OS help, then you have to convert all blocking operations into non-blocking ones that call your dispatcher rather than blocking. So, for example, if someone calls socket, you intercept that call and set the socket non-blocking. When they call read, you intercept that call and if they get a "would block" indication, you arrange to resume when the operation might succeed and schedule another thread.
You can look at GNU pth to see how you might make option 2 work. But be warned, GNU pth is full of reported bugs that have never been fixed since it was abandoned. It will give you an idea of how to implement things like mutexes and sleeps in a cooperative user-space threading environment. But don't actually use the code.

How Blocking IO Affects A Multithreaded Application/Service In Linux

Am exploring with several concepts for a web crawler in C on Linux. To decide if i'll use blocking IO, multiplexed OI, AIO, a certain combination, etc., I esp need to know (I probably should discover it for myself practically via some test code, but for expediency I prefer to know from others) when a call to IO in blocking mode is made, is it the particular thread (assuming a multithreaded app/svc) or the whole process itself that is blocked? Even more specifically, in a multitheaded (POSIX) app/service can a thread dedicated to remote read/writes block the entire process? If so, how can I unblock such a thread without terminating the entire process?
NB: Whether or not I should use blocking/nonblocking is not really the question here.
Kindly
Blocking calls block only the thread that made them, not the entire process.
Whether to use blocking I/O (with one socket per thread) or non-blocking I/O (with each thread managing multiple sockets) is something you are going to have to benchmark. But as a rule of thumb...
Linux handles multiple threads reasonably efficiently. So if you are only handling a few dozen sockets, using one thread for each is easy to code and should perform well. If you are handling hundreds of sockets, it is a closer call. And for thousands of sockets, you are almost certainly better off using one thread (or process) to manage large groups.
In the latter case, for optimal performance you probably want to use epoll, even though it is Linux-specific.

Linux aio (not posix) examples?

Does anyone have experience with Linux aio functions (io_*, not posix aio)? It would be great if someone could provide a link to some examples (or provide some examples here). Also, what are your general observations/comments about their use?
I am working on an I/O library and someone suggested I have a look at them. They are known to perform better than POSIX aio in certain cases and I would like to have a look.
Thanks.
Update: this shows an example for the native linux io interface
(This is an example on the posix aio interface).
As to some of the commentters on the question: the aio library allows a program to issue multiple parallel request in a way where the kernel can execute them in the order which is most efficient for seeks and disk rotation -- i.e. the io request may not be executed in the order which they were issues which is different from making synchronous request in a thread. In very IO intensive applications this can dramatically increase IO performance, but for most applications it will just add complexity.

libevent2 and file io

I've been toying around with libevent2, and I've got reading files working, but it blocks. Is there any way to make file reading not block just within libevent. Or, do I need to use another IO library for files and make it pump events that I need.
fd = open("/tmp/hello_world",O_RDONLY);
evbuffer_read(buf,fd,4096);
The O_NONBLOCK flag doesn't work either.
In POSIX disks are considered "fast devices" meaning that they always block (which is why O_NONBLOCK didn't work for you). Only network sockets can be non-blocking.
There is POSIX AIO, but e.g. on Linux that comes with a bunch of restrictions making it unsuitable for general-purpose usage (only for O_DIRECT, I/O must be sector-aligned).
If you want to integrate normal POSIX IO into an asynchronous event loop it seems people resort to thread pools, where the blocking syscalls are executed in the background by one of the worker threads. One example of such a library is libeio
No.
I've yet to see a *nix where you can do non-blocking i/o on regular files without resorting to the more special AIO library (Though for some, e.g. solaris, O_NONBLOCK has an effect if e.g. someone else holds a lock on the file)
Please take a look at libuv, which is used by node.js / io.js: https://github.com/libuv/libuv
It's a good alternative to libeio, because it does perform well on all major operating systems, from Windows to the BSDs, Mac OS X and of course Linux.
It supports I/O completion ports, which makes it a better choice than libeio if you are targeting Windows.
The C code is also very readable and I highly recommend this tutorial: https://nikhilm.github.io/uvbook/

Resources