I am a bit confused at the moment about select(2), which states in the summary:
select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O
multiplexing
as far as I am aware many libraries and programs such as libuv and nodejs use select/epoll/kqueue/iocp for their event loop, which is used for their corresponding async/await feature (and async I/O?).
So, what exactly does synchronous multiplexing mean? Can I achieve async I/O using select? What exactly is the difference between synchronous multiplexing and asynchronous multiplexing?
You've got a parse error there. It is not synchronous multiplexing but multiplexing of synchronous I/O: select is used to multiplex synchronous I/O calls. read and write and such are called synchronous I/O because they will either block until the transfer is complete, or not do the transfer (non-blocking non-ready sockets for example).
This can be contrasted with truly asynchronous calls where the system call just initiates the transfer and it is completed in the background and a notification is given after the completion.
The nodejs and libuv are different beasts. Even though the I/O in C is possibly multiplexed and synchronous, it will appear as asynchronous to them - there is no blocking synchronous read calls because it all happens transparently on the C/library side.
So, what exactly does synchronous multiplexing mean?
Synchronous operations are distinguished from asynchronous ones in that the former do not allow the caller to continue until they complete, whereas the latter do. Software (a)synchronicity is closely related to multithreading, and the main characteristic of select() that makes its operation synchronous instead of asynchronous is that it works entirely within a single (user) thread of execution. When you call select() your thread blocks until either one of the file descriptors you specified becomes ready, or the timeout you specified expires.
The alternative would be a programming model where you register interest in I/O on the file descriptors, then come back later to check whether they are ready.
It should be noted, however, that although select() is certainly synchronous itself, the multiplexing is mostly up to the programmer. select() provides a means to achieve it, but performs no I/O itself. It's essential brilliance is in giving you the information you need to avoid blocking trying to do I/O on one file descriptor while a different one is ready to be serviced.
Can I achieve async I/O using select?
No, select doesn't do anything to particularly facilitate asynchronous I/O. It helps you handle multiple I/O channels efficiently via a single thread, but that thread operates synchronously. This nevertheless tends to be a big win, because I/O is very slow, and that slowness is mostly associated with I/O peripherals and media, not CPU and memory. Generally speaking, a single thread has plenty of processing power to handle multiple I/O channels as long as it chooses wisely which ones to handle at any given opportunity, and select() facilitates that.
Related
I am trying to understand how non-blocking network IO is working in Node.js/libuv. I already found out that file IO is done using libuv worker threads (thus, in a background thread). However it is stated in various places that network IO is done in a non-blocking fashion using system calls like epoll, kqueue, etc (depending on operating system).
Now I am wondering if this means that the actual IO part (read()) is still done on the mainthread, and thus blocking, even if e. g. epoll is used? As for my understanding, epoll only notifies about available events, but does not actually do the read/write. At least in the examples I found (e. g. http://davmac.org/davpage/linux/async-io.html) epoll is always used in combination with the read system call, which is a blocking IO operation.
In other words, if libuv is using a single thread and epoll, to have a notification when data is available to read, is the then following read operation beeing executed on the mainthread and thus potentially blocking other operations (thinking of network requests) on the mainthread?
File descriptors referring to files are always reported as ready for read/write by epoll/poll/select, however, read/write may block waiting for data to be read/written. This is why file I/O must be done in a separate thread.
Whereas non-blocking send/recv with pipes and sockets are truly non-blocking and hence can be done in the I/O thread without risk of blocking the thread.
I'm going to use aio for async read. When aio completes and signal handler is triggered, I may need to do another aio_read call and proceed.
aio_read isn't mentioned among safe functions (in man signal). Ordinary read is, though.
What are the dangers of doing subsequent aio_read calls inside aio signal handler?
As the author of proposed Boost.AFIO which can make use of POSIX AIO, I strongly recommend against using POSIX AIO at all. I am hardly alone in this opinion, #arvid is similarly against: http://blog.libtorrent.org/2012/10/asynchronous-disk-io/. The API itself is poorly designed and as a result scales poorly with load unless you use OS-specific alternatives or extensions to AIO like BSD kqueues. POSIX AIO is essentially useless as-is.
Additionally, AIO calls are not signal safe on Linux, which you are probably using. This is because on Linux they are implemented in userspace using an emulation based on a threadpool. On BSD, AIO calls have a proper kernel syscall interface, but in the kernel turn into - yes you guessed it - a threadpool based emulation unless O_DIRECT is turned on.
You are therefore much better off on POSIX of simply always using a threadpool unless all your i/o is with O_DIRECT on. If O_DIRECT is indeed always on, Linux provides a custom kernel API detailed at http://man7.org/linux/man-pages/man2/io_submit.2.html which is fairly effective, and on BSD if you replace signal driven handling with BSD kqueues (https://www.freebsd.org/cgi/man.cgi?kqueue, see EVFILT_AIO) then with O_DIRECT things can also scale well, better than a threadpool anyway.
Use of signal based completion handling on ANY POSIX platform has dreadful performance. AFIO v2 provides a generic POSIX AIO backend, and it is dreadful, dreadful, dreadful. Avoid like the plague.
Note that a threadpooled synchronous API design is portable, scales well for most use cases, and is what I (and indeed arvid) would recommend to anybody without highly specialised needs like writing a database backend where you need very tight control over the physical storage layer, and anything but O_DIRECT|O_SYNC isn't an option.
Ok, all that said, if you really really want to use signal driven aio, I assume this is because you want to multiplex your file i/o with non-file i/o stuff and you therefore can't use aio_suspend() which is the proper API for doing this. The way AFIO v2 handles this is to use a realtime signal to interrupt aio_suspend() when something not aio related needs to be processed, it can then be handled and aio_suspend() restarted. You need to be very careful in handling races and deadlocks, and you'll need to carefully mask and unmask the signal for the thread calling aio_suspend() lest the realtime signal gets lost and you get a lost wakeup. All in all, it's not worth it for the typically much lower i/o performance you get over a threadpool + synchronous APIs.
When writing a non-blocking program (handling multiple sockets) which at a certain point needs to open files using open(2), stat(2) files or open directories using opendir(2), how can I ensure that the system calls do not block?
To me it seems that there's no other alternative than using threads or fork(2).
As Mel Nicholson replied, for everything file descriptor based you can use select/poll/epoll. For everything else you can have a proxy thread-per-item (or a thread pool) with the small stack that would convert (by means of the kernel scheduler) any synchronous blocking waits to select/poll/epoll-able asynchronous events using eventfd or a unix pipe (where portability is required).
The proxy thread shall block till the operation completes and then write to the eventfd or to the pipe to wake up the select/poll/epoll.
Indeed there is no other method.
Actually there is another kind of blocking that can't be dealt with other than by threads and that is page faults. Those may happen in program code, program data, memory allocation or data mapped from files. It's almost impossible to avoid them (actually you can lock some pages to memory, but it's privileged operation and would probably backfire by making the kernel do a poor job of memory management somewhere else). So:
You can't really weed out every last chance of blocking for a particular client, so don't bother with the likes of open and stat. The network will probably add larger delays than these functions anyway.
For optimal performance you should have enough threads so some can be scheduled if the others are blocked on page fault or similar difficult blocking point.
Also if you need to read and process or process and write data during handling a network request, it's faster to access the file using memory-mapping, but that's blocking and can't be made non-blocking. So modern network servers tend to stick with the blocking calls for most stuff and simply have enough threads to keep the CPU busy while other threads are waiting for I/O.
The fact that most modern servers are multi-core is another reason why you need multiple threads anyway.
You can use the poll( ) command to check any number of sockets for data using a single thread.
See here for linux details, or man poll for the details on your system.
open( ) and stat( ) will block in the thread they are called from in all POSIX compliant systems unless called via an asynchronous tactic (like in a fork)
Am exploring with several concepts for a web crawler in C on Linux. To decide if i'll use blocking IO, multiplexed OI, AIO, a certain combination, etc., I esp need to know (I probably should discover it for myself practically via some test code, but for expediency I prefer to know from others) when a call to IO in blocking mode is made, is it the particular thread (assuming a multithreaded app/svc) or the whole process itself that is blocked? Even more specifically, in a multitheaded (POSIX) app/service can a thread dedicated to remote read/writes block the entire process? If so, how can I unblock such a thread without terminating the entire process?
NB: Whether or not I should use blocking/nonblocking is not really the question here.
Kindly
Blocking calls block only the thread that made them, not the entire process.
Whether to use blocking I/O (with one socket per thread) or non-blocking I/O (with each thread managing multiple sockets) is something you are going to have to benchmark. But as a rule of thumb...
Linux handles multiple threads reasonably efficiently. So if you are only handling a few dozen sockets, using one thread for each is easy to code and should perform well. If you are handling hundreds of sockets, it is a closer call. And for thousands of sockets, you are almost certainly better off using one thread (or process) to manage large groups.
In the latter case, for optimal performance you probably want to use epoll, even though it is Linux-specific.
Can someone explain to me how event-driven IO system calls like select, poll, and epoll relate to blocking vs non-blocking IO?
I don't understand how related -- if at all, these concepts are
The select system call is supported in almost all Unixes and provides means for userland applications to watch over a group of descriptors and get information about which subset of this group is ready for reading/writing. Its particular interface is a bit clunky and the implementation in most kernels is mediocre at best.
epoll is provided only in Linux for the same purpose, but is a huge improvement over select in terms of efficiency and programming interface. Other Unixes have their specialised calls too.
That said, the event-driven IO system calls do not require either blocking or non-blocking descriptors. Blocking is a behaviour that affects system calls like read, write, accept and connect. select and epoll_wait do have blocking timeouts, but that is something unrelated to the descriptors.
Of course, using these event-driven system calls with blocking descriptors is a bit odd because you would expect that you can immediately read the data without blocking after you have been notified that it is available. Always relying that a blocking descriptor won't block after you have been notified for its readiness is a bit risky because race conditions are possible.
Non-blocking, event-driven IO can make server applications vastly more efficient because threads are not needed for each descriptor (connection). Compare the Apache web server to Nginx or Lighttpd in terms of performance and you'll see the benefit.
They're largely unrelated, except that you may want to use non-blocking file descriptors with event-driven IO for the following reasons:
Old versions of Linux definitely have bugs in the kernel where read can block even after select indicated a socket was readable (it happened with UDP sockets and packets with bad checksums). Current versions of Linux may still have some such bugs; I'm not sure.
If there's any possibility that other processes have access to your file descriptors and will read/write to them, or if your program is multi-threaded and other threads might do so, then there is a race condition between select determining that the file descriptor is readable/writable and your program performing IO on it, which could result in blocking.
You almost surely want to make a socket non-blocking before calling connect; otherwise you'll block until the connection is made. Use select for writing to determine when it's successfully connected, and select for errors to determine if the connection failed.
select and similar functions (you mentioned a few) are usually used to implement an event loop in an event driven system.
I.e., instead of read()ing directly from a socket or file -- potentially blocking if the no data is available, the application calls select() on multiple file descriptors waiting for data to be available on any one of them.
When a file descriptor becomes available, you can be assured data is available and the read() operation will not block.
This is one way of processing data from multiple sources simultaneously without resorting to multiple threads.