Understanding libuv / epoll / non-blocking network IO - c

I am trying to understand how non-blocking network IO is working in Node.js/libuv. I already found out that file IO is done using libuv worker threads (thus, in a background thread). However it is stated in various places that network IO is done in a non-blocking fashion using system calls like epoll, kqueue, etc (depending on operating system).
Now I am wondering if this means that the actual IO part (read()) is still done on the mainthread, and thus blocking, even if e. g. epoll is used? As for my understanding, epoll only notifies about available events, but does not actually do the read/write. At least in the examples I found (e. g. http://davmac.org/davpage/linux/async-io.html) epoll is always used in combination with the read system call, which is a blocking IO operation.
In other words, if libuv is using a single thread and epoll, to have a notification when data is available to read, is the then following read operation beeing executed on the mainthread and thus potentially blocking other operations (thinking of network requests) on the mainthread?

File descriptors referring to files are always reported as ready for read/write by epoll/poll/select, however, read/write may block waiting for data to be read/written. This is why file I/O must be done in a separate thread.
Whereas non-blocking send/recv with pipes and sockets are truly non-blocking and hence can be done in the I/O thread without risk of blocking the thread.

Related

Non blocking sockets when using I/O multiplexing

Should I use non-blocking or blocking TCP sockets when using an I/O multiplexing API like poll(2) or epoll(2)?
Some people suggest using non-blocking sockets here but the I/O multiplexing APIs inform you anyway if there is data to read so what is wrong with a blocking socket here?
If your TCP server is single-threaded and uses blocking I/O, then it's likely that any client that connects to it will be able to deny service to all of the other clients simply by sending only a partial-message, or alternatively by refusing to read any data from its TCP socket after the server sends data. In the former case, the server may block for a long time (perhaps forever) waiting for the entire message to be received from the client; during that time, the server will not be able to respond to other clients. In the latter case, the server will block for a long time (perhaps forever) waiting the client to read some TCP data so that the server-socket's send-buffer can be drained enough to fit some more outgoing data to that client.
One way to avoid that problem is to set all of the server's sockets to non-blocking I/O mode; that way the server knows it can never get "stuck" inside a recv() or a send() call, and thus can remain responsive to all clients regardless of whether any particular client is behaving nicely, or not. In the non-blocking design, the only place the server ever blocks is inside select() or poll() or similar, because those calls are designed to return whenever any client needs service, rather than blocking on only a single client. (the tradeoff is that with non-blocking I/O your server's buffering/queueing logic will need to be a bit more elaborate, since you can no longer assume that any particular fixed number of bytes will be sent or received during any given send or receive operation)
The other way to avoid the problem is to make a multi-threaded server; that has the advantage that each client gets its own thread, and therefore a badly-behaved client will block only its own thread and not the threads servicing other clients. The disadvantage is that now your server is multi-threaded, with all of the additional pitfalls that multithreading introduces.
(and, for completeness, the third approach is simply to ignore the possibility of badly-behaved/poorly-connected clients, and use a single-threaded/blocking model. That works fine for toy examples where clients are expected to be non-hostile, and where the network they are connecting over is reliable, but doesn't work so well in real life)
Non-blocking IO is used when you prefer an error response (EWOULDBLOCK / EAGAIN) over your thread waiting (blocking) until an IO operation becomes possible.
This leads to the question of how is the IO multiplexing achieved?
If you're using a thread-per-connection model (or a process-per-connection), using blocking IO might be more comfortable.
However, if the same thread is serving multiple IO objects, blocking IO would be hazardous and could bring the whole application to a halt.
It is better to use non-blocking IO when a single thread serves multiple IO objects.
Note that the issue might not be noticeable at first when polling (using select / poll or epoll/kqueue).
Since the IO operations are only performed by a code path that already "knows" that the IO operation will not block (it was polled and known to be an available operation).
This masks the issue that somewhere in the code an IO operation might be called directly without polling first, resulting in a blocking IO call that will grind the application to a halt.

Why is select(2) called "synchronous" multiplexing?

I am a bit confused at the moment about select(2), which states in the summary:
select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O
multiplexing
as far as I am aware many libraries and programs such as libuv and nodejs use select/epoll/kqueue/iocp for their event loop, which is used for their corresponding async/await feature (and async I/O?).
So, what exactly does synchronous multiplexing mean? Can I achieve async I/O using select? What exactly is the difference between synchronous multiplexing and asynchronous multiplexing?
You've got a parse error there. It is not synchronous multiplexing but multiplexing of synchronous I/O: select is used to multiplex synchronous I/O calls. read and write and such are called synchronous I/O because they will either block until the transfer is complete, or not do the transfer (non-blocking non-ready sockets for example).
This can be contrasted with truly asynchronous calls where the system call just initiates the transfer and it is completed in the background and a notification is given after the completion.
The nodejs and libuv are different beasts. Even though the I/O in C is possibly multiplexed and synchronous, it will appear as asynchronous to them - there is no blocking synchronous read calls because it all happens transparently on the C/library side.
So, what exactly does synchronous multiplexing mean?
Synchronous operations are distinguished from asynchronous ones in that the former do not allow the caller to continue until they complete, whereas the latter do. Software (a)synchronicity is closely related to multithreading, and the main characteristic of select() that makes its operation synchronous instead of asynchronous is that it works entirely within a single (user) thread of execution. When you call select() your thread blocks until either one of the file descriptors you specified becomes ready, or the timeout you specified expires.
The alternative would be a programming model where you register interest in I/O on the file descriptors, then come back later to check whether they are ready.
It should be noted, however, that although select() is certainly synchronous itself, the multiplexing is mostly up to the programmer. select() provides a means to achieve it, but performs no I/O itself. It's essential brilliance is in giving you the information you need to avoid blocking trying to do I/O on one file descriptor while a different one is ready to be serviced.
Can I achieve async I/O using select?
No, select doesn't do anything to particularly facilitate asynchronous I/O. It helps you handle multiple I/O channels efficiently via a single thread, but that thread operates synchronously. This nevertheless tends to be a big win, because I/O is very slow, and that slowness is mostly associated with I/O peripherals and media, not CPU and memory. Generally speaking, a single thread has plenty of processing power to handle multiple I/O channels as long as it chooses wisely which ones to handle at any given opportunity, and select() facilitates that.

Non-blocking access to the file system

When writing a non-blocking program (handling multiple sockets) which at a certain point needs to open files using open(2), stat(2) files or open directories using opendir(2), how can I ensure that the system calls do not block?
To me it seems that there's no other alternative than using threads or fork(2).
As Mel Nicholson replied, for everything file descriptor based you can use select/poll/epoll. For everything else you can have a proxy thread-per-item (or a thread pool) with the small stack that would convert (by means of the kernel scheduler) any synchronous blocking waits to select/poll/epoll-able asynchronous events using eventfd or a unix pipe (where portability is required).
The proxy thread shall block till the operation completes and then write to the eventfd or to the pipe to wake up the select/poll/epoll.
Indeed there is no other method.
Actually there is another kind of blocking that can't be dealt with other than by threads and that is page faults. Those may happen in program code, program data, memory allocation or data mapped from files. It's almost impossible to avoid them (actually you can lock some pages to memory, but it's privileged operation and would probably backfire by making the kernel do a poor job of memory management somewhere else). So:
You can't really weed out every last chance of blocking for a particular client, so don't bother with the likes of open and stat. The network will probably add larger delays than these functions anyway.
For optimal performance you should have enough threads so some can be scheduled if the others are blocked on page fault or similar difficult blocking point.
Also if you need to read and process or process and write data during handling a network request, it's faster to access the file using memory-mapping, but that's blocking and can't be made non-blocking. So modern network servers tend to stick with the blocking calls for most stuff and simply have enough threads to keep the CPU busy while other threads are waiting for I/O.
The fact that most modern servers are multi-core is another reason why you need multiple threads anyway.
You can use the poll( ) command to check any number of sockets for data using a single thread.
See here for linux details, or man poll for the details on your system.
open( ) and stat( ) will block in the thread they are called from in all POSIX compliant systems unless called via an asynchronous tactic (like in a fork)

Event Driven IO And Blocking vs NonBlocking

Can someone explain to me how event-driven IO system calls like select, poll, and epoll relate to blocking vs non-blocking IO?
I don't understand how related -- if at all, these concepts are
The select system call is supported in almost all Unixes and provides means for userland applications to watch over a group of descriptors and get information about which subset of this group is ready for reading/writing. Its particular interface is a bit clunky and the implementation in most kernels is mediocre at best.
epoll is provided only in Linux for the same purpose, but is a huge improvement over select in terms of efficiency and programming interface. Other Unixes have their specialised calls too.
That said, the event-driven IO system calls do not require either blocking or non-blocking descriptors. Blocking is a behaviour that affects system calls like read, write, accept and connect. select and epoll_wait do have blocking timeouts, but that is something unrelated to the descriptors.
Of course, using these event-driven system calls with blocking descriptors is a bit odd because you would expect that you can immediately read the data without blocking after you have been notified that it is available. Always relying that a blocking descriptor won't block after you have been notified for its readiness is a bit risky because race conditions are possible.
Non-blocking, event-driven IO can make server applications vastly more efficient because threads are not needed for each descriptor (connection). Compare the Apache web server to Nginx or Lighttpd in terms of performance and you'll see the benefit.
They're largely unrelated, except that you may want to use non-blocking file descriptors with event-driven IO for the following reasons:
Old versions of Linux definitely have bugs in the kernel where read can block even after select indicated a socket was readable (it happened with UDP sockets and packets with bad checksums). Current versions of Linux may still have some such bugs; I'm not sure.
If there's any possibility that other processes have access to your file descriptors and will read/write to them, or if your program is multi-threaded and other threads might do so, then there is a race condition between select determining that the file descriptor is readable/writable and your program performing IO on it, which could result in blocking.
You almost surely want to make a socket non-blocking before calling connect; otherwise you'll block until the connection is made. Use select for writing to determine when it's successfully connected, and select for errors to determine if the connection failed.
select and similar functions (you mentioned a few) are usually used to implement an event loop in an event driven system.
I.e., instead of read()ing directly from a socket or file -- potentially blocking if the no data is available, the application calls select() on multiple file descriptors waiting for data to be available on any one of them.
When a file descriptor becomes available, you can be assured data is available and the read() operation will not block.
This is one way of processing data from multiple sources simultaneously without resorting to multiple threads.

questions about multi threading for sockets/tcp-connections

I have a server that connects to multiple clients using TCP/IP connections, using C in Unix. Since it won't have more than 20 connections at a time, I figured I would use a thread per connection/socket. But the problem is writing to the sockets as I'll be sending user prompted msgs to clients. Once each socket is handled by a thread, how do I interact with the created thread to write to the sockets? Should each thread just read from the sockets and I'll write to sockets in the main program? Not sure if that's a good way to go about it.
My rule of thumb is that any given socket should only be operated on by a single thread(*). So if you spawn a separate I/O thread for each socket, and your main thread wants something written to an I/O thread's socket, then the main thread should send that data to the I/O thread, whereupon the I/O thread can write it to the socket.
Of course, this means you need to have a good communications method between the main thread and the I/O thread; which you could do by spawning a socket-pair for each I/O thread and having the I/O threads select()/poll() on their end of the socket-pair (to handle data coming from the main thread) as well as on their network socket.
But once you've done that, you're dealing with complexity of using select()/poll() AND multithreading, which is a lot of complexity overhead. So unless you absolutely need multithreading for some reason, I agree with the previous posters -- it's better to just handle all the sockets in a single thread, via select() or poll().
(*) It's possible to have multiple threads reading/writing to the same socket at the same time, but it's error-prone. In particular, startup and shutdown sequences can be tricky to get 100% right. That's why I try to avoid 'sharing' a given socket amongst multiple threads.
Sounds like you'd probably be better with a single thread and multiplexing the sockets (using select, poll etc). This will avoid the race conditions and locking requirements which will otherwise make the program more difficult to write.
Unless you are doing significant processor-intensive work, or waiting for IO on behalf of these clients, you'll get no benefit from using threads anyway, but the race conditions will still be there.
So I'd say, get a working implementation using a single thread, THEN if in performance testing you discover that it is lacking, refactor it to use multithreading if that seems like the best option to beat the performance problems (of course you'll be profiling it etc).
Having the main thread write to the sockets is fine, you only need to worry about having multiple threads writing to a socket at the same time.
However, I'd test the performance of using a single thread and select/poll before bothering with the muti threaded approach.

Resources