Linux sockets terminating listening thread - c

I have a thread that is essentially just for listening on a socket. I have the thread blocking on accept() currently.
How do I tell the thread to finish any current transaction and stop listening, rather than staying blocked on accept?
I don't really want to do non-blocking if I don't have to...

Use the select(2) call to check which fd are ready to read.
The file descriptors from call can be read with out it blocking. eg accept() on the returned fd will immediately create a new connection.

Basically you have two options, the first one is to use interrupts: i.e
http://www.cs.cf.ac.uk/Dave/C/node32.html (see the signal handler section, it also supply a th_kill example).
From accept man page:
accept() shall fail if:
EINTR
The system call was interrupted by a signal that was caught before a valid connection arrived.
Another option is to use Non blocking sockets and select(): i.e.:
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Frzab6%2Frzab6xnonblock.htm
Anyhow, usually in multi-threaded servers there's one thread which accepts new connections and spawns other threads for each connections. Since accept()ing and than recv()ing, can delay new connections requests... (Unless you're working with one client, and then accept()ing and recieving might be OK)

Use pthread_cancel on the thread. You'll need to make sure you've installed appropriate cancellation handlers (pthread_cleanup_push) to avoid resource leaks, and you should disable cancellation except for the duration of the accept call to avoid race conditions where the cancellation request might get acted upon later by a different function than accept.
Note that, due to bugs in glibc's implementation of cancellation, this approach could lead to lost connections and file descriptor leaks. This is because glibc/NPTL provides no guarantee that accept did not already finish execution and allocate a new file descriptor for the new connection before the cancellation request is acted upon. It should be a fairly rare occurrence but it's still an issue to consider...
See: http://sourceware.org/bugzilla/show_bug.cgi?id=12683
and for a discussion of the issue: Implementing cancellable syscalls in userspace

From Wake up thread blocked on accept() call
I just used the shutdown() system call and it seems to work...

Related

Login timeout for thread-per-connection approach in C

I am coding a game-server that allows up to 1100 concurrent connections using thread-per-connection approach. Every time a login packet is read from the client socket I want to be able to give it 5 seconds to connect, otherwise gracefully the connection and release the thread to the pool.
I know about alarm() for sending the process a SIGALRM, but which thread receives the signal is undefined behavior. I also tried the setitimer function, but it also sends the signal to the process. Blocking the signal in all threads but ours is impossible because I need to get the signals in all 5 threads.
Is there any way of doing this without changing the entire server architecture?
Note: This is not a personal project, so changing the thread-per-connection model is not an option, please consider these answers out-of-topic.
Threads and signals don't mix well, for the reasons you found out -- it's indeterminate which thread will receive the signal.
A better way to get a timeout within a thread is to set the socket to non-blocking mode and then run a while-loop around select() and recv(). Use the timeout argument to select() to ensure that select() will wake up at the end of your 5-second deadline, pass your socket in as part of the read-fd_set argument, and keep in mind that if the connection is TCP, the data from your socket may arrive in multiple small chunks (hence the while-loop, to collect all of them into a buffer).

Posix select()/poll() and pthread IPC

This is kind of generic question - however I met this problem several times already and I still haven't found the best possible solution.
Let's imagine you have program (e.g. HTTP application server) that is multithreaded and that communicates over sockets (TCP, Unix, ...). Main thread is using asynchronous IO and select() or poll() POSIX calls to dispatch traffic from/to sockets. There are also worker threads that process requests and provides responses. To send response back to the client, worker thread synchronises with main thread (that polls) 'somehow'. Core of the questions is 'how' - in terms of what is efficient. I can use pipe() - socket based IPC mechanism - but this seems to me as quite huge overhead. I tend to use some pthread IPC techniques like mutex, condition variables etc. … but these will not work with select() or poll().
Is there a common technique in POSIX (and surroundings) that address this conflict?
I guess on Windows there is WaitForMultipleObjects() function that allows that.
Example program is crafted to illustrate an issue, I know that I can design master/worker pattern in a different way but this is not what I'm asking for. I have other cases where I'm in the same situation.
You could use a signal to poke the worker thread, which will interrupt the select() call and return EINTR. This gets even easier to do with pselect().
For this to work:
decide on a signal (or allocate a real-time signal)
attach an empty handler function to it (if the signal were ignored, the system call would be automatically restarted)
block the signal, at least in the worker thread.
use the signal mask argument in pselect() to unblock the signal while waiting.
Between threads, you can use pthread_kill to deliver the signal to the worker thread specifically. When another process should send the signal, you can either make sure the signal is blocked in all but the worker thread (so it will be delivered there), or use the signal handler to find out whether the signal was sent to the worker thread, and use pthread_kill to forward it explicitly (the worker thread still doesn't need to do anything in the signal handler).
Due to laziness on my part, I don't have a source code viewer online, but you can clone the LibreVISA git tree, and take a look at src/messagepump.cpp, where this method is used to poke the worker thread after another thread added a file descriptor to the watch list.
Simon Richthers answer is v good.
Another alternative might be to make main thread only responsible for listening for new connections and starting up a worker thread with the connection information so that the worker is responsible for all subsequent ‘transactions’ from this source.
My understanding is:
Main thread uses select.
Worker threads processes requests forwarded to it by main thread.
So need to synchronize between workers and main thread e.g. when
worker finishes a transaction need to send response back to main
thread which in turn forwards the response back to the source.
Why don't you remove the problem of having to synchronize between the worker thread and the main thread by making the worker thread responsible for all transactions from a particular connection?
Thus the main thread is only responsible for listening for new connections and starting up a worker thread with the connection information i.e. the file descriptor for the new connection.
First of all, the way to wake another thread is to use the pthread_cond_wait / pthread_cond_timedwait calls in thread A to wait, and for thread B to use pthread_cond_broadcast / pthread_cond_signal to pick it up. So, for instance if B is a producer and A is the consumer, the producer might add items to a linked list protected with a mutex. There would be an associated conditional variable such that after the addition of the item, it could wake thread B such that it went to see if any new items had arrived on the list, and if so removed them. I say 'associated' as then the same mutex can be associated with the condition variable as protects the list.
So far so good. Now you mention asynchronous I/O. What I've wanted to do several times is select() or poll() on a set of FDs and a set of condition variables, so the select(), poll() is interrupted when the condition variable is broadcasted to. There is no easy way of doing this directly; you cannot simply mix and match.
You thus need to do one of two things. Either:
work around the problem (for instance, use a self-connected pipe() to send one byte to wake the select() up either instead of the condition variable, as well as the condition variable, or from some additional thread waiting on the condition variable; or
convert to a more threaded model. IE use one thread for sending, one thread for receiving, and use a producer / consumer model, so the sender thread simply removes from a list / buffer and sends (blocking if necessary), and the received waits for I/O (blocking if necessary) and adds it to the list (this is what you put in italics at the end).
The second is a major design change for those of us brought up on asynchronous I/O, and the first is ugly. You are not the first to be dismayed by this, but I've not found an easy way around it. Re the first an inefficiency, if you only write one character to wake the select loop to the self-pipe, I don't think you are going to see too much inefficiency.

Using multiple threads with accept() on a nonblocking listener in each process

The following strategies seem to work well:
Using a single thread/process with a nonblocking accept() call on the listener socket, regardless of how the program handles the accepted request.
Using multiple threads/processes with a blocking accept() call in each process. When a connection comes in, this wakes up exactly one accept().
What doesn't work well is EPOLLIN watching the listener socket each thread/process with accept() in a callback. This wakes every thread/process up, though only one can succeed in actually accept()ing. This is just like the bad old days of blocking accept() causing a stampede when a connection would come in.
Is there a way to only have a single thread/process wake up to accept() while still using EPOLLIN? Or should I rewrite to use blocking accept()s, just isolated using threads?
It's not an option to have only a single thread/process run accept() because I'm trying to manage the processes as a pool in a way where each process doesn't need to know whether it's the only daemon accept()ing on the listener socket.
You need to use EPOLLET or EPOLLONESHOT so that exactly one thread gets woken by the EPOLLIN event when a new connection comes in. The handling thread then needs to call accept in a loop until it returns EAGAIN (EPOLLET) or manually reset with epoll_ctl (EPOLLONESHOT) in order for more connections to be handled.
In general when using multiple threads and epoll, you want to use EPOLLET or EPOLLONESHOT. Otherwise when an event happens, multiple threads will be woken to handle it and they may interfere with each other. At best, they'll just waste time figuring out that some other thread is handling the event before waiting again. At worst they'll deadlock or corrupt stuff.
How about multiple sockets listening on the same proto+address+port? This can be accomplished with the Linux SO_REUSERPORT. https://lwn.net/Articles/542629/ . I have not tried it, but I think it should work even with epoll, since only one socket gets the actual event.
caveat emptor This is a non-portable, Linux-only solution. SO_REUSEPORT also suffers from some bugs/features that are detailed in the linked article.

Trying to exit from a blocking UDP socket read

This is a question similar to Proper way to close a blocking UDP socket. I have a thread in C which is reading from a UDP socket. The read is blocking. I would like to know if it is possible to be able to exit the thread, without relying on the recv() returning? For example can I close the socket from another thread and safely expect the socket read thread to exit? Didn't see any high voted answer on that thread, thats why I am asking it again.
This really depends on what system you're running under. For example, if you're running under a POSIX-compliant system and your thread is cancelable, the recv() call will be interrupted when you cancel the thread since it's a cancel point.
If you're using an older socket implementation, you could set a signal handler for your thread for something like SIGUSR1 and hope nobody else wanted it and signal, since recv() will interrupt on a signal. Your best option is not to block, if at all possible.
I don't think closing a socket involved in a blocking operation is a safe guaranteed way of terminating the operation. For instance, kernel.org warns darkly:
It is probably unwise to close file descriptors while they may be in
use by system calls in other threads in the same process. Since a
file descriptor may be reused, there are some obscure race conditions
that may cause unintended side effects.
Instead you could use a signal and make recv fail with EINTR
(make sure SA_RESTART is not enabled). You can send a signal to a
specific thread with pthread_kill
You could enable SO_RCVTIMEO on the socket before starting the recv
call
Personally I usually try to stay clear of all the signal nastiness but it's a viable option.
You've got a couple of options for that. A signal will interrupt the read operation, so all you need to do is make sure a signal goes off. The recv operation should fail with error number EINTR.
The simplest option is to set up a timer to interrupt your own process after some timeout e.g. 30 seconds:
itimerval timer
timeval time;
time.tv_sec = 30;
time.tv_usec = 0;
timer.it_value = time;
if( setitimer( ITIMER_REAL, &timer, NULL ) != 0 )
printf( "failed to start timer\n" );
You'll get a SIGALRM after the specified time, which will interrupt your blocking operation, and give you the chance to repeat the operation or quit.
You cannot deallocate a shared resource while another thread is or might be using it. In practice, you will find that you cannot even write code to do what you suggest.
Think about it. When you go to call close, how can you possibly know that the other thread is actually blocked in recv? What if it's about to call recv, but then another thread calls socket and gets the descriptor you just closed? Now, not only will that thread not detect any error, but it will be calling recv on the wrong socket!
There is probably a good way to solve your outer problem, the reason you need to exit from a blocking UDP socket read. There are also several ugly hacks available. The basic approach is to make the socket non-blocking and instead of making a blocking UDP socket read, fake a blocking read with select or poll. You can then abort this loop several ways:
One way is to have select time out and check an 'abort' flag when select returns.
Another way is to also select on the read end of a pipe. Send a single byte to the pipe to abort the select.
If posix complient system, you can try to monitor your thread:
pthread_create with a function that makes your recv and pthread_cond_signal just after, then returns.
The calling thread makes a pthread_cond_timedwait with the desired timeout and terminates the called thread if timed_out.

Can a socket be closed from another thread when a send / recv on the same socket is going on?

Can a socket be closed from another thread when a send / recv on the same socket is going on?
Suppose one thread is in blocking recv call and another thread closes the same socket, will the thread in the recv call know this and come out safely?
I would like to know if the behavior will differ between different OS / Platforms. If yes, how will it behave in Solaris?
In linux closing a socket won't wake up recv(). Also, as #jxh says:
If a thread is blocked on recv() or send() when the socket is closed
by a different thread, the blocked thread will receive an error.
However, it is difficult to detect the correct remedial action after
receiving the error. This is because the file descriptor number
associated with the socket may have been picked up by yet a different
thread, and the blocked thread has now been woken up on an error for a
"valid" socket. In such a case, the woken up thread should not call
close() itself.
The woken up thread will need some way to differentiate whether the
error was generated by the connection (e.g. a network error) that
requires it to call close(), or if the error was generated by a
different thread having called close() on it, in which case it should
just error out without doing anything further to the socket.
So the best way to avoid both problems is to call shutdown() instead of close(). shutdown() will make the file descriptor still available, so won't be allocated by another descriptor, also will wake up recv() with an error and the thread with the recv() call can close the socket the normal way, like a normal error happened.
I don't know Solaris network stack implementation but I'll throw out my theory/explanation of why it should be safe.
Thread A enters some blocking system call, say read(2), for this given socket. There's no data in socket receive buffer, so thread A is taken off the processor an put onto wait queue for this socket. No network stack events are initiated here, connection state (assuming TCP) has not changed.
Thread B issues close(2) on the socket. While kernel socket structure should be locked while thread B is accessing it, no other thread is holding that lock (thread A released the lock when it was put to sleep-wait). Assuming there's no outstanding data in the socket send buffer, a FIN packet is sent and the connection enters the FIN WAIT 1 state (again I assume TCP here, see connection state diagram)
I'm guessing that socket connection state change would generate a wakeup for all threads blocked on given socket. That is thread A would enter a runnable state and discover that connection is closing. The wait might be re-entered if the other side has not sent its own FIN, or the system call would return with eof otherwise.
In any case, internal kernel structures will be protected from inappropriate concurrent access. This does not mean it's a good idea to do socket I/O from multiple threads. I would advise to look into non-blocking sockets, state machines, and frameworks like libevent.
For me, shutdown() socket from another thread do the job in Linux
If a thread is blocked on recv() or send() when the socket is closed by a different thread, the blocked thread will receive an error. However, it is difficult to detect the correct remedial action after receiving the error. This is because the file descriptor number associated with the socket may have been picked up by yet a different thread, and the blocked thread has now been woken up on an error for a "valid" socket. In such a case, the woken up thread should not call close() itself.
The woken up thread will need some way to differentiate whether the error was generated by the connection (e.g. a network error) that requires it to call close(), or if the error was generated by a different thread having called close() on it, in which case it should just error out without doing anything further to the socket.
Yes, it is ok to close the socket from another thread. Any blocked/busy threads that are using that socket will report a suitable error.

Resources