Related
I have a small server program that accepts connections on a TCP or local UNIX socket, reads a simple command and (depending on the command) sends a reply.
The problem is that the client may have no interest in the answer and sometimes exits early. So writing to that socket will cause a SIGPIPE and make my server crash.
What's the best practice to prevent the crash here? Is there a way to check if the other side of the line is still reading? (select() doesn't seem to work here as it always says the socket is writable). Or should I just catch the SIGPIPE with a handler and ignore it?
You generally want to ignore the SIGPIPE and handle the error directly in your code. This is because signal handlers in C have many restrictions on what they can do.
The most portable way to do this is to set the SIGPIPE handler to SIG_IGN. This will prevent any socket or pipe write from causing a SIGPIPE signal.
To ignore the SIGPIPE signal, use the following code:
signal(SIGPIPE, SIG_IGN);
If you're using the send() call, another option is to use the MSG_NOSIGNAL option, which will turn the SIGPIPE behavior off on a per call basis. Note that not all operating systems support the MSG_NOSIGNAL flag.
Lastly, you may also want to consider the SO_SIGNOPIPE socket flag that can be set with setsockopt() on some operating systems. This will prevent SIGPIPE from being caused by writes just to the sockets it is set on.
Another method is to change the socket so it never generates SIGPIPE on write(). This is more convenient in libraries, where you might not want a global signal handler for SIGPIPE.
On most BSD-based (MacOS, FreeBSD...) systems, (assuming you are using C/C++), you can do this with:
int set = 1;
setsockopt(sd, SOL_SOCKET, SO_NOSIGPIPE, (void *)&set, sizeof(int));
With this in effect, instead of the SIGPIPE signal being generated, EPIPE will be returned.
I'm super late to the party, but SO_NOSIGPIPE isn't portable, and might not work on your system (it seems to be a BSD thing).
A nice alternative if you're on, say, a Linux system without SO_NOSIGPIPE would be to set the MSG_NOSIGNAL flag on your send(2) call.
Example replacing write(...) by send(...,MSG_NOSIGNAL) (see nobar's comment)
char buf[888];
//write( sockfd, buf, sizeof(buf) );
send( sockfd, buf, sizeof(buf), MSG_NOSIGNAL );
In this post I described possible solution for Solaris case when neither SO_NOSIGPIPE nor MSG_NOSIGNAL is available.
Instead, we have to temporarily suppress SIGPIPE in the current thread that executes library code. Here's how to do this: to suppress SIGPIPE we first check if it is pending. If it does, this means that it is blocked in this thread, and we have to do nothing. If the library generates additional SIGPIPE, it will be merged with the pending one, and that's a no-op. If SIGPIPE is not pending then we block it in this thread, and also check whether it was already blocked. Then we are free to execute our writes. When we are to restore SIGPIPE to its original state, we do the following: if SIGPIPE was pending originally, we do nothing. Otherwise we check if it is pending now. If it does (which means that out actions have generated one or more SIGPIPEs), then we wait for it in this thread, thus clearing its pending status (to do this we use sigtimedwait() with zero timeout; this is to avoid blocking in a scenario where malicious user sent SIGPIPE manually to a whole process: in this case we will see it pending, but other thread may handle it before we had a change to wait for it). After clearing pending status we unblock SIGPIPE in this thread, but only if it wasn't blocked originally.
Example code at https://github.com/kroki/XProbes/blob/1447f3d93b6dbf273919af15e59f35cca58fcc23/src/libxprobes.c#L156
Handle SIGPIPE Locally
It's usually best to handle the error locally rather than in a global signal event handler since locally you will have more context as to what's going on and what recourse to take.
I have a communication layer in one of my apps that allows my app to communicate with an external accessory. When a write error occurs I throw and exception in the communication layer and let it bubble up to a try catch block to handle it there.
Code:
The code to ignore a SIGPIPE signal so that you can handle it locally is:
// We expect write failures to occur but we want to handle them where
// the error occurs rather than in a SIGPIPE handler.
signal(SIGPIPE, SIG_IGN);
This code will prevent the SIGPIPE signal from being raised, but you will get a read / write error when trying to use the socket, so you will need to check for that.
You cannot prevent the process on the far end of a pipe from exiting, and if it exits before you've finished writing, you will get a SIGPIPE signal. If you SIG_IGN the signal, then your write will return with an error - and you need to note and react to that error. Just catching and ignoring the signal in a handler is not a good idea -- you must note that the pipe is now defunct and modify the program's behaviour so it does not write to the pipe again (because the signal will be generated again, and ignored again, and you'll try again, and the whole process could go on for a long time and waste a lot of CPU power).
Or should I just catch the SIGPIPE with a handler and ignore it?
I believe that is right on. You want to know when the other end has closed their descriptor and that's what SIGPIPE tells you.
Sam
What's the best practice to prevent the crash here?
Either disable sigpipes as per everybody, or catch and ignore the error.
Is there a way to check if the other side of the line is still reading?
Yes, use select().
select() doesn't seem to work here as it always says the socket is writable.
You need to select on the read bits. You can probably ignore the write bits.
When the far end closes its file handle, select will tell you that there is data ready to read. When you go and read that, you will get back 0 bytes, which is how the OS tells you that the file handle has been closed.
The only time you can't ignore the write bits is if you are sending large volumes, and there is a risk of the other end getting backlogged, which can cause your buffers to fill. If that happens, then trying to write to the file handle can cause your program/thread to block or fail. Testing select before writing will protect you from that, but it doesn't guarantee that the other end is healthy or that your data is going to arrive.
Note that you can get a sigpipe from close(), as well as when you write.
Close flushes any buffered data. If the other end has already been closed, then close will fail, and you will receive a sigpipe.
If you are using buffered TCPIP, then a successful write just means your data has been queued to send, it doesn't mean it has been sent. Until you successfully call close, you don't know that your data has been sent.
Sigpipe tells you something has gone wrong, it doesn't tell you what, or what you should do about it.
Under a modern POSIX system (i.e. Linux), you can use the sigprocmask() function.
#include <signal.h>
void block_signal(int signal_to_block /* i.e. SIGPIPE */ )
{
sigset_t set;
sigset_t old_state;
// get the current state
//
sigprocmask(SIG_BLOCK, NULL, &old_state);
// add signal_to_block to that existing state
//
set = old_state;
sigaddset(&set, signal_to_block);
// block that signal also
//
sigprocmask(SIG_BLOCK, &set, NULL);
// ... deal with old_state if required ...
}
If you want to restore the previous state later, make sure to save the old_state somewhere safe. If you call that function multiple times, you need to either use a stack or only save the first or last old_state... or maybe have a function which removes a specific blocked signal.
For more info read the man page.
Linux manual said:
EPIPE The local end has been shut down on a connection oriented
socket. In this case the process will also receive a SIGPIPE
unless MSG_NOSIGNAL is set.
But for Ubuntu 12.04 it isn't right. I wrote a test for that case and I always receive EPIPE withot SIGPIPE. SIGPIPE is genereated if I try to write to the same broken socket second time. So you don't need to ignore SIGPIPE if this signal happens it means logic error in your program.
I have got signaled socket for read from select(), but then no data arrived by recv call(), instead it returns -1 with errno==EAGAIN.
I can grant that no other thread touch the socket.
I think that this behavior is not correct. If an subsequent close from other side occurs, I can expect return value 0 (graceful close) or other error code from recv, but not EAGAIN, because it means by my opinion that an data will arrive in the future.
I have found some previous thread about the problem here but without solution.
This behavior happens to me on Ubuntu Linux Oneric, or other last Linux distros, then info from link posted here
That it will be fixed in kernel is not true for 3.0.0 kernel or latest 2.6.x
Does anybody have an idea why it happens and how to avoid this unwanted behavior?
Select() reporting a socket as readable does not mean that there is something to read; it implies that a read will not block. The read could return -1 or 0, but it would not block.
UPDATE:
After select returns readable: if read() returns -1, check errno.
EAGAIN/EWOULDBLOCK and EINTR are the values to be treated specially: mostly by reissuing the read(), but you might trust on the select loop returning readable the next time around.
If there are multiple threads involved, things may get more difficult.
I'm getting the same problem but with epoll. I noticed, that it happens whenever the system is reusing the FD numbers of the sockets that are already closed.
After some research, I've noticed that this behavior is caused by closing the sockets while epolling on them. Try to avoid running select on a socket while closing it - that may help.
This is a question similar to Proper way to close a blocking UDP socket. I have a thread in C which is reading from a UDP socket. The read is blocking. I would like to know if it is possible to be able to exit the thread, without relying on the recv() returning? For example can I close the socket from another thread and safely expect the socket read thread to exit? Didn't see any high voted answer on that thread, thats why I am asking it again.
This really depends on what system you're running under. For example, if you're running under a POSIX-compliant system and your thread is cancelable, the recv() call will be interrupted when you cancel the thread since it's a cancel point.
If you're using an older socket implementation, you could set a signal handler for your thread for something like SIGUSR1 and hope nobody else wanted it and signal, since recv() will interrupt on a signal. Your best option is not to block, if at all possible.
I don't think closing a socket involved in a blocking operation is a safe guaranteed way of terminating the operation. For instance, kernel.org warns darkly:
It is probably unwise to close file descriptors while they may be in
use by system calls in other threads in the same process. Since a
file descriptor may be reused, there are some obscure race conditions
that may cause unintended side effects.
Instead you could use a signal and make recv fail with EINTR
(make sure SA_RESTART is not enabled). You can send a signal to a
specific thread with pthread_kill
You could enable SO_RCVTIMEO on the socket before starting the recv
call
Personally I usually try to stay clear of all the signal nastiness but it's a viable option.
You've got a couple of options for that. A signal will interrupt the read operation, so all you need to do is make sure a signal goes off. The recv operation should fail with error number EINTR.
The simplest option is to set up a timer to interrupt your own process after some timeout e.g. 30 seconds:
itimerval timer
timeval time;
time.tv_sec = 30;
time.tv_usec = 0;
timer.it_value = time;
if( setitimer( ITIMER_REAL, &timer, NULL ) != 0 )
printf( "failed to start timer\n" );
You'll get a SIGALRM after the specified time, which will interrupt your blocking operation, and give you the chance to repeat the operation or quit.
You cannot deallocate a shared resource while another thread is or might be using it. In practice, you will find that you cannot even write code to do what you suggest.
Think about it. When you go to call close, how can you possibly know that the other thread is actually blocked in recv? What if it's about to call recv, but then another thread calls socket and gets the descriptor you just closed? Now, not only will that thread not detect any error, but it will be calling recv on the wrong socket!
There is probably a good way to solve your outer problem, the reason you need to exit from a blocking UDP socket read. There are also several ugly hacks available. The basic approach is to make the socket non-blocking and instead of making a blocking UDP socket read, fake a blocking read with select or poll. You can then abort this loop several ways:
One way is to have select time out and check an 'abort' flag when select returns.
Another way is to also select on the read end of a pipe. Send a single byte to the pipe to abort the select.
If posix complient system, you can try to monitor your thread:
pthread_create with a function that makes your recv and pthread_cond_signal just after, then returns.
The calling thread makes a pthread_cond_timedwait with the desired timeout and terminates the called thread if timed_out.
I have a thread that is essentially just for listening on a socket. I have the thread blocking on accept() currently.
How do I tell the thread to finish any current transaction and stop listening, rather than staying blocked on accept?
I don't really want to do non-blocking if I don't have to...
Use the select(2) call to check which fd are ready to read.
The file descriptors from call can be read with out it blocking. eg accept() on the returned fd will immediately create a new connection.
Basically you have two options, the first one is to use interrupts: i.e
http://www.cs.cf.ac.uk/Dave/C/node32.html (see the signal handler section, it also supply a th_kill example).
From accept man page:
accept() shall fail if:
EINTR
The system call was interrupted by a signal that was caught before a valid connection arrived.
Another option is to use Non blocking sockets and select(): i.e.:
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Frzab6%2Frzab6xnonblock.htm
Anyhow, usually in multi-threaded servers there's one thread which accepts new connections and spawns other threads for each connections. Since accept()ing and than recv()ing, can delay new connections requests... (Unless you're working with one client, and then accept()ing and recieving might be OK)
Use pthread_cancel on the thread. You'll need to make sure you've installed appropriate cancellation handlers (pthread_cleanup_push) to avoid resource leaks, and you should disable cancellation except for the duration of the accept call to avoid race conditions where the cancellation request might get acted upon later by a different function than accept.
Note that, due to bugs in glibc's implementation of cancellation, this approach could lead to lost connections and file descriptor leaks. This is because glibc/NPTL provides no guarantee that accept did not already finish execution and allocate a new file descriptor for the new connection before the cancellation request is acted upon. It should be a fairly rare occurrence but it's still an issue to consider...
See: http://sourceware.org/bugzilla/show_bug.cgi?id=12683
and for a discussion of the issue: Implementing cancellable syscalls in userspace
From Wake up thread blocked on accept() call
I just used the shutdown() system call and it seems to work...
I have a small server program that accepts connections on a TCP or local UNIX socket, reads a simple command and (depending on the command) sends a reply.
The problem is that the client may have no interest in the answer and sometimes exits early. So writing to that socket will cause a SIGPIPE and make my server crash.
What's the best practice to prevent the crash here? Is there a way to check if the other side of the line is still reading? (select() doesn't seem to work here as it always says the socket is writable). Or should I just catch the SIGPIPE with a handler and ignore it?
You generally want to ignore the SIGPIPE and handle the error directly in your code. This is because signal handlers in C have many restrictions on what they can do.
The most portable way to do this is to set the SIGPIPE handler to SIG_IGN. This will prevent any socket or pipe write from causing a SIGPIPE signal.
To ignore the SIGPIPE signal, use the following code:
signal(SIGPIPE, SIG_IGN);
If you're using the send() call, another option is to use the MSG_NOSIGNAL option, which will turn the SIGPIPE behavior off on a per call basis. Note that not all operating systems support the MSG_NOSIGNAL flag.
Lastly, you may also want to consider the SO_SIGNOPIPE socket flag that can be set with setsockopt() on some operating systems. This will prevent SIGPIPE from being caused by writes just to the sockets it is set on.
Another method is to change the socket so it never generates SIGPIPE on write(). This is more convenient in libraries, where you might not want a global signal handler for SIGPIPE.
On most BSD-based (MacOS, FreeBSD...) systems, (assuming you are using C/C++), you can do this with:
int set = 1;
setsockopt(sd, SOL_SOCKET, SO_NOSIGPIPE, (void *)&set, sizeof(int));
With this in effect, instead of the SIGPIPE signal being generated, EPIPE will be returned.
I'm super late to the party, but SO_NOSIGPIPE isn't portable, and might not work on your system (it seems to be a BSD thing).
A nice alternative if you're on, say, a Linux system without SO_NOSIGPIPE would be to set the MSG_NOSIGNAL flag on your send(2) call.
Example replacing write(...) by send(...,MSG_NOSIGNAL) (see nobar's comment)
char buf[888];
//write( sockfd, buf, sizeof(buf) );
send( sockfd, buf, sizeof(buf), MSG_NOSIGNAL );
In this post I described possible solution for Solaris case when neither SO_NOSIGPIPE nor MSG_NOSIGNAL is available.
Instead, we have to temporarily suppress SIGPIPE in the current thread that executes library code. Here's how to do this: to suppress SIGPIPE we first check if it is pending. If it does, this means that it is blocked in this thread, and we have to do nothing. If the library generates additional SIGPIPE, it will be merged with the pending one, and that's a no-op. If SIGPIPE is not pending then we block it in this thread, and also check whether it was already blocked. Then we are free to execute our writes. When we are to restore SIGPIPE to its original state, we do the following: if SIGPIPE was pending originally, we do nothing. Otherwise we check if it is pending now. If it does (which means that out actions have generated one or more SIGPIPEs), then we wait for it in this thread, thus clearing its pending status (to do this we use sigtimedwait() with zero timeout; this is to avoid blocking in a scenario where malicious user sent SIGPIPE manually to a whole process: in this case we will see it pending, but other thread may handle it before we had a change to wait for it). After clearing pending status we unblock SIGPIPE in this thread, but only if it wasn't blocked originally.
Example code at https://github.com/kroki/XProbes/blob/1447f3d93b6dbf273919af15e59f35cca58fcc23/src/libxprobes.c#L156
Handle SIGPIPE Locally
It's usually best to handle the error locally rather than in a global signal event handler since locally you will have more context as to what's going on and what recourse to take.
I have a communication layer in one of my apps that allows my app to communicate with an external accessory. When a write error occurs I throw and exception in the communication layer and let it bubble up to a try catch block to handle it there.
Code:
The code to ignore a SIGPIPE signal so that you can handle it locally is:
// We expect write failures to occur but we want to handle them where
// the error occurs rather than in a SIGPIPE handler.
signal(SIGPIPE, SIG_IGN);
This code will prevent the SIGPIPE signal from being raised, but you will get a read / write error when trying to use the socket, so you will need to check for that.
You cannot prevent the process on the far end of a pipe from exiting, and if it exits before you've finished writing, you will get a SIGPIPE signal. If you SIG_IGN the signal, then your write will return with an error - and you need to note and react to that error. Just catching and ignoring the signal in a handler is not a good idea -- you must note that the pipe is now defunct and modify the program's behaviour so it does not write to the pipe again (because the signal will be generated again, and ignored again, and you'll try again, and the whole process could go on for a long time and waste a lot of CPU power).
Or should I just catch the SIGPIPE with a handler and ignore it?
I believe that is right on. You want to know when the other end has closed their descriptor and that's what SIGPIPE tells you.
Sam
What's the best practice to prevent the crash here?
Either disable sigpipes as per everybody, or catch and ignore the error.
Is there a way to check if the other side of the line is still reading?
Yes, use select().
select() doesn't seem to work here as it always says the socket is writable.
You need to select on the read bits. You can probably ignore the write bits.
When the far end closes its file handle, select will tell you that there is data ready to read. When you go and read that, you will get back 0 bytes, which is how the OS tells you that the file handle has been closed.
The only time you can't ignore the write bits is if you are sending large volumes, and there is a risk of the other end getting backlogged, which can cause your buffers to fill. If that happens, then trying to write to the file handle can cause your program/thread to block or fail. Testing select before writing will protect you from that, but it doesn't guarantee that the other end is healthy or that your data is going to arrive.
Note that you can get a sigpipe from close(), as well as when you write.
Close flushes any buffered data. If the other end has already been closed, then close will fail, and you will receive a sigpipe.
If you are using buffered TCPIP, then a successful write just means your data has been queued to send, it doesn't mean it has been sent. Until you successfully call close, you don't know that your data has been sent.
Sigpipe tells you something has gone wrong, it doesn't tell you what, or what you should do about it.
Under a modern POSIX system (i.e. Linux), you can use the sigprocmask() function.
#include <signal.h>
void block_signal(int signal_to_block /* i.e. SIGPIPE */ )
{
sigset_t set;
sigset_t old_state;
// get the current state
//
sigprocmask(SIG_BLOCK, NULL, &old_state);
// add signal_to_block to that existing state
//
set = old_state;
sigaddset(&set, signal_to_block);
// block that signal also
//
sigprocmask(SIG_BLOCK, &set, NULL);
// ... deal with old_state if required ...
}
If you want to restore the previous state later, make sure to save the old_state somewhere safe. If you call that function multiple times, you need to either use a stack or only save the first or last old_state... or maybe have a function which removes a specific blocked signal.
For more info read the man page.
Linux manual said:
EPIPE The local end has been shut down on a connection oriented
socket. In this case the process will also receive a SIGPIPE
unless MSG_NOSIGNAL is set.
But for Ubuntu 12.04 it isn't right. I wrote a test for that case and I always receive EPIPE withot SIGPIPE. SIGPIPE is genereated if I try to write to the same broken socket second time. So you don't need to ignore SIGPIPE if this signal happens it means logic error in your program.