I'm writing an epoll-based network server in C. When I create my socket to listen for incoming connections, I make it non-blocking using fcntl. Similarly when incoming connections arrive from clients, I make their sockets non-blocking before doing anything with them, and likewise for outgoing connections' sockets.
Sometimes my server gets a SIGPIPE -- I think this is when I try to write to a client connection that has been closed by the client. This seems strange to me; I thought that with non-blocking sockets instead of a SIGPIPE I should get an -1 back from the call to write and ECONNRESET in errno.
Is there something I'm missing? Or is it just normal to get both a SIGPIPE and an error code even with non-blocking sockets (meaning that I should explicitly ignore the signal with signal(SIGPIPE, SIG_IGN) in my setup)?
Yes, this is normal. If you write to a socket (non-blocking or not) where the other end has closed the connection, you will get a SIGPIPE or (if you are blocking the SIGPIPE signal) an error return (-1) with errno set to EPIPE.
From the man page for write:
EPIPE: fd is connected to a pipe or socket whose reading end is closed. When this happens the writing process will also receive
a SIGPIPE signal. (Thus, the write return value is seen only if the program catches, blocks or ignores this signal.)
The POSIX standard is here: http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html and says:
[EPIPE] An attempt is made to write to a pipe or FIFO that is not open for reading by any process, or that only has one end open. A SIGPIPE signal shall also be sent to the thread.
The SIGPIPE is normal. Another option beside setting signal handler solely for this purpose is to use flag MSG_NOSIGNAL whenever you send.
Related
I,m working on an embedded linux kernel 2.6 device and need to know if previously established socket is still valid or not,Also I can not do this with usual send function and check the returned value,because if I send to the invalid socket descriptor,my application will crash and linux will shut down my process.Is there any other function/suggestion for this ?
EDIT:
There are an installed app manager in device and when I try to send to socket descriptor which is not refer to the open socket,app manager will end my application,then if i close a socket connection and try to write to it,my application will be turned off by lower level app-manager.Also I'm using TCP sockets,WBr.
I think this question is either misstated or based on false premises. There is no sense of "invalidity" which a socket could come to have asynchronously by the action of another process/host. The closest thing is probably the other end of the socket being closed, which does not invalidate your socket, but it does cause subsequent writes to your socket to result in an EPIPE error and SIGPIPE signal if not blocked. SIGPIPE in turn terminates your process by default. If that's your problem, the easiest way to avoid it is to block SIGPIPE with sigprocmask/pthread_sigmask, or ignore it with signal(SIGPIPE, SIG_IGN).
I have a small server program that accepts connections on a TCP or local UNIX socket, reads a simple command and (depending on the command) sends a reply.
The problem is that the client may have no interest in the answer and sometimes exits early. So writing to that socket will cause a SIGPIPE and make my server crash.
What's the best practice to prevent the crash here? Is there a way to check if the other side of the line is still reading? (select() doesn't seem to work here as it always says the socket is writable). Or should I just catch the SIGPIPE with a handler and ignore it?
You generally want to ignore the SIGPIPE and handle the error directly in your code. This is because signal handlers in C have many restrictions on what they can do.
The most portable way to do this is to set the SIGPIPE handler to SIG_IGN. This will prevent any socket or pipe write from causing a SIGPIPE signal.
To ignore the SIGPIPE signal, use the following code:
signal(SIGPIPE, SIG_IGN);
If you're using the send() call, another option is to use the MSG_NOSIGNAL option, which will turn the SIGPIPE behavior off on a per call basis. Note that not all operating systems support the MSG_NOSIGNAL flag.
Lastly, you may also want to consider the SO_SIGNOPIPE socket flag that can be set with setsockopt() on some operating systems. This will prevent SIGPIPE from being caused by writes just to the sockets it is set on.
Another method is to change the socket so it never generates SIGPIPE on write(). This is more convenient in libraries, where you might not want a global signal handler for SIGPIPE.
On most BSD-based (MacOS, FreeBSD...) systems, (assuming you are using C/C++), you can do this with:
int set = 1;
setsockopt(sd, SOL_SOCKET, SO_NOSIGPIPE, (void *)&set, sizeof(int));
With this in effect, instead of the SIGPIPE signal being generated, EPIPE will be returned.
I'm super late to the party, but SO_NOSIGPIPE isn't portable, and might not work on your system (it seems to be a BSD thing).
A nice alternative if you're on, say, a Linux system without SO_NOSIGPIPE would be to set the MSG_NOSIGNAL flag on your send(2) call.
Example replacing write(...) by send(...,MSG_NOSIGNAL) (see nobar's comment)
char buf[888];
//write( sockfd, buf, sizeof(buf) );
send( sockfd, buf, sizeof(buf), MSG_NOSIGNAL );
In this post I described possible solution for Solaris case when neither SO_NOSIGPIPE nor MSG_NOSIGNAL is available.
Instead, we have to temporarily suppress SIGPIPE in the current thread that executes library code. Here's how to do this: to suppress SIGPIPE we first check if it is pending. If it does, this means that it is blocked in this thread, and we have to do nothing. If the library generates additional SIGPIPE, it will be merged with the pending one, and that's a no-op. If SIGPIPE is not pending then we block it in this thread, and also check whether it was already blocked. Then we are free to execute our writes. When we are to restore SIGPIPE to its original state, we do the following: if SIGPIPE was pending originally, we do nothing. Otherwise we check if it is pending now. If it does (which means that out actions have generated one or more SIGPIPEs), then we wait for it in this thread, thus clearing its pending status (to do this we use sigtimedwait() with zero timeout; this is to avoid blocking in a scenario where malicious user sent SIGPIPE manually to a whole process: in this case we will see it pending, but other thread may handle it before we had a change to wait for it). After clearing pending status we unblock SIGPIPE in this thread, but only if it wasn't blocked originally.
Example code at https://github.com/kroki/XProbes/blob/1447f3d93b6dbf273919af15e59f35cca58fcc23/src/libxprobes.c#L156
Handle SIGPIPE Locally
It's usually best to handle the error locally rather than in a global signal event handler since locally you will have more context as to what's going on and what recourse to take.
I have a communication layer in one of my apps that allows my app to communicate with an external accessory. When a write error occurs I throw and exception in the communication layer and let it bubble up to a try catch block to handle it there.
Code:
The code to ignore a SIGPIPE signal so that you can handle it locally is:
// We expect write failures to occur but we want to handle them where
// the error occurs rather than in a SIGPIPE handler.
signal(SIGPIPE, SIG_IGN);
This code will prevent the SIGPIPE signal from being raised, but you will get a read / write error when trying to use the socket, so you will need to check for that.
You cannot prevent the process on the far end of a pipe from exiting, and if it exits before you've finished writing, you will get a SIGPIPE signal. If you SIG_IGN the signal, then your write will return with an error - and you need to note and react to that error. Just catching and ignoring the signal in a handler is not a good idea -- you must note that the pipe is now defunct and modify the program's behaviour so it does not write to the pipe again (because the signal will be generated again, and ignored again, and you'll try again, and the whole process could go on for a long time and waste a lot of CPU power).
Or should I just catch the SIGPIPE with a handler and ignore it?
I believe that is right on. You want to know when the other end has closed their descriptor and that's what SIGPIPE tells you.
Sam
What's the best practice to prevent the crash here?
Either disable sigpipes as per everybody, or catch and ignore the error.
Is there a way to check if the other side of the line is still reading?
Yes, use select().
select() doesn't seem to work here as it always says the socket is writable.
You need to select on the read bits. You can probably ignore the write bits.
When the far end closes its file handle, select will tell you that there is data ready to read. When you go and read that, you will get back 0 bytes, which is how the OS tells you that the file handle has been closed.
The only time you can't ignore the write bits is if you are sending large volumes, and there is a risk of the other end getting backlogged, which can cause your buffers to fill. If that happens, then trying to write to the file handle can cause your program/thread to block or fail. Testing select before writing will protect you from that, but it doesn't guarantee that the other end is healthy or that your data is going to arrive.
Note that you can get a sigpipe from close(), as well as when you write.
Close flushes any buffered data. If the other end has already been closed, then close will fail, and you will receive a sigpipe.
If you are using buffered TCPIP, then a successful write just means your data has been queued to send, it doesn't mean it has been sent. Until you successfully call close, you don't know that your data has been sent.
Sigpipe tells you something has gone wrong, it doesn't tell you what, or what you should do about it.
Under a modern POSIX system (i.e. Linux), you can use the sigprocmask() function.
#include <signal.h>
void block_signal(int signal_to_block /* i.e. SIGPIPE */ )
{
sigset_t set;
sigset_t old_state;
// get the current state
//
sigprocmask(SIG_BLOCK, NULL, &old_state);
// add signal_to_block to that existing state
//
set = old_state;
sigaddset(&set, signal_to_block);
// block that signal also
//
sigprocmask(SIG_BLOCK, &set, NULL);
// ... deal with old_state if required ...
}
If you want to restore the previous state later, make sure to save the old_state somewhere safe. If you call that function multiple times, you need to either use a stack or only save the first or last old_state... or maybe have a function which removes a specific blocked signal.
For more info read the man page.
Linux manual said:
EPIPE The local end has been shut down on a connection oriented
socket. In this case the process will also receive a SIGPIPE
unless MSG_NOSIGNAL is set.
But for Ubuntu 12.04 it isn't right. I wrote a test for that case and I always receive EPIPE withot SIGPIPE. SIGPIPE is genereated if I try to write to the same broken socket second time. So you don't need to ignore SIGPIPE if this signal happens it means logic error in your program.
I have a client application communicating with a QEMU process through a QMP Unix domain socket. Sometimes after the client calls close() on the socket connection, 'netstat -ap unix' still shows it in CONNECTED state. I do check the return value of the close() call and it returns successfully with a value of 0, but the connection still seems to be lingering.
Since QMP doesn't really support multiple connections on its socket, all the subsequent calls to connect to the socket fail since they wait indefinitely for the lingering connection to be closed.
Is there a way to make sure from the code that the socket is really closed, and is there a way to force the socket to close?
It could be that the file descriptor has been duped, forked, or leaked.
Call shutdown(sock, SHUT_RDWR) on it to close the connection for sure before closeing.
Have you tried closing the socket from the other end? It's asynchronous, but it gives both sides a chance to ensure socket closure.
You can send a close command through to the listener on the other end and have it recycle the socket. When the socket gets closed, you should end up getting a SIGPIPE. Catch the SIGPIPE and close your end of the socket. If you end up with an EPIPE doing that, then ignore it. That just means you were already notified of the socket closure.
You could just try SO_LINGER via setsockopt(2) option with a timeout of 0. This way, when you close the socket is forcibly closed, sending a RST instead of going into the FIN/ACK closing behavior.
The purpose of the SO_LINGER option is to control how the socket is shut down when the function close(2) is called. This option applies only to connection-oriented protocols such as TCP.
The default behavior of the kernel is to allow the close(2) function to return immediately to the caller. Any unsent TCP/IP data will be transmitted and delivered if possible, but no guarantee is made. Because the close(2) call returns control immediately to the caller, the application has no way of knowing whether the last bit of data was actually delivered.
The SO_LINGER option can be enabled on the socket, to cause the application to block in the close(2) call until all final data is delivered to the remote end. Furthermore, this assures the caller that both ends have acknowledged a normal socket shutdown. Failing this, the indicated option timeout occurs and an error is returned to the calling application.
One final scenario can be applied, by use of different SO_LINGER option values. If the calling application wants to abort communications immediately, appropriate values can be set in the linger structure. Then, a call to close(2) will initiate an abort of the communication link, discarding all pending data and immediately close the socket.
I'm working with some code that needs to be safe against killing the caller due to SIGPIPE, but the only socket writes it's performing are going to datagram sockets (both UDP and Unix domain datagram sockets). Do I need to worry about SIGPIPE? I'm using connect on the socket, but preliminary testing (on Linux) showed that I just get ECONNREFUSED on send if there's nobody listening on the Unix domain socket. Not sure what happens with UDP.
I can wrap the whole thing in hacks to get rid of SIGPIPE, but if it's a non-issue I'd rather save the overhead and keep the code complexity down.
The answer is in the specification for send:
[EPIPE] The socket is shut down for writing, or the socket is connection-mode and is no longer connected. In the latter case, and if the socket is of type SOCK_STREAM or SOCK_SEQPACKET and the MSG_NOSIGNAL flag is not set, the SIGPIPE signal is generated to the calling thread.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/send.html
Thus, no, writes to datagram sockets do not generate SIGPIPE or an EPIPE error.
The open group is one thing, and Apple is another.
It is definitely possible to get a SIGPIPE on iOS when writing to a dead UDP socket, as some of my crash logs revealed lately.
iOS tends to close UDP sockets while the app is in the background, writing to these sockets can pop a SIGPIPE.
From my crash log (courtesy of testflightapp):
Exception Latest Victim Occurrences
SIGPIPE
2 libsystem_c.dylib 0x32df47ec _sigtramp + 48
3 instant talk 0x0005b10e -[IPRSNetDatagramSocket send:size:to:] (iprs_iphone_net.m:671)...
Don't recall this happening on Linux, Solaris or Windows - though I never tried to close a socket and then write to it.
According to man 2 write on my Debian box,
EPIPE: fd is connected to a pipe or socket whose reading end is closed. When this happens the writing process will also receive a SIGPIPE signal. (Thus, the write
return value is seen only if the program catches, blocks or ignores this signal.)
It appears that it is possible to get SIGPIPE when writing to a socket, but it's not clear whether it can happen for UDP sockets specifically.
Can a socket be closed from another thread when a send / recv on the same socket is going on?
Suppose one thread is in blocking recv call and another thread closes the same socket, will the thread in the recv call know this and come out safely?
I would like to know if the behavior will differ between different OS / Platforms. If yes, how will it behave in Solaris?
In linux closing a socket won't wake up recv(). Also, as #jxh says:
If a thread is blocked on recv() or send() when the socket is closed
by a different thread, the blocked thread will receive an error.
However, it is difficult to detect the correct remedial action after
receiving the error. This is because the file descriptor number
associated with the socket may have been picked up by yet a different
thread, and the blocked thread has now been woken up on an error for a
"valid" socket. In such a case, the woken up thread should not call
close() itself.
The woken up thread will need some way to differentiate whether the
error was generated by the connection (e.g. a network error) that
requires it to call close(), or if the error was generated by a
different thread having called close() on it, in which case it should
just error out without doing anything further to the socket.
So the best way to avoid both problems is to call shutdown() instead of close(). shutdown() will make the file descriptor still available, so won't be allocated by another descriptor, also will wake up recv() with an error and the thread with the recv() call can close the socket the normal way, like a normal error happened.
I don't know Solaris network stack implementation but I'll throw out my theory/explanation of why it should be safe.
Thread A enters some blocking system call, say read(2), for this given socket. There's no data in socket receive buffer, so thread A is taken off the processor an put onto wait queue for this socket. No network stack events are initiated here, connection state (assuming TCP) has not changed.
Thread B issues close(2) on the socket. While kernel socket structure should be locked while thread B is accessing it, no other thread is holding that lock (thread A released the lock when it was put to sleep-wait). Assuming there's no outstanding data in the socket send buffer, a FIN packet is sent and the connection enters the FIN WAIT 1 state (again I assume TCP here, see connection state diagram)
I'm guessing that socket connection state change would generate a wakeup for all threads blocked on given socket. That is thread A would enter a runnable state and discover that connection is closing. The wait might be re-entered if the other side has not sent its own FIN, or the system call would return with eof otherwise.
In any case, internal kernel structures will be protected from inappropriate concurrent access. This does not mean it's a good idea to do socket I/O from multiple threads. I would advise to look into non-blocking sockets, state machines, and frameworks like libevent.
For me, shutdown() socket from another thread do the job in Linux
If a thread is blocked on recv() or send() when the socket is closed by a different thread, the blocked thread will receive an error. However, it is difficult to detect the correct remedial action after receiving the error. This is because the file descriptor number associated with the socket may have been picked up by yet a different thread, and the blocked thread has now been woken up on an error for a "valid" socket. In such a case, the woken up thread should not call close() itself.
The woken up thread will need some way to differentiate whether the error was generated by the connection (e.g. a network error) that requires it to call close(), or if the error was generated by a different thread having called close() on it, in which case it should just error out without doing anything further to the socket.
Yes, it is ok to close the socket from another thread. Any blocked/busy threads that are using that socket will report a suitable error.