reply = redisCommand(rcontext,"HGET %u %u",env->cr[3] ,KeyHandle);
if(reply == NULL)
{
printf("in preNtDeletKey rediscommand error ! and the err type is %d the string is %s \n" ,rcontext->err,rcontext->errstr)";
}
Here I got a error , the reply return NULL
the output is
in preNtDeletKey rediscommand error ! and the err type is 1 the string is Interrupted system call
I use this in my project . And i grep in the hiredis source don't find Interrupted system call
I want to know what the reason to cause a Interrupted system call
How hiredis write the string to the redisContext (because I don't find in the sourec)
How we avoid the Interrupted system call ?
The hiredis package marshals your command using the Redis protocol, and sends it to the Redis server. It then synchronously waits for the reply.
You will find the functions dealing with the sockets in the hiredis.c file:
int redisBufferRead(redisContext *c)
int redisBufferWrite(redisContext *c, int *done)
In these functions, the EAGAIN error is handled, but not the EINTR error which corresponds to the "Interrupted system call" message.
The consequence is any Unix signal, received by the process when hiredis is doing a write or (more likely) a read operation, can interrupt the operation and cause this error.
You need first to understand which kind of signal the application receives. Depending on the nature of the signal and the application, there are various ways to handle this situation:
masking or deferring signal handlers before doing Redis calls
binding the signal to an event loop handler (if any) to avoid the signal to be processed when it is not expected
dedicate a given thread to handle all signals (and avoid any Redis calls in this thread)
using the SA_RESTART option (in sigaction) to tell the system to replay interrupted system calls automatically
just try to do the operation again (it might not be possible though)
Personally, I would favor hiredis to handle the situation in a more graceful way (i.e. processing EINTR just like EAGAIN).
UPDATE:
The EAGAIN error is normally returned in two situations:
when the non blocking mode has been activated by calling redisConnectNonBlock or
redisConnectUnixNonBlock()
when the connection is in blocking mode (default) and the redisSetTimeout() method has been called to set a timeout
Please note calling the redisSetTimeout() functions on client side just set the SO_RCVTIMEO and SO_SNDTIMEO properties of the socket. It is completely unrelated to the timeout defined in the Redis configuration file which is a server side idle timeout (Redis server being able to close a connection if it has been inactive for more than N seconds).
Getting EAGAIN in the second situation means the Redis instance is not responsive enough for the provided timeout. You may want to simply increase the timeout or investigate further the latency issues on Redis server side.
No clue, but a quick search says that (on Linux kernel) a system call can be interrupted when nothing is wrong and when that happens, the typical thing is to just do it again. My guess, since there is nothing to go on here, is the Redis database or some part of your code isn't handling that situation. http://www.win.tue.nl/~aeb/linux/lk/lk-4.html
Related
I have a small server program that accepts connections on a TCP or local UNIX socket, reads a simple command and (depending on the command) sends a reply.
The problem is that the client may have no interest in the answer and sometimes exits early. So writing to that socket will cause a SIGPIPE and make my server crash.
What's the best practice to prevent the crash here? Is there a way to check if the other side of the line is still reading? (select() doesn't seem to work here as it always says the socket is writable). Or should I just catch the SIGPIPE with a handler and ignore it?
You generally want to ignore the SIGPIPE and handle the error directly in your code. This is because signal handlers in C have many restrictions on what they can do.
The most portable way to do this is to set the SIGPIPE handler to SIG_IGN. This will prevent any socket or pipe write from causing a SIGPIPE signal.
To ignore the SIGPIPE signal, use the following code:
signal(SIGPIPE, SIG_IGN);
If you're using the send() call, another option is to use the MSG_NOSIGNAL option, which will turn the SIGPIPE behavior off on a per call basis. Note that not all operating systems support the MSG_NOSIGNAL flag.
Lastly, you may also want to consider the SO_SIGNOPIPE socket flag that can be set with setsockopt() on some operating systems. This will prevent SIGPIPE from being caused by writes just to the sockets it is set on.
Another method is to change the socket so it never generates SIGPIPE on write(). This is more convenient in libraries, where you might not want a global signal handler for SIGPIPE.
On most BSD-based (MacOS, FreeBSD...) systems, (assuming you are using C/C++), you can do this with:
int set = 1;
setsockopt(sd, SOL_SOCKET, SO_NOSIGPIPE, (void *)&set, sizeof(int));
With this in effect, instead of the SIGPIPE signal being generated, EPIPE will be returned.
I'm super late to the party, but SO_NOSIGPIPE isn't portable, and might not work on your system (it seems to be a BSD thing).
A nice alternative if you're on, say, a Linux system without SO_NOSIGPIPE would be to set the MSG_NOSIGNAL flag on your send(2) call.
Example replacing write(...) by send(...,MSG_NOSIGNAL) (see nobar's comment)
char buf[888];
//write( sockfd, buf, sizeof(buf) );
send( sockfd, buf, sizeof(buf), MSG_NOSIGNAL );
In this post I described possible solution for Solaris case when neither SO_NOSIGPIPE nor MSG_NOSIGNAL is available.
Instead, we have to temporarily suppress SIGPIPE in the current thread that executes library code. Here's how to do this: to suppress SIGPIPE we first check if it is pending. If it does, this means that it is blocked in this thread, and we have to do nothing. If the library generates additional SIGPIPE, it will be merged with the pending one, and that's a no-op. If SIGPIPE is not pending then we block it in this thread, and also check whether it was already blocked. Then we are free to execute our writes. When we are to restore SIGPIPE to its original state, we do the following: if SIGPIPE was pending originally, we do nothing. Otherwise we check if it is pending now. If it does (which means that out actions have generated one or more SIGPIPEs), then we wait for it in this thread, thus clearing its pending status (to do this we use sigtimedwait() with zero timeout; this is to avoid blocking in a scenario where malicious user sent SIGPIPE manually to a whole process: in this case we will see it pending, but other thread may handle it before we had a change to wait for it). After clearing pending status we unblock SIGPIPE in this thread, but only if it wasn't blocked originally.
Example code at https://github.com/kroki/XProbes/blob/1447f3d93b6dbf273919af15e59f35cca58fcc23/src/libxprobes.c#L156
Handle SIGPIPE Locally
It's usually best to handle the error locally rather than in a global signal event handler since locally you will have more context as to what's going on and what recourse to take.
I have a communication layer in one of my apps that allows my app to communicate with an external accessory. When a write error occurs I throw and exception in the communication layer and let it bubble up to a try catch block to handle it there.
Code:
The code to ignore a SIGPIPE signal so that you can handle it locally is:
// We expect write failures to occur but we want to handle them where
// the error occurs rather than in a SIGPIPE handler.
signal(SIGPIPE, SIG_IGN);
This code will prevent the SIGPIPE signal from being raised, but you will get a read / write error when trying to use the socket, so you will need to check for that.
You cannot prevent the process on the far end of a pipe from exiting, and if it exits before you've finished writing, you will get a SIGPIPE signal. If you SIG_IGN the signal, then your write will return with an error - and you need to note and react to that error. Just catching and ignoring the signal in a handler is not a good idea -- you must note that the pipe is now defunct and modify the program's behaviour so it does not write to the pipe again (because the signal will be generated again, and ignored again, and you'll try again, and the whole process could go on for a long time and waste a lot of CPU power).
Or should I just catch the SIGPIPE with a handler and ignore it?
I believe that is right on. You want to know when the other end has closed their descriptor and that's what SIGPIPE tells you.
Sam
What's the best practice to prevent the crash here?
Either disable sigpipes as per everybody, or catch and ignore the error.
Is there a way to check if the other side of the line is still reading?
Yes, use select().
select() doesn't seem to work here as it always says the socket is writable.
You need to select on the read bits. You can probably ignore the write bits.
When the far end closes its file handle, select will tell you that there is data ready to read. When you go and read that, you will get back 0 bytes, which is how the OS tells you that the file handle has been closed.
The only time you can't ignore the write bits is if you are sending large volumes, and there is a risk of the other end getting backlogged, which can cause your buffers to fill. If that happens, then trying to write to the file handle can cause your program/thread to block or fail. Testing select before writing will protect you from that, but it doesn't guarantee that the other end is healthy or that your data is going to arrive.
Note that you can get a sigpipe from close(), as well as when you write.
Close flushes any buffered data. If the other end has already been closed, then close will fail, and you will receive a sigpipe.
If you are using buffered TCPIP, then a successful write just means your data has been queued to send, it doesn't mean it has been sent. Until you successfully call close, you don't know that your data has been sent.
Sigpipe tells you something has gone wrong, it doesn't tell you what, or what you should do about it.
Under a modern POSIX system (i.e. Linux), you can use the sigprocmask() function.
#include <signal.h>
void block_signal(int signal_to_block /* i.e. SIGPIPE */ )
{
sigset_t set;
sigset_t old_state;
// get the current state
//
sigprocmask(SIG_BLOCK, NULL, &old_state);
// add signal_to_block to that existing state
//
set = old_state;
sigaddset(&set, signal_to_block);
// block that signal also
//
sigprocmask(SIG_BLOCK, &set, NULL);
// ... deal with old_state if required ...
}
If you want to restore the previous state later, make sure to save the old_state somewhere safe. If you call that function multiple times, you need to either use a stack or only save the first or last old_state... or maybe have a function which removes a specific blocked signal.
For more info read the man page.
Linux manual said:
EPIPE The local end has been shut down on a connection oriented
socket. In this case the process will also receive a SIGPIPE
unless MSG_NOSIGNAL is set.
But for Ubuntu 12.04 it isn't right. I wrote a test for that case and I always receive EPIPE withot SIGPIPE. SIGPIPE is genereated if I try to write to the same broken socket second time. So you don't need to ignore SIGPIPE if this signal happens it means logic error in your program.
I'm writing an application that uses timer to do some data acquisition and processing at a fix sample rate (200Hz).
The application acts like a server and run in background. It should be controllable from other processes or other machines from UDP.
To do so, I use the timer_create() API to generate SIGUSR1 periodically and call an handler that do the acquisition and the processing.
The code to configure the timer is as follow (minus error check for clarity):
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = handler;
sigemptyset(&sa.sa_mask);
sigaction(SIGUSR1, &sa, NULL);
sev.sigev_notify = SIGEV_SIGNAL;
sev.sigev_signo = SIGUSR1;
sev.sigev_value.sival_ptr = &timerid;
timer_create(CLOCK_REALTIME, &sev, &timerid);
timer_settime(...)
The code above is called when a 'start' command is received from UDP. To check for command I have an infinite loop in my main program that call recvfrom() syscall.
The problem is, when a 'start' command is received, and then, the timer is properly started and running (using the code above), I get an 'interrupted system calls' error (EINTR) due the SIGUSR1 signal sent by the timer interrupting the recvfrom() call. If I check for this particular error code and ignore it, I finally get a 'connection refused' error when calling recvfrom().
So here my questions:
How to solve this 'interrupted system calls' error as it seems to
ignore it and re-do the recvfrom() doesn't work?
Why do I get the 'connection refused' error after about twenty tries?
I have the feeling that using SIGEV_THREAD could be a solution, as I understand it, create a new thread (like phread_create) without generate a signal. Am I right?
Is the signal number important here? Is there any plus to use real time signal?
Is there any other way to do what I intent to do: having a background loop checking for command from UDP and real-time periodic task?
And here the bonus question:
Is it safe to do the data acquisition and the processing in the handler or should I use a semaphore mechanism to wake up a thread that do it?
Solution:
As suggest in an answer and in the comments, using SA_RESTART seems to fix the main issue.
Solution 2:
Using SIGEV_THREAD over SIGEV_SIGNAL works too. I've read somewhere that using SIGEV_THREAD could require more ressources than SIGEV_SIGNAL. However I have not seen significant difference regarding the timing of the task.
Timers tend to be implemented using SIGALARM.
Signal receipt, including SIGALARM, tends to cause long running system calls to return early with EINTR in errno.
SA_RESTART is one way around this, so system calls interrupted by receipt of a signal, will be automatically restarted. Another is to check for EINTR from your system calls' errno's and restart them when you receive EINTR.
With read() and write() of course, you can't just restart, you need to pick up where you left off. That's why these return the length of data transmitted.
Given that you're using Linux, I would opt for using timerfd_create instead.
That way you can just select(2), poll(2) or epoll(7) instead and handle timer events without the difficulty of signal handlers in your main loop.
As for EINTR (Interrupted System Call), those are properly handled by just restarting the specific system call that got interrupted.
Restarting the interrupted system call is the correct response to EINTR. You "Connection Refused" problem is an unrelated error - on a UDP socket, it indicates that a previous packet sent on that socket was rejected by the destination (notified through an ICMP message).
Question 5: Your use of a message and real-time periodic thread is perfectly fine. However, I would suggest you avoid using timers altogether, precisely because they use signals. I've run into this problem myself and eventually replaced the timer with a simple clock_nanosleep() that uses TIMER_ABSTIME with time updated to maintain the desired rate (i.e. add the period to the absolute time). The result was simpler code, no more problems with signals, and a more accurate timer than the signal-based timer. BTW, you should measure your timer's period in the handler to make sure it is accurate enough. My experience with timers was 8 years ago, so the problem with accuracy might be fixed. However, the other problems with signals are inherent to signals themselves and thus can't be "solved" -- only worked around.
Also, I see no problem with doing data acquisition from the handler, it should certainly reduce latency in retrieving the data.
I have a thread that is essentially just for listening on a socket. I have the thread blocking on accept() currently.
How do I tell the thread to finish any current transaction and stop listening, rather than staying blocked on accept?
I don't really want to do non-blocking if I don't have to...
Use the select(2) call to check which fd are ready to read.
The file descriptors from call can be read with out it blocking. eg accept() on the returned fd will immediately create a new connection.
Basically you have two options, the first one is to use interrupts: i.e
http://www.cs.cf.ac.uk/Dave/C/node32.html (see the signal handler section, it also supply a th_kill example).
From accept man page:
accept() shall fail if:
EINTR
The system call was interrupted by a signal that was caught before a valid connection arrived.
Another option is to use Non blocking sockets and select(): i.e.:
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Frzab6%2Frzab6xnonblock.htm
Anyhow, usually in multi-threaded servers there's one thread which accepts new connections and spawns other threads for each connections. Since accept()ing and than recv()ing, can delay new connections requests... (Unless you're working with one client, and then accept()ing and recieving might be OK)
Use pthread_cancel on the thread. You'll need to make sure you've installed appropriate cancellation handlers (pthread_cleanup_push) to avoid resource leaks, and you should disable cancellation except for the duration of the accept call to avoid race conditions where the cancellation request might get acted upon later by a different function than accept.
Note that, due to bugs in glibc's implementation of cancellation, this approach could lead to lost connections and file descriptor leaks. This is because glibc/NPTL provides no guarantee that accept did not already finish execution and allocate a new file descriptor for the new connection before the cancellation request is acted upon. It should be a fairly rare occurrence but it's still an issue to consider...
See: http://sourceware.org/bugzilla/show_bug.cgi?id=12683
and for a discussion of the issue: Implementing cancellable syscalls in userspace
From Wake up thread blocked on accept() call
I just used the shutdown() system call and it seems to work...
I've been working on a polling TCP daemon for some time now. Recently, I've read that non-blocking sockets can sometimes throw an EWOULDBLOCK error during a send() or recv(). My understanding is that if recv() throws an EWOULDBLOCK, this (usually) means that there's nothing to receive. But what I'm unclear on is under what circumstances send() would throw an EWOULDBLOCK, and what would be proper procedure for handling such an event?
If send() throws an EWOULDBLOCK, should the daemon simply move on from that event, onto the next one? Using a polling interface like epoll, will a new event be fired when the descriptor becomes ready for writing?
what I'm unclear on is under what
circumstances send() would throw an
EWOULDBLOCK
When the sending-buffer (typically held by the OS, but, anyway, somewhere in the TCP/IP stack) is full and the counterpart hasn't acknowledged any of the bits sent to it from the buffer yet (so the stack must retain everything in the buffer in case a resend is necessary).
what would be proper procedure for
handling such an event?
In one way or another you must wait until the counterpart does acknowledge some of the packets sent to it, thereby allowing the TCP/IP stack to free some space for more "sending". Both classical select and more modern epoll (and in other OS's, kqueue &c) provide smart ways to perform such waiting (whether you're waiting to read something, write something, or "whichever of the two happens first"). Yep, watched-descriptors becoming ready (be it for reading or for writing) is the typical reason for epoll events!
I have a small server program that accepts connections on a TCP or local UNIX socket, reads a simple command and (depending on the command) sends a reply.
The problem is that the client may have no interest in the answer and sometimes exits early. So writing to that socket will cause a SIGPIPE and make my server crash.
What's the best practice to prevent the crash here? Is there a way to check if the other side of the line is still reading? (select() doesn't seem to work here as it always says the socket is writable). Or should I just catch the SIGPIPE with a handler and ignore it?
You generally want to ignore the SIGPIPE and handle the error directly in your code. This is because signal handlers in C have many restrictions on what they can do.
The most portable way to do this is to set the SIGPIPE handler to SIG_IGN. This will prevent any socket or pipe write from causing a SIGPIPE signal.
To ignore the SIGPIPE signal, use the following code:
signal(SIGPIPE, SIG_IGN);
If you're using the send() call, another option is to use the MSG_NOSIGNAL option, which will turn the SIGPIPE behavior off on a per call basis. Note that not all operating systems support the MSG_NOSIGNAL flag.
Lastly, you may also want to consider the SO_SIGNOPIPE socket flag that can be set with setsockopt() on some operating systems. This will prevent SIGPIPE from being caused by writes just to the sockets it is set on.
Another method is to change the socket so it never generates SIGPIPE on write(). This is more convenient in libraries, where you might not want a global signal handler for SIGPIPE.
On most BSD-based (MacOS, FreeBSD...) systems, (assuming you are using C/C++), you can do this with:
int set = 1;
setsockopt(sd, SOL_SOCKET, SO_NOSIGPIPE, (void *)&set, sizeof(int));
With this in effect, instead of the SIGPIPE signal being generated, EPIPE will be returned.
I'm super late to the party, but SO_NOSIGPIPE isn't portable, and might not work on your system (it seems to be a BSD thing).
A nice alternative if you're on, say, a Linux system without SO_NOSIGPIPE would be to set the MSG_NOSIGNAL flag on your send(2) call.
Example replacing write(...) by send(...,MSG_NOSIGNAL) (see nobar's comment)
char buf[888];
//write( sockfd, buf, sizeof(buf) );
send( sockfd, buf, sizeof(buf), MSG_NOSIGNAL );
In this post I described possible solution for Solaris case when neither SO_NOSIGPIPE nor MSG_NOSIGNAL is available.
Instead, we have to temporarily suppress SIGPIPE in the current thread that executes library code. Here's how to do this: to suppress SIGPIPE we first check if it is pending. If it does, this means that it is blocked in this thread, and we have to do nothing. If the library generates additional SIGPIPE, it will be merged with the pending one, and that's a no-op. If SIGPIPE is not pending then we block it in this thread, and also check whether it was already blocked. Then we are free to execute our writes. When we are to restore SIGPIPE to its original state, we do the following: if SIGPIPE was pending originally, we do nothing. Otherwise we check if it is pending now. If it does (which means that out actions have generated one or more SIGPIPEs), then we wait for it in this thread, thus clearing its pending status (to do this we use sigtimedwait() with zero timeout; this is to avoid blocking in a scenario where malicious user sent SIGPIPE manually to a whole process: in this case we will see it pending, but other thread may handle it before we had a change to wait for it). After clearing pending status we unblock SIGPIPE in this thread, but only if it wasn't blocked originally.
Example code at https://github.com/kroki/XProbes/blob/1447f3d93b6dbf273919af15e59f35cca58fcc23/src/libxprobes.c#L156
Handle SIGPIPE Locally
It's usually best to handle the error locally rather than in a global signal event handler since locally you will have more context as to what's going on and what recourse to take.
I have a communication layer in one of my apps that allows my app to communicate with an external accessory. When a write error occurs I throw and exception in the communication layer and let it bubble up to a try catch block to handle it there.
Code:
The code to ignore a SIGPIPE signal so that you can handle it locally is:
// We expect write failures to occur but we want to handle them where
// the error occurs rather than in a SIGPIPE handler.
signal(SIGPIPE, SIG_IGN);
This code will prevent the SIGPIPE signal from being raised, but you will get a read / write error when trying to use the socket, so you will need to check for that.
You cannot prevent the process on the far end of a pipe from exiting, and if it exits before you've finished writing, you will get a SIGPIPE signal. If you SIG_IGN the signal, then your write will return with an error - and you need to note and react to that error. Just catching and ignoring the signal in a handler is not a good idea -- you must note that the pipe is now defunct and modify the program's behaviour so it does not write to the pipe again (because the signal will be generated again, and ignored again, and you'll try again, and the whole process could go on for a long time and waste a lot of CPU power).
Or should I just catch the SIGPIPE with a handler and ignore it?
I believe that is right on. You want to know when the other end has closed their descriptor and that's what SIGPIPE tells you.
Sam
What's the best practice to prevent the crash here?
Either disable sigpipes as per everybody, or catch and ignore the error.
Is there a way to check if the other side of the line is still reading?
Yes, use select().
select() doesn't seem to work here as it always says the socket is writable.
You need to select on the read bits. You can probably ignore the write bits.
When the far end closes its file handle, select will tell you that there is data ready to read. When you go and read that, you will get back 0 bytes, which is how the OS tells you that the file handle has been closed.
The only time you can't ignore the write bits is if you are sending large volumes, and there is a risk of the other end getting backlogged, which can cause your buffers to fill. If that happens, then trying to write to the file handle can cause your program/thread to block or fail. Testing select before writing will protect you from that, but it doesn't guarantee that the other end is healthy or that your data is going to arrive.
Note that you can get a sigpipe from close(), as well as when you write.
Close flushes any buffered data. If the other end has already been closed, then close will fail, and you will receive a sigpipe.
If you are using buffered TCPIP, then a successful write just means your data has been queued to send, it doesn't mean it has been sent. Until you successfully call close, you don't know that your data has been sent.
Sigpipe tells you something has gone wrong, it doesn't tell you what, or what you should do about it.
Under a modern POSIX system (i.e. Linux), you can use the sigprocmask() function.
#include <signal.h>
void block_signal(int signal_to_block /* i.e. SIGPIPE */ )
{
sigset_t set;
sigset_t old_state;
// get the current state
//
sigprocmask(SIG_BLOCK, NULL, &old_state);
// add signal_to_block to that existing state
//
set = old_state;
sigaddset(&set, signal_to_block);
// block that signal also
//
sigprocmask(SIG_BLOCK, &set, NULL);
// ... deal with old_state if required ...
}
If you want to restore the previous state later, make sure to save the old_state somewhere safe. If you call that function multiple times, you need to either use a stack or only save the first or last old_state... or maybe have a function which removes a specific blocked signal.
For more info read the man page.
Linux manual said:
EPIPE The local end has been shut down on a connection oriented
socket. In this case the process will also receive a SIGPIPE
unless MSG_NOSIGNAL is set.
But for Ubuntu 12.04 it isn't right. I wrote a test for that case and I always receive EPIPE withot SIGPIPE. SIGPIPE is genereated if I try to write to the same broken socket second time. So you don't need to ignore SIGPIPE if this signal happens it means logic error in your program.