Sockets on Linux question
I have a worker thread that is blocked on an accept() call. It simply waits for an incoming network connection, handles it, and then returns to listening for the next connection.
When it is time for the program to exit, how do I signal this network worker thread (from the main thread) to return from the accept() call while still being able to gracefully exit its loop and handle its cleanup code.
Some things I tried:
pthread_kill to send a signal. Feels kludgy to do this, plus it doesn't reliably allow the thread to do it's shutdown logic. Also makes the program terminate as well. I'd like to avoid signals if at all possible.
pthread_cancel. Same as above. It's a harsh kill on the thread. That, and the thread may be doing something else.
Closing the listen socket from the main thread in order to make accept() abort. This doesn't reliably work.
Some constraints:
If the solution involves making the listen socket non-blocking, that is fine. But I don't want to accept a solution that involves the thread waking up via a select call every few seconds to check the exit condition.
The thread condition to exit may not be tied to the process exiting.
Essentially, the logic I am going for looks like this.
void* WorkerThread(void* args)
{
DoSomeImportantInitialization(); // initialize listen socket and some thread specific stuff
while (HasExitConditionBeenSet()==false)
{
listensize = sizeof(listenaddr);
int sock = accept(listensocket, &listenaddr, &listensize);
// check if exit condition has been set using thread safe semantics
if (HasExitConditionBeenSet())
{
break;
}
if (sock < 0)
{
printf("accept returned %d (errno==%d)\n", sock, errno);
}
else
{
HandleNewNetworkCondition(sock, &listenaddr);
}
}
DoSomeImportantCleanup(); // close listen socket, close connections, cleanup etc..
return NULL;
}
void SignalHandler(int sig)
{
printf("Caught CTRL-C\n");
}
void NotifyWorkerThreadToExit(pthread_t thread_handle)
{
// signal thread to exit
}
int main()
{
void* ptr_ret= NULL;
pthread_t workerthread_handle = 0;
pthread_create(&workerthread, NULL, WorkerThread, NULL);
signal(SIGINT, SignalHandler);
sleep((unsigned int)-1); // sleep until the user hits ctrl-c
printf("Returned from sleep call...\n");
SetThreadExitCondition(); // sets global variable with barrier that worker thread checks on
// this is the function I'm stalled on writing
NotifyWorkerThreadToExit(workerthread_handle);
// wait for thread to exit cleanly
pthread_join(workerthread_handle, &ptr_ret);
DoProcessCleanupStuff();
}
Close the socket using the shutdown() call. This will wake up any threads blocked on it, while keeping the file descriptor valid.
close() on a descriptor another thread B is using is inherently hazardous: another thread C may open a new file descriptor which thread B will then use instead of the closed one. dup2() a /dev/null onto it avoids that problem, but does not wake up blocked threads reliably.
Note that shutdown() only works on sockets -- for other kinds of descriptors you likely need the select+pipe-to-self or cancellation approaches.
You can use a pipe to notify the thread that you want it to exit. Then you can have a select() call which selects on both the pipe and the listening socket.
For example (compiles but not fully tested):
// NotifyPipe.h
#ifndef NOTIFYPIPE_H_INCLUDED
#define NOTIFYPIPE_H_INCLUDED
class NotifyPipe
{
int m_receiveFd;
int m_sendFd;
public:
NotifyPipe();
virtual ~NotifyPipe();
int receiverFd();
void notify();
};
#endif // NOTIFYPIPE_H_INCLUDED
// NotifyPipe.cpp
#include "NotifyPipe.h"
#include <unistd.h>
#include <assert.h>
#include <fcntl.h>
NotifyPipe::NotifyPipe()
{
int pipefd[2];
int ret = pipe(pipefd);
assert(ret == 0); // For real usage put proper check here
m_receiveFd = pipefd[0];
m_sendFd = pipefd[1];
fcntl(m_sendFd,F_SETFL,O_NONBLOCK);
}
NotifyPipe::~NotifyPipe()
{
close(m_sendFd);
close(m_receiveFd);
}
int NotifyPipe::receiverFd()
{
return m_receiveFd;
}
void NotifyPipe::notify()
{
write(m_sendFd,"1",1);
}
Then select with receiverFd(), and notify for termination using notify().
Close the listening socket and accept will return an error.
What doesn't reliably work with this? Describe the problems you're facing.
pthread_cancel to cancel a thread blocked in accept() is risky if the pthread implementation does not implement cancellation properly, that is if the thread created a socket, just before returning to your code, a pthread_cancel() is called for it, the thread is canceled, and the newly created socket is leaked. Although FreeBSD 9.0 and later does not have such a race condition problem, but you should check your OS first.
Related
Background: My code structure: I have a master socket on main thread, then each time a new client is coming, the threadpool will be notified and let one pre allocated thread take the task.
Inside this thread, I will pass a slave socket to it, and let it using accept call to listen to the client.
Scenario: In my thread pool, thread A is listening to a client right now, now I want to stop all the pre-allocated thread and close all the connection to the client, the main thread is trying to close the connection using close the connection to the client, and trying to terminate thread A using pthread_join.
main() {
// create threadpool
// logic to create mastersocket
startServer(masterSock)
IwantToCloseServer() // this function is not directly called in main, but simulated by a terminal signal , like kill -quit pid.
}
int startServer(int msock) {
int ssock; // slaveSocket
struct sockaddr_in client_addr; // the address of the client...
unsigned int client_addr_len = sizeof(client_addr); // ... and its length
while (!stopCondition) {
// Accept connection:
ssock = ::accept((int)msock, (struct sockaddr*)&client_addr, &client_addr_len); // the return value is a socket
// I was trying to replace this line of code to poll(), but it's not does the same thing as before
if (ssock < 0) {
if (errno == EINTR) continue;
perror("accept");
running =0;
return 0;
// exit(0);
} else {
// push task to thread pool to deal with logic
}
// main thread continues with the loop...
}
return 1;
}
IwantToCloseServer(slaveSocket) {
// when i want to close() or shutdown() function to close connections, these 2 function always return -1, because the thread is blocked on accept call
// logic try to terminate all the preallocated threads, the pthread_join function is stuck because the thread is blocked on accept
}
Problem: The thread A is keeping blocking on the ::accept function , the close and shutdown function return -1, they won’t close the connection , and the pthread_join is not keep going because thread A is blocked on accept.
Things I tried:
I have try to change my while loop related accept function, for example, set a flag stopCondition,
while(!stopConditon) {
ssock = ::accept((int)msock, (struct sockaddr*)&client_addr, &client_addr_len);
}
However, when the main thread change stopCondtion, the thread A is blocked inside the accept function.
It won’t go inside the while loop, so this solution won’t affect the accept function, it’s not working
I have also tried to send a signal to this blocked Thread A, using
pthread_cancel or pthread_kill(Thread A, 9)
However, if I do this, the whole process gets killed.
3.try to use poll() to replace the line, where the accept functions at, with a timeout
however, the program doesn't behave like before, the program can't listen to client anymore.
How do I terminate thread A (which is blocked on accept function call right now), so that I can clean this pre allocated thread and restart my server ?
btw i can not use library like boost in my current program. And this is under linux system not winsocket
to check periodically stopConditon in your while(!stopConditon) { first call accept/pool with a timeout to know if there is something new about msock, then depending on the result call accept etc else do nothing
I was trying to replace this line of code to poll()
try to use poll() to replace the line, where the accept functions at, with a timeout
you cannot replace accept by poll, you have to call accept / pool first and of course check the result then may be call accept
Out of that
while(!stopConditon) {
if(!stopCondtion) {
is redundant and can be replaced by
while(!stopConditon) {
gcc (GCC) 4.6.3
valgrind-3.6.1
I have created a application that send and receives some messages in 2 different thread for sending and receiving. Using pthreads, condition varables and mutexes for locks.
However, the sender will send messages and then signal the receiver to receive it and process it. It does this in a while loop.
However, the problem occurs if I want to quit the application by using ctrl-c and handling the interupt. If there is no messages being sent then the receiver is stuck in the while loop waiting to receive.
The main thread will call join and block waiting for the receiver to finish. But it doesn't as it waiting on the pthread_cond_wait.
I was thinking of using the pthread_cancel or pthread_kill. But I don't like to do that as it doesn't allow the thread to exit normally.
many thanks for any suggestions.
main function
void main(void)
{
/* Do some stuff here */
/* Start thread that will send a message */
if(pthread_create(&thread_recv_id, &thread_attr, thread_recv_fd, NULL) == -1) {
fprintf(stderr, "Failed to create thread, reason [ %s ]",
strerror(errno));
break;
}
printf("Start listening for receiving data'\n");
/* Start thread to receive messages */
if(pthread_create(&thread_send_id, &thread_attr, thread_send_fd, NULL) == -1) {
fprintf(stderr, "Failed to create thread for receiving, reason [ %s ]",
strerror(errno));
break;
}
/* Clean up threading properties */
pthread_join(thread_send_id, NULL);
pthread_join(thread_recv_id, NULL); <---- blocking here waiting for the recv thread to finish
pthread_mutex_destroy(&mutex_queue);
pthread_cond_destroy(&cond_queue);
return 0;
}
sender thread
void *thread_send_fd()
{
pthread_mutex_lock(&mutex_queue);
if(send_fd((int)fd) == FALSE) {
/* Just continue to send another item */
continue;
}
/* Signal the waiting thread to remove the item that has been sent */
pthread_cond_signal(&cond_queue);
pthread_mutex_unlock(&mutex_queue);
}
receiver thread
void *thread_recv_fd()
{
while(is_receiving()) {
pthread_mutex_lock(&mutex_queue);
/* Wait for an item to be sent on the queue */
pthread_cond_wait(&cond_queue, &mutex_queue); <---- waiting here
queue_remove();
pthread_mutex_unlock(&mutex_queue);
}
pthread_exit(NULL);
}
You basically have 3 choices:
Use pthread_cancel. This will interrupt the pthread_cond_wait call, and then exit the thread, invoking the cancellation handlers registered with pthread_cleanup_push on the way up.
Use pthread_kill to send a signal to the thread. This doesn't "kill" the thread, it just sends a signal. In this case, you must have registered a signal handler in that thread for the signal you use, and that signal handler must do something to tell the thread to exit. This isn't particularly better than the third option, since the signal handler still needs to do something to make the pthread_cond_wait loop exit.
Add a manual interruption feature to your thread that knows to set a flag and signal the condition variable. The loop around pthread_cond_wait should then check the flag and exit the thread if the flag is set.
I would recommend (1) or (3). Using pthread_cancel is most generic, but requires careful handling in the thread to ensure there are suitable pthread_cleanup_push calls for cleaning up all resources allocated by the thread, unlocking all mutexes, and so forth. Writing a manual interruption feature is potentially more work, but can be most easily tailored to your application.
using select() with pipe - this is what I am doing and now I need to catch SIGTERM on that. how can I do it? Do I have to do it when select() returns error ( < 0 ) ?
First, SIGTERM will kill your process if not caught, and select() will not return. Thus, you must install a signal handler for SIGTERM. Do that using sigaction().
However, the SIGTERM signal can arrive at a moment where your thread is not blocked at select(). It would be a rare condition, if your process is mostly sleeping on the file descriptors, but it can otherwise happen. This means that either your signal handler must do something to inform the main routine of the interruption, namely, setting some flag variable (of type sig_atomic_t), or you must guarantee that SIGTERM is only delivered when the process is sleeping on select().
I'll go with the latter approach, since it's simpler, albeit less flexible (see end of the post).
So, you block SIGTERM just before calling select(), and reblock it right away after the function returns, so that your process only receives the signal while sleeping inside select(). But note that this actually creates a race condition. If the signal arrives just after the unblock, but just before select() is called, the system call will not have been called yet and thus it will not return -1. If the signal arrives just after select() returns successfully, but just before the re-block, you have also lost the signal.
Thus, you must use pselect() for that. It does the blocking/unblocking around select() atomically.
First, block SIGTERM using sigprocmask() before entering the pselect() loop. After that, just call pselect() with the original mask returned by sigprocmask(). This way you guarantee your process will only be interrupted while sleeping on select().
In summary:
Install a handler for SIGTERM (that does nothing);
Before entering the pselect() loop, block SIGTERM using sigprocmask();
Call pselect() with the old signal mask returned by sigprocmask();
Inside the pselect() loop, now you can check safely whether pselect() returned -1 and errno is EINTR.
Please note that if, after pselect() returns successfully, you do a lot of work, you may experience bigger latency when responding to SIGTERM (since the process must do all processing and return to pselect() before actually processing the signal). If this is a problem, you must use a flag variable inside the signal handler, so that you can check for this variable in a number of specific points in your code. Using a flag variable does not eliminate the race condition and does not eliminate the need for pselect(), though.
Remember: whenever you need to wait on some file descriptors or for the delivery of a signal, you must use pselect() (or ppoll(), for the systems that support it).
Edit: nothing better than a code example to illustrate the usage.
#define _POSIX_C_SOURCE 200809L
#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/select.h>
#include <unistd.h>
// Signal handler to catch SIGTERM.
void sigterm(int signo) {
(void)signo;
}
int main(void) {
// Install the signal handler for SIGTERM.
struct sigaction s;
s.sa_handler = sigterm;
sigemptyset(&s.sa_mask);
s.sa_flags = 0;
sigaction(SIGTERM, &s, NULL);
// Block SIGTERM.
sigset_t sigset, oldset;
sigemptyset(&sigset);
sigaddset(&sigset, SIGTERM);
sigprocmask(SIG_BLOCK, &sigset, &oldset);
// Enter the pselect() loop, using the original mask as argument.
fd_set set;
FD_ZERO(&set);
FD_SET(0, &set);
while (pselect(1, &set, NULL, NULL, NULL, &oldset) >= 0) {
// Do some processing. Note that the process will not be
// interrupted while inside this loop.
sleep(5);
}
// See why pselect() has failed.
if (errno == EINTR)
puts("Interrupted by SIGTERM.");
else
perror("pselect()");
return EXIT_SUCCESS;
}
The answer is partly in one of the comment in the Q&A you point to;
> Interrupt will cause select() to return a -1 with errno set to EINTR
That is; for any interrupt(signal) caught the select will return, and the errno will be set to EINTR.
Now if you specifically want to catch SIGTERM, then you need to set that up with a call to signal, like this;
signal(SIGTERM,yourcatchfunction);
where your catch function should be defined something like
void yourcatchfunction(int signaleNumber) { .... }
So in summary, you have setup a signal handler yourcatchfunction and your program is currently in a select() call waiting for IO -- when a signal arrives, your catchfunction will be called and when you return from that the select call will return with the errno set to EINTR.
However be aware that the SIGTERM can occur at any time so you may not be in the select call when it occur, in which case you will never see the EINTR but only a regular call of the yourcatchfunction
Hence the select() returning with err and errno EINTR is just so you can take non-blocking action -- it is not what will catch the signal.
You can call select() in a loop. This is known as restarting the system call. Here is some pseudo-C.
int retval = -1;
int select_errno = 0;
do {
retval = select(...);
if (retval < 0)
{
/* Cache the value of errno in case a system call is later
* added prior to the loop guard (i.e., the while expression). */
select_errno = errno;
}
/* Other system calls might be added here. These could change the
* value of errno, losing track of the error during the select(),
* again this is the reason we cached the value. (E.g, you might call
* a log method which calls gettimeofday().) */
/* Automatically restart the system call if it was interrupted by
* a signal -- with a while loop. */
} while ((retval < 0) && (select_errno == EINTR));
if (retval < 0) {
/* Handle other errors here. See select man page. */
} else {
/* Successful invocation of select(). */
}
I have to code a multithreaded(say 2 threads) program where each of these threads do a different task. Also, these threads must keep running infinitely in the background once started. Here is what I have done. Can somebody please give me some feedback if the method is good and if you see some problems. Also, I would like to know how to shut the threads in a systematic way once I terminate the execution say with Ctrl+C.
The main function creates two threads and let them run infinitely as below.
Here is the skeleton:
void *func1();
void *func2();
int main(int argc, char *argv[])
{
pthread_t th1,th2;
pthread_create(&th1, NULL, func1, NULL);
pthread_create(&th2, NULL, func2, NULL);
fflush (stdout);
for(;;){
}
exit(0); //never reached
}
void *func1()
{
while(1){
//do something
}
}
void *func2()
{
while(1){
//do something
}
}
Thanks.
Edited code using inputs from the answers:
Am I exiting the threads properly?
#include <stdlib.h> /* exit() */
#include <stdio.h> /* standard in and output*/
#include <pthread.h>
#include <unistd.h>
#include <time.h>
#include <sys/time.h>
#include <sys/types.h>
#include <signal.h>
#include <semaphore.h>
sem_t end;
void *func1();
void *func2();
void ThreadTermHandler(int signo){
if (signo == SIGINT) {
printf("Ctrl+C detected !!! \n");
sem_post(&end);
}
}
void *func1()
{
int value;
for(;;){
sem_getvalue(&end, &value);
while(!value){
printf("in thread 1 \n");
}
}
return 0;
}
void *func2()
{
int value;
for(;;){
sem_getvalue(&end, &value);
while(!value){
printf("value = %d\n", value);
}
}
return 0;
}
int main(int argc, char *argv[])
{
sem_init(&end, 0, 0);
pthread_t th1,th2;
int value = -2;
pthread_create(&th1, NULL, func1, NULL);
pthread_create(&th2, NULL, func2, NULL);
struct sigaction sa;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = ThreadTermHandler;
// Establish a handler to catch CTRL+c and use it for exiting.
if (sigaction(SIGINT, &sa, NULL) == -1) {
perror("sigaction for Thread Termination failed");
exit( EXIT_FAILURE );
}
/* Wait for SIGINT. */
while (sem_wait(&end)!=0){}
//{
printf("Terminating Threads.. \n");
sem_post(&end);
sem_getvalue(&end, &value);
/* SIGINT received, cancel threads. */
pthread_cancel(th1);
pthread_cancel(th2);
/* Join threads. */
pthread_join(th1, NULL);
pthread_join(th2, NULL);
//}
exit(0);
}
There are mainly two approaches for thread termination.
Use a cancellation point. The thread will terminate when requested to cancel and it reaches a cancellation point, thus ending execution in a controlled fashion;
Use a signal. Have the threads install a signal handler which provides a mechanism for termination (setting a flag and reacting to EINTR).
Both approaches has caveats. Refer to Kill Thread in Pthread Library for more details.
In your case, it seems a good opportunity to use cancellation points. I will work with a commented example. The error-checking has been omitted for clarity.
#define _POSIX_C_SOURCE 200809L
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void sigint(int signo) {
(void)signo;
}
void *thread(void *argument) {
(void)argument;
for (;;) {
// Do something useful.
printf("Thread %u running.\n", *(unsigned int*)argument);
// sleep() is a cancellation point in this example.
sleep(1);
}
return NULL;
}
int main(void) {
// Block the SIGINT signal. The threads will inherit the signal mask.
// This will avoid them catching SIGINT instead of this thread.
sigset_t sigset, oldset;
sigemptyset(&sigset);
sigaddset(&sigset, SIGINT);
pthread_sigmask(SIG_BLOCK, &sigset, &oldset);
// Spawn the two threads.
pthread_t thread1, thread2;
pthread_create(&thread1, NULL, thread, &(unsigned int){1});
pthread_create(&thread2, NULL, thread, &(unsigned int){2});
// Install the signal handler for SIGINT.
struct sigaction s;
s.sa_handler = sigint;
sigemptyset(&s.sa_mask);
s.sa_flags = 0;
sigaction(SIGINT, &s, NULL);
// Restore the old signal mask only for this thread.
pthread_sigmask(SIG_SETMASK, &oldset, NULL);
// Wait for SIGINT to arrive.
pause();
// Cancel both threads.
pthread_cancel(thread1);
pthread_cancel(thread2);
// Join both threads.
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
// Done.
puts("Terminated.");
return EXIT_SUCCESS;
}
The need for blocking/unblocking signals is that if you send SIGINT to the process, any thread may be able to catch it. You do so before spawning the threads to avoid having them doing it by themselves and needing to synchronize with the parent. After the threads are created, you restore the mask and install a handler.
Cancellation points can be tricky if the threads allocates a lot of resources; in that case, you will have to use pthread_cleanup_push() and pthread_cleanup_pop(), which are a mess. But the approach is feasible and rather elegant if used properly.
The answer depends a lot on what you want to do when the user presses CtrlC.
If your worker threads are not modifying data that needs to be saved on exit, you don't need to do anything. The default action of SIGINT is to terminate the process, and that includes all threads that make up the process.
If your threads do need to perform cleanup, however, you've got some work to do. There are two separate issues you need to consider:
How you handle the signal and get the message to threads that they need to terminate.
How your threads receive and handle the request to terminate.
First of all, signal handlers are a pain. Unless you're very careful, you have to assume most library functions are not legal to call from a signal handler. Fortunately, sem_post is specified to be async-signal-safe, and can meet your requirements perfectly:
At the beginning of your program, initialize a semaphore with sem_init(&exit_sem, 0, 0);
Install a signal handler for SIGINT (and any other termination signals you want to handle, like SIGTERM) that performs sem_post(&exit_sem); and returns.
Replace the for(;;); in the main thread with while (sem_wait(&exit_sem)!=0).
After sem_wait succeeds, the main thread should inform all other threads that they should exit, then wait for them all to exit.
The above can also be accomplished without semaphores using signal masks and sigwaitinfo, but I prefer the semaphore approach because it doesn't require you to learn lots of complicated signal semantics.
Now, there are several ways you could handle informing the worker threads that it's time to quit. Some options I see:
Having them check sem_getvalue(&exit_sem) periodically and cleanup and exit if it returns a nonzero value. Note however that this will not work if the thread is blocked indefinitely, for example in a call to read or write.
Use pthread_cancel, and carefully place cancellation handlers (pthread_cleanup_push) all over the place.
Use pthread_cancel, but also use pthread_setcancelstate to disable cancellation during most of your code, and only re-enable it when you're going to perform blocking IO operations. This way you need only put the cleanup handlers just in the places where cancellation is enabled.
Learn advanced signal semantics, and setup an additional signal and interrupting signal handler which you send to all threads via pthread_kill which will cause blocking syscalls to return with an EINTR error. Then your threads can act on this and exit the normal C way via a string of failure returns all the way back up the the start function.
I would not recommend approach 4 for beginners, because it's hard to get right, but for advanced C programmers it may be the best because it allows you to use the existing C idiom of reporting exceptional conditions via return values rather than "exceptions".
Also note that with pthread_cancel, you will need to periodically call pthread_testcancel if you are not calling any other functions which are cancellation points. Otherwise the cancellation request will never be acted upon.
This is a bad idea:
for(;;){
}
because your main thread will execute unnecessary CPU instructions.
If you need to wait in the main thread, use pthread_join as answered in this question: Multiple threads in C program
What you have done works, I see no obvious problems with it (except that you are ignoring the return value of pthread_create). Unfortunately, stopping threads is more involved than you might think. The fact that you want to use signals is another complication. Here's what you could do.
In the "children" threads, use pthread_sigmask to block signals
In the main thread, use sigsuspend to wait for a signal
Once you receive the signal, cancel (pthread_cancel) the children threads
Your main thread could look something like this:
/* Wait for SIGINT. */
sigsuspend(&mask);
/* SIGINT received, cancel threads. */
pthread_cancel(th1);
pthread_cancel(th2);
/* Join threads. */
pthread_join(th1, NULL);
pthread_join(th2, NULL);
Obviously, you should read more about pthread_cancel and cancellation points. You could also install a cleanup handler. And of course, check every return value.
Looked at your updated coded and it still does not look right.
Signal handling must be done in only one thread. Signals targeted for a process (such as SIGINT) get delivered to any thread that does not have that signal blocked. In other words, there is no guarantee that given the three threads you have it is going to be the main thread that receives SIGINT. In multi-threaded programs the best practise is too block all signals before creating any threads, and once all threads have been created unblock the signals in the main thread only (normally it is the main thread that is in the best position to handle signals). See Signal Concepts and Signalling in a Multi-Threaded Process for more.
pthread_cancel is best avoided, there no reason to ever use it. To stop the threads you should somehow communicate to them that they should terminate and wait till they have terminated voluntarily. Normally, the threads will have some sort of event loop, so it should be relatively straightforward to send the other thread an event.
Wouldn't it be much easier to just call pthread_cancel and use pthread_cleanup_push in the thread function to potentially clean up the data that was dynamically allocated by the thread or do any termination tasks that was required before the thread stops.
So the idea would be:
write the code to handle signals
when you do ctrl+c ... the handling function is called
this function cancels the thread
each thread which was created set a thread cleanup function using pthread_cleanup_push
when the tread is cancelled the pthread_cleanup_push's function is called
join all threads before exiting
It seems like a simple and natural solution.
static void cleanup_handler(void *arg)
{
printf("Called clean-up handler\n");
}
static void *threadFunc(void *data)
{
ThreadData *td = (ThreadData*)(data);
pthread_cleanup_push(cleanup_handler, (void*)something);
while (1) {
pthread_testcancel(); /* A cancellation point */
...
}
pthread_cleanup_pop(cleanup_pop_arg);
return NULL;
}
You don't need the foor loop in the main. A th1->join(); th2->join(); will suffice as a wait condition since the threads never end.
To stop the threads you could use a global shared var like bool stop = false;, then when catching the signal (Ctrl+Z is a signal in UNIX), set stop = true aborting the threads, since you are waiting with join() the main program will also exit.
example
void *func1(){
while(!stop){
//do something
}
}
I discovered an issue with thread implementation, that is strange to me. Maybe some of you can explain it to me, would be great.
I am working on something like a proxy, a program (running on different machines) that receives packets over eth0 and sends it through ath0 (wireless) to another machine which is doing the exactly same thing. Actually I am not at all sure what is causing my problem, that's because I am new to everything, linux and c programming.
I start two threads,
one is listening (socket) on eth0 for incoming packets and sends it out through ath0 (also socket)
and the other thread is listening on ath0 and sends through eth0.
If I use threads, I get an error like that:
sh-2.05b# ./socketex
Failed to send network header packet.
: Interrupted system call
If I use fork(), the program works as expected.
Can someone explain that behaviour to me?
Just to show the sender implementation here comes its code snippet:
while(keep_going) {
memset(&buffer[0], '\0', sizeof(buffer));
recvlen = recvfrom(sockfd_in, buffer, BUFLEN, 0, (struct sockaddr *) &incoming, &ilen);
if(recvlen < 0) {
perror("something went wrong / incoming\n");
exit(-1);
}
strcpy(msg, buffer);
buflen = strlen(msg);
sentlen = ath_sendto(sfd, &btpinfo, &addrnwh, &nwh, buflen, msg, &selpv2, &depv);
if(sentlen == E_ERR) {
perror("Failed to send network header packet.\n");
exit(-1);
}
}
UPDATE: my main file, starting either threads or processes (fork)
int main(void) {
port_config pConfig;
memset(&pConfig, 0, sizeof(pConfig));
pConfig.inPort = 2002;
pConfig.outPort = 2003;
pid_t retval = fork();
if(retval == 0) {
// child process
pc2wsuThread((void *) &pConfig);
} else if (retval < 0) {
perror("fork not successful\n");
} else {
// parent process
wsu2pcThread((void *) &pConfig);
}
/*
wint8 rc1, rc2 = 0;
pthread_t pc2wsu;
pthread_t wsu2pc;
rc1 = pthread_create(&pc2wsu, NULL, pc2wsuThread, (void *) &pConfig);
rc2 = pthread_create(&wsu2pc, NULL, wsu2pcThread, (void *) &pConfig);
if(rc1) {
printf("error: pthread_create() is %d\n", rc1);
return(-1);
}
if(rc2) {
printf("error: pthread_create() is %d\n", rc2);
return(-1);
}
pthread_join(pc2wsu, NULL);
pthread_join(wsu2pc, NULL);
*/
return 0;
}
Does it help?
update 05/30/2011
-sh-2.05b# ./wsuproxy 192.168.1.100
mgmtsrvc
mgmtsrvc
Failed to send network header packet.
: Interrupted system call
13.254158,75.165482,DATAAAAAAmgmtsrvc
mgmtsrvc
mgmtsrvc
Still get the interrupted system call, as you can see above.
I blocked all signals as followed:
sigset_t signal_mask;
sigfillset(&signal_mask);
sigprocmask(SIG_BLOCK, &signal_mask, NULL);
The two threads are working on the same interfaces, but on different ports. The problem seems to appear still in the same place (please find it in the first code snippet). I can't go further and have not enough knowledge of how to solve that problem. Maybe some of you can help me here again.
Thanks in advance.
EINTR does not itself indicate an error. It means that your process received a signal while it was in the sendto syscall, and that syscall hadn't sent any data yet (that's important).
You could retry the send in this case, but a good thing would be to figure out what signal caused the interruption. If this is reproducible, try using strace.
If you're the one sending the signal, well, you know what to do :-)
Note that on linux, you can receive EINTR on sendto (and some other functions) even if you haven't installed a handler yourself. This can happen if:
the process is stopped (via SIGSTOP for example) and restarted (with SIGCONT)
you have set a send timeout on the socket (via SO_SNDTIMEO)
See the signal(7) man page (at the very bottom) for more details.
So if you're "suspending" your service (or something else is), that EINTR is expected and you should restart the call.
Keep in mind if you are using threads with signals that a given signal, when delivered to the process, could be delivered to any thread whose signal mask is not blocking the signal. That means if you have blocked incoming signals in one thread, and not in another, the non-blocking thread will receive the signal, and if there is no signal handler setup for the signal, you will end-up with the default behavior of that signal for the entire process (i.e., all the threads, both signal-blocking threads and non-signal-blocking threads). For instance, if the default behavior of a signal was to terminate a process, one thread catching that signal and executing it's default behavior will terminate the entire process, for all the threads, even though some threads may have been masking the signal. Also if you have two threads that are not blocking a signal, it is not deterministic which thread will handle the signal. Therefore it's typically the case that mixing signals and threads is not a good idea, but there are exceptions to the rule.
One thing you can try, is since the signal mask for a spawned thread is inherited from the generating thread, is to create a daemon thread for handling signals, where at the start of your program, you block all incoming signals (or at least all non-important signals), and then spawn your threads. Now those spawned threads will ignore any incoming signals in the parent-thread's blocked signal mask. If you need to handle some specific signals, you can still make those signals part of the blocked signal mask for the main process, and then spawn your threads. But when you're spawning the threads, leave one thread (could even be the main process thread after it's spawned all the worker threads) as a "daemon" thread waiting for those specific incoming (and now blocked) signals using sigwait(). That thread will then dispatch whatever functions are necessary when a given signal is received by the process. This will avoid signals from interrupting system calls in your other worker-threads, yet still allow you to handle signals.
The reason your forked version may not be having issues is because if a signal arrives at one parent process, it is not propagated to any child processes. So I would try, if you can, to see what signal it is that is terminating your system call, and in your threaded version, block that signal, and if you need to handle it, create a daemon-thread that will handle that signal's arrival, with the rest of the threads blocking that signal.
Finally, if you don't have access to any external libraries or debuggers, etc. to see what signals are arriving, you can setup a simple procedure for seeing what signals might be arriving. You can try this code:
#include <signal.h>
#include <stdio.h>
int main()
{
//block all incoming signals
sigset_t signal_mask;
sigfillset(&signal_mask);
sigprocmask(SIG_BLOCK, &signal_mask, NULL);
//... spawn your threads here ...
//... now wait for signals to arrive and see what comes in ...
int arrived_signal;
while(1) //you can change this condition to whatever to exit the loop
{
sigwait(&signal_mask, &arrived_signal);
switch(arrived_signal)
{
case SIGABRT: fprintf(stderr, "SIGABRT signal arrived\n"); break;
case SIGALRM: fprintf(stderr, "SIGALRM signal arrived\n"); break;
//continue for the rest of the signals defined in signal.h ...
default: fprintf(stderr, "Unrecognized signal arrived\n");
}
}
//clean-up your threads and anything else needing clean-up
return 0;
}