I discovered an issue with thread implementation, that is strange to me. Maybe some of you can explain it to me, would be great.
I am working on something like a proxy, a program (running on different machines) that receives packets over eth0 and sends it through ath0 (wireless) to another machine which is doing the exactly same thing. Actually I am not at all sure what is causing my problem, that's because I am new to everything, linux and c programming.
I start two threads,
one is listening (socket) on eth0 for incoming packets and sends it out through ath0 (also socket)
and the other thread is listening on ath0 and sends through eth0.
If I use threads, I get an error like that:
sh-2.05b# ./socketex
Failed to send network header packet.
: Interrupted system call
If I use fork(), the program works as expected.
Can someone explain that behaviour to me?
Just to show the sender implementation here comes its code snippet:
while(keep_going) {
memset(&buffer[0], '\0', sizeof(buffer));
recvlen = recvfrom(sockfd_in, buffer, BUFLEN, 0, (struct sockaddr *) &incoming, &ilen);
if(recvlen < 0) {
perror("something went wrong / incoming\n");
exit(-1);
}
strcpy(msg, buffer);
buflen = strlen(msg);
sentlen = ath_sendto(sfd, &btpinfo, &addrnwh, &nwh, buflen, msg, &selpv2, &depv);
if(sentlen == E_ERR) {
perror("Failed to send network header packet.\n");
exit(-1);
}
}
UPDATE: my main file, starting either threads or processes (fork)
int main(void) {
port_config pConfig;
memset(&pConfig, 0, sizeof(pConfig));
pConfig.inPort = 2002;
pConfig.outPort = 2003;
pid_t retval = fork();
if(retval == 0) {
// child process
pc2wsuThread((void *) &pConfig);
} else if (retval < 0) {
perror("fork not successful\n");
} else {
// parent process
wsu2pcThread((void *) &pConfig);
}
/*
wint8 rc1, rc2 = 0;
pthread_t pc2wsu;
pthread_t wsu2pc;
rc1 = pthread_create(&pc2wsu, NULL, pc2wsuThread, (void *) &pConfig);
rc2 = pthread_create(&wsu2pc, NULL, wsu2pcThread, (void *) &pConfig);
if(rc1) {
printf("error: pthread_create() is %d\n", rc1);
return(-1);
}
if(rc2) {
printf("error: pthread_create() is %d\n", rc2);
return(-1);
}
pthread_join(pc2wsu, NULL);
pthread_join(wsu2pc, NULL);
*/
return 0;
}
Does it help?
update 05/30/2011
-sh-2.05b# ./wsuproxy 192.168.1.100
mgmtsrvc
mgmtsrvc
Failed to send network header packet.
: Interrupted system call
13.254158,75.165482,DATAAAAAAmgmtsrvc
mgmtsrvc
mgmtsrvc
Still get the interrupted system call, as you can see above.
I blocked all signals as followed:
sigset_t signal_mask;
sigfillset(&signal_mask);
sigprocmask(SIG_BLOCK, &signal_mask, NULL);
The two threads are working on the same interfaces, but on different ports. The problem seems to appear still in the same place (please find it in the first code snippet). I can't go further and have not enough knowledge of how to solve that problem. Maybe some of you can help me here again.
Thanks in advance.
EINTR does not itself indicate an error. It means that your process received a signal while it was in the sendto syscall, and that syscall hadn't sent any data yet (that's important).
You could retry the send in this case, but a good thing would be to figure out what signal caused the interruption. If this is reproducible, try using strace.
If you're the one sending the signal, well, you know what to do :-)
Note that on linux, you can receive EINTR on sendto (and some other functions) even if you haven't installed a handler yourself. This can happen if:
the process is stopped (via SIGSTOP for example) and restarted (with SIGCONT)
you have set a send timeout on the socket (via SO_SNDTIMEO)
See the signal(7) man page (at the very bottom) for more details.
So if you're "suspending" your service (or something else is), that EINTR is expected and you should restart the call.
Keep in mind if you are using threads with signals that a given signal, when delivered to the process, could be delivered to any thread whose signal mask is not blocking the signal. That means if you have blocked incoming signals in one thread, and not in another, the non-blocking thread will receive the signal, and if there is no signal handler setup for the signal, you will end-up with the default behavior of that signal for the entire process (i.e., all the threads, both signal-blocking threads and non-signal-blocking threads). For instance, if the default behavior of a signal was to terminate a process, one thread catching that signal and executing it's default behavior will terminate the entire process, for all the threads, even though some threads may have been masking the signal. Also if you have two threads that are not blocking a signal, it is not deterministic which thread will handle the signal. Therefore it's typically the case that mixing signals and threads is not a good idea, but there are exceptions to the rule.
One thing you can try, is since the signal mask for a spawned thread is inherited from the generating thread, is to create a daemon thread for handling signals, where at the start of your program, you block all incoming signals (or at least all non-important signals), and then spawn your threads. Now those spawned threads will ignore any incoming signals in the parent-thread's blocked signal mask. If you need to handle some specific signals, you can still make those signals part of the blocked signal mask for the main process, and then spawn your threads. But when you're spawning the threads, leave one thread (could even be the main process thread after it's spawned all the worker threads) as a "daemon" thread waiting for those specific incoming (and now blocked) signals using sigwait(). That thread will then dispatch whatever functions are necessary when a given signal is received by the process. This will avoid signals from interrupting system calls in your other worker-threads, yet still allow you to handle signals.
The reason your forked version may not be having issues is because if a signal arrives at one parent process, it is not propagated to any child processes. So I would try, if you can, to see what signal it is that is terminating your system call, and in your threaded version, block that signal, and if you need to handle it, create a daemon-thread that will handle that signal's arrival, with the rest of the threads blocking that signal.
Finally, if you don't have access to any external libraries or debuggers, etc. to see what signals are arriving, you can setup a simple procedure for seeing what signals might be arriving. You can try this code:
#include <signal.h>
#include <stdio.h>
int main()
{
//block all incoming signals
sigset_t signal_mask;
sigfillset(&signal_mask);
sigprocmask(SIG_BLOCK, &signal_mask, NULL);
//... spawn your threads here ...
//... now wait for signals to arrive and see what comes in ...
int arrived_signal;
while(1) //you can change this condition to whatever to exit the loop
{
sigwait(&signal_mask, &arrived_signal);
switch(arrived_signal)
{
case SIGABRT: fprintf(stderr, "SIGABRT signal arrived\n"); break;
case SIGALRM: fprintf(stderr, "SIGALRM signal arrived\n"); break;
//continue for the rest of the signals defined in signal.h ...
default: fprintf(stderr, "Unrecognized signal arrived\n");
}
}
//clean-up your threads and anything else needing clean-up
return 0;
}
Related
I'm writing a multithreaded server program in C that works with AF_UNIX sockets.
The basic structure of the server is:
Main thread initialize data structures and spears a pool of "worker" threads.
Worker threads start waiting for new requests on an empty thread-safe queue
Main thread listen on various sockets (new connection and already connected clients) with a select() call.
select() reveals possible read on connection socket: main thread calls accept() and puts the returned file descriptor in the fd_set (read set).
select() reveal possible read on already connected sockets: main thread removes the ready file descriptors from the fd_set (read set) and puts them in the thread-safe queue.
Worker thread extracts a file descriptor from the queue and starts to communicate with the linked client for serve the request. At the end of the service worker thread puts socket file descriptor back to the fd_set (i worte a function to make this operation thread-safe) and it returns waiting again on the queue for a new request.
This routine is repeated in a infinite cycle until a SIGINT is raised.
Another function has to be performed on SIGUSR1 without exiting from the cycle.
My doubt is about this because if I raise a SIGINT my program exit with EINTR = Interrupted system call.
I know about the pselect() call and the "self pipe" trick but i can't figure out how to make the things work in a multithreaded situation.
I'm looking for a (POSIX compatible) signal management that that prevent the EINTR error while main thread is waiting on pselect().
I post some pieces of code for clarification:
Here i set up signal handlers (ignore errorConsolePrint function)
if(signal(SIGINT, &on_SIGINT) == SIG_ERR)
{
errorConsolePrint("File: %s; Line: %d; ", "Setting SIGINT handler", __FILE__, __LINE__);
exit(EXIT_FAILURE);
}
if(signal(SIGTERM, &on_SIGINT) == SIG_ERR)
{
errorConsolePrint("File: %s; Line: %d; ", "Setting SIGINT handler", __FILE__, __LINE__);
exit(EXIT_FAILURE);
}
if(signal(SIGUSR1, &on_SIGUSR1) == SIG_ERR)
{
errorConsolePrint("File: %s; Line: %d; ", "Setting to SIGUSR1 handler", __FILE__, __LINE__);
exit(EXIT_FAILURE);
}
if(signal(SIGPIPE, SIG_IGN) == SIG_ERR)
{
errorConsolePrint("File: %s; Line: %d; ", "Setting to ignore SIGPIPE", __FILE__, __LINE__);
exit(EXIT_FAILURE);
}
Here i set up signal mask for pselect
sigemptyset(&mask);
sigemptyset(&saveMask);
sigaddset(&mask, SIGINT);
sigaddset(&mask, SIGUSR1);
sigaddset(&mask, SIGPIPE);
Here i call pselect
test = saveSet(masterSet, &backUpSet, &saveMaxFd);
CHECK_MINUS1(test, "Server: creating master set's backup ");
int test = pselect(saveMaxFd+1, &backUpSet, NULL, NULL, &waiting, &mask);
if(test == -1 && errno != EINTR)
{
...error handling...
continue;
}
Hope in some help!
Thank you all in advance.
What you should probably do is dedicate a thread to signal handling. Here's a sketch:
In main, before spawning any threads, block all signals (using pthread_sigmask) except for SIGILL, SIGABRT, SIGFPE, SIGSEGV, and SIGBUS.
Then, spawn your signal handler thread. This thread loops calling sigwaitinfo for the signals you care about. It takes whatever action is appropriate for each; this could include sending a message to the main thread to trigger a clean shutdown (SIGINT), queuing the "another function" to be processed in the worker pool (SIGUSR1), etc. You do not install handlers for these signals.
Then you spawn your thread pool, which doesn't have to care about signals at all.
I would suggest the following strategy:
During initialization, set up your signal handlers, as you do.
During initialization, block all (blockable) signals. See for example Is it possible to ignore all signals?.
Use pselect in your main thread to unblock threads for the duration of the call, again as you do.
This has the advantage that all of your system calls, including all those in all your worker threads, will never return EINTR, except for the single pselect in the main thread. See for example the answers to Am I over-engineering per-thread signal blocking? and pselect does not return on signal when called from a separate thread but works fine in single thread program.
This strategy would also work with select: just unblock the signals in your main thread immediately before calling select, and re-block them afterwards. You only really need pselect to prevent hanging if your select timeout is long or infinite, and if your file descriptors are mostly inactive. (I've never used pselect myself, having worked mostly with older Unix's which did not have it.)
I am presuming that your signal handlers as suitable: for example, they just atomically set a global variable.
BTW, in your sample code, do you need sigaddset(&mask, SIGPIPE), as SIGPIPE is already ignored?
Ok, finally I got a solution.
The heart of my problem was about the multithreading nature of my server.
After long search I found out that in the case we have signals raised from other process (in an asyncronous way), it doens't matter which thread capture signal because the behaviour remains the same: The signal is catched and the previously registered handler is executed.
Maybe this could be obvious for others but this was driving me crazy because I did not know how to interpret errors that came out during execution.
After that i found another problem that I solved, is about the obsolete signal() call.
During execution, the first time i rise SIGUSR1, the program catch and manage it as expected but the second time it exit with User defined signal 1.
I figured out that signal() call set "one time" handler for a specific signal, after the first time that the signal is handled the behaviour for that signal return the default one.
So here's what I did:
Here the signal handlers:
N.B.: I reset handler for SIGUSR1 inside the handler itself
static void on_SIGINT(int signum)
{
if(signum == SIGINT || signum == SIGTERM)
serverStop = TRUE;
}
static void on_SIGUSR1(int signum)
{
if(signum == SIGUSR1)
pendingSIGUSR1 = TRUE;
if(signal(SIGUSR1, &on_SIGUSR1) == SIG_ERR)
exit(EXIT_FAILURE);
}
Here I set handlers during server's initialization:
if(signal(SIGINT, &on_SIGINT) == SIG_ERR)
exit(EXIT_FAILURE);
if(signal(SIGTERM, &on_SIGINT) == SIG_ERR)
exit(EXIT_FAILURE);
if(signal(SIGUSR1, &on_SIGUSR1) == SIG_ERR)
exit(EXIT_FAILURE);
if(signal(SIGPIPE, SIG_IGN) == SIG_ERR)
exit(EXIT_FAILURE);
And here the server's listening cycle:
while(!serverStop)
{
if (pendingSIGUSR1)
{
... things i have to do on SIGUSR1...
pendingSIGUSR1 = FALSE;
}
test = saveSet(masterSet, &backUpSet, &saveMaxFd);
CHECK_MINUS1(test, "Server: creating master set's backup ");
int test = select(saveMaxFd+1, &backUpSet, NULL, NULL, &waiting);
if((test == -1 && errno == EINTR) || test == 0)
continue;
if (test == -1 && errno != EINTR)
{
perror("Server: Monitoring sockets: ");
exit(EXIT_FAILURE);
}
for(int sock=3; sock <= saveMaxFd; sock++)
{
if (FD_ISSET(sock, &backUpSet))
{
if(sock == ConnectionSocket)
{
ClientSocket = accept(ConnectionSocket, NULL, 0);
CHECK_MINUS1(ClientSocket, "Server: Accepting connection");
test = INset(masterSet, ClientSocket);
CHECK_MINUS1(test, "Server: Inserting new connection in master set: ");
}
else
{
test = OUTset(masterSet, sock);
CHECK_MINUS1(test, "Server: Removing file descriptor from select ");
test = insertRequest(chain, sock);
CHECK_MINUS1(test, "Server: Inserting request in chain");
}
}
}
}
Read first signal(7) and signal-safety(7); you might want to use the Linux specific signalfd(2) since it fits nicely (for SIGTERM & SIGQUIT and SIGINT) into event loops around poll(2) or the old select(2) (or the newer pselect or ppoll)
See also this answer (and the pipe(7) to self trick mentioned there, which is POSIX-compatible) to a very similar question.
Also, signal(2) documents:
The effects of signal() in a multithreaded process are unspecified.
so you really should use sigaction(2) (which is POSIX).
I am doing a simple server/client program in C which listens on a network interface and accepts clients. Each client is handled in a forked process.
The goal I have is to let the parent process know, once a client has disconnected from the child process.
Currently my main loop looks like this:
for (;;) {
/* 1. [network] Wait for new connection... (BLOCKING CALL) */
fd_listen[client] = accept(fd_listen[server], (struct sockaddr *)&cli_addr, &clilen);
if (fd_listen[client] < 0) {
perror("ERROR on accept");
exit(1);
}
/* 2. [process] Call socketpair */
if ( socketpair(AF_LOCAL, SOCK_STREAM, 0, fd_comm) != 0 ) {
perror("ERROR on socketpair");
exit(1);
}
/* 3. [process] Call fork */
pid = fork();
if (pid < 0) {
perror("ERROR on fork");
exit(1);
}
/* 3.1 [process] Inside the Child */
if (pid == 0) {
printf("[child] num of clients: %d\n", num_client+1);
printf("[child] pid: %ld\n", (long) getpid());
close(fd_comm[parent]); // Close the parent socket file descriptor
close(fd_listen[server]); // Close the server socket file descriptor
// Tasks that the child process should be doing for the connected client
child_processing(fd_listen[client]);
exit(0);
}
/* 3.2 [process] Inside the Parent */
else {
num_client++;
close(fd_comm[child]); // Close the child socket file descriptor
close(fd_listen[client]); // Close the client socket file descriptor
printf("[parent] num of clients: %d\n", num_client);
while ( (w = waitpid(-1, &status, WNOHANG)) > 0) {
printf("[EXIT] child %d terminated\n", w);
num_client--;
}
}
}/* end of while */
It all works well, the only problem I have is (probably) due to the blocking accept call.
When I connect to the above server, a new child process is created and child_processing is called.
However when I disconnect with that client, the main parent process does not know about it and does NOT output printf("[EXIT] child %d terminated\n", w);
But, when I connect with a second client after the first client has disconnected, the main loop is able to finally process the while ( (w = waitpid(-1, &status, WNOHANG)) > 0) part and tell me that the first client has disconnected.
If there will be only ever one client connecting and disconnecting afterwards, my main parent process will never be able to tell if it has disconnected or not.
Is there any way to tell the parent process that my client already left?
UPDATE
As I am a real beginner with c, it would be nice if you provide some short snippets to your answer so I can actually understand it :-)
Your waitpid usage is not correct. You have a non-blocking call so if the child is not finished then then the call gets 0:
waitpid(): on success, returns the process ID of the child whose state
has changed; if WNOHANG was specified and one or more child(ren)
specified by pid exist, but have not yet changed state, then 0 is
returned. On error, -1 is returned.
So your are going immediately out of the while loop. Of course this can be catched later when the first children terminates and a second one lets you process the waitpid again.
As you need to have a non-blocking call to wait I can suggest you not to manage termination directly but through SIGCHLD signal that will let you catch termination of any children and then appropriately call waitpid in the handler:
void handler(int signal) {
while (waitpid(...)) { // find an adequate condition and paramters for your needs
}
...
struct sigaction act;
act.sa_flag = 0;
sigemptyset(&(act.sa_mask));
act.sa_handler = handler;
sigaction(SIGCHLD,&act,NULL);
... // now ready to receive SIGCHLD when at least a children changes its state
If I understand correctly, you want to be able to servicve multiple clients at once, and therefore your waitpid call is correct in that it does not block if no child has terminated.
However, the problem you then have is that you need to be able to process asynchronous child termination while waiting for new clients via accept.
Assuming that you're dealing with a POSIXy system, merely having a SIGCHLD handler established and having the signal unmasked (via sigprocmask, though IIRC it is unmasked by default), should be enough to cause accept to fail with EINTR if a child terminates while you are waiting for a new client to connect - and you can then handle EINTR appropriately.
The reason for this is that a SIGCHLD signal will be automatically sent to the parent process when a child process terminates. In general, system calls such as accept will return an error of EINTR ("interrupted") if a signal is received while they are waiting.
However, there would still be a race condition, where a child terminates just before you call accept (i.e. in between where already have waitpid and accept). There are two main possibilities to overcome this:
Do all the child termination processing in your SIGCHLD handler, instead of the main loop. This may not be feasible, however, since there are significant limits to what you are allowed to do within a signal handler. You may not call printf for example (though you may use write).
I do not suggest you go down this path, although it may seem simpler at first it is the least flexible option and may prove unworkable later.
Write to one end of a non-blocking pipe in your SIGCHLD signal handler. Within the main loop, instead of calling accept directly, use poll (or select) to look for readiness on both the socket and the read end of the pipe, and handle each appropriately.
On Linux (and OpenBSD, I'm not sure about others) you can use ppoll (man page) to avoid the need to create a pipe (and in this case you should leave the signal masked, and have it unmasked during the poll operation; if ppoll fails with EINTR, you know that a signal was received, and you should call waitpid). You still need to set a signal handler for SIGCHLD, but it doesn't need to do anything.
Another option on Linux is to use signalfd (man page) to avoid both the need to create a pipe and set up a signal handler (I think). You should mask the SIGCHLD signal (using sigprocmask) if you use this. When poll (or equivalent) indicates that the signalfd is active, read the signal data from it (which clears the signal) and then call waitpid to reap the child.
On various BSD systems you can use kqueue (OpenBSD man page) instead of poll and watch for signals without needing to establish a signal handler.
On other POSIX systems you may be able to use pselect (documentation) in a similar way to ppoll as described above.
There is also the option of using a library such as libevent to abstract away the OS-specifics.
The Glibc manual has an example of using select. Consult the manual pages for poll, ppoll, pselect for more information about those functions. There is an online book on using Libevent.
Rough example for using select, borrowed from Glibc documentation (and modified):
/* Set up a pipe and set signal handler for SIGCHLD */
int pipefd[2]; /* must be a global variable */
pipe(pipefd); /* TODO check for error return */
fcntl(pipefd[1], F_SETFL, O_NONBLOCK); /* set write end non-blocking */
/* signal handler */
void sigchld_handler(int signum)
{
char a = 0; /* write anything, doesn't matter what */
write(pipefd[1], &a, 1);
}
/* set up signal handler */
signal(SIGCHLD, sigchld_handler);
Where you currently have accept, you need to check status of the server socket and the read end of the pipe:
fd_set set, outset;
struct timeval timeout;
/* Initialize the file descriptor set. */
FD_ZERO (&set);
FD_SET (fdlisten[server], &set);
FD_SET (pipefds[0], &set);
FD_ZERO(&outset);
for (;;) {
select (FD_SETSIZE, &set, NULL, &outset, NULL /* no timeout */));
/* TODO check for error return.
EINTR should just continue the loop. */
if (FD_ISSET(fdlisten[server], &outset)) {
/* now do accept() etc */
}
if (FD_ISSET(pipefds[0], &outset)) {
/* now do waitpid(), and read a byte from the pipe */
}
}
Using other mechanisms is generally simpler, so I leave those as an exercise :)
gcc (GCC) 4.6.3
valgrind-3.6.1
I have created a application that send and receives some messages in 2 different thread for sending and receiving. Using pthreads, condition varables and mutexes for locks.
However, the sender will send messages and then signal the receiver to receive it and process it. It does this in a while loop.
However, the problem occurs if I want to quit the application by using ctrl-c and handling the interupt. If there is no messages being sent then the receiver is stuck in the while loop waiting to receive.
The main thread will call join and block waiting for the receiver to finish. But it doesn't as it waiting on the pthread_cond_wait.
I was thinking of using the pthread_cancel or pthread_kill. But I don't like to do that as it doesn't allow the thread to exit normally.
many thanks for any suggestions.
main function
void main(void)
{
/* Do some stuff here */
/* Start thread that will send a message */
if(pthread_create(&thread_recv_id, &thread_attr, thread_recv_fd, NULL) == -1) {
fprintf(stderr, "Failed to create thread, reason [ %s ]",
strerror(errno));
break;
}
printf("Start listening for receiving data'\n");
/* Start thread to receive messages */
if(pthread_create(&thread_send_id, &thread_attr, thread_send_fd, NULL) == -1) {
fprintf(stderr, "Failed to create thread for receiving, reason [ %s ]",
strerror(errno));
break;
}
/* Clean up threading properties */
pthread_join(thread_send_id, NULL);
pthread_join(thread_recv_id, NULL); <---- blocking here waiting for the recv thread to finish
pthread_mutex_destroy(&mutex_queue);
pthread_cond_destroy(&cond_queue);
return 0;
}
sender thread
void *thread_send_fd()
{
pthread_mutex_lock(&mutex_queue);
if(send_fd((int)fd) == FALSE) {
/* Just continue to send another item */
continue;
}
/* Signal the waiting thread to remove the item that has been sent */
pthread_cond_signal(&cond_queue);
pthread_mutex_unlock(&mutex_queue);
}
receiver thread
void *thread_recv_fd()
{
while(is_receiving()) {
pthread_mutex_lock(&mutex_queue);
/* Wait for an item to be sent on the queue */
pthread_cond_wait(&cond_queue, &mutex_queue); <---- waiting here
queue_remove();
pthread_mutex_unlock(&mutex_queue);
}
pthread_exit(NULL);
}
You basically have 3 choices:
Use pthread_cancel. This will interrupt the pthread_cond_wait call, and then exit the thread, invoking the cancellation handlers registered with pthread_cleanup_push on the way up.
Use pthread_kill to send a signal to the thread. This doesn't "kill" the thread, it just sends a signal. In this case, you must have registered a signal handler in that thread for the signal you use, and that signal handler must do something to tell the thread to exit. This isn't particularly better than the third option, since the signal handler still needs to do something to make the pthread_cond_wait loop exit.
Add a manual interruption feature to your thread that knows to set a flag and signal the condition variable. The loop around pthread_cond_wait should then check the flag and exit the thread if the flag is set.
I would recommend (1) or (3). Using pthread_cancel is most generic, but requires careful handling in the thread to ensure there are suitable pthread_cleanup_push calls for cleaning up all resources allocated by the thread, unlocking all mutexes, and so forth. Writing a manual interruption feature is potentially more work, but can be most easily tailored to your application.
Hi I'm building a program that uses a signal handler shown below ...
struct sigaction pipeIn;
pipeIn.sa_handler = updateServer;
sigemptyset(&pipeIn.sa_mask);
pipeIn.sa_flags = SA_ONESHOT;
if(sigaction(SIGUSR1, &pipeIn, NULL) == -1){
printf("We have a problem, sigaction is not working.\n");
perror("\n");
exit(1);
}
The problem is that this handler is getting tripped when it's not supposed to. The only thing that should send the SIGUSR1 signal is my child process which exists inside an infinite while loop which listens for incoming connections. The child process is forked as you can see below. I redo the pipeIn handler to run a different function that the child process uses which the parent does not. The code is shown below.
while(1){
newSock = accept(listenSock,(struct sockaddr *)&their_addr,&addr_size);
printf("A\n");
if(!fork()){
// We want to redefine the interrupt
pid_t th;
th = getpid();
printf("child pid: %d\n",th);
pipeIn.sa_handler = setFlag;
if(sigaction(SIGUSR1, &pipeIn, NULL) == -1){
printf("We have a problem, sigaction is not working.\n");
perror("\n");
exit(1);
}
close(listenSock);
kill(getppid(),SIGUSR1);
waitForP();
}*/
close(newSock);
exit(0);
}
close(newSock);
//waitForP();
//break;
}
When I run this code, I will make a call from another computer to connect to my server program that you see here. It will accept() the one request from that computer just fine, but then the child process will eventually get around to sending SIGUSR1 to the parent process. However the parent process receives the SIGUSR1 signal before the child process even gets to send the signal. The handler trips the function before it should ... then the child process finally gets to kill the signal and the handler goes off the 2nd time. Lastly the accept() function goes off again even if there are no new connections being produced and the incoming ip address is from a weird ipv6 address that is random. I don't know what's going on. Any help would be great.
Repeat after me: always check for error returned from a system call (which accept(2) is - you are getting -1 instead of a socket descriptor, EINTR in errno and undefined connecting address).
This might seem obvious, but have you compiled your code with full warnings on and assured you don't have any? Mysterious behavior is often caused by errors that are only mentioned by the C compiler if you ask it...
Sockets on Linux question
I have a worker thread that is blocked on an accept() call. It simply waits for an incoming network connection, handles it, and then returns to listening for the next connection.
When it is time for the program to exit, how do I signal this network worker thread (from the main thread) to return from the accept() call while still being able to gracefully exit its loop and handle its cleanup code.
Some things I tried:
pthread_kill to send a signal. Feels kludgy to do this, plus it doesn't reliably allow the thread to do it's shutdown logic. Also makes the program terminate as well. I'd like to avoid signals if at all possible.
pthread_cancel. Same as above. It's a harsh kill on the thread. That, and the thread may be doing something else.
Closing the listen socket from the main thread in order to make accept() abort. This doesn't reliably work.
Some constraints:
If the solution involves making the listen socket non-blocking, that is fine. But I don't want to accept a solution that involves the thread waking up via a select call every few seconds to check the exit condition.
The thread condition to exit may not be tied to the process exiting.
Essentially, the logic I am going for looks like this.
void* WorkerThread(void* args)
{
DoSomeImportantInitialization(); // initialize listen socket and some thread specific stuff
while (HasExitConditionBeenSet()==false)
{
listensize = sizeof(listenaddr);
int sock = accept(listensocket, &listenaddr, &listensize);
// check if exit condition has been set using thread safe semantics
if (HasExitConditionBeenSet())
{
break;
}
if (sock < 0)
{
printf("accept returned %d (errno==%d)\n", sock, errno);
}
else
{
HandleNewNetworkCondition(sock, &listenaddr);
}
}
DoSomeImportantCleanup(); // close listen socket, close connections, cleanup etc..
return NULL;
}
void SignalHandler(int sig)
{
printf("Caught CTRL-C\n");
}
void NotifyWorkerThreadToExit(pthread_t thread_handle)
{
// signal thread to exit
}
int main()
{
void* ptr_ret= NULL;
pthread_t workerthread_handle = 0;
pthread_create(&workerthread, NULL, WorkerThread, NULL);
signal(SIGINT, SignalHandler);
sleep((unsigned int)-1); // sleep until the user hits ctrl-c
printf("Returned from sleep call...\n");
SetThreadExitCondition(); // sets global variable with barrier that worker thread checks on
// this is the function I'm stalled on writing
NotifyWorkerThreadToExit(workerthread_handle);
// wait for thread to exit cleanly
pthread_join(workerthread_handle, &ptr_ret);
DoProcessCleanupStuff();
}
Close the socket using the shutdown() call. This will wake up any threads blocked on it, while keeping the file descriptor valid.
close() on a descriptor another thread B is using is inherently hazardous: another thread C may open a new file descriptor which thread B will then use instead of the closed one. dup2() a /dev/null onto it avoids that problem, but does not wake up blocked threads reliably.
Note that shutdown() only works on sockets -- for other kinds of descriptors you likely need the select+pipe-to-self or cancellation approaches.
You can use a pipe to notify the thread that you want it to exit. Then you can have a select() call which selects on both the pipe and the listening socket.
For example (compiles but not fully tested):
// NotifyPipe.h
#ifndef NOTIFYPIPE_H_INCLUDED
#define NOTIFYPIPE_H_INCLUDED
class NotifyPipe
{
int m_receiveFd;
int m_sendFd;
public:
NotifyPipe();
virtual ~NotifyPipe();
int receiverFd();
void notify();
};
#endif // NOTIFYPIPE_H_INCLUDED
// NotifyPipe.cpp
#include "NotifyPipe.h"
#include <unistd.h>
#include <assert.h>
#include <fcntl.h>
NotifyPipe::NotifyPipe()
{
int pipefd[2];
int ret = pipe(pipefd);
assert(ret == 0); // For real usage put proper check here
m_receiveFd = pipefd[0];
m_sendFd = pipefd[1];
fcntl(m_sendFd,F_SETFL,O_NONBLOCK);
}
NotifyPipe::~NotifyPipe()
{
close(m_sendFd);
close(m_receiveFd);
}
int NotifyPipe::receiverFd()
{
return m_receiveFd;
}
void NotifyPipe::notify()
{
write(m_sendFd,"1",1);
}
Then select with receiverFd(), and notify for termination using notify().
Close the listening socket and accept will return an error.
What doesn't reliably work with this? Describe the problems you're facing.
pthread_cancel to cancel a thread blocked in accept() is risky if the pthread implementation does not implement cancellation properly, that is if the thread created a socket, just before returning to your code, a pthread_cancel() is called for it, the thread is canceled, and the newly created socket is leaked. Although FreeBSD 9.0 and later does not have such a race condition problem, but you should check your OS first.