I have a program in c where I want to do some calculations which may or may not take a very long time. It is hard to know beforehand how much time the calculations will take. The program has a cli so right now I usually do something like this
./program
do calculation 243
and it starts calculating. If I want to cancel it because it takes to much time I do ctrl+c and restart the program with another calculation. Now I would like for the program to cancel the calculation itself after either q has been pressed or for example 10 seconds has passed.
I have found a way which seems to do what I expect using pthreads. I'm however wondering if this is recommended or if there are for example any memory leaks or other things that can happen.
The following is my code
void *pthread_getc(void *ptr) {
char c = '\0';
while (c != 'q')
c = getc(stdin);
pthread_cancel((pthread_t)ptr);
}
void *pthread_sleep(void *ptr) {
sleep(10);
pthread_cancel((pthread_t)ptr);
}
void pthread_cancellable(void *(*ptr)(void *), struct arg *arg) {
pthread_t thread_main, thread_getc, thread_sleep;
pthread_create(&thread_main, NULL, ptr, (void *)arg);
pthread_create(&thread_getc, NULL, pthread_getc, (void *)thread_main);
pthread_create(&thread_sleep, NULL, pthread_sleep, (void *)thread_main);
pthread_join(thread_main, NULL);
pthread_cancel(thread_getc);
pthread_cancel(thread_sleep);
pthread_join(thread_getc, NULL);
pthread_join(thread_sleep, NULL);
}
the idea being that both pthread_getc and pthread_sleep can cancel main, and once main is cancelled so are these two. Then I simply call pthread_cancellable where the first argument is a function doing the calculation and the second argument is the arguments to the calculating function.
Can something go wrong with memory leaks here or something else? Is there an easier/better way to this in c?
What happens if main is cancelled two times and if a thread gets cancelled when its already done?
Can something go wrong with memory leaks here or something else?
If the program is going to terminate after aborting the computation then there is no issue with memory leaks. The system does not rely on processes to clean up after themselves -- it will reclaim all memory allotted to the process no matter how the process used it.
But your code violates the #1 rule of pthread_cancel(): never call pthread_cancel(). And although monitoring stdin for a q keystroke could work, that's a bit odd, and it potentially gets in the way of using stdin for something else you want to add to your program later.
Is there an easier/better way to this in c?
Yes. In the first place, if the objective is simply to terminate the program at timeout / user interrupt, then do that. That is is, have any thread call exit() when you want to terminate. You do not need to cancel any threads for that.
In the second place, I don't see what is gained by implementing a custom keyboard action (type 'q' to abort) when the standard interrupt signal sent by Ctrl-C works fine, and you even get the latter for free. If you want or need to perform some kind of extra behavior in response to an interrupt signal (before or instead of terminating), then register a handler for it.
There are multiple ways you could implement the early termination behavior, but here are outlines of two I like:
No-frills abortion upon timeout (or Ctrl-C):
Only the program's initial thread is needed.
Before it launches the computation, it creates and starts an interval timer (timer_create()) to count down the timeout. Configure the timer to raise SIGINT when it expires.
That's it. You get termination via the keyboard (albeit with Ctrl-C as you already do, not 'q') and the same termination behavior as far as an external observer can see in the event of a timeout.
optional addition 1:
If desired, you can install a handler for SIGINT to get extra or different behavior upon cancellation than you otherwise would. Note, however, that there are significant limits on what a signal handler may do. For example, maybe you want to emit a message to stderr (use write(), not fprintf() for such things), or you want to exit() with non-zero status instead of terminating (directly) because of the signal.
optional addition 2:
If the program reaches a point where it is not finished but it no longer wants to be terminated when the timeout is reached then it may at that point use timer_delete() to disable the timer.
With-frills abortion upon timeout (or Ctrl-C):
If you want to perform work in response to abort of the computation that is unsuited for a signal handler (too much, needs to call functions that are not async-signal-safe, ...) then you need a thread to do that in, and additional control structures and mechanisms. This is one way to do it:
Create and initialize a mutex, a condition variable, and a flag of type sig_atomic_t, all at file scope. The contract for these is that the flag may be accessed (read or write) only by a thread that currently holds the mutex locked, and that the mutex is the same that will be associated with all waits on the CV.
Install a signal handler for SIGINT that
locks the mutex
Provided that the flag does not indicate completion, updates it to indicate cancellation
unlocks the mutex
broadcasts to the CV
The last thing the computational thread will do after completing its work is (with the mutex locked) set the flag to a value indicating completion, and then broadcast to the CV.
The initial thread will then do this:
Setup as described in the previous points
lock the mutex
Create / start an interval timer (timer_create()) that raises SIGINT when it expires, after the chosen timeout period.
Start the computational thread
loop while the flag indicates ongoing computation. In the loop body
perform a wait on the CV
The computation having either completed successfully or been canceled at this point, perform whatever final actions are appropriate and then terminate, either by returning from main() or by calling exit().
That's still pretty clean, gets you both timeout-based and keyboard-based cancellation (albeit the latter with Ctrl-C instead of 'q'), puts all the cancellation handling in one place, and requires only one thread in addition to the computational one.
optional addition: abort in response to 'q'
Although I do not recommend it, if you really must have that termination by typing 'q', then you can set up another thread that monitors for that keypress / character, and performs a raise(SIGINT) if it sees it.
Related
This really is two questions, but I suppose it's better they be combined.
We're working on a client that uses asynchronous TCP connection. The idea is that the program will block until certain message is received from the server, which will invoke a SIGPOLL handler. We are using a busy waiting loop, basically:
var = 1
while (var) usleep(100);
//...and somewhere else
void sigpoll_handler(int signum){
......
var = 0;
......
}
We would like to use something more reliable instead, like a semaphore. The thing is, when a thread is blocked on a semaphore, will the signal get through still? Especially considering that signals get delivered when it switches back to user level; if the process is off the runqueue, how will it happen?
Side question (just out of curiosity):
Without the "usleep(100)" the program never progresses past the while loop, although I can verify the variable was set in the handler. Why is that? Printing changes its behaviour too.
Cheers!
[too long for a comment]
Accessing var from inside the signal handler invokes undefined behaviour (at least for a POSIX conforming system).
From the related POSIX specification:
[...] if the process is single-threaded and a signal handler is executed [...] the behavior is undefined if the signal handler refers to any object [...] with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t [...]
So var shall be defined:
volatile sig_atomic_t var;
The busy waiting while-loop, can be replaced by a single call to a blocking pause(), as it will return on reception of the signal.
From the related POSIX specification:
The pause() function shall suspend the calling thread until delivery of a signal whose action is either to execute a signal-catching function or to terminate the process.
Using pause(), btw, will make the use of any global flag like var redundant, to not say needless.
Short answer: yes, the signal will get through fine with a good implementation.
If you're going to be using a semaphore to control the flow of the program, you'll want to have the listening be on one child with the actual data processing be on another. This will then put the concurrency fairness in the hands of the OS which will make sure your signal listening thread gets a chance to check for a signal with some regularity. It shouldn't ever be really "off the runqueue," but cycling through positions on the runqueue instead.
If it helps you to think about it, what you have right now seems to basically be a a very rough implementation of a semaphore on its own -- a shared variable whose value will stop one block of code from executing until another code block clears it. There isn't anything inherently paralyzing about a semaphore on a system level.
I kind of wonder why whatever function you're using to listen for the SIGPOLL isn't doing its own blocking, though. Most of those utilities that I've seen will stop their calling thread until they return a value. Basically they handle the concurrency for you and you can code as if you were dealing with a normal synchronous program.
With regards to the usleep loop: I'd have to look at what the optimizer's doing, but I think there are basically two possibilities. I think it's unlikely, but it could be that the no-body loop is compiling into something that isn't actually checking for a value change and is instead just looping. More likely to me would be that the lack of any body steps is messing up the underlying concurrency handling, and the loop is executing so quickly that nothing else is getting a chance to run -- the queue is being flooded by loop iterations and your signal processsing can't get a word in edgewise. You could try just watching it for a few hours to see if anything changes; theoretically if it's just a concurrency problem then the random factor involved could clear the block on its own with a few billion chances.
I'm currently writing a program that the main thread is going to create three child threads. These threads are running simultaneously and what I want to do is once one of the child thread is done, I will check if the output is right. If it is, then terminate the other two threads; if not, then throw away this thread's result and wait for the other two threads' result.
I'm creating the three results in the main function with pthread_create. But I do not know how to use join function. If I use join function three times in the main function, it just waits one by one until the three threads are done.
My plan is like this:
int return_value;
main(){
pthread_create(&pid[0], NULL, fun0, NULL);
pthread_create(&pid[1], NULL, fun1, NULL);
pthread_create(&pid[2], NULL, fun2, NULL);
}
fun0(){
...
if( check the result is right ){
return_value = result;
if (pid[1] is running) pthread_kill( pid[1], SIGTERM );
if (pid[2] is running) pthread_kill( pid[2], SIGTERM );
}
fun1() ...
fun2() ...
function 0, 1, and 2 are similar to each other and once one function has the right answer, it will kill the other two threads. However, while running the program, once the pthread_kill is processed, the whole program is terminated, not just one thread. I don't know why.
And I still do not know if there are any other ways to code this program. Thanks for helping me out of this.
The pthread_kill() function is not designed to terminate threads, just like kill() is not designed to terminate processes. These functions just send signals, and their names are unfortunate byproducts of history. Certain signal handlers will cause the process to terminate. Using pthread_kill() allows you to select which thread handles a signal, but the signal handler will still do the exact same thing (e.g., terminate the process).
To terminate a thread, use pthread_cancel(). This will normally terminate the thread at the next cancellation point. Cancellation points are listed in the man page for pthread_cancel(), only certain functions like write(), sleep(), pthread_testcancel() are cancellation points.
However, if you set the cancelability type of the thread (with pthread_setcanceltype()) to PTHREAD_CANCEL_ASYNCHRONOUS, you can cancel the thread at any time. This can be DANGEROUS and you must be very careful. For example, if you cancel a thread in the middle of a malloc() call, you will get all sorts of nasty problems later on.
You will probably find it much easier to either test a shared variable every now and then, or perhaps even to use different processes which you can then just kill() if you don't need them any more. Canceling a thread is tricky.
Summary
Easiest option is to just test a variable in each thread to see if it should be canceled.
If this doesn't work, my next recommendation is to use fork() instead of pthread_create(), after which you can use kill().
If you want to play with fire, use asynchronous pthread_cancel(). This will probably explode in your face. You will have to spend hours of your precious time hunting bugs and trying to figure out how to do cleanup correctly. You will lose sleep and your cat will die from neglect.
I am writing a multithreaded program and i have this question:
Suppose that, while executing in the main thread, i want to terminate all
child-threads. I can't just send them a termination signal cause i want them
to free dynamically allocated memory first. Can i define a specific signal handler
function in each thread function that is executed, which in turn is going to call
a cleanup function that i will write to do so? If not how can i accomplish my goal??
Thanks,
Nikos
Look at the man page for pthread_cancel:
When a cancellation requested is acted on, the following steps occur for
thread (in this order):
1. Cancellation clean-up handlers are popped (in the reverse of the order in
which they were pushed) and called. (See pthread_cleanup_push(3).)
2. Thread-specific data destructors are called, in an unspecified order. (See
pthread_key_create(3).)
3. The thread is terminated. (See pthread_exit(3).)
So you can use pthread_cancel from your main, provided you have registered you cleanup handlers correctly using the above functions.
(Do read that man page completely though, it has a lot of important information.)
Edit: (from comments) If you plan on using PTHREAD_CANCEL_DEFERRED and need to insert a cancellation point somewhere in your code, then use pthread_testcancel. This function checks if a cancellation was requested. If that is the case, the cancellation is serviced (i.e. that call never returns). Otherwise it has no effect.
The most robust strategy requires cooperation from the child threads: you set a flag that the threads periodically check and, when the flag is set, free whatever resources they're using and then terminate.
Cancellation (Mat's answer) is the correct and canonical one, but if you want a different approach, you can install a no-op signal handler using sigaction without the SA_RESTART flag and use pthread_kill with whatever signal number you chose in order to interrupt (EINTR) whatever the thread might have been blocked on. Combined with this, aix's answer works.
Right now I have a function connected to SIGARLM that goes off after 1 second and will re-alarm itself to go off in another second everytime. There's a test in the logic of the SIGALRM function I wrote to see if a certain timeout has been reached and when it does I need it to kill a function that's running. Does anybody know how I can do this?
I forgot to mention: in the function that needs to be killed it waits on scanf() and the function needs to die even if scanf() hasn't returned yet.
One approach that might be worth looking into is using select to poll stdin and see if any data is ready. select lets you wait for some period of time on a file descriptor, controlling when you can be interrupted and by what, and seems like it's perfect here. You could just sit in a loop waiting for up to a second, then failing gracefully if no data is available. That way, SIGALRM wouldn't need to kill the function; it would take care of that all by itself.
Not sure exactly what you're asking or what the structure of the program is. If I understand correctly: some function is running and you want to terminate it if it's been running for X time. You have a SIGALARM wake up every second and that will check the running time of the other function and do the terminate.
How do you plan to kill the function? Is it a function in the same process, or is it a separate process. Is your question how to terminate it or how to tell when it needs to be terminated?
I've done something which I believe is similar. I had a multi-threaded application with a structure which contained information about the threads I wished to monitor. The structure contained a member variable "startTime". My monitoring (SIGALARM) function had access to a list of threads. When the monitor woke up it would traverse the list, compare current time to each thread startTime and send a message to the function if it had exceeded it's allowed runtime.
Does this help at all?
You could use a (global) variable to communicate between the signal handler and the function that should be stopped. The function then would check that variable to see if it should still continue running or if it should exit.
Something line this:
volatile int worker_expired = 0;
void worker() {
while (!worker_expired) {
// ...
}
}
void sig_alrm() {
worker_expired = 1;
}
If you want the signal to terminate IO operations, you need to make sure it's an interrupting signal handler. On modern systems, system calls interrupted by signals automatically restart unless you specify otherwise. Use the sigaction function rather than the signal function to setup your signal handlers if you want control over things like this. With sigaction, unless you specify SA_RESTART, signal handlers can interrupt.
If you're using file-descriptor IO functions like read, you should now get the effects you want.
If you're using stdio functions like fscanf, getting interrupted by a signal will put the FILE into an error state that can only be cleared by clearerr, and will lose any partial input in the buffer. Interrupting signals do not mix very well with stdio unless you just want to abort all operations on the file and close it when a signal is received.
So ... to restate slightly: it isn't so much that you want to kill the function as that you want any pending i/o to terminate and the function to exit.
I would either:
use select() to periodically wake up and check a flag set by the signal handler. if the flag isn't set and there's no input pending then loop and call select() again.
i suspect that your SIGALARM handler is doing more than just checking this one timer, and so using pselect() to check for i/o OR SIGALARM is probably not an option for you. i wonder if you could grab a user defined signal, and pass that in pselect. then your alarm handler would send that user defined signal.
Regarding choice 1, if SIGALARM is waking every second then you can adjust the time that select() sleeps to be within your maximum error latency. In other words assume that the timeout occurs immediately after the call to select(), then it will take until select() wakes up to detect the flag set by the SIGALARM handler. So if select() wakes up 10 times per second then it could take up to 1/10 second to detect the setting of the "give up" flag (set by the SIGALARM handler).
I'm running a multi-threaded C program (process?) , making use of semaphores & pthreads. The threads keep interacting, blocking, waking & printing prompts on stdout continuously, without any human intervention. I want to be able to exit this process (gracefully after printing a message & putting down all threads, not via a crude CTRL+C SIGINT) by pressing a keyboard character like #.
What are my options for getting such an input from the user?
What more relevant information could I provide that will help to solve this problem?
Edit:
All your answers sound interesting, but my primary question remains. How do I get user input, when I don't know which thread is currently executing? Also, semaphore blocking using sem_wait() breaks if signalled via SIGINT, which may cause a deadlock.
There is no difference in reading standard input from threads except if more than one thread is trying to read it at the same time. Most likely your threads are not all calling functions to read standard input all the time, though.
If you regularly need to read input from the user you might want to have one thread that just reads this input and then sets flags or posts events to other threads based on this input.
If the kill character is the only thing you want or if this is just going to be used for debugging then what you probably want to do is occasionally poll for new data on standard input. You can do this either by setting up standard input as non-blocking and try to read from it occasionally. If reads return 0 characters read then no keys were pressed. This method has some problems, though. I've never used stdio.h functions on a FILE * after having set the underlying file descriptor (an int) to non-blocking, but suspect that they may act odd. You could avoid the use of the stdio functions and use read to avoid this. There is still an issue I read about once where the block/non-block flag could be changed by another process if you forked and exec-ed a new program that had access to a version of that file descriptor. I'm not sure if this is a problem on all systems. Nonblocking mode can be set or cleared with a 'fcntl' call.
But you could use one of the polling functions with a very small (0) timeout to see if there is data ready. The poll system call is probably the simplest, but there is also select. Various operating systems have other polling functions.
#include <poll.h>
...
/* return 0 if no data is available on stdin.
> 0 if there is data ready
< 0 if there is an error
*/
int poll_stdin(void) {
struct pollfd pfd = { .fd = 0, .events = POLLIN };
/* Since we only ask for POLLIN we assume that that was the only thing that
* the kernel would have put in pfd.revents */
return = poll(&pfd, 1, 0);
}
You can call this function within one of your threads until and as long as it retuns 0 you just keep on going. When it returns a positive number then you need to read a character from stdin to see what that was. Note that if you are using the stdio functions on stdin elsewhere there could actually be other characters already buffered up in front of the new character. poll tells you that the operating system has something new for you, not what C's stdio has.
If you are regularly reading from standard input in other threads then things just get messy. I'm assuming you aren't doing that (because if you are and it works correctly you probably wouldn't be asking this question).
You would have a thread listening for keyboard input, and then it would join() the other threads when receiving # as input.
Another way is to trap SIGINT and use it to handle the shutdown of your application.
The way I would do it is to keep a global int "should_die" or something, whose range is 0 or 1, and another global int "died," which keeps track of the number of threads terminated. should_die and died are both initially zero. You'll also need two semaphores to provide mutex around the globals.
At a certain point, a thread checks the should_die variable (after acquiring the mutex, of course). If it should die, it acquires the died_mutex, ups the died count, releases the died_mutex, and dies.
The main initial thread periodically wakes up, checks that the number of threads that have died is less than the number of threads, and goes back to sleep. The main thread dies when all the other threads have checked in.
If the main thread doesn't spawn all the threads itself, a small modification would be to have "threads_alive" instead of "died". threads_alive is incremented when a thread forks, and decremented when the thread dies.
In general, terminating a multithreaded operation cleanly is a pain in the butt, and besides special cases where you can use things like the semaphore barrier design pattern, this is the best I've heard of. I'd love to hear it if you find a better, cleaner one.
~anjruu
In general, I have threads waiting on a set of events and one of those events is the termination event.
In the main thread, when I have triggered the termination event, I then wait on all the threads having exited.
SIGINT is actually not that difficult to handle and is often used for graceful termination. You need a signal handler and a way to tell all the threads that it's time to stop. One global flag that threads check in their loops and the signal handler sets might do. Same approach works for "on user command" termination, though you need a way to get the input from the terminal - either poll in a dedicated thread, or again, set the terminal to generate a signal for you.
The tricky part is to unblock waiting threads. You have to carefully design the notification protocol of who tells who to stop and what they need to do - put dummy message into a queue, set a flag and signal a cv, etc.