Rerunning cancelled pthread - c

My problem is that I cannot reuse cancelled pthread. Sample code:
#include <pthread.h>
pthread_t alg;
pthread_t stop_alg;
int thread_available;
void *stopAlgorithm() {
while (1) {
sleep(6);
if (thread_available == 1) {
pthread_cancel(alg);
printf("Now it's dead!\n");
thread_available = 0;
}
}
}
void *algorithm() {
while (1) {
printf("I'm here\n");
}
}
int main() {
thread_available = 0;
pthread_create(&stop_alg, NULL, stopAlgorithm, 0);
while (1) {
sleep(1);
if (thread_available == 0) {
sleep(2);
printf("Starting algorithm\n");
pthread_create(&alg, NULL, algorithm, 0);
thread_available = 1;
}
}
}
This sample should create two threads - one will be created at the program beginning and will try to cancel second as soon it starts, second should be rerunned as soon at it was cancelled and say "I'm here". But when algorithm thread cancelled once it doesn't start once again, it says "Starting algorithm" and does nothing, no "I'm here" messages any more. Could you please tell me the way to start cancelled(immediately stopped) thread once again?
UPD: So, thanks to your help I understood what is the problem. When I rerun algorithm thread it throws error 11:"The system lacked the necessary resources to create another thread, or the system-imposed limit on the total number of threads in a process PTHREAD_THREADS_MAX would be exceeded.". Actually I have 5 threads, but only one is cancelled, others stop by pthread_exit. So after algorithm stopped and program went to standby mode I checked status of all threads with pthread_join - all thread show 0(cancelled shows PTHREAD_CANCELED), as far as I can understand this means, that all threads stopped successfully. But one more try to run algorithm throws error 11 again. So I've checked memory usage. In standby mode before algorithm - 10428, during the algorithm, when all threads used - 2026m, in standby mode after algorithm stopped - 2019m. So even if threads stopped they still use memory, pthread_detach didn't help with this. Are there any other ways to clean-up after threads?
Also, sometimes on pthread_cancel my program crashes with "libgcc_s.so.1 must be installed for pthread_cancel to work"

Several points:
First, this is not safe:
int thread_available;
void *stopAlgorithm() {
while (1) {
sleep(6);
if (thread_available == 1) {
pthread_cancel(alg);
printf("Now it's dead!\n");
thread_available = 0;
}
}
}
It's not safe for at least reasons. Firstly, you've not marked thread_available as volatile. This means that the compiler can optimise stopAlgorithm to read the variable once, and never reread it. Secondly, you haven't ensured access to it is atomic, or protected it by a mutex. Either declare it:
volatile sig_atomic_t thread_available;
(or similar), or better, protect it by a mutex.
But for the general case of triggering one thread from another, you are better using a condition variable (and a mutex), using pthread_condwait or pthread_condtimedwait in the listening thread, and pthread_condbroadcast in the triggering thread.
Next, what's the point of the stopAlgorithm thread? All it does is cancel the algorithm thread after an unpredictable amount of time between 0 and 6 seconds? Why not just sent the pthread_cancel from the main thread?
Next, do you care where your algorithm is when it is cancelled? If not, just pthread_cancel it. If so (and anyway, I think it's far nicer), regularly check a flag (either atomic and volatile as above, or protected by a mutex) and pthread_exit if it's set. If your algorithm does big chunks every second or so, then check it then. If it does lots of tiny things, check it (say) every 1,000 operations so taking the mutex doesn't introduce a performance penalty.
Lastly, if you cancel a thread (or if it pthread_exits), the way you start it again is simply to call pthread_create again. It's then a new thread running the same code.

Related

Only one thread ever acquiring semaphore

I have a program in which multiple threads are in a loop where they acquire a binary semaphore and then increase a global counter. However, by printing out the thread IDs, I notice that only one thread ever acquires the semaphore. Here's my MRE:
#include <stdbool.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <semaphore.h>
#define NUM_THREADS 10
#define MAX_COUNTER 100
struct threadCtx {
sem_t sem;
unsigned int counter;
};
static void *
threadFunc(void *args)
{
struct threadCtx *ctx = args;
pthread_t self;
bool done = false;
self = pthread_self();
while (!done) {
sem_wait(&ctx->sem);
if ( ctx->counter == MAX_COUNTER ) {
done = true;
}
else {
sleep(1);
printf("Thread %u increasing the counter to %u\n", (unsigned int)self, ++ctx->counter);
}
sem_post(&ctx->sem);
}
return NULL;
}
int main() {
pthread_t threads[NUM_THREADS];
struct threadCtx ctx = {.counter = 0};
sem_init(&sem.ctx, 0, 1);
for (int k=0; k<NUM_THREADS; k++) {
pthread_create(threads+k, NULL, threadFunc, &ctx);
}
for (int k=0; k<NUM_THREADS; k++) {
pthread_join(threads[k], NULL);
}
sem_destroy(&ctx.sem);
return 0;
}
The output is
Thread 1004766976 increasing the counter to 1
Thread 1004766976 increasing the counter to 2
Thread 1004766976 increasing the counter to 3
...
If I remove the call to sleep, the behavior is closer to what I would expect (i.e., the threads being woken up in a seemingly indeterminate manner). Why would this be?
David Schwartz's answer explains what is happening at a low level. That is to say, he's looking at it from the perspective of an OS developer or a hardware designer. Nothing wrong with that, but let's look at your program from the perspective of a Software Architect:
You've got multiple threads all executing the same loop. The loop locks the mutex,* it does some "work," and then it releases the mutex. OK, but what does it do next? Almost the very next thing that your loop does after releasing the mutex is it locks the mutex again. Your loop spends practically 100% of its time doing "work" with the mutex locked.
So, what's the point of running that same loop in multiple threads when there's never any opportunity for two or more threads to work at the same time?
If you want to use threads to do a parallel computation, you need to find/invent safe ways for the threads to do most of their work with the mutex unlocked. They should only lock a mutex for just long enough to post a result or, to take another assignment.
Sometimes that means writing code that is less efficient than single threaded code would be. But suppose that program (A) has a single thread that makes almost 100% use of a CPU, while program (B) uses eight CPUs but only uses them with 50% efficiency. Which program is going to win?
* I know, your example uses a sem_t (semaphore) object. But "semaphore" is what you are using. "Mutex" is the role in which you are using it.
Why would this be?
Context switches are expensive and your implementation is, wisely, minimizing them. Your threads are all fighting over the same resource, trying to schedule them closely will make performance much worse, probably for the entire system.
Since the thread that keeps getting the semaphore never uses up its timeslice, it will keep getting the resource. It is your responsibility to write code to do the work that you want done. It's the implementation's responsibility to execute your code as efficiently as it can, and that's what it's doing.
Most likely, what's going under the hood is this:
The thread that keeps getting the sempahore can always make forward progress except when it is sleeping. But when it is sleeping, no other thread that needs the sempahore can make forward progress.
The thread that keeps getting the semaphore never exhausts its timeslice because it sleeps before that happens.
So there is no reason for the implementation to ever block this thread other than when it is sleeping, meaning that no other thread can get the semaphore. If you don't want this thread to keep sleeping with the semaphore and blocking other threads, then write different code.

Threads, spawn regularly VS. trap in infinite temporised loop?

This isn't a technical question, but a conceptual one. My program needs to handle several tasks in background. In my case, I consider threads more appropriate than processes for several reasons :
Background tasks aren't heavy, but they have to be processed regularly.
All threads need to manipulate a shared resource. Complete processes would require setting up a shared memory segment, which isn't appropriate in my case (the resource doesn't have a fixed size). Of course, this resource is protected by a mutex.
Another thing I take into consideration is that the main() function needs to be able to end all backgrounds tasks when it wants to (which means joining threads).
Now, here are two implementations :
1 thread, looping inside.
void *my_thread_func(void* shared_ressource)
{
while(1){
do_the_job();
sleep(5);
}
}
// main()
pthread_create(&my_thread, NULL, my_thread_func, (void*)&shared_ressource);
pthread_kill(my_thread, 15);
// pthread_cancel(my_thread);
pthread_join(my_thread, NULL);
Note : In this case, main() needs to signal (or cancel) the thread before joining, otherwise it'll hang. This can be dangerous if the thread doesn't get time to sem_post before it gets terminated.
n threads, looping outside.
void *my_thread_func(void* shared_ressource)
{
do_the_job();
}
// main()
while(1){
pthread_create(&my_thread, NULL, my_thread_func, (void*)&shared_ressource);
pthread_join(my_thread, NULL);
sleep(5);
}
Note : In this case, main() wouldn't naturally hang on pthread_join, it would just have to kill its own continuous loop (using a "boolean" for instance).
Now, I would like some help comparing those two. Threads are lightweight structures, but is the spawning process too heavy for the second implementation ? Or is the infinite loop holding the thread when it shouldn't ? At the moment, I prefer the second implementation because it protects the semaphore : threads do not terminate before they sem_post it. My concern here is optimisation, not functionality.
Having your background threads continuously spawning and dying tends to be inefficient. It is usually much better to have some number of threads stay alive, servicing the background work as it becomes available.
However, it's often better to avoid thread cancellation, too. Instead, I advise using a condition variable and exit flag:
void *my_thread_func(void *shared_resource)
{
struct timespec timeout;
pthread_mutex_lock(&exit_mutex);
do
{
pthread_mutex_unlock(&exit_mutex);
do_the_job();
clock_gettime(CLOCK_REALTIME, &timeout);
timeout.tv_sec += 5;
pthread_mutex_lock(&exit_mutex);
if (!exit_flag)
pthread_cond_timedwait(&exit_cond, &exit_mutex, &timeout);
} while (!exit_flag)
pthread_mutex_unlock(&exit_mutex);
}
When the main thread wants the background thread to exit, it sets the exit flag and signals the condition variable:
pthread_mutex_lock(&exit_mutex);
exit_flag = 1;
pthread_cond_signal(&exit_cond);
pthread_mutex_unlock(&exit_mutex);
pthread_join(my_thread, NULL);
(You should actually strongly consider using CLOCK_MONOTONIC instead of the default CLOCK_REALTIME, because the former isn't affected by changes to the system clock. This requires using pthread_condattr_setclock() and pthread_cond_init() to set the clock used by the condition variable.)

Pthread - setting scheduler parameters

I wanted to use read-writer locks from pthread library in a way, that writers have priority over readers. I read in my man pages that
If the Thread Execution Scheduling option is supported, and the threads involved in the lock are executing with the scheduling policies SCHED_FIFO or SCHED_RR, the calling thread shall not acquire the lock if a writer holds the lock or if writers of higher or equal priority are blocked on the lock; otherwise, the calling thread shall acquire the lock.
so I wrote small function that sets up thread scheduling options.
void thread_set_up(int _thread)
{
struct sched_param *_param=malloc(sizeof (struct sched_param));
int *c=malloc(sizeof(int));
*c=sched_get_priority_min(SCHED_FIFO)+1;
_param->__sched_priority=*c;
long *a=malloc(sizeof(long));
*a=syscall(SYS_gettid);
int *b=malloc(sizeof(int));
*b=SCHED_FIFO;
if (pthread_setschedparam(*a,*b,_param) == -1)
{
//depending on which thread calls this functions, few thing can happen
if (_thread == MAIN_THREAD)
client_cleanup();
else if (_thread==ACCEPT_THREAD)
{
pthread_kill(params.main_thread_id,SIGINT);
pthread_exit(NULL);
}
}
}
sorry for those a,b,c but I tried to malloc everything, still I get SIGSEGV on the call to pthread_setschedparam, I am wondering why?
I don't know if these are the exact causes of your problems but they should help you hone in on it.
(1) pthread_setschedparam returns a 0 on success and a positive number otherwise. So
if (pthread_setschedparam(*a,*b,_param) == -1)
will never execute. It should be something like:
if ((ret = pthread_setschedparam(*a, *b, _param)) != 0)
{ //yada yada
}
As an aside, it isn't 100% clear what you are doing but pthread_kill looks about as ugly a way to do it as possible.
(2) syscall(SYS_gettid) gets the OS threadID. pthread__setschedparam expects the pthreads thread id, which is different. The pthreads thread id is returned by pthread_create and pthread_self in the datatype pthread_t. Change the pthread__setschedparam to use this type and the proper values instead and see if things improve.
(3) You need to run as a priviledge user to change the schedule. Try running the program as root or sudo or whatever.

how to run thread in main function infinitely without causing program to terminate

I have a function say void *WorkerThread ( void *ptr).
The function *WorkerThread( void *ptr) has infinite loop which reads and writes continously from Serial Port
example
void *WorkerThread( void *ptr)
{
while(1)
{
// READS AND WRITE from Serial Port USING MUXTEX_LOCK AND MUTEX_UNLOCK
} //while ends
}
The other function I worte is ThreadTest
example
int ThreadTest()
{
pthread_t Worker;
int iret1;
pthread_mutex_init(&stop_mutex, NULL);
if( iret1 = pthread_create(&Worker, NULL, WorkerThread, NULL) == 0)
{
pthread_mutex_lock(&stop_mutex);
stopThread = true;
pthread_mutex_unlock(&stop_mutex);
}
if (stopThread != false)
stopThread = false;
pthread_mutex_destroy(&stop_mutex);
return 0;
}
In main function
I have something like
int main(int argc, char **argv)
{
fd = OpenSerialPort();
if( ConfigurePort(fd) < 0) return 0;
while (true)
{
ThreadTest();
}
return 0;
}
Now, when I run this sort of code with debug statement it runs fine for few hours and then throw message like "can't able to create thread" and application terminates.
Does anyone have an idea where I am making mistakes.
Also if there is way to run ThreadTest in main with using while(true) as I am already using while(1) in ThreadWorker to read and write infinitely.
All comments and criticism are welcome.
Thanks & regards,
SamPrat.
You are creating threads continually and might be hitting the limit on number of threads.
Pthread_create man page says:
EAGAIN Insufficient resources to create another thread, or a system-imposed
limit on the number of threads was encountered. The latter case may
occur in two ways: the RLIMIT_NPROC soft resource limit (set via
setrlimit(2)), which limits the number of process for a real user ID,
was reached; or the kernel's system-wide limit on the number of
threads, /proc/sys/kernel/threads-max, was reached.
You should rethink of the design of your application. Creating an infinite number of threads is not a god design.
[UPDATE]
you are using lock to set an integer variable:
pthread_mutex_lock(&stop_mutex);
stopThread = true;
pthread_mutex_unlock(&stop_mutex);
However, this is not required as setting an int is atomic (on probably all architectures?). You should use a lock when you are doing not-atomic operations, eg: test and set
take_lock ();
if (a != 1)
a = 1
release_lock ();
You create a new thread each time ThreadTest is called, and never destroy these threads. So eventually you (or the OS) run out of thread handles (a limited resource).
Threads consume resources (memory & processing), and you're creating a thread each time your main loop calls ThreadTest(). And resources are finite, while your loop is not, so this will eventually throw a memory allocation error.
You should get rid of the main loop, and make ThreadTest return the newly created thread (pthread_t). Finally, make main wait for the thread termination using pthread_join.
Your pthreads are zombies and consume system resources. For Linux you can use ulimit -s to check your active upper limits -- but they are not infinite either. Use pthread_join() to let a thread finish and release the resources it consumed.
Do you know that select() is able to read from multiple (device) handles ? You can also define a user defined source to stop select(), or a timeout. With this in mind you are able to start one thread and let it sleeping if nothing occurs. If you intent to stop it, you can send a event (or timeout) to break the select() function call.
An additional design concept you have to consider is message queues to share information between your main application and/or pthread. select() is compatible with this technique so you can use one concept for data sources (devices and message queues).
Here a reference to a good pthread reading and the best pthread book available: Programming with POSIX(R) Threads, ISBN-13:978-0201633924
Looks like you've not called pthread_join() which cleans up state after non-detached threads are finished. I'd speculate that you've hit some per process resource limit here as a result.
As others have noted this is not great design though - why not re-use the thread rather than creating a new one on every loop?

Conditional wait with pthreads

I seem to be running in to a possible deadlock with a pthreads conditional variable.
Here is the code
thread function(){
for (condition){
do work
/* should the thread continue? */
if (exit == 1){
break; /* exit for */
}
} /* end for */
pthread_mutex_lock(&mtxExit);
exit = 0;
pthread_cond_signal(&condVar);
pthread_mutex_unlock(&mtxExit);
}
The main function is as follows:
function main(){
if (thread is still active){
pthread_mutex_lock(&mtxExit);
exit = 1;
pthread_mutex_unlock(&mtxExit);
} /* end if */
while (exit == 1){
pthread_mutex_lock(&mtxExit);
/* check again */
if (exit == 1)
pthread_cond_wait(&condVar, &mtxExit);
pthread_mutex_unlock(&mtxExit);
}
create new thread()
....
}
The code is always getting stuck at cond_wait. :(
EDIT:
Let me add some clarification to the thread to explain what I am doing.
At any given time, I need only one thread running. I have a function that starts the thread, tells it what to do and the main thread continues it work.
The next time the main thread decides it needs to spawn another thread, it has to make sure the thread that was previously started has exited. I cannot have two threads alive at the same time as they will interfere with each other. This is by design and by definition of the problem I am working on.
That is where I am running in to problems.
This is my approach:
Start the thread, let it do its job.
the thread checks in every step of its job to see if it is still relevant. This is where "exit" comes in to picture. The main thread sets "exit" to 1, if it needs to tell the thread that it is no longer relevant.
In most cases, the thread will exit before the main thread decides to spawn another thread. But I still need to factor in the case that the thread is still alive by the time the main thread is ready to start another one.
So the main thread sets the value of "exit" and needs to wait for the thread to exit. I dont want to use pthread_kill with 0 as signal because then main thread will be in a loop wasting CPU cycles. I need the main thread to relinquish control and sleep/wait till the thread exits.
Since I only need one thread at a time, I dont need to worry about scaling to more threads. The solution will never have more than one thread. I just need a reliable mechanism to test if my thread is still alive, if it is, signal it to exit, wait for it to exit and start the next one.
From my testing, it looks like, the main thread is still entering the conditional variable even if the thread may have exited or that the signal is not getting delivered to the main thread at all. And its waiting there forever. And is some cases, in debugger I see that the value of exit is set to 0 and still the main thread is waiting at signal. There seems to be a race condition some where.
I am not a fan of how I set up the code right now, its too messy. Its only a proof of concept right now, I will move to a better solution soon. My challenge is to reliably signal the thread to exit, wait on it to exit.
I appreciate your time.
Did you forget to initialize your condition variable?
pthread_cond_init(&condVar, NULL)
while (exit == 1) {
In the code you quote, the way you quote I do not see any particular problem. It is not clean, but it appears functional. What leads me to believe that somewhere else you are setting exit to 0 without signaling that. Or the thread is getting stuck somewhere doing the work.
But considering the comments which hint that you try to signal one thread to terminate before starting another thread, I think you are doing it wrong. Generally pthread condition signaling shouldn't be relied upon if a signal may not be missed. Though it seems that state variable exit covers that, it is still IMO wrong application of the pthread conditions.
In the case you can try to use a semaphores. While terminating, the thread increments the termination semaphore so that main can wait (decrement) the semaphore.
thread function()
{
for (condition)
{
do work
/* should the thread continue? */
if (exit == 1) {
break; /* exit for */
}
} /* end for */
sem_post(&termSema);
}
function main()
{
if (thread is still active)
{
exit = 1;
sem_wait(&termSema);
exit = 0;
}
create new thread()
....
}
As a general remark, I can suggest to look for some thread pool implementations. Because using a state variable to sync threads is still wrong and doesn't scale to more than one thread. And error prone.
When the code is stuck in pthread_cond_wait, is exit 1 or 0? If exit is 1, it should be stuck.
If exit is 0, one of two things are most likely the case:
1) Some code set exit to 0 but didn't signal the condition variable.
2) Some thread blocked on pthread_cond_wait, consumed a signal, but didn't do whatever it is you needed done.
You have all sorts of timing problems with your current implementation (hence the problems).
To ensure that the thread has finished (and its resources have been released), you should call pthread_join().
There is no need for a pthread_cond_t here.
It might also make more sense to use pthread_cancel() to notify the thread that it is no longer required, rather than a flag like you are currently doing.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *thread_func(void *arg) {
int i;
for (i = 0; i < 10; i++) {
/* protect any regions that must not be cancelled... */
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);
/* very important work */
printf("%d\n", i);
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
/* force a check to see if we're finished */
pthread_testcancel();
/* sleep (for clarity in the example) */
sleep(1);
}
return NULL;
}
void main(void) {
int ret;
pthread_t tid;
ret = pthread_create(&tid, NULL, thread_func, NULL);
if (ret != 0) {
printf("pthread_create() failed %d\n", ret);
exit(1);
}
sleep(5);
ret = pthread_cancel(tid);
if (ret != 0) {
printf("pthread_cancel() failed %d\n", ret);
exit(1);
}
ret = pthread_join(tid, NULL);
if (ret != 0) {
printf("pthread_join() failed %d\n", ret);
exit(1);
}
printf("finished...\n");
}
It's also worth noting:
exit() is a library function - you should not declare anything with the same name as something else.
Depending on your specific situation, it might make sense to keep a single thread alive always, and provide it with jobs to do, rather than creating / cancelling threads continuously (research 'thread pools')

Resources