I am trying to implement a single-writer, multiple-reader queue in pthreads. The synchronization pattern works but eventuallly deadlocks after repeated requests (I believe). It works with one writer boss thread and one reader worker thread indefinitely, but if I have one writer boss thread, and multiple reader worker threads, it eventually hangs. When I backtrace in gdb, I see this:
// Boss:
Thread 1 (Thread 0x7ffff7fd1780 (LWP 21029)):
#0 0x00007ffff7bc44b0 in futex_wait
...
// Worker:
Thread 2 (Thread 0x7ffff42ff700 (LWP 21033)):
#0 0x00007ffff7bc39f3 in futex_wait_cancelable
...
// Worker:
Thread 3 (Thread 0x7ffff3afe700 (LWP 21034)):
#0 0x00007ffff7bc39f3 in futex_wait_cancelable
...
To me this seems like the workers are waiting on the signal, and the boss is hanging on the signal and not sending it. But, I don't know why that would happen.
I have tried this synchronization pattern:
// Boss:
pthread_mutex_lock(&queue_mutex);
queue_push(&queue, data);
pthread_cond_signal(&queue_condition);
pthread_mutex_unlock(&queue_mutex);
return;
// Worker(s):
pthread_mutex_lock(&queue_mutex);
while((queue_isempty(&queue)) > 0) {
pthread_cond_wait(&queue_condition, &queue_mutex);
}
data_t *data = queue_pop(&queue);
pthread_mutex_unlock(&queue_mutex);
do_work(data);
To the best of my knowledge, this is the correct synchronization pattern. But, evidence suggests I am not applying the correct pattern. Could someone help me understand why this single-writer, multiple-reader queue access in pthreads would not work as I intend?
Here is the best guess based on available code-let. The dead lock is probably caused by the workers hold the lock while waiting for signal, and the boss got no chance to hold the lock (while worker is holding it, in order to send the signal). The following should avoid the dead lock.
// Boss:
pthread_mutex_lock(&queue_mutex);
queue_push(&queue, data);
pthread_mutex_unlock(&queue_mutex);
pthread_cond_signal(&queue_condition);
return;
// Worker(s):
while((queue_isempty(&queue)) > 0) { //> assume queue_isempty(const void*);
pthread_cond_wait(&queue_condition, &queue_mutex);
}
pthread_mutex_lock(&queue_mutex);
data_t *data = queue_pop(&queue);
pthread_mutex_unlock(&queue_mutex);
do_work(data);
Related
I had an exercise about threads, locks, and condition variables in C. I needed to write a program that get data, turn it into a linked list, starting 3 threads each calculating result for each node in the list and main thread printing the results after evreyone finished.
This is the main function:
int thread_finished_count;
// Lock and Conditional variable
pthread_mutex_t list_lock;
pthread_mutex_t thread_lock;
pthread_cond_t thread_cv;
int main(int argc, char const *argv[])
{
node *list;
int pairs_count, status;
thread_finished_count = 0;
/* get the data and start the threads */
node *head = create_numbers(argc, argv, &pairs_count);
list = head; // backup head for results
pthread_t *threads = start_threads(&list);
/* wait for threads and destroy lock */
status = pthread_cond_wait(&thread_cv, &list_lock);
chcek_status(status);
status = pthread_mutex_destroy(&list_lock);
chcek_status(status);
status = pthread_mutex_destroy(&thread_lock);
chcek_status(status);
/* print result in original list */
print_results(head);
/* cleanup */
wait_for_threads(threads, NUM_THREADS);
free_list(head);
free(threads);
return EXIT_SUCCESS;
}
Please note that the create_numbers function is working properly, and the list is working as intended.
Here is the start_thread and thread_function code:
pthread_t *start_threads(node **list)
{
int status;
pthread_t *threads = (pthread_t *)malloc(sizeof(pthread_t) * NUM_THREADS);
check_malloc(threads);
for (int i = 0; i < NUM_THREADS; i++)
{
status = pthread_create(&threads[i], NULL, thread_function, list);
chcek_status(status);
}
return threads;
}
void *thread_function(node **list)
{
int status, self_id = pthread_self();
printf("im in %u\n", self_id);
node *currentNode;
while (1)
{
if (!(*list))
break;
status = pthread_mutex_lock(&list_lock);
chcek_status(status);
printf("list location %p thread %u\n", *list, self_id);
if (!(*list))
{
status = pthread_mutex_unlock(&list_lock);
chcek_status(status);
break;
}
currentNode = (*list);
(*list) = (*list)->next;
status = pthread_mutex_unlock(&list_lock);
chcek_status(status);
currentNode->gcd = gcd(currentNode->num1, currentNode->num2);
status = usleep(10);
chcek_status(status);
}
status = pthread_mutex_lock(&thread_lock);
chcek_status(status);
thread_finished_count++;
status = pthread_mutex_unlock(&thread_lock);
chcek_status(status);
if (thread_finished_count != 3)
return NULL;
status = pthread_cond_signal(&thread_cv);
chcek_status(status);
return NULL;
}
void chcek_status(int status)
{
if (status != 0)
{
fputs("pthread_function() error\n", stderr);
exit(EXIT_FAILURE);
}
}
Note that self_id is used for debugging purposes.
My problem
My main problem is about splitting the work. Each thread so get an element from the global linked list, calculate gcd, then go on and take the next element. I get this effect only if I'm adding usleep(10) after I unlock the mutex in the while loop. If I don't add the usleep, the FIRST thread will go in and do all the work while other thread just wait and come in after all the work has been done.
Please note!: I thought about the option that maybe the first thread is created, and untill the second thread is created the first one already finish all the jobs. This is why I added the "I'm in #threadID" check with usleep(10) when evrey thread is created. They all come in but only first one is doing all the jobs.
Here is the output example if I do usleep after the mutex unlock (notice diffrent thread ID)
with usleep
./v2 nums.txt
im in 1333593856
list location 0x7fffc4fb56a0 thread 1333593856
im in 1316685568
im in 1325139712
list location 0x7fffc4fb56c0 thread 1333593856
list location 0x7fffc4fb56e0 thread 1316685568
list location 0x7fffc4fb5700 thread 1325139712
list location 0x7fffc4fb5720 thread 1333593856
list location 0x7fffc4fb5740 thread 1316685568
list location 0x7fffc4fb5760 thread 1325139712
list location 0x7fffc4fb5780 thread 1333593856
list location 0x7fffc4fb57a0 thread 1316685568
list location 0x7fffc4fb57c0 thread 1325139712
list location 0x7fffc4fb57e0 thread 1333593856
list location 0x7fffc4fb5800 thread 1316685568
list location (nil) thread 1325139712
list location (nil) thread 1333593856
...
normal result output
...
And thats the output if I comment out the usleep after mutex lock (Notice the same thread ID)
without usleep
./v2 nums.txt
im in 2631730944
list location 0x7fffe5b946a0 thread 2631730944
list location 0x7fffe5b946c0 thread 2631730944
list location 0x7fffe5b946e0 thread 2631730944
list location 0x7fffe5b94700 thread 2631730944
list location 0x7fffe5b94720 thread 2631730944
list location 0x7fffe5b94740 thread 2631730944
list location 0x7fffe5b94760 thread 2631730944
list location 0x7fffe5b94780 thread 2631730944
list location 0x7fffe5b947a0 thread 2631730944
list location 0x7fffe5b947c0 thread 2631730944
list location 0x7fffe5b947e0 thread 2631730944
list location 0x7fffe5b94800 thread 2631730944
im in 2623276800
im in 2614822656
...
normal result output
...
My second question is about order of threads working. My exercise ask me to not use join to synchronize the threads (only use at the end to "free resources") but instand use that condition variable.
My goal is that each thread will take element, do the calculation and in the meanwhile another thread will go in and take another element, and new thread will take each element (or at least close to that)
Thanks for reading and I appriciate your help.
First, you are doing the gcd() work while holding the lock... so (a) only one thread will do any work at any one time, though (b) that does not entirely explain why only one thread appears to do (nearly) all the work -- as KamilCuk says, it may be that there is so little work to do, that it's (nearly) all done before the second thread wakes up properly. [More exotic, there may be some latency between thread 'a' unlocking the mutex and another thread starting to run, such that thread 'a' can acquire the mutex before another thread gets there.]
POSIX says that when a mutex is unlocked, if there are waiters then "the scheduling policy shall determine which thread shall acquire the mutex". The default "scheduling policy" is (to the best of my knowledge) implementation defined.
You could try a couple of things: (1) use a pthread_barrier_t to hold all the threads at the start of thread_function() until they are all running; (2) use sched_yield(void) after pthread_mutex_unlock() to prompt the system into running the newly runnable thread.
Second, you should not under any circumstance treat a 'condition variable' as a signal. For main() to know that all threads have finished you need a count -- which could be a pthread_barrier_t; or it could be simple integer, protected by a mutex, with a 'condition variable' to hold the main thread on while it waits; or it could be a count (in main()) and a semaphore (posted once by each thread as it exits).
Third, you show pthread_cond_wait(&cv, &lock); in main(). At that point main() must own lock... and it matters when that happened. But: as it stands, the first thread to find the list empty will kick cv, and main() will proceed, even though other threads are still running. Though once main() does re-acquire lock, any threads which are then still running will either be exiting or will be stuck on the lock. (It's a mess.)
In general, the template for using a 'condition variable' is:
pthread_mutex_lock(&...lock) ;
while (!(... thing we need ...))
pthread_cond_wait(&...cond_var, &...lock) ;
... do stuff now we have what we need ....
pthread_mutex_unlock(&...lock) ;
NB: a 'condition variable' does not have a value... despite the name, it is not a flag to signal that some condition is true. A 'condition variable' is, essentially, a queue of threads waiting to be re-started. When a 'condition variable' is signaled, at least one waiting thread will be re-started -- but if there are no threads waiting, nothing happens, in particular the (so called) 'condition variable' retains no memory of the signal.
In the new code, following the above template, main() should:
/* wait for threads .... */
status = pthread_mutex_lock(&thread_lock);
chcek_status(status);
while (thread_finished_count != 3)
{
pthread_cond_wait(&thread_cv, &thread_lock) ;
chcek_status(status);
} ;
status = pthread_mutex_unlock(&thread_lock) ;
chcek_status(status);
So what is going on here ?
main() is waiting for thread_finished_count == 3
thread_finished_count is a shared variable "protected" by the thread_lock mutex.
...so it is incremented in thread_function() under the mutex.
...and main() must also read it under the mutex.
if main() finds thread_finished_count != 3 it must wait.
to do that it does: pthread_cond_wait(&thread_cv, &thread_lock), which:
unlocks thread_lock
places the thread on the thread_cv queue of waiting threads.
and it does those atomically.
when thread_function() does the pthread_cond_signal(&thread_cv) it wakes up the waiting thread.
when the main() thread wakes up, it will first re-acquire the thread_lock...
...so it can then proceed to re-read thread_finished_count, to see if it is now 3.
FWIW: I recommend not destroying the mutexes etc until after all the threads have been joined.
I have looked deeper into how glibc (v2.30 on Linux & x86_64, at least) implements pthread_mutex_lock() and _unlock().
It turns out that _lock() works something like this:
if (atomic_cmp_xchg(mutex->lock, 0, 1))
return <OK> ; // mutex->lock was 0, is now 1
while (1)
{
if (atomic_xchg(mutex->lock, 2) == 0)
return <OK> ; // mutex->lock was 0, is now 2
...do FUTEX_WAIT(2)... // suspend thread iff mutex->lock == 2...
} ;
And _unlock() works something like this:
if (atomic_xchg(mutex->lock, 0) == 2) // set mutex->lock == 0
...do FUTEX_WAKE(1)... // if may have waiter(s) start 1
Now:
mutex->lock: 0 => unlocked, 1 => locked-but-no-waiters, 2 => locked-with-waiter(s)
'locked-but-no-waiters' optimizes for the case where there is no lock contention and there is no need to do FUTEX_WAKE in _unlock().
the _lock()/_unlock() functions are in the library -- they are not in the kernel.
...in particular, the ownership of the mutex is a matter for the library, not the kernel.
FUTEX_WAIT(2) is a call to the kernel, which will place the thread on a pending queue associated with the mutex, unless mutex->lock != 2.
The kernel checks for mutex->lock == 2 and adds the thread to the queue atomically. This deals with the case of _unlock() being called after the atomic_xchg(mutex->lock, 2).
FUTEX_WAKE(1) is also a call to the kernel, and the futex man page tells us:
FUTEX_WAKE (since Linux 2.6.0)
This operation wakes at most 'val' of the waiters that are waiting ... No guarantee is provided about which waiters are awoken (e.g., a waiter with a higher scheduling priority is not guaranteed to be awoken in preference to a waiter with a lower priority).
where 'val' in this case is 1.
Although the documentation says "no guarantee about which waiters are awoken", the queue appears to be at least FIFO.
Note especially that:
_unlock() does not pass the mutex to the thread started by the FUTEX_WAKE.
once woken up, the thread will again try to obtain the lock...
...but may be beaten to it by any other running thread -- including the thread which just did the _unlock().
I believe this is why you have not seen the work being shared across the threads. There is so little work for each one to do, that a thread can unlock the mutex, do the work and be back to lock the mutex again before a thread woken up by the unlock can get going and succeed in locking the mutex.
I am trying to better understand how to use pthread_cond_wait() and how it works.
I am just looking for a bit of clarification to an answer I saw on this site.
The answer is the last reply on this page
understanding of pthread_cond_wait() and pthread_cond_signal()
I am wondering how this would look with three threads. Imagine Thread 1 wants to tell Thread 2 and Thread 3 to wake up
For example
pthread_mutex_t mutex;
pthread_cond_t condition;
Thread 1:
pthread_mutex_lock(&mutex);
/*Initialize things*/
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&condition); //wake up thread 2 & 3
/*Do other things*/
Thread 2:
pthread_mutex_lock(&mutex); //mutex lock
while(!condition){
pthread_cond_wait(&condition, &mutex); //wait for the condition
}
pthread_mutex_unlock(&mutex);
/*Do work*/
Thread 3:
pthread_mutex_lock(&mutex); //mutex lock
while(!condition){
pthread_cond_wait(&condition, &mutex); //wait for the condition
}
pthread_mutex_unlock(&mutex);
/*Do work*/
I am wondering if such a setup is valid. Say that Thread 2 and 3 relied on some intialization options that thread 1 needs to process.
First: If you wish thread #1 to wake up thread #2 and #3, it should use pthread_cond_broadcast.
Second: The setup is valid (with broadcast). Thread #2 and #3 are scheduled for wakeup and they will try to reacquire the mutex as part of waking up. One of them will, the other will have to wait for the mutex to be unlocked again. So thread #2 and #3 access the critical section sequentically (to re-evaluate the condition).
If I understand correctly, you want thr#2 and thr#3 ("workers") to block until thr#1 ("boss") has performed some initialization.
Your approach is almost workable, but you need to broadcast rather than signal, and are missing a predicate variable separate from your condition variable. (In the question you reference, the predicate and condition variables were very similarly named.) For example:
pthread_mutex_t mtx;
pthread_cond_t cv;
int initialized = 0; // our predicate of interest, signaled via cv
...
// boss thread
initialize_things();
pthread_mutex_lock(&mtx);
initialized = 1;
pthread_cond_broadcast(&cv);
pthread_mutex_unlock(&mtx);
...
// worker threads
pthread_mutex_lock(&mtx);
while (! initialized) {
pthread_cond_wait(&cv, &mtx);
}
pthread_mutex_unlock(&mtx);
do_things();
That's common enough that you might want to combine the mutex/cv/flag into a single abstraction. (For inspiration, see Python's Event object.) POSIX barriers which are another way of synchronizing threads: every thread waits until all threads have "arrived." pthread_once is another way, as it runs a function once and only once no matter how many threads call it.
pthread_cond_signal wakes up one (random) thread waiting on the cond variable. If you want to wake up all threads waiting on this cond variable use pthread_cond_broadcast.
Depending what you are doing on the critical session there might be also another solution apart from the ones in the previous answers.
Suppose thread1 is executing first (i.e. it is the creator thread) and suppose thread2 and thread3 do not perform any write in to shared resource in the critical session. In this case with pthread_cond_wait you are forcing one thread to wait the other when actually there is no need.
You can use read-write mutex of type pthread_rwlock_t. Basically the thread1 performs a write-lock so the other threads will be blocked when trying to acquire a read-lock.
The functions for this lock are quite self-explanatory:
//They return: 0 if OK, error number on failure
int pthread_rwlock_init(pthread_rwlock_t *restrict rwlock,
const pthread_rwlockattr_t *restrict attr);
int pthread_rwlock_destroy(pthread_rwlock_t *rwlock);
int pthread_rwlock_rdlock(pthread_rwlock_t *rwlock);
int pthread_rwlock_wrlock(pthread_rwlock_t *rwlock);
int pthread_rwlock_unlock(pthread_rwlock_t *rwlock);
int pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock);
int pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock);
When thread1 has finished its initialization it unlocks. The other threads will perform a read-lock and since more read-locks can co-exist they can execute simultaneously. Again: this is valid if you do not perform any write in thread2&3 on the shared resources.
Let's imagine there is a thread which calls pthread_cond_wait and waits for signals:
pthread_mutex_lock(&m);
.....
while(run)
{
do {
pthread_cond_wait(&cond,&m);
} while(!got_signal);
got_signal = false;
do_something();
}
And there are multiple threads which are supposed to deliver signals:
pthread_mutex_lock(&m);
got_signal = true;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&m);
Is this solution safe enough? What will happen if multiple threads send signals? Does m mutex suffice to guarantee all signals are serialized and won't be lost?
In the code you posted, the only place where threads are allowed to call pthread_cond_signal() is when they can acquire m, and that can only happen when your waiting thread is blocked on pthread_cond_wait().
However, it might happen that two signaling threads acquire the mutex after each other, before the waiting thread is woken up and can acquire the mutex. In that case you'll lose the second signal (and any further signals that might arrive after that, before the waiting thread runs), since your waiting thread can only "see" that it has been signaled, but not how many times this has happened.
To make sure that you don't lose any signals, you could use a counter instead of your got_signal flag:
Waiting thread:
pthread_mutex_lock(&m);
.....
while(run)
{
while(signal_count == 0) {
pthread_cond_wait(&cond,&m);
}
--signal_count;
do_something();
}
Signaling threads:
pthread_mutex_lock(&m);
++signal_count;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&m);
(Also note that I've exchanged the do...whileloop with a while loop, to make sure that pthread_cond_wait() isn't called if there's still an unprocessed signal left.)
Now, if multiple threads end up signaling straight after each other, signal_count will become more than one, which will cause the waiting thread to run its do_something() multiple times instead of just once.
I have a capture program which in addition do capturing data and writing it into a file also prints some statistics.The function that prints the statistics
static void report(void)
{
/*Print statistics*/
}
is called roughly every second using an ALARM that expires every second.So The program is like
void capture_program()
{
pthread_t report_thread
while()
{
if(pthread_create(&report_thread,NULL,report,NULL)){
fprintf(stderr,"Error creating reporting thread! \n");
}
/*
Capturing code
--------------
--------------
*/
if(doreport)
/*wakeup the sleeping thread.*/
}
}
void *report(void *param)
{
//access some register from hardware
//sleep for a second
}
The expiry of the timer sets the doreport flag.If this flag is set report() is called which clears the flag.
How do I wake up the sleeping thread (that runs the report()) when the timer goes off in the main thread?
You can sleep a thread using sigwait, and then signal that thread to wake up with pthread_kill. Kill sounds bad, but it doesn't kill the thread, it sends a signal. This method is very fast. It was much faster than condition variables. I am not sure it is easier, harder, safer or more dangerous, but we needed the performance so we went this route.
in startup code somewhere:
sigemptyset(&fSigSet);
sigaddset(&fSigSet, SIGUSR1);
sigaddset(&fSigSet, SIGSEGV);
to sleep, the thread does this:
int nSig;
sigwait(&fSigSet, &nSig);
to wake up (done from any other thread)
pthread_kill(pThread, SIGUSR1);
or to wake up you could do this:
tgkill(nPid, nTid, SIGUSR1);
Our code calls this on the main thread before creating child threads. I'm not sure why this would be required.
pthread_sigmask(SIG_BLOCK, &fSigSet, NULL);
How do I wake up the sleeping thread (that runs the report()) when the
timer goes off in the main thread?
I think a condition variable is the mechanism you are looking for. Have the report-thread block on the condition variable, and the main thread signal the condition variable whenever you want the report-thread to wake up (see the link for more detailed instructions).
I had a similar issue when coding an UDP chat server: there is a thread_1 that only works when an alarm interruption (timeout to see if the client is still alive) OR another thread_2 (this thread meets client requests) signals arrives. What I did was put this thread_1 to sleep (sleep(n*TICK_TIMER), where TICK_TIMER is the alarm expiration value, n is some integer >1), and wake up this thread with SIGALRM signal. See sleep() doc
The alarm handler ( to use this you have to init it: "signal(SIGALRM, tick_handler); alarm(5);")
void tick_handler(){tick_flag++; alarm(5); }
will send a SIGALRM when timeout occurs.
And the command to wake this sleep thread_1 from another thread_2 is:
pthread_kill(X,SIGALRM);
where X is a pthread_t type. If your thread_1 is your main thread, you can get this number by pthread_t X = pthread_self();
I'm writing a code in which I have two threads running in parallel.
1st is the main thread which started the 2nd thread.
2nd thread is just a simple thread executing empty while loop.
Now I want to pause / suspend the execution of 2nd thread by 1st thread who created it.
And after some time I want to resume the execution of 2nd thread (by issuing some command or function) from where it was paused / suspended.
This question is not about how to use mutexes, but how to suspend a thread.
In Unix specification there is a thread function called pthread_suspend, and another called pthread_resume_np, but for some reason the people who make Linux, FreeBSD, NetBSD and so on have not implemented these functions.
So to understand it, the functions simply are not there. There are workarounds but unfortunately it is just not the same as calling SuspendThread on windows. You have to do all kinds of non-portable stuff to make a thread stop and start using signals.
Stopping and resuming threads is vital for debuggers and garbage collectors. For example, I have seen a version of Wine which is not able to properly implement the "SuspendThread" function. Thus any windows program using it will not work properly.
I thought that it was possible to do it properly using signals based on the fact that JVM uses this technique of signals for the Garbage collector, but I have also just seen some articles online where people are noticing deadlocks and so on with the JVM, sometimes unreproducable.
So to come around to answer the question, you cannot properly suspend and resume threads with Unix unless you have a nice Unix that implements pthread_suspend_np. Otherwise you are stuck with signals.
The big problem with Signals is when you have about five different libraries all linked in to the same program and all trying to use the same signals at the same time. For this reason I believe that you cannot actually use something like ValGrind and for example, the Boehm GC in one program. At least without major coding at the very lowest levels of userspace.
Another answer to this question could be. Do what Linuz Torvalds does to NVidia, flip the finger at him and get him to implement the two most critical parts missing from Linux. First, pthread_suspend, and second, a dirty bit on memory pages so that proper garbage collectors can be implemented. Start a large petition online and keep flipping that finger. Maybe by the time Windows 20 comes out, they will realise that Suspending and resuming threads, and having dirty bits is actually one of the fundamental reasons Windows and Mac are better than Linux, or any Unix that does not implement pthread_suspend and also a dirty bit on virtual pages, like VirtualAlloc does in Windows.
I do not live in hope. Actually for me I spent a number of years planning my future around building stuff for Linux but have abandoned hope as a reliable thing all seems to hinge on the availability of a dirty bit for virtual memory, and for suspending threads cleanly.
As far as I know you can't really just pause some other thread using pthreads. You have to have something in your 2nd thread that checks for times it should be paused using something like a condition variable. This is the standard way to do this sort of thing.
I tried suspending and resuming thread using signals, here is my solution. Please compile and link with -pthread.
Signal SIGUSR1 suspends the thread by calling pause() and SIGUSR2 resumes the thread.
From the man page of pause:
pause() causes the calling process (or thread) to sleep until a
signal is delivered that either terminates the process or causes the
invocation of a
signal-catching function.
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <signal.h>
// Since I have only 2 threads so using two variables,
// array of bools will be more useful for `n` number of threads.
static int is_th1_ready = 0;
static int is_th2_ready = 0;
static void cb_sig(int signal)
{
switch(signal) {
case SIGUSR1:
pause();
break;
case SIGUSR2:
break;
}
}
static void *thread_job(void *t_id)
{
int i = 0;
struct sigaction act;
pthread_detach(pthread_self());
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
act.sa_handler = cb_sig;
if (sigaction(SIGUSR1, &act, NULL) == -1)
printf("unable to handle siguser1\n");
if (sigaction(SIGUSR2, &act, NULL) == -1)
printf("unable to handle siguser2\n");
if (t_id == (void *)1)
is_th1_ready = 1;
if (t_id == (void *)2)
is_th2_ready = 1;
while (1) {
printf("thread id: %p, counter: %d\n", t_id, i++);
sleep(1);
}
return NULL;
}
int main()
{
int terminate = 0;
int user_input;
pthread_t thread1, thread2;
pthread_create(&thread1, NULL, thread_job, (void *)1);
// Spawned thread2 just to make sure it isn't suspended/paused
// when thread1 received SIGUSR1/SIGUSR2 signal
pthread_create(&thread2, NULL, thread_job, (void *)2);
while (!is_th1_ready && !is_th2_ready);
while (!terminate) {
// to test, I am sensing signals depending on input from STDIN
printf("0: pause thread1, 1: resume thread1, -1: exit\n");
scanf("%d", &user_input);
switch(user_input) {
case -1:
printf("terminating\n");
terminate = 1;
break;
case 0:
printf("raising SIGUSR1 to thread1\n");
pthread_kill(thread1, SIGUSR1);
break;
case 1:
printf("raising SIGUSR2 to thread1\n");
pthread_kill(thread1, SIGUSR2);
break;
}
}
pthread_kill(thread1, SIGKILL);
pthread_kill(thread2, SIGKILL);
return 0;
}
There is no pthread_suspend(), pthread_resume() kind of APIs in POSIX.
Mostly condition variables can be used to control the execution of other threads.
The condition variable mechanism allows threads to suspend execution
and relinquish the processor until some condition is true. A condition
variable must always be associated with a mutex to avoid a race
condition created by one thread preparing to wait and another thread
which may signal the condition before the first thread actually waits
on it resulting in a deadlock.
For more info
Pthreads
Linux Tutorial Posix Threads
If you can use processes instead, you can send job control signals (SIGSTOP / SIGCONT) to the second process. If you still want to share the memory between those processes, you can use SysV shared memory (shmop, shmget, shmctl...).
Even though I haven't tried it myself, it might be possible to use the lower-level clone() syscall to spawn threads that don't share signals. With that, you might be able to send SIGSTOP and SIGCONT to the other thread.
For implementing the pause on a thread, you need to make it wait for some event to happen. Waiting on a spin-lock mutex is CPU cycle wasting. IMHO, this method should not be followed as the CPU cycles could have been used up by other processes/threads.
Wait on a non-blocking descriptor (pipe, socket or some other). Example code for using pipes for inter-thread communication can be seen here
Above solution is useful, if your second thread has more information from multiple sources than just the pause and resume signals. A top-level select/poll/epoll can be used on non-blocking descriptors. You can specify the wait time for select/poll/epoll system calls, and only that much micro-seconds worth of CPU cycles will be wasted.
I mention this solution with forward-thinking that your second thread will have more things or events to handle than just getting paused and resumed. Sorry if it is more detailed than what you asked.
Another simpler approach can be to have a shared boolean variable between these threads.
Main thread is the writer of the variable, 0 - signifies stop. 1 - signifies resume
Second thread only reads the value of the variable. To implement '0' state, use usleep for sime micro-seconds then again check the value. Assuming, few micro-seconds delay is acceptable in your design.
To implement '1' - check the value of the variable after doing certain number of operations.
Otherwise, you can also implement a signal for moving from '1' to '0' state.
You can use mutex to do that, pseudo code would be:
While (true) {
/* pause resume */
lock(my_lock); /* if this is locked by thread1, thread2 will wait until thread1 */
/* unlocks it */
unlock(my_lock); /* unlock so that next iteration thread2 could lock */
/* do actual work here */
}
You can suspend a thread simply by signal
pthread_mutex_t mutex;
static void thread_control_handler(int n, siginfo_t* siginfo, void* sigcontext) {
// wait time out
pthread_mutex_lock(&mutex);
pthread_mutex_unlock(&mutex);
}
// suspend a thread for some time
void thread_suspend(int tid, int time) {
struct sigaction act;
struct sigaction oact;
memset(&act, 0, sizeof(act));
act.sa_sigaction = thread_control_handler;
act.sa_flags = SA_RESTART | SA_SIGINFO | SA_ONSTACK;
sigemptyset(&act.sa_mask);
pthread_mutex_init(&mutex, 0);
if (!sigaction(SIGURG, &act, &oact)) {
pthread_mutex_lock(&mutex);
kill(tid, SIGURG);
sleep(time);
pthread_mutex_unlock(&mutex);
}
}
Not sure if you will like my answer or not. But you can achieve it this way.
If it is a separate process instead of a thread, I have a solution (This might even work for thread, maybe someone can share your thoughts) using signals.
There is no system currently in place to pause or resume the execution of the processes. But surely you can build one.
Steps I would do if I want it in my project:
Register a signal handler for the second process.
Inside the signal handler, wait for a semaphore.
Whenever you want to pause the other process, just send in a signal
that you registered the other process with. The program will go into
sleep state.
When you want to resume the process, you can send a different signal
again. Inside that signal handler, you will check if the semaphore is
locked or not. If it is locked, you will release the semaphore. So
the process 2 will continue its execution.
If you can implement this, please do share your feedack, if it worked for you or not. Thanks.