I am learning the basics of POSIX threads. I want to create a program that prints "Hello World!" 10 times with a delay of a second between each printout. I've used a for loop to print it 10 times, but I am stuck on how to implement the time delay part.
This is my code so far:
#define MAX 10
void* helloFunc(void* tid)
{
printf("Hello World!\n", (int)(intptr_t)tid);
}
int main(int ac, char * argv)
{
pthread_t hej[MAX];
for (int i = 0; i < MAX; i++)
{
pthread_create(&hej[i], NULL, helloFunc, (void*)(intptr_t)i);
pthread_join(&hej[i], NULL);
}
pthread_exit(NULL);
return(0);
}
Thanks in advance!
There are two major problems with your code:
First of all you must wait for the threads to finish. You do that by joining them with pthread_join. And for that to work you must save the pthread_t value from each and every thread (for example in an array).
If you don't wait for the threads then the exit call will end the process, and that will also unexpectedly kill and end all threads in the process.
For all threads to run in parallel you should wait in a separate loop after you have created them:
pthread_t hej[MAX];
for (int i = 0; i < MAX; i++)
{
pthread_create(&hej[i], ...);
}
for (int i = 0; i < MAX; i++)
{
pthread_join(&hej[i], NULL);
}
The second problem is that you pass a pointer to i to the thread, so tid inside the thread functions will be all be the same (and a very large and weird value). To pass a value you must first cast it to intptr_t and then to void *:
pthread_create(..., (void *) (intptr_t) i);
And in the thread function you do the opposite casting:
printf("Hello World %d!\n", (int) (intptr_t) tid);
Note that this is an exception to the rule that one should never pass values as pointers (or opposite).
Finally for the "delay" bit... On POSIX systems there are many ways to delay execution, or to sleep. The natural and simple solution would be to use sleep(1) which sleeps one second.
The problem is where do to this sleep(1) call. If you do it in the thread functions after the printf then all threads will race to print the message and then all will sleep at the same time.
If you do it in the loop where you create the threads, then the threads won't really run in parallel, but really in serial where one thread prints it message and exits, then the main thread will wait one second before creating the next thread. It makes the threads kind of useless.
As a possible third solution, use the value passed to the thread function to use as the sleep time, so the thread that is created first (when i == 0) will primt immediately, the second thread (when i == 1) will sleep one second. And so on, until the tenth thread is created and will sleep nine seconds before printing the message.
Could be done as:
void* helloFunc(void* tid)
{
int value = (int) (intptr_t) tid;
sleep(value);
printf("Hello World %d!\n", value);
// Must return a value, as the function is declared as such
return NULL;
}
Related
I have to do partial sums using parallel reduction approach in C. but I doesn't have any idea about it. So, I need guidance of Community to achieve this.
What I need to achieve: for example, computational thread, then in first reduction step add two element in array, thread 4 should wait for thread 0 done, same as thread 5 has to wait for thread 1 done, thread 6 waits for thread 2 & thread 7 should waits for thread 3 done .
now in second step, , thread 6 waits for thread 4 finished, and thread 7 waits for thread 5 finished. , thread 6 waits for thread 4 finished, and thread 7 waits for thread 5 finished.
In the third step, thread 7 waits for thread 6 done. then need to print whole array
Please help me, give me a guidance to achieve this one.
The Intel oneTBB library include "parallel_reduce" algorithm, which can be used directly for such task.
https://spec.oneapi.io/versions/latest/elements/oneTBB/source/algorithms/functions/parallel_reduce_func.html
Since oneTBB only support C++, if only C can be used, you could consider to use OpenMP instead. But it also need toolchain support.
https://www.intel.com/content/www/us/en/develop/documentation/advisor-user-guide/top/model-threading-designs/add-parallelism-to-your-program/replace-annotations-with-openmp-code/openmp-reduction-operations.html
I'm confused about how to manage computing thread a given thread should wait. 2^r-i, where r = log(m).
There are at least two easy ways to do that:
save all thread-ids in a global pthread_t ptid[N]; array.
save all thread-ids in a local pthread_t ptid[N]; array, and pass the address of that array into each thread via the thread argument.
Example pseudo-code for the latter (error handling omitted for clarity):
struct Arg {
pthread_t *ptid; // Address of ptid[N].
int idx; // Index of this thread.
};
void *partial_sum(void *p)
{
struct Arg *arg = (struct Arg *)p;
int sum = 0;
... // Compute my portion of the partial sum.
int other_thread_idx = ...; // Compute index of the other thread
// this thread should join, -1 if none.
if (other_thread_idx >= 0) {
int other_sum;
// Get the result from other thread.
pthread_join(arg->ptid[other_thread_idx], (void*) &other_sum);
printf("Thread %d joined thread %d which returned sum %d\n",
arg->idx, other_thread_idx, other_sum);
sum += other_sum;
}
printf("Thread %d, sum: %d\n", sum);
return (void*) sum;
}
int main()
{
struct Arg args[N];
pthread_t ptid[N];
for (int i = 0; i < N; ++i) {
struct Arg* arg = &args[i];
arg->idx = i;
arg->ptid = &ptid[0];
pthread_create(&ptid[i], partial_sum, NULL, arg);
}
// Get the final result.
int sum;
// Note: joining only the last thread -- all others have been joined
// already.
pthread_join(ptid[N - 1], (void*) &sum);
printf("Sum: %d\n", sum);
return 0;
}
int g_ant = 0;
void *writeloop(void *arg)
{
while(g_ant < 10)
{
g_ant++;
usleep(rand()%10);
printf("%d\n", g_ant);
}
exit(0);
}
int main(void)
{
pthread_t time;
pthread_create(&time, NULL, writeloop, NULL);
writeloop(NULL);
pthread_join(time, NUL);
return 0;
}
Hi! I have four questions which I believe goes under the category race condition...? :-)
I'm trying to figure out why the printf of g_ant, on my computer, starts on 2 and continues to 10 in 90% of the cases, with an occasional 1, 3->10 output. My guess is because of the usleep which may hinder thread1 long enough to let thread2 increment and printf before thread1 reaches printf.
Wouldn't this also mess up numbers from 2->10?
I'm also struggeling to understand pthread_join's function in this program. My understanding is that it's used to wait for a thread to complete. Is it waiting for the writeloop function started by pthread_create?
Is writeloop(null) considered second thread?
g_ant++;
isn't atomic operation, which can cause undefined behaviour. You should use
pthread_mutex_lock(&mutex);
and
pthread_mutex_unlock(&mutex);
the reason why it 90% times starts at 2 is because thread time enters the function, increments g_ant and sleeps itself. OS tends to take it away from CPU and put there another thread that is not asleep, in your case that is your main thread which again increments it by 1 runs usleep. Now g_ant has value 2, thread time resumes and prints 2 and increments it to 3. Main thread gets resumed and prints the 3 and again increments it, this keeps switching that's why you see numbers from 2 -> 10 most of the time.
Hopefully it is clear enough and should answer 2. question as well.
pthread_join makes sure that other threads finish their job before your main thread quits the program.
nope it is not considered a second thread, it runs the function on the main thread.
hope it helps.
The main thread is considered another thread. The following might help you understand what's going on before you add mutexes (assuming
you have to do that next). Usually, you don't exit() the whole process
from a thread - it would never be joined in the main thread.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int g_ant = 0;
void *writeloop(void *arg)
{
while(g_ant < 10)
{
g_ant++;
usleep( rand() % 10 );
printf("thread: %u global: %d\n", (unsigned int)pthread_self(), g_ant);
}
return NULL;
}
int main(void)
{
pthread_t t;
pthread_create(&t, NULL, writeloop, NULL);
writeloop(NULL);
pthread_join(t, NULL);
printf("Joined\n");
return 0;
}
I am relatively new to threads and forks. So to understand them a bit better I have been writing simple programs. One of the little programs I have written two programs, one to print a counter on two processes, and another with two threads.
What I noticed is that the fork prints the counters interlaced while the thread prints one thread's counter and then the others. So the thread is not so parallel, but behaves more serial Why is that? Am I doing something wrong?
Also, what exactly does pthread_join do? Even when I don't do pthread_join the program runs similarly.
Here is my code for the thread
void * thread1(void *a){
int i =0;
for(i=0; i<100; i++)
printf("Thread 1 %d\n",i);
}
void * thread2(void *b){
int i =0;
for(i=0; i<100; i++)
printf("Thread 2 %d\n", i);
}
int main()
{
pthread_t tid1,tid2;
pthread_create(&tid1,NULL,thread1, NULL);
pthread_create(&tid2,NULL,thread2, NULL);
pthread_join(tid1,NULL);
pthread_join(tid2,NULL);
return 0;
}
And here is my code for fork
int main(void)
{
pid_t childPID;
childPID = fork();
if(childPID >= 0) // fork was successful
{
if(childPID == 0) // child process
{ int i;
for(i=0; i<100;i++)
printf("\n Child Process Counter : %d\n",i);
}
else //Parent process
{
int i;
for(i=0; i<100;i++)
printf("\n Parent Process Counter : %d\n",i);
}
}
else // fork failed
{
printf("\n Fork failed, quitting!!!!!!\n");
return 1;
}
return 0;
}
EDIT:
How can I make the threaded program behave more like the fork program? i.e. the counter prints interweave.
You are traveling down a bad road here. The lesson you should be learning is to not try and out think the OS scheduler. No matter what you do - processing schedules, priorities, or whatever knobs you turn - you cannot do it reliably.
You have backed your way into discovering the need for synchronization mechanisms - mutexes, semaphores, condition variables, thread barriers, etc. What you want to do is exactly why they exist and what you should use to accomplish your goals.
On your last question, pthread_join reclaims some resources from dead, joinable (i.e.not detached) threads and allows you to inspect any return variable from the expired thread. In your program they are mostly serving as a blocking mechanism. That is, main will block on those calls until the threads expire. Without the pthread_joins your main would end and the process would die, including the threads you created. If you don't want to join the threads and aren't doing anything useful in main then use pthread_exit in main as this will allow main to exit but the threads to continue processing.
First of all, a fork creates a second process while creating a thread creates a "dispatchable unit of work" within the same process.
Getting two different processes to be interleaved is usually a simple matter of letting the OS run. However, within a process you need to know more about how the OS chooses which of several threads to run.
You could probably, artificially, get the output from the threads to be interleaved by calling sleep for different times from each thread. That is, create thread A (code it to output one line, and then sleep for 100) then create thread B (code it to output one line and then sleep for 50, etc.)
I understand wanting to see how threads can run in parallel, similar to processes. But, is this a real requirement or just a "zoo" request?
Writing my basic programs on multi threading and I m coming across several difficulties.
In the program below if I give sleep at position 1 then value of shared data being printed is always 10 while keeping sleep at position 2 the value of shared data is always 0.
Why this kind of output is coming ?
How to decide at which place we should give sleep.
Does this mean that if we are placing a sleep inside the mutex then the other thread is not being executed at all thus the shared data being 0.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include<unistd.h>
pthread_mutex_t lock;
int shared_data = 0;
void * function(void *arg)
{
int i ;
for(i =0; i < 10; i++)
{
pthread_mutex_lock(&lock);
shared_data++;
pthread_mutex_unlock(&lock);
}
pthread_exit(NULL);
}
int main()
{
pthread_t thread;
void * exit_status;
int i;
pthread_mutex_init(&lock, NULL);
i = pthread_create(&thread, NULL, function, NULL);
for(i =0; i < 10; i++)
{
sleep(1); //POSITION 1
pthread_mutex_lock(&lock);
//sleep(1); //POSITION 2
printf("Shared data value is %d\n", shared_data);
pthread_mutex_unlock(&lock);
}
pthread_join(thread, &exit_status);
pthread_mutex_destroy(&lock);
}
When you sleep before you lock the mutex, then you're giving the other thread plenty of time to change the value of the shared variable. That's why you're seeing a value of "10" with the 'sleep' in position #1.
When you grab the mutex first, you're able to lock it fast enough that you can print out the value before the other thread has a chance to modify it. The other thread sits and blocks on the pthread_mutex_lock() call until your main thread has finished sleeping and unlocked it. At that point, the second thread finally gets to run and alter the value. That's why you're seeing a value of "0" with the 'sleep' at position #2.
This is a classic case of a race condition. On a different machine, the same code might not display "0" with the sleep call at position #2. It's entirely possible that the second thread has the opportunity to alter the value of the variable once or twice before your main thread locks the mutex. A mutex can ensure that two threads don't access the same variable at the same time, but it doesn't have any control over the order in which the two threads access it.
I had a full explanation here but ended up deleting it. This is a basic synchronization problem and you should be able to trace and identify it before tackling anything more complicated.
But I'll give you a hint: It's only the sleep() in position 1 that matters; the other one inside the lock is irrelevant as long as it doesn't change the code outside the lock.
I just want my main thread to wait for any and all my (p)threads to complete before exiting.
The threads come and go a lot for different reasons, and I really don't want to keep track of all of them - I just want to know when they're all gone.
wait() does this for child processes, returning ECHILD when there are no children left, however wait does not (appear to work with) (p)threads.
I really don't want to go through the trouble of keeping a list of every single outstanding thread (as they come and go), then having to call pthread_join on each.
As there a quick-and-dirty way to do this?
Do you want your main thread to do anything in particular after all the threads have completed?
If not, you can have your main thread simply call pthread_exit() instead of returning (or calling exit()).
If main() returns it implicitly calls (or behaves as if it called) exit(), which will terminate the process. However, if main() calls pthread_exit() instead of returning, that implicit call to exit() doesn't occur and the process won't immediately end - it'll end when all threads have terminated.
http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_exit.html
Can't get too much quick-n-dirtier.
Here's a small example program that will let you see the difference. Pass -DUSE_PTHREAD_EXIT to the compiler to see the process wait for all threads to finish. Compile without that macro defined to see the process stop threads in their tracks.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <time.h>
static
void sleep(int ms)
{
struct timespec waittime;
waittime.tv_sec = (ms / 1000);
ms = ms % 1000;
waittime.tv_nsec = ms * 1000 * 1000;
nanosleep( &waittime, NULL);
}
void* threadfunc( void* c)
{
int id = (int) c;
int i = 0;
for (i = 0 ; i < 12; ++i) {
printf( "thread %d, iteration %d\n", id, i);
sleep(10);
}
return 0;
}
int main()
{
int i = 4;
for (; i; --i) {
pthread_t* tcb = malloc( sizeof(*tcb));
pthread_create( tcb, NULL, threadfunc, (void*) i);
}
sleep(40);
#ifdef USE_PTHREAD_EXIT
pthread_exit(0);
#endif
return 0;
}
The proper way is to keep track of all of your pthread_id's, but you asked for a quick and dirty way so here it is. Basically:
just keep a total count of running threads,
increment it in the main loop before calling pthread_create,
decrement the thread count as each thread finishes.
Then sleep at the end of the main process until the count returns to 0.
.
volatile int running_threads = 0;
pthread_mutex_t running_mutex = PTHREAD_MUTEX_INITIALIZER;
void * threadStart()
{
// do the thread work
pthread_mutex_lock(&running_mutex);
running_threads--;
pthread_mutex_unlock(&running_mutex);
}
int main()
{
for (i = 0; i < num_threads;i++)
{
pthread_mutex_lock(&running_mutex);
running_threads++;
pthread_mutex_unlock(&running_mutex);
// launch thread
}
while (running_threads > 0)
{
sleep(1);
}
}
If you don't want to keep track of your threads then you can detach the threads so you don't have to care about them, but in order to tell when they are finished you will have to go a bit further.
One trick would be to keep a list (linked list, array, whatever) of the threads' statuses. When a thread starts it sets its status in the array to something like THREAD_STATUS_RUNNING and just before it ends it updates its status to something like THREAD_STATUS_STOPPED. Then when you want to check if all threads have stopped you can just iterate over this array and check all the statuses.
Don't forget though that if you do something like this, you will need to control access to the array so that only one thread can access (read and write) it at a time, so you'll need to use a mutex on it.
you could keep a list all your thread ids and then do pthread_join on each one,
of course you will need a mutex to control access to the thread id list. you will
also need some kind of list that can be modified while being iterated on, maybe a std::set<pthread_t>?
int main() {
pthread_mutex_lock(&mutex);
void *data;
for(threadId in threadIdList) {
pthread_mutex_unlock(&mutex);
pthread_join(threadId, &data);
pthread_mutex_lock(&mutex);
}
printf("All threads completed.\n");
}
// called by any thread to create another
void CreateThread()
{
pthread_t id;
pthread_mutex_lock(&mutex);
pthread_create(&id, NULL, ThreadInit, &id); // pass the id so the thread can use it with to remove itself
threadIdList.add(id);
pthread_mutex_unlock(&mutex);
}
// called by each thread before it dies
void RemoveThread(pthread_t& id)
{
pthread_mutex_lock(&mutex);
threadIdList.remove(id);
pthread_mutex_unlock(&mutex);
}
Thanks all for the great answers! There has been a lot of talk about using memory barriers etc - so I figured I'd post an answer that properly showed them used for this.
#define NUM_THREADS 5
unsigned int thread_count;
void *threadfunc(void *arg) {
printf("Thread %p running\n",arg);
sleep(3);
printf("Thread %p exiting\n",arg);
__sync_fetch_and_sub(&thread_count,1);
return 0L;
}
int main() {
int i;
pthread_t thread[NUM_THREADS];
thread_count=NUM_THREADS;
for (i=0;i<NUM_THREADS;i++) {
pthread_create(&thread[i],0L,threadfunc,&thread[i]);
}
do {
__sync_synchronize();
} while (thread_count);
printf("All threads done\n");
}
Note that the __sync macros are "non-standard" GCC internal macros. LLVM supports these too - but if your using another compiler, you may have to do something different.
Another big thing to note is: Why would you burn an entire core, or waste "half" of a CPU spinning in a tight poll-loop just waiting for others to finish - when you could easily put it to work? The following mod uses the initial thread to run one of the workers, then wait for the others to complete:
thread_count=NUM_THREADS;
for (i=1;i<NUM_THREADS;i++) {
pthread_create(&thread[i],0L,threadfunc,&thread[i]);
}
threadfunc(&thread[0]);
do {
__sync_synchronize();
} while (thread_count);
printf("All threads done\n");
}
Note that we start creating the threads starting at "1" instead of "0", then directly run "thread 0" inline, waiting for all threads to complete after it's done. We pass &thread[0] to it for consistency (even though it's meaningless here), though in reality you'd probably pass your own variables/context.