I'm abit uncertain if the following code will lead to undefined behavior.
//global
pthread_t thread1;
void *worker(void *arg){
//do stuff
}
void spawnThread(){
//init stuff
int iret1 = pthread_create( &thread1, NULL, worker, (void*) p);
}
My spawnThread will make a new thread using the global thread1.
If I'm currently running a thread that is not finished, will I somehow cause undefined behaviour when starting a new thread using the thread1 variable?
If this is a problem, would it make sense to make my pthread_t variable local to a function? I think it might be problem because it will use the stack, and as soon as i return from my function that will be removed.
If I make my pthread_t local to a function, I can't use the pthread_join in a another part of my program. Is the canonical solution, to have a mutex'ed counter keeping track of how many current threads are running?
thanks
The pthread_t is just an identifier. You can copy it round or destroy it at will. Of course, as you mention, if you destroy it (because it is local) then you cannot use it to call pthread_join.
If you reuse the same pthread_t variable for multiple threads then unless there is only one thread active at a time you are overwriting the older values with the new ones, and you will only be able to call pthread_join on the most recently started thread. Also, if you are starting your threads from inside multiple threads then you will need to protect the pthread_t variable with a mutex.
If you need to wait for your thread to finish, give it its own pthread_t variable, and call pthread_join at the point where you need to wait. If you do not need to wait for your thread to finish, call pthread_detach() after creation, or use the creation attributes to start the thread detached.
pthread_t is just an identifier, and you can do whatever you like with it. Thread state is maintained internally in the C library (in the case of Glibc/NPTL, on an internal struct thread on Thread Local Storage, accessed on x86 via the GS register).
Problem is, your thread1 variable is the only way to refer to your first thread.
The solution I often use is having an array of pthread_t where to store the thread ids I need to refer to. In this example it's a static array, but you can also use dynamically alloced memory.
static pthread_t running_threads[MAX_THREAD_RUNNING_LIMIT];
static unsigned int running_thread_count = 0;
// each time you create a new thread:
pthread_create( &running_threads[running_thread_count], blabla...);
running_thread_count++;
// don't forget to check running_thread_count against the size
// of your running thread size MAX_THREAD_RUNNING_LIMIT
When you need to join() them, simply do it in a loop:
for(i =0; i<running_thread_count; i++)
{
pthread_join(&running_threads[i], &return_value);
}
Related
I am new to pthreads, and I am trying to understand it. I saw some examples like the following.
I could see that the main() is blocked by the API pthread_exit(), and I have seen examples where the main function is blocked by the API pthread_join(). I am not able to understand when to use what?
I am referring to the following site - https://computing.llnl.gov/tutorials/pthreads/. I am not able to get the concept of when to use pthread_join() and when to use pthread_exit().
Can somebody please explain? Also, a good tutorial link for pthreads will be appreciated.
#include <pthread.h>
#include <stdio.h>
#define NUM_THREADS 5
void *PrintHello(void *threadid)
{
long tid;
tid = (long)threadid;
printf("Hello World! It's me, thread #%ld!\n", tid);
pthread_exit(NULL);
}
int main (int argc, char *argv[])
{
pthread_t threads[NUM_THREADS];
int rc;
long t;
for(t=0; t<NUM_THREADS; t++){
printf("In main: creating thread %ld\n", t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc){
printf("ERROR; return code from pthread_create() is %d\n", rc);
exit(-1);
}
}
/* Last thing that main() should do */
pthread_exit(NULL);
Realized one more thing i.e.
pthread_cancel(thread);
pthread_join(thread, NULL);
Sometimes, you want to cancel the thread while it is executing.
You could do this using pthread_cancel(thread);.
However, remember that you need to enable pthread cancel support.
Also, a clean up code upon cancellation.
thread_cleanup_push(my_thread_cleanup_handler, resources);
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, 0);
static void my_thread_cleanup_handler(void *arg)
{
// free
// close, fclose
}
As explained in the openpub documentations,
pthread_exit() will exit the thread that calls it.
In your case since the main calls it, main thread will terminate whereas your spawned threads will continue to execute. This is mostly used in cases where the
main thread is only required to spawn threads and leave the threads to do their job
pthread_join
will suspend execution of the thread that has called it unless the target thread terminates
This is useful in cases when you want to wait for thread/s to terminate before further
processing in main thread.
pthread_exit terminates the calling thread while pthread_join suspends execution of calling thread until target threads completes execution.
They are pretty much well explained in detail in the open group documentation:
pthread_exit
pthread_join
Both methods ensure that your process doesn't end before all of your threads have ended.
The join method has your thread of the main function explicitly wait for all threads that are to be "joined".
The pthread_exit method terminates your main function and thread in a controlled way. main has the particularity that ending main otherwise would be terminating your whole process including all other threads.
For this to work, you have to be sure that none of your threads is using local variables that are declared inside them main function. The advantage of that method is that your main doesn't have to know all threads that have been started in your process, e.g because other threads have themselves created new threads that main doesn't know anything about.
The pthread_exit() API
as has been already remarked, is used for the calling thread termination.
After a call to that function a complicating clean up mechanism is started.
When it completes the thread is terminated.
The pthread_exit() API is also called implicitly when a call to the return() routine occurs in a thread created by pthread_create().
Actually, a call to return() and a call to pthread_exit() have the same impact, being called from a thread created by pthread_create().
It is very important to distinguish the initial thread, implicitly created when the main() function starts, and threads created by pthread_create().
A call to the return() routine from the main() function implicitly invokes the exit() system call and the entire process terminates.
No thread clean up mechanism is started.
A call to the pthread_exit() from the main() function causes the clean up mechanism to start and when it finishes its work the initial thread terminates.
What happens to the entire process (and to other threads) when pthread_exit() is called from the main() function depends on the PTHREAD implementation.
For example, on IBM OS/400 implementation the entire process is terminated, including other threads, when pthread_exit() is called from the main() function.
Other systems may behave differently.
On most modern Linux machines a call to pthread_exit() from the initial thread does not terminate the entire process until all threads termination.
Be careful using pthread_exit() from main(), if you want to write a portable application.
The pthread_join() API
is a convenient way to wait for a thread termination.
You may write your own function that waits for a thread termination, perhaps more suitable to your application, instead of using pthread_join().
For example, it can be a function based on waiting on conditional variables.
I would recommend for reading a book of David R. Butenhof “Programming with POSIX Threads”.
It explains the discussed topics (and more complicated things) very well (although some implementation details, such as pthread_exit usage in the main function, not always reflected in the book).
You don't need any calls to pthread_exit(3) in your particular code.
In general, the main thread should not call pthread_exit, but should often call pthread_join(3) to wait for some other thread to finish.
In your PrintHello function, you don't need to call pthread_exit because it is implicit after returning from it.
So your code should rather be:
void *PrintHello(void *threadid) {
long tid = (long)threadid;
printf("Hello World! It's me, thread #%ld!\n", tid);
return threadid;
}
int main (int argc, char *argv[]) {
pthread_t threads[NUM_THREADS];
int rc;
intptr_t t;
// create all the threads
for(t=0; t<NUM_THREADS; t++){
printf("In main: creating thread %ld\n", (long) t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc) { fprintf(stderr, "failed to create thread #%ld - %s\n",
(long)t, strerror(rc));
exit(EXIT_FAILURE);
};
}
pthread_yield(); // useful to give other threads more chance to run
// join all the threads
for(t=0; t<NUM_THREADS; t++){
printf("In main: joining thread #%ld\n", (long) t);
rc = pthread_join(&threads[t], NULL);
if (rc) { fprintf(stderr, "failed to join thread #%ld - %s\n",
(long)t, strerror(rc));
exit(EXIT_FAILURE);
}
}
}
pthread_exit() will terminate the calling thread and exit from that(but resources used by calling thread is not released to operating system if it is not detached from main thread.)
pthrade_join() will wait or block the calling thread until target thread is not terminated.
In simple word it will wait for to exit the target thread.
In your code, if you put sleep(or delay) in PrintHello function before pthread_exit(), then main thread may be exit and terminate full process, Although your PrintHello function is not completed it will terminate. If you use pthrade_join() function in main before calling pthread_exit() from main it will block main thread and wait to complete your calling thread (PrintHello).
Hmm.
POSIX pthread_exit description from http://pubs.opengroup.org/onlinepubs/009604599/functions/pthread_exit.html:
After a thread has terminated, the result of access to local (auto) variables of the thread is
undefined. Thus, references to local variables of the exiting thread should not be used for
the pthread_exit() value_ptr parameter value.
Which seems contrary to the idea that local main() thread variables will remain accessible.
Using pthread_exit in the main thread(in place of pthread_join), will leave the main thread in defunct(zombie) state. Since not using pthread_join, other joinable threads which are terminated will also remain in the zombie state and cause resource leakage.
Failure to join with a thread that is joinable (i.e., one that is
not detached), produces a "zombie thread". Avoid doing this, since
each zombie thread consumes some system resources, and when enough
zombie threads have accumulated, it will no longer be possible to
create new threads (or processes).
Another point is keeping the main thread in the defunct state, while other threads are running may cause implementation dependent issues in various conditions like if resources are allocated in main thread or variables which are local to the main thread are used in other threads.
Also, all the shared resources are released only when the process exits, it's not saving any resources. So, I think using pthread_exit in place of pthread_join should be avoided.
When pthread_exit() is called, the calling threads stack is no longer addressable as "active" memory for any other thread. The .data, .text and .bss parts of "static" memory allocations are still available to all other threads. Thus, if you need to pass some memory value into pthread_exit() for some other pthread_join() caller to see, it needs to be "available" for the thread calling pthread_join() to use. It should be allocated with malloc()/new, allocated on the pthread_join threads stack, 1) a stack value which the pthread_join caller passed to pthread_create or otherwise made available to the thread calling pthread_exit(), or 2) a static .bss allocated value.
It's vital to understand how memory is managed between a threads stack, and values store in .data/.bss memory sections which are used to store process wide values.
#include<stdio.h>
#include<pthread.h>
#include<semaphore.h>
sem_t st;
void *fun_t(void *arg);
void *fun_t(void *arg)
{
printf("Linux\n");
sem_post(&st);
//pthread_exit("Bye");
while(1);
pthread_exit("Bye");
}
int main()
{
pthread_t pt;
void *res_t;
if(pthread_create(&pt,NULL,fun_t,NULL) == -1)
perror("pthread_create");
if(sem_init(&st,0,0) != 0)
perror("sem_init");
if(sem_wait(&st) != 0)
perror("sem_wait");
printf("Sanoundry\n");
//Try commenting out join here.
if(pthread_join(pt,&res_t) == -1)
perror("pthread_join");
if(sem_destroy(&st) != 0)
perror("sem_destroy");
return 0;
}
Copy and paste this code on a gdb. Onlinegdb would work and see for yourself.
Make sure you understand once you have created a thread, the process run along with main together at the same time.
Without the join, main thread continue to run and return 0
With the join, main thread would be stuck in the while loop because it waits for the thread to be done executing.
With the join and delete the commented out pthread_exit, the thread will terminate before running the while loop and main would continue
Practical usage of pthread_exit can be used as an if conditions or case statements to ensure 1 version of some code runs before exiting.
void *fun_t(void *arg)
{
printf("Linux\n");
sem_post(&st);
if(2-1 == 1)
pthread_exit("Bye");
else
{
printf("We have a problem. Computer is bugged");
pthread_exit("Bye"); //This is redundant since the thread will exit at the end
//of scope. But there are instances where you have a bunch
//of else if here.
}
}
I would want to demonstrate how sometimes you would need to have a segment of code running first using semaphore in this example.
#include<stdio.h>
#include<pthread.h>
#include<semaphore.h>
sem_t st;
void* fun_t (void* arg)
{
printf("I'm thread\n");
sem_post(&st);
}
int main()
{
pthread_t pt;
pthread_create(&pt,NULL,fun_t,NULL);
sem_init(&st,0,0);
sem_wait(&st);
printf("before_thread\n");
pthread_join(pt,NULL);
printf("After_thread\n");
}
Noticed how fun_t is being ran after "before thread" The expected output if it is linear from top to bottom would be before thread, I'm thread, after thread. But under this circumstance, we block the main from running any further until the semaphore is released by func_t. The result can be verified with https://www.onlinegdb.com/
I read that main() is single thread itself, so when i create 2 threads in my program like this;
#include<stdio.h>
#include<pthread.h>
#include<windows.h>
void* counting(void * arg){
int i = 0;
for(i; i < 50; i++){
printf("counting ... \n");
Sleep(100);
}
}
void* waiting(void * arg){
int i = 0;
for(i; i < 50; i++){
printf("waiting ... \n");
Sleep(100);
}
}
int main(){
pthread_t thread1;
pthread_t thread2;
pthread_create(&thread1, NULL, counting, NULL);
pthread_create(&thread2, NULL, waiting, NULL);
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
int i = 0;
for(i; i < 50; i++){
printf("maining ... \n");
Sleep(1000);
}
}
Is main really a thread in that case?
in that case if in main in sleep for some time, shouldn't the main give the CPU to other threads?
Is main a threads itself here? I am confused a bit here.
Is there a specific order to main thread execution?
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
You asked the thread to wait until thread1 terminates and then wait until thread2 terminates, so that's what it does.
I read that main() is single thread itself
No, you have misunderstood. Every C program has a function named main(). C language semantics of the program start with the initial entry into that function. In that sense, and especially when you supply the parentheses, main(), is a function, not a thread.
However, every process also has a main thread which has a few properties that distinguish it from other threads. That is initially the only thread, so it is that thread that performs the initial entry into the main() function. But it is also that thread that runs all C functions called by main(), and by those functions, etc., so it is not, in general, specific to running only the code appearing directly in the body of main(), if that's what you mean by "main() is a single thread itself".
, so when i create 2 threads in my program like this; [...] Is main really a thread in that case?
There is really a main thread in that case, separate from the two additional threads that it starts.
in that case if in main in sleep for some time, shouldn't the main give the CPU to other threads?
If the main thread slept while either of the other two were alive, then yes, one would expect one or both of the others to get (more) CPU time. And in a sense, that's exactly what happens: the main thread calls pthread_join() on each of the other threads in turn, which causes it to wait (some would say "sleep") until those threads terminate before it proceeds. While it's waiting, it does not contend with the other threads for CPU time, as that's pretty much what "waiting" means. But by the time the main thread reaches the Sleep() call in your program, the other threads have already terminated and been joined, because that's what pthread_join() does. They no longer exist, so naturally they don't run during the Sleep().
Is main a threads itself here?
There is a main thread, yes, and it is the only one in your particular process that executes any of the code in function main(). Nothing gets executed except in some thread or other.
I am confused a bit here. Is there a specific order to main thread execution?
As already described, the main thread is initially the only thread. Many programs never have more than that one. Threads other than the main one are created only by the main thread or by another thread that has already been created. Of course, threads cannot run before they are created, nor, by definition, after they have terminated. Threads execute independently of each other, generally without any predefined order, except as is explicitly established via synchronization objects such as mutexes, via for-purpose functions such as pthread_join(), or via cooperative operations on various I/O objects such as pipes.
main() is not a thread but a function, so here's a clear "no" to your initial claim. However, if you read a few definitions of what is a thread, you will find that it is something that can be scheduled, i.e. an ongoing execution of code. Further, a running program will not be able to actually do anything without "ongoing execution of code" without e.g. main() as first entrypoint. So, definitely, every code executed by a program is executed by a thread, without exceptions.
BTW: You can retrieve the thread ID of the current thread. Try running that from main(). It will work and give you a value that distinguishes this call from calls from other threads.
My command line tool keeps throwing the bus error: 10 message. Xcode debugger shows EXC_BAD_ACCESS message and highlights the function call that creates the thread. Manual debugging shows that the execution flow breaks at random positions inside the thread flow. I tried another compiler (gcc), but it ended up the same. Disabling pthread_mutex_lock() and pthread_mutex_unlock() doesn't help. I wrote this small example that reproduces the error.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
typedef struct thread_args {
pthread_mutex_t* mutex;
} thread_args;
void* test(void* t_args) {
printf("Thread initiated\n");
thread_args* args = (thread_args* )t_args;
printf("Args casted\n");
pthread_mutex_lock(args->mutex);
printf("Mutex locked\n");
pthread_mutex_unlock(args->mutex);
printf("Mutex unlocked\n");
pthread_exit(NULL);
}
int main() {
pthread_mutex_t mutex1;
pthread_mutex_init(&mutex1, NULL);
thread_args args;
args.mutex = &mutex1;
pthread_t* thread;
printf("Initiating a thread\n");
pthread_create(thread, NULL, test, &args);
return(0);
}
I think, in your case,
pthread_create(thread, NULL, test, &args);
at this call, thread is a pointer and not allocated memory. So, essentially pthread_create() tries to write into uninitialized memory, which creates undefined behavior.
Referring the man page of pthread_create()
Before returning, a successful call to pthread_create() stores the ID of the new thread in the buffer pointed to by thread;....
Instead, you can do
pthread_t thread;
...
pthread_create(&thread, NULL, test, &args);
You're using an uninitialized pointer to your pthread_t. The actual storage of the pthread_t needs to be somewhere!
Try :
int main() {
pthread_mutex_t mutex1;
pthread_mutex_init(&mutex1, NULL);
thread_args args;
args.mutex = &mutex1;
pthread_t thread;
printf("Initiating a thread\n");
pthread_create(&thread, NULL, test, &args);
return(0);
}
As other answers pointed out, you need to initialize your pointer thread which you can simply do with:
pthread_t thread;
pthread_create(&thread, NULL, test, &args);
Well, then I'll have to allocate memory dynamically, because different
threads are spawned inside many different functions, hence I can't use
local variables, because I'm not going to join the threads. Then, how
can I free the allocated memory without waiting for the thread to
finish, i.e. without calling join?
No. You don't need to dynamically allocate just because you are going to spawn multiple threads. The thread identifier is no longer needed once a thread has been created So whether it's a local variable or malloced is not important. It's only needed when you need to join or change some characteristics of the thread -- for which you need the ID. Otherwise, you can even reuse the same thread for creating multiple threads. For example,
pthread_t thread;
for( i = 0; i<8; i++)
pthread_create(&thread, NULL, thread_func, NULL);
is perfectly fine. A thread can always get its own ID by calling pthread_self() if needed. But you can't pass a local variable mutex1 to thread functions as once main thread exits, the mutex1 no longer exits as thread created continues to use it. So you either need malloc mutex1 or make it a global variable.
Another thing to do is that if you decide to let the main thread exit then you should call pthread_exit(). Otherwise, when the main thread exits (either by calling exit or simply return) then the whole process will die, meaning, all the threads will die too.
I'm trying to use multithreading to allow two tasks to run in parallel within one DLL, but my application keeps crashing, apparently due to some bad resource conflict management; here are the details:
I need to call the same function(DoGATrainAndRun) from a certain point along the main logic flow, passing a different value for one of the parameters, let the two run, then go back to the main logic flow, and use the two (different) sets of values returned from the 2 calls.
(this is in the main header file):
typedef struct
{
int PredictorId;
int OutputType;
int Delta;
int Scale;
int Debug;
FILE* LogFile;
int TotalBars;
double CBaseVal;
double* HVal;
int PredictionLen;
double*** Forecast;
} t;
(This is in the main logic flow):
hRunMutex=CreateMutex(NULL, FALSE, NULL);
arg->OutputType=FH;
handle= (HANDLE) _beginthread( DoGATrainAndRun, 32768, (void*) arg);
arg->OutputType=FL;
handle= (HANDLE) _beginthread( DoGATrainAndRun, 32768, (void*) arg);
do {} while (hRunMutex!=0);
CloseHandle(hRunMutex);
(this is at the end of DoGaTrainAndRun):
free(args);
ReleaseMutex( hRunMutex );
I'm pretty new to multi-threading, and I can't seem to figure this one out...
There's a few things:
First, you're passing the same structure into both threads but just changing the OutputType value. If possible that the first thread will see FL and never see the value FH. The reason for this is how threads are scheduled. It's valid for your first thread to start and then be suspended. Then, your main thread creates the second thread, having set OutputType to FL. This thread starts and the first one is resumed as well. However, the first one now sees OutputType as FL.
Next, both threads are using the same structure, but one of them is freeing the memory. If the other thread is still using it after the other releases then you'll get undefined behaviour in the thread that is still using it. This will probably result in a crash.
You're attempt to wait for the threads to exit is wrong. You don't need the mutex and you certainly don't need to be spinning of it testing for zero. Just use WaitForMultipleObjects:
HANDLE handles[2];
handles[0] = (HANDLE)_beginthread(...);
handles[1] = (HANDLE)_beginthread(...);
WaitForMultipleObjects(2, handles, TRUE, INFINITE);
This will stop your main thread from spinning around wasting cpu cycles. When the wait returns you'll know both threads have finished.
Putting this all together should give you something like this:
HANDLE handles[2];
t *arg1=malloc(sizeof(t));
arg1->OutputType=FH
handles[0] = (HANDLE)_beginthread(DoGATrainAndRun, 32768, (void*)arg1);
t *arg2=malloc(sizeof(t));
arg2->OutputType=FL
handles[1] = (HANDLE)_beginthread(DoGATrainAndRun, 32768, (void*)arg2);
WaitForMultipleObjects(2, handles, TRUE, INFINITE);
free(arg1);
free(arg2)
And don't release any memory in DoGATrainAndRun.
I'm trying to create a detached thread so I won't need to free the memory allocated for it.
Valgrind is used to check for memory leaks.
I've used IBM example and written:
void *threadfunc(void *parm)
{
printf("Inside secondary thread\n");
return NULL;
}
int main(int argc, char **argv)
{
pthread_t thread;
int rc=0;
rc = pthread_create(&thread, NULL, threadfunc, NULL);
sleep(1);
rc = pthread_detach(thread);
return 0;
}
this works and doesn't create leaks, but a version without "sleep(1);" doesn't.
Why is this sleep(1) needed?
I'm trying to create a detached thread so I won't need to free the
memory allocated for it.
In this case, pthread_detach() is not be required and hence should not be used. Additionaly, in this code snippet you have not done any explict memory allocation, so you should not worry about the freeing the memory.
Why is this sleep(1) needed?
When you create the new thread, parent and child threads can starts executing in any order.
Its depends on the OS schedular and other factors. Now in this case if parent threads gets
scheduled first then it is possible that its goes and exit the program before child thread
starts execution.
By adding sleep in the parent context, the child thread is getting time to start and finish the execution before finished. But this is not good idea and as we do not know how much time child thread will take. hence pthread_jon() should be used in the parent context. For detailed information please refer to POSIX thread documentation and great arcicle from below link
https://computing.llnl.gov/tutorials/pthreads/