Understanding pthread_ create arguments in C - c

In this below link
https://computing.llnl.gov/tutorials/pthreads/samples/hello.c
in the statement rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t); the coder has just passed a variable as 4th argument without passing address of that variable. Is this code correct? If yes how can we cast a variable to void *
The above link seems to be popular as it is listing first in Google for pthreads.

Well it is a bit weird, but it does what it is supposed to.
The fourth argument is sent as argument to the PrintHello function/routine. It has to be passed as a void *.
Typically you have a pointer to a dynamically allocated object that you cast to void *. But here he defines a long t, casts it to void * (address) and sends it in. Then he casts it back to a long in PrintHello, so all is fine, but a bit ugly and could have gone "horribly" wrong if he would have cast it to a pointer and tried to access the memory it pointed to.

Yes this code is correct, if you don't try to access the memory pointed to by the parameter in the thread. Just convert it to a long in the thread.
tid = (long)threadid;
It converts the pointer to a long, but it doesn't touch the memory space that the pointer points to, which is most likely junk and will cause access violations.
For example if you did:
tid = (long)*threadid;
That would cause an access violation because you are trying to access the memory at the location pointed to by threadid.
If you would rather pass the pointer to a long integer you could do something like this.
...
long* pint = (long*)malloc(sizeof(long));
*pint = t;
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)pint);
void *PrintHello(void *threadid)
{
long* tid;
tid = (long*)threadid;
printf("Hello World! It's me, thread #%ld!\n", *tid);
free(tid);
pthread_exit(NULL);
}
But that requires the use of malloc and free
Keep in mind that a pointer is nothing more than a 32 or 64bit unsigned integer which represents a location in memory, you can put any number you want in a pointer, just don't try to access the memory it points to.
Hope that helps,
-Dave

Actually the 4th argument is the parameter to be passed to the thread, for example if there is a value that needs to be passed from the main thread to the newly created one, then this is done through this 4th argument. For example:
Lets say I have a thread being created from the main loop:
Int32 l_threadid = pthread_create(&l_updatethread,NULL,Thread,&l_filter);
As you can note that I'm passing the address of a value that is going to be used in the thread being created in the following way:
void* Thread(void *p_parameter)
{
int *l_thread_filter = (int *)p_parameter;
.... then play around with this variable ...
}

Related

Meaning of C construct

In the code below, what is the role of the int sid=*(int*)args construct?
void *thread_p(void *args)
{
int sid=*(int*)args,i,size=0;***This initialisation in the thread function means what?***
char msg[100];
while(1)
{
for(i=0;i<100;i++)
msg[i]='\0';
recv(sid,msg,100,0);
printf("\nClient:%s",msg);
printf("\nServer:");
gets(msg);
size=strlen(msg);
send(sid,msg,size,0);
if((strcmp(msg,"exit"))==0)
{
close(sid);
exit(1);
}
}
}
I'm going to make an assumption here that thread_p is a pthread_create start routine. Such a start routine is required to accept a void pointer, which in C and C++ means a pointer to an untyped block of memory. Now, in addition to a start routine, pthread_create also accepts a void pointer argument, which it passes to the start routine. It is up to the caller of pthread_create to coordinate the call to pthread_create with the code in the start routine. So if the start routine given to pthread_create assumes the void pointer points to, say, an integer, then the call to pthread_create must pass a pointer to an integer as the arg parameter to keep things in sync. This is a type-unsafe interface. The compiler won't help you make sure these are in sync, and if they are not, you'll get undefined behavior.
Your thread_p function is assuming that the arg passed to it actually points to an integer. It is first casting the void pointer to an integer pointer (that is the (int*)args part) and then dereferencing the pointer to get the integer that it points to (that's the * in front of the (int*)args). If the caller of pthread_create passed it thread_p, but failed to pass the address of an int as the arg parameter, the code would have undefined behavior.

Safe way to pass parameters into a thread

Can you clarify, why the following code is a safe way to pass parameters into the new thread:
//Listing 5.3 Passing a Value into a Created Thread
for ( int i=0; i<10; i++ )
pthread_create( &thread, 0, &thread_code, (void *)i );
And the following code isn't:
//Listing 5.4 Erroneous Way of Passing Data to a New Thread
for ( int i=0; i<10; i++ )
pthread_create( &thread, 0, &thread_code, (void *)&i );
Quote from the book,regarding the code:
It is critical to realize that the child thread can start executing at any point after the call, so the pointer must point to something that still exists and still retains the same value. This rules out passing in pointers to changing variables as well as pointers to information held on the stack (unless the stack is certain to exist until after the child thread has read the value).
A third method is good as given below:
static int args[10];
for ( int i=0; i<10; i++ ) {
args[i] = i;
pthread_create( &thread, 0, &thread_code, (void *)&args[i] );
}
If you want same variable shared across all the threads, make a local variable in main or preferably and static or global variable.
Issues with method 1 and method 2:
Method 1 You are casting an int to void * and then back to int which is bad as the size of int and void * may be different. If you plan to cast void * to int *, it is even worse and an UB. Also read this post.
Method 2 You are passing same address to all threads. When i is changed from main thread of any of the 10 worker threads same value would be reflected everywhere which may not be your intention. Moreover scope of i ends after the for loop, and you may end up accessing dangling pointers in threads. and would cause UB. (undefined behaviour)
Why is the second example wrong?
As your citation says, you must not pass a pointer to the interation variable because it gets changed quickly. You never know when exactly the concurrent thread will use the pointer and dereference it.
// Listing 5.4 Erroneous Way of Passing Data to a New Thread
for ( int i=0; i<10; i++ )
pthread_create( &thread, 0, &thread_code, (void *)&i );
Imagine the very first call to pthread_create(). It receives a pointer to i and will probably dereference the pointer and read the value. Your value is supposed to be 0 at the time. But your main thread (the one with the for loop) may have already changed i from 0 to 1. That is called a race condition because your program depends on whether one thread is faster to change the value or the other is faster to get it.
There's a second race condition as well, as your i variable will get out of scope at the end of the loop. If the threads were slow to start or to read the pointer target, the address on the stack can already be allocated to something else. You must not dereference pointers to variables that no longer exist.
Why the first doesn't have the same problem?
The first example uses the value of i, not it's address. That is good, as pthread_create() will just hold the value and pass it to the thread.
// Listing 5.3 Passing a Value into a Created Thread
for ( int i=0; i<10; i++ )
pthread_create( &thread, 0, &thread_code, (void *)i );
But pthread_create() only accepts void * (a generic pointer). The example uses a special trick where you cast the integer value to a pointer value. It is expected that the thread function will do the reverse (will cast the pointer back to integer).
This trick is often used to store an integer value where an object is expected, as it avoids having to allocate and deallocate the object. Whether such a technique is good or bad practice is out of scope of a factual answer. It's being used in frameworks like GLib but I guess many programmers will scorn it.
Final notes
The examples in the book are clearly not solutions for real problems but just motivation examples. In actual code, you would rarely pass just an integer value and you might want to join the thread at some point of time. Therefore in a simple scenario you would have to allocate the thread arguments, fill them in, start the workers, join the workers, retrieve the results and free the allocations.
In a more complicated scenario you would communicate with the threads and therefore you wouldn't be limited to feeding them at their creation and retreiving the results after joining them. You could even just let the workers run and reuse them whenever you need them.

Why is threadID unique?

I've used POSIX threads a few times in C and I never thought about this until the other day: why is the variable taken from arg given to pthread_create() private, given that all the threads call the same function when they start and run the same code to initialise the same variable (most likely a thread ID)? For example, the code:
#include <stdio.h>
#include <pthread.h>
void* threadMethod(void* arg)
{
int threadID = (int) arg;
printf("Thread %d reporting in\n", threadID);
}
int main()
{
pthread_t threads[8];
for (int i = 0; i < 8; i++)
pthread_create(&threads[i], NULL, threadMethod, (void*) i);
for (int i = 0; i < 8; i++)
pthread_join(threads[i], NULL);
}
threadID has a unique value to each thread but I don't understand why, given that it's the same variable in the same method that all threads execute. Shouldn't threads be overwriting each others' value of it? I think it's something to do with stacks. Could someone please clarify what exactly is going on here?
The question should be, "Why do all 8 thread get their own argument"
(private means something else)
The answer to that is, that you are passing by value.
The content of the variable is copied into a register
(or the stack depending on calling convention)
and is then copied further into the local argument variable
(arg), witch lives in thread-local memory.
pthread_create is a C function so there is no concept of private. The reason why the argument to your thread function is a "void*" is because void* is a generic pointer that can point to any type of memory. What that memory is, is between the thread function and the function creating the thread. You are free to use this for a threadId but it really can be anything. Since each thread may be created using a different startup function and using different data.
The reason for the warning is that void* is 64 bits on a 64 bit machine but in is typically 32 bits. The compiler is warning you that you may lose data in the cast. Using a size_t instead of an int should remove the warning.

Passing a struct to pthread_create, undefined values after cast?

I am trying to pass a struct as a parameter to pthread_create and seem to be getting some strange results.
The struct has two members, first one being an int and the second one is a function pointer:
typedef struct
{
int num;
void (*func)(int);
} foo_t;
The function I am trying to use within the struct:
void myfunc(int mynum)
{
printf("mynum: %d\n", mynum);
}
This is the declaration of the struct I will be passing to my thread:
foo_t f;
f.num = 42;
f.func = &myfunc;
The call to pthread_create:
pthread_create(&mythread, NULL, mythreadfunc, &f);
And finally, my thread function:
void mythreadfunc(void *foo)
{
foo_t f = *(foo_t *)foo;
printf("num: %d\n", f.num); // Prints: num: 32776 (undefined?)
(*f.func)(f.num); // Segfaults (again, undefined?)
...
It seems that the cast within mythreadfunc doesn't seem to work and I can't figure out why. Any suggestions? Thanks.
Papergay's answer is definitely one solution, but another approach you can use if you want to avoid dynamic allocation is just using synchronization. My favorite approach is putting a semaphore with initial-value zero in the struct and having the parent wait on the semaphore and the child post the semaphore after it's done reading the values out of the structure.
You are passing your foo_t f by reference. If you are changing your f from your pthread_create-calling function or somewhere else, for example leaving the corresponding scope would remove/delete f from the stack, then your reference inside your thread is (most likely) invalid. At least you should not access it anymore.
Use pointers with dynamically allocated variables instead.
Though I can not prove that this is happening with the code you have presented.
Edit:
A pointer is used to store the address of an object/variable. In C/C++ you can have it point to a dynamically allocated memory segment. Look up malloc/new for this. You can do what is called dereferencing with pointers and access the objecct/variable itself.
You can pass the pointer per value, which means, that you won't pass the object/variable itself, but the address(= position) of the object/variable inside your RAM. The memory management of dynamically allocated varialbes/object lies in the responsibility of the programmer, so your object won't be deleted when the scope of the pointer ends, only the pointer which stores the address will be deleted.

How to pass parameters to a thread in c multithreading properly

I'm trying to learn on C multithreading, and I've seen a couple of rare things.
I understand that passing parameters to a thread must be done with pointers. I've found an example which I don't understand. I'll copy the relevant lines:
pthread_t tid[MAX_THREADS]
int n_veg
pthread_create(&tid[n],NULL,caracter,(void *)n_veg)
caracter is obviously a predeclared function.
Now, why do we use a void pointer casting instead of a int pointer casting? Is there any relevant difference?
Secondly, why do we use a pointer casting in the first place? Can't we use "&n_veg" like with the first parameter?
Thanks in advance.
Since both your questions are related, I'll answer them together: pthread_create takes a void * parameter, so you can really pass in any pointer you want. In this case though, we aren't actually passing a pointer, but just a simple integer value casted as a pointer. That means you will access it like this in caracter:
int value = (int)n_veg;
As you mentioned, you could very well pass an actual pointer as &n_veg and retrieve the value like this:
int value = *(int *)n_veg;
In fact, in most cases, you will need to pass more data than just an integer, such as a structure, and in that case, you must pass a pointer, as you can't simply cast it to a pointer like an integer.
One thing to keep in mind when you pass a pointer is that n_veg must not go out of scope as long as the thread is running. For example, if you do:
void test() {
int n_veg;
pthread_create(&tid[n],NULL,caracter,&n_veg);
}
then &n_veg will be invalid as soon as test returns, but the thread may still be running and will be holding an invalid address. Because of this, structures passed to threads are normally dynamically allocated, say using malloc, and the thread can free it once it has completed.
pthread_create is defined as follows:
int pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict attr,
void *(*start_routine)(void *), void *restrict arg);
So it expects a void * as its last parameter. If you omit the cast, the compiler would give you a warning.

Resources