Threads creation in C - c

void print_hello_world() {
pid_t pid = getpid();
printf("Hello world %d\n", pid);
pthread_exit(0);
}
void main() {
pthread_t thread;
pthread_create(&thread, NULL, (void *) &print_hello_world, NULL);
print_hello_world();
}
I really couldn't understand what is the need of (void *) in pthread_create. And do we need "&print_hello_world" in the same or could drop "&" as I have read somewhere that while passing function pointers we don't need to place "&" before the function name.

Yes, there's no need for cast or & operator there:
pthread_create(&thread, NULL, print_hello_world, NULL);
should suffice as a function name gets converted into a function pointer when passing as argument to function(s).
Note that the function pointer you pass to pthread_create() takes a void* as argument and returns a void *. So your function should be:
void* print_hello_world(void *unused) {
...
}
It's the C way of implementing "generic" data type. For example, you can pass a int* or struct args* to the thread function and retrieve it back.
E.g.
int i=5;
pthread_create(&thread, NULL, print_hello_world, &i);
and in the function print_hello_world(), you would do:
void *print_hello_world(void *value) {
int i = *(int*)value;
...
}
Basically void* allows you to pass any data pointer to the thread function here. If pthread_create()'s thread function were to take int*, you wouldn't be able to pass a struct args* to it, for example.
I suggest you read the following posts for more information on void pointer and its usage in C:
Concept of void pointer in C programming
and
What does void* mean and how to use it?

Casting a function pointer to/from void * is actually undefined behaviour. See 6.3.2.3, especially p1 and p8. Note that functions are no objects in C.
So the cast is wrong in fact and the addressof-operator & is unnecessary. You can cast, however, one function pointer to another (see §8 of the reference). But here, you shoud definitively have a proper signature for your function, as there is a reason it takes a pointer and returns one, too. So: do not cast, but get the signature (and semantics) correct.
Note: An empty identifier-list in a function declaration which is part of the definition is an obsolescent feature. Using the prototype-style (void) for an empty argument list is recommended. Also the minimal required signature for main is int main(void) (with respect to the above said).

Related

Why is retval a void** in pthread_join?

I am having a hard time understanding why pthread_join's retval argument is a void**. I have read the manpage and tried to wrap my head around it but I still cannot fully understand it. I couldn't convince myself that retval cannot be a void*. Could someone please enlighten me?
Thank you very much in advance!
It's because you are supposed to supply the address of a void* to pthread_join.
pthread_join will then write the address supplied by pthread_exit(void*) into the variable (who's address you supplied).
Example scenario:
typedef struct {
// members
} input_data;
typedef struct {
// members
} output_data;
Starting thread side:
input_data id;
pthread_create(..., start_routine, &id);
void* start_routine(void *ptr) {
input_data *id = ptr;
output_data *od = malloc(sizeof *od);
// use the input data `id`, populate the output data `od`.
pthread_exit(od);
}
Joining side:
output_data *od;
pthread_join((void**) &od);
// use `od`
free(od);
Simple enough. The return value of thread func supplied to pthread_create is void*; pthread_join is supposed to return this value to caller.
It can not return this as a function return type (because it is already returning int to indicate the overall status of the call). The only other way as through out parameter.
And the way C does out paramters is by using a pointer to the actual type of the parameter - i.e. if you want to do int as an out parameter, the type of the argument would be int*. If your out parameter is void* (because this is what you are returning from pthread func!), the type of the argument becomes void**.
As an exercise, you can try to write a similar code yourself - first, create a function which returns void* (say, void* foo()), and than try to write another function which would call foo() and communicate result back to the caller.
The exiting thread is going to provide a pointer to some data. The pthread routines do not know what type that data has, so they receive the pointer as a void *.
The caller of pthread_join is going to receive that void *. Since the function return value is used for something else, the void * has to be received through a parameter. So the caller has to pass a pointer to where pthread_join will put the void *. That pointer is a pointer to a void *, which is a void **.
From the manpage:
If retval is not NULL, then pthread_join() copies the exit status of the target thread (i.e., the value that the target thread supplied to pthread_exit(3)) into the location pointed to by retval.
Let's look at the signature of pthread_exit.
noreturn void pthread_exit(void *retval);
So that means if we wanted to return an int from our thread it would look something like this:
void* foo() {
// ...
int value = 255;
pthread_exit(&value);
}
This works because the compiler doesn't care that it's an int* or a void*, either way it's a pointer of the same size.
Now we want to actually extract the return value of the thread using pthread_join.
void bar() {
pthread_t thread_id;
int *returnValue;
// create thread etc...
// the original type of returnValue was an `int*` so when we pass it in
// with "&" it's now become `int**`
pthread_join(thread_id, &returnValue);
printf("%d\n", *returnValue); // should print 255
}
In plain English pthread_join takes a pointer and sets it address to point at the retval from your thread. It's a void** because we need the address of the pointer to be able to set the underlying pointer to what we want.

Meaning of returns char when return value is void, and gets an void argument?

I am learning about threading with C, and I'm a bit confused about the example given. They declared a function print_name with return value of void, but then it returns a string — how and why? The function print_name accepts one argument which is called name but it is a pointer of void; what does a variable of type void mean, and how can it accept a string?
main.c
#include <stdio.h> // I-O
#include <pthread.h> // threading
void *print_name(void *name)
{
puts(name);
return name;
}
int main(int argc, char const *argv[])
{
pthread_t thread_id;
pthread_create(&thread_id, NULL, print_name, "ha ones be eilat");
pthread_join(thread_id, NULL);
return 0;
}
To compile and run with gcc
$ cc main.c -o main -Wall -pthread && ./main
ha ones be eilat
The argument type and return type is not void but pointer to void.
In C both argument passing and returning a value with the return statement happen as if by assignment. void * is a generic pointer type and conversions to and from void * from and to other pointer types will happen implicitly, i.e. without a cast.
The character literal which is an array of char decays to char * and is implicitly converted to void * to match the prototype of pthread_create. The void * is implicitly converted to char * to match the prototype of puts.
print_name has prototype void * print_name(void *) so that a pointer to that function will match the type expected by pthread_create (The third parameter is void *(*start_routine) (void *).)
The declaration of pthread_create is
int pthread_create(
pthread_t *thread,
const pthread_attr_t *attr,
void *(*start_routine) (void *),
void *arg
);
The type void * is a 'universal pointer' that can point to any object type. (On many, but not all, machines, a void * can also hold a function pointer — however, that's tangential to this question.) In C, any object pointer can be converted to a void * and back to the original type without change. Using void * can be dangerous; it can be extremely useful (and thread creation can show both dangerousness and usefulness).
Contrary to the claim in the question, the function print_name() is defined to return a void * value as well as accept a void * argument. The pthread_create() function expects a (pointer to a) thread function that matches the signature:
void *thread_function(void *);
And that's what is passed to it. Since the function returns a void *, it is legitimate to return the pointer it was passed, though that is actually an unusual thing to do.
The return value from the thread function can be collected by passing an appropriate, non-NULL pointer to pthread_join(); the example does not demonstrate that.
void *result;
if (pthread_join(thread_id, &result) == 0)
printf("Result: <<%s>>\n", result);
This would, in the example, print Result: <<ha ones be eilat>> on a line. Many times, you'd convert the returned pointer to an explicit non-void pointer — e.g. char *str = result; — and then use that.
Note that there is nothing in C that would stop you calling:
int i = 0;
if (pthread_create(&thread_id, NULL, print_name, &i) != 0)
…thread creation failed…
The wrong type of data is passed, but that will be treated as OK by the compiler. At run-time, if you're (un)lucky, an empty line will be printed, but anything is possible (because of undefined behaviour) as you passed an int * to a function that requires a char * to work correctly. This inability to check types is a weakness of void *, but it is also a strength as it allows the pthread_create() function to pass a pointer to any type of data to a function. The onus is on the programmer to get the types right — the called function must expect to convert the void * parameter to a pointer to the type that was really passed. Also, the data passed via the pointer to the function needs to be stable — not changed if another thread is started. There is no guarantee about the order in which threads will read the data passed. Passing a pointer to a structure and changing the value in the structure between calls to pthread_create() is a no-no. Similarly with the return value. There are some additional wrinkles there. The data pointed at must be valid after the function exits, so it can't be a local variable.

Dereferencing void * just as (int) -- standard practice?

I was trying to print a thread's return value and discovered that I'm still quite confused by the notion of double void-pointers.
My understanding was that a void* is a pointer to any datatype that can be dereferenced with an appropriate cast, but otherwise the "levels" of referencing are preserved like with regular typed pointers (i.e. you can't expect to get the same value that you put into **(int **)depth2 by dereferencing it only once like *depth2. ).
In the code (below) that I have scraped together for my thread-return-print, however, it seems that I'm not dereferencing a void pointer at all when I'm just casting it to (int). Is this a case of an address being used as value? If so, is this the normal way of returning from threads? Otherwise, what am I missing??
(I am aware that the safer way to manipulate data inside the thread might be caller-level storage, but I'm quite interested in this case and what it is that I don't understand about the void pointer.)
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *myThread(void *arg)
{
return (void *)42;
}
int main()
{
pthread_t tid;
void *res; // res is itself a void *
pthread_create(&tid, NULL, myThread, NULL);
pthread_join(tid, &res); // i pass its address, so void** now
printf(" %d \n", (int)res); // how come am i able to just use it as plain int?
return 0;
}
First of all, the purpose of pthread_join() is to update the void *
given through its second argument in order to obtain the result of the
thread function (a void *).
When you need to update an int as in scanf("%d", &my_var); the argument
is the address of the int to be updated: an int *.
With the same reasoning, you update a void * by providing a void **.
In the specific situation of your example, we don't use the returned
void * in a normal way: this is a trick!
Since a pointer can be thought about as a big integer counting the bytes in
a very long row, the trick is to assume that this pointer can simply store
an integer value which does no refer to any memory location.
In your example, returning (void *)42, is equivalent to saying
"you will find something interesting at address 42".
But nothing has ever been placed at this address!
Is this a problem? No, as long as nobody tries to dereference this
pointer in order to retrieve something at address 42.
Once pthread_join() has been executed, the variable res has
been updated and contains the returned void *: 42 in this case.
We perform here the reverse-trick by assuming that the information memorised
in this pointer does not refer to a memory location but is a simple integer.
It works but this is very ugly!
The main advantage is that you avoid the expensive cost of malloc()/free()
void *myThread(void *arg)
{
int *result=malloc(sizeof(int));
*result=42;
return result;
}
...
int *res;
pthread_join(tid, &res);
int result=*res; // obtain 42
free(res);
A better solution to avoid this cost would be to use the parameter
of the thread function.
void *myThread(void *arg)
{
int *result=arg;
*result=42;
return NULL;
}
...
int expected_result;
pthread_create(&tid, NULL, myThread, &expected_result);
pthread_join(tid, NULL);
// here expected_result has the value 42

how to get the function pointer to a function in C?

I have a function in C:
void start_fun()
{
// do something
}
I want to use pthread_create() to create a thread and the start routine is start_fun(), without modifing void start_fun(), how to get the function pointer to start_fun();
If you write the function name start_fun without any parameters anywhere in your code, you will get a function pointer to that function.
However pthread_create expects a function of the format void* func (void*).
If rewriting the function isn't an option, you'll have to write a wrapper:
void* call_start_fun (void* dummy)
{
(void)dummy;
start_fun();
return 0;
}
then pass call_start_fun to pthread_create:
pthread_create(&thread, NULL, call_start_fun, NULL);
The function name, used as an expression, evaluates to a pointer to the named function. Thus, for instance:
pthread_t thread_id;
int result = pthread_create(&thread_id, NULL, start_fun, NULL);
HOWEVER, the start function you present does not have the correct signature, therefore using it as a pthread start function produces undefined behavior. The start function must have this signature:
void *start_fun(void *arg);
The function may ignore its argument and always return NULL, if appropriate, but it must be declared with both the argument and the return value (of those types).

Typecasting to void pointer & back

I have a function:
void *findPos(void *param)
{
int origPos=(int)param;
...
}
Which I am calling as a thread runner:
pthread_create( &threadIdArray[i], NULL, findPos, (void *)i );
Now, this way, I get the value of origPos as the typecasted void pointer param, ie. i. This feels like a dirty hack to get around the limitation of being allowed to pass only void pointers to a thread runner function.
Can this be done in a cleaner way?
Edit:
Please note that I run the pthread_create() function in a i for loop, hence passing a pointer to i may not be a safe choice.
Sure: just supply a pointer to the int, as was the intention of the API designer:
void *findPos(void *param)
{
int origPos=*(int *)param;
...
}
pthread_create( &threadIdArray[i], NULL, findPos, &i );
Casting between int andvoid * is unsafe because the conversion is not necessarily invertible.
You must also ensure that i is still valid when the thread starts executing (if i has automatic storage duration, this would eg be the case if the calling function also calls pthread_join()).
In your case (i being a loop variable), you should duplicating the variable's value in a safe location, eg on the heap via malloc() or by pushing it on a stack with appropriate liefetime:
static int args[THREAD_COUNT];
for(int i = 0; i < THREAD_COUNT; ++i)
{
args[i] = i;
pthread_create(&threadIdArray[i], NULL, findPos, args + i);
}
You should be sure that on your system the value of your parameter has enough room inside a void-pointer (see data type intptr_t). Passing a double value could be problematic with your "direct" method.
I'm often using a parameter structur to pass values to thread (or other) functions.
struct Param {
double foo;
int bar;
};
Param param;
param.foo = 1.0;
param.bar = 1;
pthread_create( &threadIdArray[i], NULL, findPos, &param );
Well, you could pass a pointer to the value, or wrap the value in a struct and pass a pointer to that. The latter isn't cleaner per se, but more expandable if you ever need more than one int worth of parameters to your thread.
UPDATE:
I used to suggest use of use intptr_t from <stdint.h> to express that you intend to cast this integer to/from void *, but reading the documentation a bit more closely (thanks, Christoph) gives:
The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to a pointer to void, and the result will compare equal to the original pointer: intptr_t
This would seem to indicate, just as Christoph said, that you're not safe if you go this route, so don't
This is a hack that you shouldn't do if you want to have portable code. First the conversion back from void* is not necessarily well defined, as somebody else stated already.
But regardless of that, this is a dirty hack that goes against all possible intentions of the pthread_create API. Simply use something like this:
size_t * threadId = calloc(n, sizeof(size_t));
for (size_t i = 0; i < n; ++i) {
threadId[i] = i;
ptread_create(...., &threadId[i]);
}
And you don't have the congestion on i that you would have if you pass the same argument to all the threads.
I don't believe it's a dirty hack really. Wikipedia in its pthreads example does the same.

Resources