Method of Passing Thread Parameters Causing Race Condition? - c

I have a question about a program I am writing, one which requires some multi-threading. The thread function requires a couple of parameters, which I am passing by doing the following:
Having a struct defined for the parameters (lets call it ThreadDataStruct)
Initializing this struct and filling the values (call our ThreadDataStruct instance THREAD_DATA)
Passing the values on to my thread during CreateThread by setting lpParameter to a pointer to the struct (&THREAD_DATA)
Re-cast the LPVOID to ThreadDataStruct in the thread function
The issue is that my function which creates the threads returns directly after the call to CreateThread. My question is: can returning so soon after creating the thread cause the thread to not get its parameters? If the struct was created in the function and then the function returns, won't the parameters for the thread function be destroyed if it can't get them fast enough?
Here is some POC to show what I mean:
Definition of our struct:
typedef struct ThreadDataStruct
{
int Number;
};
Our function we are running inside:
void Function()
{
ThreadDataStruct THREAD_DATA;
THREAD_DATA.Number = 1;
CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)ThreadFunction, &THREAD_DATA, 0, NULL);
return;
}
Our thread function:
void ThreadFunction(LPVOID lpParam)
{
ThreadDataStruct THREAD_DATA = *(ThreadDataStruct *)lpParam;
int NumberFromStruct = THREAD_DATA.Number;
return;
}
Do I need to find another way to pass the parameters safely?

Transferring comments into an answer.
Yes, you have a race condition; and yes, you need to have a different way of passing the parameters to the function safely. The variable passed to the thread function is (normally) passed by address, so it needs to last as long as the thread needs to access it. In your code, the function returns and that means the variable is no longer valid, so all hell breaks loose.
Thank you for clearing that up. What methods are there to pass the parameters safely? I guess I could put a Sleep(5000) after CreateThread(), but that obviously is not the best way to do this.
One option is to have Function() wait for the the thread to complete before it returns. Another option is to have Function() wait on a semaphore, and the ThreadFunction() would signal on the semaphore when it has finished reading its control data.
Another option, probably the best, is to have the function that calls Function() pass in a ThreadDataStruct * that can be used, and for it to keep that around until all the threads have terminated. So, if it is main() that calls Function(), then main() might have an array of ThreadDataStruct thread_params[32]; and it could pass &thread_params[i] to the ith call to Function() (probably within a loop).
void Function(ThreadDataStruct *THREAD_DATA)
{
THREAD_DATA->Number = 1;
CreateThread(NULL, 0, ThreadFunction, THREAD_DATA, 0, NULL);
}
int main(void)
{
ThreadDataStruct thread_params[32];
for (int i = 0; i < 32; i++)
Function(&thread_params[i]);
…code to wait for threads to exit…
}
Changes in Function() include:
Add argument.
Initialize argument via pointer.
Pass argument directly, not &THREAD_DATA.
Don't cast the thread function pointer type; make sure ThreadFunction() is the correct type to be called by CreateThread().
There's no virtue in a return; at the end of a void function.
No doubt there are other equivalent techniques available too; these come to mind immediately. Normally, though, the controlling code has some sort of array of the data structures available, and passes one of the elements of the array to each invocation of the thread-launching function.

Related

Main thread and worker thread initialization

I'm creating a multi-thread program in C and I've some troubles.
There you have the function which create the threads :
void create_thread(t_game_data *game_data)
{
size_t i;
t_args *args = malloc(sizeof(t_args));
i = 0;
args->game = game_data;
while (i < 10)
{
args->initialized = 0;
args->id = i;
printf("%zu CREATION\n", i);//TODO: Debug
pthread_create(&game_data->object[i]->thread_id, NULL, &do_action, args);
i++;
while (args->initialized == 0)
continue;
}
}
Here you have my args struct :
typedef struct s_args
{
t_game_data *object;
size_t id;
int initialized;
}args;
And finally, the function which handle the created threads
void *do_action(void *v_args)
{
t_args *args;
t_game_data *game;
size_t id;
args = v_args;
game = args->game;
id = args->id;
args->initialized = 1;
[...]
return (NULL);
}
The problem is :
The main thread will create new thread faster than the new thread can init his variables :
args = v_args;
game = args->game;
id = args->id;
So, sometime, 2 different threads will get same id from args->id.
To solve that, I use an variable initialized as a bool so make "sleep" the main thread during the new thread's initialization.
But I think that is really sinful.
Maybe there is a way to do that with a mutex? But I heard it wasn't "legal" to unlock a mutex which does not belong his thread.
Thanks for your answers!
The easiest solution to this problem would be to pass a different t_args object to each new thread. To do that, move the allocation inside the loop, and make each thread responsible for freeing its own argument struct:
void create_thread(t_game_data *game_data) {
for (size_t i = 0; i < 10; i++) {
t_args *args = malloc(sizeof(t_args));
if (!args) {
/* ... handle allocation error ... */
} else {
args->game = game_data;
args->id = i;
printf("%zu CREATION\n", i);//TODO: Debug
if (pthread_create(&game_data->object[i]->thread_id, NULL,
&do_action, args) != 0) {
// thread creation failed
free(args);
// ...
}
}
}
}
// ...
void *do_action(void *v_args) {
t_args *args = v_args;
t_game_data *game = args->game;
size_t id = args->id;
free(v_args);
args = v_args = NULL;
// ...
return (NULL);
}
But you also write:
To solve that, I use an variable initialized as a bool so make "sleep"
the main thread during the new thread's initialization.
But I think that is really sinful. Maybe there is a way to do that
with a mutex? But I heard it wasn't "legal" to unlock a mutex which
does not belong his thread.
If you nevertheless wanted one thread to wait for another thread to modify some data, as your original strategy requires, then you must employ either atomic data or some kind of synchronization object. Your code otherwise contains a data race, and therefore has undefined behavior. In practice, you cannot assume in your original code that the main thread will ever see the new thread's write to args->initialized. "Sinful" is an unusual way to describe that, but maybe appropriate if you belong to the Church of the Holy C.
You could solve that problem with a mutex by protecting just the test of args->initialized in your loop -- not the whole loop -- with a mutex, and protecting the threads' write to that object with the same mutex, but that's nasty and ugly. It would be far better to wait for the new thread to increment a semaphore (not a busy wait, and the initialized variable is replaced by the semaphore), or to set up and wait on a condition variable (again not a busy wait, but the initialized variable or an equivalent is still needed).
The problem is that in create_thread you are passing the same t_args structure to each thread. In reality, you probably want to create your own t_args structure for each thread.
What's happening is your 1st thread is starting up with the args passed to it. Before that thread can run do_action the loop is modifying the args structure. Since thread2 and thread1 will both be pointing to the same args structure, when they run do_action they will have the same id.
Oh, and don't forget to not leak your memory
Your solution should work in theory except for a couple of major problems.
The main thread will sit spinning in the while loop that checks the flag using CPU cycles (this is the least bad problem and can be OK if you know it won't have to wait long)
Compiler optimisers can get trigger happy with respect to empty loops. They are also often unaware that a variable may get modified by other threads and can make bad decisions on that basis.
On multi core systems, the main thread may never see the change to args->initiialzed or at least not until much later if the change is in the cache of another core that hasn't been flushed back to main memory yet.
You can use John Bollinger's solution that mallocs a new set of args for each thread and it is fine. The only down side is a malloc/free pair for each thread creation. The alternative is to use "proper" synchronisation functions like Santosh suggests. I would probably consider this except I would use a semaphore as being a bit simpler than a condition variable.
A semaphore is an atomic counter with two operations: wait and signal. The wait operation decrements the semaphore if its value is greater than zero, otherwise it puts the thread into a wait state. The signal operation increments the semaphore, unless there are threads waiting on it. If there are, it wakes one of the threads up.
The solution is therefore to create a semaphore with an initial value of 0, start the thread and wait on the semaphore. The thread then signals the semaphore when it is finished with the initialisation.
#include <semaphore.h>
// other stuff
sem_t semaphore;
void create_thread(t_game_data *game_data)
{
size_t i;
t_args args;
i = 0;
if (sem_init(&semaphore, 0, 0) == -1) // third arg is initial value
{
// error
}
args.game = game_data;
while (i < 10)
{
args.id = i;
printf("%zu CREATION\n", i);//TODO: Debug
pthread_create(&game_data->object[i]->thread_id, NULL, &do_action, args);
sem_wait(&semaphore);
i++;
}
sem_destroy(&semaphore);
}
void *do_action(void *v_args) {
t_args *args = v_args;
t_game_data *game = args->game;
size_t id = args->id;
sem_post(&semaphore);
// Rest of the thread work
return NULL;
}
Because of the synchronisation, I can reuse the args struct safely, in fact, I don't even need to malloc it - it's small so I declare it local to the function.
Having said all that, I still think John Bollinger's solution is better for this use-case but it's useful to be aware of semaphores generally.
You should consider using condition variable for this. You can find an example here http://maxim.int.ru/bookshelf/PthreadsProgram/htm/r_28.html.
Basically wait in the main thread and signal in your other threads.

Initialize global variable in Apache server

I have a simple apache web server with a hook function and a handler function.
int globalVar1;
int globalVar2;
static void register_hooks(apr_pool_t *pool)
{
globalVar1 = 9; /* or any other value... */
/* Create a hook in the request handler, so we get called when a request arrives */
ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST);
}
static int example_handler(request_rec *r)
{
printf("globalVar1=%d", globalVar1); /* print 9 */
globalVar2 = 9; /* or any other value... */
printf("globalVar2=%d", globalVar2); /* sometimes print 9 and sometimes not - race condition. */
/* do something... */
return OK;
}
I noticed that when I initialize globalVar1 in the hook, the variable has the same value as I initialized in the hook,
although the hook and the handler are been called on different processes.
1. what is the reason for that kind of behavior?
As a result, I decided to move the variable initialization to the handler function (globalVar2).
The problem I noticed is happening when the handler gets 2 requests at the same time and therefore the variable is not being initialized correctly.
So if I want to avoid race condition, I need to use lock on the variable initialization, but if I want to use lock, I need to initialize the lock before
and again I have the problem of initialization in multi threaded system.
2. How can I use lock in this kind of situation?
By mentioning variable initialization, I mean any initialization, even calling another function to initialize a global struct.
It could be much easier if I could share memory between the two processes (hook and handler), but from the investigation I did - it is impossible.
To make sure the initialization is done only once in a multithreaded situation, use a function CallOnce, that internally makes sure it is called exactly once.
For example: C11 threads.h call_once, or pthread_once.

Passing a struct by value to another function initializes it to zero

I am trying to create a thread library and my thread is a struct type. Have to follow a certain interface and in that I need to pass the thread by value. For ex: to join on a thread my code is as follows:
int thread_join(thread_t thread, void **status1)
{
printf("Joining thread\n");
long int thId = thread.id;
printf("Thread id: %ld\n", thId);
gtthread_t * thrd = getThreadFromID(thId);
while(thrd->status != EXIT)
{
}
status1 = &(thrd->ret_value);
return 0;
}
And I an passing a struct of type thread_t to this function. My problem is when I see the thread's ID in the calling function, its displayed properly but when I check it in the thread_join function its displayed as 0. The caller function is as follows:
void* caller(void* arg)
{
thread_t th;
thread_create(&th, some_function, NULL);
thread_join(th, NULL);
while(1);
}
Thread create initializes the ID of the thread to a non-zero value and starts the function associated with it.
My thread structure (and other relevant structure is):
typedef enum
{
RUNNING,
WAITING,
CANCEL,
EXIT
} stat;
//Thread
typedef struct
{
ucontext_t t_ctxt;
long int id;
stat status;
void * ret_value;
int isMain;
} thread_t;
int thread_create(thread_t *thread, void *(*start_routine)(void *), void *arg)
{
thread = (thread_t *)malloc(sizeof(thread_t));
thread->id = ++count;
thread->status = RUNNING;
thread->ret_value = NULL;
thread->isMain = 0;
if(getcontext(&(thread->t_ctxt)) == -1)
handle_error("getcontext");
thread->t_ctxt.uc_stack.ss_sp = malloc(SIGSTKSZ);
thread->t_ctxt.uc_stack.ss_size = SIGSTKSZ;
thread->t_ctxt.uc_link = &sched_ctxt;
makecontext(&thread->t_ctxt, (void (*)(void))wrap_func, 2, (void (*)(void))start_routine, arg);
enqueue(gQ, thread);
printf("Thread id: %ld\n", thread->id);
swapcontext(&(curr_thread->t_ctxt),&sched_ctxt);
return 0;
}
Why does this happen? After all, I am passing by value and this should create a copy of the thread with the same values. Thanks.
EDIT:
Basically I am having a queue of threads and there is a scheduler which round-robins. I can post that code here too but I'm sure that's needless and that code works fine.
EDIT2:
I am making a header file from this code and including that header in another file to test it. All my thread_t variables are static. The caller is a function which includes my header file.
What is this line:
thread = (thread_t *)malloc(sizeof(thread_t));
for?
You pass in to thread_create() an address which referrs to a struct thread_t defined in caller() as auto variable.
Doing as you do, you allocate memory to the pointer passed in to thread_create() initialise it and forget the address on return.
The code never writes to the memory being referenced by the address passed in! Besides this it is a memory leak.
To fix this simply remove the line of code quoted above.
You have no mutex guard on thread id getter. Presumably, there is no guard on setter. What can be happening is that the variable is not visible in the other thread yet. And, without a critical section, it may never become visible.
Each variable which is accessed for both read and write from different threads has to be accessed in a critical section (pthread_mutex_lock / unlock).
Another possibility is that you are setting the thread id inside the running thread and you are accessing the variable even before it is set. If you attempt to join immediately after starting a thread it is possible, that the other thread hasn't been run at all yet and the variable is not set.
side note: do yourself a favor and use calloc:)
In caller function,
thread_create(&th, some_function, NULL);
should be
gtthread_create(&th, some_function, NULL);

passing a local variable to thread. is it possible?

i'm working on gcc ,
i'm wondering if this is possible:
I have a function (NOTmain but aLocalFn) and I declare a local variable in it. Then I pass this local argument as a thread argument. is it doable? or there is the chance (depending on what is run first) that the aLocalVar will be lost before threadFunction is run and the reference idxPtr will be pointing to senselessness??
int *threadFunction(void *idxPtr){
int rec_idx=(int) *idxPtr;
//work in the thread with this variabel rec_idx
}
int aLocalFn(){
int aLocalVar=returnsRecordIndex();
pthread_create(&thread_id,&attr_detached,threadFunction, &aLocalVar)!=0)
return 0;
}
thank you for your help
This code is incorrect. The function aLocalFn may return before the thread function starts executing. And so by the time the thread function reads the local variable, the scope of that variable may have ended.
What can confuse matters is that this code may very well appear to work, at least some of the time. However, it is incorrect and you should use heap allocated memory instead.
your code has a life-time issue with "aLocalVar"
if you just want to pass an integer, here is a non-portable way to do it.
it does not work on some platforms, but you are not likely to encounter those.
void threadFunction ( void * idxptr ) {
int rec_idx = (int) idxptr;
....
}
int rec_idx = returnsRecordIndex();
pthread_create (&thread1, &attr_detached, (void *) &threadFunction, (void *)rec_idx);
It's doable, but it's not done in the code in your question. You will have to add a signal variable to indicate when the new thread is done using the variable. Then your outer function can return.
static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t signal = PTHREAD_COND_INITIALIZER;
int done;
int *threadFunction(void *idxPtr){
int rec_idx=(int) *idxPtr;
pthread_mutex_lock(&lock);
done = 1;
pthread_cond_signal(&signal);
pthread_mutex_unlock(&lock);
//work in the thread with this variabel rec_idx
}
int aLocalFn(){
int aLocalVar=returnsRecordIndex();
done = 0;
pthread_create(&thread_id,&attr_detached,threadFunction, &aLocalVar)!=0)
pthread_mutex_lock(&lock);
while (!done)
pthread_cond_wait(&signal, &lock);
pthread_mutex_unlock(&lock);
return 0;
}
Note that this example code is itself not thread safe (if multiple threads call aLocalFn).
This does complicate the code, and locking is expensive. So in most cases you're probably better off storing the data in the heap and letting the new thread or pthread_join code free it.
#pizza's answer is what I'd do. Another way for you to do it would be to use malloc/free as #David hinted at. I would certainly do this over the wait loop proposed in other answers here.
int *threadFunction(void *idxPtr){
int rec_idx = *(int *)idxPtr;
// free up our int buffer
free(idxPtr);
...
}
int aLocalFn(){
int aLocalVar = returnsRecordIndex();
// allocate some space for our int
int *intBuf = (int *)malloc(sizeof(int));
*intBuf = aLocalVar;
pthread_create(&thread_id,&attr_detached,threadFunction, intBuf)!=0)
return 0;
}
Whenever you are passing variables to a thread function, it is your job to ensure that the variable remains alive and valid till the thread function is done using it.
In your case aLocalFn() continues to execute simultaneously with the new thread and may even finish execution before the thread, that leaves you with an dangling pointer(pointer pointing to data that may not exist) in thread function since the local variable aLocalVar in the function ceases to exist after function returns.

What is the difference between a thread entry function and a normal function?

I would like to know the difference between a thread entry function:
void* thread_function (void* parameter)
{
struct parameter * thread_data = (struct parameter *)parameter;
char buffer[20];
int temp;
printf_buffer(buffer);
}
and a normal function:
void printf_buffer(char *buffer)
{
printf("buffer is %s",buffer);
return;
}
I know a thread entry is called when a thread is created, and how normal functions are used.
Are there any other differences between thread entry functions and a normal functions in terms of execution , behavior, or creating instances?
There is no difference in the language between what you call a "thread function" (although Justin has edited to call it a "thread entry function") and what you call a "normal function".
With pthreads, the so-called "start routine" of a thread is a function that takes a single void* parameter and returns void*, but there's nothing to stop you calling the same function "normally".
When the start routine of a thread returns, the thread finishes execution, but that's just because the threading implementation calls it, and then finishes the thread. It's not because of anything special to do with the start routine itself.
A thread function is just the entry/exit point of a thread. The execution of that function is no different from what you refer to as a normal function.
man pthread_create defines it quite well:
http://linux.die.net/man/3/pthread_create
one major difference that has not been mentioned yet is that the thread entry should expect to operate on a stack other than the caller's. for this reason, you will often want to pass heap copies of resources as your argument when your thread is detached, then free it in the entry:
// the caller must pass a heap copy of struct parameter* arg
void* detached_entry(void* arg) {
struct parameter* parameter = (struct parameter*)arg;
...
free(parameter);
}

Resources