Memory access in pthreads - c

I am writing a unit test that involves running multiple threads and I ran into a memory access issue that I can't seem to understand.
Here is the original (pseudo) code:
void thread_func(void * err) {
/*foo will return an allocated error_type if error occurs else NULL*/
err = (void *) foo(...)
pthread_exit(NULL);
}
void unit_test() {
int i = 0;
err_type *err_array[10];
pthread_t threads[10];
for (i = 0; i < 10; i++) {
pthread_create(&(threads[i]), NULL, thread_func, (void *) err_array[i]);
}
for(i = 0; i < 10; i++) {
pthread_join(threads[i], NULL);
ASSERT_NULL(err_array[i]);
}
}
What I am confused about is that all the threads will return NULL (checked with a gdb), however err_array[1] and err_array[5] will be NOT NULL. And instead of a valid err_type, they will contain garbage. In the case of err_array[1] will contain a string of the unit_test file path, and err_array[5] will contain a bunch of access out of bounds addresses.
A work-around I've found is to use a global err_array, and pass in the index for each element to the thread. As well as initializing all the elements of the array to NULL.
My question is why does the above 2 methods work, and not the original code?

The err variable is local to thread_func. It goes out of scope when thread_func returns. You need to pass the thread a pointer to the thing you want it to modify, not the current value of the thing you want it to modify.
So:
void thread_func(void ** err) {
/*foo will return an allocated error_type if error occurs else NULL*/
*err = (void *) foo(...)
pthread_exit(NULL);
}
And:
pthread_create(&(threads[i]), NULL, thread_func, (void **) &err_array[i]);

Related

Why are my threads I created not printed in order?

I have this program:
void *func(void *arg) {
pthread_mutex_lock(&mutex);
int *id = (int *)arg;
printf("My ID is %d\n" , *id);
pthread_mutex_unlock(&mutex);
}
int main() {
int i;
pthread_t tid[3];
// Let us create three threads
for (i = 0; i < 3; i++) {
pthread_create(&tid[i], NULL, func, (void *)&i);
}
for (i = 0; i < 3; i++) {
pthread_join(tid[i], NULL);
}
pthread_exit(NULL);
return 0;
}
I expected it to output this:
My ID is 0
My ID is 1
My ID is 2
But instead I get random output, such as this:
My ID is 0
My ID is 0
My ID is 2
Since I already added mutex lock, I thought it would solve the problem. What else did I do wrong? Is this related to race condition?
Here id points to the same variable i in main for all the threads.
int *id = (int *)arg;
printf("My ID is %d\n" , *id);
But the variable i is constantly being update by the two for-loops in main behind the threads back. So before the thread reaches the point of printf, the value of i, and therefore also the value of *id, may have changed.
There are a few ways to solve this. The best way depends on the use case:
Wait in main until the thread signals that it has made a copy of *id before modifying i or letting it go out of scope.
Declare and initialize an array, int thread_id[], and create the threads like this:
pthread_create(&tid[i], NULL, func, &thread_id[i]);
malloc some memory and and initialize it with a copy of i:
int *thread_id = malloc(sizeof(*thread_id));
*thread_id = i
pthread_create(&tid[i], NULL, func, thread_id);
Just don't forget to free your memory int the thread when you are finished using it. Or in main if the thread fails to start.
If i fits in a void * can pass its content directly as a parameter to the thread. To make sure it fits, you can declare it as intptr_t rather than int
(We basicly abuse the fact that pointers are nothing more than magic integers) :
void *func(void *arg) {
pthread_mutex_lock(&mutex);
// Here we interpret a pointer value as an integer value
intptr_t id = (intptr_t )arg;
printf("My ID is %d\n" , (int)id);
pthread_mutex_unlock(&mutex);
}
int main() {
intptr_t i;
pthread_t tid[3];
// Let us create three threads
for (i = 0; i < 3; i++) {
// Here we squeeze the integer value of `i` into something that is
// supposed to hold a pointer
pthread_create(&tid[i], NULL, func, (void *)i);
}
for (i = 0; i < 3; i++) {
pthread_join(tid[i], NULL);
}
// This does not belong here !!
// pthread_exit(NULL);
return 0;
}
Nope, no race conditions involved. (my b) There can be a race condition on i because all threads access it. Each thread gets started with a pointer to i. However, the main problem is that there is no guarantee that the thread will start and run the critical section while i holds the value you expect, in an order that you expect.
I'm assuming you declared the variable mutex globally and called pthread_mutex_init() somewhere to initialize it.
Mutexes are great to allow only one thread to access a critical section of code at a time. So the code as you've written creates all three threads to run in parallel, but only lets one thread at a time run the following code.
int *id = (int *)arg;
printf("My ID is %d\n" , *id);

pthread_create int instead of void

I have the following code:
for(i = 0 ; i < max_thread; i++)
{
struct arg_struct args;
args.arg1 = file;
args.arg2 = word;
args.arg3 = repl;
if(pthread_create(&thread_id[i],NULL,&do_process,&args) != 0)
{
i--;
fprintf(stderr,RED "\nError in creating thread\n" NONE);
}
}
for(i = 0 ; i < max_thread; i++)
if(pthread_join(thread_id[i],NULL) != 0)
{
fprintf(stderr,RED "\nError in joining thread\n" NONE);
}
int do_process(void *arguments)
{
//code missing
}
*How can I transform (void *)do_process into (int) do_process ?*
That function returns very important info and without those returns I don't know how to read the replies
I get the following error: warning: passing arg 3 of `pthread_create' makes pointer from integer without a cast
The thread function returns a pointer. At minimum, you can allocate an integer dynamically and return it.
void * do_process (void *arg) {
/* ... */
int *result = malloc(sizeof(int));
*result = the_result_code;
return result;
}
Then, you can recover this pointer from the thread_join() call;
void *join_result;
if(pthread_join(thread_id[i],&join_result) != 0)
{
fprintf(stderr,RED "\nError in joining thread\n" NONE);
} else {
int result = *(int *)join_result;
free(join_result);
/* ... */
}
Just write a helper function that is of the correct type, but all it does is take the void * input parameter, get all the right parameters out of it, call your function, take the return of that, and package it up as a void * for pthread_join to get.
To your specific question, you can't/shouldn't. Just do what I outlined above and you'll be golden.
The pthread_join() is a simple way to communicate between the two threads. It has two limitations. First, it can pass only one value from the pointer (you can make it a pointer and store multiple values). Second, you can return it only when the thread is all done -- after returning this value, the thread goes in terminated state. So, if you want the threads to communicate in a more granular fashion, you will be better served in using a common shared data. Of course, at teh very least, you would to use Pthread mutex to synchronize access to the common data. And, if you want the threads to communicate with each other, then you would also need to use Pthread condvars.

Stopping pthread as soon as struct is freed in C

I have a worker thread processing a queue of work items. I just implemented a second worker that process the items which were inserted in worker1. However, I came across some Invalid reads while using Valgrind.
I'm assuming this is because struct foo that I pass to worker2() is freed at some point in the main thread. Essentially struct foo is a struct that constantly gets updated (malloc/free), however, I'd like worker2 to insert some missing items into foo.
My question is: is it possible for worker2 to stop processing as soon as struct foo is NULL? and start again when create_foo() is called? I'm not sure what would be the best approach to insert the missing items into foo with a thread? Any feedback is appreciated.
//canonical form
//producer
void push_into_queue(char *item)
{
pthread_mutex_lock(&queueMutex);
if (workQueue.full) { // full }
else
{
add_item_into_queue(item);
pthread_cond_signal(&queueSignalPush);
}
pthread_mutex_unlock(&queueMutex);
}
}
// consumer1
void *worker1(void *arg)
{
while (true) {
pthread_mutex_lock(&queueMutex);
while (workQueue.empty)
pthread_cond_wait(&queueSignalPush, &queueMutex);
item = workQueue.front; // pop from queue
add_item_into_list(item);
pthread_cond_broadcast(&queueSignalPop);
pthread_mutex_unlock(&queueMutex);
}
return NULL;
}
pthread_create(&thread1, NULL, (void *) &worker, NULL);
// consumer2
void *worker2(void *arg)
{
my_struct *foo = (my_struct *) arg;
while (true) {
pthread_mutex_lock(&queueMutex);
while (list.empty)
pthread_cond_wait(&queueSignalPop, &queueMutex);
for (i = 0; i < list.size; i++)
insert_item_into_foo(list[i].item, foo);
pthread_cond_broadcast(&queueSignalPop);
pthread_mutex_unlock(&queueMutex);
}
return NULL;
}
void create_foo()
{
my_struct *foo = calloc(10, sizeof(my_struct));
pthread_create(&thread2, NULL, (void *) &worker2, foo);
}
void free_foo()
{
pthread_mutex_lock(&queueMutex);
int i;
for (i=0; i<5; i++)
free(foo[i].list->string);
free(foo[i].list);
free(foo);
pthread_mutex_unlock(&queueMutex);
}
You did not define any terminating condition for both worker1 and worker2. I suppose that the eol of foo could be considered as such. This means that both workers must monitor the existence of foo by owning a reference to it (ie. a foo **).
void *worker2(void *arg)
{
my_struct **foo = (my_struct **) arg;
while(true) {
pthread_mutex_lock(&queueMutex);
while (list.empty)
pthread_cond_wait(&queueSignalPop, &queueMutex);
if (NULL == *foo)
break;
for (i = 0; i < list.size; i++)
insert_item_into_foo(list[i].item, *foo);
pthread_cond_broadcast(&queueSignalPop);
pthread_mutex_unlock(&queueMutex);
}
free(foo);
return NULL;
}
void create_foo()
{
my_struct *foo = calloc(10, sizeof(my_struct ));
my_struct **foo_ptr = malloc(1, sizeof(my_struct *));
*foo_ptr = foo;
pthread_create(&thread2, NULL, (void *) &worker2, foo_ptr);
// more work with foo
}
Note that somehow foo must be assigned to a different variable so as to be reachable in free_foo (your code supposes this fact without explicitely showing it - hence my comment at the end of create_foo).
With the code above, each instance of worker2 owns a pointer to rely on for its whole lifetime, and which it must take care of before exiting.
Update:
Perhaps a better solution consists in passing a struct to thread2, which contains the foo pointer, as well as a flag indicating if that pointer is still valid. You may add any other piece of information needed by the thread in the struct ad lib.
struct th2_data {
enum {RUNNING, TERMINATING} state;
my_struct *foo;
};
Then allocate a instance of that struct, initialize it as {RUNNING, foo}, and pass it to thread2. Keep a copy of its address somewhere to be able to signal the TERMINATING state to thread2. Indeed, as you asked in your comments, you would have to replace the if (NULL == *foo) test in thread2 by if (foo.state == TERMINATING).
Make foo global and add some pointer check in the loop.
Next time when you call create_foo, it will restart the thread.
my_struct *foo = NULL;
// consumer2
void *worker2(void *arg)
{
while (true) {
if ( fool == NULL )
return;
pthread_mutex_lock(&queueMutex);
while (list.empty)
pthread_cond_wait(&queueSignalPop, &queueMutex);
for (i = 0; i < list.size; i++)
insert_item_into_foo(list[i].item, foo);
pthread_cond_broadcast(&queueSignalPop);
pthread_mutex_unlock(&queueMutex);
}
return NULL;
}

C pthread_join segmentation fault

I am trying to write a C program that calculates the size of a directory tree using threads for my assignment.
My code works fine when there is only one subdirectory, however whenever I have 2 or more subdirectories, I am getting a Segmentation Fault error. I was reading a lot about it and was not able to find a reason for my code to fail.
In my global scope:
pthread_mutex_t mutex;
int total_size = 0; // Global, to accumulate the size
main():
int main(int argc, char *argv[])
{
pthread_t thread;
...
if (pthread_mutex_init(&mutex, NULL) < 0)
{
perror("pthread_mutex_init");
exit(1);
}
pthread_create(&thread, NULL, dirsize, (void*)dirpath);
pthread_join(thread, NULL);
printf("\nTotal size: %d\n\n", total_size);
...
}
My dirsize() function:
void* dirsize(void* dir)
{
...
pthread_t tid[100];
int threads_created = 0;
dp=opendir(dir);
chdir(dir);
// Loop over files in directory
while ((entry = readdir(dp)) != NULL)
{
...
// Check if directory
if (S_ISDIR(statbuf.st_mode))
{
// Create new thread & call itself recursively
pthread_create(&tid[threads_created], NULL, dirsize, (void*)entry->d_name);
threads_created++;
}
else
{
// Add filesize
pthread_mutex_lock(&mutex);
total_size += statbuf.st_size;
pthread_mutex_unlock(&mutex);
}
}
for (i = 0; i < threads_created; i++)
{
pthread_join(tid[i], NULL);
}
}
What am I doing wrong here? Would greatly appreciate if you could point me to the right direction.
Here is what I'm getting through gdb: http://pastebin.com/TUkHspHH
Thank you in advance!
What's the value of NUM_THREADS?
// Check if directory
if (S_ISDIR(statbuf.st_mode))
{
// Create new thread & call itself recursively
pthread_create(&tid[threads_created], NULL, dirsize, (void*)entry->d_name);
threads_created++;
}
Here you should check if threads_created is equal to NUM_THREADS and if so increase the size of the tid array (that I would malloc at the function begin and free at the end, btw).
Moreover you should allocate a copy of the directory name (malloc + strcpy) before you pass it as argument to the thread and free such copy at the end of the function instead of entry->d_name.

Malloc accessibility between two functions

I'm writing a variant of the producer-consumer problem with multi threading. I'm trying to use a queue to store the "produced" items until they get "consumed" later on. My problem is that when the consumer thread runs, it only processes the most recent item added to the queue (rather than the oldest item on the queue). Further, it processes that item repeatedly (up to the number of items on the queue itself).
I think that my problem might be that I need to allocate some memory when I push an item onto the queue (not sure about this, though). But then, I need a way to refer to this memory when that item is about to be consumed.
Anyway, here is a paired down version of my program. I realize that what I am posting here is incomplete (this is an infinite loop), but I'm trying just show the part that is relevant to this issue. The functions queue_push() and and queue_pop() are well tested, so I don't think that the problem lies there. I'll post more if needed.
Can anyone see why my consumer thread only processes the newest queue item? Thank you!
sem_t mutex;
queue q;
FILE* inputFPtr[10];
char host_in[BUFFERSIZE];
char host_out[BUFFERSIZE];
void* p(void* inputFile) {
while (fscanf(inputFile, INPUTFS, host_in) > 0)
{
sem_wait(&mutex);
queue_push(&q, host_in); //this function pushes the hostname onto the back of the queue
fprintf(stdout, "Produced: %d) %s\n", i, host_in);
sem_post(&mutex);
}
fclose (inputFile);
}
void* c() {
while (TRUE)
{
sem_wait(&mutex);
sprintf(hostname_out, "%s", (char *) queue_pop(&q));
printf("%s\n", host_out);
sem_post(&mutex);
}
}
int main (int argc, char* argv[]) {
int i;
pthread_t *th_in[argc-2];
pthread_t *th_out[2];
for (i = 0; i < (argc-2); i++) {
th_in[i] = (pthread_t *) malloc(sizeof(pthread_t));
inputFPtr[i] = fopen(argv[i+1], "r");
pthread_create (th_in[i], NULL, p, inputFPtr[i]);
}
for (i = 0; i < 2; i++) {
th_out[i] = (pthread_t *) malloc(sizeof(pthread_t));
pthread_create (th_out[i], NULL, c, null);
}
for (i = 0; i < (argc - 2); i++) {
pthread_join(*th_in[i], 0);
free(th_in[i]);
}
for (i = 0; i < (2); i++) {
pthread_join(*th_out[i], 0);
free(th_out[i]);
}
return EXIT_SUCCESS;
}
You forgot to post you code. However from your description, it seems like all the queue members point to the same memory block. This is why all your pops result with the same item.
The answer to you question is YES. You need to allocate memory for each one of the items and free it after it was "consumed".
Try to post some code for more specific answers...

Resources