I have a question about pthread, when I create a variable inside a thread with malloc and then pass its pointer to a shared structure, i.e fifo, is the pointer passed by thread-1 will be accessed by thread2 ?
Please note that I have to code for the question above, I'm just trying to understand threading better, the below is just what I'm thinking about. The environment is pthread, c and linux
As far as I know threads are sharing the memory of their parent process, If that's the case the below should be correct.
void *thread-1(void *pointer)
{
int *intp = malloc(4);
send_to_fifo(intp);
}
void *thread-2(void *pointer)
{
int *iptr;
iptr = read_from_fifo();
do_something(iptr);
free(iptr);
}
is the pointer passed by thread-1 will be accessed by thread2 ?
Yes: since all threads operate in a common memory space, this is allowed.
malloc, free, and other memory management functions are thread-safe by default, unless compiled with NO_THREADS.
Of course you can do this. However you must be careful to not write to variable when it's used by another thread. You need synchronization.
In your case, you have race condition if the threads are run simultaneously (thread2 not waiting for thread1 to finish): thread2 either execute all it's code before thread1 puts anything to fifo or after that.
Related
I am trying to implement a solution based partly on the discussions below. Basically, I want to malloc() in the main thread and free() in a secondary thread. The discussions linked deal with LabWindows, but I think my question is more aimed at C programmers in general.
LabWindows: implementing thread safe queues that can handle string
elements
How to use a thread safe queue to store strings
I create a pointer to char in the main thread and allocate storage using malloc(). I copy some data into the storage, and assign the pointer to an array element (CmtWriteTSQData expects an array). This array gets passed to a thread safe queue.
In a secondary thread, the thread safe queue is read. The data is assigned to a new array.
How do I free the memory allocated within the secondary thread, as the pointer variable is no longer in scope?
Can I just call free() on the array element? Or do I need to create another pointer to char in the secondary thread, copy the array element to it, and then call free() on the pointer?
There doesn't appear to be a return value with free(), so I can't figure out how to ensure the call succeeds.
// Main thread
char *ptr = NULL;
char *array1[1] = {0};
ptr = (char *) malloc (3 * sizeof (char));
strcpy (ptr, "hi");
array1[0] = ptr;
CmtWriteTSQData (queue, array1, 1, 0 NULL);
// Secondary thread
char *array2[1] = {0};
CmtReadTSQData (queue, array2, 1, 0, 0);
printf ("%s", array2[0]); // Prints "hi"
free (array2[0]); // Does this work?
The short answer is yes.
When you call malloc(), you are asking the operating system to give you a chunk of usable memory of some minimum size, and the operating system responds by passing you a pointer, which is just an integer representing the virtual address of that chunk of memory.
When a program has multiple threads, this means that several lightweight processes are sharing the same virtual address space, which means if there are several copies of some valid pointer among several threads of the same process, then they must all point to the same memory location.
The operating system does not care which thread asked for the memory, and it does not care which thread gives it back. When malloc() returns a pointer in a process, all of the threads in that process may use it, and when one of the threads free()s that pointer, that memory location becomes invalid for all threads in the process.
I don't know how the functions CmtWriteTSQData() and CmtReadTSQData() behave, but as long as printf("%p\n",ptr) in main and printf("%p\n",array2[0]) in the secondary thread produce the same hex value, your code is good.
Hope that helps!
I'm writing a very simple code in which i need to use some threads.
When I create first type of thread i pass argument with pthread_create:
fman thread_arg;
thread_arg.sd=sda;
char* split = strtok(buffer, "|");
thread_arg.wcount=atoi(split);
split = strtok(NULL,"");
strcpy(thread_arg.id, split);
pthread_create(&thread_temp, NULL, registerF, &thread_arg);
And everything works fine, but in function registerF I need to do something like this:
wman thread_arg;
thread_arg.sd=foremans_fd[ix];
thread_arg.fmanix=ix;
strcpy(thread_arg.id,tmpr);
pthread_create(&thread_temp, NULL, registerW, &thread_arg);
Those arguments are structures defined by me:
typedef struct fman
{
int sd;
char id[100];
int wcount;
} fman;
typedef struct wman
{
int sd;
int fmanix;
char id[100];
} wman;
And when I check it by printf("%x, args) I get the same address but values are different inside. Where is my mistake?
One likely problem is here:
fman thread_arg;
[...]
pthread_create(&thread_temp, NULL, registerF, &thread_arg);
Note that the thread_arg object is located on the stack, and thus will be destroyed (and likely overwritten by other stack variables) when the function it is declared in returns.
pthread_create(), on the other hand, launches a thread that will run asynchronously with this function, which means that the thread can (and often will) run after the function you excerpted has returned, which means that by the time the thread dereferences &thread_arg, thread_arg has likely already been destroyed and that pointer is now pointing to some other data that was written into the same stack location later on.
Unless you are doing something special to make sure that the struct's lifetime is long enough to include all of the spawned thread's accesses to the struct, then the fact that this code ever works is pure luck (i.e. the scheduler just happened to schedule the thread to run and perform all of its accesses to the struct before the struct was destroyed/overwritten). You definitely can't depend on that.
In order to fix the problem, you need to either allocate the struct on the heap (so that it won't be destroyed when the function returns -- the spawned thread can then free the struct when it is done using it), or use some kind of synchronization mechanism (e.g. a condition variable) to cause the main thread to block inside your function until the spawned thread has indicated that it is done accessing the struct.
The thread struct in the running thread is treated like a block of memory and accessed using offsets. Since your fman and wman structs have different orders (4+100+4) vs (4+4+100), it's likely that you're getting right struct but reading from different memory location, given the passed struct to this thread is fman, and it's being accessed as wman.
Try changing them both to same signature, as in, int, int, char* and it should work.
I wanted to know if it was possible to return a pointer-address from the main function in c. Here a very short example:
int main(){
int i = 0;
return &i; //won't work because of type difference and because i..
} //.. will be deallocated.
So is there any way to do this?
And second: i want to do this in order to return a heap object from one program to anotherone..
Is it possible to keep the heap object alive if the called program terminates on the main() but continues running on a second thread which was started from main?
Thanks in advance!
In short, the answer is no.
The return value from main() is typically for error codes. Data to be kept persistent after the end of the process should be communicated in some other way. When you return from main(), your process (and all its threads) has ended. The memory space allocated by the operating system for the process has also been freed. This includes your heap memory. In short, once the process is over, there is no heap, and your object in the heap is gone.
You seem to have discovered the difficulty of inter-process communication. There are many possible techniques to allow one process to communicate with another. Some of them are
storing data to a file
pipes
message passing interface (MPI)
shared memory
Some techniques are better suited for different situations, this is why there are multiple options.
Is it possible to keep the heap object alive if the called program terminates on the main() but continues running on a second thread which was started from main?
No, it is not. When you return from main, it is equivalent to calling exit. Hence, the program terminates. No threads can be alive after that.
From the C99 Standard:
5.1.2.2.3 Program termination
1 If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument;
You cannot return it in a strictly-conforming program nor on most implementations.
From the C11 standard draft N1570, ยง5.1.2.2.1:
The function called at program startup is named main. [...] It shall
be defined with a return type of int and with no parameters:
int main(void) { /* ... */ }
or with two parameters [...]:
int main(int argc, char *argv[]) { /* ... */ }
[...] or in some other implementation-defined manner.
Mangling the pointer into an int would be possible but that would be implementation-defined, possibly undefined, and... nasty.
Is it possible to keep the heap object alive if the called program
terminates on the main() but continues running on a second thread
which was started from main?
A program is process. main runs in a thread. Heap memory is shared between all threads, so yes, sharing heap memory between threads is entirely possible.
However, main returning is equivalent to the program terminating, so you need to find another way to share it between threads. An example would be to set a pointer pointing to heap memory in one thread (maybe make that thread yield), and use that pointer in the other threads.
i want to do this in order to return a heap object from one program to
anotherone..
What now, between processes or threads? Heap memory is shared between threads but not between multiple processes. Sharing between multiple processes is done using shared memory (mmap or shmopen, shmget, etc.).
Two problems:
When main returns, i ceases to exist; any pointer value you returned would be invalid.
I am not aware of any runtime environment that expects to receive a pointer value from any executed program; main returns an int because the runtime environment expects an int.
I have to use two threads; one to do various operations on matrices, and the other to monitor virtual memory at various points in the matrix operation process. This method is required to use a global state variable 'flag'.
So far I have the following (leaving some out for brevity):
int flag = 0;
int allocate_matrices(int dimension)
{
while (flag == 0) {} //busy wait while main prints memory state
int *matrix = (int *) malloc(sizeof(int)*dimension*dimension);
int *matrix2 = (int *) malloc(sizeof(int)*dimension*dimension);
flag = 0;
while (flag == 0) {} //busy wait while main prints memory state
// more similar actions...
}
int memory_stats()
{
while (flag == 0)
{ system("top"); flag = 1; }
}
int main()
{ //threads are created and joined for these two functions }
As you might expect, the system("top") call happens once, the the matrices are allocated, then the program falls into an infinite loop. It seems apparent to me that this is because the thread assigned to the memory_stats function has already completed its duty, so flag will never be updated again.
Is there an elegant way around this? I know I have to print memory stats four times, so it occurs to me that I could write four while loops in the memory_stats function with busy waiting contingent on the global flag in between each of them, but that seems clunky to me. Any help or pointers would be appreciated.
One of the possible reasons for the hang is that flag is a regular variable and the compiler sees that it's never set to a non-zero value between flag = 0; and while (flag == 0) {} or in this while inside allocate_matrices(). And so it "thinks" the variable stays 0 and the loop becomes infinite. The compiler is entirely oblivious to your threads.
You could define flag as volatile to prevent the above from happening, but you'll likely run into other issues after adding volatile. For one thing, volatile does not guarantee atomicity of variable modifications.
Another issue is that if the compiler sees an infinite loop that has no side effects, it may be considered undefined behavior and anything could happen, or, at least, not what you're thinking should, also this.
You need to use proper synchronization primitives like mutexes.
You can lock it with mutex. I assume you use pthread.
pthread_mutex_t mutex;
pthread_mutex_lock(&mutex);
flag=1;
pthread_mutex_unlock (&mutex);
Here is a very good tutorial about pthreads, mutexes and other stuff: https://computing.llnl.gov/tutorials/pthreads/
Your problem could be solved with a C compiler that follows the latest C standard, C11. C11 has threads and a data type called atomic_flag, that can basically used for a spin lock as you have it in your question.
First of all, the variable flag needs to be declared volatile or else the compiler has license to omit reads to it after the first one.
With that out of the way, a sequencer/event_counter can be used: one thread may increment the variable when it's odd, the other when it's even. Since one thread always "owns" the variable, and transfers the ownership with the increment, there is no race condition.
Coming from CUDA I'm interested in how shared memory is read from a thread and compares to the reading alignment requirements of CUDA. I'll used the following code as an example:
#include <sys/unistd.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#define THREADS 2
void * threadFun(void * args);
typedef struct {
float * dataPtr;
int tIdx,
dSize;
} t_data;
int main(int argc, char * argv[])
{
int i,
sizeData=5;
void * status;
float *data;
t_data * d;
pthread_t * threads;
pthread_attr_t attr;
data=(float *) malloc(sizeof(float) * sizeData );
threads=(pthread_t *)malloc(sizeof(pthread_t)*THREADS);
d = (t_data *) malloc (sizeof(t_data)*THREADS);
data[0]=0.0;
data[1]=0.1;
data[2]=0.2;
data[3]=0.3;
data[4]=0.4;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
for (i=0; i<THREADS;i++)
{
d[i].tIdx=i;
d[i].dataPtr=data;
d[i].dSize=sizeData;
pthread_create(&threads[i],NULL,threadFun,(void *)(d+i));
}
for (i=0; i<THREADS; i++)
{
pthread_join(threads[i],&status);
if(status);
//Error;
}
return 0;
}
void * threadFun(void * args)
{
int i;
t_data * d= (t_data *) args;
float sumVal=0.0;
for (i=0; i<d->dSize; i++)
sumVal+=d->dataPtr[i]*(d->tIdx+1);
printf("Thread %d calculated the value as %-11.11f\n",d->tIdx,sumVal);
return(NULL);
}
In the threadFun, the entire pointer d is pointing to shared memory space (I believe). From what I've encountered in documentation reading from multiple threads is ok. In CUDA reads need to be coalesced - is there similar alignment restrictions in pthreads? I.e. if I have two threads reading from the same shared address I'm assuming somewhere along the line a scheduler has to put one thread ahead of the other. In CUDA this could be a costly operation and should be avoided. Is there a penalty for 'simultaneous' reads from shared memory - and if so is it so small that it is negligible? i.e. both threads may need to read d->datPtr[0] simultaneously - I'm assuming that memory read cannot occur simultaneously - is this assumption wrong?
Also I read an article from intel that said to use a structure of arrays when multithreading - this is consistent with cuda. If I do this though, it is almost inevitable I will need the thread ID - which I believe will require me to use a mutex lock the thread ID until it is read into the thread's scope, is this true or would there be some other way to identify threads?
An article on memory management for mulithreaded programs would be appreciated as well.
While your thread data pointer d is pointing into a shared memory space, unless you increment that pointer to try and read from or write to an adjoining thread data element in the shared memory space array, you're basically dealing with localized thread data. Also the value of args is local to each thread, so in both cases if you are not incrementing the data pointer itself (i.e., you're never calling something like d++, etc. so that you're pointing to another thread's memory), no mutex is needed to guard the memory "belonging" to your thread.
Also again for your thread ID, since you're only writing that value from the spawning thread, and then reading that value in the actual spawned thread, there is no need for a mutex or synchronization mechanism ... you only have a single producer/consumer for the data. Mutexes and other synchronization mechanisms are only needed if there are multiple threads that will read and write the same data location.
CPUs have caches. Reads come from caches, so each CPU/core can read from its own cache, as long as the corresponding cacheline is SHARED. Writes force cachelines into EXCLUSIVE state, invalidating the corresponding cachelines on other CPUs.
If you have an array with a member per thread, and there are both reads and writes to that array, you may want to align every member to a cacheline, to avoid false sharing.
memory read to the same area in different thread to the same memory isn't a problem in shared memory systems (write is another matter, the pertinent area is the cache line: 64-256 bytes depending on the system)
I don't see any reason for which getting the thread_id should be a synchronized operation. (And you can feed your thread with any id meaningful for you, it can be simpler than getting a meaningful value from an abstract id)
Coming from CUDA probably let's you think to complicated. POSIX threads are much simpler. Basically what you are doing should work, as long as you are only reading in the shared array.
Also, don't forget that CUDA is a dismemberment of C++ and not on C, so some things might look different from that aspect, too. E.g in your code the habit of casting the return from malloc is generally frowned upon by real C programmers since it can be the source of subtle errors, there.