Memory sharing between threads in C - c

can someone please explain exactly what memory is being shared between threads? I got the code from a website just so I can explain what I don't understand exactly. I want to know, if all the threads after they're created they will execute the function doSomeThing and will they share the same value for MyVariable or every thread will have separate values for it. (ignore the fact that there isn't any value assigned to the variable)
#include<stdio.h>
#include<string.h>
#include<pthread.h>
#include<stdlib.h>
#include<unistd.h>
pthread_t tid[2];
void* doSomeThing(void *arg)
{
unsigned long i = 0;
int MyVariable;
pthread_t id = pthread_self();
if(pthread_equal(id,tid[0]))
{
printf("\n First thread processing\n");
}
else
{
printf("\n Second thread processing\n");
}
for(i=0; i<(0xFFFFFFFF);i++);
return NULL;
}
int main(void)
{
int i = 0;
int err;
while(i < 2)
{
err = pthread_create(&(tid[i]), NULL, &doSomeThing, NULL);
if (err != 0)
printf("\ncan't create thread :[%s]", strerror(err));
else
printf("\n Thread created successfully\n");
i++;
}
return 0;
}

You've actually asked two separate questions.
what memory is being shared between threads?
Well, all memory (on typical OSes). A main difference between threads and processes is that different processes have different memory spaces, which threads within a process have the same memory space.
See also:
What is the difference between a process and a thread?
will [the two threads] share the same value for MyVariable?
No! and that's because each thread has its own stack (and their own registers state). Now, the stacks are both in the shared memory space, but each thread uses a different one.
So: Sharing memory space is not the same as sharing the value of each variable.

Related

Threading In C: Producer Consumer taking forever to run

I'm new to concept of threading.
I was doing producer consumer problem in C but the consumer thread doesn't run when parallel with producer.
my code is as follows:
#include<stdio.h>
#include<stdlib.h>
#include<pthread.h>
int S;
int E;
int F;
void waitS(){
//printf("hbasd");
while(S<=0);
S--;
}
void signalS(){
S++;
}
void waitE(){
while(E<=0);
E--;
}
void signalE(){
E++;
}
void waitF(){
while(F<=0);
F--;
}
void signalF(){
F++;
}
int p,c;
void* producer(void *n){
int *j = (int *)n;
int i = *j;
while(1){
waitS();
waitE();
printf("Producer %d\n",E);
signalS();
signalF();
p++;
if(p>=i){
printf("Exiting: producer\n");
pthread_exit(0);
}
}
}
void* consumer(void *n){
int *j = (int *)n;
int i = *j;
while(1){
waitS();
waitF();
printf("Consumer %d\n",E);
signalS();
signalE();
c++;
if(c>=i){
printf("Exiting Consumer\n");
pthread_exit(0);
}
}
}
int main(int argc, char* argv[]){
int n = atoi(argv[1]);
E = n;
S = 1;
F = 0;
int pro = atoi(argv[2]);
int con = atoi(argv[3]);
pthread_t pid, cid;
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_create(&pid,&attr,producer,(void *)&pro);
pthread_create(&cid,&attr,consumer,(void *)&con);
pthread_join(pid,NULL);
pthread_join(cid,NULL);
}
When i give the input as ./a.out 3 4 3
i.e n=3, pro = 4, con = 3
I get no out just an a dead lock kind of situation.
I expect an output like
Producer 2
Producer 1
Producer 0
Consumer 0
Consumer 1
Producer 0
Exiting: producer
Consumer 0
Exiting: consumer
...similar outputs where Producer runs 4 times and consumer thrice
When i give an input like ./a.out 4 4 3
i get the following output
Producer 3
Producer 2
Producer 1
Producer 0
Exiting: producer
Consumer 0
Consumer 1
Consumer 2
Exiting: consumer
from the results i'm getting a conclusion that pthread producer is executing 1st and then is pthread consumer.
I want both of them to execute simultaneously so that i get an answer similar to the first expected output when test cases like 3 4 3 are given.
You are accessing non-atomic variables from different threads without any kind of synchronization; this is a race condition and it leads to undefined behavior.
In particular, modern CPUs provide separate registers and separate caches to each CPU core, which means that if a thread running on CPU core #1 modifies the value of a variable, that modification may remain solely in CPU #1's cache for quite a while, without getting "pushed out" to RAM, and so another thread running on CPU core #2 may not "see" the thread #1's update for a long time (or perhaps never).
The traditional way to deal with this problem is either to serialize accesses to your shared variables with one or more mutexes (see pthread_mutex_init(), pthread_mutex_lock(), pthread_mutex_unlock(), etc), or use atomic variables rather than standard ints for values you want to access from multiple threads simultaneously. Both of those mechanisms have safeguards to ensure that undefined behavior won't occur (if you are using them correctly).
You can not access same memory from two different threads without synchronization. The standard for pthreads spells it out quite clearly here:
Applications shall ensure that access to any memory location by more than one thread of control (threads or processes) is restricted such that no thread of control can read or modify a memory location while another thread of control may be modifying it. Such access is restricted using functions that synchronize thread execution and also synchronize memory with respect to other threads.
Besides, even if we ignore that many CPUs don't synchronise memory unless you explicitly ask them to, your code is still incorrect in normal C because if variables can be changed behind your back they should be volatile. But even though volatile might help on some CPUs, it is incorrect for pthreads.
Just use proper locking, don't spin on global variables, there are methods to heat a room that are much cheaper than using a CPU.
In general, you should use synchronization primitives, but unlike other answerers I do believe we might not need any if we run this program on x86 architecture and prevent compiler to optimize some critical parts in the code.
According to Wikipedia, x86 architecture has almost sequential consistency, which is more than enough to implement a producer-consumer algorithm.
The rules to successfully implement such an producer-consumer algorithm is quite simple:
We must avoid writing the same variable from different threads, i.e. if one thread writes to variable X, another thread just read from X
We must tell the compiler explicitly that our variables might change somewhere, i.e. use volatile keyword on all shared between threads variables.
And here is the working example based on your code. Producer produces numbers from 5 down to 0, consumer consumes them. Please remember, this will work on x86 only due to weaker ordering on other architectures:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
volatile int P = 0;
volatile int C = 0;
volatile int value = 0;
void produce(int v)
{
value = v;
P++;
}
int consume()
{
int v = value;
C++;
return v;
}
void waitForConsumer()
{
while (C != P)
;
}
void waitForProducer()
{
while (C == P)
;
}
void *producer(void *n)
{
int i = *(int *)n;
while (1) {
waitForConsumer();
printf("Producing %d\n", i);
produce(i);
i--;
if (i < 0) {
printf("Exiting: producer\n");
pthread_exit(0);
}
}
}
void *consumer(void *n)
{
while (1) {
waitForProducer();
int v = consume();
printf("Consumed %d\n", v);
if (v == 0) {
printf("Exiting: consumer\n");
pthread_exit(0);
}
}
}
int main(int argc, char *argv[])
{
int pro = 5;
pthread_t pid, cid;
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_create(&pid, &attr, producer, (void *)&pro);
pthread_create(&cid, &attr, consumer, NULL);
pthread_join(pid, NULL);
pthread_join(cid, NULL);
}
Produces the following result:
$ ./a.out
Producing 5
Producing 4
Consumed 5
Consumed 4
Producing 3
Producing 2
Consumed 3
Consumed 2
Producing 1
Producing 0
Exiting: producer
Consumed 1
Consumed 0
Exiting: consumer
For more information, I really recommend Herb Sutter's presentation called atomic<> Weapons, which is quite long, but has everything you need to know about ordering and atomics.
Despite the code listed above will work OK on x86, I really encourage you to watch the presentation above and use builtin atomics, like __atomic_load_n(), which will generate the correct assembly code on any platform.
Create new threads for producer and consumer each i.e all producers and consumers have their own threads.

Count char on file using thread and mutex

I am working on this problem: take from the command line a letter and the name of same files, count the occurrence of the char in each file, using one thread per file, and and print the total occurrences.
This is my code:
typedef struct _CharFile{
char c;
char *fileName;
} CharFile;
pthread_mutex_t count = PTHREAD_MUTEX_INITIALIZER;
int sum = 0;
void *CountFile(void *threadarg);
int main(int argc, const char * argv[]) {
pthread_t threads[argc-2];
int chck, t;
CharFile cf;
for ( t=0 ; t<argc-2 ; t++ ){
cf.c = argv[1][0];
cf.fileName = (char *)argv[t + 2];
chck = pthread_create(&threads[t], NULL, CountFile, (void *) &cf);
if (chck){
printf("ERROR; return code from pthread_create() is %d\n", chck);
exit(-1);
}
}
printf("%lld occurrences of the letter %c in %lld threads\n", (long long)sum, argv[1][0], (long long)argc-2);
return 0;
}
void *CountFile(void *threadarg){
FILE *in;
CharFile *cf;
char c;
int counter = 0;
cf = (CharFile *) threadarg;
in = fopen(cf->fileName, "r");
if (in == NULL){
perror("Error opening the file!\n");
pthread_exit(NULL);
}
while (fscanf(in, "%c", &c) != EOF){
if(c == cf->c){
counter += 1;
}
}
fclose(in);
pthread_mutex_lock(&count);
sum += counter;
pthread_mutex_unlock(&count);
pthread_exit(NULL);
}
I don't get any error in the file opening or in the thread creations, but my output is always 0 as total occurrences. I also tried to print the counter in the threads and I got every time the same numbers in all the threads, even if my input files are different. Am I using the mutex wrongly or is there something else wrong?
This is one of my outputs:
61 occurrences of e in this thread
0 occurrences of the letter e in 3 threads
61 occurrences of e in this thread
61 occurrences of e in this thread
Program ended with exit code: 9
There are several threading issues at play here.
1) The main thread will continue asynchronously to the newly spawned threads. Given the code, it is very likely that the main thread will complete and exit before the CountFile threads have completed. On Linux, when the main thread returns, the C runtime will perform a exit_group system call which will terminate all threads.
You'll need to add some check to ensure the CountFile threads have finished the relevant section of work. In this example, look at using pthread_join() in the main thread.
2) The 'cf' storage in the main thread is a stack local variable, which is passed by pointer to each thread. However, since it is the same storage, several types of failures could occur. a) the workunit may be updated by the main thread while a worker thread is accessing it. b) the same workunit is sent to multiple/all threads.
You could solve this several ways: 'cf' could be an array of CharFile for each thread. Or 'cf' could be dynamically allocated for each thread. The former is a bit more performance and memory efficient, but the later might be better structurally. Particularly that main thread is giving addresses in its local stack space to another thread.
3) Once item #1 is addressed and the threads exited before the main thread printf, the mutex usage would be ok. But it might be better to put pthread_mutex_locks around the main thread access of 'sum' anyway. It may not be necessary given this code, but future code refactors might change that.

Producer / Consumer using semaphore

I'm starting my studies with syncronzed threads using semaphore.
I just did a test using binary semaphore (2 threads only) and it's all good.
Imagine a lanhouse, that have 3 computers (threads) and some clients (Threads). If all computers are bussy, the client will wait in a queue with a know limit (e.g 15 clients).
I can't understand how threads will relate to each other.
As far as I know, semaphore is used to control the access of the threads to a certain critical region / memory area / global variable.
1) Create 1 semaphore to control the Clients accessing the computers (but both are threads);
2) Create 1 semaphore to control the clients in queue;
But how relate threads with threads ? How the semaphore will know which thread(s) it should work with.
I don't need a full answer for it. I just need to understand how the Threads will relate to eachother. Some help to understand the situation.
This is my code so far and it's not working ;P can't control the clients to access the 3 computers avaliable.
#include <pthread.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define ROOM_SIZE 15
sem_t queue, pc, mutex;
int room [ROOM_SIZE];
int pcsAvaliable = 3, nAvaliable = 0, roomAvaliable = ROOM_SIZE;
int computers [3]; // 0 -> Empty | 1 -> Ocuppied
void* Lan(void* arg)
{
//Enter the lanhouse
//Verify if there is a computer avaliable
sem_wait(&pc);
if(pcsAvaliable > 0)
{
sem_wait(&mutex);
pcsAvaliable--;
computers[nAvaliable] = 1;
printf("Cliente pegou pc: %d\n", nAvaliable);
nAvaliable++;
sem_post(&mutex);
//Wait for 80~90ms
printf("Client liberou pc: %d\n", nAvaliable);
computers[nAvaliable] = 0;
nAvaliable--;
sem_post(&pc);
}
else
{
printf("No computer avaliable...\n");
//Check the waiting room for avaliable slot
if(roomAvaliable > 0)
{
roomAvaliable--;
printf("Client entered the waiting room.");
}
else
printf("No avaliable space in waiting room..\n");
}
}
int main(int argc, char const *argv[])
{
int i;
if(argc > 1)
{
int numClients = atoi(argv[1]);
sem_init(&pc, 0, 3);
sem_init(&mutex, 0, 1);
pthread_t clients[numClients];
//Create Clients
for(i=0; i< numClients; i++)
{
pthread_create(&clients[i], NULL, Lan, NULL);
}
//Join Clients
for(i=0; i< numClients; i++)
{
pthread_join(clients[i], NULL);
}
}
else
printf("Please, insert a parameter.");
pthread_exit(NULL);
sem_destroy(&pc);
return 0;
}
If you're going to be technical, if you're syncing tasks between threads you should use Semaphore. Example reading input before parsing it.
Here's an answer on semaphores.
But if you're using shared resources, and need to avoid race condition/two threads accesing at the same time, you should use mutexes. Here's a question on what is a mutex.
Also look at the disambiguation by Michael Barr which is a really good.
I would read both question thoroughly and the disambiguation, and you might actually end up not using semaphore and just mutexes since from what you explained you're only controlling a shared resource.
Common semaphore function
int sem_init(sem_t *sem, int pshared, unsigned int value); //Use pshared with 0, starts the semaphore with a given value
int sem_wait(sem_t *sem);//decreases the value of a semaphore, if it's in 0 it waits until it's increased
int sem_post(sem_t *sem);//increases the semaphore by 1
int sem_getvalue(sem_t *sem, int *valp);// returns in valp the value of the semaphore the returned int is error control
int sem_destroy(sem_t *sem);//destroys a semaphore created with sim_init
Common Mutexes functions (for linux not sure what O.S. you're running on)
int pthread_mutex_init(pthread_mutex_t *p_mutex, const pthread_mutexattr_t *attr); //starts mutex pointed by p_mutex, use attr NULL for simple use
int pthread_mutex_lock(pthread_mutex_t *p_mutex); //locks the mutex
int pthread_mutex_unlock(pthread_mutex_t *p_mutex); //unlocks the mutex
int pthread_mutex_destroy(pthread_mutex_t *p_mutex);//destroys the mutex
You can treat computers as resources. The data structure for the resource can be initialized by the main thread. Then, there can be client threads trying to acquire an instance of resource (a computer). You can use a counting semaphore with a value 3 for the number of computers. To acquire a computer, a client thread does
P (computer_sem).
Similarly to release the client thread has to do,
V (computer_sem)
For more information on threads and semaphore usage, refer
POSIX Threads Synchronization in C.

static storage with pthread functions

I was practicing some multithreaded programs, but I could not figure the logic behind this output.
#include<stdio.h>
#include<stdlib.h>
#include<pthread.h>
int print_message(void* ptr);
int main()
{
pthread_t thread1,thread2;
char *mesg1 = "Thread 1";
char *mesg2 = "Thread 2";
int iret1, iret2;
pthread_create(&thread1, NULL, print_message, (void *)mesg1);
pthread_create(&thread2, NULL, print_message, (void *)mesg2);
pthread_join(thread1,(void*)&iret1 );
pthread_join(thread2, (void*)&iret2);
printf("Thread 1 return : %d\n", (int)iret1);
printf("Thread 2 return : %d\n", (int)iret2);
return 0;
}
int print_message(void *ptr)
{
char *mesg;
static int i=0;
mesg = (char *)ptr;
printf("%s\n",mesg);
i++;
return ((void*)i);
}
I was expecting the output
Thread 1
Thread 2
Thread 1 return : 1
Thread 2 return : 2
but I am getting the output
Thread 1
Thread 2
Thread 1 return : 0
Thread 2 return : 2
Could some please clarify this to me ? And please point if any errors in usage of pthread functions.
The variable i is shared between both threads because it is static. The behaviour of modifying a variable between multiple threads is undefined, so, in fact, both the output you get and the output you want to get are “wrong” in the sense that the compiler is under no obligation to give it to you. In fact, I was able to get the output to change depending on the optimisation level I used and it will undoubtedly be different based on the platform.
If you want to modify i, you should use a mutex:
int print_message(void *ptr)
{
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
char *mesg;
static int i=0;
int local_i;
mesg = (char *)ptr;
printf("%s\n",mesg);
if (pthread_mutex_lock(&mutex) == 0) {
local_i = ++i;
pthread_mutex_unlock(&mutex);
}
return ((void*)local_i);
}
If you do not use a mutex, you will never be sure to get the output you think you should get.
There are several good books on multi-threading. I found Butenhof's Programming with Posix threads quite interesting, but more recent books exist.
You may also want to read this pthreads tutorial online.
Basically, each program source code thread might not view the memory as intuitively as you expect (cache coherence, multi-processing, memory model, C11).
Practically speaking, any access to a data shared between threads should be protected by synchronization primitives, e.g. mutexes or rwlocks.
Also, note that debugging multi-threaded programs is challenging due to non-determinism and heisenbugs.

How can I wait for any/all pthreads to complete?

I just want my main thread to wait for any and all my (p)threads to complete before exiting.
The threads come and go a lot for different reasons, and I really don't want to keep track of all of them - I just want to know when they're all gone.
wait() does this for child processes, returning ECHILD when there are no children left, however wait does not (appear to work with) (p)threads.
I really don't want to go through the trouble of keeping a list of every single outstanding thread (as they come and go), then having to call pthread_join on each.
As there a quick-and-dirty way to do this?
Do you want your main thread to do anything in particular after all the threads have completed?
If not, you can have your main thread simply call pthread_exit() instead of returning (or calling exit()).
If main() returns it implicitly calls (or behaves as if it called) exit(), which will terminate the process. However, if main() calls pthread_exit() instead of returning, that implicit call to exit() doesn't occur and the process won't immediately end - it'll end when all threads have terminated.
http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_exit.html
Can't get too much quick-n-dirtier.
Here's a small example program that will let you see the difference. Pass -DUSE_PTHREAD_EXIT to the compiler to see the process wait for all threads to finish. Compile without that macro defined to see the process stop threads in their tracks.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <time.h>
static
void sleep(int ms)
{
struct timespec waittime;
waittime.tv_sec = (ms / 1000);
ms = ms % 1000;
waittime.tv_nsec = ms * 1000 * 1000;
nanosleep( &waittime, NULL);
}
void* threadfunc( void* c)
{
int id = (int) c;
int i = 0;
for (i = 0 ; i < 12; ++i) {
printf( "thread %d, iteration %d\n", id, i);
sleep(10);
}
return 0;
}
int main()
{
int i = 4;
for (; i; --i) {
pthread_t* tcb = malloc( sizeof(*tcb));
pthread_create( tcb, NULL, threadfunc, (void*) i);
}
sleep(40);
#ifdef USE_PTHREAD_EXIT
pthread_exit(0);
#endif
return 0;
}
The proper way is to keep track of all of your pthread_id's, but you asked for a quick and dirty way so here it is. Basically:
just keep a total count of running threads,
increment it in the main loop before calling pthread_create,
decrement the thread count as each thread finishes.
Then sleep at the end of the main process until the count returns to 0.
.
volatile int running_threads = 0;
pthread_mutex_t running_mutex = PTHREAD_MUTEX_INITIALIZER;
void * threadStart()
{
// do the thread work
pthread_mutex_lock(&running_mutex);
running_threads--;
pthread_mutex_unlock(&running_mutex);
}
int main()
{
for (i = 0; i < num_threads;i++)
{
pthread_mutex_lock(&running_mutex);
running_threads++;
pthread_mutex_unlock(&running_mutex);
// launch thread
}
while (running_threads > 0)
{
sleep(1);
}
}
If you don't want to keep track of your threads then you can detach the threads so you don't have to care about them, but in order to tell when they are finished you will have to go a bit further.
One trick would be to keep a list (linked list, array, whatever) of the threads' statuses. When a thread starts it sets its status in the array to something like THREAD_STATUS_RUNNING and just before it ends it updates its status to something like THREAD_STATUS_STOPPED. Then when you want to check if all threads have stopped you can just iterate over this array and check all the statuses.
Don't forget though that if you do something like this, you will need to control access to the array so that only one thread can access (read and write) it at a time, so you'll need to use a mutex on it.
you could keep a list all your thread ids and then do pthread_join on each one,
of course you will need a mutex to control access to the thread id list. you will
also need some kind of list that can be modified while being iterated on, maybe a std::set<pthread_t>?
int main() {
pthread_mutex_lock(&mutex);
void *data;
for(threadId in threadIdList) {
pthread_mutex_unlock(&mutex);
pthread_join(threadId, &data);
pthread_mutex_lock(&mutex);
}
printf("All threads completed.\n");
}
// called by any thread to create another
void CreateThread()
{
pthread_t id;
pthread_mutex_lock(&mutex);
pthread_create(&id, NULL, ThreadInit, &id); // pass the id so the thread can use it with to remove itself
threadIdList.add(id);
pthread_mutex_unlock(&mutex);
}
// called by each thread before it dies
void RemoveThread(pthread_t& id)
{
pthread_mutex_lock(&mutex);
threadIdList.remove(id);
pthread_mutex_unlock(&mutex);
}
Thanks all for the great answers! There has been a lot of talk about using memory barriers etc - so I figured I'd post an answer that properly showed them used for this.
#define NUM_THREADS 5
unsigned int thread_count;
void *threadfunc(void *arg) {
printf("Thread %p running\n",arg);
sleep(3);
printf("Thread %p exiting\n",arg);
__sync_fetch_and_sub(&thread_count,1);
return 0L;
}
int main() {
int i;
pthread_t thread[NUM_THREADS];
thread_count=NUM_THREADS;
for (i=0;i<NUM_THREADS;i++) {
pthread_create(&thread[i],0L,threadfunc,&thread[i]);
}
do {
__sync_synchronize();
} while (thread_count);
printf("All threads done\n");
}
Note that the __sync macros are "non-standard" GCC internal macros. LLVM supports these too - but if your using another compiler, you may have to do something different.
Another big thing to note is: Why would you burn an entire core, or waste "half" of a CPU spinning in a tight poll-loop just waiting for others to finish - when you could easily put it to work? The following mod uses the initial thread to run one of the workers, then wait for the others to complete:
thread_count=NUM_THREADS;
for (i=1;i<NUM_THREADS;i++) {
pthread_create(&thread[i],0L,threadfunc,&thread[i]);
}
threadfunc(&thread[0]);
do {
__sync_synchronize();
} while (thread_count);
printf("All threads done\n");
}
Note that we start creating the threads starting at "1" instead of "0", then directly run "thread 0" inline, waiting for all threads to complete after it's done. We pass &thread[0] to it for consistency (even though it's meaningless here), though in reality you'd probably pass your own variables/context.

Resources