I am relatively new to threads and forks. So to understand them a bit better I have been writing simple programs. One of the little programs I have written two programs, one to print a counter on two processes, and another with two threads.
What I noticed is that the fork prints the counters interlaced while the thread prints one thread's counter and then the others. So the thread is not so parallel, but behaves more serial Why is that? Am I doing something wrong?
Also, what exactly does pthread_join do? Even when I don't do pthread_join the program runs similarly.
Here is my code for the thread
void * thread1(void *a){
int i =0;
for(i=0; i<100; i++)
printf("Thread 1 %d\n",i);
}
void * thread2(void *b){
int i =0;
for(i=0; i<100; i++)
printf("Thread 2 %d\n", i);
}
int main()
{
pthread_t tid1,tid2;
pthread_create(&tid1,NULL,thread1, NULL);
pthread_create(&tid2,NULL,thread2, NULL);
pthread_join(tid1,NULL);
pthread_join(tid2,NULL);
return 0;
}
And here is my code for fork
int main(void)
{
pid_t childPID;
childPID = fork();
if(childPID >= 0) // fork was successful
{
if(childPID == 0) // child process
{ int i;
for(i=0; i<100;i++)
printf("\n Child Process Counter : %d\n",i);
}
else //Parent process
{
int i;
for(i=0; i<100;i++)
printf("\n Parent Process Counter : %d\n",i);
}
}
else // fork failed
{
printf("\n Fork failed, quitting!!!!!!\n");
return 1;
}
return 0;
}
EDIT:
How can I make the threaded program behave more like the fork program? i.e. the counter prints interweave.
You are traveling down a bad road here. The lesson you should be learning is to not try and out think the OS scheduler. No matter what you do - processing schedules, priorities, or whatever knobs you turn - you cannot do it reliably.
You have backed your way into discovering the need for synchronization mechanisms - mutexes, semaphores, condition variables, thread barriers, etc. What you want to do is exactly why they exist and what you should use to accomplish your goals.
On your last question, pthread_join reclaims some resources from dead, joinable (i.e.not detached) threads and allows you to inspect any return variable from the expired thread. In your program they are mostly serving as a blocking mechanism. That is, main will block on those calls until the threads expire. Without the pthread_joins your main would end and the process would die, including the threads you created. If you don't want to join the threads and aren't doing anything useful in main then use pthread_exit in main as this will allow main to exit but the threads to continue processing.
First of all, a fork creates a second process while creating a thread creates a "dispatchable unit of work" within the same process.
Getting two different processes to be interleaved is usually a simple matter of letting the OS run. However, within a process you need to know more about how the OS chooses which of several threads to run.
You could probably, artificially, get the output from the threads to be interleaved by calling sleep for different times from each thread. That is, create thread A (code it to output one line, and then sleep for 100) then create thread B (code it to output one line and then sleep for 50, etc.)
I understand wanting to see how threads can run in parallel, similar to processes. But, is this a real requirement or just a "zoo" request?
Related
I am learning the basics of POSIX threads. I want to create a program that prints "Hello World!" 10 times with a delay of a second between each printout. I've used a for loop to print it 10 times, but I am stuck on how to implement the time delay part.
This is my code so far:
#define MAX 10
void* helloFunc(void* tid)
{
printf("Hello World!\n", (int)(intptr_t)tid);
}
int main(int ac, char * argv)
{
pthread_t hej[MAX];
for (int i = 0; i < MAX; i++)
{
pthread_create(&hej[i], NULL, helloFunc, (void*)(intptr_t)i);
pthread_join(&hej[i], NULL);
}
pthread_exit(NULL);
return(0);
}
Thanks in advance!
There are two major problems with your code:
First of all you must wait for the threads to finish. You do that by joining them with pthread_join. And for that to work you must save the pthread_t value from each and every thread (for example in an array).
If you don't wait for the threads then the exit call will end the process, and that will also unexpectedly kill and end all threads in the process.
For all threads to run in parallel you should wait in a separate loop after you have created them:
pthread_t hej[MAX];
for (int i = 0; i < MAX; i++)
{
pthread_create(&hej[i], ...);
}
for (int i = 0; i < MAX; i++)
{
pthread_join(&hej[i], NULL);
}
The second problem is that you pass a pointer to i to the thread, so tid inside the thread functions will be all be the same (and a very large and weird value). To pass a value you must first cast it to intptr_t and then to void *:
pthread_create(..., (void *) (intptr_t) i);
And in the thread function you do the opposite casting:
printf("Hello World %d!\n", (int) (intptr_t) tid);
Note that this is an exception to the rule that one should never pass values as pointers (or opposite).
Finally for the "delay" bit... On POSIX systems there are many ways to delay execution, or to sleep. The natural and simple solution would be to use sleep(1) which sleeps one second.
The problem is where do to this sleep(1) call. If you do it in the thread functions after the printf then all threads will race to print the message and then all will sleep at the same time.
If you do it in the loop where you create the threads, then the threads won't really run in parallel, but really in serial where one thread prints it message and exits, then the main thread will wait one second before creating the next thread. It makes the threads kind of useless.
As a possible third solution, use the value passed to the thread function to use as the sleep time, so the thread that is created first (when i == 0) will primt immediately, the second thread (when i == 1) will sleep one second. And so on, until the tenth thread is created and will sleep nine seconds before printing the message.
Could be done as:
void* helloFunc(void* tid)
{
int value = (int) (intptr_t) tid;
sleep(value);
printf("Hello World %d!\n", value);
// Must return a value, as the function is declared as such
return NULL;
}
I read that main() is single thread itself, so when i create 2 threads in my program like this;
#include<stdio.h>
#include<pthread.h>
#include<windows.h>
void* counting(void * arg){
int i = 0;
for(i; i < 50; i++){
printf("counting ... \n");
Sleep(100);
}
}
void* waiting(void * arg){
int i = 0;
for(i; i < 50; i++){
printf("waiting ... \n");
Sleep(100);
}
}
int main(){
pthread_t thread1;
pthread_t thread2;
pthread_create(&thread1, NULL, counting, NULL);
pthread_create(&thread2, NULL, waiting, NULL);
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
int i = 0;
for(i; i < 50; i++){
printf("maining ... \n");
Sleep(1000);
}
}
Is main really a thread in that case?
in that case if in main in sleep for some time, shouldn't the main give the CPU to other threads?
Is main a threads itself here? I am confused a bit here.
Is there a specific order to main thread execution?
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
You asked the thread to wait until thread1 terminates and then wait until thread2 terminates, so that's what it does.
I read that main() is single thread itself
No, you have misunderstood. Every C program has a function named main(). C language semantics of the program start with the initial entry into that function. In that sense, and especially when you supply the parentheses, main(), is a function, not a thread.
However, every process also has a main thread which has a few properties that distinguish it from other threads. That is initially the only thread, so it is that thread that performs the initial entry into the main() function. But it is also that thread that runs all C functions called by main(), and by those functions, etc., so it is not, in general, specific to running only the code appearing directly in the body of main(), if that's what you mean by "main() is a single thread itself".
, so when i create 2 threads in my program like this; [...] Is main really a thread in that case?
There is really a main thread in that case, separate from the two additional threads that it starts.
in that case if in main in sleep for some time, shouldn't the main give the CPU to other threads?
If the main thread slept while either of the other two were alive, then yes, one would expect one or both of the others to get (more) CPU time. And in a sense, that's exactly what happens: the main thread calls pthread_join() on each of the other threads in turn, which causes it to wait (some would say "sleep") until those threads terminate before it proceeds. While it's waiting, it does not contend with the other threads for CPU time, as that's pretty much what "waiting" means. But by the time the main thread reaches the Sleep() call in your program, the other threads have already terminated and been joined, because that's what pthread_join() does. They no longer exist, so naturally they don't run during the Sleep().
Is main a threads itself here?
There is a main thread, yes, and it is the only one in your particular process that executes any of the code in function main(). Nothing gets executed except in some thread or other.
I am confused a bit here. Is there a specific order to main thread execution?
As already described, the main thread is initially the only thread. Many programs never have more than that one. Threads other than the main one are created only by the main thread or by another thread that has already been created. Of course, threads cannot run before they are created, nor, by definition, after they have terminated. Threads execute independently of each other, generally without any predefined order, except as is explicitly established via synchronization objects such as mutexes, via for-purpose functions such as pthread_join(), or via cooperative operations on various I/O objects such as pipes.
main() is not a thread but a function, so here's a clear "no" to your initial claim. However, if you read a few definitions of what is a thread, you will find that it is something that can be scheduled, i.e. an ongoing execution of code. Further, a running program will not be able to actually do anything without "ongoing execution of code" without e.g. main() as first entrypoint. So, definitely, every code executed by a program is executed by a thread, without exceptions.
BTW: You can retrieve the thread ID of the current thread. Try running that from main(). It will work and give you a value that distinguishes this call from calls from other threads.
I would need some help with some C code.
Basically I have n processes which execute some code. Once they're almost done, I'd like the "Manager Process" (which is the main function) to send to each of the n processes an int variable, which may be different for every process.
My idea was to signal(handler_function, SIGALRM) once all processes started. When process is almost done, it uses kill(getpid(), SIGSTOP) in order to wait for the Manager Process.
After SIM_TIME seconds passed, handler_function sends int variable on a Message Queue then uses kill(process_pid, SIGCONT) in order to wake up waiting processes. Those processes, after being woken up should receive that int variable from Message Queue, print it and simply terminate, letting Manager Process take control again.
Here's some code:
/**
* Child Process creation using fork() system call
* Parent Process allocates and initializes necessary variables in shared memory
* Child Process executes Student Process code defined in childProcess function
*/
pid_t runChild(int index, int (*func)(int index))
{
pid_t pid;
pid = fork();
if (pid == -1)
{
printf(RED "Fork ERROR!\n" RESET);
exit(EXIT_FAILURE);
}
else if (pid == 0)
{
int res = func(index);
return getpid();
}
else
{
/*INSIGNIFICANT CODE*/
currentStudent = createStudent(pid);
currentStudent->status = FREE;
students[index] = *currentStudent;
currentGroup = createGroup(index);
addMember(currentStudent, currentGroup);
currentGroup->closed = FALSE;
groups[index] = *currentGroup;
return pid;
}
}
Code executed by each Process
/**
* Student Process Code
* Each Student executes this code
*/
int childProcess(int index)
{
/*NOTICE: showing only relevant part of code*/
printf("Process Index %d has almost done, waiting for manager!\n", index);
/* PROGRAM GETS STUCK HERE!*/
kill(getpid(), SIGSTOP);
/* mex variable is already defines, it's a struct implementing Message Queue message struct*/
receiveMessage(mexId, mex, getpid());
printf(GREEN "Student %d has received variable %d\n" RESET, getpid(), mex->variable);
}
Handler Function:
* Handler function
* Will be launched when SIM_TIME is reached
*/
void end_handler(int sig)
{
if (sig == SIGALRM)
{
usleep(150000);
printf(RED "Time's UP!\n" RESET);
printGroups();
for(int i = 0; i < POP_SIZE; i++){
mex->mtype = childPids[i];
mex->variable = generateInt(18, 30);
sendMessage(mexId, mex);
//childPids is an array containing PIDs of all previously launched processes
kill(childPids[i], SIGCONT);
}
}
I hope my code is understandable.
I have an issue though, Using provided code the entire program gets stuck at kill(getpid(), SIGSTOP) system call.
I also tried to launch ps in terminal and no active processes are detected.
I think handler_function doesn't send kill(childPids[i], SIGCONT) system call for some reason.
Any idea how to solve this problem?
Thank you
You might want to start by reading the manual page for mq_overview (man mq_overview). It provides a portable and flexible communication mechanism between processes which permits sync and async mechanisms to communicate.
In your approach, there is a general problem of “how does one process know if another is waiting”. If the process hasn’t stopped itself, the SIGCONT is ignored, and when it subsequently suspends itself, nobody will continue it.
In contrast, message-based communication between the two can be viewed as a little language. For simple exchanges (such as yours), the completeness of the grammar can be readily hand checked. For more complex ones, state machines or even nested state machines can be constructed to analyze their behaviour.
I'm having a tricky time understanding how to alternate control between two processes using semaphores. Here's a contrived example of the process handling code.
int pid = fork();
if (pid) {
int counter = 0;
while (true) {
counter += 1;
printf("P%d = %d", pid, counter);
}
} else {
int counter = 0;
while (true) {
counter += 1;
printf("P%d = %d", pid, counter);
}
}
I was expecting the above code to run in parallel, but it seems like control flow continues instantly for the forked process and only later resumes for the parent process.
This is fundamentally breaking my existing code that uses a semaphore to control which process can act.
int id = get_semaphore(1);
int pid = fork();
if (pid) {
int counter = 0;
while (true) {
sem_wait(id);
counter += 1;
printf("P%d = %d\n", pid, counter);
sem_signal(id);
}
} else {
int counter = 0;
while (true) {
sem_wait(id);
counter += 1;
printf("P%d = %d\n", pid, counter);
sem_signal(id);
}
}
The sem_wait helper just subtracts 1 from the semaphore value and blocks until the result is > 0 (uses semop under the hood).
The sem_signal helper just adds 1 to the semaphore value (uses semop under the hood).
I'd like the code to alternate between the two processes, using sem_wait to block until the other process releases the resources with sem_signal. The desired output would be:
P1 = 0
P0 = 0
P1 = 1
P0 = 1
...
However, because of the initial execution delay between the processes, the child process takes the available semaphore resource, uses it to print a number, then restores it and loops — at which point the resource is available again, so it continues without ever waiting for the other process.
What's the best way to prevent a process from using resources if it released them itself?
it seems like control flow continues instantly for the forked process and only later resumes for the parent process
That is because stream IO buffers the output on stdout until either
the buffer is full
fflush() is called on stdout
a newline (\n) is encountered
In your program, each process will fill a buffer before sending its contents to stdout giving the appearance of one process running for a long time, then the other. Terminate the format strings of your printf statements with \n and you'll see behaviour in your first program more like you expect.
I am not sure why your semaphore thing isn't working - I'm not very knowledgeable about system V semaphores but it seems like a red flag to me that you are getting the semaphore after you have forked. With the more common POSIX semaphores, the semaphore has to be in memory that both processes can see otherwise it's two semaphores.
Anyway, assuming your get_semaphore() function does the right thing to share the semaphore, there is still a problem because there is no guarantee that, when one process signals the semaphore, the other one will start soon enough for it to grab it again before the first process loops round and grabs it itself.
You need two semaphores, one for the parent and one for the child. Before the print each process should wait on its own semaphore. After the print, each process should signal the other semaphore. Also, one semaphore should be initialised with a count of 1 and the other should be initialised with a count of 0.
Semaphores have two general use cases. One is mutual exclusion and the second is synchronization. What's been done in your code is mutual exclusion. What you actually want is synchronization (alternation) between the parent and child processes.
Let me explain a bit:
Mutual exclusion means that at any time only once process can access a "critical section" which is a piece of code that you want only one process/thread to access at a time.Critical sections generally have a code that manipulates a shared resource.
Coming to your code, since you have used only a single semaphore, there is no guarantee as to the "order" in which each process is allowed to enter the critical section.
ex: sem_wait(id) from your code can be executed by any process and it's not necessary that the two processes should alternate.
For process synchronization (more specifically alternation), you need to use two semaphore one for parent and another for child.
Sample code:
int pid = fork();
int parent_sem = get_semaphore(0);
int child_sem = get_semaphore(1);
if (pid) {
int counter = 0;
while (true) {
sem_wait(child_sem);
counter += 1;
printf("P%d = %d", pid, counter);
sem_signal(parent_sem);
}
} else {
int counter = 0;
while (true) {
sem_wait(parent_sem);
counter += 1;
printf("P%d = %d", pid, counter);
sem_signal(child_sem);
}
}
You need to initialize one semaphore (in my case child) to 1 and the second one to zero. That way only of the two processes get to start while the other enters into wait. Once child is done printing, it signals the parent. Now child's semaphore value is zero so it waits on wait(child_sem) while the parent that was signaled by the child executes. Next time, parent signals child and it executes. This continues in alternating sequences and is a classic synchronization problem.
I need to print 2 messages each one in each thread in C and synchronize them.
The one thread prints One and the second prints Two.
So my code is something like that
void printOne(void* empty){
while(1) printf("One ");
}
void printTwo(void* empty){
while(1) printf("Two\n");
}
int main(){
pthread_t t_1,t_2;
pthread_create(&t_1, NULL,(void *)&printOne,NULL);
pthread_create(&t_2, NULL,(void *)&printTwo,NULL);
pthread_join(t_1,NULL);
pthread_join(t_2,NULL);
exit(0);
}
The problem is that randomly prints One and Two but not always in that sequence. I would like to make it print always Two after One. I got a bit confused with the join command.
Thanks in advance!
You are confusing some basic concepts about synchronization here. The pthread_join() function will not guarantee synchronization in the way you are thinking. The join is used to synchronize threads after their execution have finished, i.e., after return is called from within your thread. This way, the calling thread will wait for the specified thread to finish its execution, which is exactly what your code is doing. The main thread is waiting for:
First, t_1 to end
Then, t_2 to end
If t_2 ends before t_1, the main thread will still be blocked by t_1, because this order has to be respected. Of course none of them will finish their execution in your code, since both are stuck in an infinite loop (while(1)).
What you are trying to achieve can be done using many techniques. I'd suggest you to use semaphores (if you want to use the POSIX API) or mutex (already implemented in pthread library).
Here's an example of how your code can be changed to get some synchronization:
void printOne(void* empty){
while(1)
{
sem_wait(&s1); //wait for semaphore s1
printf("One ");
sem_post(&s2); //signal semaphore s2
}
}
void printTwo(void* empty){
while(1)
{
sem_wait(&s2); //wait for semaphore s2
printf("Two\n");
sem_post(&s1); //signal semaphore s1
}
}
sem_t s1, s2; //Declare the semaphores globally, so the threads can access them
int main(){
pthread_t t_1,t_2;
sem_init(&s1, 0, 1); //Initialize s1 with 1
sem_init(&s2, 0, 0); //Initialize s2 with 0
pthread_create(&t_1, NULL,(void *)&printOne,NULL);
pthread_create(&t_2, NULL,(void *)&printTwo,NULL);
pthread_join(t_1,NULL);
pthread_join(t_2,NULL);
exit(0);
}
This way, your code guarantees that one message after another are going to be printed to your output:
One
Two
One
Two
...
Why use threads at all if you want it to be synchronous? Just print them sequentially in main().
Otherwise... I guess you could just run one thread after the other. pthread_join makes the program wait for the thread to finish before continuing.
int main(){
pthread_t t_1,t_2;
pthread_create(&t_1, NULL,(void *)&printOne,NULL);
pthread_join(t_1,NULL);
pthread_create(&t_2, NULL,(void *)&printTwo,NULL);
pthread_join(t_2,NULL);
exit(0);
}
You will have to make a breaking condition in your printOne and printTwo functions if you want the threads to actually finish, though...