does order of thread creation matters? - c

I am writing a multi-thread program, where one thread executes a lot of system calls (like read, write), and other thread executes normal calls like printf.
Suppose thread A is for normal calls, and thread B is for system calls, my main function is like
int main()
{
pthread_t thread_A;
pthread_t thread_B;
pthread_create(&thread_B,NULL,&system_call_func,NULL);
pthread_create(&thread_A,NULL,&printf_func,NULL);
pthread_join(thread_B,NULL);
pthread_join(thread_A,NULL);
printf("Last thread to be executed was %c\n",write_last);
return 0;
}
By this, I found that the thread with system calls is executed last always. Even if I change the order of thread creation and joining, it is still thread B.
I have two questions, does the order of thread creation/joining matters? and is it because of the system calls that thread B is always executing last?

You're just measuring which thread finishes first, not which one runs first. Assuming they both run in parallel and start at roughly the same time, the one that spends less time working is going to finish first.
If you want to observe the sequence of operations in both, run the program under strace -f, but be aware that the overhead of tracing slows things down a lot and tends to eliminate parallelism in the traced program except when it's doing purely computational tasks with no system calls.

Related

Priority based multithreading?

I have written code for two threads where is one is assigned priority 20 (lower) and another on 10 (higher). Upon executing my code, 70% of the time I get expected results i.e high_prio (With priority 10) thread executes first and then low_prio (With priority 20).
Why is my code not able to get 100 % correct result in all the executions? Is there any conceptual mistake that I am doing?
void *low_prio(){
Something here;
}
void *high_prio(){
Something here;
}
int main(){
Thread with priority 10 calls high_prio;
Thread with priority 20 calls low_prio;
return 0;
}
Is there any conceptual mistake that I am doing?
Yes — you have an incorrect expectation regarding what thread priorities do. Thread priorities are not meant to force one thread to execute before another thread.
In fact, in a scenario where there is no CPU contention (i.e. where there are always at least as many CPU cores available as there are threads that currently want to execute), thread priorities will have no effect at all -- because there would be no benefit to forcing a low-priority thread not to run when there is a CPU core available for it to run on. In this no-contention scenario, all of the threads will get to run simultaneously and continuously for as long as they want to.
The only time thread priorities may make a difference is when there is CPU contention -- i.e. there are more threads that want to run than there are CPU cores available to run them. At that point, the OS's thread-scheduler has to make a decision about which thread will get to run and which thread will have to wait for a while. In this instance, thread priorities can be used to indicate to the scheduler which thread it should prefer allow to run.
Note that it's even more complicated than that, however -- for example, in your posted program, both of your threads are calling printf() rather a lot, and printf() invokes I/O, which means that the thread may be temporarily put to sleep while the I/O (e.g. to your Terminal window, or to a file if you have redirected stdout to file) completes. And while that thread is sleeping, the thread-scheduler can take advantage of the now-available CPU core to let another thread run, even if that other thread is of lower priority. Later, when the I/O operation completes, your high-priority thread will be re-awoken and re-assigned to a CPU core (possibly "bumping" a low-priority thread off of that core in order to get it).
Note that inconsistent results are normal for multithreaded programs -- threads are inherently non-deterministic, since their execution patterns are determined by the thread-scheduler's decisions, which in turn are determined by lots of factors (e.g. what other programs are running on the computer at the time, the system clock's granularity, etc).

How pthread returns the fastest result and terminates the slower ones?

I'm currently writing a program that the main thread is going to create three child threads. These threads are running simultaneously and what I want to do is once one of the child thread is done, I will check if the output is right. If it is, then terminate the other two threads; if not, then throw away this thread's result and wait for the other two threads' result.
I'm creating the three results in the main function with pthread_create. But I do not know how to use join function. If I use join function three times in the main function, it just waits one by one until the three threads are done.
My plan is like this:
int return_value;
main(){
pthread_create(&pid[0], NULL, fun0, NULL);
pthread_create(&pid[1], NULL, fun1, NULL);
pthread_create(&pid[2], NULL, fun2, NULL);
}
fun0(){
...
if( check the result is right ){
return_value = result;
if (pid[1] is running) pthread_kill( pid[1], SIGTERM );
if (pid[2] is running) pthread_kill( pid[2], SIGTERM );
}
fun1() ...
fun2() ...
function 0, 1, and 2 are similar to each other and once one function has the right answer, it will kill the other two threads. However, while running the program, once the pthread_kill is processed, the whole program is terminated, not just one thread. I don't know why.
And I still do not know if there are any other ways to code this program. Thanks for helping me out of this.
The pthread_kill() function is not designed to terminate threads, just like kill() is not designed to terminate processes. These functions just send signals, and their names are unfortunate byproducts of history. Certain signal handlers will cause the process to terminate. Using pthread_kill() allows you to select which thread handles a signal, but the signal handler will still do the exact same thing (e.g., terminate the process).
To terminate a thread, use pthread_cancel(). This will normally terminate the thread at the next cancellation point. Cancellation points are listed in the man page for pthread_cancel(), only certain functions like write(), sleep(), pthread_testcancel() are cancellation points.
However, if you set the cancelability type of the thread (with pthread_setcanceltype()) to PTHREAD_CANCEL_ASYNCHRONOUS, you can cancel the thread at any time. This can be DANGEROUS and you must be very careful. For example, if you cancel a thread in the middle of a malloc() call, you will get all sorts of nasty problems later on.
You will probably find it much easier to either test a shared variable every now and then, or perhaps even to use different processes which you can then just kill() if you don't need them any more. Canceling a thread is tricky.
Summary
Easiest option is to just test a variable in each thread to see if it should be canceled.
If this doesn't work, my next recommendation is to use fork() instead of pthread_create(), after which you can use kill().
If you want to play with fire, use asynchronous pthread_cancel(). This will probably explode in your face. You will have to spend hours of your precious time hunting bugs and trying to figure out how to do cleanup correctly. You will lose sleep and your cat will die from neglect.

thread overhead performance

When programming in C using threads, in a Linux shell, I am trying to reduce the thread overhead, basically lower CPU time (and making it more efficient).
Now in the program lots of threads are being created and need to do a job before it terminates. Only one thread can do the job at the same time because of mutual exclusion.
I know how long a thread will take to complete a job before it starts
Other threads have to wait while there is a thread doing that job. The way they check if they can do the job is if a condition variable is met.
For waiting threads, if they wait using that condition variable, using this specific code to wait (the a, b, c, and d is just arbitrary stuff, this is just an example):
while (a == b || c != d){
pthread_cond_wait(&open, &mylock);
}
How efficient is this? Whats happening in the pthread_cond_wait code? Is it a while loop (behind the scenes) that constantly checks the condition variable?
Also since I know how long a job a thread will take, is it more efficient that I enforce a scheduling policy about shortest jobs first? Or does that not matter since, in any combination of threads doing the job, the program will take the same amount of time to finish. In other words, does using shortest job first lower CPU overhead for other threads doing the waiting? Since the shortest job first seems to lower waiting times.
Solve your problem with a single thread, and then ask us for help identifying the best place for exposing parallelisation if you can't already see an avenue where the least locking is required. The optimal number of threads to use will depend upon the computer you use. It doesn't make much sense to use more than n+1 threads, where n is the number of processors/cores available to your program. To reduce thread creation overhead, it's a good idea to give each thread multiple jobs.
The following is in response to your clarification edit:
Now in the program lots of threads are being created and need to do a
job before it terminates. Only one thread can do the job at the same
time because of mutual exclusion.
No. At most n+1 threads should be created, as described above. What is it you mean by mutual exclusion? I consider mutual exclusion to be "Only one thread includes task x in it's work queue". This means that no other threads require locking on task x.
Other threads have to wait while there is a thread doing that job. The
way they check if they can do the job is if a condition variable is
met.
Give each thread an independent list of tasks to complete. If job x is a prerequisite to job y, then job x and job y would ideally be in the same list so that the thread doesn't have to deal with thread mutex objects on either job. Have you explored this avenue?
while (a == b || c != d){
pthread_cond_wait(&open, &mylock);
}
How efficient is this? Whats happening in the pthread_cond_wait code?
Is it a while loop (behind the scenes) that constantly checks the
condition variable?
In order to avoid undefined behaviour, mylock must be locked by the current thread before calling pthread_cond_wait, so I presume your code calls pthread_mutex_lock to acquire the mylock lock before this loop is entered.
pthread_mutex_lock blocks the thread until it acquires the lock, which means that one thread at a time can execute the code between the pthread_mutex_lock and pthread_cond_wait (the pre-pthread_cond_wait code).
pthread_cond_wait releases the lock, allowing some other thread to run the code between the pthread_mutex_lock and the pthread_cond_wait. Before pthread_cond_wait returns, it waits until it can acquire the lock again. This step is repeated adhoc while (a == b || c != d).
pthread_mutex_unlock is later called when the task is complete. Until then, only one thread at a time can execute the code between the pthread_cond_wait and the pthread_mutex_unlock (the post-pthread_cond_wait code). In addition, if one thread is running pre-pthread_cond_wait code then no other thread can be running post-pthread_cond_wait code, and visa-versa.
Hence, you might as well be running single-threaded code that stores jobs in a priority queue. At least you wouldn't have the unnecessary and excessive context switches. As I said earlier, "Solve your problem with a single thread". You can't make meaningful statements about how much time an optimisation saves until you have something to measure it against.
Also since I know how long a job a thread will take, is it more
efficient that I enforce a scheduling policy about shortest jobs
first? Or does that not matter since, in any combination of threads
doing the job, the program will take the same amount of time to
finish. In other words, does using shortest job first lower CPU
overhead for other threads doing the waiting? Since the shortest job
first seems to lower waiting times.
If you're going to enforce a scheduling policy, then do it in a single-threaded project. If you believe that concurrency will help you solve your problem quickly, then expose your completed single-threaded project to concurrency and derive tests to verify your beliefs. I suggest exposing concurrency in ways that threads don't have to share work.
Pthread primitives are generally fairly efficient; things that block usually consume no or negligible CPU time while blocking. If you are having performance problems, look elsewhere first.
Don't worry about the scheduling policy. If your application is designed such that only one thread can run at a time, you are losing most of the benefits of being threaded in the first place while imposing all of the costs. (And if you're not imposing all the costs, like locking shared variables because only one thread is running at a time, you're asking for trouble down the road.)

thread handling

Suppose a thread A creates a thread B and after a duration the thread B crashes with an issue, Is there any possibility that the control moves back to the thread A in C language.
Sort of an exceptional handling.
No. "Control passes back" doesn't make a lot of sense at all, since they are executing independently anyway -- usually, Thread A isn't going to sit around waiting for Thread B to finish, but it will be doing something else.
Incidentally, threads can, of course, check whether another thread is still running. Check your thread library or the system functions that you are using.
However, that will only work for something one could call a "soft crash"; a lot of crashes screw up a lot more than just the thread doing the bad thing, such as hardware exceptions that kill the entire process, or corrupting memory. So, trying to catch crashes in another thread is going to be a good amount of work with little benefit, if any at all. Better spend that time fixing the crashes.
No. They're separate threads of execution. Once thread A has created and started thread B, both A and B can execute independently.
Of course if thread B crashes the whole process, thread A won't exist any more...
Threads cannot call other threads, only signal them. The 'normal' function/method call/return mechanism is stack-based and each thread has its own stack, (it is very common for several threads to run exactly the same code using different stack auto-variables).
If a thread cannot call another thread, then there is no 'return' from one thread to another either.

Barriers for thread syncing

I'm creating n threads & then starting then execution after a barrier breakdown.
In global data space:
int bkdown = 0;
In main():
pthread_barrier_init(&bar,NULL,n);
for(i=0;i<n;i++)
{
pthread_create(&threadIdArray[i],NULL,runner,NULL);
if(i==n-2)printf("breakdown imminent!\n");
if(i==n-1)printf("breakdown already occurred!\n");
}
In thread runner function:
void *runner(void *param)
{
pthread_barrier_wait(&bar);
if(bkdown==0){bkdown=1;printf("barrier broken down!\n");}
...
pthread_exit(NULL);
}
Expected order:
breakdown imminent!
barrier broken down!
breakdown already occurred!
Actual order: (tested repeatedly)
breakdown imminent!
breakdown already occurred!
barrier broken down!!
Could someone explain why the I am not getting the "broken down" message before the "already occurred" message?
The order in which threads are run is dependent on the operating system. Just because you start a thread doesn't mean the OS is going to run it immediately.
If you really want to control the order in which threads are executed, you have to put some kind of synchronization in there (with mutexes or condition variables.)
for(i=0;i<n;i++)
{
pthread_create(&threadIdArray[i],NULL,runner,NULL);
if(i==n-2)printf("breakdown imminent!\n");
if(i==n-1)printf("breakdown already occurred!\n");
}
Nothing stops this loop from executing until i == n-1 . pthread_create() just fires off a thread to be run. It doesn't wait for it to start or end. Thus you're at the mercy of the scheduler, which might decide to continue executing your loop, or switch to one of the newly created threads (or do both, on a SMP system).
You're also initalizing the barrier to n, so in any case none of the threads will get past the barrier until you've created all of them.
In addition to the answers of nos and Starkey you have to take into account that you have another serialization in your code that is often neglected: you are doing IO on the same FILE variable, namely stdin.
The access to that variable is mutexed internally and the order in which your n+1 threads (including your calling thread) get access to that mutex is implementation defined, take it basically as random in your case.
So the order in which you get your printf output is the order in which your threads pass through these wormholes.
You can get the expected order in one of two ways
Create each thread with a higher priority than the main thread. This will ensure that new thread will run immediately after creation and wait on the barrier.
Move the "breakdown imminent!\n" print before the pthread_create() and call use a sched_yield() call after every pthread_create(). This will schedule the newly created thread for execution.

Resources