The first thing, which I was told when had started working with pthreads, was - you should avoid force thread cancelation, like pthread_cancel. Instead we should use thread cancel notification via threads communication channel.
If we have a really long task to run in the thread, we split this task into small chunks and check the cancelation flag after each chunk processing. Like this:
loop {
process_chunk();
if (check_cancel_flag())
break;
}
But what is the best approach for implementation of this check_cancel_flag() function?
With all my experience in c and linux, I can remember only those methods:
(If you have only one working thread) You can use sig_atomic_t as a type for the cancelation flag. Check it in check_cancel_flag() function and mark it as true in the thread` signal handler. Then just call pthread_kill from the main thread.
Use any POD type for cancelation flag and protect it with a mutex. In this case you will get overhead with calling lock too often.
Use mutex as cancelation flag. Check it with pthread_mutex_trylock call. If the main thread releases this mutex, it is time to shutdown for the worker thread.
(For C11) Use gcc _atomic built-in functions (or another asm atomic library) to set and check cancelation flag.
I could not remember nothing else.
The question: How to choose correct approach?
Do you know any bench mark about this problem?
An alternative is to use a reader-writer lock (pthread_rwlock_t) to protect the flag, as your worker threads need to frequently read it but it is only written once.
As long as the chunk that is processed in between checks of the flag isn't too small, the overhead will be insignificant.
Related
In the POSIX thread interface, pthread_join(thread) can be used to block until the specified thread exits.
Is there a similar function that will allow execution to block until any child thread exits?
This would be similar to the wait() UNIX system call, except be applicable for child threads, not a processes
I don't think this is directly possible from pthreads per se, but you can work around it fairly easily.
Using the pthreads API, you can use pthread_cond_wait and friends to set up a "condition" and wait on it. When a thread is about to exit, signal the condition to wakeup the waiting thread.
Alternatively, another method is to create a pipe with pipe, and when a thread is going to exit, write to the pipe. Have the main thread waiting on the other end of the pipe with either select, poll, epoll, or your favorite variant thereof. (This also allows you to wait simultaneously on other FDs.)
Newer versions of Linux also include "eventfds" for doing the same thing, see man eventfd, but note this is only recently added. Note that is isn't POSIX, it's Linux-only, and it's only available if you're reasonably up-to-date. (2.6.22 or better.)
I've personally always wondered why this API wasn't designed to treat these things similar to file descriptors. If it were me, they'd be "eventables", and you could select files, threads, timers...
I don't think there's any function in the POSIX thread interface to do this.
You'd need to create your own version of it - e.g. an array of flags (one flag per thread) protected by a mutex and a condition variable; where just before "pthread_exit()" each thread acquires the mutex, sets its flag, then does "pthread_cond_signal()". The main thread waits for the signal, then checks the array of flags to determine which thread/s to join (there may be more than one thread to join by then).
You need to implement a customize one by pthread conditional variable: pthread_cond_wait(), pthread_cond_signal()/pthread_cond_broadcast().
I have this code:
int _break=0;
while(_break==0) {
if(someCondition) {
//...
if(someOtherCondition)_break=1;//exit the loop
//...
}
}
The problem is that if someCondition is false, the loop gets heavy on the CPU. Is there a way to sleep for some milliseconds in the loop so that the cpu will not have a huge load?
Update
What I'm trying to do is a server-client application, without using sockets, just using shared memory, semaphores and system calls. I'm doing this on linux.
someOtherCondition becomes true when the applications receives the "kill" signal, while someCondition is true if the message received is valid. If it's not valid, it keeps waiting for a valid message and the while loop becomes a heavy infinite loop (it works but loads the CPU too much). I would like to make it lightweight.
I'm working on Linux (Debian 7).
If you have a single-threaded application, then it won't make any difference whether you suspend the execution or not.
If you have multiple threads running, then you should use a binary semaphore instead of polling a global variable.
This thread should acquire the semaphore at the beginning of each iteration, and one of the other threads should release the semaphore whenever you wish this thread to run.
This method is also known as "consumer-producer".
When a thread attempts to acquire a binary semaphore:
If the semaphore is released, then the calling thread acquires it and continues the execution.
If the semaphore is already acquired, then the calling thread "asks" the OS to block itself, and the OS will unblock it as soon as some other thread releases the semaphore.
The entire procedure is "atomic", i.e., no context-switch between threads can take place while the semaphore code is executed. This is generally achieved by disabling the interrupts. Everything is implemented within the semaphore code, so you need not "worry" about it.
Since you did not specify what OS you're using, I cannot provide any technical details (i.e., code)...
UPDATE:
If you are trying to protect a critical section inside the loop (i.e., if you are accessing some other global variable, which is also being accessed by other threads, and at least one of those threads is changing that global variable), then you should use a Mutex instead of a binary semaphore.
There are two advantages for using a Mutex in this case:
It can be released only by the thread which has acquired it (thus ensuring mutual exclusion).
It can resolve a specific type of deadlocks that occur when a high-priority thread is waiting for a low-priority thread to complete, while a medium-priority thread is preventing the low-priority thread from completing (a.k.a. priority-inversion).
Of course, a Mutex is required only if you really need to ensure mutual exclusion for accessing the data.
UPDATE #2:
Now that you've added some specific details on your system, here is the general scheme:
Step #1 - Before starting your threads:
// Declare a global variable 'sem'
// Initialize the global variable 'sem' with 'count = 0' (i.e., as acquired)
Step #2 - In this thread:
// Declare the global variable 'sem' as 'extern'
while(1)
{
semget(&sem);
//...
}
Step #3 - In the Rx ISR:
// Declare the global variable 'sem' as 'extern'
semset(&sem);
Spinning a loop without any delay will use a fair amount of CPU, a small time delay will reduce that you're right.
Using Sleep() is the easiest way, in Windows this is in the windows.h header.
Having said that, the most elegant solution would be to thread your code so that the code is only ever run when your condition is true, that way it will truly sleep until you wake it up.
I suggest you look into pthread and mutex. This will allow you to sleep that loop of yours entirely until the condition becomes true.
Hope that helps in some way :)
Suppose I have multiple threads blocking on a call to pthread_mutex_lock(). When the mutex becomes available, does the first thread that called pthread_mutex_lock() get the lock? That is, are calls to pthread_mutex_lock() in FIFO order? If not, what, if any, order are they in? Thanks!
When the mutex becomes available, does the first thread that called pthread_mutex_lock() get the lock?
No. One of the waiting threads gets a lock, but which one gets it is not determined.
FIFO order?
FIFO mutex is rather a pattern already. See Implementing a FIFO mutex in pthreads
"If there are threads blocked on the mutex object referenced by mutex when pthread_mutex_unlock() is called, resulting in the mutex becoming available, the scheduling policy shall determine which thread shall acquire the mutex."
Aside from that, the answer to your question isn't specified by the POSIX standard. It may be random, or it may be in FIFO or LIFO or any other order, according to the choices made by the implementation.
FIFO ordering is about the least efficient mutex wake order possible. Only a truly awful implementation would use it. The thread that ran the most recently may be able to run again without a context switch and the more recently a thread ran, more of its data and code will be hot in the cache. Reasonable implementations try to give the mutex to the thread that held it the most recently most of the time.
Consider two threads that do this:
Acquire a mutex.
Adjust some data.
Release the mutex.
Go to step 1.
Now imagine two threads running this code on a single core CPU. It should be clear that FIFO mutex behavior would result in one "adjust some data" per context switch -- the worst possible outcome.
Of course, reasonable implementations generally do give some nod to fairness. We don't want one thread to make no forward progress. But that hardly justifies a FIFO implementation!
I want to implement a mutex lock.
From my understanding, mutex.lock() should work like
1) check lock owner
2) if lock is owned, put thread in waiting queue
3) suspend this thread until another thread send a wait up signal
However, there is nothing like pthread_suspend(), then how do I do suspend?
I found someone saying use pthread_con_wait(), but seems if I want to use that function, I have to set up a pthread_mutex lock first, which it doesn't make sense to use pthread_mutex inside my mutex.
Well, if my understanding of mutex is wrong, please correct me.
Thanks.
Mutexes, locks, and wait conditions are all different, distinct things. You need a mutex variable in order to implement both a lock and a wait condition.
A lock is a simple mechanism that prevents more than one thread from executing the same code at once by making all by one thread wait for the lock to become unlocked.
A wait condition is a slightly more complex structure that allows a thread to monitor a condition (usually a boolean flag) and only wake up when the flag has changed favourably.
In both cases, when a thread blocks (i.e. sleeps), the operating system's scheduling primitives automatically take care of descheduling the thread and using the available computing time elsewhere. Thread and task scheduling is not something you would normally have to worry about manually.
You can only make things that are at least as complex as the simplest pieces you have. If the simplest pieces you have are mutexes, then you can't make mutexes from the pieces you have. You can only make things at least as complex as a mutex or more so. If you have any pieces simpler than a mutex, tell us what they are, and we can tell you how to make a mutex out of them.
I suppose, if you want, you can make your own mutex out of pthread mutexes and condition variables. I'm not sure what the point is, but it's trivial to do. As you noted, you can use pthread_cond_wait to wait on your own kind of mutex.
The reason the pthreads standard gives you a mutex is because it's about the most flexible of the possible synchronization primitives.
mutex.lock() should work like:
1) check lock owner
2) if lock is owned, put thread in waiting queue
3) suspend this thread until THE THREAD THAT OWNS THE LOCK sends a wake up signal. No other thread can release the lock.
These steps should be performed as an atomic operation so that the correct behaviour is followed for all threads acquiring/releasing the mutex, no matter how such calls may be interrupted and reentered from other threads.
'However, there is nothing like pthread_suspend(), then how do I do suspend?' - usually, you don't. The OS kernel provides synchronization primitives that can block threads that should not run on. To implement a 'suspend' in user-space, you can only spin-wait - something that is a good strategy in a few cases, (underloaded multi-core box where the lock is only held for a very short time), but certainly not all, (and can lead to spectacularly disastrous livelocks across whole clusters of machines).
If you want a mutex, use an OS mutex - that's what any cross-platform lib. will do.
I am developing a user level thread library as part of a project. I came up with an approach to implement mutex. I would like to see ur views before going on with it. Basically, i need to implement just 3 functions in my library
mutex_init, mutex_lock and mutex_unlock
I thought my mutex_t structure would look something like
typedef struct
{
int available; //indicates whether the mutex is locked or unlocked
queue listofwaitingthreads;
gtthread_t owningthread;
}mutex_t;
In my mutex_lock function, i will first check if the mutex is available in a while loop. If it is not, i will yield the processor for the next thread to execute.
In my mutex_unlock function, i will check if the owner thread is the current thread. If it is, i will set available to 0.
Is this the way to go about it ? Also, what about deadlock? Should i take care of those conditions in my user level library or should i leave the application programmers to write code properly ?
This won't work, because you have a race condition. If 2 threads try to catch the lock at the same time, both will see available == 0, and both will think they succeeded with taking the mutex.
If you want to do this properly, and without using an already-existing lock, You must access hardware operations like TAS, CAS, etc.
There are algorithms that give you mutual exclusion without such hardware support, but they make some assumptions that are many times false. For more details about this, I highly recommend reading Herlihy and Shavit's The art of multiprocessor programming, chapter 7.
You shouldn't worry about deadlocks in this level - mutex locks should be simple enough, and there is some assumption that the programmer using them should use care not to cause deadlocks (advanced mutexes can check for self-deadlock, meaning a thread that calls lock twice without calling unlock in the middle).
Not only that you have to do atomic operations to read and modify the flag (as Eran pointed out) you also have to watch that your queue is capable to have concurrent accesses. This is not completely trivial, sort of hen and egg problem.
But if you'd really implement this by spinning, you wouldn't even need to have such a queue. The access order to the lock then would be mainly random, though.
Probably just yielding would also not be enough, this can be quite costly if you have threads holding the lock for more than some processor cycles. Consider using nanosleep with a low time value for the wait.
In general, a mutex implementation should look like:
Lock:
while (trylock()==failed) {
atomic_inc(waiter_cnt);
atomic_sleep_if_locked();
atomic_dec(waiter_cnt);
}
Trylock:
return atomic_swap(&lock, 1);
Unlock:
atomic_store(&lock, 0);
if (waiter_cnt) wakeup_sleepers();
Things get more complex if you want recursive mutexes, mutexes that can synchronize their own destruction (i.e. freeing the mutex is safe as soon as you get the lock), etc.
Note that atomic_sleep_if_locked and wakeup_sleepers correspond to FUTEX_WAIT and FUTEX_WAKE ops on Linux. The other atomics are probably CPU instructions, but could be system calls or kernel-assisted userspace function code, as in the case of Linux/ARM and the 0xffff0fc0 atomic compare-and-swap call.
You do not need atomic instructions for a user level thread library, because all the threads are going to be user level threads of the same process. So actually when your process is given the time slice to execute, you are running multiple threads during that time slice but on the same processor. So, no two threads are going to be in the library function at the same time. Considering that the functions for mutex are already in the library, mutual exclusion is guaranteed.