Issue with Lock Ordering or Scheduling

Issue with Lock Ordering or Scheduling - c

I have a C application that uses pthreads.
There is a lock contention between two threads(say A and B) where A gets the lock first while B is waiting for the lock, Once A is done and releases the lock, B still doesn't get it and after a while A gets the lock again(A does acquire and release in a loop).
If I attach my process to gdb and pause thread A after it has given up the lock and manually continue on thread B, it then gets it and does what is needed.
This does not look like a dead lock to me.
What could be preventing thread B from getting the lock? Any help is greatly appreciated.
Sample Code:
Thread A:
while (true)
{
lock.acquire(lock)
// Do stuff
lock.release(lock)
// Do more stuff
}
Thread B:
lock.acquire(lock)
// Do some stuff
lock.release(lock)

It looks that you algorithm suffers from starvation, you should queue your access to the lock, see
pthreads: thread starvation caused by quick re-locking
or
Fair critical section (Linux)
As an answer to the comment, what is a mutex (pthread library)
A mutex is a lock (from Pthread library) that guarantees the following
three things:
Atomicity - Locking a mutex is an atomic operation,
meaning that the threads library assures you that if you lock a mutex,
no other thread can succeed in locking that mutex at the same time.
Singularity - If a thread managed to lock a mutex, it is assured that
no other thread will be able to lock the same mutex until the original
thread releases the lock.
Non-Busy Wait - If threadA attempts to lock a mutex that was locked
by threadB, the threadA will get suspended (and will not consume any
CPU resources) until the lock is freed by threadB. When threadB
unlocks the mutex, then the threadA will wake up and continue
execution, having the mutex locked by it.
It do not guaranty fairness.
If you are still interested in sort of reader writer fairness for pthread_rwlock_rdlock:
which are allowed to favour writers over readers to avoid writer starvation.

Another possibility is that your lock has been claimed earlier on the A thread preventing the lock/release to release fully (lock count thread stays too high).
Starvation is another strong possibility, but your question states "after a while A gets the lock again" indicating more than a few microseconds :), which should prevent starvation.
Is it possible that you return from A or use a continue statement, thus keeping the lock?

Related

What happens when pthread_cond_broadcast is called and multiple threads are awoken only to compete for the same mutex?

What happens when you call pthread_cond_broadcast() and multiple threads wake up just to compete for the same mutex lock. One of the threads takes the mutex lock but what happens to the other threads? Do they go back to sleep? Or do they spin until the lock is available again?

What happens when you call pthread_cond_broadcast() and multiple
threads wake up just to compete for the same mutex lock. One of the
threads takes the mutex lock but what happens to the other threads? Do
they go back to sleep? Or do they spin until the lock is available
again?
When you call pthread_cond_broadcast(), all threads then waiting on the specified condition variable stop doing so. All such threads will have passed (a pointer to) the same mutex to pthread_cond_wait(), else the behavior is undefined. Each thread that was unblocked will (re)acquire that mutex before returning successfully from pthread_cond_wait(). That may require some or even all of them to block, just as if they were all contending for the same mutex under any other circumstances. They do not spin, and they do not require any further interaction with the CV for them to resume, but each one will hold the mutex locked when it returns from pthread_cond_wait(), just as it did when it called that function.

Which thread owns the associated mutex after pthread_cond_broadcast?

This question concerns the pthread API for Posix systems.
My understanding is that when waiting for a conditional variable, or more specifically a pthread_cond_t, the flow goes something like this.
// imagine the mutex is named mutex and the conditional variable is named cond
// first we lock the mutex to prevent race conditions
pthread_mutex_lock(&mutex);
// then we wait for the conditional variable, releasing the mutex
pthread_cond_wait(&cond, &mutex);
// after we're done waiting we own the mutex again have to release it
pthread_mutex_unlock(&mutex);
In this example we stop waiting for the mutex when some other thread follows a procedure like this.
// lock the mutex to prevent race conditions
pthread_mutex_lock(&mutex);
// signal the conditional variable, giving up control of the mutex
pthread_cond_signal(&cond);
My understanding is that if multiple threads are waiting some kind of scheduling policy will be applied, and whichever thread is unblocked also gets back the associated mutex.
Now what I don't understand is what happens when some thread calls pthread_cond_broadcast(&cond) to awake all of the threads waiting on the conditional variable.
Does only one thread get to own the mutex? Do I need to wait in a fundamentally different manner when waiting for a broadcast than when waiting for a signal (i.e. by not calling pthread_mutex_unlock unless I can confirm this thread acquired the mutex)? Or am I wrong in my whole understanding of how the mutex/cond relationship works?
Most importantly, if (as I think is probably the case) pthread_cond_broadcast causes threads to compete for the associated mutex as if they had all tried to lock it, does that mean only one thread will really wake up?

When some thread calls pthread_cond_broadcast while holding the mutex, it holds the mutex. Which means that once pthread_cond_broadcast returns, it still owns the mutex.
The other threads will all wake up, try to lock the mutex, then go to sleep to wait for the mutex to become available.
If you call pthread_cond_broadcast while not holding the mutex, then one of the other threads will be able to lock the mutex immediately. All the others will have to wait to lock the mutex.

Spin_lock and mutex lock order

I got test question (interview).
You need to grab both spin_lock and mutex in order to do something.
What is the correct order of acquiring? Why?
I have some thoughts about this but no strong opinion about answer.

The reasone why you should grub lock is protect "critical region" on SMP or "critical region" on single CPU from preemption relative corruption (race). It is very important you machine type SMP or single CPU. It is also important what code inside spin and mutex. Is there kmalloc, vmalloc, mem_cache_alloc, alloc_bootmem or function with __user memory access oe even usleep.
spin_lock - it is the simplest lock in /include/asm/spinlock.h. Only one thread can be locked inside spin_lock in the same time. Any other thread that will try get spin_lock will be spin on the same place (instruction) until previous thread will free spin_lock. Spined thread will not go to sleep. So in the same time with spin_lock you can have two or more threads that will do something (one work and one spin). It is impossible on single CPU machine. But very good work on SMP. Code section inside spin_lock should be small and fast. If you code should work on differnt machines try check CONFIG_SMP and CONFIG_PREEMPT.
mutex - on other hand work lick semaphore /inside/asm/semaphore.h but counter is one. If counter is one so only one thread can go inside mutex lock region. Any other thread that will try get lock will see counter is zero becouse one thrad inside. And thread will go to wait queue. It will be woken up when the mutex be released and counter equal one. Thread inside mutex lock can sleep. It can call memory allocation function and get userspace memory.
(SMP)So imagine that you got spinlock and next mutex lock. So only one thread can get first spin and next mutex. But potentially inside mutex lock code cant sleep and it is bad. Because mutex inside spin.
(SMP)If you will get mutex lock and next spin lock. The same situation only one thread can go inside lock region. But between mutex get lock and spinlock code can sleep and also between spin_unlock and mutex free lock it can sleep too. Spin lock will get less unsleep region and it is good.

TL;DR: Lock mutex first and then spin lock.
First you need to avoid such situations and be very careful to avoid deadlocks.
Then, you should consider effects of locking. Mutex may cause thread to block and sleep, while spin lock may cause thread to occupy processor in busy waiting loop. So it is general recommendation to keep critical sections that own a spin lock short in time which leads to following rule of thumb: do not sleep (i.e. by locking mutex) while owning a spin lock or you will waste CPU time.

Will killed process/thread release mutex?

Several processes access shared memory, locking it with the mutex and pthread_mutex_lock() for synchronization, and each process can be killed at any moment (in fact I described php-fpm with APC extension, but it doesn't matter).
Will the mutex be unlocked automatically, if the process locked the mutex and then was killed?
Or is there a way to unlock it automatically?
Edit: As it turns out, dying processes and threads have similar behavior in this situation, which depends on robust attribute of mutex.

That depends on the type of mutex. A "robust" mutex will survive the death of the thread/process. See this question: POSIX thread exit/crash/exception-crash while holding mutex
The next thread that will attempt to lock it will receive a EOWNERDEAD error code
Note: Collected information from the comments.

preemption, pthread_spin_lock and atomic built-in

According to this question here by using pthread_spin_lock is dangerous to lock a critical section as the thread might be interrupted by the scheduler out of the bloom and other threads contenting on that resource might be left spinning.
Suppose that I decide to switch from pthread_spin_lock to locks implemented via atomic built-in + compare_and_swap idion: will this thing improve or still I will suffer from this issue?
Since with pthread it seems to be nothing to disable preemption, is there something I can do in case I use locks implemented via atomics or anything I can have a look at?
I am interested in locking a small critical region.

pthread_mutex_lock typically has a fast path which uses an atomic operation to try to acquire the lock. In the event that the lock is not owned, this can be very fast. Only if the lock is already held, does the thread enter the kernel via a system call. The kernel acquires a spin-lock, and then reattempts to acquire the mutex in case it was released since the first attempt. If this attempt fails, the calling thread is added to a wait queue associated with the mutex, and a context switch is performed. The kernel also sets a bit in the mutex to indicate that there is a waiting thread.
pthread_mutex_unlock also has a fast path. If the waiting thread flag is clear, it can simply release the lock. If the flag is set, the thread must enter the kernel via a system call so the waiting thread can be woken. Again, the kernel must acquire a spin lock so that it can manipulate its thread control data structures. In the event that there is no thread waiting after all, the lock can be released by the kernel. If there is a thread waiting, it is made runnable, and ownership of the mutex is transferred without it being released.
There are many subtle race conditions in this little dance, and hopefully it all works properly.
Since a thread that attempts to acquire a locked mutex is context switched out, it does not prevent the thread the owns the mutex from running, which gives the owner an opportunity to exit its critical section and release the mutex.
In contrast, a thread that attempts to acquire a locked spin-lock simply spins, consuming CPU cycles. This has the potential of preventing the thread that owns the spin-lock from exiting its critical section and releasing the lock. The spinning thread can be preempted when its timeslice has been consumed, allowing the thread that owns the lock to eventually regain control. Of course, this is not great for performance.
In practice, spin-locks are used where there is no chance that the thread can be preempted while it owns the lock. A kernel may set a per-cpu flag to prevent it from performing a context switch from an interrupt service routine (or it may raise the interrupt priority level to prevent interrupts that can cause context switches, or it may disable interrupts altogether). A user thread can prevent itself from being preempted (by other threads in the same process) by raising its priority. Note that, in a uniprocessor system, preventing the current thread from being preempted eliminates the need for the spin lock. Alternatively, in a multiprocessor system, you can bind threads to cpus (cpu affinity) so that they cannot preempt one another.
All locks ultimately require an atomic primitive (well, efficient locks; see here for a counter example). Mutexes can be inefficient if they are highly contended, causing threads to constantly enter the kernel and be context switched; especially if the critical section is smaller than the kernel overhead. Spin locks can be more efficient, but only if the owner cannot be preempted and the critical section is short. Note that the kernel must still acquire a spin lock when a thread attempts to acquire a locked mutex.
Personally, I would use atomic operations for things like shared counter updates, and mutexes for more complex operations. Only after profiling would I consider replacing mutexes with spin locks (and figure out how to deal with preemption). Note that if you intend to use condvars, you have no choice but to use mutexes.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight