Linux: How can I find the thread which holds a particular lock?

Linux: How can I find the thread which holds a particular lock? - c

I have a multi-threads program which is running on Linux, sometimes if I run gstack against it, there is a thread was waiting for a lock for a long time(say, 2-3 minutes),
Thread 2 (Thread 0x5e502b90 (LWP 19853)):
0 0x40000410 in __kernel_vsyscall ()
1 0x400157b9 in __lll_lock_wait () from /lib/i686/nosegneg/libpthread.so.0
2 0x40010e1d in _L_lock_981 () from /lib/i686/nosegneg/libpthread.so.0
3 0x40010d3b in pthread_mutex_lock () from /lib/i686/nosegneg/libpthread.so.0
...
I checked the rest of the threads, none of them were taking this lock, however, after a while this thread (LWP 19853) could acquire this lock successfully.
There should exist one thread that had already acquired this lock, but I failed to find it, is there anything I missing?
EDIT:
The definition of the pthread_mutex_t:
typedef union
{
struct __pthread_mutex_s
{
int __lock;
unsigned int __count;
int __owner;
/* KIND must stay at this position in the structure to maintain
binary compatibility. */
int __kind;
unsigned int __nusers;
extension union
{
int __spins;
__pthread_slist_t __list;
};
} __data;
char _size[_SIZEOF_PTHREAD_MUTEX_T];
long int __align;
} pthread_mutex_t;
There is a member "__owner", it is the id of the thread who is holding the mutex now.

2-3 minutes sounds a lot, but if your system is under heavy load, there is no guarantee that your thread wakes up immediately after another one has unlocked the mutex. So there might just be no thread (anymore) that holds the lock in the moment that you are looking at it.
Linux mutex work in two stages. Roughly:
At the first stage there is a atomic CAS operation on an int value to see if the
mutex can be locked immediately.
If this is not possible a futex_wait system call with the address of the same int is passed to the kernel.
An unlock operation then consist in changing the value back to the initial value (usually 0) and doing a futex_wake system call. The kernel then looks if someone registered a futex_wait call on the same address, and revives those threads in the scheduling queue. Which thread the really gets woken up and when depends on different things, in particular the scheduling policy that is enabled. There is no guarantee that threads obtain the locks in the order they placed them.

Mutexes by default don't track the thread that locked them. (Or at least I don't know of such a thing )
There are two ways to debug this kind of problem. One way is to log every lock and unlock. On every thread creation you log the value of the thread id that got created. Right after locking any lock, you log the thread id, and the name of the lock that was locked ( you can use file/line for this, or assign a name to each lock). And you log again right before unlocking any lock.
This is a fine way to do it if your program doesn't have tens of threads or more. After that the logs start to become unmanageable.
The other way is to wrap your lock in a class that stores the thread id in a lock object right after each lock. You might even create a global lock registry that tracks this, that you can print out when you need to.
Something like:
class MyMutex
{
public:
void lock() { mMutex.lock(); mLockingThread = getThreadId(); }
void unlock() { mLockingThread = 0; mMutex.unlock(); }
SystemMutex mMutex;
ThreadId mLockingThread;
};
The key here is - don't implement either of these methods for your release version. Both a global locking log, or a global registry of lock states creates a single global resource that will itself become a resource under lock contention.

The POSIX API doesn't contain a function that does it.
It's also possible that on some platforms, the implementation doesn't allow this.
For example, a lock can use an atomic variable, set to 1 when locked. The thread obtaining it doesn't have to write its ID anywhere, so no function can find it.

For such debugging issues you might two add special logging calls to your program stating when which tread had aquired the lock and when it returned it.
Such log entries then will help you finding which thread aquired the lock last.
Anyway doing so might massivly change the run time behavior of the program and the issue to be debugged won't appear anymore outing itself as sort of a classical heisenbug as seen often in multi-threaded applications.

Related

C semaphore and mutex difference [duplicate]

What is the difference between semaphores and mutex provided by pthread library ?

semaphores have a synchronized counter and mutex's are just binary (true / false).
A semaphore is often used as a definitive mechanism for answering how many elements of a resource are in use -- e.g., an object that represents n worker threads might use a semaphore to count how many worker threads are available.
Truth is you can represent a semaphore by an INT that is synchronized by a mutex.

I am going to talk about Mutex vs Binary-Semaphore. You obviously use mutex to prevent data in one thread from being accessed by another thread at the same time.
(Assume that you have just called lock() and in the process of accessing a data. This means that, you don’t expect any other thread (or another instance of the same thread-code) to access the same data locked by the same mutex. That is, if it is the same thread-code getting executed on a different thread instance, hits the lock, then the lock() should block the control flow.)
This applies to a thread that uses a different thread-code, which is also accessing the same data and which is also locked by the same mutex.
In this case, you are still in the process of accessing the data and you may take, say, another 15 secs to reach the mutex unlock (so that the other thread that is getting blocked in mutex lock would unblock and would allow the control to access the data).
Do you ever allow another thread to just unlock the same mutex, and in turn, allow the thread that is already waiting (blocking) in the mutex lock to unblock and access the data? (Hope you got what I am saying here.)
As per agreed-upon universal definition,
with “mutex” this can’t happen. No other thread can unlock the lock
in your thread
with “binary-semaphore” this can happen. Any other thread can unlock
the lock in your thread
So, if you are very particular about using binary-semaphore instead of mutex, then you should be very careful in “scoping” the locks and unlocks, I mean, that every control-flow that hits every lock should hit an unlock call and also there shouldn’t be any “first unlock”, rather it should be always “first lock”.

The Toilet Example
Mutex:
Is a key to a toilet. One person can have the key - occupy the toilet - at the time. When finished, the person gives (frees) the key to the next person in the queue.
"Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section."
(A mutex is really a semaphore with value 1.)
Semaphore:
Is the number of free identical toilet keys.
For Example, say we have four toilets with identical locks and keys. The semaphore count - the count of keys - is set to 4 at beginning (all four toilets are free), then the count value is decremented as people are coming in. If all toilets are full, ie. there are no free keys left, the semaphore count is 0. Now, when eq. one person leaves the toilet, semaphore is increased to 1 (one free key), and given to the next person in the queue.
"A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore)."
Source

mutex is used to avoid race condition between multiple threads.
whereas semaphore is used as synchronizing element used across multiple process.
mutex can't be replaced with binary semaphore since, one process waits for semaphore while other process releases semaphore. In case mutex both acquisition and release is handled by same.

The difference between the semaphore and mutex is the difference between mechanism and pattern. The difference is in their purpose (intent)and how they work(behavioral).
The mutex, barrier, pipeline are parallel programming patterns. Mutex is used(intended) to protect a critical section and ensure mutual exclusion. Barrier makes the agents(thread/process) keep waiting for each other.
One of the feature(behavior) of mutex pattern is that only allowed agent(s)(process or thread) can enter a critical section and only that agent(s) can voluntarily get out of that.
There are cases when mutex allows single agent at a time. There are cases where it allows multiple agents(multiple readers) and disallow some other agents(writers).
The semaphore is a mechanism that can be used(intended) to implement different patterns. It is(behavior) generally a flag(possibly protected by mutual exclusion). (One interesting fact is even mutex pattern can be used to implement semaphore).
In popular culture, semaphores are mechanisms provided by kernels, and mutexes are provided by user-space library.
Note, there are misconceptions about semaphores and mutexes. It says that semaphores are used for synchronization. And mutexes has ownership. This is due to popular OS books. But the truth is all the mutexes, semaphores and barriers are used for synchronization. The intent of mutex is not ownership but mutual exclusion. This misconception gave the rise of popular interview question asking the difference of the mutexes and binary-semaphores.
Summary,
intent
mutex, mutual exclusion
semaphore, implement parallel design patterns
behavior
mutex, only the allowed agent(s) enters critical section and only it(they) can exit
semaphore, enter if the flag says go, otherwise wait until someone changes the flag
In design perspective, mutex is more like state-pattern where the algorithm that is selected by the state can change the state. The binary-semaphore is more like strategy-pattern where the external algorithm can change the state and eventually the algorithm/strategy selected to run.

This two articles explain great details about mutex vs semaphores
Also this stack overflow answer tells the similar answer.

Semaphore is more used as flag, for which your really don't need to bring RTOS / OS. Semaphore can be accidentally or deliberately changed by other threads (say due to bad coding).
When you thread use mutex, it owns the resources. No other thread can ever access it, before resource get free.

Mutexes can be applied only to threads in a single process and do not work between processes as do semaphores.

Mutex is like sempaphore with with S=1.
You can control number of concurrent accesses with semaphore but with mutex only one process at a time can access it.
See the implemenation of these two below: (all functions are atomic)
Semaphore:
wait(S) {
while (S <= 0 )
; // busy wait
S--;
}
signal(S) {
S++;
}
Mutex:
acquire() {
while (!available)
; // busy wait
available = false;
}
release() {
available = true;
}

pthreads lock recovery

I am working on a multi-threaded network server application. At the moment, I am having issues with lock recovery. If a thread dies unexpectedly while it is holding a lock, say a mutex, rwlock, spinlock, etc..., is it possible to recover the lock from a different thread without having to go into the lock struct itself and manually disassociate the owner from the lock. I would like to not have to go to this extreme to clear it as this will make the code non-portable. I have attempted to force a lock owner change by doing a pthread_kill on the offending thread and looking at the return code. But even using a mutex type attribute of PTHREAD_MUTEX_ERRORCHECK, I still cannot gain control of the mutex from another thread if the locking thread has quit. This can be a problem if some internal table is being updated when the thread bails out as it will eventually cause the entire server application to halt.
I have used Google extensively and I'm getting conflicting information, even on here. Any suggestions or ideas that I can explore?
This is on FreeBSD 9.3 using clang-llvm compiler.

For mutexes which are shared between processes (PTHREAD_PROCESS_SHARED) you can set them PTHREAD_MUTEX_ROBUST... but you are stuck with the problem that the state protected by the mutex may be invalid -- depending on the application.
For mutexes which are not shared between processes, there is no standard notion of "robustness", because a thread cannot spontaneously die on its own -- a thread will run until either it is cancelled, it exits or the process exits or dies.
You can use:
void pthread_cleanup_push(void (*routine)(void*), void *arg);
void pthread_cleanup_pop(int execute);
to arrange for a mutex to be released if the thread is cancelled or exits while holding the mutex -- something like:
pthread_mutex_lock(&foo) ; // as now
pthread_cleanup_push(pthread_mutex_unlock, &foo) ; // extra step
....
pthread_cleanup_pop(true) ; // replacing the pthread_mutex_unlock()
HOWEVER: you still need to think very carefully about what state the data protected by the mutex is in when the thread is cancelled or exits !!
You may be much better off examining why the thread needs this, and perhaps sort out any error/exception handling to pass the error/exception up and out of the critical section (leaving the critical section cleanly).

While loop heavy on CPU

I have this code:
int _break=0;
while(_break==0) {
if(someCondition) {
//...
if(someOtherCondition)_break=1;//exit the loop
//...
}
}
The problem is that if someCondition is false, the loop gets heavy on the CPU. Is there a way to sleep for some milliseconds in the loop so that the cpu will not have a huge load?
Update
What I'm trying to do is a server-client application, without using sockets, just using shared memory, semaphores and system calls. I'm doing this on linux.
someOtherCondition becomes true when the applications receives the "kill" signal, while someCondition is true if the message received is valid. If it's not valid, it keeps waiting for a valid message and the while loop becomes a heavy infinite loop (it works but loads the CPU too much). I would like to make it lightweight.
I'm working on Linux (Debian 7).

If you have a single-threaded application, then it won't make any difference whether you suspend the execution or not.
If you have multiple threads running, then you should use a binary semaphore instead of polling a global variable.
This thread should acquire the semaphore at the beginning of each iteration, and one of the other threads should release the semaphore whenever you wish this thread to run.
This method is also known as "consumer-producer".
When a thread attempts to acquire a binary semaphore:
If the semaphore is released, then the calling thread acquires it and continues the execution.
If the semaphore is already acquired, then the calling thread "asks" the OS to block itself, and the OS will unblock it as soon as some other thread releases the semaphore.
The entire procedure is "atomic", i.e., no context-switch between threads can take place while the semaphore code is executed. This is generally achieved by disabling the interrupts. Everything is implemented within the semaphore code, so you need not "worry" about it.
Since you did not specify what OS you're using, I cannot provide any technical details (i.e., code)...
UPDATE:
If you are trying to protect a critical section inside the loop (i.e., if you are accessing some other global variable, which is also being accessed by other threads, and at least one of those threads is changing that global variable), then you should use a Mutex instead of a binary semaphore.
There are two advantages for using a Mutex in this case:
It can be released only by the thread which has acquired it (thus ensuring mutual exclusion).
It can resolve a specific type of deadlocks that occur when a high-priority thread is waiting for a low-priority thread to complete, while a medium-priority thread is preventing the low-priority thread from completing (a.k.a. priority-inversion).
Of course, a Mutex is required only if you really need to ensure mutual exclusion for accessing the data.
UPDATE #2:
Now that you've added some specific details on your system, here is the general scheme:
Step #1 - Before starting your threads:
// Declare a global variable 'sem'
// Initialize the global variable 'sem' with 'count = 0' (i.e., as acquired)
Step #2 - In this thread:
// Declare the global variable 'sem' as 'extern'
while(1)
{
semget(&sem);
//...
}
Step #3 - In the Rx ISR:
// Declare the global variable 'sem' as 'extern'
semset(&sem);

Spinning a loop without any delay will use a fair amount of CPU, a small time delay will reduce that you're right.
Using Sleep() is the easiest way, in Windows this is in the windows.h header.
Having said that, the most elegant solution would be to thread your code so that the code is only ever run when your condition is true, that way it will truly sleep until you wake it up.
I suggest you look into pthread and mutex. This will allow you to sleep that loop of yours entirely until the condition becomes true.
Hope that helps in some way :)

Manipulating thread's nice value

I wrote a simple program that implements master/worker scheme where the master is the main thread, and workers are created by it.
The main thread writes something to a shared buffer, and the worker threads read this shared buffer, writing and reading to shared buffer are organized by read/write lock.
Unfortunately, this scheme definitely leads to starvation of main thread, since a single write has to wait on several reads to complete. One possible solution is increasing the priority of the master thread, so if it wants to write something, it will get immediate access to the shared buffer.
According to a great post to a similar issue, I discovered that probably manipulating the priority of a thread under SCHED_OTHER policy is not allowed, what can be changed is the nice value only.
I wrote a procedure to give worker threads lower priority than master thread, but it seems not to work correctly.
void assignWorkerThreadPriority(pthread_t* worker)
{
struct sched_param* worker_sched_param = (struct sched_param*)malloc(sizeof(struct sched_param));
worker_sched_param->sched_priority =0; //any value other than 0 gives error?
int policy = SCHED_OTHER;
pthread_setschedparam(*worker, policy, worker_sched_param);
printf("Result of changing priority is: %d - %s\n", errno, strerror(errno));
}
I have a two-fold question:
How can I set the nice value of a worker threads to avoid main thread starvation.
If not possible, then how can I change the scheduling policy to a one that allows changing the priority.
Edit: I managed to run the program using other policies, such as SCHED_FIFO, all I had to do was running the program as a super user

You cannot avoid problems using a read/write lock when the read and write usage is so even. You need a different method. You need a lock-free message queue or independent work queues or one of many other techniques.
Here is another way to do the job, the way I would do it. The worker can take the buffer away and work on it rather than keeping it shared:
Write thread:
Create work item.
Lock the mutex or CriticalSection protecting the current queue and pointer to queue.
Add work item to queue.
Release the lock.
Optionally signal a condition variable or Event. Another option is for worker threads to check for work on a timer.
Worker thread:
Create a new queue.
Wait for a condition variable or event or other signal, or wait on a timer.
Lock the mutex or CriticalSection protecting the current queue and pointer to queue.
Set the current queue pointer to the new queue.
Release the lock.
Proceed to work on the now private queue.
Delete the queue when all work items complete.
Now write thread creates more work items. When all the worker threads have their own copies of a queue to work on it will be able to write many items in peace.
You can modify this. For example, a worker thread may lock the queue and move a limited number of work items off into its own internal queue instead of taking the whole thing.

pthreads mutex vs semaphore

What is the difference between semaphores and mutex provided by pthread library ?

semaphores have a synchronized counter and mutex's are just binary (true / false).
A semaphore is often used as a definitive mechanism for answering how many elements of a resource are in use -- e.g., an object that represents n worker threads might use a semaphore to count how many worker threads are available.
Truth is you can represent a semaphore by an INT that is synchronized by a mutex.

The Toilet Example
Mutex:
Is a key to a toilet. One person can have the key - occupy the toilet - at the time. When finished, the person gives (frees) the key to the next person in the queue.
"Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section."
(A mutex is really a semaphore with value 1.)
Semaphore:
Is the number of free identical toilet keys.
For Example, say we have four toilets with identical locks and keys. The semaphore count - the count of keys - is set to 4 at beginning (all four toilets are free), then the count value is decremented as people are coming in. If all toilets are full, ie. there are no free keys left, the semaphore count is 0. Now, when eq. one person leaves the toilet, semaphore is increased to 1 (one free key), and given to the next person in the queue.
"A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore)."
Source

mutex is used to avoid race condition between multiple threads.
whereas semaphore is used as synchronizing element used across multiple process.
mutex can't be replaced with binary semaphore since, one process waits for semaphore while other process releases semaphore. In case mutex both acquisition and release is handled by same.

The difference between the semaphore and mutex is the difference between mechanism and pattern. The difference is in their purpose (intent)and how they work(behavioral).
The mutex, barrier, pipeline are parallel programming patterns. Mutex is used(intended) to protect a critical section and ensure mutual exclusion. Barrier makes the agents(thread/process) keep waiting for each other.
One of the feature(behavior) of mutex pattern is that only allowed agent(s)(process or thread) can enter a critical section and only that agent(s) can voluntarily get out of that.
There are cases when mutex allows single agent at a time. There are cases where it allows multiple agents(multiple readers) and disallow some other agents(writers).
The semaphore is a mechanism that can be used(intended) to implement different patterns. It is(behavior) generally a flag(possibly protected by mutual exclusion). (One interesting fact is even mutex pattern can be used to implement semaphore).
In popular culture, semaphores are mechanisms provided by kernels, and mutexes are provided by user-space library.
Note, there are misconceptions about semaphores and mutexes. It says that semaphores are used for synchronization. And mutexes has ownership. This is due to popular OS books. But the truth is all the mutexes, semaphores and barriers are used for synchronization. The intent of mutex is not ownership but mutual exclusion. This misconception gave the rise of popular interview question asking the difference of the mutexes and binary-semaphores.
Summary,
intent
mutex, mutual exclusion
semaphore, implement parallel design patterns
behavior
mutex, only the allowed agent(s) enters critical section and only it(they) can exit
semaphore, enter if the flag says go, otherwise wait until someone changes the flag
In design perspective, mutex is more like state-pattern where the algorithm that is selected by the state can change the state. The binary-semaphore is more like strategy-pattern where the external algorithm can change the state and eventually the algorithm/strategy selected to run.

This two articles explain great details about mutex vs semaphores
Also this stack overflow answer tells the similar answer.

Semaphore is more used as flag, for which your really don't need to bring RTOS / OS. Semaphore can be accidentally or deliberately changed by other threads (say due to bad coding).
When you thread use mutex, it owns the resources. No other thread can ever access it, before resource get free.

Mutexes can be applied only to threads in a single process and do not work between processes as do semaphores.

Mutex is like sempaphore with with S=1.
You can control number of concurrent accesses with semaphore but with mutex only one process at a time can access it.
See the implemenation of these two below: (all functions are atomic)
Semaphore:
wait(S) {
while (S <= 0 )
; // busy wait
S--;
}
signal(S) {
S++;
}
Mutex:
acquire() {
while (!available)
; // busy wait
available = false;
}
release() {
available = true;
}