I am a little bit confused trying to implement a very simple mutex (lock) in C. I understand that a mutex is similar to a binary semaphore, except that the mutex also enforces the constraint that the thread that releases the lock, must be the same thread that most recently acquired it. I am confused on how the ownership is kept track of?
This is what I have so far. Keep in mind that it is not completed yet, and is suppose to be really simple (uniprocessor, no recursion on mutex, disabling interrupts as mutual exclusion method, etc).
struct mutex {
char *mutexName;
volatile int inUse;
};
I believe I should add in another member variable, i.e., whoIsOwner, but I am kind of confused as what to store there. I assume it has to be something that can uniquely identify the thread trying to call the lock? Is this correct?
I have a thread structure in place that has a "char *threadName" member variable (along with others), but I'm not sure how I would access this from within the mutex implementation.
Any pointers/hints/ideas would be appreciated.
You could implement the mutex as an atomic integer which is 0 when unlocked, and which takes the value of the locking thread's ID to indicate it's locked. Of course access to the variable has to be atomic, and suitably fenced to prevent reordering (acquire-release fence pairs suffice).
Ultimately you can of course never prevent yourself from shooting yourself in the foot; if you really want you can overwrite the mutex's memory by force from another thread, or something like that. You'll only get the correct behaviour if you use the tools correctly. With that in mind, you might be satisfied with a simple bool for the locking variable.
uint32_t semOwner;
If the above field is 0, then it is available. If it is "owned", then let it be set to the ID of the owning task, or thread, or Process ID/Thread ID combo (or some other combination that may suit your system).
Hope this helps.
Related
So from my understanding, mutex and binary semaphore are very similar but I just want to know what are some specific application or circumstances that using mutex is better than binary semaphore or viceversa
One big difference between a mutex and a binary semaphore is that a thread must not unlock a mutex locked by another thread (the thread locking the mutex is the unique ownership): a mutex is only meant to be used for critical sections. Wait conditions should be used in this case. A semaphore could be used to do that though it is a bit unusual. There are some other points about priority inversion and safety you can find here.
Generally speaking—since you did not mention any particular library or programming language—mutex and binary semaphore are very close to the same thing.
Binary semaphore is a specialization of the more general counting semaphore, which was invented way back in the early 1960s. It is a surprisingly versatile thing (see The Little Book of Semaphores, and back in the day, it was imagined that semaphore would be the lowest-level API, that would be built-in to many different operating systems to provide the bedrock upon which other, portable synchronization methods and algorithms could be built.
In my personal opinion, if you use something called "mutex" or "lock," then you should use it for one thing only: Use it to prevent threads from interfering with each other when they access shared variables. Whenever you think you want to use a mutex to let one thread send some kind of a signal to some other thread, then that's when you should reach for "semaphore." Even though they both do practically the same thing, using the one with the right name will help other people who read your code to understand what you are doing.
How can I implement a binary semaphore using the POSIX counting semaphore API? I am using an unnamed semaphore and need to limit its count to 1. I believe I can't use a mutex because I need to be able to unlock from another thread.
If you actually want a semaphore that "absorbs" multiple posts without allowing multiple waits to succeed, and especially if you want to be strict about that, POSIX semaphores are not a good underlying promitive to use to implement it. The right set of primitives to implement it on top of is a mutex, a condition variable, and a bool protected by the mutex. When changing the bool from 0 to 1, you signal the condition variable.
With that said, what you're asking for is something of a smell; it inherently has ambiguous orderings. For example if threads A and B both post the semaphore one after another, and threads X and Y are both just starting to wait, it's possible with your non-counting semaphore that either both waits succeed or that only one does, depending on the order of execution: ABXY or AXBY (or other comparable permutation). Thus, the pattern is likely erroneous unless either there's only one thread that could possibly psot at any given time (in which case, why would it post more than once? maybe this is a non-issue) or ability to post is controlled by holding some sort of lock (again, in which case why would it post more than once?). So if you don't have a design flaw here, it's likely that just using a counting semaphore but not posting it more than once gives the behavior you want.
If that's not the case, then there's probably some other data associated with the semaphore that's not properly synchronized, and you're trying to use the semaphore like a condition variable for it. If that's the case, just put a proper mutex and condition variable around it and use them, and forget the semaphore.
One comment for addressing your specific situation:
I believe I can't use a mutex because I need to be able to unlock from another thread.
This becomes a non-issue if you use a combination of mutex and condition variable, because you don't keep the mutex locked while working. Instead, the fact that the combined system is in-use is part of the state protected by the mutex (e.g. the above-mentioned bool) and any thread that can obtain the mutex can change it (to return it to a released state).
In a comment on the question Automatically release mutex on crashes in Unix back in 2010, jilles claimed:
glibc's robust mutexes are so fast because glibc takes dangerous shortcuts. There is no guarantee that the mutex still exists when the kernel marks it as "will cause EOWNERDEAD". If the mutex was destroyed and the memory replaced by a memory mapped file that happens to contain the last owning thread's ID at the right place and the last owning thread terminates just after writing the lock word (but before fully removing the mutex from its list of owned mutexes), the file is corrupted. Solaris and will-be-FreeBSD9 robust mutexes are slower because they do not want to take this risk.
I can't make any sense of the claim, since destroying a mutex is not legal unless it's unlocked (and thus not in any thread's robust list). I also can't find any references searching for such a bug/issue. Was the claim simply erroneous?
The reason I ask and that I'm interested is that this is relevant to the correctness of my own implementation built upon the same Linux robust-mutex primitive.
I think I found the race, and it is indeed very ugly. It goes like this:
Thread A has held the robust mutex and unlocks it. The basic procedure is:
Put it in the "pending" slot of the thread's robust list header.
Remove it from the linked list of robust mutexes held by the current thread.
Unlock the mutex.
Clear the "pending" slot of the thread's robust list header.
The problem is that between steps 3 and 4, another thread in the same process could obtain the mutex, then unlock it, and (rightly) believing itself to be the final user of the mutex, destroy and free/munmap it. After that, if any thread in the process creates a shared mapping of a file, device, or shared memory and it happens to get assigned the same address, and the value at that location happens to match the pid of the thread that's still between steps 3 and 4 of unlocking, you have a situation whereby, if the process is killed, the kernel will corrupt the mapped file by setting the high bit of a 32-bit integer it thinks is the mutex owner id.
The solution is to hold a global lock on mmap/munmap between steps 2 and 4 above, exactly the same as in my solution to the barrier issue described in my answer to this question:
Can a correct fail-safe process-shared barrier be implemented on Linux?
The description of the race by FreeBSD pthread developer David Xu: http://lists.freebsd.org/pipermail/svn-src-user/2010-November/003668.html
I don't think the munmap/mmap cycle is strictly required for the race. The piece of shared memory might be put to a different use as well. This is uncommon but valid.
As also mentioned in that message, more "fun" occurs if threads with different privilege access a common robust mutex. Because the node for the list of owned robust mutexes is in the mutex itself, a thread with low privilege may corrupt a high privilege thread's list. This could be exploited easily to make the high privilege thread crash and in rare cases this might allow the high privilege thread's memory to be corrupted. Apparently Linux's robust mutexes are only designed for use by threads with the same privileges. This could have been avoided easily by making the robust list an array fully in the thread's memory instead of a linked list.
If I have two threads and one global variable (one thread constantly loops to read the variable; the other constantly loops to write to it) would anything happen that shouldn't? (ex: exceptions, errors). If it, does what is a way to prevent this. I was reading about mutex locks and that they allow exclusive access to a variable to one thread. Does this mean that only that thread can read and write to it and no other?
Would anything happen that shouldn't?
It depends in part on the type of the variables. If the variable is, say, a string (long array of characters), then if the writer and the reader access it at the same time, it is completely undefined what the reader will see.
This is why mutexes and other coordinating mechanisms are provided by pthreads.
Does this mean that only that thread can read and write to it and no other?
Mutexes ensure that at most one thread that is using the mutex can have permission to proceed. All other threads using the same mutex will be held up until the first thread releases the mutex. Therefore, if the code is written properly, at any time, only one thread will be able to access the variable. If the code is not written properly, then:
one thread might access the variable without checking that it has permission to do so
one thread might acquire the mutex and never release it
one thread might destroy the mutex without notifying the other
None of these is desirable behaviour, but the mere existence of a mutex does not prevent any of these happening.
Nevertheless, your code could reasonably use a mutex carefully and then the access to the global variable would be properly controlled. While it has permission via the mutex, either thread could modify the variable, or just read the variable. Either will be safe from interference by the other thread.
Does this mean that only that thread can read and write to it and no other?
It means that only one thread can read or write to the global variable at a time.
The two threads will not race amongst themselves to access the global variable neither will they access it at the same time at any given point of time.
In short the access to the global variable is Synchronized.
First; In C/C++ unsynchronized read/write of variable does not generate any exceptions or system error, BUT it can generate application level errors -- mostly because you are unlikely to fully understand how the memory is accessed, and whether it is atomic unless you look at the generated assembler. A multi core CPU may likely create hard-to-debug race conditions when you access shared memory without synchronization.
Hence
Second; You should always use synchronization -- such as mutex locks -- when dealing with shared memory. A mutex lock is cheap; so it will not really impact performance if done right. Rule of thumb; keep the lcok for as short as possible, such as just for the duration of reading/incrementing/writing the shared memory.
However, from your description, it sounds like that one of your threads is doing nothing BUT waiting for the shared meory to change state before doing something -- that is a bad multi-threaded design which cost unnecessary CPU burn, so
Third; Look at using semaphores (sem_create/wait/post) for synchronization between your threads if you are trying to send a "message" from one thread to the other
As others already said, when communicating between threads through "normal" objects you have to take care of race conditions. Besides mutexes and other lock structures that are relatively heavy weight, the new C standard (C11) provides atomic types and operations that are guaranteed to be race-free. Most modern processors provide instructions for such types and many modern compilers (in particular gcc on linux) already provide their proper interfaces for such operations.
If the threads truly are only one producer and only one consumer, then (barring compiler bugs) then
1) marking the variable as volatile, and
2) making sure that it is correctly aligned, so as to avoid interleaved fetches and stores
will allow you to do this without locking.
I am developing a user level thread library as part of a project. I came up with an approach to implement mutex. I would like to see ur views before going on with it. Basically, i need to implement just 3 functions in my library
mutex_init, mutex_lock and mutex_unlock
I thought my mutex_t structure would look something like
typedef struct
{
int available; //indicates whether the mutex is locked or unlocked
queue listofwaitingthreads;
gtthread_t owningthread;
}mutex_t;
In my mutex_lock function, i will first check if the mutex is available in a while loop. If it is not, i will yield the processor for the next thread to execute.
In my mutex_unlock function, i will check if the owner thread is the current thread. If it is, i will set available to 0.
Is this the way to go about it ? Also, what about deadlock? Should i take care of those conditions in my user level library or should i leave the application programmers to write code properly ?
This won't work, because you have a race condition. If 2 threads try to catch the lock at the same time, both will see available == 0, and both will think they succeeded with taking the mutex.
If you want to do this properly, and without using an already-existing lock, You must access hardware operations like TAS, CAS, etc.
There are algorithms that give you mutual exclusion without such hardware support, but they make some assumptions that are many times false. For more details about this, I highly recommend reading Herlihy and Shavit's The art of multiprocessor programming, chapter 7.
You shouldn't worry about deadlocks in this level - mutex locks should be simple enough, and there is some assumption that the programmer using them should use care not to cause deadlocks (advanced mutexes can check for self-deadlock, meaning a thread that calls lock twice without calling unlock in the middle).
Not only that you have to do atomic operations to read and modify the flag (as Eran pointed out) you also have to watch that your queue is capable to have concurrent accesses. This is not completely trivial, sort of hen and egg problem.
But if you'd really implement this by spinning, you wouldn't even need to have such a queue. The access order to the lock then would be mainly random, though.
Probably just yielding would also not be enough, this can be quite costly if you have threads holding the lock for more than some processor cycles. Consider using nanosleep with a low time value for the wait.
In general, a mutex implementation should look like:
Lock:
while (trylock()==failed) {
atomic_inc(waiter_cnt);
atomic_sleep_if_locked();
atomic_dec(waiter_cnt);
}
Trylock:
return atomic_swap(&lock, 1);
Unlock:
atomic_store(&lock, 0);
if (waiter_cnt) wakeup_sleepers();
Things get more complex if you want recursive mutexes, mutexes that can synchronize their own destruction (i.e. freeing the mutex is safe as soon as you get the lock), etc.
Note that atomic_sleep_if_locked and wakeup_sleepers correspond to FUTEX_WAIT and FUTEX_WAKE ops on Linux. The other atomics are probably CPU instructions, but could be system calls or kernel-assisted userspace function code, as in the case of Linux/ARM and the 0xffff0fc0 atomic compare-and-swap call.
You do not need atomic instructions for a user level thread library, because all the threads are going to be user level threads of the same process. So actually when your process is given the time slice to execute, you are running multiple threads during that time slice but on the same processor. So, no two threads are going to be in the library function at the same time. Considering that the functions for mutex are already in the library, mutual exclusion is guaranteed.