Mutexes and thread priorities with regards to scheduling on POSIX systems - c

In POSIX systems (linux etc), when multiple threads lock a common mutex - is it the locking order that is always observed, or does thread priority bias threads of higher priority when it comes scheduling the next thread in the critical section?
Does the standard mention anything about the behavior? because as far as I can see it only seems to mention the required interface.
Please note, I'm looking of guidance on any POSIX conformant system (not just linux), so feel free to suggest the behavior of other OSes (QNX, Minix etc..)

When multiple threads are waiting to lock the same mutex, when the mutex becomes available, the highest priority thread will be unblocked first. If multiple threads have the same priority, which thread is unblocked will depend on the scheduling algorithm used, e.g. using a FIFO policy, the thread that has been waiting that the longest will be awakened first.
Thread priorities and synchronisation is quite a thorny area and you need to be very careful you don't end up with priority inversion and cause deadlock.
Chapter 5.5 of Butenhof's Programming with POSIX Threads deals with realtime scheduling.

Related

Does the pthread API provide synchronization in a multiprocessor environment?

I've just started to study the pthread API. I've been using different books and websites, and judging from what they all report, pthread synchronization functions (e.g. those involving mutexes) all work both for a uniprocessor and multiprocessor environments. But none of these sources explicitly stated it, so I wanted to know if that's actually the case (of course I believe so, I just wanted to be 100% sure).
So, if two threads running on different CPUs called a lock (e.g. pthread_mutex_lock()) on the same mutex at the same time, would the execution of this routine be executed sequentially rather than in parallel? And after the first lock is over and the thread invoking it has private access to the critical section, does the lock executed by the other thread on another CPU cause the latter thread to suspend?
Yes, it does. The POSIX API is described in terms of requirements on implementations - for example, a pthread_mutex_lock() that returns zero or EOWNERDEAD must return with the mutex locked and owned by the calling thread. There's no exception for multiprocessor environments, so conforming implementations in multiprocessor environments must continue to make it work.
So, if two threads running on different CPUs called a lock (e.g.
pthread_mutex_lock()) on the same mutex at the same time, would the
execution of this routine be executed sequentially rather than in
parallel?
It's not specified how pthread_mutex_lock() works underneath, but from an application point of view you know that if it doesn't return an error, your thread has acquired the lock.
And after the first lock is over and the thread invoking it has
private access to the critical section, does the lock executed by the
other thread on another CPU cause the latter thread to suspend?
Yes - the specification for pthread_mutex_lock() says:
If the mutex is already locked by another thread, the calling thread
shall block until the mutex becomes available.

What does it mean to POSIX that a thread is "suspended"?

In the course of commentary on a recent question, a subsidiary question arose about at what point a cancellation request for a pthreads thread with cancelability PTHREAD_CANCEL_DEFERRED can be expected to be acted upon. References to the standard and a bit of lawyering ensued. I'm not much concerned specifically about whether I was mistaken in my comments on that question, but I would like to be sure I understand POSIX's provisions correctly.
The most pertinent section of the standard says
Whenever a thread has cancelability enabled and a cancellation request has been made with that thread as the target, and the thread then calls any function that is a cancellation point [...], the cancellation request shall be acted upon before the function returns. If a thread has cancelability enabled and a cancellation request is made with the thread as a target while the thread is suspended at a cancellation point, the thread shall be awakened and the cancellation request shall be acted upon.
What, though, does it mean for a thread to be "suspended"? POSIX explicitly defines the term for processes, but not, as far as I can determine, for threads. On the other hand, POSIX documents thread suspension to be among the behaviors of a handful of functions, including, but not limited to, some of those related to synchronization objects. Should one then conclude that those serve collectively as the relevant definition of the term?
And as this all pertains to the question that spawned this line of inquiry, given that POSIX does not specify thread suspension as part of the behavior of read(), fread(), or any of the general file or stream I/O functions, if a thread is not making progress on account of being blocked on I/O, does that necessarily mean it is "suspended" for the purposes of cancellation?
A suspended thread is one that, as you say, is blocked on a socket read, waiting for a semaphore to become available, etc.
Given that POSIX implementations vary at the tricky edges, and that there is the potential for a thread to be blocked in a function that is not a cancellation point, it might be that relying on cancellation in code that is to be ported might be more trouble than it's worth.
I've never used it, I've always chosen to have code to explicitly instruct a thread to terminate (normally a message down a pipe or queue). This is very easy with a Communicating Sequential Processes or Actor Model system.
That way clean up can be done under one's own control, freeing memory, etc. as necessary. I've no idea whether a cancelled thread will clean up its memory (I suspect not), or whether there is the option for an at_exit() type thing (there may be). On the whole I think that application behaviour is more thoroughly controlled if there is only one single way a thread can exit.
==EDIT==
#JohnBollinger,
The language used If a thread has cancelability enabled and a cancellation request is made with the thread as a target while the thread is suspended at a cancellation point could be interpretted as IF a thread has cancelability enabled AND IF cancelled and IF implementation suspends blocked threads AND IF the thread is blocked THEN the thread shall be awakened.... In other words, they're leaving it up to the implementer of the POSIX subsystem.
Cygwin's implementation of select() does not (or at least did not) result in the thread being suspended. Instead it spawns a polling thread per file descriptor to test for signalable activity, due to the fundamental lack of anything quite like select() in Windows (it gets close, but no cigar. Win32 select() works on only sockets). Implementations of select() back in the 1980s often worked this way too.
It might be for reasons like this that POSIX is reluctant to clearly define when a thread is suspended. Historically many implementations of select() were like this, making it a minefield for a standards committee to say when a thread might or might not be suspended. Of course the complexities caused by select() would also apply to a process but as POSIX does define a suspended process it does seem odd that they couldn't / didn't extend the definition to threads.
It might be down to how threads are implemented; you can conceivably have a POSIX implementation that doesn't use OS threads (a bit like the early implementations of ADA back in the days when OSes didn't do threads at all), and in such an implementation a blocked thread might not be suspended (in the sense of taking no CPU cycles) at all.
Definition of suspend in the context of threads:
3.107 Condition Variable
A synchronization object which allows a thread to suspend execution, repeatedly, until some associated predicate becomes true. A thread whose execution is suspended on a condition variable is said to be blocked on the condition variable.
From: http://pubs.opengroup.org/onlinepubs/9699919799/
This is not a direct answer, just a definition – too large for a comment. Blocked == suspended.
read, fread, and friends are system calls and as such they will execute a context switch and execute from the kernel context until those functions complete. Interrupting a kernel context is outside the scope of pthreads thus they will not cause a cancellation.
I don't have a reference for it, but as far as I know, thread suspension in the context of Posix threads has to do with it's synchronization object's ( like futex's ).

What is the `pthread_mutex_lock()` wake order with multiple threads waiting?

Suppose I have multiple threads blocking on a call to pthread_mutex_lock(). When the mutex becomes available, does the first thread that called pthread_mutex_lock() get the lock? That is, are calls to pthread_mutex_lock() in FIFO order? If not, what, if any, order are they in? Thanks!
When the mutex becomes available, does the first thread that called pthread_mutex_lock() get the lock?
No. One of the waiting threads gets a lock, but which one gets it is not determined.
FIFO order?
FIFO mutex is rather a pattern already. See Implementing a FIFO mutex in pthreads
"If there are threads blocked on the mutex object referenced by mutex when pthread_mutex_unlock() is called, resulting in the mutex becoming available, the scheduling policy shall determine which thread shall acquire the mutex."
Aside from that, the answer to your question isn't specified by the POSIX standard. It may be random, or it may be in FIFO or LIFO or any other order, according to the choices made by the implementation.
FIFO ordering is about the least efficient mutex wake order possible. Only a truly awful implementation would use it. The thread that ran the most recently may be able to run again without a context switch and the more recently a thread ran, more of its data and code will be hot in the cache. Reasonable implementations try to give the mutex to the thread that held it the most recently most of the time.
Consider two threads that do this:
Acquire a mutex.
Adjust some data.
Release the mutex.
Go to step 1.
Now imagine two threads running this code on a single core CPU. It should be clear that FIFO mutex behavior would result in one "adjust some data" per context switch -- the worst possible outcome.
Of course, reasonable implementations generally do give some nod to fairness. We don't want one thread to make no forward progress. But that hardly justifies a FIFO implementation!

Why is a pthread mutex considered "slower" than a futex?

Why are POSIX mutexes considered heavier or slower than futexes? Where is the overhead coming from in the pthread mutex type? I've heard that pthread mutexes are based on futexes, and when uncontested, do not make any calls into the kernel. It seems then that a pthread mutex is merely a "wrapper" around a futex.
Is the overhead simply in the function-wrapper call and the need for the mutex function to "setup" the futex (i.e., basically the setup of the stack for the pthread mutex function call)? Or are there some extra memory barrier steps taking place with the pthread mutex?
Futexes were created to improve the performance of pthread mutexes. NPTL uses futexes, LinuxThreads predated futexes, which I think is where the "slower" consideration comes. NPTL mutexes may have some additional overhead, but it shouldn't be much.
Edit:
The actual overhead basically consists on:
selecting the correct algorithm for the mutex type (normal, recursive, adaptive, error-checking; normal, robust, priority-inheritance, priority-protected), where the code heavily hints to the compiler that we are likely using a normal mutex (so it should convey that to the CPU's branch prediction logic),
and a write of the current owner of the mutex if we manage to take it which should normally be fast, since it resides in the same cache-line as the actual lock which we have just taken, unless the lock is heavily contended and some other CPU accessed the lock between the time we took it and when we attempted to write the owner (this write is unneeded for normal mutexes, but needed for error-checking and recursive mutexes).
So, a few cycles (typical case) to a few cycles + a branch misprediction + an additional cache miss (very worst case).
The short answer to your question is that futexes are known to be implemented about as efficiently as possible, while a pthread mutex may or may not be. At minimum, a pthread mutex has overhead associated with determining the type of mutex and futexes do not. So a futex will almost always be at least as efficient as a pthread mutex, until and unless someone thinks up some structure lighter than a futex and then releases a pthreads implementation that uses that for its default mutex.
Technically speaking pthread mutexes are not slower or faster than futexes. pthread is just a standard API, so whether they are slow or fast depends on the implementation of that API.
Specifically in Linux pthread mutexes are implemented as futexes and are therefore fast. Actually, you don't want to use the futex API itself as it is very hard to use, does not have the appropriate wrapper functions in glibc and requires coding in assembly which would be non portable. Fortunately for us the glibc maintainers already coded all of this for us under the hood of the pthread mutex API.
Now, because most operating systems did not implement futexes then programmers usually mean by pthread mutex is the performance you get from usual implementation of pthread mutexes, which is, slower.
So it's a statistical fact that in most operating systems that are POSIX compliant the pthread mutex is implemented in kernel space and is slower than a futex. In Linux they have the same performance. It could be that there are other operating systems where pthread mutexes are implemented in user space (in the uncontended case) and therefore have better performance but I am only aware of Linux at this point.
Because they stay in userspace as much as possible, which means they require fewer system calls, which is inherently faster because the context switch between user and kernel mode is expensive.
I assume you're talking about kernel threads when you talk about POSIX threads. It's entirely possible to have an entirely userspace implementation of POSIX threads which require no system calls but have other issues of their own.
My understanding is that a futex is halfway between a kernel POSIX thread and a userspace POSIX thread.
On AMD64 a futex is 4 bytes, while a NPTL pthread_mutex_t is 56 bytes! Yes, there is a significant overhead.

multi thread in c question

Does mutex guarantee to execute thread in order of arriving?
that is, if, thread 2 and thread 3 arrive is waiting while thread 1 is in critical section
what exactly happen after thread 1 exit critical section if thread 2 arrive at mutex lock before thread 3, thread 2 will be allowed to enter critical section before thread 3 ?
or race condition will be occurred?
if its not guaranteed, how can i solve this? (maybe queue?)
That sort of behaviour would have to be an implementation detail of your threading library (which you didn't mention). I would guess most threading libraries don't make any such a guarantee, though. Unless the waiting threads had different priorities, of course.
Generally threading libraries do not make any such guarantees, because most OS's don't make any such guarantee. The thread wrapper can't (usually) do any better than the native OS thread management operations.
It's up to the operating system. In Windows, there is no guaranteed order that any given thread will be awoken and granted the mutex.
The question you have asked is classical case of "bounded waiting", and there is known way of solving this via [Bakery Algorithm].1
The basic idea here is that you maintain two counts, first is current serving number and other is global count (analogy to a bakery with a running number). Whenever a new thread enters (i.e. waits on a mutex) then increase the global count and give out ticket to the thread. This thread then waits on the ticket number, till the current serving number is equal to ticket number.
This way we can maintain the order such that thread which comes first, gets hold of mutex first.
I am not sure if the standard libraries implement mutex this way internally, but it wouldn't be that difficult to implement Bakery algorithm for your need.

Resources