I'm using Glib's mutex utilities to handle concurrency. Is it guaranteed that the updated version of a modified variable will be visible to any other thread after unlocking a mutex?
Do these threads have to acquire a lock on the mutex as well in order to read it safely?
GStaticMutex mutex;
int value;
void init() {
g_static_mutex_init(&mutex);
value = 0;
}
void changeValue() {
g_static_mutex_lock(&mutex);
value = generateRandomNumber();
g_static_mutex_unlock(&mutex);
}
You should work by the book, and let the smart people who implemented the mutex worry about visibility and barriers. The book says a mutex should be held both when reading and when writing.
The CPU can rearrange reads, and does this a lot. It helps reduce the penalty of cache misses, because you start to fetch the data a while before it's actually needed.
So if you read a variable after another CPU wrote it and released the lock, the read may actually be performed before these things happen.
The mutex serves as a memory barrier, preventing this problem (and others).
The mutex object should only be read through the g_static_mutex_* functions. If you want to know if you can acquire the mutex you can use this function:
g_static_mutex_trylock
On the linkage of the identifier, it follows the same rules as with any other C identifier: it depends in which scope it is declared and if some storage class specifier (e. g., static or extern) is specified.
I guess I found the answer. Gthread is a wrapper around pthread (according to http://redmine.lighttpd.net/boards/3/topics/425) and pthreads seem to implement a memory barrier (http://stackoverflow.com/questions/3208060/does-guarding-a-variable-with-a-pthread-mutex-guarantee-its-also-not-cached)
But I'm uncertain if it is necessary to use the mutex read the value.
Related
I use pthread_mutex_t in my program for thread synchronization control.
Do I need to do some finishing works when the pthread_mutex_t is no longer in use?
Or, can I do nothing?
Thank you for your help
You mention that "the pthread_mutex_t is no longer in use".
I assume you mean that you no longer need to use it, ever, in any of your threads.
In this case:
The pthread_mutex_t must be in the unlocked state.
You should call pthread_mutex_destroy.
The requirement for the mutex to be unlocked appears in the documentation for pthread_mutex_destroy:
It shall be safe to destroy an initialized mutex that is unlocked.
Attempting to destroy a locked mutex results in undefined behavior.
(emphasis is mine)
This post contains some more info about the proper usage of pthread_mutex_destroy:
How to safely and correctly destroy a mutex in Linux using pthread_mutex_destroy?.
I use pthread_mutex_t in my program for thread synchronization
control. Do I need to do some finishing works when the
pthread_mutex_t is no longer in use? Or, can I do nothing?
TL;DR: You do not need to do any cleanup. In some cases you should, and in other cases it's a question of style. And in some cases, the question is moot because it's not possible to recognize that a mutex is no longer in use.
The relevant sense of "no longer in use" here would be that the mutex is not currently locked by any thread (including the one that might perform cleanup), and there is no possibility that any thread will attempt to lock it in the future. For this case, the pthread_mutex_destroy() function is available to release any resources that the mutex may be holding (not including the storage occupied by the mutex object itself). In any other case, destroying the mutex puts your program at risk of exercising undefined behavior.
If a given mutex object has ever been initialized, including via the static initializer, and its lifetime ends at a point when it has not been destroyed since its last initialization, then the end of its lifetime must be assumed to leak resources. But this is consequential only when the mutex's lifetime ends before the end of the program, because all resources belonging to a process are cleaned up by the OS when the process terminates. In particular, it is not consequential in the common case of mutex objects declared at file scope in any translation unit.
Guidance, then:
As a correctness matter, you must ensure that
The lifetime of a mutex object does not end while it is still in use.
No mutex is destroyed while it is still in use or after the end of its lifetime.
As a practical matter, you should avoid consequential resource leaks, as they may ultimately lead to program failure and / or overall system stress from resource exhaustion. In this context, that means using pthread_mutex_destroy() to clean up mutex objects having automatic, allocated, or thread storage duration before those objects' lifetimes end, when that occurs significantly before the end of the program overall.
As a style matter, you might choose to apply a similar discipline to mutex objects having static storage duration -- perhaps only those initialized via pthread_mutex_init(), or perhaps including also those initialized via the static initializer macro. I tend not to worry about these, myself, as there are rarely very many, and they rarely go out of use very much before the program is going to terminate anyway.
As a style matter, you should not make heroic efforts or overly complicate your code to ensure that mutexes are explicitly destroyed when the program is terminating. The OS is going to perform all necessary cleanup anyway, and any cleanup (or other) code that takes a lot of effort to write correctly in the first place is fertile ground for bugs and has high maintenance cost.
Finally, note well that there are cases when you can't even recognize before program termination that a given mutex is no longer in use. For example, consider a program that declares a file-scope mutex used to synchronize the operations of several daemon threads. It may well be that no thread in the system can determine whether all the (other) daemon threads have terminated, so as to know that the mutex is no longer in use, so there is no safe course but to avoid ever destroying it explicitly.
Assume sharedFnc is a function that is used between multiple threads:
void sharedFnc(){
// do some thread safe work here
}
Which one is the proper way of using a Mutex here?
A)
void sharedFnc(){
// do some thread safe work here
}
int main(){
...
pthread_mutex_lock(&lock);
sharedFnc();
pthread_mutex_unlock(&lock);
...
}
Or B)
void sharedFnc(){
pthread_mutex_lock(&lock);
// do some thread safe work here
pthread_mutex_unlock(&lock);
}
int main(){
...
sharedFnc();
...
}
Let's consider two extremes:
In the first extreme, you can't even tell what lock you need to acquire until you're inside the function. Maybe the function locates an object and operates on it and the lock is per-object. So how can the caller know what lock to hold?
And maybe the code needs to do some work while holding the lock and some work while not holding the lock. Maybe it needs to release the lock while waiting for something.
In this extreme, the lock must be acquired and released inside the function.
In the opposite extreme, the function might not even have any idea it's used by multiple threads. It may have no idea what lock its data is associated with. Maybe it's called on different data at different times and that data is protected by different locks.
Maybe its caller needs to call several different functions while holding the same lock. Maybe this function reports some information on which the thread will decide to call some other function and it's critical that state not be changed by another thread between those two functions.
In this extreme, the caller must acquire and release the lock.
Between these two extremes, it's a judgment call based on which extreme the situation is closer to. Also, those aren't the only two options available. There are "in-between" options as well.
There's something to be said for this pattern:
// Only call this with `lock` locked.
//
static sometype foofunc_locked(...) {
...
}
sometype foofunc(...) {
pthread_mutex_lock(&lock);
sometype rVal = foofunc_locked(...);
pthread_mutex_unlock(&lock);
return rVal;
}
This separates the responsibility for locking and unlocking the mutex from whatever other responsibilities are embodied by foofunc_locked(...).
One reason you would want to do that is, it's very easy to see whether every possible invocation of foofunc() unlocks the lock before it returns. That might not be the case if the locking and unlocking was mingled with loops, and switch statements and nested if statements and returns from the middle, etc.
If the lock is inside the function, you better make damn sure there's no recursion involved, especially no indirect recursion.
Another problem with the lock being inside the function is loops, where you have two big problems:
Performance. Every cycle you're releasing and reacquiring your locks. That can be expensive, especially in OS's like Linux which don't have light locks like critical sections.
Lock semantics. If there's work to be done inside the loop, but outside your function, you can't acquire the lock once per cycle, because it will dead-lock your function. So you have to piece-meal your loop cycle even more, calling your function (acquire-release), then manually acquire the lock, do the extra work, and manually release it before the cycle ends. And you have absolutely no guarantee of what happens between your function releasing it and you acquiring it.
I've never had the chance to play with the pthreads library before, but I am reviewing some code involving pthread mutexes. I checked the documentation for pthread_mutex_lock and pthread_mutex_init, and my understanding from reading the man pages for both these functions is that I must call pthread_mutex_init before I call pthread_mutex_lock.
However, I asked a couple colleagues, and they think it is okay to call pthread_mutex_lock before calling pthread_mutex_init. The code I'm reviewing also calls pthread_mutex_lock without even calling pthread_mutex_init.
Basically, is it safe and smart to call pthread_mutex_lock before ever calling pthread_mutex_init (if pthread_mutex_init even gets called)?
EDIT: I also see some examples where pthread_mutex_lock is called when pthread_mutex_init is not used, such as this example
EDIT #2: Here is specifically the code I'm reviewing. Please note that the configure function acquires and attaches to some shared memory that does not get initialized. The Java code later on will call lock(), with no other native functions called in-between. Link to code
Mutexes are variables containing state (information) that functions need to do their job. If no information was needed, the routine wouldn't need a variable. Likewise, the routine can't possibly function properly if you feed random garbage to it.
Most platforms do accept a mutex object filled with zero bytes. This is usually what pthread_mutex_init and PTHREAD_MUTEX_INITIALIZER create. As it happens, the C language also guarantees that uninitialized global variables are zeroed out when the program starts. So, it may appear that you don't need to initialize pthread_mutex_t objects, but this is not the case. Things that live on the stack or the heap, in particular, often won't be zeroed.
Calling pthread_mutex_init after pthread_lock is certain to have undesired consequences. It will overwrite the variable. Potential results:
The mutex gets unlocked.
A race condition with another thread attempting to get the lock, leading to a crash.
Resources leaked in the library or kernel (but will be freed on process termination).
The POSIX standard says:
If mutex does not refer to an initialized mutex object, the behavior
of pthread_mutex_lock(), pthread_mutex_trylock(), and
pthread_mutex_unlock() is undefined.
So you do need to initialise the mutex. This can be done either by a call to pthread_mutex_init(); or, if the mutex has static storage duration, by using the static initializer PTHREAD_MUTEX_INITIALIZER. Eg:
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
here is the text from the link I posted in a comment:
Mutual exclusion locks (mutexes) prevent multiple threads
from simultaneously executing critical sections of code that
access shared data (that is, mutexes are used to serialize
the execution of threads). All mutexes must be global. A
successful call for a mutex lock by way of mutex_lock()
will cause another thread that is also trying to lock the
same mutex to block until the owner thread unlocks it by way
of mutex_unlock(). Threads within the same process or
within other processes can share mutexes.
Mutexes can synchronize threads within the **same process** or
in ***other processes***. Mutexes can be used to synchronize
threads between processes if the mutexes are allocated in
writable memory and shared among the cooperating processes
(see mmap(2)), and have been initialized for this task.
Initialize Mutexes are either intra-process or inter-process,
depending upon the argument passed implicitly or explicitly to the initialization of that mutex.
A statically allocated mutex does not need to be explicitly initialized;
by default, a statically allocated mutex is initialized with all zeros and its scope is set to be within the calling process.
For inter-process synchronization, a mutex needs to be allo-
cated in memory shared between these processes. Since the
memory for such a mutex must be allocated dynamically, the
mutex needs to be explicitly initialized using mutex_init().
also, for inter-process synchronization,
besides the requirement to be allocated in shared memory,
the mutexes must also use the attribute PTHREAD_PROCESS_SHARED,
otherwise accessing the mutex from another process than its creator results in undefined behaviour
(see this: linux.die.net/man/3/pthread_mutexattr_setpshared):
The process-shared attribute is set to PTHREAD_PROCESS_SHARED to permit a
mutex to be operated upon by any thread that has access to the memory
where the mutex is allocated, even if the mutex is allocated in memory that is shared by multiple processes
I am new to thread programming. I know that mutexes are used to protect access to shared data in a multi-threaded program.
Suppose I have one thread with variable a and a second one with the pointer variable p that holds the address of a. Is the code safe if, in the second thread, I lock a mutex before I modify the value of a using the pointer variable? From my understanding it is safe.
Can you confirm? And also can you provide the reason why it is true or why it is not true?
I am working with c and pthreads.
The general rule when doing multithreading is that shared variables among threads that are read and written need to be accessed serially, which means that you must use some sort of synchronization primitive. Mutexes are a popular choice, but no matter what you end up using, you just need to remember that before reading from or writing to a shared variable, you need to acquire a lock to ensure consistency.
So, as long as every thread in your code agrees to always use the same lock before accessing a given variable, you're all good.
Now, to answer your specific questions:
Is the code safe if, in the second thread, I lock a mutex before I
modify the value of a using the pointer variable?
It depends. How do you read a on the first thread? The first thread needs to lock the mutex too before accessing a in any way. If both threads lock the same mutex before reading or writing the value of a, then it is safe.
It's safe because the region of code between the mutex lock and unlock is exclusive (as long as every thread respects the rule that before doing Y, they need to acquire lock X), since only one thread at a time can have the lock.
As for this comment:
And if the mutex is locked before p is used, then both a and p are
protected? The conclusion being that every memory reference present in
a section where a mutex is locked is protected, even if the memory is
indirectly referenced?
Mutexes don't protect memory regions or references, they protect a region of code. Whatever you make between locking and unlocking is exclusive; that's it. So, if every thread accessing or modifying either of a or p locks the same mutex before and unlocks afterwards, then as a side-effect you have synchronized accesses.
TL;DR Mutexes allow you to write code that never executes in parallel, you get to choose what that code does - a remarkably popular pattern is to access and modify shared variables.
If I have two threads and one global variable (one thread constantly loops to read the variable; the other constantly loops to write to it) would anything happen that shouldn't? (ex: exceptions, errors). If it, does what is a way to prevent this. I was reading about mutex locks and that they allow exclusive access to a variable to one thread. Does this mean that only that thread can read and write to it and no other?
Would anything happen that shouldn't?
It depends in part on the type of the variables. If the variable is, say, a string (long array of characters), then if the writer and the reader access it at the same time, it is completely undefined what the reader will see.
This is why mutexes and other coordinating mechanisms are provided by pthreads.
Does this mean that only that thread can read and write to it and no other?
Mutexes ensure that at most one thread that is using the mutex can have permission to proceed. All other threads using the same mutex will be held up until the first thread releases the mutex. Therefore, if the code is written properly, at any time, only one thread will be able to access the variable. If the code is not written properly, then:
one thread might access the variable without checking that it has permission to do so
one thread might acquire the mutex and never release it
one thread might destroy the mutex without notifying the other
None of these is desirable behaviour, but the mere existence of a mutex does not prevent any of these happening.
Nevertheless, your code could reasonably use a mutex carefully and then the access to the global variable would be properly controlled. While it has permission via the mutex, either thread could modify the variable, or just read the variable. Either will be safe from interference by the other thread.
Does this mean that only that thread can read and write to it and no other?
It means that only one thread can read or write to the global variable at a time.
The two threads will not race amongst themselves to access the global variable neither will they access it at the same time at any given point of time.
In short the access to the global variable is Synchronized.
First; In C/C++ unsynchronized read/write of variable does not generate any exceptions or system error, BUT it can generate application level errors -- mostly because you are unlikely to fully understand how the memory is accessed, and whether it is atomic unless you look at the generated assembler. A multi core CPU may likely create hard-to-debug race conditions when you access shared memory without synchronization.
Hence
Second; You should always use synchronization -- such as mutex locks -- when dealing with shared memory. A mutex lock is cheap; so it will not really impact performance if done right. Rule of thumb; keep the lcok for as short as possible, such as just for the duration of reading/incrementing/writing the shared memory.
However, from your description, it sounds like that one of your threads is doing nothing BUT waiting for the shared meory to change state before doing something -- that is a bad multi-threaded design which cost unnecessary CPU burn, so
Third; Look at using semaphores (sem_create/wait/post) for synchronization between your threads if you are trying to send a "message" from one thread to the other
As others already said, when communicating between threads through "normal" objects you have to take care of race conditions. Besides mutexes and other lock structures that are relatively heavy weight, the new C standard (C11) provides atomic types and operations that are guaranteed to be race-free. Most modern processors provide instructions for such types and many modern compilers (in particular gcc on linux) already provide their proper interfaces for such operations.
If the threads truly are only one producer and only one consumer, then (barring compiler bugs) then
1) marking the variable as volatile, and
2) making sure that it is correctly aligned, so as to avoid interleaved fetches and stores
will allow you to do this without locking.