Real dangers of 2+ threads writing/reading a variable - c

What are the real dangers of simultaneous read/write to a single variable?
If I use one thread to write a variable and another to read the variable in a while loop and there is no danger if the variable is read while being written and an old value is used what else is a danger here?
Can a simultaneous read/write cause a thread crash or what happens on the low level when an exact simultaneous read/write occurs?

If two threads access a variable without suitable synchronization, and at least one of those accesses is a write then you have a data race and undefined behaviour.
How undefined behaviour manifests is entirely implementation dependent. On most modern architectures, you won't get a trap or exception or anything from the hardware, and it will read something, or store something. The thing is, it won't necessarily read or write what you expected.
e.g. with two threads incrementing a variable, you can miss counts, as described in my article at devx: http://www.devx.com/cplus/Article/42725
For a single writer and a single reader, the most common outcome will be that reader sees a stale value, but you might also see a partially-updated value if the update requires more than one cycle, or the variable is split across cache lines. What happens then depends on what you do with it --- if it's a pointer and you get a partially updated value then it might not be a valid pointer, and won't point to what you intended it to anyway, and then you might get any kind of corruption or error due to dereferencing an invalid pointer value. This may include formatting your hard disk or other bad consequences if the bad pointer value just happens to point to a memory mapped I/O register....

In general you get unexpected results. Wikipedia defines two distinct racing conditions:
A critical race occurs when the order in which internal variables are changed determines the eventual state that the state machine will end up in.
A non-critical race occurs when the order in which internal variables are changed does not alter the eventual state. In other words, a non-critical race occurs when moving to a desired state means that more than one internal state variable must be changed at once, but no matter in what order these internal state variables change, the resultant state will be the same.
So the output will not always get messed up, it depends on the code. It's good practice to always deal with racing conditions for later code scaling and preventing possible errors. Nothing is more annoying then not being able to trust your own data.

Two threads reading the same value is no problem at all.
The problem begins when one thread writes a non-atomic variable and another thread reads it. Then the results of the read are undefined. Since a thread may be preempted (stopped) at any time. Only operations on atomic variables are guaranteed to be non-breakable. Atomic actions are usually writes to int type variables.
If you have two threads accessing the same data, it is best practice + usually unavoidable to use locking (mutex, semaphore).
hth
Mario

Depends on the platform. For example, on Win32, then read and write ops of aligned 32bit values are atomic- that is, you can't half-read a new value and half-read an old value, and if you write, then when someone comes to read, either they get the full new value or the old value. That's not true for all values, or all platforms, of course.

Result is undefined.
Consider this code:
global int counter = 0;
tread()
{
for(i=0;i<10;i++)
{
counter=counter+1;
}
}
Problem is that if you have N threads result can be anything between 10 and N*10.
This is because it might happen all treads read same value increase it and then write value +1 back. But you asked if you can crash program or hardware.
It depends. In most cases are wrong results useless.
For solving this locking problem you need mutex or semaphore.
Mutex is lock for code. In upper case you would lock part of code in line
counter = counter+1;
Where semaphore is lock for variable
counter
Basicaly same thing for solving same type of problem.
Check for this tools in your tread library.
http://en.wikipedia.org/wiki/Mutual_exclusion

The worst that will happen depends on the implementation. There are so many completely independent implementations of pthreads, running on different systems and hardware, that I doubt anyone knows everything about all of them.
If p isn't a pointer-to-volatile then I think that a compiler for a conforming Posix implementation is allowed to turn:
while (*p == 0) {}
exit(0);
Into a single check of *p followed by an infinite loop that doesn't bother looking at the value of *p at all. In practice, it won't, so it's a question of whether you want to program to the standard, or program to undocumented observed behavior of the implementations you're using. The latter generally works for simple cases, and then you build on the code until you do something complicated enough that it unexpectedly doesn't work.
In practice, on a multi-CPU system that doesn't have coherent memory caches, it could be a very long time before that while loop ever sees a change made from a different CPU, because without memory barriers it might never update its cached view of main memory. But Intel has coherent caches, so most likely you personally won't see any delays long enough to care about. If some poor sucker ever tries to run your code on a more exotic architecture, they may end up having to fix it.
Back to theory, the setup you're describing could cause a crash. Imagine a hypothetical architecture where:
p points to a non-atomic type, like long long on a typical 32 bit architecture.
long long on that system has trap representations, for example because it has a padding bit used as a parity check.
the write to *p is half-complete when the read occurs
the half-write has updated some of the bits of the value, but has not yet updated the parity bit.
Bang, undefined behavior, you read a trap representation. It may be that Posix forbids certain trap representations that the C standard allows, in which case long long might not be a valid example for the type of *p, but I expect you can find a type for which trap representations are permitted.

If the variable being written to and from can not be updated or read atomically then it is possible for the reader to pick up a corrupt "partially updated" value.

You can see a partial update (e.g. you may see a long long variable with half of it coming from the new value and the other half coming from the old value).
You are not guaranteed to see the new value until you use a memory barrier (pthread_mutex_unlock() contains an implicit memory barrier).

Related

Are concurrent unordered writes with fencing to shared memory undefined behavior?

I have heard that it is undefined behavior to read/write to the same location in memory concurrently, but I am unsure if the same is true when there are no clear race-conditions involved. I suspect that the c18 standard will state it is undefined behavior on principal due to the potential to create race conditions, but I am more interested in if this still counts as undefined behavior at an application level when these instances are surrounded by fencing.
Setup
For context, say we have two threads A and B, set up to operate on the same location in memory. It can be assumed that the shared memory mentioned here is not used or accessible anywhere else.
// Prior to the creation of these threads, the current thread has exclusive ownership of the shared memory
pthread_t a, b;
// Create two threads which operate on the same memory concurrently
pthread_create(&a, NULL, operate_on_shared_memory, NULL);
pthread_create(&b, NULL, operate_on_shared_memory, NULL);
// Join both threads giving the current thread exclusive ownership to shared memory
pthread_join(a, NULL);
pthread_join(b, NULL);
// Read from memory now that the current thread has exclusive ownership
printf("Shared Memory: %d\n", shared_memory);
Write/Write
Each thread then theoretically runs operate_on_shared_memory which mutates the value of shared_memory at the same time across both threads. However with the caveat that both threads attempt to set the shared memory to the same unchanging constant. Even if it is a race condition, the race winner should not matter. Does this count as undefined behavior? If so, why?
int shared_memory = 0;
void *operate_on_shared_memory(void *_unused) {
const int SOME_CONSTANT = 42;
shared_memory = SOME_CONSTANT;
return NULL;
}
Optional Branching Write/Write
If the previous version does not count as undefined behavior, then what about this example which first reads from shared_memory then writes the constant to a second location in shared memory. The important part here being that even if one or both threads succeeds in running the if statement, it should still have the same outcome.
int shared_memory = 0;
int other_shared_memory = 0;
void *operate_on_shared_memory(void *_unused) {
const int SOME_CONSTANT = 42;
if (shared_memory != SOME_CONSTANT) {
other_shared_memory = SOME_CONSTANT;
}
shared_memory = SOME_CONSTANT;
return NULL;
}
If this is undefined behavior, then why? If the only reason is that it introduces a race condition, is there any reason why I shouldn't deem it acceptable for one thread to potentially execute an extra machine instruction? Is it because the CPU or compiler may re-order memory operations? What if I were to put atomic_thread_fence at the start and end of the operate_on_shared_memory?
Context
GCC and Clang doesn't seem to have any complaints. I used c18 for this test, but I don't mind referring to a later standard if they are easier to reference.
$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
$ gcc -std=c18 main.c -pthread -g -O3 -Wall
If the value of an object is modified any time near where it is read via non-qualified lvalue, machine code generated by gcc may yield behavior which is inconsistent with any particular value the object held or could have held. Situations where such inconsistencies would occur are likely rare, but I don't think there's any way of judging whether such issues could arise in machine code generated from any particular source, except by inspecting the machine code in question.
For example, (godbolt link https://godbolt.org/z/T3jd6voax) the function:
unsigned test(unsigned short *p)
{
unsigned short temp = *p;
unsigned short result = temp - (temp>>15);
return result;
}
will be processed by gcc 10.2.1, when targeting the Cortex-M0 platform, by code equivalent to:
unsigned test(unsigned short *p)
{
unsigned short temp1 = *p;
signed short temp2 = *(signed short*)p;
unsigned short result = temp1 + (temp2 >> 15);
return result;
}
Although there is no way the original function could return 65535, the revised function may do so if the value at *p changes from 0 to 65535, or from 65535 to 0, between the two reads.
Many compilers are designed in a way that would inherently guarantee that an unqualified read of any word-size-or-smaller object will always yield a value the object has held or will hold in future, but unfortunately it is rare for compilers to explicitly document such things. The only compilers that wouldn't uphold such a guarantee would be those that process code using some sequence of steps which differs from that specified, but is expected to behave identically, and compiler writers seldom see any reason to enumerate--much less document--all of the transformations that they could perform, but don't.
As long as you don't plan to count every toggle of shared_memory and other_shared_memory, and if you don't care if some modifications aren't done or are done twice unnecessarily, it should work.
For example, if your code is planned to simply monitor/show another system's activity for end users, it's fine: a mismatch during one microsecond isn't a serious issue.
If you plan to sample precisely two inputs and get an accurate array of results, or do precise computations on thread's results in shared memory, then you're doing it very wrongly.
Here, your UB is mostly that you can't guarantee that shared_memory isn't modified between the test and the assignment.
I've numbered two lines in your code:
void *operate_on_shared_memory(void *_unused) {
const int SOME_CONSTANT = 42;
/*1*/if (shared_memory != SOME_CONSTANT) {
other_shared_memory = SOME_CONSTANT;
}
/*2*/shared_memory = SOME_CONSTANT;
return NULL;
}
When on line marked 1, if for example you're toggling shared_memory between two values (SOME_CONSTANT and SOME_CONSTANT_2), since it isn't atomic reads/writes you MAY read something different than the two used constants.
On line marked 2, it's the same: you can't be sure that you won't be interrupted by another write, and got finally a value that isn't SOME_CONSTANT or SOME_CONSTANT_2, but something else. Think about reading the upper part of one, and the lower part of the other.
Also, you can "miss" a true condition on line #1, and therefore miss an update to other_shared_memory, or do it twice because the write at line #2 will be messed up - so for next test line #1, value will be different from SOME_CONSTANT and you'll do an unwanted update.
All this depends on several factors, like:
Are your writes/reads atomic anyway, despite not being explicitely atomics?
Can your threads be really interrupted between lines #1 and #2, or are you "protected" (humm...) by scheduler/priorities?
Is shared memory tolerant to multiple concurrent access, or will you lock the chip that controls it if you do such an attempt?
You can't reply? That's why it's an undefined behavior...
In your particular situation, it MAY works. Or not. Or fails on my machine while working on yours.
"Undefined behavior" is usually not properly understood. What it really means is: "You cannot predict nor guarantee what the behavior will be for ALL possible platforms".
Nothing more, nothing less. It's not a guarantee of having problems, it's the absence of a guarantee of NOT having them. I may sounds like a subtle difference, but in fact it's a huge one.
By "platform", we mean the tuple build with:
An execution machine, including all currently running softwares,
An operating system, including its version and installed components,
A compiler chain, including its version and switches,
A build system, including all possible flags passed to compiler chain.
But UB doesn't mean "your program will act randomly"... A given set of CPU instruction will always produce the same result (in the same initial conditions), there is no randomness here. Obviously, they can be the wrong instructions for the problem you wish to solve, but it's reproductible. That's how we hunt bugs, BTW...
So, on a fixed platform, having an UB means "you can't predict what will happen". And in no way "you'll face pure randomness". In fact, a LOT of programs can even exploit UB, because they're known on this particular platform and it's easier/cheaper/faster than doing it the good way.
Or because, even if officially an UB, your compiler does finally the same thing as the other (i.e. there is an UB when downcasting an integer to a signed smaller integer, and char is usually signed... Near nobody cares.).
But once your code is written, you'll know what the behavior is: it won't be undefined anymore... For YOUR platform, and ONLY this platform. Update your OS or your compiler, launch another program that can mess the scheduling, use a faster CPU, and you MUST test again to check if behavior is still the same. And that's why it's annoying to have UB: it can works NOW. And cause a tricky bug a bit later.
It's one of the major reasons why industrial software often use "old" OSes and/or compiler: upgrading them is a HIGH risk of triggering/causing a bug because an update corrected what was a real bug, but the project's code exploited this bug (maybe unknowingly!) and the updated software now crash... Or worse, can destroy some hardware!
We're in 2022, I still have a project that uses an embedded 2008 Linux, with a GCC3, VS2008, C++98, and WinXP/Qt4 on user's machine. Project is actively maintained - and trust me, it's a pain. But upgrading software/platform? No way. Better deal with known bugs rather than discovering new ones. Cheaper, too.
One of my speciality is softwares porting, mostly from "old" platforms to new ones (often with 10 years or more between the two). I've faced this kind of things a LOT of times: it worked on old platform, it breaks on new one, and only because an UB was exploited then, and now the behavior (still undefined...) is not the same anymore.
I obviously don't speak about changing C/C++ standard, or machine's endianness, where you need anyway to rewrite code, or dealing with new OS features (like UAC on Windows). I speak about "normal" code, compiled without any warning, that behaves differently now and then. And you can't imagine how frequent it is, since no compiler will warn you about neither high-level UB (for example, non thread-safe functions) nor instruction-level UB (a simple cast or alias can fully hide it without ANY warning).

Are race conditions that write the same value safe?

Suppose I have some multi-threaded program using shared memory, in which multiple threads are at random times overwriting the value of some multi-byte variable (e.g. an int or a double), sometimes colliding which each other (a.k.a. a race condition), and reading the value from the same variable at random times too.
Assuming all the threads always write the same value to the memory address (e.g. each thread does x = 1000) - if a thread reads the variable at the exact moment that another thread(s) is/are overwriting it, is the variable guaranteed to have the correct value? or could the memory somehow get overwritten with something random?
That is, if all the threads always write x = 1000, can a thread read x and get something other than 1000?
Assuming all the threads always write the same value to the memory
address (e.g. each thread does x = 1000) - if a thread reads the
variable at the exact moment that another thread(s) is/are overwriting
it, is the variable guaranteed to have the correct value?
The C language specifications expressly decline to make such a guarantee by declaring the behavior of programs containing race conditions to be undefined. And you're right that without synchronization, the situation you describe is a race condition, notwithstanding whether the value being written is the same is the initial contents of the memory.
or could
the memory somehow get overwritten with something random?
The behavior is undefined. In principle, anything could happen, including the read seeing a value that was never stored at the location in question.
Note also that the race is not about any kind of objective simultaneity. Rather, it is about lack of synchronization that would prevent simultaneous access, regardless of whether any simultaneous access actually occurs.
In practice, you would probably find that on some implementations, under at least some circumstances, writes that do not change the contents of memory act as if they did not conflict with each other or with reads that happen after that value was first written to the location, where "happens after" is a technical term that depends in part on synchronization. I do not recommend depending on such behavior, however. Not even if it is documented.
The C Standard allows implementations which can cheaply offer stronger guarantees than what it mandates to do so. Consequently, the Standard bends over backward to avoid requiring implementations to uphold any guarantees whose costs might sometimes exceed their benefits, leaving the question of how and when to uphold guarantees whose benefits would exceed the costs as a Quality of Implementation outside the Standard's jurisdiction.
Although it would seem like it should cost nothing to guarantee that writing an object with the value it already contains would have no effect, upholding such a guarantee would sometimes require foregoing some potentially-useful optimizations. As a simple example, consider the following function
volatile zz;
unsigned test(unsigned *p, unsigned *q)
{
unsigned temp;
*p = 0x1234;
temp = *q;
zz = 1;
do {} while(zz);
*p = 0x1235;
return temp;
}
On some platforms, including the original 8088/8086, the most efficient way to process the code may be to replace the last assignment to *p with *p += 1; which could then be processed using an inc instruction. If the code were executed in two threads simultaneously, however, that could cause *p to be left holding the value 0x1236.
In many cases, upholding a guarantee that writing an object with the value it already contains would cost nothing, while treating race conditions involving such writes as benign would eliminate the cost of some synchronization actions that would be rendered unnecessary. Unfortunately, while the Standard allows implementations to offer guarantees beyond what the Standard requires when doing so would be practical and useful, it provides no means of distinguishing implementations that offer such guarantees from those that don't.

why reading a variable modified by other threads can be neither old value nor new value

It has been mentioned by several, for example here c++ what happens when in one thread write and in second read the same object? (is it safe?)
that if two threads are operating on the same variable without atomics and lock, reading the variable can return neither the old value nor the new value.
I don't understand why this can happen and I cannot find an example such things happen, I think load and store is always one single instruction which will not be interrupted, then why can this happen?
For on example, C may be implemented on hardware which supports only 16-bit accesses to memory. In this case, loading or storing a 32-bit integer requires two load or store instructions. A thread performing these two instructions may be interrupted between their executions, and another thread may execute before the first thread is resumed. If that other thread loads, it may load one new part and one old part. If it stores, it may store both parts, and the first thread, when resumed, will see one old part and one new part. And other such mixes are possible.
From a language-lawyer point of view (i.e. in terms of what the C or C++ spec says, without considering any particular hardware the program might be running on), operations are either defined or undefined, and if operations are undefined, then the program is allowed to do literally anything it wants to, because they don't want to constrain the performance of the language by forcing compiler writers to support any particular behavior for operations that the programmer should never allow to happen anyway.
From a practical standpoint, the most likely scenario (on common hardware) where you'd read a value that is neither-old-nor-new would be the "word-tearing" scenario; where (broadly speaking) the other thread has written to part of the variable at the instant your thread reads from it, but not to the other part, so you get half of the old value and half of the new value.
It has been mentioned by several, for example here c++ what happens when in one thread write and in second read the same object? (is it safe?) that if two threads are operating on the same variable without atomics and lock, reading the variable can return neither the old value nor the new value.
Correct. Undefined behavior is undefined.
I don't understand why this can happen and I cannot find an example such things happen, I think load and store is always one single instruction which will not be interrupted, then why can this happen?
Because undefined behavior is undefined. There is no requirement that you be able to think of any way it can go wrong. Do not ever think that because you can't think of some way something can break, that means it can't break.
For example, say there's a function that has an unsynchronized read in it. The compiler could conclude that therefore this function can never be called. If it's the only function that could modify a variable, then the compiler could omit reads to that variable. For example:
int j = 12;
// This is the only code that modifies j
int q = some_variable_another_thread_is_writing;
j = 0;
// other code
if (j != 12) important_function();
Since the only code that modifies j reads a variable another thread is writing, the compiler is free to assume that code will never execute, thus j will always be 12, and thus the test of j and the call to important_function can be optimized out. Ouch.
Here's another example:
if (some_function()) j = 0;
else j = 1;
If the implementation thinks that some_function will almost always return true and can prove some_function cannot access j, it is perfectly legal for it to optimize this to:
j = 0;
if (!some_function()) j++;
This will cause your code to break horribly if other threads mess with j without a lock or j is not a type defined to be atomic.
And do not ever think that some compiler optimization, though legal, will never happen. That has burned people over and over again as compilers get smarter.

Is Setting a Variable Atomic in THESE conditions

I have this situation where I have a state variable; int state = (2, 1, 0)
and an infinite loop:
ret = BoolCompareAndSwap(state, 1, 2)
if (ret) {
// Change something ...
state = 0;
}
Would this state setting be atomic?
Assuming to set a variable you must:
Take out from memory
Change value
Set new value
If some other thread came and compared the variable, it would be atomic since the actual value doesn't change until it it re-set in memory, Correct?
Strictly speaking, C compilers would still be standard conforming if they wrote the state bitwise. After writing the first few bits, some other thread can read any kind of garbage.
Most compilers do no such thing (with the possible exception of compilers for ancient 4bit processors or even narrower ...), because it would be a performance loss.
Also, and more practically relevant, if any other thread writes (instead of only reading) to the state, that written value can get lost if you do not protect the described code against racing conditions.
As a side note, the described state change (read, modify, write) is never atomic. The question however when that non-atomicity is vulnerable, is valid and it is what I tried to answer above.
More generically speaking, thinking through all possible combinations of concurrent access is a valid protection mechanism. It is however extremely costly in many ways (design effort, test effort, risk during maintenance...).
Only if those costs are in total smaller than the intended saving (possibly performance), it is feasible to go that way, instead of using an appropriate protection mechanism.

One thread reading and another writing to volatile variable - thread-safe?

In C I have a pointer that is declared volatile and initialized null.
void* volatile pvoid;
Thread 1 is occasionally reading the pointer value to check if it is non-null. Thread 1 will not set the value of the pointer.
Thread 2 will set the value of a pointer just once.
I believe I can get away without using a mutex or condition variable.
Is there any reason thread 1 will read a corrupted value or thread 2 will write a corrupted value?
To make it thread safe, you have to make atomic reads/writes to the variable, it being volatile is not safe in all timing situations. Under Win32 there are the Interlocked functions, under Linux you can build it yourself with assembly if you do not want to use the heavy weight mutexes and conditional variables.
If you are not against GPL then http://www.threadingbuildingblocks.org and its atomic<> template seems promising. The lib is cross platform.
In the case where the value fits in a single register, such as a memory aligned pointer, this is safe. In other cases where it might take more than one instruction to read or write the value, the read thread could get corrupted data. If you are not sure wether the read and write will take a single instruction in all usage scenarios, use atomic reads and writes.
Depends on your compiler, architecture and operating system. POSIX (since this question was tagged pthreads Im assuming we're not talking about windows or some other threading model) and C don't give enough constraints to have a portable answer to this question.
The safe assumption is of course to protect the access to the pointer with a mutex. However based on your description of the problem I wonder if pthread_once wouldn't be a better way to go. Granted there's not enough information in the question to say one way or the other.
Unfortunately, you cannot portably make any assumptions about what is atomic in pure C.
GCC, however, does provide some atomic built-in functions that take care of using the proper instructions for many architectures for you. See Chapter 5.47 of the GCC manual for more information.
Well this seems fine.. The only problem will happen in this case
let thread A be your checking thread and B the modifying one..
The thing is that checking for equality is not atomic technically first the values should be copied to registers then checked and then restored. Lets assume that thread A has copied to register, now B decides to change the value , now the value of your variable changes. So when control goes back to A it will say it is not null even though it SHUD be according to when the thread was called. This seems harmless in this program but MIGHT cause problems..
Use a mutex.. simple enuf.. and u can be sure you dont have sync errors!
On most platforms where a pointer value can be read/written in a single instruction, it either set or it isn't set yet. It can't be interrupted in the middle and contain a corrupted value. A mutex isn't needed on that kind of platform.

Resources