Compiler related - are these two C codes really identical? - c

In a multi-thread or RTOS environment, are these codes below identical?
I believe they are not. But is the 1st code absolute save in a multi-thread environment? Is there a rule for compiler to assign a register for 'ga' and would not read 'ga' again later in func_a()?
I know I can use lock, but this is not a question about how to protect the data. It is just a question about the compiler behaviour.
// ga is a global variable.
int func_a() {
int a = ga;
return a>2 ? a-2 : 2-a;
}
int func_b() {
return ga>2 ? ga-2 : 2-ga;
}
My intention is looking for a standard way (not platform specific) to read ga only once and assign its value to a local variable 'a'.
'a' can then be used consistently regardless of whether 'ga' has changed.

Both these versions of code have undefined behaviour in the face of multiple threads executing the functions. Certainly different compilers can do different things regarding saving the global variable into registers, or not. What's more, there's no guarantee that assigning to a local variable can be done in an atomic way with respect to threads that are mutating the global variable.

There is no rule in the C standard that requires the compiler to implement those functions differently. e.g. When working with registers, the compiler may or may not 'optimize out' the assignment from ga to a (i.e. By 'optimize out', I mean: load ga into a REG, then use the same REG to do the rest of the computation, using it as a). Or it may not do so.
If you want to implement a lock-free data structure:
C99 offers nothing that can help you.
C11 (very recent standard) offers you atomic data types.
If you are using C99, then you either need to:
Use locks (and hence, not lock-free code)
Be ready to write architecture specific code. The least you need to do is use a minimal set of atomic operations, as done in this library that implements lock-free data structures using atomic operations provided by the x86, x86_64, and ARM ISAs.
In an earlier version of this answer, I touched upon a side issue (which has to do with volatile, and which is really not relevant to your real question):
There is one case that can put a restriction on how func_b is implemented, but I am actually going off on a tangent here: If ga is declared as a volatile.
If ga is volatile, then each read on ga must load ga from memory afresh. i.e. in func_b, ga will be loaded from memory two times. Once for the comparison, and once to calculate the return value. The expected use is, for example say ga refers to a memory mapped I/O port. Then if value of ga changes in between the two reads, this will reflect in the return value. However, if you change ga in another thread, don't expect sane/defined behavior.
On the other hand, not having a volatile qualifier does not mean that ga will be read exactly once in func_b. And there is no qualifier that is the 'opposite of volatile'.

the behaviour depends on which compiler you're using, every compiler has its own rules regarding the optimisation.

The two snippets are likely going to end up with identical machine code. Neither of them is safe in a multi-thread case.
volatile would force the creation of a temporary variable, but since the copy from "ga" into the volatile variable is not guaranteed to be atomic, this is not thread-safe.
The only safe way to write such code is with guards:
int func_a() {
mtx_lock(&ga_mutex);
int a = ga;
mtx_unlock(&ga_mutex);
return a>2 ? a-2 : 2-a;
}

Related

Is it good practice to make a global variable always volatile?

I know which is the meaning of volatile. I need to ask that if my variable is global, is it good practise to make it volatile, even i dont use interface with hardware.
Header:
typedef struct
{
int Value;
}Var_;
extern volatile Var_ myVariable;
Source:
volatile Var_ myVariable;
No. If you’re writing multi-threaded code, you want to use atomic variables, not volatile. For example, many concurrent structures need to be kept consistent, not modified one word at a time.
If no other thread, process or hardware is modifying the variable, you should not use either atomics or volatile. It will just complicate the program, run slower, and disable certain APIs for no reason.
The volatile keyword has historically been used for a few different things (such as telling the compiler not to optimize away a delay loop), but its purpose in C11 is narrow: to specify that a value in memory will change by some means that doesn’t follow the rules of atomics. You need it to write some kinds of device drivers, but it’s discouraged even in other low-level code such as OS kernels.
No, it is not good practice. volatile informs the C implementation (largely the compiler) that an object may be changed by something outside of the C implementation or that accesses to the object within the C implementation may have desired effects outside the C implementation. As long as your global object is only used and modified inside your own program, it has no volatile effects, and declaring it with volatile causes the compiler to suppress optimization and to generate unnecessary accesses to it within your program.

Volatile and its harmful implications

I am a embedded developer and use volatile keyword when working with I/O ports. But my Project manager suggested using volatile keyword is harmful and has lot of draw backs, But i find in most of the cases volatile is useful in embedded programming, As per my knowledge volatile is harmful in kernel code as the changes to our code will become unpredictable. There are any drawbacks using volatile in Embedded Systems also?
No, volatile is not harmful. In any situation. Ever. There is no possible well-formed piece of code that will break with the addition of volatile to an object (and pointers to that object). However, volatile is often poorly understood. The reason the kernel docs state that volatile is to be considered harmful is that people kept using it for synchronization between kernel threads in broken ways. In particular, they used volatile integer variables as though access to them was guaranteed to be atomic, which it isn't.
volatile is also not useless, and particularly if you go bare-metal, you will need it. But, like any other tool, it is important to understand the semantics of volatile before using it.
What volatile is
Access to volatile objects is, in the standard, considered a side-effect in the same way as incrementing or decrementing by ++ and --. In particular, this means that 5.1.2.3 (3), which says
(...) An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object)
does not apply. The compiler has to chuck out everything it thinks it knows about the value of a volatile variable at every sequence point. (like other side-effects, when access to volatile objects happens is governed by sequence points)
The effect of this is largely the prohibition of certain optimizations. Take, for example, the code
int i;
void foo(void) {
i = 0;
while(i == 0) {
// do stuff that does not touch i
}
}
The compiler is allowed to make this into an infinite loop that never checks i again because it can deduce that the value of i is not changed in the loop, and thus that i == 0 will never be false. This holds true even if there is another thread or an interrupt handler that could conceivably change i. The compiler does not know about them, and it does not care. It is explicitly allowed to not care.
Contrast this with
int volatile i;
void foo(void) {
i = 0;
while(i == 0) { // Note: This is still broken, only a little less so.
// do stuff that does not touch i
}
}
Now the compiler has to assume that i can change at any time and cannot do this optimization. This means, of course, that if you deal with interrupt handlers and threads, volatile objects are necessary for synchronisation. They are not, however, sufficient.
What volatile isn't
What volatile does not guarantee is atomic access. This should make intuitive sense if you're used to embedded programming. Consider, if you will, the following piece of code for an 8-bit AVR MCU:
uint32_t volatile i;
ISR(TIMER0_OVF_vect) {
++i;
}
void some_function_in_the_main_loop(void) {
for(;;) {
do_something_with(i); // This is thoroughly broken.
}
}
The reason this code is broken is that access to i is not atomic -- cannot be atomic on an 8-bit MCU. In this simple case, for example, the following might happen:
i is 0x0000ffff
do_something_with(i) is about to be called
the high two bytes of i are copied into the parameter slot for this call
at this point, timer 0 overflows and the main loop is interrupted
the ISR changes i. The lower two bytes of i overflow and are now 0. i is now 0x00010000.
the main loop continues, and the lower two bytes of i are copied into the parameter slot
do_something_with is called with 0 as its parameter.
Similar things can happen on PCs and other platforms. If anything, more opportunities it can fail open up with a more complex architecture.
Takeaway
So no, using volatile is not bad, and you will (often) have to do it in bare-metal code. However, when you do use it, you have to keep in mind that it is not a magic wand, and that you will still have to make sure you don't trip over yourself. In embedded code, there's often a platform-specific way to handle the problem of atomicity; in the case of AVR, for example, the usual crowbar method is to disable interrupts for the duration, as in
uint32_t x;
ATOMIC_BLOCK(ATOMIC_RESTORESTATE) {
x = i;
}
do_something_with(x);
...where the ATOMIC_BLOCK macro calls cli() (disable interrupts) before and sei() (enable interrupts) afterwards if they were enabled beforehand.
With C11, which is the first C standard that explicitly acknowledges the existence of multithreading, a new family of atomic types and memory fencing operations have been introduced that can be used for inter-thread synchronisation and in many cases make use of volatile unnecessary. If you can use those, do it, but it'll likely be some time before they reach all common embedded toolchains. With them, the loop above could be fixed like this:
atomic_int i;
void foo(void) {
atomic_store(&i, 0);
while(atomic_load(&i) == 0) {
// do stuff that does not touch i
}
}
...in its most basic form. The precise semantics of the more relaxed memory order semantics go way beyond the scope of a SO answer, so I'll stick with the default sequentially consistent stuff here.
If you're interested in it, Gil Hamilton provided a link in the comments to an explanation of a lock-free stack implementation using C11 atomics, although I don't feel it's a terribly good write-up of the memory order semantics themselves. The C11 model does, however, appear to closely mirror the C++11 memory model, of which a useful presentation exists here. If I find a link to a C11-specific write-up, I will put it here later.
volatile is only useful when the so qualified object can change asynchronously. Such changes can happen
if the object is in fact an hardware IO register or similar that has changes external to your program
if the object might be changed by a signal handler
if the object is changed between calls to setjmp and longjmp
in all these cases you must declare your object volatile, otherwise your program will not work correctly. (And you might notice that objects shared between different threads is not in the list.)
In all other cases you shouldn't, because you may be missing optimization opportunities. On the other hand, qualifying an object volatile that doesn't fall under the points above will not make your code incorrect.
Not using volatile where necessary and appropriate is far more likely to be harmful! The solution to any perceived problems with volatile is not to ban its use, because there are a number of cases where it is necessary for safe and correct semantics. Rather the solution is to understand its purpose and its behaviour.
It is essential for any data that may be changed outside of the knowledge of the compiler, such as I/O and dual-ported or DMA memory. It is also necessary for access to memory shared between execution contexts such as threads and interrupt-handlers; this is where perhaps the confusion lies; it ensures an explicit read of such memory, and does not enforce atomicity or mutual exclusion - additional mechanisms are required for that, but that does not preclude volatile, but it is merely part of the solution to shared memory access.
See the following articles of the use of volatile (and send them to your project manager too!):
Place volatile accurately by Dan Saks.
Introduction to the volatile keyword by Nigel Jones
Guidelines for handling volatile variables by Colin Walls
Combining C's volatile and const keywords - Michael Barr
Volatile tells the compiler not to optimize anything that has to do with the volatile variable.
Why the "volatile" type class should not be used? - Best article in Kernel doc
https://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt
volatile is a keyword in c which tell the compiler not to do any kind of optimization on that variable.
Let me give you a simple example:
int temp;
for ( i=0 ;i <5 ; i++ )
{
temp = 5;
}
what compiler will do to make the code optimized :
int temp;
temp = 5; /* assigned temp variable before the loop. */
for ( i=0 ;i <5 ; i++ )
{
}
But if we mention volatile keyword then compiler will not do any kind of optimization in temp variable.
volatile int temp;
for ( i=0 ;i <5 ; i++ )
{
temp = 5;
}
"Volatile Considered Harmful" ---> I don't consider volatile as harmful. You use volatile where you don't want any kind of optimization from compiler end.
For example consider this piece of code is used by a thermometer company and temp is a variable used to take the temperature of the atmosphere which can change anytime. So if we do not use volatile then compiler will do the optimization and the atmosphere temperature will always be same.

Alternative to volatile?

I'm using a lot of volatile variables in my embedded firmware, but most of the time there is only one point in a function where I need to be sure the value is recent (at the start). But the rest of the function is referring to the same variable-name, and the value can be changed in the mean time, producing very unexpected code flow / results. I know this can be solved by using a temporary variable inside the function, but I was looking for a better solution.
Now I was wondering, instead of marking the whole variable as volatile, is there a way I could instruct the compiler (gcc) with a special keyword that I want to read the variable as if it was marked volatile, so I can use that keyword only once at the beginning of the function?
I'm a little confused about the scenario - if it's that you want one particular access to a variable to be treated as volatile, use
dest = *(volatile TYPE *)&src;
where TYPE is the type of src. You may also need
asm volatile ("" ::: "memory");
in carefully controlled locations, to prevent the compiler from moving loads/stores of other memory locations across the volatile read.
Also investigate C11's _Atomic types. (I'm not sure if GCC supports these yet.)
If your variable is in memory and your embedded system supports it you could use memory barriers. To make sure that nothing accesses the memory while you are reading the value out.

Restrictions on non volatile variables in C

I Would like to understand what Restrictions if any does the compiler have with regards to non volatile variables in C.
I'm not sure if its true or not, but I've been told that if you have the following code:
int x;
...
void update_x() {
lock();
x = x*5+3;
unlock();
}
You must acquire the lock to read x because even tough the compiler is unlikely to do it is technically legal for it to store intermediate calculation such as x*5 into x, and so the read might read an intermediate value. so my first question is whether it is indeed the case? if not, why not?
If it is, I have a followup question, is there anything that's prevents to compiler from using x as a temporary storage before or after taking the lock? (Assuming the compiler can prove that a single thread executing the program will not notice it).
If not, does that mean that any program that has non volatile shared variables is technically undefined even if all the accesses are protected by locks?
Thanks,
Ilya
Prior to C11, the answer is No, as the spec doesn't define anything about what multiple threads do, so any program that uses multiple threads where one thread writes an object and another thread reads it is undefined behavior.
With C11, there's actually a memory model that talks about multiple threads and data races, so the answer is Yes, as long as the lock/unlock routines do certain synchronization operations (involving either library functions that do the synchronization or operations on special _Atomic objects).
Since the C11 spec is attempting to codify behavior of existing implementations (for the most part), it is likely that any code that does what it requires (ie, using a implementation provided library for locking, or implementation provided extensions for atomic operations) will work correctly even on pre-C11 implementations.
Section 5.2.1.4 of the C11 spec covers this.

Can an ANSI C compiler remove a delay loop?

Consider a while loop in ANSI C whose only purpose is to delay execution:
unsigned long counter = DELAY_COUNT;
while(counter--);
I've seen this used a lot to enforce delays on embedded systems, where eg. there is no sleep function and timers or interrupts are limited.
My reading of the ANSI C standard is that this can be completely removed by a conforming compiler. It has none of the side effects described in 5.1.2.3:
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.
...and this section also says:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
Does this imply that the loop could be optimised out? Even if counter were volatile?
Notes:
That this is not quite the same as Are compilers allowed to eliminate infinite loops?, because that refers to infinite loops, and questions arise about when a program is allowed to terminate at all. In this case, the program will certainly proceed past this line at some point, optimisation or not.
I know what GCC does (removes the loop for -O1 or higher, unless counter is volatile), but I want to know what the standard dictates.
C standard compliance follows the "as-if" rule, by which the compiler can generate any code that behaves "as if" it was running your actual instructions on the abstract machine. Since not performing any operations has the same observable behaviour "as if" you did perform the loop, it's entirely permissible to not generate code for it.
In other words, the time something takes to compute on a real machine is not part of the "observable" behaviour of your program, it is merely a phenomenon of a particular implementation.
The situation is different for volatile variables, since accessing a volatile counts as an "observable" effect.
Does this imply that the loop could be optimised out?
Yes.
Even if counter were volatile?
No. It would read and write a volatile variable, which has observable behavior, so it must occur.
If the counter is volatile, the compiler cannot legally optimize out the delay loop. Otherwise it can.
Delay loops like this are bad because the time they burn depends on how the compiler generates code for them. Using different optimization options you can achieve different delays, which is hardly what one wants from a delay loop.
For this reason such delay loops should be implemented in assembly language, where the programmer controls the code fully. This typically applies in embedded systems with simple CPUs.
The standard dictates the behavior you see. If you create a dependency tree for DELAY_COUNT you see that it has a modify without use property which means it can be eliminated. This is in reference to the non volatile case. IN the volatile case the compiler cannot use the dependency tree to attempt to remove this variable and as such the delay remains (since volatile means that hardware can change the memory mapped value OR in some cases means "I really need this don't throw it away") In the case you're looking at if labeled volatile it tells the compiler, please don't throw this away it's here for a reason.

Resources