I saw this in a part that never gets called in a coworker's code:
volatile unsigned char vol_flag = 0;
// ...
while(!vol_flag);
vol_flag is declared in the header file, but is never changed. Am I correct that this will lead to the program hanging in an infinite loop? Is there a way out of it?
Usually a code like this indicates that vol_flag is expected to be changed externally at some point. Here, externally may mean a different thread, an interrupt handler, a piece of hardware (in case of memory mapped IO) etc. This loop effectively waits for the external event which changes the flag.
The volatile keyword is a way for the programmer to express the fact that it is not safe to assume what is apparent from the code: namely that the flag is not changed in the loop. Thus, it prevents the compiler from making optimizations which could compromise the intentions behind the code. Instead, the compiler is forced to make a memory reference to fetch the value of the flag.
Note that (unlike in Java) volatile in C/C++ does not establish happens-before relationship and does not guarantee any ordering or visibility of memory references across volatile access. Moreover, it does not ensure atomicity of variable references. Thus, it is not a tool for communication between threads. See this for details.
Related
In case I have a C file that uses an extern declared variable
and in the code the variable is modified and then an external function is called.
will the compiler optimization take into account the possibility that the variable can be touched by the function? so it won't change the C code order and make sure the variable is set in memory before the function is called?
You need to be careful of a number of things... but in a basic C application with a single thread, yes... this should be fine.
If, however you are (non-exhaustive):
Using multiple threads
Using shared memory between a number of processes
Running on a low-level system (e.g: AVR / STM32) and handling the variable in the an interrupt handler and under main()
Handling the variable in a signal handler and under main()
Reading memory / registers that are modified by hardware / DMA
Then you'll need to be careful.
The volatile keyword can be useful - it will inform the compiler that "this variable may change while you're not looking".
Even with the volatile keyword though, you may run into the Read-Modify-Write problem...
Knowing about the Read-Modify-Write problem is half the fight... the other half is mitigating it, which can be achieved by a number of optiosn such as using a technique called Mutual Exclusion / Critical Sections, or if appropriate by copying the data into a local variable before you operate on the value.
Suppose I have the following C code:
/* clock.c */
#include "clock.h"
static volatile uint32_t clock_ticks;
uint32_t get_clock_ticks(void)
{
return clock_ticks;
}
void clock_tick(void)
{
clock_ticks++;
}
Now I am calling clock_tick (i.e.: incrementing clock_ticks variable) within an interruption, while calling get_clock_ticks() from the main() function (i.e.: outside the interruption).
My understanding is that clock_ticks should be declared as volatile as otherwise the compiler could optimize its access and make main() think the value has not changed (while it actually changed from the interruption).
I wonder if using the get_clock_ticks(void) function there, instead of accessing the variable directly form main() (i.e.: not declaring it as static) can actually force the compiler to load the variable from memory even if it was not declared as volatile.
I wonder this as someone told me this could be happening. Is it true? Under which conditions? Should I always use volatile anyway no matters if I use a "getter" function?
A getter function doesn't help in any way here over using volatile.
Assume the compiler sees you've just fetched the value two lines above and not changed it since then.
If it's a good optimizing compiler, I would expect it to see the function call has no side effect simply optimize out the function call.
If get_clock_ticks() would be external (i.e. in a separate module), matters are different (maybe that's what you remember).
Something that can change its value outside normal program flow (e.g. in an ISR), should always be declared volatile.
Don't forget that even if you currently compile the code declaring get_clock_ticks and the code using it as separate modules, perhaps one day you will use link-time or cross-module optimisation. Keep the "volatile" even though you are using a getter function - it will do no harm to the code generation in this case, and makes the code correct.
One thing you have not mentioned is the bit size of the processor. If it is not capable of reading a 32-bit value in a single operation, then your get_clock_ticks() will sometimes fail as the reads are not atomic.
Does the volatile keyword enforce visibility across threads? For example:
volatile int bar;
mutex mut;
void foo()
{
bar = 4;
// (*) Possible other thread changes to `bar`. No instructions here,
// just time that passes.
lock(&mut);
// (1) If 'bar' had _not_ been declared 'volatile', would the compiler
// be allowed to assume 'bar' is '4' here?
//
// (2) If 'bar' _is_ declared 'volatile', the compiler is
// forced to add the necessary instructions such that changes to
// 'bar' that may have occurred during (*) are visible here.
unlock(&mut)
}
Not asking about atomicity or ordering (I'm assuming any sane implementation of lock(mutex) adds the appropriate memory and compiler fences, where appropriate for the architecture) - simply a question of visibility.
Even if you don't tag bar as volatile, the compiler cannot be sure that the value hasn't been modified in the meanwhile since it is a global value.
So it has to read it again (mutex functions are called, it could be any function getting access to bar and change it), volatile or not.
It would be different for a local value, binding, say, on a hardware register that may change independently of the program execution, where the volatile keyword would be required.
(1) If 'bar' had not been declared 'volatile', would the compiler
be allowed to assume 'bar' is '4' here?
The property of volatile is really simple to understand: read it from memory location every time without optimising anything in relation that. Whether or not, there are multiple threads, a volatile qualified variable still holds the same property.
Does it mean volatile enforces "visibility" across threads?
It may do so as a side effect of its property. But that shouldn't be necessary in a multi-threaded program. Using a proper synchronisation primitive (e.g. a mutex or an atomic variable), a compiler must enforce the visibility or the last stored value in an object across different threads. This is case in both C11 and POSIX threads. A compiler that supports multi-threading programs should be able to generate code correct code that enforces this without requiring volatile. So, the answer is no; you don't need volatile in multi-threaded programs to enforce changes to objects (variables).
I am a embedded developer and use volatile keyword when working with I/O ports. But my Project manager suggested using volatile keyword is harmful and has lot of draw backs, But i find in most of the cases volatile is useful in embedded programming, As per my knowledge volatile is harmful in kernel code as the changes to our code will become unpredictable. There are any drawbacks using volatile in Embedded Systems also?
No, volatile is not harmful. In any situation. Ever. There is no possible well-formed piece of code that will break with the addition of volatile to an object (and pointers to that object). However, volatile is often poorly understood. The reason the kernel docs state that volatile is to be considered harmful is that people kept using it for synchronization between kernel threads in broken ways. In particular, they used volatile integer variables as though access to them was guaranteed to be atomic, which it isn't.
volatile is also not useless, and particularly if you go bare-metal, you will need it. But, like any other tool, it is important to understand the semantics of volatile before using it.
What volatile is
Access to volatile objects is, in the standard, considered a side-effect in the same way as incrementing or decrementing by ++ and --. In particular, this means that 5.1.2.3 (3), which says
(...) An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object)
does not apply. The compiler has to chuck out everything it thinks it knows about the value of a volatile variable at every sequence point. (like other side-effects, when access to volatile objects happens is governed by sequence points)
The effect of this is largely the prohibition of certain optimizations. Take, for example, the code
int i;
void foo(void) {
i = 0;
while(i == 0) {
// do stuff that does not touch i
}
}
The compiler is allowed to make this into an infinite loop that never checks i again because it can deduce that the value of i is not changed in the loop, and thus that i == 0 will never be false. This holds true even if there is another thread or an interrupt handler that could conceivably change i. The compiler does not know about them, and it does not care. It is explicitly allowed to not care.
Contrast this with
int volatile i;
void foo(void) {
i = 0;
while(i == 0) { // Note: This is still broken, only a little less so.
// do stuff that does not touch i
}
}
Now the compiler has to assume that i can change at any time and cannot do this optimization. This means, of course, that if you deal with interrupt handlers and threads, volatile objects are necessary for synchronisation. They are not, however, sufficient.
What volatile isn't
What volatile does not guarantee is atomic access. This should make intuitive sense if you're used to embedded programming. Consider, if you will, the following piece of code for an 8-bit AVR MCU:
uint32_t volatile i;
ISR(TIMER0_OVF_vect) {
++i;
}
void some_function_in_the_main_loop(void) {
for(;;) {
do_something_with(i); // This is thoroughly broken.
}
}
The reason this code is broken is that access to i is not atomic -- cannot be atomic on an 8-bit MCU. In this simple case, for example, the following might happen:
i is 0x0000ffff
do_something_with(i) is about to be called
the high two bytes of i are copied into the parameter slot for this call
at this point, timer 0 overflows and the main loop is interrupted
the ISR changes i. The lower two bytes of i overflow and are now 0. i is now 0x00010000.
the main loop continues, and the lower two bytes of i are copied into the parameter slot
do_something_with is called with 0 as its parameter.
Similar things can happen on PCs and other platforms. If anything, more opportunities it can fail open up with a more complex architecture.
Takeaway
So no, using volatile is not bad, and you will (often) have to do it in bare-metal code. However, when you do use it, you have to keep in mind that it is not a magic wand, and that you will still have to make sure you don't trip over yourself. In embedded code, there's often a platform-specific way to handle the problem of atomicity; in the case of AVR, for example, the usual crowbar method is to disable interrupts for the duration, as in
uint32_t x;
ATOMIC_BLOCK(ATOMIC_RESTORESTATE) {
x = i;
}
do_something_with(x);
...where the ATOMIC_BLOCK macro calls cli() (disable interrupts) before and sei() (enable interrupts) afterwards if they were enabled beforehand.
With C11, which is the first C standard that explicitly acknowledges the existence of multithreading, a new family of atomic types and memory fencing operations have been introduced that can be used for inter-thread synchronisation and in many cases make use of volatile unnecessary. If you can use those, do it, but it'll likely be some time before they reach all common embedded toolchains. With them, the loop above could be fixed like this:
atomic_int i;
void foo(void) {
atomic_store(&i, 0);
while(atomic_load(&i) == 0) {
// do stuff that does not touch i
}
}
...in its most basic form. The precise semantics of the more relaxed memory order semantics go way beyond the scope of a SO answer, so I'll stick with the default sequentially consistent stuff here.
If you're interested in it, Gil Hamilton provided a link in the comments to an explanation of a lock-free stack implementation using C11 atomics, although I don't feel it's a terribly good write-up of the memory order semantics themselves. The C11 model does, however, appear to closely mirror the C++11 memory model, of which a useful presentation exists here. If I find a link to a C11-specific write-up, I will put it here later.
volatile is only useful when the so qualified object can change asynchronously. Such changes can happen
if the object is in fact an hardware IO register or similar that has changes external to your program
if the object might be changed by a signal handler
if the object is changed between calls to setjmp and longjmp
in all these cases you must declare your object volatile, otherwise your program will not work correctly. (And you might notice that objects shared between different threads is not in the list.)
In all other cases you shouldn't, because you may be missing optimization opportunities. On the other hand, qualifying an object volatile that doesn't fall under the points above will not make your code incorrect.
Not using volatile where necessary and appropriate is far more likely to be harmful! The solution to any perceived problems with volatile is not to ban its use, because there are a number of cases where it is necessary for safe and correct semantics. Rather the solution is to understand its purpose and its behaviour.
It is essential for any data that may be changed outside of the knowledge of the compiler, such as I/O and dual-ported or DMA memory. It is also necessary for access to memory shared between execution contexts such as threads and interrupt-handlers; this is where perhaps the confusion lies; it ensures an explicit read of such memory, and does not enforce atomicity or mutual exclusion - additional mechanisms are required for that, but that does not preclude volatile, but it is merely part of the solution to shared memory access.
See the following articles of the use of volatile (and send them to your project manager too!):
Place volatile accurately by Dan Saks.
Introduction to the volatile keyword by Nigel Jones
Guidelines for handling volatile variables by Colin Walls
Combining C's volatile and const keywords - Michael Barr
Volatile tells the compiler not to optimize anything that has to do with the volatile variable.
Why the "volatile" type class should not be used? - Best article in Kernel doc
https://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt
volatile is a keyword in c which tell the compiler not to do any kind of optimization on that variable.
Let me give you a simple example:
int temp;
for ( i=0 ;i <5 ; i++ )
{
temp = 5;
}
what compiler will do to make the code optimized :
int temp;
temp = 5; /* assigned temp variable before the loop. */
for ( i=0 ;i <5 ; i++ )
{
}
But if we mention volatile keyword then compiler will not do any kind of optimization in temp variable.
volatile int temp;
for ( i=0 ;i <5 ; i++ )
{
temp = 5;
}
"Volatile Considered Harmful" ---> I don't consider volatile as harmful. You use volatile where you don't want any kind of optimization from compiler end.
For example consider this piece of code is used by a thermometer company and temp is a variable used to take the temperature of the atmosphere which can change anytime. So if we do not use volatile then compiler will do the optimization and the atmosphere temperature will always be same.
Consider a while loop in ANSI C whose only purpose is to delay execution:
unsigned long counter = DELAY_COUNT;
while(counter--);
I've seen this used a lot to enforce delays on embedded systems, where eg. there is no sleep function and timers or interrupts are limited.
My reading of the ANSI C standard is that this can be completely removed by a conforming compiler. It has none of the side effects described in 5.1.2.3:
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.
...and this section also says:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
Does this imply that the loop could be optimised out? Even if counter were volatile?
Notes:
That this is not quite the same as Are compilers allowed to eliminate infinite loops?, because that refers to infinite loops, and questions arise about when a program is allowed to terminate at all. In this case, the program will certainly proceed past this line at some point, optimisation or not.
I know what GCC does (removes the loop for -O1 or higher, unless counter is volatile), but I want to know what the standard dictates.
C standard compliance follows the "as-if" rule, by which the compiler can generate any code that behaves "as if" it was running your actual instructions on the abstract machine. Since not performing any operations has the same observable behaviour "as if" you did perform the loop, it's entirely permissible to not generate code for it.
In other words, the time something takes to compute on a real machine is not part of the "observable" behaviour of your program, it is merely a phenomenon of a particular implementation.
The situation is different for volatile variables, since accessing a volatile counts as an "observable" effect.
Does this imply that the loop could be optimised out?
Yes.
Even if counter were volatile?
No. It would read and write a volatile variable, which has observable behavior, so it must occur.
If the counter is volatile, the compiler cannot legally optimize out the delay loop. Otherwise it can.
Delay loops like this are bad because the time they burn depends on how the compiler generates code for them. Using different optimization options you can achieve different delays, which is hardly what one wants from a delay loop.
For this reason such delay loops should be implemented in assembly language, where the programmer controls the code fully. This typically applies in embedded systems with simple CPUs.
The standard dictates the behavior you see. If you create a dependency tree for DELAY_COUNT you see that it has a modify without use property which means it can be eliminated. This is in reference to the non volatile case. IN the volatile case the compiler cannot use the dependency tree to attempt to remove this variable and as such the delay remains (since volatile means that hardware can change the memory mapped value OR in some cases means "I really need this don't throw it away") In the case you're looking at if labeled volatile it tells the compiler, please don't throw this away it's here for a reason.