Volatile and its harmful implications

Volatile and its harmful implications - c

I am a embedded developer and use volatile keyword when working with I/O ports. But my Project manager suggested using volatile keyword is harmful and has lot of draw backs, But i find in most of the cases volatile is useful in embedded programming, As per my knowledge volatile is harmful in kernel code as the changes to our code will become unpredictable. There are any drawbacks using volatile in Embedded Systems also?

No, volatile is not harmful. In any situation. Ever. There is no possible well-formed piece of code that will break with the addition of volatile to an object (and pointers to that object). However, volatile is often poorly understood. The reason the kernel docs state that volatile is to be considered harmful is that people kept using it for synchronization between kernel threads in broken ways. In particular, they used volatile integer variables as though access to them was guaranteed to be atomic, which it isn't.
volatile is also not useless, and particularly if you go bare-metal, you will need it. But, like any other tool, it is important to understand the semantics of volatile before using it.
What volatile is
Access to volatile objects is, in the standard, considered a side-effect in the same way as incrementing or decrementing by ++ and --. In particular, this means that 5.1.2.3 (3), which says
(...) An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object)
does not apply. The compiler has to chuck out everything it thinks it knows about the value of a volatile variable at every sequence point. (like other side-effects, when access to volatile objects happens is governed by sequence points)
The effect of this is largely the prohibition of certain optimizations. Take, for example, the code
int i;
void foo(void) {
i = 0;
while(i == 0) {
// do stuff that does not touch i
}
}
The compiler is allowed to make this into an infinite loop that never checks i again because it can deduce that the value of i is not changed in the loop, and thus that i == 0 will never be false. This holds true even if there is another thread or an interrupt handler that could conceivably change i. The compiler does not know about them, and it does not care. It is explicitly allowed to not care.
Contrast this with
int volatile i;
void foo(void) {
i = 0;
while(i == 0) { // Note: This is still broken, only a little less so.
// do stuff that does not touch i
}
}
Now the compiler has to assume that i can change at any time and cannot do this optimization. This means, of course, that if you deal with interrupt handlers and threads, volatile objects are necessary for synchronisation. They are not, however, sufficient.
What volatile isn't
What volatile does not guarantee is atomic access. This should make intuitive sense if you're used to embedded programming. Consider, if you will, the following piece of code for an 8-bit AVR MCU:
uint32_t volatile i;
ISR(TIMER0_OVF_vect) {
++i;
}
void some_function_in_the_main_loop(void) {
for(;;) {
do_something_with(i); // This is thoroughly broken.
}
}
The reason this code is broken is that access to i is not atomic -- cannot be atomic on an 8-bit MCU. In this simple case, for example, the following might happen:
i is 0x0000ffff
do_something_with(i) is about to be called
the high two bytes of i are copied into the parameter slot for this call
at this point, timer 0 overflows and the main loop is interrupted
the ISR changes i. The lower two bytes of i overflow and are now 0. i is now 0x00010000.
the main loop continues, and the lower two bytes of i are copied into the parameter slot
do_something_with is called with 0 as its parameter.
Similar things can happen on PCs and other platforms. If anything, more opportunities it can fail open up with a more complex architecture.
Takeaway
So no, using volatile is not bad, and you will (often) have to do it in bare-metal code. However, when you do use it, you have to keep in mind that it is not a magic wand, and that you will still have to make sure you don't trip over yourself. In embedded code, there's often a platform-specific way to handle the problem of atomicity; in the case of AVR, for example, the usual crowbar method is to disable interrupts for the duration, as in
uint32_t x;
ATOMIC_BLOCK(ATOMIC_RESTORESTATE) {
x = i;
}
do_something_with(x);
...where the ATOMIC_BLOCK macro calls cli() (disable interrupts) before and sei() (enable interrupts) afterwards if they were enabled beforehand.
With C11, which is the first C standard that explicitly acknowledges the existence of multithreading, a new family of atomic types and memory fencing operations have been introduced that can be used for inter-thread synchronisation and in many cases make use of volatile unnecessary. If you can use those, do it, but it'll likely be some time before they reach all common embedded toolchains. With them, the loop above could be fixed like this:
atomic_int i;
void foo(void) {
atomic_store(&i, 0);
while(atomic_load(&i) == 0) {
// do stuff that does not touch i
}
}
...in its most basic form. The precise semantics of the more relaxed memory order semantics go way beyond the scope of a SO answer, so I'll stick with the default sequentially consistent stuff here.
If you're interested in it, Gil Hamilton provided a link in the comments to an explanation of a lock-free stack implementation using C11 atomics, although I don't feel it's a terribly good write-up of the memory order semantics themselves. The C11 model does, however, appear to closely mirror the C++11 memory model, of which a useful presentation exists here. If I find a link to a C11-specific write-up, I will put it here later.

volatile is only useful when the so qualified object can change asynchronously. Such changes can happen
if the object is in fact an hardware IO register or similar that has changes external to your program
if the object might be changed by a signal handler
if the object is changed between calls to setjmp and longjmp
in all these cases you must declare your object volatile, otherwise your program will not work correctly. (And you might notice that objects shared between different threads is not in the list.)
In all other cases you shouldn't, because you may be missing optimization opportunities. On the other hand, qualifying an object volatile that doesn't fall under the points above will not make your code incorrect.

Not using volatile where necessary and appropriate is far more likely to be harmful! The solution to any perceived problems with volatile is not to ban its use, because there are a number of cases where it is necessary for safe and correct semantics. Rather the solution is to understand its purpose and its behaviour.
It is essential for any data that may be changed outside of the knowledge of the compiler, such as I/O and dual-ported or DMA memory. It is also necessary for access to memory shared between execution contexts such as threads and interrupt-handlers; this is where perhaps the confusion lies; it ensures an explicit read of such memory, and does not enforce atomicity or mutual exclusion - additional mechanisms are required for that, but that does not preclude volatile, but it is merely part of the solution to shared memory access.
See the following articles of the use of volatile (and send them to your project manager too!):
Place volatile accurately by Dan Saks.
Introduction to the volatile keyword by Nigel Jones
Guidelines for handling volatile variables by Colin Walls
Combining C's volatile and const keywords - Michael Barr

Volatile tells the compiler not to optimize anything that has to do with the volatile variable.
Why the "volatile" type class should not be used? - Best article in Kernel doc
https://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt

volatile is a keyword in c which tell the compiler not to do any kind of optimization on that variable.
Let me give you a simple example:
int temp;
for ( i=0 ;i <5 ; i++ )
{
temp = 5;
}
what compiler will do to make the code optimized :
int temp;
temp = 5; /* assigned temp variable before the loop. */
for ( i=0 ;i <5 ; i++ )
{
}
But if we mention volatile keyword then compiler will not do any kind of optimization in temp variable.
volatile int temp;
for ( i=0 ;i <5 ; i++ )
{
temp = 5;
}
"Volatile Considered Harmful" ---> I don't consider volatile as harmful. You use volatile where you don't want any kind of optimization from compiler end.
For example consider this piece of code is used by a thermometer company and temp is a variable used to take the temperature of the atmosphere which can change anytime. So if we do not use volatile then compiler will do the optimization and the atmosphere temperature will always be same.

Related

Is restricted the opposite of volatile?

I can use volatile for something like the following, where the value might be modified by an external function/signal/etc:
volatile int exit = 0;
while (!exit)
{
/* something */
}
And the compiler/assembly will not cache the value. On the other hand, with the restrict keyword, I can tell the compiler that a variable has no aliases / only referenced once inside the current scope, and the compiler can try and optimize it:
void update_res (int *a , int *b, int * restrict c ) {
* a += * c;
* b += * c;
}
Is that a correct understanding of the two, that they are basically opposites of each other? volatile says the variable can be modified outside the current scope and restrict says it cannot? What would be an example of the assembly instructions it would emit for the most basic example using these two keywords?

They're not exact opposites of each other. But yes, volatile gives a hard constraint to the optimizer to not optimize away accesses to an object, while restrict is a promise / guarantee to the optimizer about aliasing, so in a broad sense they act in opposite directions in terms of freedom for the optimizer. (And of course usually only matter in optimized builds.)
restrict is totally optional, only allowing extra performance. volatile sig_atomic_t can be "needed" for communication between a signal handler and the main program, or for device drivers. For any other use, _Atomic is usually a better choice. Other than that, volatile is also not needed for correctness of normal code. (_Atomic has a similar effect, especially with current compilers which purposely don't optimize atomics.) Neither volatile nor _Atomic are needed for correctness of single-threaded code without signal handlers, regardless of how complex the series of function calls is, or any amount of globals holding pointers to other variables. The as-if rule already requires compilers to make asm that gives observable results equivalent to stepping through the C abstract machine 1 line at a time. (Memory contents is not an observable result; that's why data races on non-atomic objects are undefined behaviour.)
volatile means that every C variable read (lvalue to rvalue conversion) and write (assignment) must become an asm load and store. In practice yes that means it's safe for things that change asynchronously, like MMIO device addresses, or as a bad way to roll your own _Atomic int with memory_order_relaxed. (When to use volatile with multi threading? - basically never in C11 / C++11.)
volatile says the variable can be modified outside the current scope
It depends what you mean by that. Volatile is far stronger than that, and makes it safe for it to be modified asynchronously while inside the current scope.
It's already safe for a function called from this scope to modify a global exit var; if a function doesn't get inlined, compilers generally have to assume that every global var could have been modified, same for everything possibly reachable from global pointers (escape analysis), or from calling functions in this translation unit that modify file-scoped static variables.
And like I said, you can use it for multi-threading, but don't. C11 _Atomic is standardized and can be used to write code that compiles to the same asm, but with more guarantees about exactly what is and isn't implied. (Especially ordering wrt. other operations.)
They have no equivalent in hand-written asm because there's no optimizer between the source and machine code asm.
In C compiler output, you won't notice a difference if you compile with optimization disabled. (Well maybe a minor difference in expressions that read the same volatile multiple times.)
Compiling with optimization disabled makes bad uninteresting asm, where every object is treated much like volatile to enable consistent debugging. As Multithreading program stuck in optimized mode but runs normally in -O0 shows, the optimizations allowed by making variables plain non-volatile only get done with optimization enabled. See also this Q&A about the same issue on single-core microcontrollers with interrupts.
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
How to remove "noise" from GCC/clang assembly output? - pretty sure I linked you this multiple times already.
*What would be an example of the assembly instructions it would emit for the most basic example using these two keywords?
Try it yourself on https://godbolt.org/ with gcc10 -O3. You already have a useful test-case for restrict; it should let the compiler load *c once.
Or if you search at all, Ciro Santilli has already analyzed the exact function you're asking about back in 2015, in an answer with over 150 upvotes. I found it by searching on site:stackoverflow.com optimize restrict, as the 3rd hit.
Realistic usage of the C99 'restrict' keyword? shows your exact case, including asm output with/without restrict, and analysis / discussion of that asm.

Is it good practice to make a global variable always volatile?

I know which is the meaning of volatile. I need to ask that if my variable is global, is it good practise to make it volatile, even i dont use interface with hardware.
Header:
typedef struct
{
int Value;
}Var_;
extern volatile Var_ myVariable;
Source:
volatile Var_ myVariable;

No. If you’re writing multi-threaded code, you want to use atomic variables, not volatile. For example, many concurrent structures need to be kept consistent, not modified one word at a time.
If no other thread, process or hardware is modifying the variable, you should not use either atomics or volatile. It will just complicate the program, run slower, and disable certain APIs for no reason.
The volatile keyword has historically been used for a few different things (such as telling the compiler not to optimize away a delay loop), but its purpose in C11 is narrow: to specify that a value in memory will change by some means that doesn’t follow the rules of atomics. You need it to write some kinds of device drivers, but it’s discouraged even in other low-level code such as OS kernels.

No, it is not good practice. volatile informs the C implementation (largely the compiler) that an object may be changed by something outside of the C implementation or that accesses to the object within the C implementation may have desired effects outside the C implementation. As long as your global object is only used and modified inside your own program, it has no volatile effects, and declaring it with volatile causes the compiler to suppress optimization and to generate unnecessary accesses to it within your program.

Where all to use volatile keyword in C

I know volatile keyword prevents compiler from optimizing a variable and read it from memory whenever it is read. Apart from memory mapped registers, what are all the situations where we need to use volatile? Given a conforming compiler, do I have to declare test_var as volatile in both scenarios?
1.
In file1.c
int test_var=100;
void func1()
{
test_var++;
}
In file2.c
extern int test_var;
void func2()
{
if(test_var==100)
{
....
}
}
2.
In file1.c
int test_var=100;
void func1()
{
}
In file2.c
extern int test_var;
void func2()
{
if(test_var==100)
{
....
}
}

memory mapped I/O is the only generic use of volatile in C. *)
With POSIX signals, a volatile can also be used together with the type sig_atomic_t like this:
volatile sig_atomic_t signal_occured = 0;
Neither of your scenarios should require volatile at all. If all you're interested in is a guarantee that the value is updated between different compilation units, see tofro's comment, extern already guarantees that. Particularly, volatile is not the correct tool for thread synchronization in C. It would only introduce bugs, because, as you state it, it does require actual read and write accesses to the variable, but it does not enforce their proper ordering with respect to threads (it's missing memory barriers, google for details).
Note that this is different from some other languages where volatile is designed to work between threads.
In an embedded system, volatile might be good enough for communicating between an ISR (interrupt service routine) and the main program, when combined with a data type that's read/written atomically, just like sig_atomic_t for POSIX signals. Consult the documentation of your compiler for that.
*) The C standard mentions this, along with the use-case of "asynchronously interrupting functions", only in a footnote, because memory-mapped I/O is outside the scope of the language. The language just defines the semantics of volatile in a way that make it suitable for memory-mapped I/O.

In neither of your examples is volatile necessary.
volatile is necessary:
anywhere a variable may be changed outside of the control of a single thread of execution,
anywhere the variable access is required to occur even when it semantically has no effect.
Case 1 includes:
memory mapped I/O registers,
memory used for DMA transfers,
memory shared between interrupt and/or thread contexts,
memory shared between independent processors (such as dual port RAM)
Case 2 includes:
loop counters used for empty delay loops, where the entire loop may otherwise be optimised away completely and take no time,
Variables written to but never read for observation in a debugger.
The above examples may not be exhaustive, but it is the semantics of volatile that are key; the language only has to perform an explicit access as indicated by the source code.

Besides extensions such as memory mapped devices, in standard C volatile has two use cases: interaction with signal handlers and modification of objects across usage of setjmp/longjmp. Both are case were there is unusual flow of control that an optimizer may not be aware of.

In C microcontroller applications using interrupts, the volatile keyword is essential in making sure that a value set in an interrupt is saved properly in the interrupt and later has the correct value in the main processing loop. Failure to use volatile is perhaps the single biggest reason that timer-based interrupts or ADC (analog-digital conversion) based interrupts for example will have a corrupted value when flow of control resumes after the processor state is returned post-interrupt. A canonical template from Atmel and GCC:
volatile uint8_t flag = 0;
ISR(TIMER_whatever_interrupt)
{
flag = 1;
}
while(1) // main loop
{
if (flag == 1)
{
<do something>
flag = 0;
}
}
Without the volatile it's guaranteed to not work as expected.

Apart from memory mapped registers, what are all the situations where
we need to use volatile?
If
execution is purely sequential (no threads and no signals delivered asynchronously);
you don't use longjmp;
you don't need to be able to debug a program compiled with optimizations;
you don't use constructs with vaguely specified semantics like floating point operations;
you don't do useless computations (computations where the result is ignored) as in a benchmark loop;
you don't do timings of any pure computations, that is anything that isn't I/O based (I/O based such as timings of accesses of network requests, external database accesses)
then you probably have no need for volatile.

Compiler related - are these two C codes really identical?

In a multi-thread or RTOS environment, are these codes below identical?
I believe they are not. But is the 1st code absolute save in a multi-thread environment? Is there a rule for compiler to assign a register for 'ga' and would not read 'ga' again later in func_a()?
I know I can use lock, but this is not a question about how to protect the data. It is just a question about the compiler behaviour.
// ga is a global variable.
int func_a() {
int a = ga;
return a>2 ? a-2 : 2-a;
}
int func_b() {
return ga>2 ? ga-2 : 2-ga;
}
My intention is looking for a standard way (not platform specific) to read ga only once and assign its value to a local variable 'a'.
'a' can then be used consistently regardless of whether 'ga' has changed.

Both these versions of code have undefined behaviour in the face of multiple threads executing the functions. Certainly different compilers can do different things regarding saving the global variable into registers, or not. What's more, there's no guarantee that assigning to a local variable can be done in an atomic way with respect to threads that are mutating the global variable.

There is no rule in the C standard that requires the compiler to implement those functions differently. e.g. When working with registers, the compiler may or may not 'optimize out' the assignment from ga to a (i.e. By 'optimize out', I mean: load ga into a REG, then use the same REG to do the rest of the computation, using it as a). Or it may not do so.
If you want to implement a lock-free data structure:
C99 offers nothing that can help you.
C11 (very recent standard) offers you atomic data types.
If you are using C99, then you either need to:
Use locks (and hence, not lock-free code)
Be ready to write architecture specific code. The least you need to do is use a minimal set of atomic operations, as done in this library that implements lock-free data structures using atomic operations provided by the x86, x86_64, and ARM ISAs.
In an earlier version of this answer, I touched upon a side issue (which has to do with volatile, and which is really not relevant to your real question):
There is one case that can put a restriction on how func_b is implemented, but I am actually going off on a tangent here: If ga is declared as a volatile.
If ga is volatile, then each read on ga must load ga from memory afresh. i.e. in func_b, ga will be loaded from memory two times. Once for the comparison, and once to calculate the return value. The expected use is, for example say ga refers to a memory mapped I/O port. Then if value of ga changes in between the two reads, this will reflect in the return value. However, if you change ga in another thread, don't expect sane/defined behavior.
On the other hand, not having a volatile qualifier does not mean that ga will be read exactly once in func_b. And there is no qualifier that is the 'opposite of volatile'.

the behaviour depends on which compiler you're using, every compiler has its own rules regarding the optimisation.

The two snippets are likely going to end up with identical machine code. Neither of them is safe in a multi-thread case.
volatile would force the creation of a temporary variable, but since the copy from "ga" into the volatile variable is not guaranteed to be atomic, this is not thread-safe.
The only safe way to write such code is with guards:
int func_a() {
mtx_lock(&ga_mutex);
int a = ga;
mtx_unlock(&ga_mutex);
return a>2 ? a-2 : 2-a;
}

How do I know if gcc agrees that something is volatile?

Consider the following:
volatile uint32_t i;
How do I know if gcc did or did not treat i as volatile? It would be declared as such because no nearby code is going to modify it, and modification of it is likely due to some interrupt.
I am not the world's worst assembly programmer, but I play one on TV. Can someone help me to understand how it would differ?
If you take the following stupid code:
#include <stdio.h>
#include <inttypes.h>
volatile uint32_t i;
int main(void)
{
if (i == 64738)
return 0;
else
return 1;
}
Compile it to object format and disassemble it via objdump, then do the same after removing 'volatile', there is no difference (according to diff). Is the volatile declaration just too close to where its checked or modified or should I just always use some atomic type when declaring something volatile? Do some optimization flags influence this?
Note, my stupid sample does not fully match my question, I realize this. I'm only trying to find out if gcc did or did not treat the variable as volatile, so I'm studying small dumps to try to find the difference.

Many compilers in some situations don't treat volatile the way they should. See this paper if you deal much with volatiles to avoid nasty surprises: Volatiles are Miscompiled, and What to Do about It. It also contains the pretty good description of the volatile backed with the quotations from the standard.
To be 100% sure, and for such a simple example check out the assembly output.

Try setting the variable outside a loop and reading it inside the loop. In a non-volatile case, the compiler might (or might not) shove it into a register or make it a compile time constant or something before the loop, since it "knows" it's not going to change, whereas if it's volatile it will read it from the variable space every time through the loop.
Basically, when you declare something as volatile, you're telling the compiler not to make certain optimizations. If it decided not to make those optimizations, you don't know that it didn't do them because it was declared volatile, or just that it decided it needed those registers for something else, or it didn't notice that it could turn it into a compile time constant.

As far as I know, volatile helps the optimizer. For example, if your code looked like this:
int foo() {
int x = 0;
while (x);
return 42;
}
The "while" loop would be optimized out of the binary.
But if you define 'x' as being volatile (ie, volatile int x;), then the compiler will leave the loop alone.

Your little sample is inadequate to show anything. The difference between a volatile variable and one that isn't is that each load or store in the code has to generate precisely one load or store in the executable for a volatile variable, whereas the compiler is free to optimize away loads or stores of non-volatile variables. If you're getting one load of i in your sample, that's what I'd expect for volatile and non-volatile.
To show a difference, you're going to have to have redundant loads and/or stores. Try something like
int i = 5;
int j = i + 2;
i = 5;
i = 5;
printf("%d %d\n", i, j);
changing i between non-volatile and volatile. You may have to enable some level of optimization to see the difference.
The code there has three stores and two loads of i, which can be optimized away to one store and probably one load if i is not volatile. If i is declared volatile, all stores and loads should show up in the object code in order, no matter what the optimization. If they don't, you've got a compiler bug.

It should always treat it as volatile.
The reason the code is the same is that volatile just instructs the compiler to load the variable from memory each time it accesses it. Even with optimization on, the compiler still needs to load i from memory once in the code you've written, because it can't infer the value of i at compile time. If you access it repeatedly, you'll see a difference.

Any modern compiler has multiple stages. One of the fairly easy yet interesting questions is whether the declaration of the variable itself was parsed correctly. This is easy because the C++ name mangling should differ depending on the volatile-ness. Hence, if you compile twice, once with volatile defined away, the symbol tables should differ slightly.

Read the standard before you misquote or downvote. Here's a quote from n2798:
7.1.6.1 The cv-qualifiers
7 Note: volatile is a hint to the implementation to avoid aggressive optimization involving the object because the value of the object might be changed by means undetectable by an implementation. See 1.9 for detailed semantics. In general, the semantics of volatile are intended to be the same in C++ as they are in C.
The keyword volatile acts as a hint. Much like the register keyword. However, volatile asks the compiler to keep all its optimizations at bay. This way, it won't keep a copy of the variable in a register or a cache (to optimize speed of access) but rather fetch it from the memory everytime you request for it.
Since there is so much of confusion: some more. The C99 standard does in fact say that a volatile qualified object must be looked up every time it is read and so on as others have noted. But, there is also another section that says that what constitutes a volatile access is implementation defined. So, a compiler, which knows the hardware inside out, will know, for example, when you have an automatic volatile qualified variable and whose address is never taken, that it will not be put in a sensitive region of memory and will almost certainly ignore the hint and optimize it away.
This keyword finds usage in setjmp and longjmp type of error handling. The only thing you have to bear in mind is that: You supply the volatile keyword when you think the variable may change. That is, you could take an ordinary object and manage with a few casts.
Another thing to keep in mind is the definition of what constitutes a volatile access is left by standard to the implementation.
If you really wanted different assembly compile with optimization