Consider the following:
volatile uint32_t i;
How do I know if gcc did or did not treat i as volatile? It would be declared as such because no nearby code is going to modify it, and modification of it is likely due to some interrupt.
I am not the world's worst assembly programmer, but I play one on TV. Can someone help me to understand how it would differ?
If you take the following stupid code:
#include <stdio.h>
#include <inttypes.h>
volatile uint32_t i;
int main(void)
{
if (i == 64738)
return 0;
else
return 1;
}
Compile it to object format and disassemble it via objdump, then do the same after removing 'volatile', there is no difference (according to diff). Is the volatile declaration just too close to where its checked or modified or should I just always use some atomic type when declaring something volatile? Do some optimization flags influence this?
Note, my stupid sample does not fully match my question, I realize this. I'm only trying to find out if gcc did or did not treat the variable as volatile, so I'm studying small dumps to try to find the difference.
Many compilers in some situations don't treat volatile the way they should. See this paper if you deal much with volatiles to avoid nasty surprises: Volatiles are Miscompiled, and What to Do about It. It also contains the pretty good description of the volatile backed with the quotations from the standard.
To be 100% sure, and for such a simple example check out the assembly output.
Try setting the variable outside a loop and reading it inside the loop. In a non-volatile case, the compiler might (or might not) shove it into a register or make it a compile time constant or something before the loop, since it "knows" it's not going to change, whereas if it's volatile it will read it from the variable space every time through the loop.
Basically, when you declare something as volatile, you're telling the compiler not to make certain optimizations. If it decided not to make those optimizations, you don't know that it didn't do them because it was declared volatile, or just that it decided it needed those registers for something else, or it didn't notice that it could turn it into a compile time constant.
As far as I know, volatile helps the optimizer. For example, if your code looked like this:
int foo() {
int x = 0;
while (x);
return 42;
}
The "while" loop would be optimized out of the binary.
But if you define 'x' as being volatile (ie, volatile int x;), then the compiler will leave the loop alone.
Your little sample is inadequate to show anything. The difference between a volatile variable and one that isn't is that each load or store in the code has to generate precisely one load or store in the executable for a volatile variable, whereas the compiler is free to optimize away loads or stores of non-volatile variables. If you're getting one load of i in your sample, that's what I'd expect for volatile and non-volatile.
To show a difference, you're going to have to have redundant loads and/or stores. Try something like
int i = 5;
int j = i + 2;
i = 5;
i = 5;
printf("%d %d\n", i, j);
changing i between non-volatile and volatile. You may have to enable some level of optimization to see the difference.
The code there has three stores and two loads of i, which can be optimized away to one store and probably one load if i is not volatile. If i is declared volatile, all stores and loads should show up in the object code in order, no matter what the optimization. If they don't, you've got a compiler bug.
It should always treat it as volatile.
The reason the code is the same is that volatile just instructs the compiler to load the variable from memory each time it accesses it. Even with optimization on, the compiler still needs to load i from memory once in the code you've written, because it can't infer the value of i at compile time. If you access it repeatedly, you'll see a difference.
Any modern compiler has multiple stages. One of the fairly easy yet interesting questions is whether the declaration of the variable itself was parsed correctly. This is easy because the C++ name mangling should differ depending on the volatile-ness. Hence, if you compile twice, once with volatile defined away, the symbol tables should differ slightly.
Read the standard before you misquote or downvote. Here's a quote from n2798:
7.1.6.1 The cv-qualifiers
7 Note: volatile is a hint to the implementation to avoid aggressive optimization involving the object because the value of the object might be changed by means undetectable by an implementation. See 1.9 for detailed semantics. In general, the semantics of volatile are intended to be the same in C++ as they are in C.
The keyword volatile acts as a hint. Much like the register keyword. However, volatile asks the compiler to keep all its optimizations at bay. This way, it won't keep a copy of the variable in a register or a cache (to optimize speed of access) but rather fetch it from the memory everytime you request for it.
Since there is so much of confusion: some more. The C99 standard does in fact say that a volatile qualified object must be looked up every time it is read and so on as others have noted. But, there is also another section that says that what constitutes a volatile access is implementation defined. So, a compiler, which knows the hardware inside out, will know, for example, when you have an automatic volatile qualified variable and whose address is never taken, that it will not be put in a sensitive region of memory and will almost certainly ignore the hint and optimize it away.
This keyword finds usage in setjmp and longjmp type of error handling. The only thing you have to bear in mind is that: You supply the volatile keyword when you think the variable may change. That is, you could take an ordinary object and manage with a few casts.
Another thing to keep in mind is the definition of what constitutes a volatile access is left by standard to the implementation.
If you really wanted different assembly compile with optimization
Related
I can use volatile for something like the following, where the value might be modified by an external function/signal/etc:
volatile int exit = 0;
while (!exit)
{
/* something */
}
And the compiler/assembly will not cache the value. On the other hand, with the restrict keyword, I can tell the compiler that a variable has no aliases / only referenced once inside the current scope, and the compiler can try and optimize it:
void update_res (int *a , int *b, int * restrict c ) {
* a += * c;
* b += * c;
}
Is that a correct understanding of the two, that they are basically opposites of each other? volatile says the variable can be modified outside the current scope and restrict says it cannot? What would be an example of the assembly instructions it would emit for the most basic example using these two keywords?
They're not exact opposites of each other. But yes, volatile gives a hard constraint to the optimizer to not optimize away accesses to an object, while restrict is a promise / guarantee to the optimizer about aliasing, so in a broad sense they act in opposite directions in terms of freedom for the optimizer. (And of course usually only matter in optimized builds.)
restrict is totally optional, only allowing extra performance. volatile sig_atomic_t can be "needed" for communication between a signal handler and the main program, or for device drivers. For any other use, _Atomic is usually a better choice. Other than that, volatile is also not needed for correctness of normal code. (_Atomic has a similar effect, especially with current compilers which purposely don't optimize atomics.) Neither volatile nor _Atomic are needed for correctness of single-threaded code without signal handlers, regardless of how complex the series of function calls is, or any amount of globals holding pointers to other variables. The as-if rule already requires compilers to make asm that gives observable results equivalent to stepping through the C abstract machine 1 line at a time. (Memory contents is not an observable result; that's why data races on non-atomic objects are undefined behaviour.)
volatile means that every C variable read (lvalue to rvalue conversion) and write (assignment) must become an asm load and store. In practice yes that means it's safe for things that change asynchronously, like MMIO device addresses, or as a bad way to roll your own _Atomic int with memory_order_relaxed. (When to use volatile with multi threading? - basically never in C11 / C++11.)
volatile says the variable can be modified outside the current scope
It depends what you mean by that. Volatile is far stronger than that, and makes it safe for it to be modified asynchronously while inside the current scope.
It's already safe for a function called from this scope to modify a global exit var; if a function doesn't get inlined, compilers generally have to assume that every global var could have been modified, same for everything possibly reachable from global pointers (escape analysis), or from calling functions in this translation unit that modify file-scoped static variables.
And like I said, you can use it for multi-threading, but don't. C11 _Atomic is standardized and can be used to write code that compiles to the same asm, but with more guarantees about exactly what is and isn't implied. (Especially ordering wrt. other operations.)
They have no equivalent in hand-written asm because there's no optimizer between the source and machine code asm.
In C compiler output, you won't notice a difference if you compile with optimization disabled. (Well maybe a minor difference in expressions that read the same volatile multiple times.)
Compiling with optimization disabled makes bad uninteresting asm, where every object is treated much like volatile to enable consistent debugging. As Multithreading program stuck in optimized mode but runs normally in -O0 shows, the optimizations allowed by making variables plain non-volatile only get done with optimization enabled. See also this Q&A about the same issue on single-core microcontrollers with interrupts.
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
How to remove "noise" from GCC/clang assembly output? - pretty sure I linked you this multiple times already.
*What would be an example of the assembly instructions it would emit for the most basic example using these two keywords?
Try it yourself on https://godbolt.org/ with gcc10 -O3. You already have a useful test-case for restrict; it should let the compiler load *c once.
Or if you search at all, Ciro Santilli has already analyzed the exact function you're asking about back in 2015, in an answer with over 150 upvotes. I found it by searching on site:stackoverflow.com optimize restrict, as the 3rd hit.
Realistic usage of the C99 'restrict' keyword? shows your exact case, including asm output with/without restrict, and analysis / discussion of that asm.
Say you have (for reasons that are not important here) the following code:
int k = 0;
... /* no change to k can happen here */
if (k) {
do_something();
}
Using the -O2 flag, GCC will not generate any code for it, recognizing that the if test is always false.
I'm wondering if this is a pretty common behaviour across compilers or it is something I should not rely on.
Does anybody knows?
Dead code elimination in this case is trivial to do for any modern optimizing compiler. I would definitely rely on it, given that optimizations are turned on and you are absolutely sure that the compiler can prove that the value is zero at the moment of check.
However, you should be aware that sometimes your code has more potential side effects than you think.
The first source of problems is calling non-inlined functions. Whenever you call a function which is not inlined (i.e. because its definition is located in another translation unit), compiler assumes that all global variables and the whole contents of the heap may change inside this call. Local variables are the lucky exception, because compiler knows that it is illegal to modify them indirectly... unless you save the address of a local variable somewhere. For instance, in this case dead code won't be eliminated:
int function_with_unpredictable_side_effects(const int &x);
void doit() {
int k = 0;
function_with_unpredictable_side_effects(k);
if (k)
printf("Never reached\n");
}
So compiler has to do some work and may fail even for local variables. By the way, I believe the problem which is solved in this case is called escape analysis.
The second source of problems is pointer aliasing: compiler has to take into account that all sort of pointers and references in your code may be equal, so changing something via one pointer may change the contents at the other one. Here is one example:
struct MyArray {
int num;
int arr[100];
};
void doit(int idx) {
MyArray x;
x.num = 0;
x.arr[idx] = 7;
if (x.num)
printf("Never reached\n");
}
Visual C++ compiler does not eliminate the dead code, because it thinks that you may access x.num as x.arr[-1]. It may sound like an awful thing to do to you, but this compiler has been used in gamedev area for years, and such hacks are not uncommon there, so the compiler stays on the safe side. On the other hand, GCC removes the dead code. Maybe it is related to its exploitation of strict pointer aliasing rule.
P.S. The const keywork is never used by optimizer, it is only present in C/C++ language for programmers' convenience.
There is no pretty common behaviour across compilers. But there is a way to explore how different compilers acts with specific part of code.
Compiler explorer will help you to answer on every question about code generation, but of course you must be familiar with assembler language.
Does gcc have an option to disable read/write optimizations for global variables not explicitly defined as volatile?
My team is running out of program memory in our embedded C project, built using gcc. When I enable optimizations to reduce code size, the code no longer works as expected because we have not been using the volatile keyword where we ought to have been. That is, I was able to resolve the presenting problem by declaring a few variables accessed in ISRs volatile. However, I don't have any level of certainty that those are the only variables I need to declare volatile and I just haven't noticed the other bugs yet.
I have heard that "some compilers" have a flag to implicitly declare everything volatile, but that I should resist the temptation because it is a "substitute for thought" (see https://barrgroup.com/Embedded-Systems/How-To/C-Volatile-Keyword).
Yes, but thought is expensive. Feel free to try to talk me out of it in the comments section, but I'm hoping for a quick fix to get the code size down without breaking the application.
You mean something besides -O0?
I expect that it's not too difficult to hack GCC for a quick experiment. This place in grokdeclarator in gcc/c/c-decl.c is probably a good place to unconditionally inject a volatile qualifier for the interesting storage_class values (probably csc_none, csc_extern, csc_static in your case).
/* It's a variable. */
/* An uninitialized decl with `extern' is a reference. */
int extern_ref = !initialized && storage_class == csc_extern;
type = c_build_qualified_type (type, type_quals, orig_qual_type,
orig_qual_indirect);
The experiment should tell you whether this is feasible from a performance/code size point of view and if it is, you might want to submit this as a proper upstream patch.
It's possible to do it by for example redefining all basic types as the same but with volatile specifier. Anyway if all variables in your code will be volatile I expect that size of your application will be larger than before optimizations.
My solution is: enable optimizations for part of the code. If your application got some functional architecture you can start enabling optimizations and test if this functionality works properly. It's much easier than optimize everything and analyze why nothing works.
I am a embedded developer and use volatile keyword when working with I/O ports. But my Project manager suggested using volatile keyword is harmful and has lot of draw backs, But i find in most of the cases volatile is useful in embedded programming, As per my knowledge volatile is harmful in kernel code as the changes to our code will become unpredictable. There are any drawbacks using volatile in Embedded Systems also?
No, volatile is not harmful. In any situation. Ever. There is no possible well-formed piece of code that will break with the addition of volatile to an object (and pointers to that object). However, volatile is often poorly understood. The reason the kernel docs state that volatile is to be considered harmful is that people kept using it for synchronization between kernel threads in broken ways. In particular, they used volatile integer variables as though access to them was guaranteed to be atomic, which it isn't.
volatile is also not useless, and particularly if you go bare-metal, you will need it. But, like any other tool, it is important to understand the semantics of volatile before using it.
What volatile is
Access to volatile objects is, in the standard, considered a side-effect in the same way as incrementing or decrementing by ++ and --. In particular, this means that 5.1.2.3 (3), which says
(...) An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object)
does not apply. The compiler has to chuck out everything it thinks it knows about the value of a volatile variable at every sequence point. (like other side-effects, when access to volatile objects happens is governed by sequence points)
The effect of this is largely the prohibition of certain optimizations. Take, for example, the code
int i;
void foo(void) {
i = 0;
while(i == 0) {
// do stuff that does not touch i
}
}
The compiler is allowed to make this into an infinite loop that never checks i again because it can deduce that the value of i is not changed in the loop, and thus that i == 0 will never be false. This holds true even if there is another thread or an interrupt handler that could conceivably change i. The compiler does not know about them, and it does not care. It is explicitly allowed to not care.
Contrast this with
int volatile i;
void foo(void) {
i = 0;
while(i == 0) { // Note: This is still broken, only a little less so.
// do stuff that does not touch i
}
}
Now the compiler has to assume that i can change at any time and cannot do this optimization. This means, of course, that if you deal with interrupt handlers and threads, volatile objects are necessary for synchronisation. They are not, however, sufficient.
What volatile isn't
What volatile does not guarantee is atomic access. This should make intuitive sense if you're used to embedded programming. Consider, if you will, the following piece of code for an 8-bit AVR MCU:
uint32_t volatile i;
ISR(TIMER0_OVF_vect) {
++i;
}
void some_function_in_the_main_loop(void) {
for(;;) {
do_something_with(i); // This is thoroughly broken.
}
}
The reason this code is broken is that access to i is not atomic -- cannot be atomic on an 8-bit MCU. In this simple case, for example, the following might happen:
i is 0x0000ffff
do_something_with(i) is about to be called
the high two bytes of i are copied into the parameter slot for this call
at this point, timer 0 overflows and the main loop is interrupted
the ISR changes i. The lower two bytes of i overflow and are now 0. i is now 0x00010000.
the main loop continues, and the lower two bytes of i are copied into the parameter slot
do_something_with is called with 0 as its parameter.
Similar things can happen on PCs and other platforms. If anything, more opportunities it can fail open up with a more complex architecture.
Takeaway
So no, using volatile is not bad, and you will (often) have to do it in bare-metal code. However, when you do use it, you have to keep in mind that it is not a magic wand, and that you will still have to make sure you don't trip over yourself. In embedded code, there's often a platform-specific way to handle the problem of atomicity; in the case of AVR, for example, the usual crowbar method is to disable interrupts for the duration, as in
uint32_t x;
ATOMIC_BLOCK(ATOMIC_RESTORESTATE) {
x = i;
}
do_something_with(x);
...where the ATOMIC_BLOCK macro calls cli() (disable interrupts) before and sei() (enable interrupts) afterwards if they were enabled beforehand.
With C11, which is the first C standard that explicitly acknowledges the existence of multithreading, a new family of atomic types and memory fencing operations have been introduced that can be used for inter-thread synchronisation and in many cases make use of volatile unnecessary. If you can use those, do it, but it'll likely be some time before they reach all common embedded toolchains. With them, the loop above could be fixed like this:
atomic_int i;
void foo(void) {
atomic_store(&i, 0);
while(atomic_load(&i) == 0) {
// do stuff that does not touch i
}
}
...in its most basic form. The precise semantics of the more relaxed memory order semantics go way beyond the scope of a SO answer, so I'll stick with the default sequentially consistent stuff here.
If you're interested in it, Gil Hamilton provided a link in the comments to an explanation of a lock-free stack implementation using C11 atomics, although I don't feel it's a terribly good write-up of the memory order semantics themselves. The C11 model does, however, appear to closely mirror the C++11 memory model, of which a useful presentation exists here. If I find a link to a C11-specific write-up, I will put it here later.
volatile is only useful when the so qualified object can change asynchronously. Such changes can happen
if the object is in fact an hardware IO register or similar that has changes external to your program
if the object might be changed by a signal handler
if the object is changed between calls to setjmp and longjmp
in all these cases you must declare your object volatile, otherwise your program will not work correctly. (And you might notice that objects shared between different threads is not in the list.)
In all other cases you shouldn't, because you may be missing optimization opportunities. On the other hand, qualifying an object volatile that doesn't fall under the points above will not make your code incorrect.
Not using volatile where necessary and appropriate is far more likely to be harmful! The solution to any perceived problems with volatile is not to ban its use, because there are a number of cases where it is necessary for safe and correct semantics. Rather the solution is to understand its purpose and its behaviour.
It is essential for any data that may be changed outside of the knowledge of the compiler, such as I/O and dual-ported or DMA memory. It is also necessary for access to memory shared between execution contexts such as threads and interrupt-handlers; this is where perhaps the confusion lies; it ensures an explicit read of such memory, and does not enforce atomicity or mutual exclusion - additional mechanisms are required for that, but that does not preclude volatile, but it is merely part of the solution to shared memory access.
See the following articles of the use of volatile (and send them to your project manager too!):
Place volatile accurately by Dan Saks.
Introduction to the volatile keyword by Nigel Jones
Guidelines for handling volatile variables by Colin Walls
Combining C's volatile and const keywords - Michael Barr
Volatile tells the compiler not to optimize anything that has to do with the volatile variable.
Why the "volatile" type class should not be used? - Best article in Kernel doc
https://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt
volatile is a keyword in c which tell the compiler not to do any kind of optimization on that variable.
Let me give you a simple example:
int temp;
for ( i=0 ;i <5 ; i++ )
{
temp = 5;
}
what compiler will do to make the code optimized :
int temp;
temp = 5; /* assigned temp variable before the loop. */
for ( i=0 ;i <5 ; i++ )
{
}
But if we mention volatile keyword then compiler will not do any kind of optimization in temp variable.
volatile int temp;
for ( i=0 ;i <5 ; i++ )
{
temp = 5;
}
"Volatile Considered Harmful" ---> I don't consider volatile as harmful. You use volatile where you don't want any kind of optimization from compiler end.
For example consider this piece of code is used by a thermometer company and temp is a variable used to take the temperature of the atmosphere which can change anytime. So if we do not use volatile then compiler will do the optimization and the atmosphere temperature will always be same.
I'm using a lot of volatile variables in my embedded firmware, but most of the time there is only one point in a function where I need to be sure the value is recent (at the start). But the rest of the function is referring to the same variable-name, and the value can be changed in the mean time, producing very unexpected code flow / results. I know this can be solved by using a temporary variable inside the function, but I was looking for a better solution.
Now I was wondering, instead of marking the whole variable as volatile, is there a way I could instruct the compiler (gcc) with a special keyword that I want to read the variable as if it was marked volatile, so I can use that keyword only once at the beginning of the function?
I'm a little confused about the scenario - if it's that you want one particular access to a variable to be treated as volatile, use
dest = *(volatile TYPE *)&src;
where TYPE is the type of src. You may also need
asm volatile ("" ::: "memory");
in carefully controlled locations, to prevent the compiler from moving loads/stores of other memory locations across the volatile read.
Also investigate C11's _Atomic types. (I'm not sure if GCC supports these yet.)
If your variable is in memory and your embedded system supports it you could use memory barriers. To make sure that nothing accesses the memory while you are reading the value out.