Consider the following code:
// In the interrupt handler file:
volatile uint32_t gSampleIndex = 0; // declared 'extern'
void HandleSomeIrq()
{
gSampleIndex++;
}
// In some other file
void Process()
{
uint32_t localSampleIndex = gSampleIndex; // will this be optimized away?
PrevSample = RawSamples[(localSampleIndex + 0) % NUM_RAW_SAMPLE_BUFFERS];
CurrentSample = RawSamples[(localSampleIndex + 1) % NUM_RAW_SAMPLE_BUFFERS];
NextSample = RawSamples[(localSampleIndex + 2) % NUM_RAW_SAMPLE_BUFFERS];
}
My intention is that PrevSample, CurrentSample and NextSample are consistent, even if gSampleIndex is updated during the call to Process().
Will the assignment to the localSampleIndex do the trick, or is there any chance it will be optimized away even though gSampleIndex is volatile?
In principle, volatile is not enough to guarantee that Process only sees consistent values of gSampleIndex. In practice, however, you should not run into any issues if uinit32_t is directly supported by the hardware. The proper solution would be to use atomic accesses.
The problem
Suppose that you are running on a 16-bit architecture, so that the instruction
localSampleIndex = gSampleIndex;
gets compiled into two instructions (loading the upper half, loading the lower half). Then the interrupt might be called between the two instructions, and you'll get half of the old value combined with half of the new value.
The solution
The solution is to access gSampleCounter using atomic operations only. I know of three ways of doing that.
C11 atomics
In C11 (supported since GCC 4.9), you declare your variable as atomic:
#include <stdatomic.h>
atomic_uint gSampleIndex;
You then take care to only ever access the variable using the documented atomic interfaces. In the IRQ handler:
atomic_fetch_add(&gSampleIndex, 1);
and in the Process function:
localSampleIndex = atomic_load(gSampleIndex);
Do not bother with the _explicit variants of the atomic functions unless you're trying to get your program to scale across large numbers of cores.
GCC atomics
Even if your compiler does not support C11 yet, it probably has some support for atomic operations. For example, in GCC you can say:
volatile int gSampleIndex;
...
__atomic_add_fetch(&gSampleIndex, 1, __ATOMIC_SEQ_CST);
...
__atomic_load(&gSampleIndex, &localSampleIndex, __ATOMIC_SEQ_CST);
As above, do not bother with weak consistency unless you're trying to achieve good scaling behaviour.
Implementing atomic operations yourself
Since you're not trying to protect against concurrent access from multiple cores, just race conditions with an interrupt handler, it is possible to implement a consistency protocol using standard C primitives only. Dekker's algorithm is the oldest known such protocol.
In your function you access volatile variable just once (and it's the only volatile one in that function) so you don't need to worry about code reorganization that compiler may do (and volatile prevents). What standard says for these optimizations at ยง5.1.2.3 is:
In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
Note last sentence: "...no needed side effects are produced (...accessing a volatile object)".
Simply volatile will prevent any optimization compiler may do around that code. Just to mention few: no instruction reordering respect other volatile variables. no expression removing, no caching, no value propagation across functions.
BTW I doubt any compiler may break your code (with or without volatile). Maybe local stack variable will be elided but value will be stored in a registry (for sure it won't repeatedly access a memory location). What you need volatile for is value visibility.
EDIT
I think some clarification is needed.
Let me safely assume you know what you're doing (you're working with interrupt handlers so this shouldn't be your first C program): CPU word matches your variable type and memory is properly aligned.
Let me also assume your interrupt is not reentrant (some magic cli/sti stuff or whatever your CPU uses for this) unless you're planning some hard-time debugging and tuning.
If these assumptions are satisfied then you don't need atomic operations. Why? Because localSampleIndex = gSampleIndex is atomic (because it's properly aligned, word size matches and it's volatile), with ++gSampleIndex there isn't any race condition (HandleSomeIrq won't be called again while it's still in execution). More than useless they're wrong.
One may think: "OK, I may not need atomic but why I can't use them? Even if such assumption are satisfied this is an *extra* and it'll achieve same goal" . No, it doesn't. Atomic has not same semantic of volatile variables (and seldom volatile is/should be used outside memory mapped I/O and signal handling). Volatile (usually) is useless with atomic (unless a specific architecture says it is) but it has a great difference: visibility. When you update gSampleIndex in HandleSomeIrq standard guarantees that value will be immediately visible to all threads (and devices). with atomic_uint standard guarantees it'll be visible in a reasonable amount of time.
To make it short and clear: volatile and atomic are not the same thing. Atomic operations are useful for concurrency, volatile are useful for lower level stuff (interrupts, devices). If you're still thinking "hey they do *exactly* what I need" please read few useful links picked from comments: cache coherency and a nice reading about atomics.
To summarize:
In your case you may use an atomic variable with a lock (to have both atomic access and value visibility) but no one on this earth would put a lock inside an interrupt handler (unless absolutely definitely doubtless unquestionably needed, and from code you posted it's not your case).
I am making embedded firmware where everything after initialization happens in ISRs. I have variables that are shared between them, and I am wondering in which cases they need to be volatile. I never block, waiting for a change in another ISR.
When can I be certain that actual memory is read or written to when not using volatile? Once every ISR?
Addendum:
This is for ARM Cortex-M0, but this isn't really a question about ISRs as much as it is about compiler optimization, and as such, the platform shouldn't really be important.
The question is entirely answerable, and the answer is simple:
Without volatile you (simplistically) can't assume that actual memory access will ever happen - the compiler is free to conclude that results are either entirely unused (if that is apparent in what it can see), or that they may be safely cached in a register, or computed out of order (as long as visible dependencies are maintained).
You need volatile to tell the compiler that the side effects of access may be important to something the optimizer is unable to analyze, such as an interrupt context or the context of a different "thread".
In effect volatile is how you say to the compiler "I know something you don't, so don't try to be clever here"
Beware that volatile does not guarantee atomicity of read-modify-write, or of even read or write alone where the data type (or its misalignment) requires muti-step access. In those cases, you risk not just getting a stale value, but an entirely erroneous one.
it is already mentioned that actual write to memory/cache is not exactly predictable when using non volatile variables.
but it is also worth mentioning about another aspect where the volatile variable might get cached and might require a forced cache flush to write in to the actual memory ( depends on whether a write-through or a write-back policy is used).
consider another case where the volatile variable is not cached ( placed in non-cacheable area)
but due to the presence of write buffers and bus bridges sometimes it is not predictable when the real write happens to the intended register and it requires a dummy read to ensure that write actually happened to the real register/memory. This is particularly helpful to avoid race conditions in interrupt clearing/masking.
even though compilers are not supposed to be clever around volatile variables.it is free to do some optimizations with respect to volatile sequence points ( optimization across sequence points not permitted, but optimization between sequence points are permitted)
The variables that need volatile are:
1) Share data between ISR and program data or other threads.
Preferable these are flags that indicate block access to various data structures.
// main() code;
disable_interrupts();
if (flag == 0) {
flag = 1;
enable_interrupts();
Manipulate_data();
flag = 0;
} else {
enable_interrupts();
Cope_with_data_unavailable();
}
2) Memory mapped hardware registers.
This "memory" can change at any time due to hardware conditions and the complier needs to know that their values are not necessarily consistent. Without volatile, a naive comapiler would only sample fred once resulting in a potential endless loop.
volatile int *fred = 0x1234; // Hardware reg at address 0x1234;
while (*fred);
I read that in some cases (global variable, or while(variable), etc.) if the variables are not defined as volatile it may cause problems.
Would it cause a problem if I define all variables as volatile?
If something outside of the current scope or any subsequent child scope (think: function calls) can modify the variable you are working in (there's a timer interrupt that will increment your variable, you gave a reference to the var to some other code that might do something in response to an interrupt, etc) then the variable should be declared volatile.
volatile is a hint to the compiler that says, "something else might change this variable." and the compiler's response is, "Oh. OK. I will never trust a copy of this variable I have in a register or on the stack. Every time I need to use this variable I will read it from memory because my copy in a register could be out of date."
Declaring everything volatile will make your code slow down a lot and result in a much larger binary. Instead of doing this the correct answer is to understand what needs to be tagged volatile, why it does, and tagging appropriately.
A volatile variable must have its memory accesses honoured by the compiler.
This means that:
If you read the value of the variable, the compiler is not allowed to reuse a value it already "knows" about. For example, it might be in an external device register and subject to change by the peripheral hardware.
If you write something to the variable, the compiler must write to that memory. It is not allowed to look around and say "well, nothing else uses that variable, so I don't really need to write it". For example, it might again be an external peripheral register (maybe the Tx FIFO of a UART).
Note that volatile is not (always) sufficient for communication between threads (or between main loops and interrupt service routines: https://stackoverflow.com/a/2485177/106092. See also https://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt
Only make things volatile which need to be. If you find yourself having to make things volatile to make them work it comes down to one of two things:
A compiler bug
Your code not being correct
In my experience it's almost always the latter, but you might need to consult a c-lawyer on why!
But in answer to your actual question, if you make everything volatile, the code should still work fine, although you may have performance limitations that you don't need to have!
A variable is said to be volatile if its value can change at any moment independently of the program. It is useful if another program (or thread), or an external event (keyboard, network ...) can modify the variable. It tells the compiler to reread the value of the variable from its original location each time the variable is accessed. It prevents the compiler to optimize memory access. So declaring each variable volatile may slow down the program.
By the way: I know nothing about specificity of AVR programming.
If you need to configure all variable as volatile , then there is some deep rooted design issue with your software. Yes, it will decrease by performance. But how much? We don't know unless you provide the spec of the CPU and its instructions.
I would like to determine if the below code would allow the CPU to fetch the unsafe_variable twice? Let's assume the compiler will not reorder or optimise the code because of the volatile and _ReadWriteBarrier (on VS). Mutex cannot be used here and I only care about the case of the potential double fetch.
I am not an expert in term of CPU design but what I am concerned about regarding a potential double fetch would be: Speculative execution (Performance optimization technique including branch prediction and prefetch techniques), Register and memory location renaming and the use of reorder buffer and store buffers within one or two CPUs? Please let me know if a double fetch here would be at all possible.
int function(void* Data) {
size_t _varSize = ((volatile DATA *)Data)->unsafe_variable;
// unsafe_variable is in some kind of shared memory and can change at any time
_ReadWriteBarrier();
// this does not prevent against CPU optimisations (MemoryBarrier would)
if (_varSize > x * y) { return FALSE;}
size_t size = _varSize - t * q;
function_xy(size);
return TRUE;
}
Accessing volatile memory location counts as a side-effect in the C standard C11 5.1.2.3/2:
"Accessing a volatile object, modifying an object, modifying a file, or calling a function
that does any of those operations are all side effects"
and the compiler is not allowed to generate code that causes additional side-effects. It is explicitly not allowed to generate code that causes additional access of volatile variables (C11 5.1.2.3/6).
But then you are using VC++ so all bets are off. It barely follows any standard. I would advise to use a strictly conforming C compiler instead, if possible.
Nothing in the C language specifies memory system behavior at this level nor anything about threads, so there is no certain answer.
To be certain you would need exact details of the CPU, OS, and compiler.
However, I doubt any modern architecture would fetch twice from memory to serve any of the purposes you mention, provided computation is not interrputed. Any reference after the first would be to cache.
However, if there is a context switch after the unsafe_variable fetch is begun but before _varSize can be used, it's imaginable that the fetch might be restarted when this thread continues.
Why is volatile needed in C? What is it used for? What will it do?
volatile tells the compiler not to optimize anything that has to do with the volatile variable.
There are at least three common reasons to use it, all involving situations where the value of the variable can change without action from the visible code:
When you interface with hardware that changes the value itself
when there's another thread running that also uses the variable
when there's a signal handler that might change the value of the variable.
Let's say you have a little piece of hardware that is mapped into RAM somewhere and that has two addresses: a command port and a data port:
typedef struct
{
int command;
int data;
int isBusy;
} MyHardwareGadget;
Now you want to send some command:
void SendCommand (MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
Looks easy, but it can fail because the compiler is free to change the order in which data and commands are written. This would cause our little gadget to issue commands with the previous data-value. Also take a look at the wait while busy loop. That one will be optimized out. The compiler will try to be clever, read the value of isBusy just once and then go into an infinite loop. That's not what you want.
The way to get around this is to declare the pointer gadget as volatile. This way the compiler is forced to do what you wrote. It can't remove the memory assignments, it can't cache variables in registers and it can't change the order of assignments either
This is the correct version:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isBusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
volatile in C actually came into existence for the purpose of not caching the values of the variable automatically. It will tell the compiler not to cache the value of this variable. So it will generate code to take the value of the given volatile variable from the main memory every time it encounters it. This mechanism is used because at any time the value can be modified by the OS or any interrupt. So using volatile will help us accessing the value afresh every time.
Another use for volatile is signal handlers. If you have code like this:
int quit = 0;
while (!quit)
{
/* very small loop which is completely visible to the compiler */
}
The compiler is allowed to notice the loop body does not touch the quit variable and convert the loop to a while (true) loop. Even if the quit variable is set on the signal handler for SIGINT and SIGTERM; the compiler has no way to know that.
However, if the quit variable is declared volatile, the compiler is forced to load it every time, because it can be modified elsewhere. This is exactly what you want in this situation.
volatile tells the compiler that your variable may be changed by other means, than the code that is accessing it. e.g., it may be a I/O-mapped memory location. If this is not specified in such cases, some variable accesses can be optimised, e.g., its contents can be held in a register, and the memory location not read back in again.
See this article by Andrei Alexandrescu, "volatile - Multithreaded Programmer's Best Friend"
The volatile keyword was
devised to prevent compiler
optimizations that might render code
incorrect in the presence of certain
asynchronous events. For example, if
you declare a primitive variable as
volatile, the compiler is not
permitted to cache it in a register --
a common optimization that would be
disastrous if that variable were
shared among multiple threads. So the
general rule is, if you have variables
of primitive type that must be shared
among multiple threads, declare those
variables volatile. But you can
actually do a lot more with this
keyword: you can use it to catch code
that is not thread safe, and you can
do so at compile time. This article
shows how it is done; the solution
involves a simple smart pointer that
also makes it easy to serialize
critical sections of code.
The article applies to both C and C++.
Also see the article "C++ and the Perils of Double-Checked Locking" by Scott Meyers and Andrei Alexandrescu:
So when dealing with some memory locations (e.g. memory mapped ports or memory referenced by ISRs [ Interrupt Service Routines ] ), some optimizations must be suspended. volatile exists for specifying special treatment for such locations, specifically: (1) the content of a volatile variable is "unstable" (can change by means unknown to the compiler), (2) all writes to volatile data are "observable" so they must be executed religiously, and (3) all operations on volatile data are executed in the sequence in which they appear in the source code. The first two rules ensure proper reading and writing. The last one allows implementation of I/O protocols that mix input and output. This is informally what C and C++'s volatile guarantees.
My simple explanation is:
In some scenarios, based on the logic or code, the compiler will do optimisation of variables which it thinks do not change. The volatile keyword prevents a variable being optimised.
For example:
bool usb_interface_flag = 0;
while(usb_interface_flag == 0)
{
// execute logic for the scenario where the USB isn't connected
}
From the above code, the compiler may think usb_interface_flag is defined as 0, and that in the while loop it will be zero forever. After optimisation, the compiler will treat it as while(true) all the time, resulting in an infinite loop.
To avoid these kinds of scenarios, we declare the flag as volatile, we are telling to compiler that this value may be changed by an external interface or other module of program, i.e., please don't optimise it. That's the use case for volatile.
A marginal use for volatile is the following. Say you want to compute the numerical derivative of a function f :
double der_f(double x)
{
static const double h = 1e-3;
return (f(x + h) - f(x)) / h;
}
The problem is that x+h-x is generally not equal to h due to roundoff errors. Think about it : when you substract very close numbers, you lose a lot of significant digits which can ruin the computation of the derivative (think 1.00001 - 1). A possible workaround could be
double der_f2(double x)
{
static const double h = 1e-3;
double hh = x + h - x;
return (f(x + hh) - f(x)) / hh;
}
but depending on your platform and compiler switches, the second line of that function may be wiped out by a aggressively optimizing compiler. So you write instead
volatile double hh = x + h;
hh -= x;
to force the compiler to read the memory location containing hh, forfeiting an eventual optimization opportunity.
There are two uses. These are specially used more often in embedded development.
Compiler will not optimise the functions that uses variables that are defined with volatile keyword
Volatile is used to access exact memory locations in RAM, ROM, etc... This is used more often to control memory-mapped devices, access CPU registers and locate specific memory locations.
See examples with assembly listing.
Re: Usage of C "volatile" Keyword in Embedded Development
I'll mention another scenario where volatiles are important.
Suppose you memory-map a file for faster I/O and that file can change behind the scenes (e.g. the file is not on your local hard drive, but is instead served over the network by another computer).
If you access the memory-mapped file's data through pointers to non-volatile objects (at the source code level), then the code generated by the compiler can fetch the same data multiple times without you being aware of it.
If that data happens to change, your program may become using two or more different versions of the data and get into an inconsistent state. This can lead not only to logically incorrect behavior of the program but also to exploitable security holes in it if it processes untrusted files or files from untrusted locations.
If you care about security, and you should, this is an important scenario to consider.
Volatile is also useful, when you want to force the compiler not to optimize a specific code sequence (e.g. for writing a micro-benchmark).
volatile means the storage is likely to change at anytime and be changed but something outside the control of the user program. This means that if you reference the variable, the program should always check the physical address (ie a mapped input fifo), and not use it in a cached way.
In the language designed by Dennis Ritchie, every access to any object, other than automatic objects whose address had not been taken, would behave as though it computed the address of the object and then read or wrote the storage at that address. This made the language very powerful, but severely limited optimization opportunities.
While it might have been possible to add a qualifier that would invite a compiler to assume that a particular object wouldn't be changed in weird ways, such an assumption would be appropriate for the vast majority of objects in C programs, and it would have been impractical to add a qualifier to all the objects for which such assumption would be appropriate. On the other hand, some programs need to use some objects for which such an assumption would not hold. To resolve this issue, the Standard says that compilers may assume that objects which are not declared volatile will not have their value observed or changed in ways that are outside the compiler's control, or would be outside a reasonable compiler's understanding.
Because various platforms may have different ways in which objects could be observed or modified outside a compiler's control, it is appropriate that quality compilers for those platforms should differ in their exact handling of volatile semantics. Unfortunately, because the Standard failed to suggest that quality compilers intended for low-level programming on a platform should handle volatile in a way that will recognize any and all relevant effects of a particular read/write operation on that platform, many compilers fall short of doing so in ways that make it harder to process things like background I/O in a way which is efficient but can't be broken by compiler "optimizations".
In my opinion, you should not expect too much from volatile. To illustrate, look at the example in Nils Pipenbrinck's highly-voted answer.
I would say, his example is not suitable for volatile. volatile is only used to:
prevent the compiler from making useful and desirable optimizations. It is nothing about the thread safe, atomic access or even memory order.
In that example:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
the gadget->data = data before gadget->command = command only is only guaranteed in compiled code by compiler. At running time, the processor still possibly reorders the data and command assignment, regarding to the processor architecture. The hardware could get the wrong data(suppose gadget is mapped to hardware I/O). The memory barrier is needed between data and command assignment.
In simple terms, it tells the compiler not to do any optimisation on a particular variable. Variables which are mapped to device register are modified indirectly by the device. In this case, volatile must be used.
The Wiki say everything about volatile:
volatile (computer programming)
And the Linux kernel's doc also make a excellent notation about volatile:
Why the "volatile" type class should not be used
A volatile can be changed from outside the compiled code (for example, a program may map a volatile variable to a memory mapped register.) The compiler won't apply certain optimizations to code that handles a volatile variable - for example, it won't load it into a register without writing it to memory. This is important when dealing with hardware registers.
As rightly suggested by many here, the volatile keyword's popular use is to skip the optimisation of the volatile variable.
The best advantage that comes to mind, and worth mentioning after reading about volatile is -- to prevent rolling back of the variable in case of a longjmp. A non-local jump.
What does this mean?
It simply means that the last value will be retained after you do stack unwinding, to return to some previous stack frame; typically in case of some erroneous scenario.
Since it'd be out of scope of this question, I am not going into details of setjmp/longjmp here, but it's worth reading about it; and how the volatility feature can be used to retain the last value.
it does not allows compiler to automatic changing values of variables. a volatile variable is for dynamic use.