code working fine without volatile? - c

Hi i wrote a code recently with following skeleton :
variable;
callback(){
//variable updated here
}
thread_function(){
//variable used
}
main(){
//callback registered
//thread registered
}
I found that whenever the variable is updated at callback, its automatically updated in the thread without declaring the variable as a volatile. well, I am not clear how it is managed. thank you in advance. between, callback() is called from a library being compiled with the code.

Ok, first of all, it is broken. You're just lucky your compiler created code that, by accident, seems to work (or maybe unlucky, because it could hide subtle bugs in corner cases).
I found that whenever the variable is updated at callback, its automatically updated in the thread without declaring the variable as a volatile. well, I am not clear how it is managed.
Starting from here: It is one and the same variable. Threads share the same address space. So, this is what you would naively expect.
volatile is about optimizations. A C compiler is free to do a lot of modifications to your code in order to make it faster, this includes reordering of statements, not reading an accessed variable because the same value is in a register, even "unrolling" loops and a lot more. Normally, the only restriction is that the observable behavior stays the same (google for it, I don't want to write a book here.)
Reading a variable from normal memory does not create any side effects, so it can be legally left out without changing behavior. C (prior to c11) does not know about threads. If a variable was not written to in some part of code, it is assumed to hold the same value as before.
Now, what volatile gives you is telling the compiler this is not normal memory but a location that could change outside of the control of your program (like e.g. a memory mapped I/O register). With volatile, the compiler is obliged to actually fetch the variable for any read operation and to actually store it for any write operation. It's also not allowed to reorder accesses between several volatile variables. But that's all and it is not enough for synchronizing threads. For that, you also need the guarantee that no other memory accesses or code execution is reordered around accesses of your variable, even with multi-processors -> you need a memory barrier.
Note this is different from e.g. java where volatile gives you what you need for thread synchronization. In c, use what the pthreads library gives you: mutexes, semaphores and condition variables.
tl;dr
Your code works correctly by accident. In c, volatile for threads is wrong (and unnecessary), use synchronization primitives as provided by pthreads.

Related

Volatile function argument in C (STM32F4 example) [duplicate]

Why is volatile needed in C? What is it used for? What will it do?
volatile tells the compiler not to optimize anything that has to do with the volatile variable.
There are at least three common reasons to use it, all involving situations where the value of the variable can change without action from the visible code:
When you interface with hardware that changes the value itself
when there's another thread running that also uses the variable
when there's a signal handler that might change the value of the variable.
Let's say you have a little piece of hardware that is mapped into RAM somewhere and that has two addresses: a command port and a data port:
typedef struct
{
int command;
int data;
int isBusy;
} MyHardwareGadget;
Now you want to send some command:
void SendCommand (MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
Looks easy, but it can fail because the compiler is free to change the order in which data and commands are written. This would cause our little gadget to issue commands with the previous data-value. Also take a look at the wait while busy loop. That one will be optimized out. The compiler will try to be clever, read the value of isBusy just once and then go into an infinite loop. That's not what you want.
The way to get around this is to declare the pointer gadget as volatile. This way the compiler is forced to do what you wrote. It can't remove the memory assignments, it can't cache variables in registers and it can't change the order of assignments either
This is the correct version:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isBusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
volatile in C actually came into existence for the purpose of not caching the values of the variable automatically. It will tell the compiler not to cache the value of this variable. So it will generate code to take the value of the given volatile variable from the main memory every time it encounters it. This mechanism is used because at any time the value can be modified by the OS or any interrupt. So using volatile will help us accessing the value afresh every time.
Another use for volatile is signal handlers. If you have code like this:
int quit = 0;
while (!quit)
{
/* very small loop which is completely visible to the compiler */
}
The compiler is allowed to notice the loop body does not touch the quit variable and convert the loop to a while (true) loop. Even if the quit variable is set on the signal handler for SIGINT and SIGTERM; the compiler has no way to know that.
However, if the quit variable is declared volatile, the compiler is forced to load it every time, because it can be modified elsewhere. This is exactly what you want in this situation.
volatile tells the compiler that your variable may be changed by other means, than the code that is accessing it. e.g., it may be a I/O-mapped memory location. If this is not specified in such cases, some variable accesses can be optimised, e.g., its contents can be held in a register, and the memory location not read back in again.
See this article by Andrei Alexandrescu, "volatile - Multithreaded Programmer's Best Friend"
The volatile keyword was
devised to prevent compiler
optimizations that might render code
incorrect in the presence of certain
asynchronous events. For example, if
you declare a primitive variable as
volatile, the compiler is not
permitted to cache it in a register --
a common optimization that would be
disastrous if that variable were
shared among multiple threads. So the
general rule is, if you have variables
of primitive type that must be shared
among multiple threads, declare those
variables volatile. But you can
actually do a lot more with this
keyword: you can use it to catch code
that is not thread safe, and you can
do so at compile time. This article
shows how it is done; the solution
involves a simple smart pointer that
also makes it easy to serialize
critical sections of code.
The article applies to both C and C++.
Also see the article "C++ and the Perils of Double-Checked Locking" by Scott Meyers and Andrei Alexandrescu:
So when dealing with some memory locations (e.g. memory mapped ports or memory referenced by ISRs [ Interrupt Service Routines ] ), some optimizations must be suspended. volatile exists for specifying special treatment for such locations, specifically: (1) the content of a volatile variable is "unstable" (can change by means unknown to the compiler), (2) all writes to volatile data are "observable" so they must be executed religiously, and (3) all operations on volatile data are executed in the sequence in which they appear in the source code. The first two rules ensure proper reading and writing. The last one allows implementation of I/O protocols that mix input and output. This is informally what C and C++'s volatile guarantees.
My simple explanation is:
In some scenarios, based on the logic or code, the compiler will do optimisation of variables which it thinks do not change. The volatile keyword prevents a variable being optimised.
For example:
bool usb_interface_flag = 0;
while(usb_interface_flag == 0)
{
// execute logic for the scenario where the USB isn't connected
}
From the above code, the compiler may think usb_interface_flag is defined as 0, and that in the while loop it will be zero forever. After optimisation, the compiler will treat it as while(true) all the time, resulting in an infinite loop.
To avoid these kinds of scenarios, we declare the flag as volatile, we are telling to compiler that this value may be changed by an external interface or other module of program, i.e., please don't optimise it. That's the use case for volatile.
A marginal use for volatile is the following. Say you want to compute the numerical derivative of a function f :
double der_f(double x)
{
static const double h = 1e-3;
return (f(x + h) - f(x)) / h;
}
The problem is that x+h-x is generally not equal to h due to roundoff errors. Think about it : when you substract very close numbers, you lose a lot of significant digits which can ruin the computation of the derivative (think 1.00001 - 1). A possible workaround could be
double der_f2(double x)
{
static const double h = 1e-3;
double hh = x + h - x;
return (f(x + hh) - f(x)) / hh;
}
but depending on your platform and compiler switches, the second line of that function may be wiped out by a aggressively optimizing compiler. So you write instead
volatile double hh = x + h;
hh -= x;
to force the compiler to read the memory location containing hh, forfeiting an eventual optimization opportunity.
There are two uses. These are specially used more often in embedded development.
Compiler will not optimise the functions that uses variables that are defined with volatile keyword
Volatile is used to access exact memory locations in RAM, ROM, etc... This is used more often to control memory-mapped devices, access CPU registers and locate specific memory locations.
See examples with assembly listing.
Re: Usage of C "volatile" Keyword in Embedded Development
I'll mention another scenario where volatiles are important.
Suppose you memory-map a file for faster I/O and that file can change behind the scenes (e.g. the file is not on your local hard drive, but is instead served over the network by another computer).
If you access the memory-mapped file's data through pointers to non-volatile objects (at the source code level), then the code generated by the compiler can fetch the same data multiple times without you being aware of it.
If that data happens to change, your program may become using two or more different versions of the data and get into an inconsistent state. This can lead not only to logically incorrect behavior of the program but also to exploitable security holes in it if it processes untrusted files or files from untrusted locations.
If you care about security, and you should, this is an important scenario to consider.
Volatile is also useful, when you want to force the compiler not to optimize a specific code sequence (e.g. for writing a micro-benchmark).
volatile means the storage is likely to change at anytime and be changed but something outside the control of the user program. This means that if you reference the variable, the program should always check the physical address (ie a mapped input fifo), and not use it in a cached way.
In the language designed by Dennis Ritchie, every access to any object, other than automatic objects whose address had not been taken, would behave as though it computed the address of the object and then read or wrote the storage at that address. This made the language very powerful, but severely limited optimization opportunities.
While it might have been possible to add a qualifier that would invite a compiler to assume that a particular object wouldn't be changed in weird ways, such an assumption would be appropriate for the vast majority of objects in C programs, and it would have been impractical to add a qualifier to all the objects for which such assumption would be appropriate. On the other hand, some programs need to use some objects for which such an assumption would not hold. To resolve this issue, the Standard says that compilers may assume that objects which are not declared volatile will not have their value observed or changed in ways that are outside the compiler's control, or would be outside a reasonable compiler's understanding.
Because various platforms may have different ways in which objects could be observed or modified outside a compiler's control, it is appropriate that quality compilers for those platforms should differ in their exact handling of volatile semantics. Unfortunately, because the Standard failed to suggest that quality compilers intended for low-level programming on a platform should handle volatile in a way that will recognize any and all relevant effects of a particular read/write operation on that platform, many compilers fall short of doing so in ways that make it harder to process things like background I/O in a way which is efficient but can't be broken by compiler "optimizations".
In my opinion, you should not expect too much from volatile. To illustrate, look at the example in Nils Pipenbrinck's highly-voted answer.
I would say, his example is not suitable for volatile. volatile is only used to:
prevent the compiler from making useful and desirable optimizations. It is nothing about the thread safe, atomic access or even memory order.
In that example:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
the gadget->data = data before gadget->command = command only is only guaranteed in compiled code by compiler. At running time, the processor still possibly reorders the data and command assignment, regarding to the processor architecture. The hardware could get the wrong data(suppose gadget is mapped to hardware I/O). The memory barrier is needed between data and command assignment.
In simple terms, it tells the compiler not to do any optimisation on a particular variable. Variables which are mapped to device register are modified indirectly by the device. In this case, volatile must be used.
The Wiki say everything about volatile:
volatile (computer programming)
And the Linux kernel's doc also make a excellent notation about volatile:
Why the "volatile" type class should not be used
A volatile can be changed from outside the compiled code (for example, a program may map a volatile variable to a memory mapped register.) The compiler won't apply certain optimizations to code that handles a volatile variable - for example, it won't load it into a register without writing it to memory. This is important when dealing with hardware registers.
As rightly suggested by many here, the volatile keyword's popular use is to skip the optimisation of the volatile variable.
The best advantage that comes to mind, and worth mentioning after reading about volatile is -- to prevent rolling back of the variable in case of a longjmp. A non-local jump.
What does this mean?
It simply means that the last value will be retained after you do stack unwinding, to return to some previous stack frame; typically in case of some erroneous scenario.
Since it'd be out of scope of this question, I am not going into details of setjmp/longjmp here, but it's worth reading about it; and how the volatility feature can be used to retain the last value.
it does not allows compiler to automatic changing values of variables. a volatile variable is for dynamic use.

Is it bad if all variables are defined as volatile on AVR programming?

I read that in some cases (global variable, or while(variable), etc.) if the variables are not defined as volatile it may cause problems.
Would it cause a problem if I define all variables as volatile?
If something outside of the current scope or any subsequent child scope (think: function calls) can modify the variable you are working in (there's a timer interrupt that will increment your variable, you gave a reference to the var to some other code that might do something in response to an interrupt, etc) then the variable should be declared volatile.
volatile is a hint to the compiler that says, "something else might change this variable." and the compiler's response is, "Oh. OK. I will never trust a copy of this variable I have in a register or on the stack. Every time I need to use this variable I will read it from memory because my copy in a register could be out of date."
Declaring everything volatile will make your code slow down a lot and result in a much larger binary. Instead of doing this the correct answer is to understand what needs to be tagged volatile, why it does, and tagging appropriately.
A volatile variable must have its memory accesses honoured by the compiler.
This means that:
If you read the value of the variable, the compiler is not allowed to reuse a value it already "knows" about. For example, it might be in an external device register and subject to change by the peripheral hardware.
If you write something to the variable, the compiler must write to that memory. It is not allowed to look around and say "well, nothing else uses that variable, so I don't really need to write it". For example, it might again be an external peripheral register (maybe the Tx FIFO of a UART).
Note that volatile is not (always) sufficient for communication between threads (or between main loops and interrupt service routines: https://stackoverflow.com/a/2485177/106092. See also https://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt
Only make things volatile which need to be. If you find yourself having to make things volatile to make them work it comes down to one of two things:
A compiler bug
Your code not being correct
In my experience it's almost always the latter, but you might need to consult a c-lawyer on why!
But in answer to your actual question, if you make everything volatile, the code should still work fine, although you may have performance limitations that you don't need to have!
A variable is said to be volatile if its value can change at any moment independently of the program. It is useful if another program (or thread), or an external event (keyboard, network ...) can modify the variable. It tells the compiler to reread the value of the variable from its original location each time the variable is accessed. It prevents the compiler to optimize memory access. So declaring each variable volatile may slow down the program.
By the way: I know nothing about specificity of AVR programming.
If you need to configure all variable as volatile , then there is some deep rooted design issue with your software. Yes, it will decrease by performance. But how much? We don't know unless you provide the spec of the CPU and its instructions.

One thread reading and another writing to volatile variable - thread-safe?

In C I have a pointer that is declared volatile and initialized null.
void* volatile pvoid;
Thread 1 is occasionally reading the pointer value to check if it is non-null. Thread 1 will not set the value of the pointer.
Thread 2 will set the value of a pointer just once.
I believe I can get away without using a mutex or condition variable.
Is there any reason thread 1 will read a corrupted value or thread 2 will write a corrupted value?
To make it thread safe, you have to make atomic reads/writes to the variable, it being volatile is not safe in all timing situations. Under Win32 there are the Interlocked functions, under Linux you can build it yourself with assembly if you do not want to use the heavy weight mutexes and conditional variables.
If you are not against GPL then http://www.threadingbuildingblocks.org and its atomic<> template seems promising. The lib is cross platform.
In the case where the value fits in a single register, such as a memory aligned pointer, this is safe. In other cases where it might take more than one instruction to read or write the value, the read thread could get corrupted data. If you are not sure wether the read and write will take a single instruction in all usage scenarios, use atomic reads and writes.
Depends on your compiler, architecture and operating system. POSIX (since this question was tagged pthreads Im assuming we're not talking about windows or some other threading model) and C don't give enough constraints to have a portable answer to this question.
The safe assumption is of course to protect the access to the pointer with a mutex. However based on your description of the problem I wonder if pthread_once wouldn't be a better way to go. Granted there's not enough information in the question to say one way or the other.
Unfortunately, you cannot portably make any assumptions about what is atomic in pure C.
GCC, however, does provide some atomic built-in functions that take care of using the proper instructions for many architectures for you. See Chapter 5.47 of the GCC manual for more information.
Well this seems fine.. The only problem will happen in this case
let thread A be your checking thread and B the modifying one..
The thing is that checking for equality is not atomic technically first the values should be copied to registers then checked and then restored. Lets assume that thread A has copied to register, now B decides to change the value , now the value of your variable changes. So when control goes back to A it will say it is not null even though it SHUD be according to when the thread was called. This seems harmless in this program but MIGHT cause problems..
Use a mutex.. simple enuf.. and u can be sure you dont have sync errors!
On most platforms where a pointer value can be read/written in a single instruction, it either set or it isn't set yet. It can't be interrupted in the middle and contain a corrupted value. A mutex isn't needed on that kind of platform.

Why is volatile needed in C?

Why is volatile needed in C? What is it used for? What will it do?
volatile tells the compiler not to optimize anything that has to do with the volatile variable.
There are at least three common reasons to use it, all involving situations where the value of the variable can change without action from the visible code:
When you interface with hardware that changes the value itself
when there's another thread running that also uses the variable
when there's a signal handler that might change the value of the variable.
Let's say you have a little piece of hardware that is mapped into RAM somewhere and that has two addresses: a command port and a data port:
typedef struct
{
int command;
int data;
int isBusy;
} MyHardwareGadget;
Now you want to send some command:
void SendCommand (MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
Looks easy, but it can fail because the compiler is free to change the order in which data and commands are written. This would cause our little gadget to issue commands with the previous data-value. Also take a look at the wait while busy loop. That one will be optimized out. The compiler will try to be clever, read the value of isBusy just once and then go into an infinite loop. That's not what you want.
The way to get around this is to declare the pointer gadget as volatile. This way the compiler is forced to do what you wrote. It can't remove the memory assignments, it can't cache variables in registers and it can't change the order of assignments either
This is the correct version:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isBusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
volatile in C actually came into existence for the purpose of not caching the values of the variable automatically. It will tell the compiler not to cache the value of this variable. So it will generate code to take the value of the given volatile variable from the main memory every time it encounters it. This mechanism is used because at any time the value can be modified by the OS or any interrupt. So using volatile will help us accessing the value afresh every time.
Another use for volatile is signal handlers. If you have code like this:
int quit = 0;
while (!quit)
{
/* very small loop which is completely visible to the compiler */
}
The compiler is allowed to notice the loop body does not touch the quit variable and convert the loop to a while (true) loop. Even if the quit variable is set on the signal handler for SIGINT and SIGTERM; the compiler has no way to know that.
However, if the quit variable is declared volatile, the compiler is forced to load it every time, because it can be modified elsewhere. This is exactly what you want in this situation.
volatile tells the compiler that your variable may be changed by other means, than the code that is accessing it. e.g., it may be a I/O-mapped memory location. If this is not specified in such cases, some variable accesses can be optimised, e.g., its contents can be held in a register, and the memory location not read back in again.
See this article by Andrei Alexandrescu, "volatile - Multithreaded Programmer's Best Friend"
The volatile keyword was
devised to prevent compiler
optimizations that might render code
incorrect in the presence of certain
asynchronous events. For example, if
you declare a primitive variable as
volatile, the compiler is not
permitted to cache it in a register --
a common optimization that would be
disastrous if that variable were
shared among multiple threads. So the
general rule is, if you have variables
of primitive type that must be shared
among multiple threads, declare those
variables volatile. But you can
actually do a lot more with this
keyword: you can use it to catch code
that is not thread safe, and you can
do so at compile time. This article
shows how it is done; the solution
involves a simple smart pointer that
also makes it easy to serialize
critical sections of code.
The article applies to both C and C++.
Also see the article "C++ and the Perils of Double-Checked Locking" by Scott Meyers and Andrei Alexandrescu:
So when dealing with some memory locations (e.g. memory mapped ports or memory referenced by ISRs [ Interrupt Service Routines ] ), some optimizations must be suspended. volatile exists for specifying special treatment for such locations, specifically: (1) the content of a volatile variable is "unstable" (can change by means unknown to the compiler), (2) all writes to volatile data are "observable" so they must be executed religiously, and (3) all operations on volatile data are executed in the sequence in which they appear in the source code. The first two rules ensure proper reading and writing. The last one allows implementation of I/O protocols that mix input and output. This is informally what C and C++'s volatile guarantees.
My simple explanation is:
In some scenarios, based on the logic or code, the compiler will do optimisation of variables which it thinks do not change. The volatile keyword prevents a variable being optimised.
For example:
bool usb_interface_flag = 0;
while(usb_interface_flag == 0)
{
// execute logic for the scenario where the USB isn't connected
}
From the above code, the compiler may think usb_interface_flag is defined as 0, and that in the while loop it will be zero forever. After optimisation, the compiler will treat it as while(true) all the time, resulting in an infinite loop.
To avoid these kinds of scenarios, we declare the flag as volatile, we are telling to compiler that this value may be changed by an external interface or other module of program, i.e., please don't optimise it. That's the use case for volatile.
A marginal use for volatile is the following. Say you want to compute the numerical derivative of a function f :
double der_f(double x)
{
static const double h = 1e-3;
return (f(x + h) - f(x)) / h;
}
The problem is that x+h-x is generally not equal to h due to roundoff errors. Think about it : when you substract very close numbers, you lose a lot of significant digits which can ruin the computation of the derivative (think 1.00001 - 1). A possible workaround could be
double der_f2(double x)
{
static const double h = 1e-3;
double hh = x + h - x;
return (f(x + hh) - f(x)) / hh;
}
but depending on your platform and compiler switches, the second line of that function may be wiped out by a aggressively optimizing compiler. So you write instead
volatile double hh = x + h;
hh -= x;
to force the compiler to read the memory location containing hh, forfeiting an eventual optimization opportunity.
There are two uses. These are specially used more often in embedded development.
Compiler will not optimise the functions that uses variables that are defined with volatile keyword
Volatile is used to access exact memory locations in RAM, ROM, etc... This is used more often to control memory-mapped devices, access CPU registers and locate specific memory locations.
See examples with assembly listing.
Re: Usage of C "volatile" Keyword in Embedded Development
I'll mention another scenario where volatiles are important.
Suppose you memory-map a file for faster I/O and that file can change behind the scenes (e.g. the file is not on your local hard drive, but is instead served over the network by another computer).
If you access the memory-mapped file's data through pointers to non-volatile objects (at the source code level), then the code generated by the compiler can fetch the same data multiple times without you being aware of it.
If that data happens to change, your program may become using two or more different versions of the data and get into an inconsistent state. This can lead not only to logically incorrect behavior of the program but also to exploitable security holes in it if it processes untrusted files or files from untrusted locations.
If you care about security, and you should, this is an important scenario to consider.
Volatile is also useful, when you want to force the compiler not to optimize a specific code sequence (e.g. for writing a micro-benchmark).
volatile means the storage is likely to change at anytime and be changed but something outside the control of the user program. This means that if you reference the variable, the program should always check the physical address (ie a mapped input fifo), and not use it in a cached way.
In the language designed by Dennis Ritchie, every access to any object, other than automatic objects whose address had not been taken, would behave as though it computed the address of the object and then read or wrote the storage at that address. This made the language very powerful, but severely limited optimization opportunities.
While it might have been possible to add a qualifier that would invite a compiler to assume that a particular object wouldn't be changed in weird ways, such an assumption would be appropriate for the vast majority of objects in C programs, and it would have been impractical to add a qualifier to all the objects for which such assumption would be appropriate. On the other hand, some programs need to use some objects for which such an assumption would not hold. To resolve this issue, the Standard says that compilers may assume that objects which are not declared volatile will not have their value observed or changed in ways that are outside the compiler's control, or would be outside a reasonable compiler's understanding.
Because various platforms may have different ways in which objects could be observed or modified outside a compiler's control, it is appropriate that quality compilers for those platforms should differ in their exact handling of volatile semantics. Unfortunately, because the Standard failed to suggest that quality compilers intended for low-level programming on a platform should handle volatile in a way that will recognize any and all relevant effects of a particular read/write operation on that platform, many compilers fall short of doing so in ways that make it harder to process things like background I/O in a way which is efficient but can't be broken by compiler "optimizations".
In my opinion, you should not expect too much from volatile. To illustrate, look at the example in Nils Pipenbrinck's highly-voted answer.
I would say, his example is not suitable for volatile. volatile is only used to:
prevent the compiler from making useful and desirable optimizations. It is nothing about the thread safe, atomic access or even memory order.
In that example:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
the gadget->data = data before gadget->command = command only is only guaranteed in compiled code by compiler. At running time, the processor still possibly reorders the data and command assignment, regarding to the processor architecture. The hardware could get the wrong data(suppose gadget is mapped to hardware I/O). The memory barrier is needed between data and command assignment.
In simple terms, it tells the compiler not to do any optimisation on a particular variable. Variables which are mapped to device register are modified indirectly by the device. In this case, volatile must be used.
The Wiki say everything about volatile:
volatile (computer programming)
And the Linux kernel's doc also make a excellent notation about volatile:
Why the "volatile" type class should not be used
A volatile can be changed from outside the compiled code (for example, a program may map a volatile variable to a memory mapped register.) The compiler won't apply certain optimizations to code that handles a volatile variable - for example, it won't load it into a register without writing it to memory. This is important when dealing with hardware registers.
As rightly suggested by many here, the volatile keyword's popular use is to skip the optimisation of the volatile variable.
The best advantage that comes to mind, and worth mentioning after reading about volatile is -- to prevent rolling back of the variable in case of a longjmp. A non-local jump.
What does this mean?
It simply means that the last value will be retained after you do stack unwinding, to return to some previous stack frame; typically in case of some erroneous scenario.
Since it'd be out of scope of this question, I am not going into details of setjmp/longjmp here, but it's worth reading about it; and how the volatility feature can be used to retain the last value.
it does not allows compiler to automatic changing values of variables. a volatile variable is for dynamic use.

Using C/Pthreads: do shared variables need to be volatile?

In the C programming language and Pthreads as the threading library; do variables/structures that are shared between threads need to be declared as volatile? Assuming that they might be protected by a lock or not (barriers perhaps).
Does the pthread POSIX standard have any say about this, is this compiler-dependent or neither?
Edit to add: Thanks for the great answers. But what if you're not using locks; what if you're using barriers for example? Or code that uses primitives such as compare-and-swap to directly and atomically modify a shared variable...
As long as you are using locks to control access to the variable, you do not need volatile on it. In fact, if you're putting volatile on any variable you're probably already wrong.
https://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming/
The answer is absolutely, unequivocally, NO. You do not need to use 'volatile' in addition to proper synchronization primitives. Everything that needs to be done are done by these primitives.
The use of 'volatile' is neither necessary nor sufficient. It's not necessary because the proper synchronization primitives are sufficient. It's not sufficient because it only disables some optimizations, not all of the ones that might bite you. For example, it does not guarantee either atomicity or visibility on another CPU.
But unless you use volatile, the compiler is free to cache the shared data in a register for any length of time... if you want your data to be written to be predictably written to actual memory and not just cached in a register by the compiler at its discretion, you will need to mark it as volatile. Alternatively, if you only access the shared data after you have left a function modifying it, you might be fine. But I would suggest not relying on blind luck to make sure that values are written back from registers to memory.
Right, but even if you do use volatile, the CPU is free to cache the shared data in a write posting buffer for any length of time. The set of optimizations that can bite you is not precisely the same as the set of optimizations that 'volatile' disables. So if you use 'volatile', you are relying on blind luck.
On the other hand, if you use sychronization primitives with defined multi-threaded semantics, you are guaranteed that things will work. As a plus, you don't take the huge performance hit of 'volatile'. So why not do things that way?
I think one very important property of volatile is that it makes the variable be written to memory when modified, and reread from memory each time it accessed. The other answers here mix volatile and synchronization, and it is clear from some other answers than this that volatile is NOT a sync primitive (credit where credit is due).
But unless you use volatile, the compiler is free to cache the shared data in a register for any length of time... if you want your data to be written to be predictably written to actual memory and not just cached in a register by the compiler at its discretion, you will need to mark it as volatile. Alternatively, if you only access the shared data after you have left a function modifying it, you might be fine. But I would suggest not relying on blind luck to make sure that values are written back from registers to memory.
Especially on register-rich machines (i.e., not x86), variables can live for quite long periods in registers, and a good compiler can cache even parts of structures or entire structures in registers. So you should use volatile, but for performance, also copy values to local variables for computation and then do an explicit write-back. Essentially, using volatile efficiently means doing a bit of load-store thinking in your C code.
In any case, you positively have to use some kind of OS-level provided sync mechanism to create a correct program.
For an example of the weakness of volatile, see my Decker's algorithm example at http://jakob.engbloms.se/archives/65, which proves pretty well that volatile does not work to synchronize.
There is a widespread notion that the keyword volatile is good for multi-threaded programming.
Hans Boehm points out that there are only three portable uses for volatile:
volatile may be used to mark local variables in the same scope as a setjmp whose value should be preserved across a longjmp. It is unclear what fraction of such uses would be slowed down, since the atomicity and ordering constraints have no effect if there is no way to share the local variable in question. (It is even unclear what fraction of such uses would be slowed down by requiring all variables to be preserved across a longjmp, but that is a separate matter and is not considered here.)
volatile may be used when variables may be "externally modified", but the modification in fact is triggered synchronously by the thread itself, e.g. because the underlying memory is mapped at multiple locations.
A volatile sigatomic_t may be used to communicate with a signal handler in the same thread, in a restricted manner. One could consider weakening the requirements for the sigatomic_t case, but that seems rather counterintuitive.
If you are multi-threading for the sake of speed, slowing down code is definitely not what you want. For multi-threaded programming, there two key issues that volatile is often mistakenly thought to address:
atomicity
memory consistency, i.e. the order of a thread's operations as seen by another thread.
Let's deal with (1) first. Volatile does not guarantee atomic reads or writes. For example, a volatile read or write of a 129-bit structure is not going to be atomic on most modern hardware. A volatile read or write of a 32-bit int is atomic on most modern hardware, but volatile has nothing to do with it. It would likely be atomic without the volatile. The atomicity is at the whim of the compiler. There's nothing in the C or C++ standards that says it has to be atomic.
Now consider issue (2). Sometimes programmers think of volatile as turning off optimization of volatile accesses. That's largely true in practice. But that's only the volatile accesses, not the non-volatile ones. Consider this fragment:
volatile int Ready;
int Message[100];
void foo( int i ) {
Message[i/10] = 42;
Ready = 1;
}
It's trying to do something very reasonable in multi-threaded programming: write a message and then send it to another thread. The other thread will wait until Ready becomes non-zero and then read Message. Try compiling this with "gcc -O2 -S" using gcc 4.0, or icc. Both will do the store to Ready first, so it can be overlapped with the computation of i/10. The reordering is not a compiler bug. It's an aggressive optimizer doing its job.
You might think the solution is to mark all your memory references volatile. That's just plain silly. As the earlier quotes say, it will just slow down your code. Worst yet, it might not fix the problem. Even if the compiler does not reorder the references, the hardware might. In this example, x86 hardware will not reorder it. Neither will an Itanium(TM) processor, because Itanium compilers insert memory fences for volatile stores. That's a clever Itanium extension. But chips like Power(TM) will reorder. What you really need for ordering are memory fences, also called memory barriers. A memory fence prevents reordering of memory operations across the fence, or in some cases, prevents reordering in one direction.Volatile has nothing to do with memory fences.
So what's the solution for multi-threaded programming? Use a library or language extension that implements the atomic and fence semantics. When used as intended, the operations in the library will insert the right fences. Some examples:
POSIX threads
Windows(TM) threads
OpenMP
TBB
Based on article by Arch Robison (Intel)
In my experience, no; you just have to properly mutex yourself when you write to those values, or structure your program such that the threads will stop before they need to access data that depends on another thread's actions. My project, x264, uses this method; threads share an enormous amount of data but the vast majority of it doesn't need mutexes because its either read-only or a thread will wait for the data to become available and finalized before it needs to access it.
Now, if you have many threads that are all heavily interleaved in their operations (they depend on each others' output on a very fine-grained level), this may be a lot harder--in fact, in such a case I'd consider revisiting the threading model to see if it can possibly be done more cleanly with more separation between threads.
NO.
Volatile is only required when reading a memory location that can change independently of the CPU read/write commands. In the situation of threading, the CPU is in full control of read/writes to memory for each thread, therefore the compiler can assume the memory is coherent and optimizes the CPU instructions to reduce unnecessary memory access.
The primary usage for volatile is for accessing memory-mapped I/O. In this case, the underlying device can change the value of a memory location independently from CPU. If you do not use volatile under this condition, the CPU may use a previously cached memory value, instead of reading the newly updated value.
POSIX 7 guarantees that functions such as pthread_lock also synchronize memory
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11 "4.12 Memory Synchronization" says:
The following functions synchronize memory with respect to other threads:
pthread_barrier_wait()
pthread_cond_broadcast()
pthread_cond_signal()
pthread_cond_timedwait()
pthread_cond_wait()
pthread_create()
pthread_join()
pthread_mutex_lock()
pthread_mutex_timedlock()
pthread_mutex_trylock()
pthread_mutex_unlock()
pthread_spin_lock()
pthread_spin_trylock()
pthread_spin_unlock()
pthread_rwlock_rdlock()
pthread_rwlock_timedrdlock()
pthread_rwlock_timedwrlock()
pthread_rwlock_tryrdlock()
pthread_rwlock_trywrlock()
pthread_rwlock_unlock()
pthread_rwlock_wrlock()
sem_post()
sem_timedwait()
sem_trywait()
sem_wait()
semctl()
semop()
wait()
waitpid()
Therefore if your variable is guarded between pthread_mutex_lock and pthread_mutex_unlock then it does not need further synchronization as you might attempt to provide with volatile.
Related questions:
Does guarding a variable with a pthread mutex guarantee it's also not cached?
Does pthread_mutex_lock contains memory fence instruction?
Volatile would only be useful if you need absolutely no delay between when one thread writes something and another thread reads it. Without some sort of lock, though, you have no idea of when the other thread wrote the data, only that it's the most recent possible value.
For simple values (int and float in their various sizes) a mutex might be overkill if you don't need an explicit synch point. If you don't use a mutex or lock of some sort, you should declare the variable volatile. If you use a mutex you're all set.
For complicated types, you must use a mutex. Operations on them are non-atomic, so you could read a half-changed version without a mutex.
Volatile means that we have to go to memory to get or set this value. If you don't set volatile, the compiled code might store the data in a register for a long time.
What this means is that you should mark variables that you share between threads as volatile so that you don't have situations where one thread starts modifying the value but doesn't write its result before a second thread comes along and tries to read the value.
Volatile is a compiler hint that disables certain optimizations. The output assembly of the compiler might have been safe without it but you should always use it for shared values.
This is especially important if you are NOT using the expensive thread sync objects provided by your system - you might for example have a data structure where you can keep it valid with a series of atomic changes. Many stacks that do not allocate memory are examples of such data structures, because you can add a value to the stack then move the end pointer or remove a value from the stack after moving the end pointer. When implementing such a structure, volatile becomes crucial to ensure that your atomic instructions are actually atomic.
The underlying reason is that the C language semantic is based upon a single-threaded abstract machine. And the compiler is within its own right to transform the program as long as the program's 'observable behaviors' on the abstract machine stay unchanged. It can merge adjacent or overlapping memory accesses, redo a memory access multiple times (upon register spilling for example), or simply discard a memory access, if it thinks the program's behaviors, when executed in a single thread, doesn't change. Therefore as you may suspect, the behaviors do change if the program is actually supposed to be executing in a multi-threaded way.
As Paul Mckenney pointed out in a famous Linux kernel document:
It _must_not_ be assumed that the compiler will do what you want
with memory references that are not protected by READ_ONCE() and
WRITE_ONCE(). Without them, the compiler is within its rights to
do all sorts of "creative" transformations, which are covered in
the COMPILER BARRIER section.
READ_ONCE() and WRITE_ONCE() are defined as volatile casts on referenced variables. Thus:
int y;
int x = READ_ONCE(y);
is equivalent to:
int y;
int x = *(volatile int *)&y;
So, unless you make a 'volatile' access, you are not assured that the access happens exactly once, no matter what synchronization mechanism you are using. Calling an external function (pthread_mutex_lock for example) may force the compiler do memory accesses to global variables. But this happens only when the compiler fails to figure out whether the external function changes these global variables or not. Modern compilers employing sophisticated inter-procedure analysis and link-time optimization make this trick simply useless.
In summary, you should mark variables shared by multiple threads volatile or access them using volatile casts.
As Paul McKenney has also pointed out:
I have seen the glint in their eyes when they discuss optimization techniques that you would not want your children to know about!
But see what happens to C11/C++11.
Some people obviously are assuming that the compiler treats the synchronization calls as memory barriers. "Casey" is assuming there is exactly one CPU.
If the sync primitives are external functions and the symbols in question are visible outside the compilation unit (global names, exported pointer, exported function that may modify them) then the compiler will treat them -- or any other external function call -- as a memory fence with respect to all externally visible objects.
Otherwise, you are on your own. And volatile may be the best tool available for making the compiler produce correct, fast code. It generally won't be portable though, when you need volatile and what it actually does for you depends a lot on the system and compiler.
No.
First, volatile is not necessary. There are numerous other operations that provide guaranteed multithreaded semantics that don't use volatile. These include atomic operations, mutexes, and so on.
Second, volatile is not sufficient. The C standard does not provide any guarantees about multithreaded behavior for variables declared volatile.
So being neither necessary nor sufficient, there's not much point in using it.
One exception would be particular platforms (such as Visual Studio) where it does have documented multithreaded semantics.
Variables that are shared among threads should be declared 'volatile'. This tells the
compiler that when one thread writes to such variables, the write should be to memory
(as opposed to a register).

Resources