C99: Restricted Pointers to Document Thread Safety? - c

This question isn't about the technical usage of restricted, more about the subjective usage. Although I might be mistaken as to how restricted technically works, in which case you should feel free to grill me for basing a question on a false premise.
Here are two examples of how I'm using restricted so far:
If I have a function that takes a pointer to a sequence of immutable chars, I don't say it's restricted, since other people are allowed to access the data via their own pointers at the same time as the function's executing, e.g. from another parallel thread. The data isn't being modified, so no problem.
However, if the function takes a pointer to a sequence of mutable chars that it might modify, I say it's restricted because the data absolutely should not be accessed in anyway from any pointer (bar the argument the function uses, obviously) during the execution of the function due to potentially inconsistent data. It also states the possibility of the data being modified, so the coder knows not to read stale data and that they should use a memory barrier when accessing or whatever...
I don't code much C, so I could easily be wrong about what I'm assuming here. Is this correct usage of restrict? Is it worth doing in this scenario?
I'm also assuming that once the restricted pointer is popped off the stack when the function returns, that the data can then freely be accessed via any other pointer again, and that the restriction only lasts as long as the restricted pointer. I know that this relies on the coder following the rules, since accessing a restricted data via an 'unofficial' pointer is UB.
Have I got all of this right?
EDIT:
I'd just like to make clear that I already know it does absolutely nothing to prevent the users from accessing data via multiple threads, and I also know that C89 has no knowledge of what 'threads' even are.
But given that any context where an argument can be modified via reference, it's clear that it mustn't be accessed as the function is running. This doesn't do anything to enforce thread safety, but it does clearly document that you modify the data through your own pointer during the execution of the function at your own risk.
Even if threading is taken completely out of the equation, you still allow for further optimizations in a scenario where it seems correct to me.
Even so, thanks for all your authoritative answers so far. Do I upvote all the answers that I liked, or just the one that I accept? What if more than one is accepted? Sorry, I'm new here, I'll look through the FAQ more thoroughly now...

restrict has nothing to do with thread safety. In fact, the existing C standards have nothing to say on the topic of threads at all; from the point of view of the spec, there is no such thing as a "thread".
restrict is a way to inform the compiler about aliasing. Pointers often make it hard for the compiler to generate efficient code, because the compiler cannot know at compile time whether two pointers actually refer to the same memory. Toy example:
void foo(int *x, int *y) {
*x = 5;
*y = 7;
printf("%d\n", *x);
}
When the compiler processes this function, it has no idea whether x and y refer to the same memory location. Therefore it does not know whether it will print 5 or 7, and it has to emit code to actually read *x before calling printf.
But if you declare x as int *restrict x, the compiler can prove that this function prints 5, so it can feed a compile-time constant to the printf call.
Many such optimizations become possible with restrict, especially when you are talking about operations on arrays.
But none of this has anything to do with threads. To get multi-treading applications right, you need proper synchronization primitives like mutexes, condition variables, and memory barriers... All of which are specific to your platform, and none of which have anything to do with restrict.
[edit]
To answer your question about using restrict as a form of documentation, I would say "no" to that, also.
You seem to be thinking that you should document when a variable can or cannot be concurrently accessed. But for shared data, the proper discipline is (almost always) to ensure that it is never concurrently accessed.
The relevant documentation for a variable is whether it is shared at all and which mutex protects it. Any code accessing that variable for any reason, even just to read it, needs to hold the mutex. The author should neither know nor care whether some other thread might or might not happen to be accessing the variable concurrently... Because that other thread will be obeying the same discipline, and the mutex guarantees there is no concurrent access.
This discipline is simple, it works, and it scales, which is why it is one of the dominant paradigms for sharing data between threads. (The other is message passing.) If you ever find yourself trying to reason "do I really need to lock the mutex this time?", you are almost certainly doing something wrong. It would be hard to overstate this point.

No, I don't think that this is a good dialect to provide any information about acces from different threads. It is meant as assertions about pointers that a particular peace of code gets for different pointers it handles. Threading is not part of the language, yet. Thread safe acces to data needs much more, restrict is not the right tool. volatile isn't a guarantee, which you sometimes see proposed as well.
mutexes or other lock structures
memory fences that ensure data integrity
atomic operations and types
The upcoming standard C1x is supposed to provide such constructs. C99 isn't.

The canonical example of restrict is memcpy() - which the manpage on OS X 10.6 gives the prototype as:
void* memcpy(void *restrict s1, const void *restrict s2, size_t n);
The source and destination regions in memcpy are not permitted to overlap; therefore by labelling them as restrict this restriction is enforced - the compiler can know that no part of the source array aliases with the destination; can it can do things like read a large chunk of the source and then write it into the destination.
Essentially restrict is about assisting compiler obtimizations by tagging pointers as not aliasing - in and of itself it doeen't help with thread safety - it doesn't automatically cause the object pointed to be to locked while the function is called.
See How to use the restrict qualified in C and the wikipedia article on restrict for a more lengthy discussion.

restrict is a hint to the compiler that the buffer accessed via a pointer is not aliased via another pointer in scope. So if you have a function like:
foo(char *restrict datai, char *restrict dataj)
{
// we've "promised" the compiler that it should not worry about overlap between
// datai and dataj buffers, this gives the compiler the opportunity to generate
// "better" code, potentially mitigating load-store issues, amongst other things
}
To not use restrict is not enough to provide guarded access in a multi-threaded application though. If you have a shared buffer simultaneously accessed by multiple threads via char * parameters in a read/write way you would potentially need to use some kind of lock/mutex etc - the absence of restrict does not imply thread safety.
Hope this helps.

Related

Volatile function argument in C (STM32F4 example) [duplicate]

Why is volatile needed in C? What is it used for? What will it do?
volatile tells the compiler not to optimize anything that has to do with the volatile variable.
There are at least three common reasons to use it, all involving situations where the value of the variable can change without action from the visible code:
When you interface with hardware that changes the value itself
when there's another thread running that also uses the variable
when there's a signal handler that might change the value of the variable.
Let's say you have a little piece of hardware that is mapped into RAM somewhere and that has two addresses: a command port and a data port:
typedef struct
{
int command;
int data;
int isBusy;
} MyHardwareGadget;
Now you want to send some command:
void SendCommand (MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
Looks easy, but it can fail because the compiler is free to change the order in which data and commands are written. This would cause our little gadget to issue commands with the previous data-value. Also take a look at the wait while busy loop. That one will be optimized out. The compiler will try to be clever, read the value of isBusy just once and then go into an infinite loop. That's not what you want.
The way to get around this is to declare the pointer gadget as volatile. This way the compiler is forced to do what you wrote. It can't remove the memory assignments, it can't cache variables in registers and it can't change the order of assignments either
This is the correct version:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isBusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
volatile in C actually came into existence for the purpose of not caching the values of the variable automatically. It will tell the compiler not to cache the value of this variable. So it will generate code to take the value of the given volatile variable from the main memory every time it encounters it. This mechanism is used because at any time the value can be modified by the OS or any interrupt. So using volatile will help us accessing the value afresh every time.
Another use for volatile is signal handlers. If you have code like this:
int quit = 0;
while (!quit)
{
/* very small loop which is completely visible to the compiler */
}
The compiler is allowed to notice the loop body does not touch the quit variable and convert the loop to a while (true) loop. Even if the quit variable is set on the signal handler for SIGINT and SIGTERM; the compiler has no way to know that.
However, if the quit variable is declared volatile, the compiler is forced to load it every time, because it can be modified elsewhere. This is exactly what you want in this situation.
volatile tells the compiler that your variable may be changed by other means, than the code that is accessing it. e.g., it may be a I/O-mapped memory location. If this is not specified in such cases, some variable accesses can be optimised, e.g., its contents can be held in a register, and the memory location not read back in again.
See this article by Andrei Alexandrescu, "volatile - Multithreaded Programmer's Best Friend"
The volatile keyword was
devised to prevent compiler
optimizations that might render code
incorrect in the presence of certain
asynchronous events. For example, if
you declare a primitive variable as
volatile, the compiler is not
permitted to cache it in a register --
a common optimization that would be
disastrous if that variable were
shared among multiple threads. So the
general rule is, if you have variables
of primitive type that must be shared
among multiple threads, declare those
variables volatile. But you can
actually do a lot more with this
keyword: you can use it to catch code
that is not thread safe, and you can
do so at compile time. This article
shows how it is done; the solution
involves a simple smart pointer that
also makes it easy to serialize
critical sections of code.
The article applies to both C and C++.
Also see the article "C++ and the Perils of Double-Checked Locking" by Scott Meyers and Andrei Alexandrescu:
So when dealing with some memory locations (e.g. memory mapped ports or memory referenced by ISRs [ Interrupt Service Routines ] ), some optimizations must be suspended. volatile exists for specifying special treatment for such locations, specifically: (1) the content of a volatile variable is "unstable" (can change by means unknown to the compiler), (2) all writes to volatile data are "observable" so they must be executed religiously, and (3) all operations on volatile data are executed in the sequence in which they appear in the source code. The first two rules ensure proper reading and writing. The last one allows implementation of I/O protocols that mix input and output. This is informally what C and C++'s volatile guarantees.
My simple explanation is:
In some scenarios, based on the logic or code, the compiler will do optimisation of variables which it thinks do not change. The volatile keyword prevents a variable being optimised.
For example:
bool usb_interface_flag = 0;
while(usb_interface_flag == 0)
{
// execute logic for the scenario where the USB isn't connected
}
From the above code, the compiler may think usb_interface_flag is defined as 0, and that in the while loop it will be zero forever. After optimisation, the compiler will treat it as while(true) all the time, resulting in an infinite loop.
To avoid these kinds of scenarios, we declare the flag as volatile, we are telling to compiler that this value may be changed by an external interface or other module of program, i.e., please don't optimise it. That's the use case for volatile.
A marginal use for volatile is the following. Say you want to compute the numerical derivative of a function f :
double der_f(double x)
{
static const double h = 1e-3;
return (f(x + h) - f(x)) / h;
}
The problem is that x+h-x is generally not equal to h due to roundoff errors. Think about it : when you substract very close numbers, you lose a lot of significant digits which can ruin the computation of the derivative (think 1.00001 - 1). A possible workaround could be
double der_f2(double x)
{
static const double h = 1e-3;
double hh = x + h - x;
return (f(x + hh) - f(x)) / hh;
}
but depending on your platform and compiler switches, the second line of that function may be wiped out by a aggressively optimizing compiler. So you write instead
volatile double hh = x + h;
hh -= x;
to force the compiler to read the memory location containing hh, forfeiting an eventual optimization opportunity.
There are two uses. These are specially used more often in embedded development.
Compiler will not optimise the functions that uses variables that are defined with volatile keyword
Volatile is used to access exact memory locations in RAM, ROM, etc... This is used more often to control memory-mapped devices, access CPU registers and locate specific memory locations.
See examples with assembly listing.
Re: Usage of C "volatile" Keyword in Embedded Development
I'll mention another scenario where volatiles are important.
Suppose you memory-map a file for faster I/O and that file can change behind the scenes (e.g. the file is not on your local hard drive, but is instead served over the network by another computer).
If you access the memory-mapped file's data through pointers to non-volatile objects (at the source code level), then the code generated by the compiler can fetch the same data multiple times without you being aware of it.
If that data happens to change, your program may become using two or more different versions of the data and get into an inconsistent state. This can lead not only to logically incorrect behavior of the program but also to exploitable security holes in it if it processes untrusted files or files from untrusted locations.
If you care about security, and you should, this is an important scenario to consider.
Volatile is also useful, when you want to force the compiler not to optimize a specific code sequence (e.g. for writing a micro-benchmark).
volatile means the storage is likely to change at anytime and be changed but something outside the control of the user program. This means that if you reference the variable, the program should always check the physical address (ie a mapped input fifo), and not use it in a cached way.
In the language designed by Dennis Ritchie, every access to any object, other than automatic objects whose address had not been taken, would behave as though it computed the address of the object and then read or wrote the storage at that address. This made the language very powerful, but severely limited optimization opportunities.
While it might have been possible to add a qualifier that would invite a compiler to assume that a particular object wouldn't be changed in weird ways, such an assumption would be appropriate for the vast majority of objects in C programs, and it would have been impractical to add a qualifier to all the objects for which such assumption would be appropriate. On the other hand, some programs need to use some objects for which such an assumption would not hold. To resolve this issue, the Standard says that compilers may assume that objects which are not declared volatile will not have their value observed or changed in ways that are outside the compiler's control, or would be outside a reasonable compiler's understanding.
Because various platforms may have different ways in which objects could be observed or modified outside a compiler's control, it is appropriate that quality compilers for those platforms should differ in their exact handling of volatile semantics. Unfortunately, because the Standard failed to suggest that quality compilers intended for low-level programming on a platform should handle volatile in a way that will recognize any and all relevant effects of a particular read/write operation on that platform, many compilers fall short of doing so in ways that make it harder to process things like background I/O in a way which is efficient but can't be broken by compiler "optimizations".
In my opinion, you should not expect too much from volatile. To illustrate, look at the example in Nils Pipenbrinck's highly-voted answer.
I would say, his example is not suitable for volatile. volatile is only used to:
prevent the compiler from making useful and desirable optimizations. It is nothing about the thread safe, atomic access or even memory order.
In that example:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
the gadget->data = data before gadget->command = command only is only guaranteed in compiled code by compiler. At running time, the processor still possibly reorders the data and command assignment, regarding to the processor architecture. The hardware could get the wrong data(suppose gadget is mapped to hardware I/O). The memory barrier is needed between data and command assignment.
In simple terms, it tells the compiler not to do any optimisation on a particular variable. Variables which are mapped to device register are modified indirectly by the device. In this case, volatile must be used.
The Wiki say everything about volatile:
volatile (computer programming)
And the Linux kernel's doc also make a excellent notation about volatile:
Why the "volatile" type class should not be used
A volatile can be changed from outside the compiled code (for example, a program may map a volatile variable to a memory mapped register.) The compiler won't apply certain optimizations to code that handles a volatile variable - for example, it won't load it into a register without writing it to memory. This is important when dealing with hardware registers.
As rightly suggested by many here, the volatile keyword's popular use is to skip the optimisation of the volatile variable.
The best advantage that comes to mind, and worth mentioning after reading about volatile is -- to prevent rolling back of the variable in case of a longjmp. A non-local jump.
What does this mean?
It simply means that the last value will be retained after you do stack unwinding, to return to some previous stack frame; typically in case of some erroneous scenario.
Since it'd be out of scope of this question, I am not going into details of setjmp/longjmp here, but it's worth reading about it; and how the volatility feature can be used to retain the last value.
it does not allows compiler to automatic changing values of variables. a volatile variable is for dynamic use.

Setting Global variables in thread

I need to have a string as a global variable. There is a possibility for multiple threads to set the global variable. Should I have to go for mutex for this? Or will OS handle such actions.
Going for mutex affects the application performance.
I am not concerned about the order of actions happening. I am afraid of the data corruption.
Could somebody let me know about this.
It sounds like you understand all of the concerns. If the global variable can be corrupt you definitely need to lock it in a mutex. This will affect performance, since this part is by definition now going to be synchronous. That being said, you will want to lock the smallest part of the code as necessary, to minimize the time that synchronous code is being called.
What's your global variable? A pointer to the string buffer, or the buffer itself?
On many architectures (including AFAIR 32-bit x86) overwriting a single pointer is atomic.
This example might work:
volatile char **global_var;
void set_var(char *str) {
char *tmp = strdup(str);
global_var = &tmp;
}
You can use Thread-Local Storage for this.
Unfortunately, its not specified in current C99 standart, but possible will be there in C1X. For now, you can use compiler-specific implementations (GCC, ICC and Visual C have it).
As far as the standards are concerned, yes, you must use a mutex. Failure to do so results in undefined behavior. In practice, most machine architectures will have no problem with this. Future versions of the C standard (C1x) will have atomic types which, if used here, would definitely make the assignment without lock safe (albeit possibly using an internal lock, on broken archs that lack real atomics).

Following pointers in a multithreaded environment

If I have some code that looks something like:
typedef struct {
bool some_flag;
pthread_cond_t c;
pthread_mutex_t m;
} foo_t;
// I assume the mutex has already been locked, and will be unlocked
// some time after this function returns. For clarity. Definitely not
// out of laziness ;)
void check_flag(foo_t* f) {
while(f->flag)
pthread_cond_wait(&f->c, &f->m);
}
Is there anything in the C standard preventing an optimizer from rewriting check_flag as:
void check_flag(foo_t* f) {
bool cache = f->flag;
while(cache)
pthread_cond_wait(&f->c, &f->m);
}
In other words, does the generated code have to follow the f pointer every time through the loop, or is the compiler free to pull the dereference out?
If it is free to pull it out, is there any way to prevent this? Do I need to sprinkle a volatile keyword somewhere? It can't be check_flag's parameter because I plan on having other variables in this struct that I don't mind the compiler optimizing like this.
Might I have to resort to:
void check_flag(foo_t* f) {
volatile bool* cache = &f->some_flag;
while(*cache)
pthread_cond_wait(&f->c, &f->m);
}
In the general case, even if multi-threading wasn't involved and your loop looked like:
void check_flag(foo_t* f) {
while(f->flag)
foo(&f->c, &f->m);
}
the compiler would be unable to to cache the f->flag test. That's because the compiler can't know whether or not a function (like foo() above) might change whatever object f is pointing to.
Under special circumstances (foo() is visible to the compiler, and all pointers passed to the check_flag() are known not to be aliased or otherwise modifiable by foo()) the compiler might be able to optimize the check.
However, pthread_cond_wait() must be implemented in a way that would prevent that optimization.
See Does guarding a variable with a pthread mutex guarantee it's also not cached?:
You might also be interested in Steve Jessop's answer to: Can a C/C++ compiler legally cache a variable in a register across a pthread library call?
But how far you want to take the issues raised by Boehm's paper in your own work is up to you. As far as I can tell, if you want to take the stand that pthreads doesn't/can't make the guarantee, then you're in essence taking the stand that pthreads is useless (or at least provides no safety guarantees, which I think by reduction has the same outcome). While this might be true in the strictest sense (as addressed in the paper), it's also probably not a useful answer. I'm not sure what option you'd have other than pthreads on Unix-based platforms.
Normally, you should try to lock the pthread mutex before waiting on the condition object as the pthread_cond_wait call release the mutex (and reacquire it before returning). So, your check_flag function should be rewritten like that to conform to the semantic on the pthread condition.
void check_flag(foo_t* f) {
pthread_mutex_lock(&f->m);
while(f->flag)
pthread_cond_wait(&f->c, &f->m);
pthread_mutex_unlock(&f->m);
}
Concerning the question of whether or not the compiler is allowed to optimize the reading of the flagfield, this answer explains it in more detail than I can.
Basically, the compiler know about the semantic of pthread_cond_wait, pthread_mutex_lock and pthread_mutex_unlock. He know that he can't optimize memory reading in those situation (the call to pthread_cond_wait in this exemple). There is no notion of memory barrier here, just a special knowledge of certain function, and some rule to follow in their presence.
There is another thing protecting you from optimization performed by the processor. Your average processor is capable of reordering memory access (read / write) provided that the semantic is conserved, and it is always doing it (as it allow to increase performance). However, this break when more than one processor can access the same memory address. A memory barrier is just an instruction to the processor telling it that it can move the read / write that were issued before the barrier and execute them after the barrier. It has finish them now.
As written, the compiler is free to cache the result as you describe or even in a more subtle way - by putting it into a register. You can prevent this optimization from taking place by making the variable volatile. But that is not necessarily enough - you should not code it this way! You should use condition variables as prescribed (lock, wait, unlock).
Trying to do work around the library is bad, but it gets worse. Perhaps reading Hans Boehm's paper on the general topic from PLDI 2005 ("Threads Cannot be Implemented as a Library"), or many of his follow-on articles (which lead up to work on a revised C++ memory model) will put the fear of God in you and steer you back to the straight and narrow :).
Volatile is for this purpose. Relying on the compiler to know about pthread coding practices seems a little nuts to me, although; compilers are pretty smart these days. In fact, the compiler probably sees that you are looping to test a variable and won't cache it in a register for that reason, not because it sees you using pthreads. Just use volatile if you really care.
Kind of funny little note. We have a VOLATILE #define that is either "volatile" (when we think the bug can't possibly be our code...) or blank. When we think we have a crash due to the optimizer killing us, we #define it "volatile" which puts volatile in front of almost everything. We then test to see if the problem goes away. So far... the bugs have been the developer and not the compiler! who'd have thought!? We have developed a high performance "non locking" and "non blocking" threading library. We have a test platform that hammers it to the point of thousands of races per second. So fare, we have never detected a problem needing volatile! So far gcc has never cached a shared variable in a register. yah...we are surprised too. We are still waiting for our chance to use volatile!

Why doesn't POSIX mmap return a volatile void*?

Mmap returns a void*, but not a volatile void*. If I'm using mmap to map shared memory, then another process could be writing to that memory, which means two subsequent reads from the same memory location can yield different values -- the exact situation volatile is meant for. So why doesn't it return a volatile void*?
My best guess is that if you have a process that's exclusively writing to the shared memory segment, it doesn't need to look at the shared memory through volatile pointers because it will always have the right understanding of what's present; any optimizations the compiler does to prevent redundant reads won't matter since there is nothing else writing and changing the values under its feet. Or is there some other historical reason? I'm inclined to say returning volatile void* would be a safer default, and those wanting this optimization could then manually cast to void*.
POSIX mmap description: http://opengroup.org/onlinepubs/007908775/xsh/mmap.html
Implementing shared memory is only one small subset of the uses of mmap(). In fact the most common uses are creating private mappings, both anonymous and file-backed. This means that, even if we accepted your contention about requiring a volatile-qualified pointer for shared memory access, such a qualifier would be superfluous in the general case.
Remember that you can always add final qualifiers to a pointer type without casting, but you can't remove them. So, with the current mmap() declaration, you can do both this:
volatile char *foo = mmap(); /* I need volatile */
and this:
char *bar = mmap(); /* But _I_ do not */
With your suggestion, the users in the common case would have to cast the volatile away.
The deeply-held assumption running through many software systems is that most programmers are sequential programmers. This has only recently started to change.
mmap has dozens of uses not related to shared memory. In the event that a programmer is writing a multithreaded program, they must take their own steps to ensure safety. Protecting each variable with a mutex is not the default. Likewise, mmap does not assume that another thread will make contentious accesses to the same shared-memory segment, or even that a segment so mapped will be accessible by another thread.
I'm also unconvinced that marking the return of mmap as volatile will have an effect on this. A programmer would still have to ensure safety in access to the mapped region, no?
Being volatile would only cover a single read (which depending on the architecture might be 32 bit or something else, and thus be quite limiting. Often you'll need to write more than 1 machine word, and you'll anyway have to introduce some sort of locking.
Even if it were volatile, you could easily have 2 processes reading different values from the same memory, all it takes is a 3. process to write to the memory in the nanosecond between the read from the 1. process and the read from the 2. process(unless you can guarantee the 2 processes reading the same memory within almost exact the same clock cycles.
Thus - it's pretty useless for mmap() to try to deal with these things, and is better left up to the programmer how to deal with access to the memory and mark the pointer as volatile where needed - if the memory is shared - you will need to have all partys involved be cooperative and aware of how they can update the memory in relation to eachother - something out of scope of mmap, and something volative will not solve.
I don't think volatile does what you think it does.
Basically, it just tells the compiler not to optimize the variable by storing its value in a register. This forces it to retrieve the value each time you reference it, which is a good idea if another thread (or whatever) could have updated it in the interim.
The function returns a void*, but it's not going to be updated, so calling it volatile is meaningless. Even if you assigned the value to a local volatile void*, nothing would be gained.
The type volatile void * or void * volatile is nonsensical: you cannot dereference a void *, so it doesn't make sense to specify type qualifiers to it.
And, since you anyway need a cast to char * or whatever your data type, then perhaps that is the right place to specify volatility. Thus, the API as defined nicely side-steps the responsibility of marking the memory changable-under-your-feet/volatile.
That said, from a big picture POV, I agree with you: mmap should have a return type stating that the compiler should not cache this range.
It's probably done that way for performance reasons, providing nothing extra by default. If you know that on your particular architecture that writes/reads won't be reordered by the processor you may not need volatile at all (possibly in conjuction with other synchronization). EDIT: this was just an example - there may be a variety of other cases where you know that you don't need to force a reread every time the memory is accessed.
If you need to ensure that all the addresses are read from memory each time they're accessed, const_cast (or C-style cast) volatile onto the return value yourself.

Why is volatile needed in C?

Why is volatile needed in C? What is it used for? What will it do?
volatile tells the compiler not to optimize anything that has to do with the volatile variable.
There are at least three common reasons to use it, all involving situations where the value of the variable can change without action from the visible code:
When you interface with hardware that changes the value itself
when there's another thread running that also uses the variable
when there's a signal handler that might change the value of the variable.
Let's say you have a little piece of hardware that is mapped into RAM somewhere and that has two addresses: a command port and a data port:
typedef struct
{
int command;
int data;
int isBusy;
} MyHardwareGadget;
Now you want to send some command:
void SendCommand (MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
Looks easy, but it can fail because the compiler is free to change the order in which data and commands are written. This would cause our little gadget to issue commands with the previous data-value. Also take a look at the wait while busy loop. That one will be optimized out. The compiler will try to be clever, read the value of isBusy just once and then go into an infinite loop. That's not what you want.
The way to get around this is to declare the pointer gadget as volatile. This way the compiler is forced to do what you wrote. It can't remove the memory assignments, it can't cache variables in registers and it can't change the order of assignments either
This is the correct version:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isBusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
volatile in C actually came into existence for the purpose of not caching the values of the variable automatically. It will tell the compiler not to cache the value of this variable. So it will generate code to take the value of the given volatile variable from the main memory every time it encounters it. This mechanism is used because at any time the value can be modified by the OS or any interrupt. So using volatile will help us accessing the value afresh every time.
Another use for volatile is signal handlers. If you have code like this:
int quit = 0;
while (!quit)
{
/* very small loop which is completely visible to the compiler */
}
The compiler is allowed to notice the loop body does not touch the quit variable and convert the loop to a while (true) loop. Even if the quit variable is set on the signal handler for SIGINT and SIGTERM; the compiler has no way to know that.
However, if the quit variable is declared volatile, the compiler is forced to load it every time, because it can be modified elsewhere. This is exactly what you want in this situation.
volatile tells the compiler that your variable may be changed by other means, than the code that is accessing it. e.g., it may be a I/O-mapped memory location. If this is not specified in such cases, some variable accesses can be optimised, e.g., its contents can be held in a register, and the memory location not read back in again.
See this article by Andrei Alexandrescu, "volatile - Multithreaded Programmer's Best Friend"
The volatile keyword was
devised to prevent compiler
optimizations that might render code
incorrect in the presence of certain
asynchronous events. For example, if
you declare a primitive variable as
volatile, the compiler is not
permitted to cache it in a register --
a common optimization that would be
disastrous if that variable were
shared among multiple threads. So the
general rule is, if you have variables
of primitive type that must be shared
among multiple threads, declare those
variables volatile. But you can
actually do a lot more with this
keyword: you can use it to catch code
that is not thread safe, and you can
do so at compile time. This article
shows how it is done; the solution
involves a simple smart pointer that
also makes it easy to serialize
critical sections of code.
The article applies to both C and C++.
Also see the article "C++ and the Perils of Double-Checked Locking" by Scott Meyers and Andrei Alexandrescu:
So when dealing with some memory locations (e.g. memory mapped ports or memory referenced by ISRs [ Interrupt Service Routines ] ), some optimizations must be suspended. volatile exists for specifying special treatment for such locations, specifically: (1) the content of a volatile variable is "unstable" (can change by means unknown to the compiler), (2) all writes to volatile data are "observable" so they must be executed religiously, and (3) all operations on volatile data are executed in the sequence in which they appear in the source code. The first two rules ensure proper reading and writing. The last one allows implementation of I/O protocols that mix input and output. This is informally what C and C++'s volatile guarantees.
My simple explanation is:
In some scenarios, based on the logic or code, the compiler will do optimisation of variables which it thinks do not change. The volatile keyword prevents a variable being optimised.
For example:
bool usb_interface_flag = 0;
while(usb_interface_flag == 0)
{
// execute logic for the scenario where the USB isn't connected
}
From the above code, the compiler may think usb_interface_flag is defined as 0, and that in the while loop it will be zero forever. After optimisation, the compiler will treat it as while(true) all the time, resulting in an infinite loop.
To avoid these kinds of scenarios, we declare the flag as volatile, we are telling to compiler that this value may be changed by an external interface or other module of program, i.e., please don't optimise it. That's the use case for volatile.
A marginal use for volatile is the following. Say you want to compute the numerical derivative of a function f :
double der_f(double x)
{
static const double h = 1e-3;
return (f(x + h) - f(x)) / h;
}
The problem is that x+h-x is generally not equal to h due to roundoff errors. Think about it : when you substract very close numbers, you lose a lot of significant digits which can ruin the computation of the derivative (think 1.00001 - 1). A possible workaround could be
double der_f2(double x)
{
static const double h = 1e-3;
double hh = x + h - x;
return (f(x + hh) - f(x)) / hh;
}
but depending on your platform and compiler switches, the second line of that function may be wiped out by a aggressively optimizing compiler. So you write instead
volatile double hh = x + h;
hh -= x;
to force the compiler to read the memory location containing hh, forfeiting an eventual optimization opportunity.
There are two uses. These are specially used more often in embedded development.
Compiler will not optimise the functions that uses variables that are defined with volatile keyword
Volatile is used to access exact memory locations in RAM, ROM, etc... This is used more often to control memory-mapped devices, access CPU registers and locate specific memory locations.
See examples with assembly listing.
Re: Usage of C "volatile" Keyword in Embedded Development
I'll mention another scenario where volatiles are important.
Suppose you memory-map a file for faster I/O and that file can change behind the scenes (e.g. the file is not on your local hard drive, but is instead served over the network by another computer).
If you access the memory-mapped file's data through pointers to non-volatile objects (at the source code level), then the code generated by the compiler can fetch the same data multiple times without you being aware of it.
If that data happens to change, your program may become using two or more different versions of the data and get into an inconsistent state. This can lead not only to logically incorrect behavior of the program but also to exploitable security holes in it if it processes untrusted files or files from untrusted locations.
If you care about security, and you should, this is an important scenario to consider.
Volatile is also useful, when you want to force the compiler not to optimize a specific code sequence (e.g. for writing a micro-benchmark).
volatile means the storage is likely to change at anytime and be changed but something outside the control of the user program. This means that if you reference the variable, the program should always check the physical address (ie a mapped input fifo), and not use it in a cached way.
In the language designed by Dennis Ritchie, every access to any object, other than automatic objects whose address had not been taken, would behave as though it computed the address of the object and then read or wrote the storage at that address. This made the language very powerful, but severely limited optimization opportunities.
While it might have been possible to add a qualifier that would invite a compiler to assume that a particular object wouldn't be changed in weird ways, such an assumption would be appropriate for the vast majority of objects in C programs, and it would have been impractical to add a qualifier to all the objects for which such assumption would be appropriate. On the other hand, some programs need to use some objects for which such an assumption would not hold. To resolve this issue, the Standard says that compilers may assume that objects which are not declared volatile will not have their value observed or changed in ways that are outside the compiler's control, or would be outside a reasonable compiler's understanding.
Because various platforms may have different ways in which objects could be observed or modified outside a compiler's control, it is appropriate that quality compilers for those platforms should differ in their exact handling of volatile semantics. Unfortunately, because the Standard failed to suggest that quality compilers intended for low-level programming on a platform should handle volatile in a way that will recognize any and all relevant effects of a particular read/write operation on that platform, many compilers fall short of doing so in ways that make it harder to process things like background I/O in a way which is efficient but can't be broken by compiler "optimizations".
In my opinion, you should not expect too much from volatile. To illustrate, look at the example in Nils Pipenbrinck's highly-voted answer.
I would say, his example is not suitable for volatile. volatile is only used to:
prevent the compiler from making useful and desirable optimizations. It is nothing about the thread safe, atomic access or even memory order.
In that example:
void SendCommand (volatile MyHardwareGadget * gadget, int command, int data)
{
// wait while the gadget is busy:
while (gadget->isbusy)
{
// do nothing here.
}
// set data first:
gadget->data = data;
// writing the command starts the action:
gadget->command = command;
}
the gadget->data = data before gadget->command = command only is only guaranteed in compiled code by compiler. At running time, the processor still possibly reorders the data and command assignment, regarding to the processor architecture. The hardware could get the wrong data(suppose gadget is mapped to hardware I/O). The memory barrier is needed between data and command assignment.
In simple terms, it tells the compiler not to do any optimisation on a particular variable. Variables which are mapped to device register are modified indirectly by the device. In this case, volatile must be used.
The Wiki say everything about volatile:
volatile (computer programming)
And the Linux kernel's doc also make a excellent notation about volatile:
Why the "volatile" type class should not be used
A volatile can be changed from outside the compiled code (for example, a program may map a volatile variable to a memory mapped register.) The compiler won't apply certain optimizations to code that handles a volatile variable - for example, it won't load it into a register without writing it to memory. This is important when dealing with hardware registers.
As rightly suggested by many here, the volatile keyword's popular use is to skip the optimisation of the volatile variable.
The best advantage that comes to mind, and worth mentioning after reading about volatile is -- to prevent rolling back of the variable in case of a longjmp. A non-local jump.
What does this mean?
It simply means that the last value will be retained after you do stack unwinding, to return to some previous stack frame; typically in case of some erroneous scenario.
Since it'd be out of scope of this question, I am not going into details of setjmp/longjmp here, but it's worth reading about it; and how the volatility feature can be used to retain the last value.
it does not allows compiler to automatic changing values of variables. a volatile variable is for dynamic use.

Resources