Optimization allowed on volatile objects - c

From ISO/IEC 9899:201x section 5.1.2.3 Program execution paragraph 4:
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
What exactly is the allowed optimization here regarding the volatile object? can someone give an example of a volatile access that CAN be optimized away?
Since volatiles access are an observable behaviour (described in paragraph 6) it seems that no optimization can take please regarding volatiles, so, I'm curious to know what optimization is allowed in section 4.

Reformatting a little:
An actual implementation need not evaluate part of an expression if:
a) it can deduce that its value is not used; and
b) it can deduce that that no needed side effects are produced (including any
caused by calling a function or accessing a volatile object).
Reversing the logic without changing the meaning:
An actual implementation must evaluate part of an expression if:
a) it can't deduce that its value is not used; or
b) it can't deduce that that no needed side effects are produced (including
any caused by calling a function or accessing a volatile object).
Simplifying to focus on the volatile part:
An actual implementation must evaluate part of an expression if needed
side effects are produced (including accessing a volatile object).

Accesses to volatile objects must be evaluated. The phrase “including any…” modifies “side effects.” It does not modify “if it can deduce…” It has the same meaning as:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects (including any caused by calling a function or accessing a volatile object) are produced.
This means “side effects” includes side effects that are caused by accessing a volatile object. In order to decide it cannot evaluate part of an expression, an implementation must deduce that no needed side effects, including any caused by calling a function or accessing a volatile object, are produced.
It does not mean that an implementation can discard evaluation of part of an expression even if that expression includes accesses to a volatile object.

can someone give an example of a volatile access that CAN be optimized
away?
I think that you misinterpreted the text, IMO this paragraph means that
volatile unsigned int bla = whatever();
if (bla < 0) // the code is not evaluated even if a volatile is involved

Adding another example that fits into this in my understanding:
volatile int vol_a;
....
int b = vol_a * 0; // vol_a is not evaluated

In cases where an access to a volatile object would affect system behavior in a way that would be necessary to make a program achieve its purpose, such an access must not be omitted. If the access would have no effect whatsoever on system behavior, then the operation could be "performed" on the abstract machine without having to execute any instructions. It would be rare, however, for a compiler writer to know with certainty that the effect of executing instructions to perform the accesses would be the same as the effect of pretending to do those instructions on the abstract machine while skipping them on the real one.
In the much more common scenario where a compiler writer would have no particular knowledge of any effect that a volatile access might have, but also have no particular reason to believe that such accesses couldn't have effects the compiler writer doesn't know about (e.g. because of hardware which is triggered by operations involving certain addresses), a compiler writer would have to allow for the possibility that such accesses might have "interesting" effects by performing them in the specified sequence, without regard for whether the compiler writer knows of any particular reason that the sequence of operations should matter.

Related

Can volatile variables be read multiple times between sequence points?

I'm making my own C compiler to try to learn as much details as possible about C. I'm now trying to understand exactly how volatile objects work.
What is confusing is that, every read access in the code must strictly be executed (C11, 6.7.3p7):
An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously.134) What constitutes an access to an object that has volatile-qualified type is implementation-defined.
Example : in a = volatile_var - volatile_var;, the volatile variable must be read twice and thus the compiler can't optimise to a = 0;
At the same time, the order of evaluation between sequence point is undetermined (C11, 6.5p3):
The grouping of operators and operands is indicated by the syntax. Except as specified later, side effects and value computations of subexpressions are unsequenced.
Example : in b = (c + d) - (e + f) the order in which the additions are evaluated is unspecified as they are unsequenced.
But evaluations of unsequenced objects where this evaluation creates a side effect (with volatile for instance), the behaviour is undefined (C11, 6.5p2):
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.
Does this mean the expressions like x = volatile_var - (volatile_var + volatile_var) is undefined ? Should my compiler throw an warning if this occurs ?
I've tried to see what CLANG and GCC do. Neither thow an error nor a warning. The outputed asm shows that the variables are NOT read in the execution order, but left to right instead as show in the asm risc-v asm below :
const int volatile thingy = 0;
int main()
{
int new_thing = thingy - (thingy + thingy);
return new_thing;
}
main:
lui a4,%hi(thingy)
lw a0,%lo(thingy)(a4)
lw a5,%lo(thingy)(a4)
lw a4,%lo(thingy)(a4)
add a5,a5,a4
sub a0,a0,a5
ret
Edit: I am not asking "Why do compilers accept it", I am asking "Is it undefined behavior if we strictly follow the C11 standard". The standard seems to state that it is undefined behaviour, but I need more precision about it to correctly interpret that
Reading the (ISO 9899:2018) standard literally, then it is undefined behavior.
C17 5.1.2.3/2 - definition of side effects:
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects
C17 6.5/2 - sequencing of operands:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.
Thus when reading the standard literally, volatile_var - volatile_var is definitely undefined behavior. Twice in a row UB actually, since both of the quoted sentences apply.
Please also note that this text changed quite a bit in C11. Previously C99 said, 6.5/2:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
That is, the behaviour was previously unspecified in C99 (unspecified order of evaluation) but was made undefined by the changes in C11.
That being said, other than re-ordering the evaluation as it pleases, a compiler doesn't really have any reason to do wild and crazy things with this expression since there isn't much that can be optimized, given volatile.
As a quality of implementation, mainstream compilers seem to maintain the previous "merely unspecified" behavior from C99.
Per C11, this is undefined behavior.
Per 5.1.2.3 Program execution, paragraph 2 (bolding mine):
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects ...
And 6.5 Expressions, paragraph 2 (again, bolding mine):
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
Note that, as this is your compiler, you are free to define the behavior should you wish.
As other answers have pointed out, accessing a volatile-qualified variable is a side effect, and side effects are interesting, and having multiple side effects between sequence points is especially interesting, and having multiple side effects that affect the same object between sequence points is undefined.
As an example of how/why it's undefined, consider this (wrong) code for reading a two-byte big-endian value from an input stream ifs:
uint16_t val = (getc(ifs) << 8) | getc(ifs); /* WRONG */
This code imagines (in order to implement big-endianness, that is) that the two getc calls happen in left-to-right order, but of course that's not at all guaranteed, which is why this code is wrong.
Now, one of the things the volatile qualifier is for is input registers. So if you've got a volatile variable
volatile uint8_t inputreg;
and if every time you read it you get the next byte coming in on some device — that is, if merely accessing the variable inputreg is like calling getc() on a stream — then you might write this code:
uint16_t val = (inputreg << 8) | inputreg; /* ALSO WRONG */
and it's just about exactly as wrong as the getc() code above.
The Standard has no terminology more specific than "Undefined Behavior" to describe actions which should be unambiguously defined on some implementations, or even the vast majority of them, but may behave unpredictably on others, based upon Implementation-Defined criteria. If anything, the authors of the Standard go out of their way to avoid saying anything about such behaviors.
The term is also used as a catch-all for situations where a potentially useful optimization might observably affect program behavior in some cases, to ensure that such optimizations will not affect program behavior in any defined situations.
The Standard specifies that the semantics of volatile-qualified accesses are "Implementation Defined", and there are platforms where certain kinds of optimizations involving volatile-qualified accesses might be observable if more than one such access occurs between sequence points. As a simple example, some platforms have read-modify-write operations whose semantics may be observably distinct from doing discrete read, modify, and write operations. If a programmer were to write:
void x(int volatile *dest, int volatile *src)
{
*dest = *src | 1;
}
and the two pointers were equal, the behavior of such a function might depend upon whether a compiler recognized that the pointers were equal and replaced discrete read and write operations with a combined read-modify-write.
To be sure, such distinctions would be unlikely to matter in most cases, and would be especially unlikely to matter in cases where an object is read twice. Nonetheless, the Standard makes no attempt to distinguish situations where such optimizations would actually affect program behavior, much less those where they would affect program behavior in any way that actually mattered, from those where it would be impossible to detect the effects of such optimization. The notion that the phrase "non-portable or erroneous" excludes constructs which would be non-portable but correct on the target platform would lead to an interesting irony that compiler optimizations such as read-modify-write merging would be completely useless on any "correct" programs.
No diagnostic is required for programs with Undefined Behaviour, except where specifically mentioned. So it's not wrong to accept this code.
In general, it's not possible to know whether the same volatile storage is being accessed multiple times between sequence points (consider a function taking two volatile int* parameters, without restrict, as the simplest example where analysis is impossible).
That said, when you are able to detect a problematic situation, users might find it helpful, so I encourage you to work on getting a diagnostic out.
IMO it is legal but very bad.
int new_thing = thingy - (thingy + thingy);
Multiple use of volatile variables in one expression is allowed and no warning is needed. But from the programmer's point of view, it is a very bad line of code.
Does this mean the expressions like x = volatile_var - (volatile_var +
volatile_var) is undefined ? Should my compiler throw an error if this
occurs ?
No as C standard does not say anything how those reads have to be ordered. It is left to the implementations. All known to me implementations do it the easiest way for them like in this example : https://godbolt.org/z/99498141d

Assign volatile to non-volatile sematics and the C standard

volatile int vfoo = 0;
void func()
{
int bar;
do
{
bar = vfoo; // L.7
}while(bar!=1);
return;
}
This code busy-waits for the variable to turn to 1. If on first pass vfoo is not set to 1, will I get stuck inside.
This code compiles without warning.
What does the standard say about this?
vfoo is declared as volatile. Therefore, read to this variable should not be optimized.
However, bar is not volatile qualified. Is the compiler allowed to optimize the write to this bar? .i.e. the compiler would do a read access to vfoo, and is allowed to discard this value and not assign it to bar (at L.7).
If this is a special case where the standard has something to say, can you please include the clause and interpret the standard's lawyer talk?
What the standard has to say about this includes:
5.1.2.3 Program execution
¶2 Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression in general includes both value computations and initiation of side effects. Value computation for an lvalue expression includes determining the identity of the designated object.
¶4 In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
¶6 The least requirements on a conforming implementation are:
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
...
The takeaway from ¶2 in particular should be that accessing a volatile object is no different from something like calling printf - it can't be elided because it has a side effect. Imagine your program with bar = vfoo; replaced by bar = printf("hello\n");
volatile variable has to be read on any access. In your code snippet that read cannot be optimized out. The compiler knows that bar might be affected by the side effect. So the condition will be checked correctly.
https://godbolt.org/z/nFd9BB
However, bar is not volatile qualified.
Variable bar is used to hold a value. Do you care about the value stored in it, or do you care about that variable being represented exactly according to the ABI?
Volatile would guarantee you the latter. Your program depends on the former.
Is the compiler allowed to optimize the write to this bar?
Of course. Why would you possibly care whether the value read was really written to a memory location allocated to the variable on the stack?
All you specified was that the value read was tested as an exit condition:
bar = ...
}while(bar!=1);
.i.e. the compiler would do a read access to vfoo, and is allowed to
discard this value and not assign it to bar (at L.7).
Of course not!
The compiler needs to hold the value obtained by the volatile read enough time to be able to compare it to 1. But no more time, as you don't ever use bar again latter.
It may be that a strange CPU as a EQ1 ("equal to 1") flag in the condition register, that is set whenever a value equal to 1 is loaded. Then the compiler would not even store temporarily the read value and just EQ1 condition test.
Under your hypothesis that compilers can discard variable values for all non volatile variables, non volatile objects would have almost no possible uses.

Do the C compiler know when a statement operates on a file and thus has "observable behaviour"?

The C99 standard 5.1.2.3$2 says
Accessing a volatile object, modifying an object, modifying a file, or calling a function
that does any of those operations are all side effects, 12) which are changes in the state of
the execution environment. Evaluation of an expression in general includes both value
computations and initiation of side effects. Value computation for an lvalue expression
includes determining the identity of the designated object.
I guess that in a lot of cases the compiler can't inline and possibly eliminate the functions doing I/O since they live in a different translation unit. And the parameters to functions doing I/O are often pointers, further hindering the optimizer.
But link-time-optimization gives the compiler "more to chew on".
And even though the paragraph I quoted says that "modifying an object" (that's standard-speak for memory) is a side-effect, stores to memory is not automatically treated as a side effect when the optimizer kicks in. Here's an example from John Regehrs Nine Ways to Break your Systems Software using Volatile where the message store is reordered relative to the volatile ready variable.
.
volatile int ready;
int message[100];
void foo (int i) {
message[i/10] = 42;
ready = 1;
}
How do a C compiler determine if a statement operates on a file? In a free-standing embedded environment I declare registers as volatile, thus hindering the compiler from optimizing calls away and swapping order of I/O calls.
Is that the only way to tell the compiler that we're doing I/O? Or do the C standard dictate that these N calls in the standard library do I/O and thus must receive special treatment? But then, what if someone created their own system call wrapper for say read?
As C has no statement dedicated to IO, only function calls can modify files. So if the compiler sees no function call in a sequence of statements, it knows that this sequence has not modified any file.
If only functions from the standard library are called, and if the environment is hosted, the compiler could know what they do and use that to guess what will happen.
But what is really important, is that the compiler only needs to respect side effects. It is perfectly allowed when it does not know, to assume that a function call could involve side effects and act accordingly. It will not be a violation of the standard if no side effects are actually involved, it will just possibly lose a higher optimization.

Does a volatile value-only statement trigger a read access in C?

If I had the folowing declaration:
extern volatile int SOME_REGISTER;
and later on:
void trigger_read_register()
{
SOME_REGISTER;
}
would calling trigger_read_register() issue a read request on SOME_REGISTER ?
According to the C11 spec, accessing a volatile is considered a side effect, and thus the compiler shouldn't optimize the (otherwise useless) access in your example.
So, the answer is that yes, it should read from memory.
See C11 standard (draft), section 5.1.2.3 section 2:
Accessing a volatile object, modifying an object, modifying a file, or
calling a function that does any of those operations are all side
effects, which are changes in the state of the execution
environment. Evaluation of an expression in general includes both
value computations and initiation of side effects. Value computation
for an lvalue expression includes determining the identity of the
designated object.
Further, 4 says:
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).

What does section 5.1.2.3, paragraph 4 (in n1570.pdf) mean for null operations?

I have been advised many times that accesses to volatile objects can't be optimised away, however it seems to me as though this section, present in the C89, C99 and C11 standards advises otherwise:
... An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
If I understand correctly, this sentence is stating that an actual implementation can optimise away part of an expression, providing these two requirements are met:
"its value is not used", and
"that no needed side effects are produced (including any caused by calling a function or accessing a volatile object)"...
It seems to me that many people are confusing the meaning for "including" with the meaning for "excluding".
Is it possible for a compiler to distinguish between a side effect that's "needed", and a side effect that isn't? If timing is considered a needed side effect, then why are compilers allowed to optimise away null operations like do_nothing(); or int unused_variable = 0;?
If a compiler is able to deduce that a function does nothing (eg. void do_nothing() { }), then is it possible that the compiler might have justification to optimise calls to that function away?
If a compiler is able to deduce that a volatile object isn't mapped to anything crucial (i.e. perhaps it's mapped to /dev/null to form a null operation), then is it possible that the compiler might also have justification to optimise that non-crucial side-effect away?
If a compiler can perform optimisations to eliminate unnecessary code such as calls to do_nothing() in a process called "dead code elimination" (which is quite the common practice), then why can't the compiler also eliminate volatile writes to a null device?
As I understand, either the compiler can optimise away calls to functions or volatile accesses or the compiler can't optimise away either, because of 5.1.2.3p4.
I think the "including any" applies to the "needed side-effects" , whereas you seem to be reading it as applying to "part of an expression".
So the intent was to say:
... An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced .
Examples of needed side-effects include:
Needed side-effects caused by a function which this expression calls
Accesses to volatile variables
Now, the term needed side-effect is not defined by the Standard. Section /4 is not attempting to define it either -- it's trying (and not succeeding very well) to provide examples.
I think the only sensible interpretation is to treat it as meaning observable behaviour which is defined by 5.1.2.3/6. So it would have been a lot simpler to write:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no observable behaviour would be caused.
Your questions in the edit are answered by 5.1.2.3/6, sometimes known as the as-if rule, which I'll quote here:
The least requirements on a conforming implementation are:
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place as specified in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.
This is the observable behaviour of the program.
Answering the specific questions in the edit:
Is it possible for a compiler to distinguish between a side effect that's "needed", and a side effect that isn't? If timing is considered a needed side effect, then why are compilers allowed to optimise away null operations like do_nothing(); or int unused_variable = 0;?
Timing isn't a side-effect. A "needed" side-effect presumably here means one that causes observable behaviour.
If a compiler is able to deduce that a function does nothing (eg. void do_nothing() { }), then is it possible that the compiler might have justification to optimise calls to that function away?
Yes, these can be optimized out because they do not cause observable behaviour.
If a compiler is able to deduce that a volatile object isn't mapped to anything crucial (i.e. perhaps it's mapped to /dev/null to form a null operation), then is it possible that the compiler might also have justification to optimise that non-crucial side-effect away?
No, because accesses to volatile objects are defined as observable behaviour.
If a compiler can perform optimisations to eliminate unnecessary code such as calls to do_nothing() in a process called "dead code elimination" (which is quite the common practice), then why can't the compiler also eliminate volatile writes to a null device?
Because volatile accesses are defined as observable behaviour and empty functions aren't.
I believe this:
(including any caused by calling a function or accessing a volatile
object)
is intended to be read as
(including:
any side-effects caused by calling a function; or
accessing a volatile variable)
This reading makes sense because accessing a volatile variable is a side-effect.

Resources