I'm making my own C compiler to try to learn as much details as possible about C. I'm now trying to understand exactly how volatile objects work.
What is confusing is that, every read access in the code must strictly be executed (C11, 6.7.3p7):
An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously.134) What constitutes an access to an object that has volatile-qualified type is implementation-defined.
Example : in a = volatile_var - volatile_var;, the volatile variable must be read twice and thus the compiler can't optimise to a = 0;
At the same time, the order of evaluation between sequence point is undetermined (C11, 6.5p3):
The grouping of operators and operands is indicated by the syntax. Except as specified later, side effects and value computations of subexpressions are unsequenced.
Example : in b = (c + d) - (e + f) the order in which the additions are evaluated is unspecified as they are unsequenced.
But evaluations of unsequenced objects where this evaluation creates a side effect (with volatile for instance), the behaviour is undefined (C11, 6.5p2):
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.
Does this mean the expressions like x = volatile_var - (volatile_var + volatile_var) is undefined ? Should my compiler throw an warning if this occurs ?
I've tried to see what CLANG and GCC do. Neither thow an error nor a warning. The outputed asm shows that the variables are NOT read in the execution order, but left to right instead as show in the asm risc-v asm below :
const int volatile thingy = 0;
int main()
{
int new_thing = thingy - (thingy + thingy);
return new_thing;
}
main:
lui a4,%hi(thingy)
lw a0,%lo(thingy)(a4)
lw a5,%lo(thingy)(a4)
lw a4,%lo(thingy)(a4)
add a5,a5,a4
sub a0,a0,a5
ret
Edit: I am not asking "Why do compilers accept it", I am asking "Is it undefined behavior if we strictly follow the C11 standard". The standard seems to state that it is undefined behaviour, but I need more precision about it to correctly interpret that
Reading the (ISO 9899:2018) standard literally, then it is undefined behavior.
C17 5.1.2.3/2 - definition of side effects:
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects
C17 6.5/2 - sequencing of operands:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.
Thus when reading the standard literally, volatile_var - volatile_var is definitely undefined behavior. Twice in a row UB actually, since both of the quoted sentences apply.
Please also note that this text changed quite a bit in C11. Previously C99 said, 6.5/2:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
That is, the behaviour was previously unspecified in C99 (unspecified order of evaluation) but was made undefined by the changes in C11.
That being said, other than re-ordering the evaluation as it pleases, a compiler doesn't really have any reason to do wild and crazy things with this expression since there isn't much that can be optimized, given volatile.
As a quality of implementation, mainstream compilers seem to maintain the previous "merely unspecified" behavior from C99.
Per C11, this is undefined behavior.
Per 5.1.2.3 Program execution, paragraph 2 (bolding mine):
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects ...
And 6.5 Expressions, paragraph 2 (again, bolding mine):
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
Note that, as this is your compiler, you are free to define the behavior should you wish.
As other answers have pointed out, accessing a volatile-qualified variable is a side effect, and side effects are interesting, and having multiple side effects between sequence points is especially interesting, and having multiple side effects that affect the same object between sequence points is undefined.
As an example of how/why it's undefined, consider this (wrong) code for reading a two-byte big-endian value from an input stream ifs:
uint16_t val = (getc(ifs) << 8) | getc(ifs); /* WRONG */
This code imagines (in order to implement big-endianness, that is) that the two getc calls happen in left-to-right order, but of course that's not at all guaranteed, which is why this code is wrong.
Now, one of the things the volatile qualifier is for is input registers. So if you've got a volatile variable
volatile uint8_t inputreg;
and if every time you read it you get the next byte coming in on some device — that is, if merely accessing the variable inputreg is like calling getc() on a stream — then you might write this code:
uint16_t val = (inputreg << 8) | inputreg; /* ALSO WRONG */
and it's just about exactly as wrong as the getc() code above.
The Standard has no terminology more specific than "Undefined Behavior" to describe actions which should be unambiguously defined on some implementations, or even the vast majority of them, but may behave unpredictably on others, based upon Implementation-Defined criteria. If anything, the authors of the Standard go out of their way to avoid saying anything about such behaviors.
The term is also used as a catch-all for situations where a potentially useful optimization might observably affect program behavior in some cases, to ensure that such optimizations will not affect program behavior in any defined situations.
The Standard specifies that the semantics of volatile-qualified accesses are "Implementation Defined", and there are platforms where certain kinds of optimizations involving volatile-qualified accesses might be observable if more than one such access occurs between sequence points. As a simple example, some platforms have read-modify-write operations whose semantics may be observably distinct from doing discrete read, modify, and write operations. If a programmer were to write:
void x(int volatile *dest, int volatile *src)
{
*dest = *src | 1;
}
and the two pointers were equal, the behavior of such a function might depend upon whether a compiler recognized that the pointers were equal and replaced discrete read and write operations with a combined read-modify-write.
To be sure, such distinctions would be unlikely to matter in most cases, and would be especially unlikely to matter in cases where an object is read twice. Nonetheless, the Standard makes no attempt to distinguish situations where such optimizations would actually affect program behavior, much less those where they would affect program behavior in any way that actually mattered, from those where it would be impossible to detect the effects of such optimization. The notion that the phrase "non-portable or erroneous" excludes constructs which would be non-portable but correct on the target platform would lead to an interesting irony that compiler optimizations such as read-modify-write merging would be completely useless on any "correct" programs.
No diagnostic is required for programs with Undefined Behaviour, except where specifically mentioned. So it's not wrong to accept this code.
In general, it's not possible to know whether the same volatile storage is being accessed multiple times between sequence points (consider a function taking two volatile int* parameters, without restrict, as the simplest example where analysis is impossible).
That said, when you are able to detect a problematic situation, users might find it helpful, so I encourage you to work on getting a diagnostic out.
IMO it is legal but very bad.
int new_thing = thingy - (thingy + thingy);
Multiple use of volatile variables in one expression is allowed and no warning is needed. But from the programmer's point of view, it is a very bad line of code.
Does this mean the expressions like x = volatile_var - (volatile_var +
volatile_var) is undefined ? Should my compiler throw an error if this
occurs ?
No as C standard does not say anything how those reads have to be ordered. It is left to the implementations. All known to me implementations do it the easiest way for them like in this example : https://godbolt.org/z/99498141d
Related
Can compiler optimize comparing volatile variable to itself and assume it will be equal? Or it has to read twice this variable and compare the two values it got?
The way I think about volatile variables is that reading from them or writing to them is treated like an I/O function.
A call to an I/O function can never be optimized out, because it has side effects. Nor can a read or write involving a volatile variable be optimized out.
If you code two calls to the same input function, the compiler has to ensure the input function is actually called twice, since it could give different results. In the same way, you can read from a volatile variable twice, and in between the two reads, someone else could change the value of the variable. So the compiler will always emit the instructions to read it twice, whereas with non-volatile variables it can simply assume they're not modified by anyone outside the program.
Again, I/O functions calls can't be reordered, and nor can volatile variable accesses.
C 2018 6.7.3 8 says:
An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to [an object with volatile-qualified type] shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3… What constitutes an access to an object that has volatile-qualified type is implementation-defined.
5.1.2.3 4 says:
… In the abstract machine, all expressions are evaluated as specified by the semantics…
In x == x, where x is a volatile-qualified object, the semantics are that the first x is converted to the value of the object x (by lvalue conversion per 6.3.2.1 2), the second x is converted to its value, and the two values are compared. Converting an object to its value accesses the object, getting the value stored in it.
So the abstract machine accesses x twice. By 6.7.3 8, this evaluation is strict; the actual program must implement the same accesses as the abstract machine, so the value of x must be accessed twice; neither can be optimized away.
The C standard leaves it to the C implementation to define what constitutes an access. But whatever that is, the program must do it twice.
From ISO/IEC 9899:201x section 5.1.2.3 Program execution paragraph 4:
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
What exactly is the allowed optimization here regarding the volatile object? can someone give an example of a volatile access that CAN be optimized away?
Since volatiles access are an observable behaviour (described in paragraph 6) it seems that no optimization can take please regarding volatiles, so, I'm curious to know what optimization is allowed in section 4.
Reformatting a little:
An actual implementation need not evaluate part of an expression if:
a) it can deduce that its value is not used; and
b) it can deduce that that no needed side effects are produced (including any
caused by calling a function or accessing a volatile object).
Reversing the logic without changing the meaning:
An actual implementation must evaluate part of an expression if:
a) it can't deduce that its value is not used; or
b) it can't deduce that that no needed side effects are produced (including
any caused by calling a function or accessing a volatile object).
Simplifying to focus on the volatile part:
An actual implementation must evaluate part of an expression if needed
side effects are produced (including accessing a volatile object).
Accesses to volatile objects must be evaluated. The phrase “including any…” modifies “side effects.” It does not modify “if it can deduce…” It has the same meaning as:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects (including any caused by calling a function or accessing a volatile object) are produced.
This means “side effects” includes side effects that are caused by accessing a volatile object. In order to decide it cannot evaluate part of an expression, an implementation must deduce that no needed side effects, including any caused by calling a function or accessing a volatile object, are produced.
It does not mean that an implementation can discard evaluation of part of an expression even if that expression includes accesses to a volatile object.
can someone give an example of a volatile access that CAN be optimized
away?
I think that you misinterpreted the text, IMO this paragraph means that
volatile unsigned int bla = whatever();
if (bla < 0) // the code is not evaluated even if a volatile is involved
Adding another example that fits into this in my understanding:
volatile int vol_a;
....
int b = vol_a * 0; // vol_a is not evaluated
In cases where an access to a volatile object would affect system behavior in a way that would be necessary to make a program achieve its purpose, such an access must not be omitted. If the access would have no effect whatsoever on system behavior, then the operation could be "performed" on the abstract machine without having to execute any instructions. It would be rare, however, for a compiler writer to know with certainty that the effect of executing instructions to perform the accesses would be the same as the effect of pretending to do those instructions on the abstract machine while skipping them on the real one.
In the much more common scenario where a compiler writer would have no particular knowledge of any effect that a volatile access might have, but also have no particular reason to believe that such accesses couldn't have effects the compiler writer doesn't know about (e.g. because of hardware which is triggered by operations involving certain addresses), a compiler writer would have to allow for the possibility that such accesses might have "interesting" effects by performing them in the specified sequence, without regard for whether the compiler writer knows of any particular reason that the sequence of operations should matter.
I was reading through the C11 standard. As per the C11 standard undefined behavior is classified into four different types. The parenthesized numbers refer to the subclause of the C Standard (C11) that identifies the undefined behavior.
Example 1: The program attempts to modify a string literal (6.4.5). This undefined behavior is classified as: Undefined Behavior (information/confirmation needed)
Example 2 : An lvalue does not designate an object when evaluated (6.3.2.1). This undefined behavior is classified as: Critical Undefined Behavior
Example 3: An object has its stored value accessed other than by an lvalue of an allowable type (6.5). This undefined behavior is classified as: Bounded Undefined Behavior
Example 4: The string pointed to by the mode argument in a call to the fopen function does not exactly match one of the specified character sequences (7.21.5.3). This undefined behavior is classified as: Possible Conforming Language Extension
What is the meaning of the classifications? What do these classification convey to the programmer?
I only have access to a draft of the standard, but from what I’m reading, it seems like this classification of undefined behavior isn’t mandated by the standard and only matters from the perspective of compilers and environments that specifically indicate that they want to create C programs that can be more easily analyzed for different classes of errors. (These environments have to define a special symbol __STDC_ANALYZABLE__.)
It seems like the key idea here is an “out of bounds write,” which is defined as a write operation that modifies data that isn’t otherwise allocated as part of an object. For example, if you clobber the bytes of an existing variable accidentally, that’s not an out of bounds write, but if you jumped to a random region of memory and decorated it with your favorite bit pattern you’d be performing an out of bounds write.
A specific behavior is bounded undefined behavior if the result is undefined, but won’t ever do an out of bounds write. In other words, the behavior is undefined, but you won’t jump to a random address not associated with any objects or allocated space and put bytes there. A behavior is critical undefined behavior if you get undefined behavior that cannot promise that it won’t do an out-of-bounds write.
The standard then goes on to talk about what can lead to critical undefined behavior. By default undefined behaviors are bounded undefined behaviors, but there are exceptions for UB that result from memory errors like like accessing deallocated memory or using an uninitialized pointer, which have critical undefined behavior. Remember, though, that these classifications only exist and have meaning in the context of implementations of C that choose to specifically separate out these sorts of behaviors. Unless your C environment guarantees it’s analyzable, all undefined behaviors can potentially do absolutely anything!
My guess is that this is intended for environments like building drivers or kernel plugins where you’d like to be able to analyze a piece of code and say “well, if you're going to shoot someone in the foot, it had better be your foot that you’re shooting and not mine!” If you compile a C program with these constraints, the runtime environment can instrument the very few operations that are allowed to be critical undefined behavior and have those operations trap to the OS, and assume that all other undefined behaviors will at most destroy memory that’s specifically associated with the program itself.
All of these are cases where the behaviour is undefined, i.e. the standard "imposes no requirements". Traditionally, within undefined behaviour and considering one implementation (i.e. C compiler + C standard library), one could see two kinds of undefined behaviour:
constructs for which the behaviour would not be documented, or would be documented to cause a crash, or the behaviour would be erratic,
constructs that the standard left undefined but for which the implementation defines some useful behaviour.
Sometimes these can be controlled by compiler switches. E.g. example 1 usually always causes bad behaviour - a trap, or crash, or modifies a shared value. Earlier versions of GCC allowed one to have modifiable string literals with -fwritable-strings; therefore if that switch was given, the implementation defined the behaviour in that case.
C11 added an optional orthogonal classification: bounded undefined behaviour and critical undefined behaviour. Bounded undefined behaviour is that which does not perform an out-of-bounds store, i.e. it cannot cause values being written in arbitrary locations in memory. Any undefined behaviour that is not bounded undefined behaviour is critical undefined behaviour.
Iff __STDC_ANALYZABLE__ is defined, the implementation will conform to the appendix L, which has this definitive list of critical undefined behaviour:
An object is referred to outside of its lifetime (6.2.4).
A store is performed to an object that has two incompatible declarations (6.2.7),
A pointer is used to call a function whose type is not compatible with the referenced type (6.2.7, 6.3.2.3, 6.5.2.2).
An lvalue does not designate an object when evaluated (6.3.2.1).
The program attempts to modify a string literal (6.4.5).
The operand of the unary * operator has an invalid value (6.5.3.2).
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just
beyond the array object and is used as the operand of a unary *
operator that is evaluated (6.5.6).
An attempt is made to modify an object defined with a const-qualified type through use of an lvalue with
non-const-qualified type (6.7.3).
An argument to a function or macro defined in the standard library has an invalid value or a type not expected by a function
with variable number of arguments (7.1.4).
The longjmp function is called with a jmp_buf argument where the most recent invocation of the setjmp macro in the same invocation of
the program with the corresponding jmp_buf argument is nonexistent,
or the invocation was from another thread of execution, or the
function containing the invocation has terminated execution in the
interim, or the invocation was within the scope of an identifier with
variably modified type and execution has left that scope in the
interim (7.13.2.1).
The value of a pointer that refers to space deallocated by a call to the free or realloc function is used (7.22.3).
A string or wide string utility function accesses an array beyond the end of an object (7.24.1, 7.29.4).
For the bounded undefined behaviour, the standard imposes no requirements other than that an out-of-bounds write is not allowed to happen.
The example 1: modification of a string literal is also. classified as critical undefined behaviour. The example 4 is critical undefined behaviour too - the value is not one expected by the standard library.
For example 4, the standard hints that while the behaviour is undefined in case of mode that is not defined by the standard, there are implementations that might define behaviour for other flags. For example glibc supports many more mode flags, such as c, e, m and x, and allow setting the character encoding of the input with ,ccs=charset modifier (and putting the stream into wide mode right away).
Some programs are intended solely for use with input that is known to be valid, or at least come from trustworthy sources. Others are not. Certain kinds of optimizations which might be useful when processing only trusted data are stupid and dangerous when used with untrusted data. The authors of Annex L unfortunately wrote it excessively vaguely, but the clear intention is to allow compilers that they won't do certain kinds of "optimizations" that are stupid and dangerous when using data from untrustworthy sources.
Consider the function (assume "int" is 32 bits):
int32_t triplet_may_be_interesting(int32_t a, int32_t b, int32_t c)
{
return a*b > c;
}
invoked from the context:
#define SCALE_FACTOR 123456
int my_array[20000];
int32_t foo(uint16_t x, uint16_t y)
{
if (x < 20000)
my_array[x]++;
if (triplet_may_be_interesting(x, SCALE_FACTOR, y))
return examine_triplet(x, SCALE_FACTOR, y);
else
return 0;
}
When C89 was written, the most common way a 32-bit compiler would process that code would have been to do a 32-bit multiply and then do a signed comparison with y. A few optimizations are possible, however, especially if a compiler in-lines the function invocation:
On platforms where unsigned compares are faster than signed compares, a compiler could infer that since none of a, b, or c can be negative, the arithmetical value of a*b is non-negative, and it may thus use an unsigned compare instead of a signed comparison. This optimization would be allowable even if __STDC_ANALYZABLE__ is non-zero.
A compiler could likewise infer that if x is non-zero, the arithmetical value of x*123456 will be greater than every possible value of y, and if x is zero, then x*123456 won't be greater than any. It could thus replace the second if condition with simply if (x). This optimization is also allowable even if __STDC_ANALYzABLE__ is non-zero.
A compiler whose authors either intend it for use only with trusted data, or else wrongly believe that cleverness and stupidity are antonyms, could infer that since any value of x larger than 17395 will result in an integer overflow, x may be safely presumed to be 17395 or less. It could thus perform my_array[x]++; unconditionally. A compiler may not define __STDC_ANALYZABLE__ with a non-zero value if it would perform this optimization. It is this latter kind of optimization which Annex L is designed to address. If an implementation can guarantee that the effect of overflow will be limited to yielding a possibly-meaningless value, it may be cheaper and easier for code to deal with the possibly of the value being meaningless than to prevent the overflow. If overflow could instead cause objects to behave as though their values were corrupted by future computations, however, there would be no way a program could deal with things like overflow after the fact, even in cases where the result of the computation would end up being irrelevant.
In this example, if the effect of integer overflow would be limited to yielding a possibly-meaningless value, and if calling examine_triplet() unnecessarily would waste time but would otherwise be harmless, a compiler may be able to usefully optimize triplet_may_be_interesting in ways that would not be possible if it were written to avoid integer overflow at all costs. Aggressive
"optimization" will thus result in less efficient code than would be possible with a compiler that instead used its freedom to offer some loose behavioral guarantees.
Annex L would be much more useful if it allowed implementations to offer specific behavioral guarantees (e.g. overflow will yield a possibly-meaningless result, but have no other side-effects). No single set of guarantees would be optimal for all programs, but the amount of text Annex L spent on its impractical proposed trapping mechanism could have been better spent specifying macros to indicate what guarantees various implementations could offer.
According to cppreference :
Critical undefined behavior
Critical UB is undefined behavior that might perform a memory write or
a volatile memory read out of bounds of any object. A program that has
critical undefined behavior may be susceptible to security exploits.
Only the following undefined behaviors are critical:
access to an object outside of its lifetime (e.g. through a dangling pointer)
write to an object whose declarations are not compatible
function call through a function pointer whose type is not compatible with the type of the function it points to
lvalue expression is evaluated, but does not designate an object attempted modification of a string literal
dereferencing an invalid (null, indeterminate, etc) or past-the-end pointer
modification of a const object through a non-const pointer
call to a standard library function or macro with an invalid argument
call to a variadic standard library function with unexpected argument type (e.g. call to printf with an argument of the type that
doesn't match its conversion specifier)
longjmp where there is no setjmp up the calling scope, across threads, or from within the scope of a VM type.
any use of the pointer that was deallocated by free or realloc
any string or wide string library function accesses an array out of bounds
Bounded undefined behavior
Bounded UB is undefined behavior that cannot perform an illegal memory
write, although it may trap and may produce or store indeterminate
values.
All undefined behavior not listed as critical is bounded, including
multithreaded data races
use of a indeterminate values with automatic storage duration
strict aliasing violations
misaligned object access
signed integer overflow
unsequenced side-effects modify the same scalar or modify and read the same scalar
floating-to-integer or pointer-to-integer conversion overflow
bitwise shift by a negative or too large bit count
integer division by zero
use of a void expression
direct assignment or memcpy of inexactly-overlapped objects
restrict violations
etc.. ALL undefined behavior that's not in the critical list.
"I was reading through the C11 standard. As per the C11 standard undefined behavior is classified into four different types."
I wonder what you were actually reading. The 2011 ISO C standard does not mention these four different classifications of undefined behavior. In fact it's quite explicit in not making any distinctions among different kinds of undefined behavior.
Here's ISO C11 section 4 paragraph 2:
If a "shall" or "shall not" requirement that appears outside of a
constraint or runtime-constraint is violated, the behavior is
undefined. Undefined behavior is otherwise indicated in this
International Standard by the words "undefined behavior" or by the
omission of any explicit definition of behavior. There is no
difference in emphasis among these three; they all describe "behavior
that is undefined".
All the examples you cite are undefined behavior, which, as far as the Standard is concerned, means nothing more or less than:
behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements
If you have some other reference, that discusses different kinds of undefined behavior, please update your question to cite it. Your question would then be about what that document means by its classification system, not (just) about the ISO C standard.
Some of the wording in your question appears similar to some of the information in C11 Annex L, "Analyzability" (which is optional for conforming C11 implementations), but your first example refers to "Undefined Behavior (information/confirmation needed)", and the word "confirmation" appears nowhere in the ISO C standard.
If I had the folowing declaration:
extern volatile int SOME_REGISTER;
and later on:
void trigger_read_register()
{
SOME_REGISTER;
}
would calling trigger_read_register() issue a read request on SOME_REGISTER ?
According to the C11 spec, accessing a volatile is considered a side effect, and thus the compiler shouldn't optimize the (otherwise useless) access in your example.
So, the answer is that yes, it should read from memory.
See C11 standard (draft), section 5.1.2.3 section 2:
Accessing a volatile object, modifying an object, modifying a file, or
calling a function that does any of those operations are all side
effects, which are changes in the state of the execution
environment. Evaluation of an expression in general includes both
value computations and initiation of side effects. Value computation
for an lvalue expression includes determining the identity of the
designated object.
Further, 4 says:
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
I work on compilers for a couple of embedded platforms. A user has recently complained about the following behaviour from one of our compilers. Given code like this:
extern volatile int MY_REGISTER;
void Test(void)
{
(void) (MY_REGISTER = 1);
}
The compiler generates this (in pseudo-assembler):
Test:
move regA, 1
store regA, MY_REGISTER
load regB, MY_REGISER
That is, it not only writes to MY_REGISTER, but reads it back afterwards. The extra load upset him for performance reasons. I explained that this was because according to the standard "An assignment expression has the value of the left operand after the assignment, [...]".
Strangely, removing the cast-to-void changes the behaviour: the load disappears. The user's happy, but I'm just confused.
So I also checked this out in a couple of versions of GCC (3.3 and 4.4). There, the compiler never generates a load, even if the value is explicitly used, e.g.
int TestTwo(void)
{
return (MY_REGISTER = 1);
}
Turns into
TestTwo:
move regA, 1
store regA, MY_REGISTER
move returnValue, 1
return
Does anyone have a view on which is a correct interpretation of the standard? Should the read-back happen at all? Is it correct or useful to add the read only if the value is used or cast to void?
The relevant paragraph in the standard is this
An assignment operator stores a value
in the object designated by the left
operand. An assignment expression has
the value of the left operand after
the assignment, but is not an lvalue.
The type of an assignment expression
is the type of the left operand unless
the left operand has qualified type,
in which case it is the unqualified
version of the type of the left
operand. The side effect of updating the stored value of the left operand shall
occur between the previous and the next sequence point.
So this clearly makes the difference between "the value of the left operand" and the update of the stored value. Also note that the return is not an lvalue (so there is no reference to the variable in the return of the expression) and all qualifiers are lost.
So I read this as gcc doing the right thing when it returns the value that it knowingly has to store.
Edit:
The upcoming standard plans to clarify that by adding a footnote:
The implementation is permitted to
read the object to determine the value
but is not required to, even when the
object has volatile-qualified type.
Edit 2:
Actually there is another paragraph about expression statements that might shed a light on that:
The expression in an expression
statement is evaluated as a void
expression for its side effects.\footnote{Such as assignments, and function calls which have side effects}
Since this implies that the effect of returning a value is not wanted for such a statement, this strongly suggests that the value may only be loaded from the variable if the value is used.
As a summary, your customer really is rightly upset when he sees that the variable is loaded. This behavior might be in accordance with the standard if you stretch the interpretation of it, but it clearly is on the borderline of being acceptable.
Reading back seems to be nearer to the standard (especially considering that reading a volatile variable can result in a different value than the one written), but I'm pretty sure it isn't what is expected by most code using volatile, especially in contexts where reading or writing a volatile variable triggers some other effects.
volatile in general isn't very well defined -- "What constitutes an access to an object that
has volatile-qualified type is implementation-defined."
Edit: If I had to make a compiler, I think I wouldn't read back the variable if it isn't used and reread it if is, but with a warning. Then should a cast to void be an used?
(void) v;
should surely be one, and considering that, I don't any reason for
(void) v = exp;
not to be. But in any case, I'd give a warning explaining how to get the other effect.
BTW, If you work on a compiler, you probably have someone in contact with the C committee, filling a formal defect report will bring you a binding interpretation (well, there is the risk of the DR being classified "Not A Defect" without any hint about what they want...)
The language in the standard says nothing about reading the volatile variable, only what the value of the assignment expression is, which a) is defined by C semantics, not by the content of the variable and b) isn't used here, so need not be calculated.