__restrict vis-a-vis a function optimization behavior of popular compilers - c

Consider the following function:
int bar(const int* __restrict x, void g())
{
int result = *x;
g();
result += *x;
return result;
}
Do we need to read twice from x because of the call to g()? Or is the __restriction enough to guarantee the invocation of g() does not access/does not alter the value at address x?
At this link we see the most popular compilers have to say about this (GodBolt; language standard C99, platform AMD64):
clang 7.0: Restriction respected.
GCC 8.3: No restriction.
MSVC 19.16: No restriction.
Is clang rightly optimizing the second read away, or isn't it? I'm asking both for C and C++ here, as the behavior is the same (thanks #PSkocik).
Related information and some notes:
Readers unfamiliar with __restrict (or __restrict__) may want to have a look at: What does the restrict keyword mean in C++?
GCC's documentation page on restricted pointers in C++.
I've opened a bug against GCC on this point.
The fact that x is marked const is not significant here - we get the same behavior if we drop the const and the question stands as is.

I think this is effectively a C question, since C is effectively the language that has restrict, with a formal spec attached to it.
The part of the C standard that governs the use of restrict is 6.7.3.1:
1 Let D be a declaration of an ordinary identifier that provides a
means of designating an object P as a restrict-qualified pointer to
type T.
2 If D appears inside a block and does not have storage class extern,
let B denote the block. If D appears in the list of parameter
declarations of a function definition, let B denote the associated
block. Otherwise, let B denote the block of main (or the block of
whatever function is called at program startup in a freestanding
environment).
3 In what follows, a pointer expression E is said to be based on
object P if (at some sequence point in the execution of B prior to the
evaluation of E) modifying P to point to a copy of the array object
into which it formerly pointed would change the value of E.137) Note
that ''based'' is defined only for expressions with pointer types.
4 During each execution of B, let L be any lvalue that has &L based on
P. If L is used to access the value of the object X that it
designates, and X is also modified (by any means), then the following
requirements apply: T shall not be const-qualified. Every other lvalue
used to access the value of X shall also have its address based on P.
Every access that modifies X shall be considered also to modify P, for
the purposes of this subclause. If P is assigned the value of a
pointer expression E that is based on another restricted pointer
object P2, associated with block B2, then either the execution of B2
shall begin before the execution of B, or the execution of B2 shall
end prior to the assignment. If these requirements are not met, then
the behavior is undefined.
5 Here an execution of B means that portion of the execution of the
program that would correspond to the lifetime of an object with scalar
type and automatic storage duration associated with B.
The way I read it, the execution of g() falls under the execution of the bar's block, so g() is disallowed from modifying *x and clang is right to optimize out the second load (IOW, if *x refers to a non-const global, g() must not modify that global).

Related

Undefined behavior when working with partially initialized struct in C90

Let's consider the following code:
struct M {
unsigned char a;
unsigned char b;
};
void pass_by_value(struct M);
int main() {
struct M m;
m.a = 0;
pass_by_value(m);
return 0;
}
In the function pass_by_value m.b is initialized before used.
However, since m is passed by value the compiler copies it to the stack already.
No variable has storage class register here. a and b are of type unsigned char.
Does that have to be considered UB in C90? (Please note: I am specifically asking for C90)
This question is very similar to Returning a local partially initialized struct from a function and undefined behavior, but actually the other way around.
The C 1990 standard (and the C 199 standard) does not contain the sentence that first appears in C 2011 that makes the behavior of using some uninitialized values undefined.
C 2011 6.3.2.1 2 includes:
… If the lvalue has an incomplete type and does not have array type, the behavior is undefined. If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
The whole of the corresponding paragraph in C 1990, clause 6.2.2.1, second paragraph, is:
Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue). If the lvalue has qualified type, the value has the unqualified version of the type of the lvalue; otherwise, the value has the type of the lvalue. If the lvalue has an incomplete type and does not have array type, the behavior is undefined.
Therefore, the behavior of the code in the question would seem to be defined, inasmuch that it passes the value stored in the structure.
In the absence of explicit statements in the standard, common practice helps guide interpretation. It is perfectly normal not to initialize all members of a structure yet to expect the structure to represent useful data, and therefore the behavior of using the structure as a value must be defined if at least one of its members is initialized. The equivalent question for C 2011 contains mention (from a C defect report) of the standard struct tm in one of its answers. The struct tm may be used to represent a specific date by filling in all of date fields (year, month, day of month) and possibly the time fields (hour, minute, second, even Daylight Savings Time indication) but leaving the day of week and day of year fields uninitialized.
In defining undefined behavior in 3.16, the 1990 standard does say it is “Behavior, upon use … of indeterminately valued objects, for which this International Standard imposes no requirements.” And 6.5.7 says “… If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate…” However, a structure with automatic storage duration in which one member, but not another, has been initialized is neither fully initialized nor not initialized. Given the intended uses of structures, I would say we should not consider use of the value of a partially initialized structure to be subject to being made undefined by 3.16.
Under C90, if an object held Indeterminate Value, each and every bit could independently be zero or one, regardless of whether or not they would in combination represent a valid bit pattern for the object's type. If an implementation specified the behavior of attempting to read each and every one of the 2ⁿ individual possible bit patterns an object could hold, the behavior of reading an object with Indeterminate Value would be equivalent to reading the value of an arbitrarily chosen bit pattern. If there were any bit patterns for which an implementation did not specify the effect of an attempted read, then the effects of trying to read an object that might hold such bit patterns would be likewise unspecified.
Code generation efficiency could be improved in some cases by specifying the behavior of uninitialized objects more loosely, in a way which would not otherwise be consistent with sequential program execution as specified but would nonetheless meet program requirements. For example, given something like:
struct foo { short dat[16]; } x,y,z;
void test1(int a, int b, int c, int d)
{
struct foo temp;
temp.dat[a] = 1;
temp.dat[b] = 2;
temp.dat[c] = 3;
temp.dat[d] = 4;
x=temp;
y=temp;
}
void test2(int a, int b, int c, int d)
{
test1(a,b,c,d);
z=x;
}
If client code only cares about the values of x and y that correspond to values of temp that were written, efficiency might be improved, while still meeting requirements, if the code were rewritten as:
void test1(int a, int b, int c, int d)
{
x.dat[a] = 1;
y.dat[a] = 1;
x.dat[b] = 2;
y.dat[b] = 1;
x.dat[c] = 3;
y.dat[c] = 1;
x.dat[d] = 4;
y.dat[d] = 1;
}
The fact that the original function test1 doesn't do anything to initialize temp suggests that it won't care about what is yielded by any individual attempt to read it. On the other hand, nothing within the code for test2 would imply that client code wouldn't care about whether all members of x held the same values as corresponding values of y. Thus, such an inference would more likely be dangerous there.
The C Standard makes no attempt to define behavior in situations where an optimization might yield program behavior which, although useful, would be inconsistent with sequential processing of non-optimized code. Instead, the principle that optimizations must never affect any defined behavior is taken to imply that the Standard must characterize as Undefined all actions whose behavior would be visibly affected by optimization, leaving implementor discretion the question of what aspects of behavior should or should not be defined in what circumstances. Ironically, the only time the Standard's laxity with regard to this behavior would allow more efficient code generation outside contrived scenarios would be in cases where implementations treat the behavior as at least loosely defined, and programmers are able to exploit that. If a programmer had to explicitly initialize all elements of temp to avoid having the compiler behave in completely nonsensical fashion, that would eliminate any possibility of optimizing out the unnecessary writes to unused elements of x and y.

Is the C11 formal definition of restrict consistent with implementation?

In trying to answer a recent question (Passing restrict qualified pointers to functions?), I could not find how the C11 standard is consistent with practice.
I'm not trying to call out the standard or anything, most things that look inconsistent I just am not understanding right, but my question is best posed as an argument against the definition used in the standard, so here it is.
It seems to be commonly accepted that a function can take a restrict qualified pointer and both work on it and have its own function calls work on it. For example,
// set C to componentwise sum and D to componentwise difference
void dif_sum(float* restrict C, float* restrict D, size_t n)
{
size_t i = 0;
while(i<n) C[i] = C[i] - D[i],
D[i] += C[i] + D[i],
++i;
}
// set A to componentwise sum of squares of A and B
// set B to componentwise product of A and B
void prod_squdif(float* restrict A, float* restrict B, size_t n)
{
size_t i = 0;
float x;
dif_sum(A,B,n);
while(i<n) x = ( (A[i]*=A[i]) - B[i]*B[i] )/2,
A[i] -= x,
B[i++] = x/2;
}
What seems to be the common understanding is that restrict pointers need to reference independent space within their declaring block. So, prod_sqdif is valid because nothing lexically within its defining block accesses the arrays identified by A or B other than those pointers.
To demonstrate my concern with the standard, here is the standard formal definition of restrict (according to the committee draft, if you have the final version and it is different, let me know!):
6.7.3.1 Formal definition of restrict
1 Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T.
2 If D appears inside a block and does not have storage class extern, let B denote the block. If D appears in the list of parameter declarations of a function definition, let B denote the associated block. Otherwise, let B denote the block of main (or the block of whatever function is called at program startup in a freestanding environment).
3 In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E. Note that ‘‘based’’ is defined only for expressions with pointer types.
4 During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. Every other lvalue used to access the value of X shall also have its address based on P. Every access that modifies X shall be considered also to modify P, for the purposes of this subclause. If P is assigned the value of a pointer expression E that is based on another restricted pointer object P2, associated with block B2, then either the execution of B2 shall begin before the execution of B, or the execution of B2 shall end prior to the assignment. If these
requirements are not met, then the behavior is undefined.
5 Here an execution of B means that portion of the execution of the program that would correspond to the lifetime of an object with scalar type and automatic storage duration associated with B.
6 A translator is free to ignore any or all aliasing implications of uses of restrict.
[Examples not included because they are not formally significant.]
Identifying execution of B with expressions lexically contained therein might be seen as supported by the following excerpt from 6.2.4, item 6:
"...Entering an enclosed block or calling a function suspends, but does not end,
execution of the current block..."
However, part 5 of the formal definition of restrict explicitly defines the block B to correspond to the lifetime of an object with automatic storage declared in B (in my example, B is the body of prod_squdif). This clearly overrides any definition of the execution of a block found elsewhere in the standard. The following excerpt from the standard defines lifetime of an object.
6.2.4 Storage durations of objects, item 2
The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. 34) If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
Then the execution of dif_sum is clearly included in the execution of B. I don't think there is any question there. But then the lvalues in dif_sum that read and modify elements of A and B (via C and D) are clearly not based on A and B (they follow sequence points where A and B could have been repointed to copies of their content without changing the locations identified by the lvalues). This is undefined behavior. Note that what lvalues or sequence points are discussed in item 4 is not restricted; as it is stated, there is no reason to restrict lvalues and sequence points to those lexically corresponding to the block B, and so lvalues and sequence points within a function call play just like they do in the body of the calling function.
On the other hand, the generally accepted use of restrict seems implied by the fact that the formal definition explicitly allows C and D to be assigned the values of A and B. This suggests that some meaningful access to A and B through C and D is allowed. However, as argued above, such access is undefined for any element modified through either the outer or inner function call, and at least read by the inner call. This seems contrary to the apparent intent of allowing the assignment in the first place.
Of course, intent has no formal place in the standard, but it does seem suggestive that the common interpretation of restrict, rather than what seems actually defined, is what is intended.
In summary, interpreting the execution of B as the execution of each statement during the lifetime of B's automatic storage, then function calls can't work with the contents of restrict pointers passed to them.
It seems unavoidable to me that there should be some exception stating reads and writes within functions or sub blocks are not considered, but that at most one assignment within such a sub block (and other sub blocks, recursively) may be based on any particular restrict pointer in the outer block.
I have really gone over the standard, both today and yesterday. I really can't see how the formal definition of restrict could possibly be consistent with the way it seems to be understood and implemented.
EDIT: As has been pointed out, violating the restrict contract results in undefined behavior. My question is not about what happens when the contract is violated. My question can be restated as follows:
How can the formal definition of restrict be consistent with access to array elements through function calls? Does such access, within a calling function, not constitute access not based on the restrict pointer passed to the function?
I am looking for an answer based in the standard, as I agree that restrict pointers should be able to be passed through function calls. It just seems that this is not the consequence of the formal definition in the standard.
EDIT
I think the main problem with communicating my question is related to the definition of "based on". I will try to present my question a little differently here.
The following is an informal tracking of a particular call to prod_squdif. This is not intended as C code, it is just an informal description of the execution of the function's block.
Note that this execution includes the execution of the called function, per item 5 of the formal definition of restrict: "Here an execution of B means that portion of the execution of the program that would correspond to the lifetime of an object with scalar type and automatic storage duration associated with B."
// 1. prod_squdif is called
prod_squdif( (float[1]){2}, (float[1]){1}, 1 )
// 2. dif_sum is called
dif_sum(A,B,n) // assigns C=A and D=B
// 3. while condition is evaluated
0<1 // true
// 4. 1st assignment expression
C[0] = C[0] - D[0] // C[0] == 0
// 5. 2nd assignment expression
D[0] += C[0] + D[0] // D[0] == 1
// 6. increment
++i // i == 1
// 7. test
1<1 // false
// return to calling function
// 8. test
0<1 // true
// 9. 1st assignment expression
x = ( (A[0]*=A[0]) - B[1]*B[1] )/2 // x == -.5
// 10. 2nd assignment expression
A[0] -= -.5 // A[0] == .5
// 11. 3rd assignment expression
B[i++/*evaluates to 0*/] = -.5/2 // B[0] == -.25
// 12. test
1<1 // false
// prod_squdif returns
So, the test for the restrict contract is given by item 4 in the formal definition of restrict: "During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: ... Every other lvalue used to access the value of X shall also have its address based on P..."
Let L be the lvalue on the left of the portion of the execution marked '4' above (C[0]). Is &L based on A? I.e., is C based on A?
See item 3 of the formal definition of restrict: "...a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E...".
Take as a sequence point the end of item 3 above. (At this sequence point) modifying A to point to a coppy of the array object into which it formerly pointed would NOT change the value of C.
Thus C is not based on A. So A[0] is modified by an lvalue not based on A. Since it is also read by an lvalue that is based on A (item 10), this is undefined behavior.
My question is: Is it correct to therefore conclude that my example invokes undefined behavior and thus the formal definition of restrict is not consistent with common implementation?
Suppose we have a function with nested blocks like this:
void foo()
{
T *restrict A = getTptr();
{
T *restrict B = A;
{
#if hypothetical
A = copyT(A);
#endif
useTptr(B + 1);
}
}
}
It would seem that, at the point where useTptr(B + 1) is called, the hypothetical change to A would no longer affect the value of B + 1. However, a different sequence point can be found, such that a change to A does affect the value of B + 1:
void foo()
{
T *restrict A = getTptr();
#if hypothetical
A = copyT(A);
#endif
{
T *restrict B = A;
{
useTptr(B + 1);
}
}
}
and C11 draft standard n1570 6.7.3.1 Formal definition of restrict only demands that there be some such sequence point, not that all sequence points exhibit this behavior.
I'm really not sure exactly what your question is.
It sounds like you're asking:
Q: Gee, will "restrict" still apply if I violate the "restrict"
contract? Like in the "remove_zeroes()" example?
The answer, of course, is "No - it won't".
Here are two links that might clarify the discussion. Please update your post with (a) more explicit question(s):
Realistic usage of the C99 'restrict' keyword?
Is it legal to assign a restricted pointer to another pointer, and use the second pointer to modify the value?
https://en.wikipedia.org/wiki/Restrict

When can a volatile variable be optimized away completely?

Consider this code example:
int main(void)
{
volatile int a;
static volatile int b;
volatile int c;
c = 20;
static volatile int d;
d = 30;
volatile int e = 40;
static volatile int f = 50;
return 0;
}
Without volatile a compiler could optimize away all the variables since they are never read from.
I think a and b can be optimized away since they are completely unused, see unused volatile variable.
I think c and d can not be removed since they are written to, and writes to volatile variables must actually happen. e should be equivalent to c.
GCC does not optimize away f, but it also does not emit any instruction to write to it. 50 is set in a data section. LLVM (clang) removes f completely.
Are these statements true?
A volatile variable can be optimized away if it is never accessed.
Initialization of a static or global variable does not count as access.
Writes to volatile variables (even automatic ones) count as observable behaviour.
C11 (N1570) 5.1.2.3/6:
The least requirements on a conforming implementation are:
— Accesses to volatile objects are evaluated strictly according to the rules of the abstract
machine.
— At program termination, all data written into files shall be identical to the result that
execution of the program according to the abstract semantics would have produced.
— The input and output dynamics of interactive devices shall take place as specified in
7.21.3. The intent of these requirements is that unbuffered or line-buffered output
appear as soon as possible, to ensure that prompting messages actually appear prior to
a program waiting for input.
This is the observable behavior of the program.
The question is: does initialization (e, f) count as an "access"? As pointed out by Sander de Dycker, 6.7.3 says:
What constitutes an access to an object that has volatile-qualified type is implementation-defined.
which means it's up to the compiler whether or not e and f can be optimized away - but this must be documented!
Strictly speaking any volatile variable that is accessed (read or written to) cannot be optimized out according to the C Standard. The Standard says that an access to a volatile object may have unknown side-effects and that accesses to volatile object has to follow the rules of the C abstract machine (where all expressions are evaluated as specified by their semantics).
From the mighty Standard (emphasis mine):
(C11, 6.7.3p7) "An object that has volatile-qualified type may be modified in ways unknown to the
implementation or have other unknown side effects. Therefore any expression referring
to such an object shall be evaluated strictly according to the rules of the abstract machine,
as described in 5.1.2.3."
Following that even a simple initialization of a variable should be considered as an access. Remember the static specifier also causes the object to be initialized (to 0) and thus accessed.
Now compilers are known to behave differently with the volatile qualifier, and I guess a lot of them will just optimize out most of the volatile objects of your example program except the ones with the explicit assignment (=).

Does `const T *restrict` guarantee the object pointed-to isn’t modified?

Consider the following code:
void doesnt_modify(const int *);
int foo(int *n) {
*n = 42;
doesnt_modify(n);
return *n;
}
where the definition of doesnt_modify isn’t visible for the compiler. Thus, it must assume, that doesnt_modify changes the object n points to and must read *n before the return (the last line cannot be replaced by return 42;).
Assume, doesnt_modify doesn’t modify *n. I thought about the following to allow the optimization:
int foo_r(int *n) {
*n = 42;
{ /* New scope is important, I think. */
const int *restrict n_restr = n;
doesnt_modify(n_restr);
return *n_restr;
}
}
This has the drawback that the caller of doesnt_modify has to tell the compiler *n isn’t modified, rather than that the function itself could tell the compiler via its prototype. Simply restrict-qualifying the parameter to doesnt_modify in the declaration doesn’t suffice, cf. “Is top-level volatile or restrict significant [...]?”.
When compiling with gcc -std=c99 -O3 -S (or Clang with the same options), all functions are compiled to equivalent assembly, all re-reading the 42 from *n.
Would a compiler be allowed to do this optimization (replace the last line by return 42;) for foo_r? If not, is there a (portable, if possible) way to tell the compiler doesnt_modify doesn’t modify what its argument points to? Is there a way compilers do understand and make use of?
Does any function have UB (provided doesnt_modify doesn’t modify its argument’s pointee)?
Why I think, restrict could help here (From C11 (n1570) 6.7.3.1 “Formal definition of restrict”, p4 [emph. mine]):
[In this case, B is the inner block of foo_r, P is n_restr, T is const int, and X is the object denoted by *n, I think.]
During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. […]
$ clang --version
Ubuntu clang version 3.5.0-4ubuntu2 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
Target: x86_64-pc-linux-gnu
Gcc version is 4.9.2, on an x86 32bit target.
Version 1 seems clearly specified by the formal definition of restrict (C11 6.7.3.1). For the following code:
const int *restrict P = n;
doesnt_modify(P);
return *P;
the symbols used in 6.7.3.1 are:
B - that block of code
P - the variable P
T - the type of *P which is const int
X - the (non-const) int being pointed to by P
L - the lvalue *P is what we're interested in
6.7.3.1/4 (partial):
During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified
[...]
If these requirements are not met, then the behavior is undefined.
Note that T is const-qualified. Therefore, if X is modified in any way during this block (which includes during the call to a function in that block), the behaviour is undefined.
Therefore the compiler can optimize as if doesnt_modify did not modify X.
Version 2 is a bit more difficult for the compiler. 6.7.6.3/15 says that top-level qualifiers are not considered in prototype compatibility -- although they aren't ignored completely.
So although the prototype says:
void doesnt_modify2(const int *restrict p);
it could still be that the body of the function is declared as void doesnt_modify2(const int *p) and therefore might modify *p.
My conclusion is that if and only if the compiler can see the definition for doesnt_modify2 and confirm that p is declared restrict in the definition's parameter list then it would be able to perform the optimization.
Generally, restrict means that the pointer is not aliased (i.e. only it or a pointer derived from it can be used to access the pointed-to object).
With const, this means that the pointed-to object cannot be modified by well-formed code.
There is, however, nothing to stop the programmer breaking the rules using an explicit type conversion to remove the constness. Then the compiler (having been beaten into submission by the programmer) will permit an attempt to modify the pointed-to object without any complaint. This, strictly speaking, results in undefined behaviour so any result imaginable is then permitted including - possibly - modifying the pointed-to object.
If not, is there a (portable, if possible) way to tell the compiler doesnt_modify doesn’t modify what its argument points to?
No such way.
Compiler optimizers have difficulty optimizing when pointer and reference function parameters are involved. Because the implementation of that function can cast away constness compilers assume that T const* is as bad as T*.
Hence, in your example, after the call doesnt_modify(n) it must reload *n from memory.
See 2013 Keynote: Chandler Carruth: Optimizing the Emergent Structures of C++. It applies to C as well.
Adding restrict keyword here does not change the above.
Simultaneous use of a restrict qualifier on a pointer-type parameter and a const qualifier on its target type would invite a compiler to assume that no region of storage which is accessed during the lifetime of the pointer object via the pointer contained therein or any pointer derived from it, will be modified via any means during that pointer's lifetime. It generally says nothing whatsoever about regions of storage which are not accessed using the pointer in question.
The only situations where const restrict would have implications for an entire object would be those where pointer is declared using array syntax with a static bound. In that situation, behavior would only be defined in cases where the entire array object could be read (without invoking UB). Since reading any part of the array object which changes during function execution would invoke UB, code would be allowed to assume that no portion of the array can be changed in any fashion whatsoever.
Unfortunately, while a compiler that knew that a function's actual definition starts with:
void foo(int const thing[restrict static 1]);
would be entitled to assume that no part of *thing would be changed during the function's execution, even if the object might be one the function could otherwise access via pointer not derived from thing, the fact that a function's prototype includes such qualifiers would not compel its definition to do likewise.

In C, can a const variable be modified via a pointer?

I wrote some thing similar to this in my code
const int x=1;
int *ptr;
ptr = &x;
*ptr = 2;
Does this work on all compilers? Why doesn't the GCC compiler notice that we are changing a constant variable?
const actually doesn't mean "constant". Something that's "constant" in C has a value that's determined at compile time; a literal 42 is an example. The const keyword really means read-only. Consider, for example:
const int r = rand();
The value of r is not determined until program execution time, but the const keyword means that you're not permitted to modify r after it's been initialized.
In your code:
const int x=1;
int *ptr;
ptr = &x;
*ptr = 2;
the assignment ptr = &x; is a constraint violation, meaning that a conforming compiler is required to complain about it; you can't legally assign a const int* (pointer to const int) value to a non-const int* object. If the compiler generates an executable (which it needn't do; it could just reject it), then the behavior is not defined by the C standard.
For example, the generated code might actually store the value 2 in x -- but then a later reference to x might yield the value 1, because the compiler knows that x can't have been modified after its initialization. And it knows that because you told it so, by defining x as const. If you lie to the compiler, the consequences can be arbitrarily bad.
Actually, the worst thing that can happen is that the program behaves as you expect it to; that means you have a bug that's very difficult to detect. (But the diagnostic you should have gotten will have been a large clue.)
Online C 2011 draft:
6.7.3 Type qualifiers
...
6 If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined.133)
133) This applies to those objects that behave as if they were defined with qualified types, even if they are
never actually defined as objects in the program (such as an object at a memory-mapped input/output
address).
Emphasis added.
Since the behavior is left undefined, the compiler is not required to issue a diagnostic, nor is it required to halt translation. This would be difficult to catch in the general case; suppose you had a function like
void foo( int *p ) { *p = ...; }
defined in it's own separate translation unit. During translation, the compiler has no way of knowing if p could be pointing to a const-qualified object or not. If your call is something like
const int x;
foo( &x );
you may get a warning like parameter 1 of 'foo' discards qualifiers or something similarly illuminating.
Also note that the const qualifier doesn't necessarily mean that the associated variable will be stored in read-only memory, so it's possible the above code would "work" (update the value in x) in that you'd successfully update x by doing an end-run around the const semantics. But then you might as well just not declare x to be const.
There is a good discussion of this here:Does the evil cast get trumped by the evil compiler?
I would expect gcc to compile this because:
ptr is allowed to point to x, otherwise reading it would be impossible, although as the comment below says it's not exactly brilliant code and the compiler should complain. Warning options will (I guess) affect whether or not it's actually warned about.
when you write to x, the compiler no longer "knows" that it is writing to a const, all this is in the coders hands in C. However, it does really know, so it may well warn you, depending on the warning options you've selected.
whether it works or not, however, will depend on how the code is compiled, the way const is implemented for the compile options selected and the target CPU and architecture. It may work. Or it may crash. Or you may write to a "random" bit of memory and cause (or not) some freaky effect. It's not a good coding strategy, but that wasn't your question :-)
Bad programmer. No Moon Pie!
If you need to modify a const, copy it to a non-const variable and then work with it. It's const for a reason. Trying to "sneak" around a const can cause serious runtime issues. i.e. the optimizer has likely used the value inline, etc.
const int x=1;
int non_const_x = x;
non_const_x = 2;

Resources