const int a = 100;
int *p = &a;
*p = 99;
printf("value %d", a);
Above code compiles and I am able to update its value via pointer but code below didn't output anything.
static const int a = 100;
int *p = &a;
*p = 99;
printf("value %d", a);
Can anyone explain this thing.
Both code snippets are invalid C. During initialization/assignment, there's a requirement that "the type pointed to by the left has all the qualifiers of the type pointed to
by the right" (c17 6.5.16.1). This means we can't assign a const int* to an int*.
"Above code compiles" Well, it doesn't - it does not compile cleanly. See What must a C compiler do when it finds an error?.
I would recommend you to block invalid C from compiling without errors by using (on gcc, clang, icc) -std=c11 -pedantic-errors.
Since the code is invalid C, it's undefined behavior and why it has a certain behavior is anyone's guess. Speculating about why you get one particular output from one case of undefined behavior to another isn't very meaningful. What is undefined behavior and how does it work? Instead focus on writing valid C code without bugs.
There are several things going on here:
const does not mean "Put this variable in read-only memory or otherwise guarantee that any attempt to modify it will definitively result in an error message."
What const does mean is "I promise not to try to modify this variable." (But you broke that promise in both code fragments.)
Attempting to modify a const-qualified variable (i.e., breaking your promise) yields undefined behavior, which means that anything can happen, meaning that it might do what you want, or it might give you an error, or it might do what you don't want, or it might do something totally different.
Compilers don't always complain about const violations. (Though a good compiler should really have complained about the ones here.)
Some compilers are selective in their complaints. Sometimes you have to ask the compiler to warn about iffy things you've done.
Some programmers are careless about ignoring warnings. Did your compiler give you any warnings when you compiled this?
The compiler should complain in both cases when you store the address of a const int into p, a pointer to modifiable int.
In the first snippet, a is defined as a local variable with automatic storage: although you define it as const, the processor does not prevent storing a value into it via a pointer. The behavior is undefined, but consistent with your expectations (a is assigned the value 99 and this value is printed, but the compiler could have assumed that the value of a cannot be changed, hence could have passed 100 directly to printf without reading the value of a).
In the second snippet, a is a global variable only accessible from within the current scope, but the compiler can place it in a read-only location, causing undefined behavior when you attempt to modify its value via the pointer. The program may terminate before evaluating the printf() statement. This is consistent with your observations.
Briefly, modifying const-qualified static objects causes a trap and modifying a const-qualified automatic object does not because programs are able to place static objects in protected memory but automatic objects must be kept in writeable memory.
In common C implementations, a const-qualified static object is placed in a section of the program data that is marked read-only after it is loaded into memory. Attempting to modify this memory causes the processor to execute a trap, which results in the operating system terminating execution of the program.
In contrast, an object with automatic storage duration (one defined inside a function without static or other storage duration) cannot easily be put in a read-only program section. This is because automatic objects need to be allocated, initialized, and released during program execution, as the functions they are defined in are called and returned. So even though the object may be defined as const for the purposes of the C code, the program needs to be able to modify the memory actually used for it.
To achieve this, common C implementations put automatic objects on the hardware stack, and no attempt is made to mark the memory read-only. Then, if the program mistakenly attempts to modify a const-qualified automatic object, the hardware does not prevent it.
The C standard requires that the compiler issue a diagnostic message for the statement int *p = &a;, since it attempts to initialize a pointer to non-const with the address of a const-qualified type. When you ignore that message and execute the program anyway, the behavior is not defined by the C standard.
Also see this answer for explanation of why the program may behave as though a is not changed even after *p = 99; executes without trapping.
6.7.3 Type qualifiers
...
6 If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined.133)
133) This applies to those objects that behave as if they were defined with qualified types, even if they are
never actually defined as objects in the program (such as an object at a memory-mapped input/output
address).
C 2011 Online Draft
If you declare a as const, you're making a promise to the compiler that the value of a should not change during its lifetime; if you try to assign a new value to a directly the compiler should at least issue a diagnostic. However, by trying to change a through a non-const pointer p you're breaking that promise, but you're doing it in such a way that the compiler can't necessarily detect it.
The resulting behavior is undefined - neither the compiler nor the runtime environment are required to handle the situation in any particular way. The code may work as expected, it may crash outright, it may appear to do nothing, it may corrupt other data. const-ness may be handled in different ways depending on the compiler, the platform, and the code.
The use of static changes how a is stored, and the interaction of static and const is likely what's leading to the different behavior. The static version of a is likely being stored in a different memory segment which may be read-only.
Related
Why are we able to change constant variables using a pointer, but we can't change a constant string index value using a pointer?
For example,
Case1: Changing constant variables using pointers, this works fine.
int main()
{
const int var = 10;
int *ptr = &var;
*ptr = 12;
printf("var = %d\n", var); //12
return 0;
}
Case2: Changing constant string using pointers, this gives compiler error
int main()
{
char * a = "test";//test is in ROM, a is a pointer to its start address in ROM
a[3] = 'M';//error
return 0;
}
The both programs are ill-formed and have undefined behavior.
According to the C Standard (6.7.3 Type qualifiers)
6 If an attempt is made to modify an object defined with a
const-qualified type through use of an lvalue with non-const-qualified
type, the behavior is undefined.
It seems that the first program produces the expected result only due to the fact that the variable var has automatic storage duration. That is the compiler did not place it in a read-only memory.
All string literals (though in C they have types of non-constant arrays opposite to C++) have the static storage duration and usually are collected by the compiler in a literal pool that is stored in a read-only memory.
In any case according to the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
There are at least four factors involved in the observations made in the question.
1. Implicitly removing const should generate a warning
Consider this:
int *ptr = &var;
In this statement, &var is a pointer to a const int, but ptr is a pointer to int. This violates the constraints for simple assignments in C 2018 6.5.16.1 (which apply because the rules for initialization in 6.7.9 11 refer to them). In this case, the left operand must have all the qualifiers of the right operand and be of otherwise compatible type.
Because a constraint is violated, a compiler conforming to the C standard is required to issue a diagnostic. You either used a non-conforming compiler to compile this program or you ignored the diagnostic and executed the program anyway.
Since a constraint is violated, the resulting behavior is not defined by the C standard.
An important principle here is that the C standard does not prevent you from breaking some rules. In this case, it merely does not guarantee what will happen.
2. The behavior of attempting to modify const object is not defined by the C standard
In this line:
*ptr = 12;
the program attempts to modify the constant var through a pointer. This violates C 6.7.3 7, which says:
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.
(The expression *ptr is an lvalue with non-const-qualified type.)
As above, the C standard does not prevent you from breaking this rule; it merely does not define what will happen.
What happens when you break this rule? It depends on how the compiler treated your program. Several things are common:
If the object is static and const, the compiler might assign it to a read-only location of memory. Then attempting to modify it would cause a memory access violation and crash the program.
If the object is automatic (defined inside a function with default storage), the compiler might assign it to the stack. The stack is both readable and writeable (it has to be writeable because we change the stack frequently, as routines are called and return). Thus, although the object is const, the compiler has no good way to put an automatic object into read-only memory. So it is on the stack and is writeable. Then attempting to modify it succeeds.
The compiler, during optimization, modifies your program in various ways. This can cause hard-to-predict results when you attempt to modify a const object. The optimizer might recognize that the attempt is not defined by the C standard and simply remove the attempt from your program. But other results are possible too.
3. Due to historical language development, string literals are not const-qualified
At the time string literals were introduced to the C language, there was no const qualifier. There was merely a rule (at some point, if not initially), that you were not allowed to modify the elements of string literals.
When const was introduced to the C language, string literals could not be made const because this would cause many programs not to compile, because those programs were using char * to refer to elements in string literals. They were not modifying the string literals, but they were using these old pointer types to refer to them.
So string literals remained non-const-qualified.
4. The rule that modifying string literals is not supported remains
Another feature of string literals is that they may be consolidated. If you use "abcdefghijklmnopqrstuvwxyz" in one place in the program and you use the same string in another place in the program, even in a different translation unit, the compiler and linker are allowed to create just one instance of them in the executable file and in the memory of the loaded program. This feature was important for early programs, because machines had limited space, so combining copies of the same data was valuable.
This permission is in C 2018 6.4.5 7, which says, about string literals:
It is unspecified whether these arrays are distinct provided their elements have the appropriate values.…
That paragraph also gives us the rule that the behavior of attempting to modify string literals is not defined by the C standard:
… If the program attempts to modify such an array, the behavior is undefined.
The first rule of this paragraph is also a reason we need the second rule. If two string literal can be consolidated into one memory location, then a routine that changed what it thought of as its string could inadvertently change the data used by another routine, possibly in an entirely different part of the program written by a different person at a different company.
Thus, due to how the C language developed historically, string literals are not const-qualified, but the C standard does not support modifying them.
What happens when you break this rule? Commonly, string literals are put in a read-only portion of memory. When you attempt to modify them, a likely result is that your program causes a memory access violation and crashes. This is the proximate cause of the behavior you observed: The string was in read-only memory, and modifying it caused a crash, but the ptr object was on the stack, and modifying it “worked.” So the results are not a necessary consequence of the rules of C but were consequences of how your compiler behaved.
Declaring a const int and then assigning a value to it through a pointer is undefined behavior. Modifying this variable through a pointer will not throw an exception when it is allocated in a writable memory though (seems that your compiler is allocating the variable in the stack).
However when you declare char *str = "test"; the string "test" is usually allocated in a read-only memory section (which seems to be your case). Thus probably throwing an ACCESS_VIOLATION when you try to change it through a[3] = 'M';.
To stress that out though: both cases should be considered undefined behavior.
When is casting away const illegal?
In many cases const is simply an annotation for the benefit of the user which is enforced by the compiler. This question is asking for when casting away const is strictly illegal and what part of the C99 standard forbids it (if any)?
I suspect that it must be illegal in some cases because I've observed that const function pointers in the global scope are sometimes inlined by gcc. If these function pointers could be legally modified later in the program then this optimization would be invalid. Additionally I'm pretty sure I've read an explanation for when casting away const is illegal on StackOverflow but I just can't find it. (I think the answer was that const can only be cast away when a variable was originally not declared as const.)
I've tagged this C99 because I'm hoping this is something specified in the C99 standard but if an answer can explain behavior common to the gcc, clang, and MSVC implementations of C99 then that would also be sufficient for me.
(Obviously this is very basic question so I did try and fail to find a duplicate specifically for the C standard. If anyone else can point me to an exact duplicate I'll be happy to close vote. In either case, I think this question title at least helps SSO for this sort of question.)
Casting away constness in never illegal. You can do it freely and legally as much as you want. But attempting to modify the pointed object after casing away the constness might lead to undefined behavior (or it might not, depending on the circumstances).
C99 clearly states
6.7.3 Type qualifiers
5 If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.
(it is 6.7.3/6 in C11).
Note that const in C (and C++) is used in two independent and very different roles: const-qualification of an object itself and const-qualification of an access path to an object.
const int a = 42; // `a` is a const-qualifiet object
int b = 5; // `b` is not const-qualified
const int *p = &b; // `*p` is a const-qualified access path to `b`
In the latter case the constness can be legally cast away and the resultant non-const access path can be used to modify the object, provided the object itself is not const
*(int *) p = 6; // nothing formally wrong with it
This is the very reason we have the ability to cast away constness in the language.
But the former (constness of the object itself) cannot be overridden
*(int *) &a = 43; // undefined behavior
The reason is pretty trivial - a const object can be physically located in read-only memory. And even if it isn't, the compiler is still free to compile the code under assumption that the object's value never changes.
This means that in a declaration like
const int *const cp = &b;
these two const have quite different meanings. The first (leftmost) one is a "soft" const, which can be overridden if the conditions are right. But the second const is a "hard" const that cannot be overridden.
So, when you say that "const is simply an annotation for the benefit of the user which is enforced by the compiler", it is only true for const-qualifications of access paths. When it comes to const-qualifications of objects themselves, it is generally a lot more that just just a purely conceptual compiler-enforced "annotation".
Writing to a constant variable using a pointer is giving run time error.
const int i;
int *p;
void main()
{
p = (int*)&i;
*p = 10; // Causes runtime error
}
But in a windows system everything is running from RAM itself.
When I printed the address of const variables and normal variables, I can see that they are in different offsets.
How does the system know that the address being accessed by the pointer is a const one?
Strictly speaking, your code yields undefined behavior according to the C-language standard.
In practice, the linker has probably placed the variable i in a RO section of the executable image.
So the write-operation *p = 10 resulted in a memory-access violation (aka segmentation fault).
How does the system knew ....
Ideally, the system does not need to know. For objects with const-qualified type, the allocation (in general) will be in read-only section, so any attempt to modify (write) will cause access violation. It's the programmer who should know.
When I printed the address of const variables and normal variables, I can see that they are in different offsets.
Yes, that's likely, because the normal variables reside in read-write memory, whereas const variables will reside in read-only memory.
Please notice, there's no syntax (or compilation) error for your code snippet. It's only the behavior of the code (runtime) is undefined.
FYI, quoting C11, chapter §6.7.3/p6
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. [...]
Why is it allowed to change a const variable using a pointer to it with memcpy?
This code:
const int i=5;
int j = 0;
memcpy(&j, &i, sizeof(int));
printf("Source: i = %d, dest: j = %d\n", i,j);
j = 100;
memcpy(&i, &j, sizeof(int));
printf("Source: j = %d, dest: i = %d\n", j,i);
return 0;
compiled with just a warning:
warning: passing argument 1 of ‘memcpy’ discards ‘const’ qualifier
from pointer target type [enabled by default]
But did run just fine, and changed the value of a const variable.
Attempt to modify the value of a const-qualified variable leads to an undefined behavior in C. You should not rely on your results, since anything can happen.
C11 (n1570), § 6.7.3 Type qualifiers
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined.
Nothing force the compiler to produce a diagnostic message.
In fact, this qualifier has not enormous effects on the machine code. A const-qualified variable does not usually reside in a read-only data segment (obviously, not in your implementation, although it could be different on an other one).
The compiler can't tell easily what a pointer is pointing to in a given function. It is possible with some static analysis tools, which perform pointer-analysis. However, it is difficult to implement, and it would be stupid to put it in the standard.
The question asks why. Here's why:
This is allowed because once you have a pointer to a memory address, the language does not know what it points to. It could be a variable, part of a struct, the heap or the stack, or anything. So it cannot prevent you from writing to it. Direct memory access is always unsafe and to be avoided if there's another way of doing it.
The const stops you modifying the value of a const with an assignment (or increment etc). This kind of mutation is the only operations it can guarantee you won't be able to perform on a const.
Another way to look at this is the division of the static context (i.e. at compile time) and the runtime context. When you compile a piece of code which may, for example, make an assignment to a variable, the language can say "that's not allowed, it's const" and that is a compilation error. After this, the code is compiled into an executable and the fact that it is a const is lost. Variable declarations (and the rest of the language) is written as input to the compiler. Once it is compiled, the code isn't relevant. You can do a logical proof in your compiler to say that consts aren't changed. The compiled program runs, and we know at compile time that we have created a program that doesn't break the rules.
When you introduce pointers, you have behaviour that can be defined at run-time. The code that you wrote is now irrelevant, and you can [attempt to] do what you want. The fact that pointers are typed (allowing pointer arithmetic, interpreting the memory at the end of a pointer as a particular type) means that the language gives you some help, but it can't prevent you from doing anything. It can make no guarantees, as you can point a pointer anywhere. The compiler can't stop you breaking the rules at run-time with code that uses pointers.
That said, pointers are the way we get dynamic behaviour and data structures, and are necessary for all but the most trivial code.
(The above is subject to lots of caveats, i.e. code heuristics, more sophisticated static analysis bus is broadly true of a vanilla compiler.)
The reason why is because the C language allows any pointer type to be implicitly casted to/from the type void*. It is designed that way because void pointers are used for generic programming.
So a C compiler is not allowed to stop your code from compiling, even though the program invokes undefined behavior in this case. A good compiler will however give a warning as soon as you implicitly try to cast away a const qualifier.
C++ has "stronger typing" than C, meaning that it would require an explicit cast of the pointer type for this code to compile. This is one flaw of the C language that C++ actually fixed.
While 'officially' it's undefined in reality it's very much defined - you will change the value of the const variable. Which raises the question why it's const to begin with.
Is accessing a non-const object through a const declaration allowed by the C standard?
E.g. is the following code guaranteed to compile and output 23 and 42 on a standard-conforming platform?
translation unit A:
int a = 23;
void foo(void) { a = 42; }
translation unit B:
#include <stdio.h>
extern volatile const int a;
void foo(void);
int main(void) {
printf("%i\n", a);
foo();
printf("%i\n", a);
return 0;
}
In the ISO/IEC 9899:1999, I just found (6.7.3, paragraph 5):
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined.
But in the case above, the object is not defined as const (but just declared).
UPDATE
I finally found it in ISO/IEC 9899:1999.
6.2.7, 2
All declarations that refer to the same object or function shall have compatible type;
otherwise, the behavior is undefined.
6.7.3, 9
For two qualified types to be compatible, both shall have the identically qualified
version of a compatible type; [...]
So, it is undefined behaviour.
TU A contains the (only) definition of a. So a really is a non-const object, and it can be accessed as such from a function in A with no problems.
I'm pretty sure that TU B invokes undefined behavior, since its declaration of a doesn't agree with the definition. Best quote I've found so far to support that this is UB is 6.7.5/2:
Each declarator declares one identifier, and asserts that when an
operand of the same form as the declarator appears in an expression,
it designates a function or object with the scope, storage duration,
and type indicated by the declaration specifiers.
[Edit: the questioner has since found the proper reference in the standard, see the question.]
Here, the declaration in B asserts that a has type volatile const int. In fact the object does not have (qualified) type volatile const int, it has (qualified) type int. Violation of semantics is UB.
In practice what will happen is that TU A will be compiled as if a is non-const. TU B will be compiled as if a were a volatile const int, which means it won't cache the value of a at all. Thus, I'd expect it to work provided the linker doesn't notice and object to the mismatched types, because I don't immediately see how TU B could possibly emit code that goes wrong. However, my lack of imagination is not the same as guaranteed behavior.
AFAIK, there's nothing in the standard to say that volatile objects at file scope can't be stored in a completely different memory bank from other objects, that provides different instructions to read them. The implementation would still have to be capable of reading a normal object through, say, a volatile pointer, so suppose for example that the "normal" load instruction works on "special" objects, and it uses that when reading through a pointer to a volatile-qualified type. But if (as an optimization) the implementation emitted the special instruction for special objects, and the special instruction didn't work on normal objects, then boom. And I think that's the programmer's fault, although I confess I only invented this implementation 2 minutes ago so I can't be entirely confident that it conforms.
In the B translation unit, const would only prohibit modifying the a variable within the B translation unit itself.
Modifications of that value from outside (other translation units) will reflect on the value you see in B.
This is more of a linker issue than a language issue. The linker is free to frown upon the differing qualifications of the a symbol (if there is such information in the object files) when merging the compiled translation units.
Note, however, that if it's the other way around (const int a = 23 in A and extern int a in B), you would likely encounter a memory access violation in case of attempting to modify a from B, since a could be placed in a read-only area of the process, usually mapped directly from the .rodata section of the executable.
The declaration that has the initialization is the definition, so your object is indeed not a const qualified object and foo has all the rights to modify it.
In B your are providing access to that object that has the additional const qualification. Since the types (the const qualified version and the non-qualified version) have the same object representation, read access through that identifier is valid.
Your second printf, though, has a problem. Since you didn't qualify your B version of a as volatile you are not guaranteed to see the modification of a. The compiler is allowed to optimize and to reuse the previous value that he might have kept in a register.
Declaring it as const means that the instance is defined as const. You cannot access it from a not-const. Most compilers will not allow it, and the standard says it's not allowed either.
FWIW: In H&S5 is written (Section 4.4.3 Type Qualifiers, page 89):
"When used in a context that requires a value rather than a designator, the qualifiers are eliminated from the type." So the const only has an effect when someone tries to write something into the variable.
In this case, the printf's use a as an rvalue, and the added volatile (unnecessary IMHO) makes the program read the variable anew, so I would say, the program is required to produce the output the OP saw initially, on all platforms/compilers.
I'll look at the Standard, and add it if/when I find anything new.
EDIT: I couldn't find any definite solution to this question in the Standard (I used the latest draft for C1X), since all references to linker behavior concentrate on names being identical. Type qualifiers on external declarations do not seem to be covered.
Maybe we should forward this question to the C Standard Committee.