So I was reading the book Language C by Kernighan Ritchie and on page 39, Chapter 2: Types, Operators and Expressions
the author writes:
The const declaration can also be used with array arguments, to indicate that the function does not change that array:
int strlen(const char[]);
The result is implementation-defined if an attempt is made to change a const.
I don't understand what it means. Would appreciate if anyone could simplify what he means by that.
"Implementation defined" simply means that it is up to the implementation what should happen. A difference from "undefined behavior" is that when it is "implementation defined", the behavior needs to be documented. Read more about that here: Undefined, unspecified and implementation-defined behavior
But you can change things via a const pointer if you cast it to non-const. This will print 42;
void foo(const int *x)
{
*(int *)x = 42;
}
int main(void)
{
int n = 69;
foo(&n);
printf("%d\n", &n);
}
I wrote a related answer about const that you can read here: https://stackoverflow.com/a/62563330/6699433
Declaring function parameters as const indicates that the function should not change the value of those parameters.
Even though C passes arguments by value, pointed-to values are susceptible to change. By declaring the function parameter as const, if the function attempts to modify the pointed-to value, the compiler will generate an error.
The following function will change the value pointed to by x:
void foo(int *x)
{
*x = 100;
}
In the following function, by marking the parameter as const, the function will not be able to change the value pointed to by x.
void foo(const int *x)
{
*x = 100; // Compiler generates an error
}
In C, even though it looks like you're passing an array when using the square brackets [], you're actually passing a pointer. So void foo(const int *x) would be the same as void foo(const int x[])
Summary
Kernighan and Ritchie are wrong; attempting to modify const objects is undefined, not implementation-defined.
This rules applies only to objects originally defined with const.
const on function parameters is advisory, not enforced. It is possible, and defined by the C standard, for a function to modify an object pointed to with a const parameter if that object was not defined with const.
Details
The quoted passage is wrong. Attempting to modify an object defined with const has undefined behavior, not implementation-defined behavior. And this applies only to objects defined with const, not to objects passed via const-qualified pointers, if those objects were not originally defined with const.
C 2018 6.7.3 7 says:
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.
The same wording appears in C 1990 6.5.3.
“Undefined“ means the C standard does not impose any requirements on the behavior (C 2018 3.4.3). This is different from “implementation-defined,” which means the C implementation must document how a choice among possibilities is made (C 2018 3.4.1).
Note that that rule applies only to objects defined with const. 6.7 5 tells us that, for object identifiers, a definition is a declaration that causes storage to be reserved for the object. If we declare int x; inside a function, that will cause storage to be reserved for x, so it is a definition. However, the statement int strlen(const char[]); merely declares a function and its parameter type. The actual parameter is not declared because there is no name for it. If we consider the actual function definition, such as:
int strlen(const char s[])
{
…
}
then this function definition includes a declaration of the parameter s. And it does define s; storage for the parameter itself will be reserved when the function executes. However, this s is only a pointer to some object that the caller passes the address of. So this is not a definition of that object.
So far, we know the rule in 6.7.3 7 tells us that modifying an object defined with const has undefined behavior. Are there any other rules about a function modifying an object it has received through a pointer with const? There are. The left operand of an assignment operator must be modifiable. C 2018 6.5.16 2 says:
An assignment operator shall have a modifiable lvalue as its left operand.
An lvalue qualified with const is not modifiable, per C 2018 6.3.2.1 1. This paragraph is a constraint in the C standard, which means a C implementation is required to diagnose violations. (So, again, this is not implementation-defined behavior. The C implementation must produce a message.) The ++ and -- operators, both pre- and post-, have similar constraints.
So, a function with a parameter const char s[] cannot directly modify *s or s[i], at least not without getting a diagnostic message. However, a program is allowed to remove const in a conversion operator if it was not originally present. C 2018 6.3.2.3 2 says we can add const:
For any qualifier q, a pointer to a non-q-qualified type may be converted to a pointer to the q-qualified version of the type; the values stored in the original and converted pointers shall compare equal.
and then C 2018 6.3.2.3 7 says that, after we have done that, we can convert the const version back to the original type:
A pointer to an object type may be converted to a pointer to a different object type… when converted back again, the result shall compare equal to the original pointer.
What this means is that if a calling routine has:
int x = 3;
foo(&x);
printf("%d\n", x);
and foo is:
void foo(const int *p)
{
* (int *) p = 4;
}
then this is allowed and defined by the C standard. The function foo removes const and modifies the object it points to, and “4” will be printed.
A lesson here is that const in function parameters is advisory, not enforced by C. It serves two purposes:
const on a function parameter is generally an indication for humans that the function will not modify the pointed-to object through that parameter. (However, there are circumstances, not discussed here, where this indication does not hold.)
The compiler will enforce a rule that the pointed-to object cannot be modified through the const type. This prevents inadvertent errors where a typographical error might result in an unwanted assignment to a const object. However, a function is permitted to explicitly remove const and then attempt to modify the object.
That material appears a bit obsolete.
The strlen standard library function returns size_t nowadays, but anyway:
int strlen(const char[]);, which is the same as, int strlen(const char*); means
strlen can accept either a char* or const char* without needing a cast.
If you pass a pointer to a non-const variable and the function attempts to modify it (by casting away the const as in void modify(char const *X){ *(char*)X='x'; }) the behavior is ̶i̶m̶p̶l̶e̶m̶e̶n̶t̶a̶t̶i̶o̶n̶-̶d̶e̶f̶i̶n̶e̶d̶ undefined (it is undefined, not implementation defined in newer C versions).
Undefined behavior means you lose all guarantees about the program.
Implementation defined means the code will do something specific (e.g., abort, segfault, do nothing) in a consistent, predictable fashion (given a platform) and that it that won't affect the integrity of the rest of the program.
In older simple-minded compilers, a static-lifetime const-variable would be placed in read-only memory if the platform has read-only memory or in writable memory otherwise. Then you either get a segfault on an attempt to modify it or you won't. That would be implementation-defined behavior.
Newer compilers may additionally attempt to do far-reaching optimization based on the const annotations, so it's hard to tell what effects an attempt at such a modification attempt may have. The fact that such a modification attempt is undefined behavior in modern C allows compilers to make such optimizations (i.e., optimizations which yield hardly predictable results if the rule is broken).
const char[] as parameter type of a function is in fact equal to a pointer to const char (type const char *). The array notation was invented for convenience when passing pointer to arrays.
Related posts:
C pointer notation compared to array notation: When passing to function
Passing an array as an argument to a function in C
With declaring const char p_a[] you declare p_a as a pointer to const char.
The const qualifier associated to char tells the compiler that the char object/array pointed to in the caller shouldn't be modified.
Any attempt to modify this object/array with a non-const pointer/lvalue invokes undefined behavior:
"If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined."
Source: C18, 6.7.3/7
and a compiler usually will warn you when doing so.
strlen does not need to and also should not modify a string to which a pointer is passed as argument. For the sake of not giving any chance to accidentally modifying the string in the caller, the pointer is classified as pointer to const char.
The const qualifier adds an extra layer of security.
Related
I have the following piece of code:
void TestFunc(const void * const Var1, const float Var2)
{
*(float*)Var1 = Var2;
}
It looks like I am changing the value of the const object the const pointer points to (thanks sharptooth), which should not be allowed. Fact is, none of the compilers I tried issued a warning. How is this possible?
As others mentioned, the cast removes the 'constness' of the destination as far as the expression is concerned. When you use a cast the compiler treats the expression according to the cast - as long as the cast itself it valid (and C-style casts are pretty much the big hammer). This is why you don't get an error or warning. You're essentially telling the compiler, "be quiet, I know what I'm doing, this is how you should treat things". In fact, casts are probably the #1 way for programmers to get the compiler to stop issuing warnings.
Your assignment expression may or may not be undefined behavior. It is permitted to cast away constness if the object actually pointed to is not const.
However, if the object pointed to is const, then you have undefined behavior.
void TestFunc(const void * const Var1, const float Var2)
{
*(float*)Var1 = Var2;
}
int
main(void)
{
float x = 1.0;
const float y = 2.0;
TestFunc( &x, -1.0); // well defined (if not particularly great style)
TestFunc( &y, -2.0); // undefined behavior
return 0;
}
You're treading dangerous waters...
In general (I'm sure there are exceptions), casting so that expressions treat objects as they really are is supported, well-defined behavior in C/C++.
This particular behavior is covered in the standards mostly by statements that modifying a const object through a cast (or something) that removes the const qualifier is undefined. The inference is that doing the same for a non-const object is not undefined. An example given in the C++ standard makes this clear.
C90 6.5.3 - Type Qualifiers (C99 6.7.3):
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined.
C++ 7.1.5.1 The cv-qualifiers
A pointer or reference to a cv-qualified type need not actually point or refer to a cv-qualified object, but it is treated as if it does; a const-qualified access path cannot be used to modify an object even if the object referenced is a non-const object and can be modified through some other access path. [Note: cv-qualifiers are
supported by the type system so that they cannot be subverted without casting (5.2.11). ]
Except that any class member declared mutable (7.1.1) can be modified, any attempt to modify a const
object during its lifetime (3.8) results in undefined behavior.
...
[Example:
...
int i = 2; //not cv-qualified
const int* cip; //pointer to const int
cip = &i; //OK: cv-qualified access path to unqualified
*cip = 4; //ill-formed: attempt to modify through ptr to const
int* ip;
ip = const_cast<int*>(cip); //cast needed to convert const int*to int*
*ip = 4; //defined: *ip points to i, a non-const object
const int* ciq = new const int (3); //initialized as required
int* iq = const_cast<int*>(ciq); //cast required
*iq = 4; //undefined: modifies a const object
You change the object the pointer points to, not the pointer value.
C-style cast acts like a const_cast and removes the const modifier off the pointer. The compiler now has nothing to moan about.
The cast is legal, but the behaviour is undefined.
What does the following function signature define in C?!
#include <stdio.h>
#include <string.h>
void f(int a[const volatile static 2])
{
(void)a;
}
int main() {
int b[1];
f(b);
}
https://godbolt.org/z/6qPxaM1vM
I don't get the meaning of const / volatile / static at this place, but it seems to compile, so I guess it has a meaning?
Thanks
This is a mildly useful feature introduced in C99. From C17 6.7.6.3/7:
A declaration of a parameter as “array of type” shall be adjusted to “qualified pointer to type”, where
the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the
keyword static also appears within the [ and ] of the array type derivation, then for each call to
the function, the value of the corresponding actual argument shall provide access to the first element
of an array with at least as many elements as specified by the size expression.
Meaning in this case the qualifiers const and volatile means that the array decays into a pointer of type int* const volatile a. Since this qualified type is local to the function, it means very little to the caller, since a passed pointer is assigned by lvalue conversion and can still be a non-qualified pointer type.
The static is ever so slightly more useful, as it supposedly enables some compile-time size checks by the compiler, though in practice it seems that the mainstream compilers (currently) only manage to check that the pointer is not null. For example f(0) on clang gives:
warning: null passed to a callee that requires a non-null argument [-Wnonnull]
Curiously, f(0) on gcc 11.1 and beyond says:
warning: argument 1 to 'int[static 8]' is null where non-null expected [-Wnonnull]"
No idea what the 8 is coming from, I guess it's a typo/minor compiler bug (the decayed pointer is 8 bytes large).
A parameter of array type is automatically adjusted to a pointer type.
Therefore a parameter declared as:
int A[3]
is transformed to:
int *A
However, with the array notation there is no intuitive place to add qualifier the variable a itself (not the data pointed by a).
Therefore C standard allows to put those specifier between brackets.
Thus:
void f(int a[const volatile restrict 2])
is actually:
void (int * const volatile restrict a)
The size (2 in above example) is usually ignored. The exception is when static keyword is used. It provides a hint for a compiler that at least elements at addresses a + 0 to a + size - 1 are valid.
This hint in theory should improve optimization by simplifying vectorization. However, AFAIK it is ignored by major compilers.
Your code:
int b[1];
f(b);
is triggering UB because only element at address b + 0 is valid while the hint requires b+0 and b+1 to be valid. Better compilers/sanitizers should detect that.
The static is also useful for self documentation and detection of errors like telling that at least n elements pointed by a pointer must be valid:
void fun(int n, int arr[static n])
or even telling that the pointer is never NULL:
void fun(int ptr[static 1]);
Moreover syntax int buf[static n] is a good visual hint that something is actually not an array. It help to avoid a common error when trying to acquire the a size of "array" with sizeof buf syntax.
EDIT
As stated on the comment the word "hint" may be a bit misleading
because it could be interpreted that violation the "hint" is not
an error though it may result in some non-optimality (like performance degradation).
Actually, it is rather a requirement, violating which results in undefined behavior.
#include <stdio.h>
int main() {
const char a[99]="hello-hi";
printf("%s\n",a);
char *p=strtok(a,"-");
printf("%s",a);
return 0;
}
output:
hello-hi
hello
why a is modified here?? I made it const but still why it is modified??
The definition of const is not “the computer will prevent you from modifying the object”. The definition of const in C 2018 6.7.3 7 is:
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined…
So defining an object with const does not create a promise from the computer to you that the object will not be modified. It is actually a promise in the other direction: It is a promise from you to the computer that you will not attempt to modify the object. This gives the compiler permission to put the object in memory that is marked read-only.
If you violate the promise, the behavior is not defined by the C standard. The object might be in read-only memory, and attempting to modify it will cause a trap and alert you to a bug in your program. Or the object might be in modifiable memory, and attempting to modify it will modify it. Or, with program optimization, other behaviors may occur.
The C standard does give you some help with this. When you pass a const char * to strtok, which expects a char *, the compiler is required to issue a diagnostic message. Pay attention to the warnings and errors the compiler reports and use them to fix your program. Preferable, use a compiler switch to elevate warnings to errors. (-Werror with GCC or Clang, /WX with Microsoft Visual C++.)
1. const char array does not have to be in read-only memory. Attempt to modify it is an Undefined Behaviour. Anything may happen segfault, modification, virus activation, bank account transfer, disk erase etc etc.
Read the compiler warnings
<source>:9:20: warning: passing 'const char [99]' to parameter of type 'char *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
char *p=strtok(a,"-");
int main() should be int main(void)
why a is modified here??
strtok modifies the string. Modifying const object results in undefined behaviour. The behaviour of the example program is undefined.
In fact, the const array doesn't convert to pointer to non-const (without cast), so the program is ill-formed.
Other problems:
You don't include the header that declares strtok.
int main() is non-standard in C.
From the C Standard (6.7.3 Type qualifiers)
6 If an attempt is made to modify an object defined with a
const-qualified type through use of an lvalue with non-const-qualified
type, the behavior is undefined.
The compiler should issue at least a warning that the call of strtok discards the qualifier const from the passed argument expression.
In this call of strtok
char *p=strtok(a,"-");
the array a is implicitly converted to pointer of the type const char * to its first element while the corresponding parameter of the function does not have the qualifier const.
From this article.
Another use for declaring a variable as register and const is to inhibit any non-local change of that variable, even trough taking its address and then casting the pointer. Even if you think that you yourself would never do this, once you pass a pointer (even with a const attribute) to some other function, you can never be sure that this might be malicious and change the variable under your feet.
I don't understand how we can modify the value of a const variable by a pointer. Isn't it undefined behavior?
const int a = 81;
int *p = (int *)&a;
*p = 42; /* not allowed */
The author's point is that declaring a variable with register storage class prevents you from taking its address, so it can not be passed to a function that might change its value by casting away const.
void bad_func(const int *p) {
int *q = (int *) p; // casting away const
*q = 42; // potential undefined behaviour
}
void my_func() {
int i = 4;
const int j = 5;
register const int k = 6;
bad_func(&i); // ugly but allowed
bad_func(&j); // oops - undefined behaviour invoked
bad_func(&k); // constraint violation; diagnostic required
}
By changing potential UB into a constraint violation, a diagnostic becomes required and the error is (required to be) diagnosed at compile time:
c11
5.1.1.3 Diagnostics
1 - A conforming implementation shall produce at least one diagnostic message [...] if a preprocessing translation unit or translation unit
contains a violation of any syntax rule or constraint, even if the behavior is also explicitly
specified as undefined or implementation-defined.
6.5.3.2 Address and indirection operators
Constraints
1 - The operand of the unary & operator shall be [...] an lvalue that designates an object that [...] is
not declared with the register storage-class specifier.
Note that array-to-pointer decay on a register array object is undefined behaviour that is not required to be diagnosed (6.3.2.1:3).
Note also that taking the address of a register lvalue is allowed in C++, where register is just an optimiser hint (and a deprecated one at that).
Can we modify the value of a const variable?
Yes, You can modify a const variable through various means: Pointer hackery, casts etc...
Do Read next Q!!
Is it valid code to modify the value of a const variable?
No! What that gives you is Undefined Behavior.
Technically, your code example has an Undefined Behavior.
The program is not adhering to c standard once you modify the const and hence may give any result.
Note that an Undefined Behavior does not mean that the compiler needs to report the violation as an diagnostic. In this case your code uses pointer hackery to modify a const and the compiler is not needed to provide a diagnostic for it.
The C99 standard 3.4.3 says:
Undefined behavior: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
Your code compiles, but it has undefined behavior.
The author's point is to use const and register so that the code no longer compiles:
const int a = 81;
int *p = (int *)&a; /* no compile error */
*p = 42; /* UB */
register const int b = 81;
int *q = (int *)&b; /* does not compile */
The code fragment indeed invokes undefined behavior.
I 'm not really sure what the author's point is: in order to not let "foreign code" change the value of the variable you make it const so that... UB is invoked instead? How is that preferable? To be frank, it does not make sense.
I think the author is also talking about this case, which is a misunderstanding of const:
int a = 1;
int* const a_ptr = (int* const)&a; //cast not relevant
int function(int* const p){
int* malicious = (int*)p;
*malicious = 2;
}
The variable itself is not constant, but the pointer is. The malicious code can convert to a regular pointer and legally modify the variable below.
I don't understand how we can modify the value of a const variable by a pointer. Isn't it undefined behavior?
Yes, it is undefined behavior:
Quote from C18, 6.7.3/7:
"If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined."
But just because the behavior is undefined, it does not mean you potentially can not do that. As far as I can think of, it is indeed the case, that the compiler will, most of the times your program contains any kind of undefined behavior, not warn you - which is a big problem.
Fortunately in this case, when compiling f.e.:
#include <stdio.h>
int main(){
const int a = 25;
int *p = &a;
*p = 26;
printf("a = %d",a);
}
the compiler will throw a warning:
initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] (gcc)
or
warning: initializing 'int *' with an expression of type 'const int *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] (clang)
but despite that the code contains parts which cause undefined behavior and you can never be sure what it will print on any execution, you get that malicious program compiled (without -Werror option of course).
Can we modify the value of a const variable?
So, yes - unfortunately. One can actually modify a const object, but you never ever should do that, neither intentionally nor by accident.
The method to using register keyword might be efficient because the address of a register marked variable can´t have its address taken - means you cannot assign a pointer with the address of the relative variable nor pass it to a function as argument of the respective pointer type.
Unlike C++, C has no notion of a const_cast. That is, there is no valid way to convert a const-qualified pointer to an unqualified pointer:
void const * p;
void * q = p; // not good
First off: Is this cast actually undefined behaviour?
In any event, GCC warns about this. To make "clean" code that requires a const-cast (i.e. where I can guarantee that I won't mutate the contents, but all I have is a mutable pointer), I have seen the following "conversion" trick:
typedef union constcaster_
{
void * mp;
void const * cp;
} constcaster;
Usage: u.cp = p; q = u.mp;.
What are the C language rules on casting away constness through such a union? My knowledge of C is only very patchy, but I've heard that C is far more lenient about union access than C++, so while I have a bad feeling about this construction, I would like an argument from the standard (C99 I suppose, though if this has changed in C11 it'll be good to know).
It's implementation defined, see C99 6.5.2.3/5:
if the value of a member of a union object is used when the most
recent store to the object was to a different member, the behavior is
implementation-defined.
Update: #AaronMcDaid commented that this might be well-defined after all.
The standard specified the following 6.2.5/27:
Similarly, pointers to qualified or unqualified versions of compatible
types shall have the same representation and alignment
requirements.27)
27) The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values from
functions, and members of unions.
And (6.7.2.1/14):
A pointer to a union object, suitably converted, points to each of its
members (or if a member is a bitfield, then to the unit in which it
resides), and vice versa.
One might conclude that, in this particular case, there is only room for exactly one way to access the elements in the union.
My understanding it that the UB can arise only if you try to modify a const-declared object.
So the following code is not UB:
int x = 0;
const int *cp = &x;
int *p = (int*)cp;
*p = 1; /* OK: x is not a const object */
But this is UB:
const int cx = 0;
const int *cp = &cx;
int *p = (int*)cp;
*p = 1; /* UB: cx is const */
The use of a union instead of a cast should not make any difference here.
From the C99 specs (6.7.3 Type qualifiers):
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined.
The initialization certainly won't cause UB. The conversion between qualified pointer types is explicitly allowed in §6.3.2.3/2 (n1570 (C11)). It's the use of content in that pointer afterwards that cause UB (see #rodrigo's answer).
However, you need an explicit cast to convert a void* to a const void*, because the constraint of simple assignment still require all qualifier on the LHS appear on the RHS.
§6.7.9/11: ... The initial value of the object is that of the expression (after conversion); the same type constraints and conversions as for simple assignment apply, taking the type of the scalar to be the unqualified version of its declared type.
§6.5.16.1/1: (Simple Assignment / Contraints)
... both operands are
pointers to qualified or unqualified versions of compatible types, and the type pointed
to by the left has all the qualifiers of the type pointed to by the right;
... one operand is a pointer
to an object type, and the other is a pointer to a qualified or unqualified version of
void, and the type pointed to by the left has all the qualifiers of the type pointed to
by the right;
I don't know why gcc just gives a warning though.
And for the union trick, yes it's not UB, but still the result is probably unspecified.
§6.5.2.3/3 fn 95: If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
§6.2.6.1/7: When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values. (* Note: see also §6.5.2.3/6 for an exception, but it doesn't apply here)
The corresponding sections in n1124 (C99) are
C11 §6.3.2.3/2 = C99 §6.3.2.3/2
C11 §6.7.9/11 = C99 §6.7.8/11
C11 §6.5.16.1/1 = C99 §6.5.16.1/1
C11 §6.5.2.3/3 fn 95 = missing ("type punning" doesn't appear in C99)
C11 §6.2.6.1/7 = C99 §6.2.6.1/7
Don't cast it at all. It's a pointer to const which means that attempting to modify the data is not allowed and in many implementations will cause the program to crash if the pointer points to unmodifiable memory. Even if you know the memmory can be modified, there may be other pointers to it that do not expect it to change e.g. if it is part of the storage of a logically immutable string.
The warning is there for good reason.
If you need to modify the content of a const pointer, the portable safe way to do it is first to copy the memory it points to and then modify that.