#include <stdio.h>
int main() {
const char a[99]="hello-hi";
printf("%s\n",a);
char *p=strtok(a,"-");
printf("%s",a);
return 0;
}
output:
hello-hi
hello
why a is modified here?? I made it const but still why it is modified??
The definition of const is not “the computer will prevent you from modifying the object”. The definition of const in C 2018 6.7.3 7 is:
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined…
So defining an object with const does not create a promise from the computer to you that the object will not be modified. It is actually a promise in the other direction: It is a promise from you to the computer that you will not attempt to modify the object. This gives the compiler permission to put the object in memory that is marked read-only.
If you violate the promise, the behavior is not defined by the C standard. The object might be in read-only memory, and attempting to modify it will cause a trap and alert you to a bug in your program. Or the object might be in modifiable memory, and attempting to modify it will modify it. Or, with program optimization, other behaviors may occur.
The C standard does give you some help with this. When you pass a const char * to strtok, which expects a char *, the compiler is required to issue a diagnostic message. Pay attention to the warnings and errors the compiler reports and use them to fix your program. Preferable, use a compiler switch to elevate warnings to errors. (-Werror with GCC or Clang, /WX with Microsoft Visual C++.)
1. const char array does not have to be in read-only memory. Attempt to modify it is an Undefined Behaviour. Anything may happen segfault, modification, virus activation, bank account transfer, disk erase etc etc.
Read the compiler warnings
<source>:9:20: warning: passing 'const char [99]' to parameter of type 'char *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
char *p=strtok(a,"-");
int main() should be int main(void)
why a is modified here??
strtok modifies the string. Modifying const object results in undefined behaviour. The behaviour of the example program is undefined.
In fact, the const array doesn't convert to pointer to non-const (without cast), so the program is ill-formed.
Other problems:
You don't include the header that declares strtok.
int main() is non-standard in C.
From the C Standard (6.7.3 Type qualifiers)
6 If an attempt is made to modify an object defined with a
const-qualified type through use of an lvalue with non-const-qualified
type, the behavior is undefined.
The compiler should issue at least a warning that the call of strtok discards the qualifier const from the passed argument expression.
In this call of strtok
char *p=strtok(a,"-");
the array a is implicitly converted to pointer of the type const char * to its first element while the corresponding parameter of the function does not have the qualifier const.
Related
So I was reading the book Language C by Kernighan Ritchie and on page 39, Chapter 2: Types, Operators and Expressions
the author writes:
The const declaration can also be used with array arguments, to indicate that the function does not change that array:
int strlen(const char[]);
The result is implementation-defined if an attempt is made to change a const.
I don't understand what it means. Would appreciate if anyone could simplify what he means by that.
"Implementation defined" simply means that it is up to the implementation what should happen. A difference from "undefined behavior" is that when it is "implementation defined", the behavior needs to be documented. Read more about that here: Undefined, unspecified and implementation-defined behavior
But you can change things via a const pointer if you cast it to non-const. This will print 42;
void foo(const int *x)
{
*(int *)x = 42;
}
int main(void)
{
int n = 69;
foo(&n);
printf("%d\n", &n);
}
I wrote a related answer about const that you can read here: https://stackoverflow.com/a/62563330/6699433
Declaring function parameters as const indicates that the function should not change the value of those parameters.
Even though C passes arguments by value, pointed-to values are susceptible to change. By declaring the function parameter as const, if the function attempts to modify the pointed-to value, the compiler will generate an error.
The following function will change the value pointed to by x:
void foo(int *x)
{
*x = 100;
}
In the following function, by marking the parameter as const, the function will not be able to change the value pointed to by x.
void foo(const int *x)
{
*x = 100; // Compiler generates an error
}
In C, even though it looks like you're passing an array when using the square brackets [], you're actually passing a pointer. So void foo(const int *x) would be the same as void foo(const int x[])
Summary
Kernighan and Ritchie are wrong; attempting to modify const objects is undefined, not implementation-defined.
This rules applies only to objects originally defined with const.
const on function parameters is advisory, not enforced. It is possible, and defined by the C standard, for a function to modify an object pointed to with a const parameter if that object was not defined with const.
Details
The quoted passage is wrong. Attempting to modify an object defined with const has undefined behavior, not implementation-defined behavior. And this applies only to objects defined with const, not to objects passed via const-qualified pointers, if those objects were not originally defined with const.
C 2018 6.7.3 7 says:
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.
The same wording appears in C 1990 6.5.3.
“Undefined“ means the C standard does not impose any requirements on the behavior (C 2018 3.4.3). This is different from “implementation-defined,” which means the C implementation must document how a choice among possibilities is made (C 2018 3.4.1).
Note that that rule applies only to objects defined with const. 6.7 5 tells us that, for object identifiers, a definition is a declaration that causes storage to be reserved for the object. If we declare int x; inside a function, that will cause storage to be reserved for x, so it is a definition. However, the statement int strlen(const char[]); merely declares a function and its parameter type. The actual parameter is not declared because there is no name for it. If we consider the actual function definition, such as:
int strlen(const char s[])
{
…
}
then this function definition includes a declaration of the parameter s. And it does define s; storage for the parameter itself will be reserved when the function executes. However, this s is only a pointer to some object that the caller passes the address of. So this is not a definition of that object.
So far, we know the rule in 6.7.3 7 tells us that modifying an object defined with const has undefined behavior. Are there any other rules about a function modifying an object it has received through a pointer with const? There are. The left operand of an assignment operator must be modifiable. C 2018 6.5.16 2 says:
An assignment operator shall have a modifiable lvalue as its left operand.
An lvalue qualified with const is not modifiable, per C 2018 6.3.2.1 1. This paragraph is a constraint in the C standard, which means a C implementation is required to diagnose violations. (So, again, this is not implementation-defined behavior. The C implementation must produce a message.) The ++ and -- operators, both pre- and post-, have similar constraints.
So, a function with a parameter const char s[] cannot directly modify *s or s[i], at least not without getting a diagnostic message. However, a program is allowed to remove const in a conversion operator if it was not originally present. C 2018 6.3.2.3 2 says we can add const:
For any qualifier q, a pointer to a non-q-qualified type may be converted to a pointer to the q-qualified version of the type; the values stored in the original and converted pointers shall compare equal.
and then C 2018 6.3.2.3 7 says that, after we have done that, we can convert the const version back to the original type:
A pointer to an object type may be converted to a pointer to a different object type… when converted back again, the result shall compare equal to the original pointer.
What this means is that if a calling routine has:
int x = 3;
foo(&x);
printf("%d\n", x);
and foo is:
void foo(const int *p)
{
* (int *) p = 4;
}
then this is allowed and defined by the C standard. The function foo removes const and modifies the object it points to, and “4” will be printed.
A lesson here is that const in function parameters is advisory, not enforced by C. It serves two purposes:
const on a function parameter is generally an indication for humans that the function will not modify the pointed-to object through that parameter. (However, there are circumstances, not discussed here, where this indication does not hold.)
The compiler will enforce a rule that the pointed-to object cannot be modified through the const type. This prevents inadvertent errors where a typographical error might result in an unwanted assignment to a const object. However, a function is permitted to explicitly remove const and then attempt to modify the object.
That material appears a bit obsolete.
The strlen standard library function returns size_t nowadays, but anyway:
int strlen(const char[]);, which is the same as, int strlen(const char*); means
strlen can accept either a char* or const char* without needing a cast.
If you pass a pointer to a non-const variable and the function attempts to modify it (by casting away the const as in void modify(char const *X){ *(char*)X='x'; }) the behavior is ̶i̶m̶p̶l̶e̶m̶e̶n̶t̶a̶t̶i̶o̶n̶-̶d̶e̶f̶i̶n̶e̶d̶ undefined (it is undefined, not implementation defined in newer C versions).
Undefined behavior means you lose all guarantees about the program.
Implementation defined means the code will do something specific (e.g., abort, segfault, do nothing) in a consistent, predictable fashion (given a platform) and that it that won't affect the integrity of the rest of the program.
In older simple-minded compilers, a static-lifetime const-variable would be placed in read-only memory if the platform has read-only memory or in writable memory otherwise. Then you either get a segfault on an attempt to modify it or you won't. That would be implementation-defined behavior.
Newer compilers may additionally attempt to do far-reaching optimization based on the const annotations, so it's hard to tell what effects an attempt at such a modification attempt may have. The fact that such a modification attempt is undefined behavior in modern C allows compilers to make such optimizations (i.e., optimizations which yield hardly predictable results if the rule is broken).
const char[] as parameter type of a function is in fact equal to a pointer to const char (type const char *). The array notation was invented for convenience when passing pointer to arrays.
Related posts:
C pointer notation compared to array notation: When passing to function
Passing an array as an argument to a function in C
With declaring const char p_a[] you declare p_a as a pointer to const char.
The const qualifier associated to char tells the compiler that the char object/array pointed to in the caller shouldn't be modified.
Any attempt to modify this object/array with a non-const pointer/lvalue invokes undefined behavior:
"If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined."
Source: C18, 6.7.3/7
and a compiler usually will warn you when doing so.
strlen does not need to and also should not modify a string to which a pointer is passed as argument. For the sake of not giving any chance to accidentally modifying the string in the caller, the pointer is classified as pointer to const char.
The const qualifier adds an extra layer of security.
In the C89 standard, I found the following section:
3.2.2.1 Lvalues and function designators
Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue). If the lvalue has qualified type, the value has the unqualified version of the type of the lvalue; otherwise the value has the type of the lvalue. If the lvalue has an incomplete type and does not have array type, the behavior is undefined.
If I read it correctly, it allows us to create an lvalue and applies some operators on it, which compiles and can cause undefined behavior during runtime.
Problem is that, I can't think of an example of "an lvalue with incomplete type" which can pass compiler's semantic check and triggers undefined behavior.
Consider that an lvalue is
An lvalue is an expression (with an object type or an incomplete type other than void) that designates an object.
and that incomplete type is
Types are partitioned into object types (types that describe objects), function types (types that describe functions), and incomplete types (types that describe objects but lack information needed to determine their sizes).
A failed program I tried:
struct i_am_incomplete;
int main(void)
{
struct i_am_incomplete *p;
*(p + 1);
return 0;
}
and got the following error:
error: arithmetic on a pointer to an incomplete type 'struct i_am_incomplete'
*(p + 1);
~ ^
Anyone can think of an example on this ? An example of "an lvalue with incomplete type" which can pass compiler's semantic check and triggers undefined behavior.
UPDATE:
As #algrid said in the answer, I misunderstood undefined behavior, which contains compile error as an option.
Maybe I'm splitting hairs, I still wonder the underlying motivation here to prefer undefined behavior over disallowing an lvalue to have an incomplete type.
I believe this program demonstrates the case:
struct S;
struct S *s, *f();
int main(void)
{
s = f();
if ( 0 )
*s; // here
}
struct S { int x; };
struct S *f() { static struct S y; return &y; }
On the marked line, *s is an lvalue of incomplete type, and it does not fall under any of the "Except..." cases in your quote of 3.2.2.1 (which is 6.3.2.1/2 in the current standard). Therefore it is undefined behaviour.
I tried my program in gcc and clang and they both rejected it with the error that a pointer to incomplete type cannot be dereferenced; but I cannot find anywhere in the Standard which would make that a constraint violation, so I believe the compilers are incorrect to reject the program. Or possibly the standard is defective by omitting such a constraint, which would make sense.
(Since the code is inside an if(0), that means the compiler cannot reject it merely on the basis of it being undefined behaviour).
Some build systems may have been designed in a way would allow code like:
extern struct foo x;
extern use_foo(struct foo x); // Pass by value
...
use_foo(x);
to be processed successfully without the compiler having to know or care
about the actual representation of struct foo [for example, some systems may process pass-by-value by having the caller pass the address of an object and requiring the called function to make a copy if it's going to modify it].
Such a facility may be useful on systems that could support it, and I don't think the authors of the Standard wanted to imply that code which used that feature was "broken", but they also didn't want to mandate that all C implementations support such a feature. Making the behavior undefined would allow implementations to support it when practical, without requiring that they do so.
"Undefined behavior" term includes compilation error as an option. From the C89 standard:
Undefined behavior - behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or of indeterminately-valued objects, for which the Standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
As you can see "terminating a translation" is ok.
In this case I believe the compilation error you get for you sample code is an example of "undefined behavior" implemented as compile time error.
Sure, array types can be that:
extern double A[];
...
A[0] = 1; // lvalue conversion of A
This has well defined behavior, even if the definition of A is not visible to the compiler. So inside this TU the array type is never completed.
From this article.
Another use for declaring a variable as register and const is to inhibit any non-local change of that variable, even trough taking its address and then casting the pointer. Even if you think that you yourself would never do this, once you pass a pointer (even with a const attribute) to some other function, you can never be sure that this might be malicious and change the variable under your feet.
I don't understand how we can modify the value of a const variable by a pointer. Isn't it undefined behavior?
const int a = 81;
int *p = (int *)&a;
*p = 42; /* not allowed */
The author's point is that declaring a variable with register storage class prevents you from taking its address, so it can not be passed to a function that might change its value by casting away const.
void bad_func(const int *p) {
int *q = (int *) p; // casting away const
*q = 42; // potential undefined behaviour
}
void my_func() {
int i = 4;
const int j = 5;
register const int k = 6;
bad_func(&i); // ugly but allowed
bad_func(&j); // oops - undefined behaviour invoked
bad_func(&k); // constraint violation; diagnostic required
}
By changing potential UB into a constraint violation, a diagnostic becomes required and the error is (required to be) diagnosed at compile time:
c11
5.1.1.3 Diagnostics
1 - A conforming implementation shall produce at least one diagnostic message [...] if a preprocessing translation unit or translation unit
contains a violation of any syntax rule or constraint, even if the behavior is also explicitly
specified as undefined or implementation-defined.
6.5.3.2 Address and indirection operators
Constraints
1 - The operand of the unary & operator shall be [...] an lvalue that designates an object that [...] is
not declared with the register storage-class specifier.
Note that array-to-pointer decay on a register array object is undefined behaviour that is not required to be diagnosed (6.3.2.1:3).
Note also that taking the address of a register lvalue is allowed in C++, where register is just an optimiser hint (and a deprecated one at that).
Can we modify the value of a const variable?
Yes, You can modify a const variable through various means: Pointer hackery, casts etc...
Do Read next Q!!
Is it valid code to modify the value of a const variable?
No! What that gives you is Undefined Behavior.
Technically, your code example has an Undefined Behavior.
The program is not adhering to c standard once you modify the const and hence may give any result.
Note that an Undefined Behavior does not mean that the compiler needs to report the violation as an diagnostic. In this case your code uses pointer hackery to modify a const and the compiler is not needed to provide a diagnostic for it.
The C99 standard 3.4.3 says:
Undefined behavior: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
Your code compiles, but it has undefined behavior.
The author's point is to use const and register so that the code no longer compiles:
const int a = 81;
int *p = (int *)&a; /* no compile error */
*p = 42; /* UB */
register const int b = 81;
int *q = (int *)&b; /* does not compile */
The code fragment indeed invokes undefined behavior.
I 'm not really sure what the author's point is: in order to not let "foreign code" change the value of the variable you make it const so that... UB is invoked instead? How is that preferable? To be frank, it does not make sense.
I think the author is also talking about this case, which is a misunderstanding of const:
int a = 1;
int* const a_ptr = (int* const)&a; //cast not relevant
int function(int* const p){
int* malicious = (int*)p;
*malicious = 2;
}
The variable itself is not constant, but the pointer is. The malicious code can convert to a regular pointer and legally modify the variable below.
I don't understand how we can modify the value of a const variable by a pointer. Isn't it undefined behavior?
Yes, it is undefined behavior:
Quote from C18, 6.7.3/7:
"If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined."
But just because the behavior is undefined, it does not mean you potentially can not do that. As far as I can think of, it is indeed the case, that the compiler will, most of the times your program contains any kind of undefined behavior, not warn you - which is a big problem.
Fortunately in this case, when compiling f.e.:
#include <stdio.h>
int main(){
const int a = 25;
int *p = &a;
*p = 26;
printf("a = %d",a);
}
the compiler will throw a warning:
initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] (gcc)
or
warning: initializing 'int *' with an expression of type 'const int *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] (clang)
but despite that the code contains parts which cause undefined behavior and you can never be sure what it will print on any execution, you get that malicious program compiled (without -Werror option of course).
Can we modify the value of a const variable?
So, yes - unfortunately. One can actually modify a const object, but you never ever should do that, neither intentionally nor by accident.
The method to using register keyword might be efficient because the address of a register marked variable can´t have its address taken - means you cannot assign a pointer with the address of the relative variable nor pass it to a function as argument of the respective pointer type.
I am trying to build the M-SIM architecture simulator, but when I run the make utility, gcc reports this error (it is not even a warning)
note: expected 'char *' but argument is of type 'const char *'
Since when this is considered an error. Is there any flags that can bypass this check?
This is an error because passing a const char* argument to a function that takes a char* parameter violates const-correctness; it would allow you to modify a const object, which would defeat the whole purpose of const.
For example, this C program:
#include <stdio.h>
void func(char *s) {
puts(s);
s[0] = 'J';
}
int main(void) {
const char message[] = "Hello";
func(message);
puts(message);
return 0;
}
produces the following compile-time diagnostics from gcc:
c.c: In function ‘main’:
c.c:10:5: warning: passing argument 1 of ‘func’ discards qualifiers from pointer target type
c.c:3:6: note: expected ‘char *’ but argument is of type ‘const char *’
The final message is marked as a "note" because it refers to the (perfectly legal) declaration of func(), explaining that that's the parameter declaration to which the warning refers.
As far as the C standard is concerned, this is a constraint violation, which means that a compiler could treat it as a fatal error. gcc, by default, just warns about it and does an implicit conversion from const char* to char*.
When I run the program, the output is:
Hello
Jello
which shows that, even though I declared message as const, the function was able to modify it.
Since gcc didn't treat this as a fatal error, there's no need to suppress either of the diagnostic messages. It's entirely possible that the code will work anyway (say, if the function doesn't happen to modify anything). But warnings exist for a reason, and you or the maintainers of the M-SIM architecture simulator should probably take a look at this.
(Passing a string literal to func() wouldn't trigger these diagnostics, since C doesn't treat string literals as const. (It does make the behavior of attempting to modify a string literal undefined.) This is for historical reasons. gcc does have an option, -Wwrite-strings, that causes it to treat string literals as const; this actually violates the C standard, but it can be a useful check.)
As I mentioned in a comment, it would be helpful if you'd show us the code that triggers the diagnostics.
I even downloaded and built the M-SIM architecture simulator myself, but I didn't see that particular message.
Pointers to const-qualified types do not implicitly convert to pointers to non-const-qualified types. An explicit conversion via a cast is necessary, for example:
foo((char *)bar)
First in a function call (of a function defined with a prototype), the arguments are converted to the type of the parameters as if by assignment.
You can assign a value of type char * to an object of type const char * but you cannot assign a const char * value to a char * object.
This constraint appears in the constraints of assignment operator:
(C99, 6.5.16.1p1) "One of the following shall hold: [...] - both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the qualifiers of the type pointed to by the right;"
This constraint permits the first assignment but disallows the second.
Declaring a pointer with type const char * means you won't modify the object pointed to by the pointer. So you can assign the pointer a value of char * type, it just means the object won't be modified through the const char * pointer.
But declaring a pointer of type char * means you could modify the object pointed to by the pointer. It would not make sense to assign it a value of const char *.
Remember that in C, const does not mean constant but rather read-only. The const qualifier put before pointer types means you promise not to modify objects through objects of these pointers types.
Take these steps if you haven't already:
Declare a char pointer.
If necessary, allocate space and copy contents from the constant string. (e.g. by
using strdup())
And substitute the constant char pointer with the new char pointer.
Why is it that I can assign a string to a variable of type int? Eg. the following code compiles correctly:
int main(int argv, char** argc){
int a="Hello World";
printf(a);
}
Also, the program doesn't compile when I assign a string to a variable of a different type, namely double and char.
I suppose that what is actually going on is that the compiler executes int* a = "Hello World"; and when I write double a="Hello World";, it executes that line of code as it is.
Is this correct?
In fact, that assignment is a constraint violation, requiring a diagnostic (possibly just a warning) from any conforming C implementation. The C language standard does not define the behavior of the program.
EDIT : The constraint is in section 6.5.16.1 of the C99 standard, which describes the allowed operands for a simple assignment. The older C90 standard has essentially the same rules. Pre-ANSI C (as described in K&R1, published in 1978) did allow this particular kind of implicit conversion, but it's been invalid since the 1989 version of the language.
What probably happens if it does compile is that
int a="Hello World";
is treated as it if were
int a = (int)"Hello World";
The cast takes a pointer value and converts to int. The meaning of such a conversion is implementation-defined; if char* and int are of different sizes, it can lose information.
Some conversions may be done explicitly, for example between different arithmetic types. This one may not.
Your compiler should complain about this. Crank up the warning levels until it does. (Tell us what compiler you're using, and we can tell you how to do that.)
EDIT :
The printf call:
printf(a);
has undefined behavior in C90, and is a constraint violation, requiring a diagnostic, in C99, because you're calling a variadic function with no visible prototype. If you want to call printf, you must have a
#include <stdio.h>
(In some circumstances, the compiler won't tell you abut this, but it's still incorrect.) And given a visible declaration, since printf's first parameter is of type char* and you're passing it an int.
And if your compiler doesn't complain about
double a="Hello World";
you should get a better compiler. That (probably) tries to convert a pointer value to type double, which doesn't make any sense at all.
"Hello World" is an array of characters ending with char '\0'
When you assign its value to an int a, you assign the address of the first character in the array to a. GCC is trying to be kind with you.
When you print it, then it goes to where a points and prints all the characters until it reaches char '\0'.
It will compile because (on a 32-bit system) int, int *, and char * all correspond to 32-bit registers -- but double is 64-bits and char is 8-bits.
When compiled, I get the following warnings:
[11:40pm][wlynch#wlynch /tmp] gcc -Wall foo.c -o foo
foo.c: In function ‘main’:
foo.c:4: warning: initialization makes integer from pointer without a cast
foo.c:5: warning: passing argument 1 of ‘printf’ makes pointer from integer without a cast
foo.c:5: warning: format not a string literal and no format arguments
foo.c:5: warning: format not a string literal and no format arguments
foo.c:6: warning: control reaches end of non-void function
As you can see from the warnings, "Hello World" is a pointer, and it is being converted to an integer automatically.
The code you've given will not always work correctly though. A pointer is sometimes larger than an int. If it is, you could get a truncated pointer, and then a very odd fault when you attempt to use that value.
It produces a warning on compilers like GCC and clang.
warning: initialization makes integer from pointer without a cast [enabled by default]
The string literal "Hello World" is a const char *, so you are assigning a pointer to an int (i.e. casting the address of the first character in the string as an int value). On my compilers, gcc 4.6.2 and clang-mac-lion, assigning the string to int, unsigned long long, or char all produce warnings, not errors.
This is not behavior to rely on, quite frankly. Not to mention, your printf(a); is also a dangerous use of printf.