Replacing (not modifying) 'const' values, legal? - c

Observe the following code:
const char *str1 = "foo";
printf("1.1: %s\n", str1);
str1 = "bar";
printf("1.2: %s\n\n", str1);
const char *str2[] = { "foo", "bar" };
printf("2.1: %s\n", str2[0]);
str2[0] = "baz";
printf("2.2: %s\n\n", str2[0]);
char *str3 = malloc(4);
strcpy(str3, "foo");
const char *str4[] = { str3, "bar" };
printf("3.1: %s -- %s\n", str4[0], str4[1]);
//str4[0][0] = 'z';
str4[0] = "bar";
str4[1] = "baz";
printf("3.2: %s -- %s\n", str4[0], str4[1]);
free(str3);
Wrapping it in a standard main function and compiling with gcc -Wall -Wextra -o test test.c, this gives no warnings and the output is:
1.1: foo
1.2: bar
2.1: foo
2.2: baz
3.1: foo -- bar
3.2: bar -- baz
When uncommenting the only commented line, gcc complains error: assignment of read-only location ‘*str4[0]’. Of course this is expected, I only added that bit for completeness.
By my understanding all the other assignments are valid because I'm not modifying the already existing read-only memory, I'm replacing it in its entirety with another memory block (which is then also marked read-only). But is this legal throughout all C compilers, or is it still implementation-dependent and therefore can it result in undefined behaviour? It's a little hard to find good information on this because it always seems to be about modifying the existing memory through pointer magic or casting the const away.

In this declaration
const char *str1 = "foo";
the pointer (variable) str1 is not constant. It is the string literal pointed to by the pointer str1 that is constant. That is the (non-constant) pointer str1 points to a constant object of the type const char.
So you may not change the literal pointed to by the pointer like for example
str1[0] = 'g';
But you may change the variable str1 itself because it is not constant
str1 = "bar";
To declare the pointer str1 as a constant pointer you should write
const char * const str1 = "foo";
In this case you may not write
str1 = "bar";
Because now the pointer str1 itself is constant.
Pay attention to though in C string literals have types of non-constant character arrays nevertheless you may not change a string literal. Any attempt to change a string literal results in undefined behavior.
That is you may not write
char *str1 = "foo";
str1[0] = 'g';

Related

Ambiguity with const keyword in C

I have some code:
int main() {
const char *string1 = "foo";
char *string2 = "bar";
string1 = string2;
return 0;
}
When I build it, no warnings are raised. However, with
int main() {
const char *string1 = "foo";
char *string2 = "bar";
string2 = string1;
return 0;
}
A warning is raised:
warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
string2 = string1;
^
With my current basic knowledge of C, I would have thought that string1 would not be reassignable since it is const. To my surprise, it was the code in the first snippet that built with no warnings and the second code snippet the opposite.
Any explanations of what is going on? Thanks!
That is perfectly ok to do. The (potential) problem here is that the strings you're pointing at are string literals, so modifying them would invoke undefined behavior.
What you're mixing it up with is probably this:
const char *str1 - str1 is a pointer to const char
char * const str2 - str2 is a const pointer to char
const char * const str3 - str3 is a const pointer to const char
str1 can be reassigned to point at something else, but since it is a pointer to const char it cannot be used as an l-value, so *str1 = <something> is forbidden. str2 on the other hand cannot be reassigned to point at something else, but it can be used as an l-value. With str3 you cannot do any of those.
Also, note that const char *str and char const *str are equivalent. To find out what a declaration does, or to find out how to declare something, use this site: https://cdecl.org/
I would have thought that string1 would not be reassignable since it is const.
But it isn't!
It's a non-const pointer, pointing to a const thing (specifically, const chars).
The only reason your code is troublesome, is that you should not implicitly convert a const char* to char*, for fairly obvious reasons — it would defeat the purpose of const-safety to just drop const on a pointee type whenever you liked! That's why you're getting a warning.
Going the other way around, to add constness, as in your first example, is fine.
tl;dr: it's not your assignment; it's what you're assigning.

When to use char a[] over char p* and vice versa?

Lately I've been learning all about the C language, and am confused as to when to use
char a[];
over
char *p;
when it comes to string manipulation. For instance, I can assign a string to them both like so:
char a[] = "Hello World!";
char *p = "Hello World!";
and view/access them both like:
printf("%s\n", a);
printf("%s\n", p);
and manipulate them both like:
printf("%c\n", &a[6]);
printf("%c\n", &p[6]);
So, what am I missing?
char a[] = "Hello World!";
This allocates modifiable array just big enough to hold the string literal (including terminating NUL char). Then it initializes the array with contents of string literal. If it is a local variable, then this effectively means it does memcpy at runtime, every time the local variable is created.
Use this when you need to modify the string, but don't need to make it bigger.
Also, if you have char *ap = a;, when a goes out of scope ap becomes a dangling pointer. Or, same thing, you can't do return a; when a is local to that function, because return value will be dangling pointer to now destroyed local variables of that function.
Note that using exactly this is rare. Usually you don't want an array with contents from string literal. It's much more common to have something like:
char buf[100]; // contents are undefined
snprintf(buf, sizeof buf, "%s/%s.%d", pathString, nameString, counter);
char *p = "Hello World!";
This defines pointer, and initializes it to point to string literal. Note that string literals are (normally) non-writable, so you really should have this instead:
const char *p = "Hello World!";
Use this when you need pointer to non-modifiable string.
In contrast to a above, if you have const char *p2 = p; or do return p;, these are fine, because pointer points to the string literal in program's constant data, and is valid for the whole execution of the program.
The string literals themselves, text withing double quotes, the actual bytes making up the strings, are created at compile time and normally placed with other constant data within the application. And then string literal in code concretely means address of this constant data blob.
char * strings are read-only. They cannot be modified while char[] strings can be.
char *str = "hello";
str[0] = 't'; // This is an illegal operation
Whereas
char str[] = "hello"; str[0] = 't'; // Legal, string becomes tello

Difference between char* and const char* [duplicate]

What's the difference between
char* name
which points to a constant string literal, and
const char* name
char* is a mutable pointer to a mutable character/string.
const char* is a mutable pointer to an immutable character/string. You cannot change the contents of the location(s) this pointer points to. Also, compilers are required to give error messages when you try to do so. For the same reason, conversion from const char * to char* is deprecated.
char* const is an immutable pointer (it cannot point to any other location) but the contents of location at which it points are mutable.
const char* const is an immutable pointer to an immutable character/string.
char *name
You can change the char to which name points, and also the char at which it points.
const char* name
You can change the char to which name points, but you cannot modify the char at which it points.
correction: You can change the pointer, but not the char to which name points to (https://msdn.microsoft.com/en-us/library/vstudio/whkd4k6a(v=vs.100).aspx, see "Examples"). In this case, the const specifier applies to char, not the asterisk.
According to the MSDN page and http://en.cppreference.com/w/cpp/language/declarations, the const before the * is part of the decl-specifier sequence, while the const after * is part of the declarator.
A declaration specifier sequence can be followed by multiple declarators, which is why const char * c1, c2 declares c1 as const char * and c2 as const char.
EDIT:
From the comments, your question seems to be asking about the difference between the two declarations when the pointer points to a string literal.
In that case, you should not modify the char to which name points, as it could result in Undefined Behavior.
String literals may be allocated in read only memory regions (implementation defined) and an user program should not modify it in anyway. Any attempt to do so results in Undefined Behavior.
So the only difference in that case (of usage with string literals) is that the second declaration gives you a slight advantage. Compilers will usually give you a warning in case you attempt to modify the string literal in the second case.
Online Sample Example:
#include <string.h>
int main()
{
char *str1 = "string Literal";
const char *str2 = "string Literal";
char source[] = "Sample string";
strcpy(str1,source); //No warning or error, just Undefined Behavior
strcpy(str2,source); //Compiler issues a warning
return 0;
}
Output:
cc1: warnings being treated as errors
prog.c: In function ‘main’:
prog.c:9: error: passing argument 1 of ‘strcpy’ discards qualifiers from pointer target type
Notice the compiler warns for the second case but not for the first.
char mystring[101] = "My sample string";
const char * constcharp = mystring; // (1)
char const * charconstp = mystring; // (2) the same as (1)
char * const charpconst = mystring; // (3)
constcharp++; // ok
charconstp++; // ok
charpconst++; // compile error
constcharp[3] = '\0'; // compile error
charconstp[3] = '\0'; // compile error
charpconst[3] = '\0'; // ok
// String literals
char * lcharp = "My string literal";
const char * lconstcharp = "My string literal";
lcharp[0] = 'X'; // Segmentation fault (crash) during run-time
lconstcharp[0] = 'X'; // compile error
// *not* a string literal
const char astr[101] = "My mutable string";
astr[0] = 'X'; // compile error
((char*)astr)[0] = 'X'; // ok
In neither case can you modify a string literal, regardless of whether the pointer to that string literal is declared as char * or const char *.
However, the difference is that if the pointer is const char * then the compiler must give a diagnostic if you attempt to modify the pointed-to value, but if the pointer is char * then it does not.
CASE 1:
char *str = "Hello";
str[0] = 'M' //Warning may be issued by compiler, and will cause segmentation fault upon running the programme
The above sets str to point to the literal value "Hello" which is hard-coded in the program's binary image, which is flagged as read-only in memory, means any change in this String literal is illegal and that would throw segmentation faults.
CASE 2:
const char *str = "Hello";
str[0] = 'M' //Compile time error
CASE 3:
char str[] = "Hello";
str[0] = 'M'; // legal and change the str = "Mello".
The question is what's the difference between
char *name
which points to a constant string literal, and
const char *cname
I.e. given
char *name = "foo";
and
const char *cname = "foo";
There is not much difference between the 2 and both can be seen as correct. Due to the long legacy of C code, the string literals have had a type of char[], not const char[], and there are lots of older code that likewise accept char * instead of const char *, even when they do not modify the arguments.
The principal difference of the 2 in general is that *cname or cname[n] will evaluate to lvalues of type const char, whereas *name or name[n] will evaluate to lvalues of type char, which are modifiable lvalues. A conforming compiler is required to produce a diagnostics message if target of the assignment is not a modifiable lvalue; it need not produce any warning on assignment to lvalues of type char:
name[0] = 'x'; // no diagnostics *needed*
cname[0] = 'x'; // a conforming compiler *must* produce a diagnostic message
The compiler is not required to stop the compilation in either case; it is enough that it produces a warning for the assignment to cname[0]. The resulting program is not a correct program. The behaviour of the construct is undefined. It may crash, or even worse, it might not crash, and might change the string literal in memory.
The first you can actually change if you want to, the second you can't. Read up about const correctness (there's some nice guides about the difference). There is also char const * name where you can't repoint it.
Actually, char* name is not a pointer to a constant, but a pointer to a variable. You might be talking about this other question.
What is the difference between char * const and const char *?
I would add here that the latest compilers, VS 2022 for instance, do not allow char* to be initialized with a string literal. char* ptr = "Hello"; throws an error whilst const char* ptr = "Hello"; is legal.

Difference between char* and const char*?

What's the difference between
char* name
which points to a constant string literal, and
const char* name
char* is a mutable pointer to a mutable character/string.
const char* is a mutable pointer to an immutable character/string. You cannot change the contents of the location(s) this pointer points to. Also, compilers are required to give error messages when you try to do so. For the same reason, conversion from const char * to char* is deprecated.
char* const is an immutable pointer (it cannot point to any other location) but the contents of location at which it points are mutable.
const char* const is an immutable pointer to an immutable character/string.
char *name
You can change the char to which name points, and also the char at which it points.
const char* name
You can change the char to which name points, but you cannot modify the char at which it points.
correction: You can change the pointer, but not the char to which name points to (https://msdn.microsoft.com/en-us/library/vstudio/whkd4k6a(v=vs.100).aspx, see "Examples"). In this case, the const specifier applies to char, not the asterisk.
According to the MSDN page and http://en.cppreference.com/w/cpp/language/declarations, the const before the * is part of the decl-specifier sequence, while the const after * is part of the declarator.
A declaration specifier sequence can be followed by multiple declarators, which is why const char * c1, c2 declares c1 as const char * and c2 as const char.
EDIT:
From the comments, your question seems to be asking about the difference between the two declarations when the pointer points to a string literal.
In that case, you should not modify the char to which name points, as it could result in Undefined Behavior.
String literals may be allocated in read only memory regions (implementation defined) and an user program should not modify it in anyway. Any attempt to do so results in Undefined Behavior.
So the only difference in that case (of usage with string literals) is that the second declaration gives you a slight advantage. Compilers will usually give you a warning in case you attempt to modify the string literal in the second case.
Online Sample Example:
#include <string.h>
int main()
{
char *str1 = "string Literal";
const char *str2 = "string Literal";
char source[] = "Sample string";
strcpy(str1,source); //No warning or error, just Undefined Behavior
strcpy(str2,source); //Compiler issues a warning
return 0;
}
Output:
cc1: warnings being treated as errors
prog.c: In function ‘main’:
prog.c:9: error: passing argument 1 of ‘strcpy’ discards qualifiers from pointer target type
Notice the compiler warns for the second case but not for the first.
char mystring[101] = "My sample string";
const char * constcharp = mystring; // (1)
char const * charconstp = mystring; // (2) the same as (1)
char * const charpconst = mystring; // (3)
constcharp++; // ok
charconstp++; // ok
charpconst++; // compile error
constcharp[3] = '\0'; // compile error
charconstp[3] = '\0'; // compile error
charpconst[3] = '\0'; // ok
// String literals
char * lcharp = "My string literal";
const char * lconstcharp = "My string literal";
lcharp[0] = 'X'; // Segmentation fault (crash) during run-time
lconstcharp[0] = 'X'; // compile error
// *not* a string literal
const char astr[101] = "My mutable string";
astr[0] = 'X'; // compile error
((char*)astr)[0] = 'X'; // ok
In neither case can you modify a string literal, regardless of whether the pointer to that string literal is declared as char * or const char *.
However, the difference is that if the pointer is const char * then the compiler must give a diagnostic if you attempt to modify the pointed-to value, but if the pointer is char * then it does not.
CASE 1:
char *str = "Hello";
str[0] = 'M' //Warning may be issued by compiler, and will cause segmentation fault upon running the programme
The above sets str to point to the literal value "Hello" which is hard-coded in the program's binary image, which is flagged as read-only in memory, means any change in this String literal is illegal and that would throw segmentation faults.
CASE 2:
const char *str = "Hello";
str[0] = 'M' //Compile time error
CASE 3:
char str[] = "Hello";
str[0] = 'M'; // legal and change the str = "Mello".
The question is what's the difference between
char *name
which points to a constant string literal, and
const char *cname
I.e. given
char *name = "foo";
and
const char *cname = "foo";
There is not much difference between the 2 and both can be seen as correct. Due to the long legacy of C code, the string literals have had a type of char[], not const char[], and there are lots of older code that likewise accept char * instead of const char *, even when they do not modify the arguments.
The principal difference of the 2 in general is that *cname or cname[n] will evaluate to lvalues of type const char, whereas *name or name[n] will evaluate to lvalues of type char, which are modifiable lvalues. A conforming compiler is required to produce a diagnostics message if target of the assignment is not a modifiable lvalue; it need not produce any warning on assignment to lvalues of type char:
name[0] = 'x'; // no diagnostics *needed*
cname[0] = 'x'; // a conforming compiler *must* produce a diagnostic message
The compiler is not required to stop the compilation in either case; it is enough that it produces a warning for the assignment to cname[0]. The resulting program is not a correct program. The behaviour of the construct is undefined. It may crash, or even worse, it might not crash, and might change the string literal in memory.
The first you can actually change if you want to, the second you can't. Read up about const correctness (there's some nice guides about the difference). There is also char const * name where you can't repoint it.
Actually, char* name is not a pointer to a constant, but a pointer to a variable. You might be talking about this other question.
What is the difference between char * const and const char *?
I would add here that the latest compilers, VS 2022 for instance, do not allow char* to be initialized with a string literal. char* ptr = "Hello"; throws an error whilst const char* ptr = "Hello"; is legal.

Do these statements about pointers have the same effect?

Does this...
char* myString = "hello";
... have the same effect as this?
char actualString[] = "hello";
char* myString = actualString;
No.
char str1[] = "Hello world!"; //char-array on the stack; string can be changed
char* str2 = "Hello world!"; //char-array in the data-segment; it's READ-ONLY
The first example creates an array of size 13*sizeof(char) on the stack and copies the string "Hello world!" into it.
The second example creates a char* on the stack and points it to a location in the data-segment of the executable, which contains the string "Hello world!". This second string is READ-ONLY.
str1[1] = 'u'; //Valid
str2[1] = 'u'; //Invalid - MAY crash program!
No. The first one gives you a pointer to const data, and if you change any character via that pointer, it's undefined behavior. The second one copies the characters into an array, which isn't const, so you can change any characters (either directly in array, or via pointer) at will with no ill effects.
No. In the first one, you can't modify the string pointed by myString, in the second one you can. Read more here.
It isn't the same, because the unnamed array pointed to by myString in the first example is read-only and has static storage duration, whereas the named array in the second example is writeable and has automatic storage duration.
On the other hand, this is closer to being equivalent:
static const char actualString[] = "hello";
char* myString = (char *)actualString;
It's still not quite the same though, because the unnamed arrays created by string literals are not guaranteed to be unique, whereas explicit arrays are. So in the following example:
static const char string_a[] = "hello";
static const char string_b[] = "hello";
const char *ptr_a = string_a;
const char *ptr_b = string_b;
const char *ptr_c = "hello";
const char *ptr_d = "hello";
ptr_a and ptr_b are guaranteed to compare unequal, whereas ptr_c and ptr_d may be either equal or unequal - both are valid.

Resources