C : Using char * const pointer - c

In the following program, p is declared as a pointer(which is constant BUT string is not).But still the program does not work and stops abruptly saying "untitled2.exe has stopped working".
#include<stdio.h>
#include<stdlib.h>
int main(){
char * const p = "hello";
*p = 'm';
return 0;
}
Why this unexpected behaviour?

Albeit p itself is a pointer to a non-const object, it is pointing to a string literal. A string literal is an object which, although not const-qualified with regards to its type, is immutable.
In other words, p is pointing to an object which is not const, but behaves as if it were.
Read more on ANSI/ISO 9899:1990 (C90), section 6.1.4.

You are getting a Windows error because you are invalidly accessing memory. On other systems you might get a SEGFAULT or SEGV or a Bus error.
*p = 'm';
Is trying to change the first letter of the constant string "hello" from 'h' to 'm';

char * const p = "hello";
defines a constant pointer p and initialises it with the memory address of a constant string "hello" which is inherently of type const char *. By this assignment you are discarding a const qualifier. It's valid C, but will lead to undefined behaviour if you don't know what you are doing.
Mind that const char * forbids you to modify the contents of the memory being pointed to, but does not forbid to change the address while char * const permits you to modify the contents, but fixes the address. There is also a combo version const char * const.
Although this is valid C code, depending on your OS placement and restrictions on "hello" it may or may not end up in writable memory. This is left undefined. As a rule on thumb: constant strings are part of the executable program text and are read-only. Thus attempting to write to *p gives you a memory permission error SIGSEGV.
The correct way is to copy the contents of the string to the stack and work there:
char p[] = "hello";
Now you can modify *p because it is located on the stack which is read/write. If you require the same globally then put it into the global scope.

Related

Why use the const char* form for a string

Take the following two forms of creating a string:
const char* pt1 = "Hello";
char* pt2 = "Goodbye";
What is the use of const in the above? In my understanding, doing:
ptr = "Adios";
Would work for both, since that is changing the address of the pointer, but trying to change a letter in the string would fail for both:
const char* pt1 = "Hello";
compiler error: assignment of read-only location
char* pt2 = "Goodbye";
runtime error: seg fault, trying to change .rodata
Since they produce the same result -- i.e., an error -- is there any advantage in using const when defining a string?
Defining pointers that point to string constants (aka string literals) as const char * allows the compiler to detect an incorrect access if somewhere else in the code you try and modify what pt1 points to as in *pt1 = 'A'; whereas you would just have undefined behavior at runtime if pt1 had type char *, causing a crash on some architectures and less obvious but potentially more damaging side effects on others.
To expand on this subject, there is sometimes a confusion as to the meaning of const for pointer definitions:
const char *pt1 = "Hello"; defines a modifiable pointer pt1 that points to an array of char that cannot be modified through it. Since "Hello" is a string constant, it is the correct type for pt1. pt1 can be modified to point to another string or char, modifiable or not, or be set to NULL.
char *pt2 = "Hello"; defines a modifiable pointer pt2 that points to an array of char that can be modified through it. The C Standard allows this in spite of the constness of "Hello" for compatibility with historical code. gcc and clang can disable this behavior with the -Wwrite-strings command line option. I strongly recommend using this and many more warnings to avoid common mistakes.
const char * const pt3 = "Hello"; defines a constant pointer pt3 that points to an array of char that cannot be modified through it. pt3 cannot be modified to point to another string or even be set to NULL.
char * const pt4 = "Hello"; defines a constant pointer pt4 that points to an array of char that can be modified through it. pt4 cannot be changed once initialized.
char and const can be placed in any order, but whether const is before or after the * makes a big difference.
What is the use of const in the above?
const char* pt1 = "Hello";
Simply mean you cannot change data that pt1 is pointing to.
Both
const char* ptr1 = "Hello";
char* pt2 = "Goodbye";
create static memory for string literal. I suggest you read this.
So advantage is that you would always get compile time error on first where on second it might depend on compiler. Some compilers do it automatically. see the page I have linked to.
Why use the const char* form for a string
Use const char *ptr1 when the referenced string should not get modified and allow the compiler to optimize based on that.
This is always the case when assigning with string literals.
Use char *ptr2 when the referenced string might get modified.
The danger of char* pt2 = "Goodbye"; is that later code may attempt to change the data referenced by pt2, which is presently points to string literal.
Why use the const char* form for a string
To notify yourself and other developers that the memory the pointer points to cannot be modified. In this context const is a keyword mostly for the programmer to notify the programmer that the data the pointer points to is const.
basically anything you can do to boil the error up to be caught by the compiler is preferable, right?
Yes. That is the reason why developers push to invent better tools for static code analysis. There are many tools for static code analysis, and recently GNU compiler gcc 11 comes with internal code static analysis. It's also the reason why languages like Rust are invented and so popular. All the tools try to push as many errors as possible to be detectable "statically" - at compile time.
Gcc has also a warning with -Wwrite-strings that warns about code like char *str = "str" that assigns const literal to a non-const pointer.
As KamilCuk pointed out, modifying const char* pt1 = "Hello"; gives you an error already at compile time, when you can just fix the code, recompile, and everything is fine. Modifying char* pt1 = "Hello"; throws the error at runtime, and you do not want all your 1 million users to redownload and reinstall your program (You would have to first buy a better internet connection for that). So, you definitely should use const char*.

Understanding of strlen function - Assignment of const char *s to const char *sc

Below is the implementation of strlen.c as per "The Standard C Library,
size_t strlen(const char *s){
const char *sc;
for(sc = s; *sc != '\0'; ++sc)
return (sc-s); }
Is my understanding of the legality of sc = s correct?
sc=s is a legal assignment because since both variables are declared as const, both protect the object that is pointed to by s. In this case, it is legal to change where sc or s both point to but any assignment (or reference?) to *s or sc would be illegal.
I think what you are asking is what the const keyword means. If not please clarify your question.
The way I like to think of it is any const variable can be stored in ROM (Read Only Memory) and variables that are not declared const can be stored in RAM (Random Access Memory). This kind of depends on the kind of computer you are working with so the const data may not actually be stored in ROM but it could be.
So you can do anything you want with the pointer itself but you can not change the data in the memory it points to.
This means you can reference the pointer and pass that around as much as you like. Also you can assign a different value to the pointer.
Say you have this code
const char* foo = "hello";
const char* bar = "world";
Its perfectly legal to do
foo = bar;
Now both point "world"
Its also legal to do
const char *myPtr = bar;
myPtr = foo;
What you are not allowed to do is change the actual data memory so you are not allowed to do
foo[0] = 'J';
You are correct.
const char * sc declares a pointer to a const char. In essence, it means that sc points to a variable of type char (or in that case, a contiguous array of chars) and that you cannot use sc to modify the pointed variable. See it live here.
Note that sc itself is not a const variable. The const applies to the pointed variable, and not to the pointer. You can thus change the value of the pointer, i.e. the variable to which it points.
Follow this answer to have more insight about the different uses of const and pointers : What is the difference between const int*, const int * const, and int const *?
Is my understanding of the legality of sc = s correct?
Yes, only some detail on the last part needed.
... but any assignment (or reference?) to *s or sc would be illegal.
(I suspect OP means "... or *sc would be illegal.")
Referencing what s or sc points to is OK as in char ch = *sc;
Attempting to change the value of *s or *sc is undefined behavior (UB), not "illegal" as in *sc = 'x';
(See good additional detail by #rici)
With UB, the assignment may work, it might not on Tuesdays, code may crash, etc. It is not defined by C what happens. Certainty code should not attempt it.

Pointers and access to memory in c. Be careful [duplicate]

This question already has answers here:
In C, why can't an integer value be assigned to an int* the same way a string value can be assigned to a char*?
(5 answers)
Why it is possible to assign string to character pointer in C but not an integer value to an integer pointer
(3 answers)
Assigning strings to pointer in C Language
(4 answers)
Why must int pointer be tied to variable but not char pointer?
(8 answers)
Closed 4 years ago.
Still learning more C and am a little confused. In my references I find cautions about assigning a pointer that has not been initialized. They go on to give examples. Great answers yesterday by the way from folks helping me with pointers, here:
Precedence, Parentheses, Pointers with iterative array functions
On follow up I briefly asked about the last iteration of the loop and potentially pointing the pointer to a non-existent place (i.e. because of my references cautioning against it). So I went back and looked more and find this:
If you have a pointer
int *pt;
then use it without initializing it (i.e. I take this to mean without a statement like *pt= &myVariable):
*pt = 606;
you could end up with a real bad day depending on where in memory this pointer has been assigned to. The part I'm having trouble with is when working with a string of characters something like this would be ok:
char *str = "Sometimes I feel like I'm going crazy.";
Where the reference says, "Don't worry about where in the memory the string is allocated; it's handled automatically by the compiler". So no need to say initialize *str = &str[0]; or *str = str;. Meaning, the compiler is automatically char str[n]; in the background?
Why is it that this is handled differently? Or, am I completely misunderstanding?
In this case:
char *str = "Sometimes I feel like I'm going crazy.";
You're initializing str to contain the address of the given string literal. You're not actually dereferencing anything at this point.
This is also fine:
char *str;
str = "Sometimes I feel like I'm going crazy.";
Because you're assigning to str and not actually dereferencing it.
This is a problem:
int *pt;
*pt = 606;
Because pt is not initialized and then it is dereferenced.
You also can't do this for the same reason (plus the types don't match):
*pt= &myVariable;
But you can do this:
pt= &myVariable;
After which you can freely use *pt.
When you write sometype *p = something;, it's equivalent to sometype *p; p = something;, not sometype *p; *p = something;. That means when you use a string literal like that, the compiler figures out where to put it and then puts its address there.
The statement
char *str = "Sometimes I feel like I'm going crazy.";
is equivalent to
char *str;
str = "Sometimes I feel like I'm going crazy.";
Simplifying the string literal can be expressed as:
const char literal[] = "Sometimes I feel like I'm going crazy.";
so the expression
char *str = "Sometimes I feel like I'm going crazy.";
is logically equivalent to:
const char literal[] = "Sometimes I feel like I'm going crazy.";
const char *str = literal;
Of course literals do not have the names.
But you can't dereference the char pointer which does not have allocated memory for the actual object.
/* Wrong */
char *c;
*c = 'a';
/* Wrong - you assign the pointer with the integer value */
char *d = 'a';
/* Correct */
char *d = malloc(1);
*d = 'a';
/* Correct */
char x
char *e = &x;
*e = 'b';
The last example:
/* Wrong - you assign the pointer with the integer value */
int *p = 666;
/* Wrong you dereference the pointer which references to the not allocated space */
int *r;
*r = 666;
/* Correct */
int *s = malloc(sizeof(*s));
*s = 666;
/* Correct */
int t;
int *u = &t;
*u = 666;
And the last one - something similar to the string literals = the compound literals:
/* Correct */
int *z = (int[]){666,567,234};
z[2] = 0;
*z = 5;
/* Correct */
int *z = (const int[]){666,567,234};
Good job on coming up with that example. It does a good job of showing the difference between declaring a pointer (like char *text;) and assigning to a pointer (like text = "Hello, World!";).
When you write:
char *text = "Hello!";
it is essentially the same as saying:
char *text; /* Note the '*' before text */
text = "Hello!"; /* Note that there's no '*' on this line */
(Just so you know, the first line can also be written as char* text;.)
So why is there no * on the second line? Because text is of type char*, and "Hello!" is also of type char*. There is no disagreement here.
Also, the following three lines are identical, as far as the compiler is concerned:
char *text = "Hello!";
char* text = "Hello!";
char * text = "Hello!";
The placement of the space before or after the * makes no difference. The second line is arguably easier to read, as it drives the point home that text is a char*. (But be careful! This style can burn you if you declare more than one variable on a line!)
As for:
int *pt;
*pt = 606; /* Unsafe! */
you might say that *pt is an int, and so is 606, but it's more accurate to say that pt (without a *) is a pointer to memory that should contain an int. Whereas *pt (with a *) refers to the int inside the memory that pt (without the *) is pointing to.
And since pt was never initialized, using *pt (either to assign to or to de-reference) is unsafe.
Now, the interesting part about the lines:
int *pt;
*pt = 606; /* Unsafe! */
is that they'll compile (although possibly with a warning). That's because the compiler sees *pt as an int, and 606 as an int as well, so there's no disagreement. However, as written, the pointer pt doesn't point to any valid memory, so assigning to *pt will likely cause a crash, or corrupt data, or usher about the end of the world, etc.
It's important to realize that *pt is not a variable (even though it is often used like one). *pt just refers to the value in the memory whose address is contained in pt. Therefore, whether *pt is safe to use depends on whether pt contains a valid memory address. If pt isn't set to valid memory, then the use of *pt is unsafe.
So now you might be wondering: What's the point of declaring pt as an int* instead of just an int?
It depends on the case, but in many cases, there isn't any point.
When programming in C and C++, I use the advice: If you can get away with declaring a variable without making it a pointer, then you probably shouldn't declare it as a pointer.
Very often programmers use pointers when they don't need to. At the time, they aren't thinking of any other way. In my experience, when it's brought to their attention to not use a pointer, they will often say that it's impossible not to use a pointer. And when I prove them otherwise, they will usually backtrack and say that their code (which uses pointers) is more efficient than the code that doesn't use pointers.
(That's not true for all programmers, though. Some will recognize the appeal and simplicity of replacing a pointer with a non-pointer, and gladly change their code.)
I can't speak for all cases, of course, but C compilers these days are usually smart enough to compile both pointer code and non-pointer code to be practically identical in terms of efficiency. Not only that, but depending on the case, non-pointer code is often more efficient than code that uses pointers.
There are 4 concepts which you have mixed up in your example:
declaring a pointer. int *p; or char *str; are declarations of the pointers
initializing a pointer at declaration. char *str = "some string"; declares the pointer and initializes it.
assigning a value to the pointer. str = "other string"; assigns a value to the pointer. Similarly p = (int*)606; would assign the value of 606 to the pointer. Though, in the first case the value is legal and points to the location of the string in static memory. In the second case you assign an arbitrary address to p. It might or might not be a legal address. So, p = &myint; or p = malloc(sizeof(int)); are better choices.
assigning a value to what the pointer points to. *p = 606; assigns the value to the 'pointee'. Now it depends, if the value of the pointer 'p' is legal or not. If you did not initialize the pointer, it is illegal (unless you are lucky :-)).
Many good explanations over here. The OP has asked
Why is it that this is handled differently?
It is a fair question, he means why, not how.
Short answer
It is a design decision.
Long answer
When you use a literal in an asigment, the compiler has two options: either it places the literal in the generated assembly instruction (maybe allowing variable length assembly instructions to accomodate different literal byte lenghts) or it places the literal somewhere the cpu can reach it (memory, registers...). For ints, it seems a good choice to place them on the assembly instruction, but for strings... almost all strings used in programs (?) are too long to be placed on the assembly instruction. Given that arbitrarily long assembly instructions are bad for general purpose CPUs, C designers have decided to optimize this use case for strings and save the programmer one step by allocating memory for him. This way, the behaviour is consistent across machines.
Counterexample
Just to see that, for other languages, this has not to be necessarily the case, check this. There (it is Python), int constants are actually placed in memory and given an id, always. So, if you try to get the address of two different variables that were asigned the same literal, it will return the same id (since they are refereing to the same literal, already placed in memory by the Python loader). It is useful to stress that in Python, the id is equivalent to an address in the Python's abstract machine.
Each byte of memory is stored in its own numbered pigeon-hole. That number is the "address" of that byte.
When your program compiles, it builds up a data-table of constants. At run-time these are copied into memory somewhere. So upon execution, in memory is the string (here at the 100,000th byte):
#100000 Sometimes I feel like I'm going crazy.\0
The compiler has generated code, such that when the variable str is created, it is automatically initialised with the address of where that string came to be stored. So in this example's case, str -> 100000. This is where the name pointer comes from, str does not actually contain that string-data, it holds the address of it (i.e. a number), "pointing" to it, saying "that piece of data at this address".
So if str was treated like an integer, it would contain the value 100000.
When you dereference a pointer, like *str = '\0', it's saying: The memory str points at, put this '\0' there.
So when the code defines a pointer, but without any initialisation, it could be pointing anywhere, perhaps even to memory the executable doesn't own (or owns, but can't write to).
For example:
int *pt = blah; // What does 'pt' point at?
It does not have an address. So if the code tries to dereference it, it's just pointing off anywhere in memory, and this gives indeterminate results.
But the case of:
int number = 605;
int *pt = &number
*pt = 606;
Is perfectly valid, because the compiler has generated some space for the storage of number, and now pt contains the address of that space.
So when we use the address-of operator & on a variable, it gives us the number in memory where the variable's content is stored. So if the variable number happened to be stored at byte 100040:
int number = 605;
printf( "Number is stored at %p\n", &number );
We would get the output:
Number is stored at 100040
Similarly with string-arrays, these are really just pointers too. The address is the memory-number of the first element.
// words, words_ptr1, words_ptr2 all end up being the same address
char words[] = "Sometimes I feel like I'm going crazy."
char *words_ptr1 = &(words[0]);
char *words_ptr2 = words;
There are answers here with very good and detailed information.
I will post another answer, perhaps targeting more straightly to the OP.
Rephrasing it a bit:
Why is
int *pt;
*pt = 606;
not ok (non working case), and
char *str = "Sometimes I feel like I'm going crazy.";
is ok (working case)?
Consider that:
char *str = "Sometimes I feel like I'm going crazy.";
is equivalent to
char *str;
str = "Sometimes I feel like I'm going crazy.";
The closest "analogous", working case for int is (using a compound literal instead of a string literal)
int *pt = (int[]){ 686, 687 };
or
int *pt;
pt = (int[]){ 686, 687 };
So, the differences with your non-working case are three-fold:
Use pt = ... instead of *pt = ...
Use a compound literal, not a value (by the same token, str = 'a' wouldn't work).
Compound literals are not always guaranteed to work, since the lifetime of its storage depends on standard/implementation.
In fact, its use as above may give the compilation error taking address of temporary array.
A string variable can be declared either as an array of characters char txt[] or using a character pointer char* txt. The following illustrates the declaration and initialization of a string:
char* txt = "Hello";
In fact, as illustrated above, txt is a pointer to the first character of the string literal.
Whether we are able to modify (read/write) a string variable or not, depends on how we declared it.
6.4.5 String literals (ISO)
6. It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
Actually, if we declare a string txt like we previously did, the compiler will declare the string literal in a read-only data section .rodata (platform dependent) even if txt is not declared as const char*. So we can not modify it. Actually, we should not even try to modify it. In this case gcc can fire warnings (-Wwrite-strings) or even fail due to -Werror. In this cas, it is better to declare string variable as const pointers:
const char* txt = "Hello";
On the other hand, we can declare a string variable as an array of characters:
char txt[] = "Hello";
In that case, the compiler will arrange for the array to get initialized from the string literal, so you can modify it.
Note: An array of characters can be used as if it was a pointer to its first character. That's why we can use txt[0] or *txt syntax to access the first character. And we can even explicitly convert an array of characters to a pointer:
char txt[] = "Hello";
char* ptxt = (char*) txt;

Pointer and Memory from Stanford

I am reading article from Stanford CS library http://cslibrary.stanford.edu/102/
Bad Pointer Example
Code with the most common sort of pointer bug will look like the above correct code, but without the middle step where the pointers are assigned pointees. The bad code will compile fine, but at run-time, each dereference with a bad pointer will corrupt memory in some way. The program will crash sooner or later. It is up to the programmer to ensure that each pointer is assigned a pointee before it is used. The following example shows a simple example of the bad code and a drawing of how memory is likely to react...
void BadPointer() {
int* p; // allocate the pointer, but not the pointee
*p = 42; // this dereference is a serious runtime error
}
// What happens at runtime when the bad pointer is dereferenced...
But I remember that char* should be defined like this
char *const name_ptr = "Test";
In this way, if everyone think about if this char* is a bad define?
The line
char *const name_ptr = "Test";
is fine; you're initializing the pointer with the address of the string literal "Test", which is an array of char stored in such a way that the memory for it is allocated at program startup and held until the program terminates.
A quick digression on the const qualifier:
In C, declaration of the form
const T foo = expr;
or
T const foo = expr;
means that foo may not be written to; it's assigned the value of expr when it's created, and that value may not be changed for the rest of foo's lifetime1). With pointer variables, it gets a little more complicated:
const T *p = expr;
T const *p = expr;
both declare p as a non-const pointer to const data; IOW, you can change the value of p (p can point to different objects), but not the value of *p (you cannot change the value of what p points to).
T * const p = expr;
declares p as a const pointer to non-const data; you can change the value of what p points to (*p = ...), but you cannot change p to point to a different object.
const T * const p = expr;
T const * const p = expr;
both declare p as a const pointer to const data; you cannot change either the value of p or what p points to.
In C, string literals such as "Test" are stored as arrays of char, but attempting to modify the contents of a string literal is undefined behavior (depending on the platform, you may get an access violation). For safety's sake, it's usually a good idea to declare pointers to string literals as const char * or char const *, rather than char * const as in the example above.
As far as
void BadPointer() {
int* p; // allocate the pointer, but not the pointee
*p = 42; // this dereference is a serious runtime error
}
is concerned, p is an auto variable, which is not initialized to any particular value; it will contain a random bit string that may or may not correspond to a writable address. Because of this, the behavior of the statement *p = 42; is undefined - you may get an access violation, you may wind up overwriting something important and leave the program in a bad state, or it may appear to "work" with no issues (writing to some random memory area that is accessible and not important).
In general, it's impossible to tell whether a given pointer value is valid or invalid from the pointer value alone2). The one exception is the special pointer value NULL, which is a well-defined "nowhere" that's guaranteed to compare unequal to any valid pointer value. Pointer variables declared at file scope (outside of any function) or with the static qualifier are implicitly initialized to NULL. Non-static, block-scope pointer variables should always be explicitly initialized with either NULL or a valid address. This way you can easily check to see if the pointer has been assigned a valid value:
int *p = NULL;
...
if (p != NULL) // or simply if (p)
{
*p = 42;
}
else
{
// p was not assigned a valid memory location
}
1) Note that, in C, foo is not a compile-time constant; it's a regular run-time variable, you just cannot write to it. You cannot use it in a context that requires a compile-time constant.
2) If you're intimately familiar with your platform's memory model you can make some educated guesses, but even then it's not guaranteed.
In the second case:
char *const name_ptr = "Test";
You are creating a string literal that placed in read-only memory. Therefore you can have a legit pointer to it.
In the first case:
void BadPointer() {
int* p; // allocate the pointer, but not the pointee
*p = 42; // this dereference is a serious runtime error
}
you will get an Undefined Behavior (UB).
char *const name_ptr means that name_ptr is a constant pointer to a char (it is the pointer which is constant).
You probably mean const char * name_ptr = "Test"
(name_ptr is a pointer to a character that is constant)
The thing is that "Test" is a string, which is an array of chars, stored somewhere in (probably) constant memory. Since the memory is allocated, then that is fine to initialise the pointer to point at it.
int *p; is an uninitialised pointer. It has some undefined value which might or might not resolve to a sensible memory location - odds are that it won't but you never know. Saying *p = 42; will overwrite that arbitary memory location with 42, then all bets for your program are off.
In a case like this, it helps to remember that a pointer is nothing more than a normal variable that holds a value - the only "magic" part about it is that value represents a location in memory, and you can dereference that location to access what's stored there.
Imagine a bit of code like this:
void BadPrinter() {
int p;
printf("%d\n", p);
}
What would it print? Who knows? Maybe 0, maybe garbage, maybe the lyrics to "Come Sail Away" by Styx encoded as an integer.
Now we go back to your pointer code:
void BadPointer() {
int* p; // allocate the pointer, but not the pointee
*p = 42; // this dereference is a serious runtime error
}
p is uninitialized in the exact same way - it could contain anything. So when you do *p, you're asking the compiler to give you access to whatever memory is represented by the number contained in p.
So if p happens to contain 0, you're now trying to stuff the value 42 into the memory location 0x0: your program will probably crash. If p happens to contain a location in writable memory, your program will probably continue merrily along, since you will be allowed to store 42 at that location.
Now this case is a little different:
char *const name_ptr = "Test";
Here you're asking the compiler to allocate enough memory space to store the string "Test" and store the location of that memory in name_ptr. Going back to our first example, it would be analogous to:
void GoodPrinter() {
int p = 4;
printf("%d\n", p);
}

Why is this C code involving constant pointer crashing on me? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
getting segmentation fault in a small c program
Here's my code:
char *const p1 = "john";
p1[2] = 'z'; //crashes
printf("%s\n", p1);
I know p1 is a "read-only" variable, but I thought I could still modify the string ("john" ). I appreciate any tips or advice.
You cannot safely modify string literals, even if the pointer doesn't look const. They will often be allocated in read-only memory, hence your crashes - and when they're not in read-only memory modifying them can have unexpected consequences.
If you copy to an array this should work:
char tmp[] = "john";
char *const p1 = tmp;
p1[2] = 'z'; // ok
Keyword const means that variable of that type should stay constant. You shouldn't change it.
Also even if you declare this string as char *p1 = "john"; it would be constant string literal and changing it would cause undefined behaviour. You should declare it as char p1[] = "john"; in order to achieve behaviour you are looking for.
1) Forget the "const" qualifier. This is WRONG, in C and C++, on ANY platform:
char *p1 = "john";
p1[2] = 'z'; //crashes
Here's why:
Why do I get a segmentation fault when writing to a string initialized with "char *s" but not "char s[]"?
http://c-faq.com/decl/strlitinit.html
2) So how do you get around the access violation?
Simple: you allocate writable memory (instead of writing to a string constant, which is probably allocated in read-only memory):
#define BUFSIZE 80
...
char p1[BUFSIZE];
strcpy (p1, "john");
p1[2] = 'z'; //no problem
3) OK: so then what's the deal with "const"?
You were right about that part. Here's (one of many) discussions about the (subtle) difference between a "const pointer" and a "pointer to a const":
http://www.codeguru.com/cpp/cpp/cpp_mfc/general/article.php/c6967
your pointer points to memory that isn't allowed to be changed ( the constant string )
do char p1[100] = "john";
This is because you are not supposed to modify constants: after all, they are called constants for a reason. Modifying a constant is undefined behavior according to the C standard, which often means that your program is going to crash.
Note that it has nothing to do with your pointer being constant: the crash is because what your pointer points to is a string constant.
Here is how to do what you are trying to do legally:
char p1[] = "john";
p1[2] = 'z'; //no longer crashes :)
printf("%s\n", p1);
There are two, unrelated, problems in that code. First, the const is not in the right place.
char * const p = "john";
char const * p = "john";
const char * p = "john";
The latter two are pointers to unmodifiable strings. If you had done this, then the code would not have compiled.
The first option char * const is not really a read-only variable. It means a pointer which points to modifiable data, and that the pointer cannot be changed such that it points to another string. But that's not relevant to your problem.
Your constis not relevant here. You have attempted to modify a string that shouldn't be modified. Literal strings should never be modified, and this is the cause of your crash.
p1 is a pointer to constant data, not a constant pointer. Moreover it points to a literal constant, which typically resides in the code space, and modern operating systems and processors usually protect against code trying to modify code space.
You should not be surprised that it crashes, but in general the behaviour is undefined.

Resources