Proper way to initialize a string in C - c

I've seen people's code as:
char *str = NULL;
and I've seen this is as well,
char *str;
I'm wonder, what is the proper way of initializing a string? and when are you supposed to initialize a string w/ and w/out NULL?

You're supposed to set it before using it. That's the only rule you have to follow to avoid undefined behaviour. Whether you initialise it at creation time or assign to it just before using it is not relevant.
Personally speaking, I prefer to never have variables set to unknown values myself so I'll usually do the first one unless it's set in close proximity (within a few lines).
In fact, with C99, where you don't have to declare locals at the tops of blocks any more, I'll generally defer creating it until it's needed, at which point it can be initialised as well.
Note that variables are given default values under certain circumstances (for example, if they're static storage duration such as being declared at file level, outside any function).
Local variables do not have this guarantee. So, if your second declaration above (char *str;) is inside a function, it may have rubbish in it and attempting to use it will invoke the afore-mentioned, dreaded, undefined behaviour.
The relevant part of the C99 standard 6.7.8/10:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
if it is an aggregate, every member is initialized (recursively) according to these rules;
if it is a union, the first named member is initialized (recursively) according to these rules.

I'm wonder, what is the proper way of initializing a string?
Well, since the second snippet defines an uninitialized pointer to string, I'd say the first one. :)
In general, if you want to play it safe, it's good to initialize to NULL all pointers; in this way, it's easy to spot problems derived from uninitialized pointers, since dereferencing a NULL pointer will yield a crash (actually, as far as the standard is concerned, it's undefined behavior, but on every machine I've seen it's a crash).
However, you should not confuse a NULL pointer to string with an empty string: a NULL pointer to string means that that pointer points to nothing, while an empty string is a "real", zero-length string (i.e. it contains just a NUL character).
char * str=NULL; /* NULL pointer to string - there's no string, just a pointer */
const char * str2 = ""; /* Pointer to a constant empty string */
char str3[] = "random text to reach 15 characters ;)"; /* String allocated (presumably on the stack) that contains some text */
*str3 = 0; /* str3 is emptied by putting a NUL in first position */

this is a general question about c variables not just char ptrs.
It is considered best practice to initialize a variable at the point of declaration. ie
char *str = NULL;
is a Good Thing. THis way you never have variables with unknown values. For example if later in your code you do
if(str != NULL)
doBar(str);
What will happen. str is in an unknown (and almost certainly not NULL) state
Note that static variables will be initialized to zero / NULL for you. Its not clear from the question if you are asking about locals or statics

Global variables are initialized with default values by a compiler, but local variables must be initialized.

an unitialized pointer should be considered as undefined so to avoid generating errors by using an undefined value it's always better to use
char *str = NULL;
also because
char *str;
this will be just an unallocated pointer to somewhere that will mostly cause problems when used if you forget to allocate it, you will need to allocate it ANYWAY (or copy another pointer).
This means that you can choose:
if you know that you will allocate it shortly after its declaration you can avoid setting it as NULL (this is a sort of rule to thumb)
in any other case, if you want to be sure, just do it. The only real problem occurs if you try to use it without having initialized it.

It depends entirely on how you're going to use it. In the following, it makes more sense not to initialize the variable:
int count;
while ((count = function()) > 0)
{
}

Don't initialise all your pointer variables to NULL on declaration "just in case".
The compiler will warn you if you try to use a pointer variable that has not been initialised, except when you pass it by address to a function (and you usually do that in order to give it a value).
Initialising a pointer to NULL is not the same as initialising it to a sensible value, and initialising it to NULL just disables the compiler's ability to tell you that you haven't initialised it to a sensible value.
Only initialise pointers to NULL on declaration if you get a compiler warning if you don't, or you are passing them by address to a function that expects them to be NULL.
If you can't see both the declaration of a pointer variable and the point at which it is first given a value in the same screen-full, your function is too big.

static const char str[] = "str";
or
static char str[] = "str";

Because free() doesn't do anything if you pass it a NULL value you can simplify your program like this:
char *str = NULL;
if ( somethingorother() )
{
str = malloc ( 100 );
if ( NULL == str )
goto error;
}
...
error:
cleanup();
free ( str );
If for some reason somethingorother() returns 0, if you haven't initialized str you will
free some random address anywhere possibly causing a failure.
I apologize for the use of goto, I know some finds it offensive. :)

Your first snippet is a variable definition with initialization; the second snippet is a variable definition without initialization.
The proper way to initialize a string is to provide an initializer when you define it. Initializing it to NULL or something else depends on what you want to do with it.
Also be aware of what you call "string". C has no such type: usually "string" in a C context means "array of [some number of] char". You have pointers to char in the snippets above.
Assume you have a program that wants the username in argv[1] and copies it to the string "name". When you define the name variable you can keep it uninitialized, or initialize it to NULL (if it's a pointer to char), or initialize with a default name.
int main(int argc, char **argv) {
char name_uninit[100];
char *name_ptr = NULL;
char name_default[100] = "anonymous";
if (argc > 1) {
strcpy(name_uninit, argv[1]); /* beware buffer overflow */
name_ptr = argv[1];
strcpy(name_default, argv[1]); /* beware buffer overflow */
}
/* ... */
/* name_uninit may be unusable (and untestable) if there were no command line parameters */
/* name_ptr may be NULL, but you can test for NULL */
/* name_default is a definite name */
}

By proper you mean bug free? well, it depends on the situation. But there are some rules of thumb I can recommend.
Firstly, note that strings in C are not like strings in other languages.
They are pointers to a block of characters. The end of which is terminated with a 0 byte or NULL terminator. hence null terminated string.
So for example, if you're going to do something like this:
char* str;
gets(str);
or interact with str in any way, then it's a monumental bug. The reason is because as I have just said, in C strings are not strings like other languages. They are just pointers. char* str is the size of a pointer and will always be.
Therefore, what you need to do is allocate some memory to hold a string.
/* this allocates 100 characters for a string
(including the null), remember to free it with free() */
char* str = (char*)malloc(100);
str[0] = 0;
/* so does this, automatically freed when it goes out of scope */
char str[100] = "";
However, sometimes all you need is a pointer.
e.g.
/* This declares the string (not intialized) */
char* str;
/* use the string from earlier and assign the allocated/copied
buffer to our variable */
str = strdup(other_string);
In general, it really depends on how you expect to use the string pointer.
My recommendation is to either use the fixed size array form if you're only going to be using it in the scope of that function and the string is relatively small. Or initialize it to NULL. Then you can explicitly test for NULL string which is useful when it's passed into a function.
Beware that using the array form can also be a problem if you use a function that simply checks for NULL as to where the end of the string is. e.g. strcpy or strcat functions don't care how big your buffer is. Therefore consider using an alternative like BSD's strlcpy & strlcat. Or strcpy_s & strcat_s (windows).
Many functions expect you to pass in a proper address as well. So again, be aware that
char* str = NULL;
strcmp(str, "Hello World");
will crash big time because strcmp doesn't like having NULL passed in.
You have tagged this as C, but if anyone is using C++ and reads this question then switch to using std::string where possible and use the .c_str() member function on the string where you need to interact with an API that requires a standard null terminated c string.

Related

Non static pointer variable declared inside function seems to be preserved even after function returns

This question is not an exact duplicate of this. The most popular answer here says that it was not guaranteed that the memory location would be preserved but by accident it did. Where I got this piece of code from, it says clearly that the string variable will be preserved.
Consider the following code:
#include<stdio.h>
char *getString()
{
char *str = "abc";
return str;
}
int main()
{
printf("%s", getString());
getchar();
return 0;
}
This code compiles and runs without any errors.
The pointer char* str is not defined as a static variable. Yet the variable seems to be preserved even after the function returns.
Please explain this behavior.
I do not understand why this question got a negative vote even though it's not answered yet or marked as duplicate.
char *getString() ... char *str = "abc";
You are making some confusion. getString() creates a variable in the stack, and makes it point to a literal string. That literal can be anywhere, and will stay always there; depending on the compiler, it can be in RAM initialized at startup, or it can be in read-only memory. OK: If it is in RAM, perhaps you can even modify it, but not in normal (legal) ways - we should ignore this and think at the literal as immutable data.
The point above is THE point. It is true that the string "is preserved". But it is not true that an automatic variable is preserved - it can happen, but you should not rely on that, it is an error if you do it. By definition, an automatic variable gets destroyed when the function returns, and be sure that it happens.
That said, I don't see how you can say that the variable is preserved. There is no mechanism in C to peek at local variables (outside the current scope); you can do it with a debugger, especially when the active frame is that of getString(); maybe some debugger lets you to look where you shouldn't, but this is another matter.
EDIT after the comment
Many compilers create auto (local) variables in the stack. When the function returns, the data on the stack remains untouched, because clearing/destroying the stack is made by simply moving the stack pointer elsewhere. So the affirmation "the variable seems to be preserved" is correct. It seems because actually the variable is still there. It seems because there is no legal way to use it. But even in this situation, where the variable is still there but hidden, the compiler can decide to use that room for something else, or an interrupt can arrive and use that same room. In other words, while the auto variable is in scope it is guaranteed to be "preserved", but when it falls out of scope it must be considered gone.
There are situation where one can write code that refers to variables fallen out of scope: these are errors that the compiler should (and sometimes can) detect.
I said that often auto variables go in the stack. This is not always true, it depends on architecture and compiler, but all the rest is true and it is dictated by the language rules.
The address of str may change but the thing it points to (once initialised) will not.
Note that in this simple example you probably won't see changes. If you call getString a few times from different stack depths and show &str you will see what I mean.
char *getString()
{
char *str = "abc";
printf("%p", &str);
return str;
}
int main()
{
printf("%s", getString());
stackTest();
getchar();
return 0;
}
void stackTest()
{
char blob[200];
int x=0;
printf("%s", getString());
}
*(I have not tested this and my unused stack variables might get optimised away depending on your compiler & settings)
C 2011 (draft N1570) clause 6.4.5, paragraph 6 describes how a string literal in source code becomes a static object:
The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence.
Thus, in char *str = "abc";, str is initialized to point to a static array containing the characters a, b, and c, and a null character.
Then return str; returns the value of this pointer.
Finally, printf("%s", getString()); passes the value of the pointer to printf. At this point, the str object is gone (in the C model of computation). It no longer exists. We have the value it was used to hold, but the str object itself is gone. However, the value points to a static object, the array of characters, and printf reads that array and prints it.
So, the title of your question is incorrect. The non-static pointer variable declared inside the function was not preserved. All that was preserved was the static array of characters and its address.

Can we return value which is non pointer to the pointer function

char* xpx(char* src)
{
char result[sizeof(src)];
strcpy(result,src);
return result;
}
There are 2 bugs in the above code.
1) strcpy is passing the src as parameters but it is not legal as str
is a pointer.
I could not able to find the another one. Could you help me?
Look at strcpy docs:
char *strcpy( char *restrict dest, const char *restrict src );
Copies the null-terminated byte string pointed to by src, including
the null terminator, to the character array whose first element is
pointed to by dest. The behavior is undefined if the dest array is
not large enough. [...] (important but not relevant to the question so
omitted part)
So strcpy does take 2 pointers. That's fine. It's not a bug.
To find the bug pay attention to (and think about it, it's logical): dest array must be large enough. What does "large enough" mean here? Well since the functions copies the string from src, including null terminator to dest it means dest must be at least the length of the string in src + 1 for the null terminator. That means strlen(src) + 1.
sizeof(src) is the same as sizeof(int*) which is the size of a char pointer on the platform. The size of the pointer. Not what you want.
The next error is that the functions returns the address of an automatic storage object, aka the result array. This means that the array will cease to exist when the function exits and thus the function returns a pointer to an object that is no longer valid. A solution to this would be to use malloc to allocate the array. Another is to change the signature of xpx to something similar to strcpy where the destination array is supplied.
So summing them you need something along this:
char* result = malloc(strlen(src) + 1);
Another bug (yes, technically it's not a bug, but semantically it's a bug in my opinion) is that src should be of const char* type.
Another source of potential problems and bugs is if the input is not well-behaved, e.g. if src is not null-terminated, but it is debatable how much responsibility this function should carry regarding this.
not legal as str is a pointer
Not sure what you mean by that, but passing pointers to functions is perfectly fine.
As for "bugs" in the function, it is hard to say if there are any, given that you have not provided what is the expected behavior of the function. The function is legal C, although quite useless and dangerous to use (and will trigger warnings from common compilers):
Do not return addresses to local variables, since they do not exist anymore after your return, so you shall not access them.
If you are passing a null-terminated string into a function and you need to get a buffer of its size; then you have to find out its length at runtime using something like strlen() and then allocate the memory on the heap with malloc() (or using VLAs on C99 and later).

When to allocate memory to char *

I am bit confused when to allocate memory to a char * and when to point it to a const string.
Yes, I understand that if I wish to modify the string, I need to allocate it memory.
But in cases when I don't wish to modify the string to which I point and just need to pass the value should I just do the below? What are the disadvantages in the below steps as compared to allocating memory with malloc?
char *str = NULL;
str = "This is a test";
str = "Now I am pointing here";
Let's try again your example with the -Wwrite-strings compiler warning flag, you will see a warning:
warning: initialization discards 'const' qualifier from pointer target type
This is because the type of "This is a test" is const char *, not char *. So you are losing the constness information when you assign the literal address to the pointer.
For historical reasons, compilers will allow you to store string literals which are constants in non-const variables.
This is, however, a bad behavior and I suggest you to use -Wwrite-strings all the time.
If you want to prove it for yourself, try to modify the string:
char *str = "foo";
str[0] = 'a';
This program behavior is undefined but you may see a segmentation fault on many systems.
Running this example with Valgrind, you will see the following:
Process terminating with default action of signal 11 (SIGSEGV)
Bad permissions for mapped region at address 0x4005E4
The problem is that the binary generated by your compiler will store the string literals in a memory location which is read-only. By trying to write in it you cause a segmentation fault.
What is important to understand is that you are dealing here with two different systems:
The C typing system which is something to help you to write correct code and can be easily "muted" (by casting, etc.)
The Kernel memory page permissions which are here to protect your system and which shall always be honored.
Again, for historical reasons, this is a point where 1. and 2. do not agree. Or to be more clear, 1. is much more permissive than 2. (resulting in your program being killed by the kernel).
So don't be fooled by the compiler, the string literals you are declaring are really constant and you cannot do anything about it!
Considering your pointer str read and write is OK.
However, to write correct code, it should be a const char * and not a char *. With the following change, your example is a valid piece of C:
const char *str = "some string";
str = "some other string";
(const char * pointer to a const string)
In this case, the compiler does not emit any warning. What you write and what will be in memory once the code is executed will match.
Note: A const pointer to a const string being const char *const:
const char *const str = "foo";
The rule of thumb is: always be as constant as possible.
If you need to modify the string, use dynamic allocation (malloc() or better, some higher level string manipulation function such as strdup, etc. from the libc), if you don't need to, use a string literal.
If you know that str will always be read-only, why not declare it as such?
char const * str = NULL;
/* OR */
const char * str = NULL;
Well, actually there is one reason why this may be difficult - when you are passing the string to a read-only function that does not declare itself as such. Suppose you are using an external library that declares this function:
int countLettersInString(char c, char * str);
/* returns the number of times `c` occurs in `str`, or -1 if `str` is NULL. */
This function is well-documented and you know that it will not attempt to change the string str - but if you call it with a constant string, your compiler might give you a warning! You know there is nothing dangerous about it, but your compiler does not.
Why? Because as far as the compiler is concerned, maybe this function does try to modify the contents of the string, which would cause your program to crash. Maybe you rely very heavily on this library and there are lots of functions that all behave like this. Then maybe it's easier not to declare the string as const in the first place - but then it's all up to you to make sure you don't try to modify it.
On the other hand, if you are the one writing the countLettersInString function, then simply make sure the compiler knows you won't modify the string by declaring it with const:
int countLettersInString(char c, char const * str);
That way it will accept both constant and non-constant strings without issue.
One disadvantage of using string-literals is that they have length restrictions.
So you should keep in mind from the document ISO/IEC:9899
(emphasis mine)
5.2.4.1 Translation limits
1 The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:
[...]
— 4095 characters in a character string literal or wide string literal (after concatenation)
So If your constant text exceeds this count (What some times throughout may be possible, especially if you write a dynamic webserver in C) you are forbidden to use the string literal approach if you want to stay system independent.
There is no problem in your code as long as you are not planing to modify the contents of that string. Also, the memory for such string literals will remain for the full life time of the program. The memory allocated by malloc is read-write, so you can manipulate the contents of that memory.
If you have a string literal that you do not want to modify, what you are doing is ok:
char *str = NULL;
str = "This is a test";
str = "Now I am pointing here";
Here str a pointer has a memory which it points to. In second line you write to that memory "This is a test" and then again in 3 line you write in that memory "Now I am pointing here". This is legal in C.
You may find it a bit contradicting but you can't modify string that is something like this -
str[0]='X' // will give a problem.
However, if you want to be able to modify it, use it as a buffer to hold a line of input and so on, use malloc:
char *str=malloc(BUFSIZE); // BUFSIZE size what you want to allocate
free(str); // freeing memory
Use malloc() when you don't know the amount of memory needed during compile time.
It is legal in C unfortunately, but any attempt to modify the string literal via the pointer will result in undefined behavior.
Say
str[0] = 'Y'; //No compiler error, undefined behavior
It will run fine, but you may get a warning by the compiler, because you are pointing to a constant string.
P.S.: It will run OK only when you are not modifying it. So the only disadvantage of not using malloc is that you won't be able to modify it.

C Duration of strings, constants, compound literals, and why not, the code itself

I didn't remember where I read, that If I pass a string to a function like.
char *string;
string = func ("heyapple!");
char *func (char *string) {
char *p
p = string;
return p;
}
printf ("%s\n", string);
The string pointer continue to be valid because the "heyapple!" is in memory, it IS in the code the I wrote, so it never will be take off, right?
And about constants like 1, 2.10, 'a'?
And compound literals?
like If I do it:
func (1, 'a', "string");
Only the string will be all of my program execution, or the constans will be too?
For example I learned that I can take the address of string doing it
&"string";
Can I take the address of the constants literals? like 1, 2.10, 'a'?
I'm passing theses to functions arguments and it need to have static duration like strings without the word static.
Thanks a lot.
This doesn't make a whole lot of sense.
Values that are not pointers cannot be "freed", they are values, they can't go away.
If I do:
int c = 1;
The variable 'c' is not a pointer, it cannot do anything else than contain an integer value, to be more specific it can't NOT contain an integer value. That's all it does, there are no alternatives.
In practice, the literals will be compiled into the generated machine-code, so that somewhere in the code resulting from the above will be something like
load r0, 1
Or whatever the assembler for the underlying instruction set looks like. The '1' is a part of the instruction encoding, it can't go away.
Make sure you distinguish between values and pointers to memory. Pointers are themselves values, but a special kind of value that contains an address to memory.
With char* hello = "hello";, there are two things happening:
the string "hello" and a null-terminator are written somewhere in memory
a variable named hello contains a value which is the address to that memory
With int i = 0; only one thing happens:
a variable named i contains the value 0
When you pass around variables to functions their values are always copied. This is called pass by value and works fine for primitive types like int, double, etc. With pointers this is tricky because only the address is copied; you have to make sure that the contents of that address remain valid.
Short answer: yes. 1 and 'a' stick around due to pass by value semantics and "hello" sticks around due to string literal allocation.
Stuff like 1, 'a', and "heyapple!" are called literals, and they get stored in the compiled code, and in memory for when they have to be used. If they remain or not in memory for the duration of the program depends on where they are declared in the program, their size, and the compiler's characteristics, but you can generally assume that yes, they are stored somewhere in memory, and that they don't go away.
Note that, depending on the compiler and OS, it may be possible to change the value of literals, inadvertently or purposely. Many systems store literals in read-only areas (CONST sections) of memory to avoid nasty and hard-to-debug accidents.
For literals that fit into a memory word, like ints and chars it doesn't matter how they are stored: one repeats the literal throughout the code and lets the compiler decide how to make it available. For larger literals, like strings and structures, it would be bad practice to repeat, so a reference should be kept.
Note that if you use macros (#define HELLO "Hello!") it is up to the compiler to decide how many copies of the literal to store, because macro expansion is exactly that, a substitution of macros for their expansion that happens before the compiler takes a shot at the source code. If you want to make sure that only one copy exists, then you must write something like:
#define HELLO "Hello!"
char* hello = HELLO;
Which is equivalent to:
char* hello = "Hello!";
Also note that a declaration like:
const char* hello = "Hello!";
Keeps hello immutable, but not necessarily the memory it points to, because of:
char h = (char) hello;
h[3] = 'n';
I don't know if this case is defined in the C reference, but I would not rely on it:
char* hello = "Hello!";
char* hello2 = "Hello!"; // is it the same memory?
It is better to think of literals as unique and constant, and treat them accordingly in the code.
If you do want to modify a copy of a literal, use arrays instead of pointers, so it's guaranteed a different copy of the literal (and not an alias) is used each time:
char hello[] = "Hello!";
Back to your original question, the memory for the literal "heyapple!" will be available (will be referenceable) as long as a reference is kept to it in the running code. Keeping a whole module (a loadable library) in memory because of a literal may have consequences on overall memory use, but that's another concern (you could also force the unloading of the module that defines the literal and get all kind of strange results).
First,it IS in the code the I wrote, so it never will be take off, right? my answer is yes. I recommend you to have a look at the structure of ELF or runtime structure of executable. The position that the string literal stored is implementation dependent, in gcc, string literal is store in the .rdata segment. As the name implies, the .rdata is read-only. In your code
char *p
p = string;
the pointer p now point to an address in a readonly segment, so even after the end of function call, that address is still valid. But if you try to return a pointer point to a local variable then it is dangerous and may cause hard-to-find bugs:
int *func () {
int localVal = 100;
int *ptr = localVal;
return p;
}
int val = func ();
printf ("%d\n", val);
after the execution of func, as the stack space of func is retrieve by the c runtime, the memory address where localVal was stored will no longer guarantee to hold the original localVal value. It can be overidden by operation following the func.
Back to your question title
-
string literal have static duration.
As for "And about constants like 1, 2.10, 'a'?"
my answer is NO, your can't get address of a integer literal using &1. You may be confused by the name 'integer constant', but 1,2.10,'a' is not right value ! They do not identify a memory place,thus, they don't have duration, a variable contain their value can have duration
compound literals, well, I am not sure about this.

C's strtok() and read only string literals

char *strtok(char *s1, const char *s2)
repeated calls to this function break string s1 into "tokens"--that is
the string is broken into substrings,
each terminating with a '\0', where
the '\0' replaces any characters
contained in string s2. The first call
uses the string to be tokenized as s1;
subsequent calls use NULL as the first
argument. A pointer to the beginning
of the current token is returned; NULL
is returned if there are no more
tokens.
Hi,
I have been trying to use strtok just now and found out that if I pass in a char* into s1, I get a segmentation fault. If I pass in a char[], strtok works fine.
Why is this?
I googled around and the reason seems to be something about how char* is read only and char[] is writeable. A more thorough explanation would be much appreciated.
What did you initialize the char * to?
If something like
char *text = "foobar";
then you have a pointer to some read-only characters
For
char text[7] = "foobar";
then you have a seven element array of characters that you can do what you like with.
strtok writes into the string you give it - overwriting the separator character with null and keeping a pointer to the rest of the string.
Hence, if you pass it a read-only string, it will attempt to write to it, and you get a segfault.
Also, becasue strtok keeps a reference to the rest of the string, it's not reeentrant - you can use it only on one string at a time. It's best avoided, really - consider strsep(3) instead - see, for example, here: http://www.rt.com/man/strsep.3.html (although that still writes into the string so has the same read-only/segfault issue)
An important point that's inferred but not stated explicitly:
Based on your question, I'm guessing that you're fairly new to programming in C, so I'd like to explain a little more about your situation. Forgive me if I'm mistaken; C can be hard to learn mostly because of subtle misunderstanding in underlying mechanisms so I like to make things as plain as possible.
As you know, when you write out your C program the compiler pre-creates everything for you based on the syntax. When you declare a variable anywhere in your code, e.g.:
int x = 0;
The compiler reads this line of text and says to itself: OK, I need to replace all occurrences in the current code scope of x with a constant reference to a region of memory I've allocated to hold an integer.
When your program is run, this line leads to a new action: I need to set the region of memory that x references to int value 0.
Note the subtle difference here: the memory location that reference point x holds is constant (and cannot be changed). However, the value that x points can be changed. You do it in your code through assignment, e.g. x = 15;. Also note that the single line of code actually amounts to two separate commands to the compiler.
When you have a statement like:
char *name = "Tom";
The compiler's process is like this: OK, I need to replace all occurrences in the current code scope of name with a constant reference to a region of memory I've allocated to hold a char pointer value. And it does so.
But there's that second step, which amounts to this: I need to create a constant array of characters which holds the values 'T', 'o', 'm', and NULL. Then I need to replace the part of the code which says "Tom" with the memory address of that constant string.
When your program is run, the final step occurs: setting the pointer to char's value (which isn't constant) to the memory address of that automatically created string (which is constant).
So a char * is not read-only. Only a const char * is read-only. But your problem in this case isn't that char *s are read-only, it's that your pointer references a read-only regions of memory.
I bring all this up because understanding this issue is the barrier between you looking at the definition of that function from the library and understanding the issue yourself versus having to ask us. And I've somewhat simplified some of the details in the hopes of making the issue more understandable.
I hope this was helpful. ;)
I blame the C standard.
char *s = "abc";
could have been defined to give the same error as
const char *cs = "abc";
char *s = cs;
on grounds that string literals are unmodifiable. But it wasn't, it was defined to compile. Go figure. [Edit: Mike B has gone figured - "const" didn't exist at all in K&R C. ISO C, plus every version of C and C++ since, has wanted to be backward-compatible. So it has to be valid.]
If it had been defined to give an error, then you couldn't have got as far as the segfault, because strtok's first parameter is char*, so the compiler would have prevented you passing in the pointer generated from the literal.
It may be of interest that there was at one time a plan in C++ for this to be deprecated (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/1996/N0896.asc). But 12 years later I can't persuade either gcc or g++ to give me any kind of warning for assigning a literal to non-const char*, so it isn't all that loudly deprecated.
[Edit: aha: -Wwrite-strings, which isn't included in -Wall or -Wextra]
In brief:
char *s = "HAPPY DAY";
printf("\n %s ", s);
s = "NEW YEAR"; /* Valid */
printf("\n %s ", s);
s[0] = 'c'; /* Invalid */
If you look at your compiler documentation, odds are there is a option you can set to make those strings writable.

Resources