Consider the following code:
char* str = "Hello World";
memcpy(str, "Copy\0", 5);
A segmentation fault occurs during the memcpy. However, using this code:
char str[12];
memcpy(str, "Hello World\0", 12);
memcpy(str, "Copy\0", 5);
The program does not produce a segmentation fault.
Does the problem arise from allocating the memory on the stack versus the data section?
When you use a string literal in gcc the value is placed in read-only memory and cannot be modified. Trying to modify it leads to undefined behaviour. Usually you will get a segmentation fault on Linux when you try to do this.
The second example works because you aren't modifying the string literal, you are modifying a copy of it that is stored in variable that is not read-only.
char* str = "Hello World";
and
char str[12];
are two very different things. One allocates a pointer on the stack and an array in read-only "code segment". The pointer then points at the array. The other allocates the entire array on the stack, and there is no pointer.
Related
I have a global pointer to char arrays defined (on the stack I believe?) as:
char *history[BUFFER_SIZE];
And inside a method I simply want to:
strncpy(history[0], str, length);
and it seg faults. It doesn't make sense to me since:
history[0] = "a string"
doesn't seg fault.
My questions:
Since I am defining the array of char arrays like this, I shouldn't have to do any sort of malloc or initialization, correct?
Why is it seg faulting?
char *history[BUFFER_SIZE]; is an array of char*s that point to nowhere. When you try to strncpy to those pointers, you invoke undefined behavior (because they point to nowhere), and you're seeing this manifested as a segfault.
When you history[0] = "a string" this assigns the char* at history[0], so history[0] no longer pointer to nowhere, it points to "a string". "a string" is a string literal, stored elsewhere in your program, most likely the read-only section. history[0] does not actually contain the data "a string", it simply contains the address of where "a string" resides.
Since I am defining the array of char arrays like this, I shouldn't
have to do any sort of malloc or initialization, correct?
That depends on what you want to do. It's perfectly fine to do history[0] = "a string", just know that trying to modify that string is also undefined behavior, since it is a string literal. If you want to copy the string literal to a section of memory where you can freely modify the copy, you will have to allocate some memory with malloc or similar. But char *history[BUFFER_SIZE]; isn't defining an "array of char arrays", it's defining an array of char pointers.
history is an array of pointers, you can not copy a whole string to what is only likely 32b or 64b in size.
You must in fact allocate memory to be associated with the pointer, whether it be on the stack or heap. The second example you gave allocated memory for the string before assigning it's address to the pointer stack.
I am sorry, I might me asking a dumb question but I want to understand is there any difference in the below assignments? strcpy works in the first case but not in the second case.
char *str1;
*str1 = "Hello";
char *str2 = "World";
strcpy(str1,str2); //Works as expected
char *str1 = "Hello";
char *str2 = "World";
strcpy(str1,str2); //SEGMENTATION FAULT
How does compiler understand each assignment?Please Clarify.
Edit: In the first snippet you wrote *str1 = "Hello" which is equivalent to assigning to str[0], which is obviously wrong, because str1 is uninitialized and therefore is an invalid pointer. If we assume that you meant str1 = "Hello", then you are still wrong:
According to C specs, Attempting to modify a string literal results in undefined behavior: they may be stored in read-only storage (such as .rodata) or combined with other string literals so both snippets that you provided will yield undefined behavior.
I can only guess that in the second snippet the compiler is storing the string in some read-only storage, while in the first one it doesn't, so it works, but it's not guaranteed.
Sorry, both examples are very wrong and lead to undefined behaviour, that might or might not crash. Let me try to explain why:
str1 is a dangling pointer. That means str1 points to somewhere in your memory, writing to str1 can have arbitrary consequences. For example a crash or overriding some data in memory (eg. other local variables, variables in other functions, everything is possible)
The line *str1 = "Hello"; is also wrong (even if str1 were a valid pointer) as *str1 has type char (not char *) and is the first character of str1 which is dangling. However, you assign it a pointer ("Hello", type char *) which is a type error that your compiler will tell you about
str2 is a valid pointer but presumably points to read-only memory (hence the crash). Normally, constant strings are stored in read-only data in the binary, you cannot write to them, but that's exactly what you do in strcpy(str1,str2);.
A more correct example of what you want to achieve might be (with an array on the stack):
#define STR1_LEN 128
char str1[STR1_LEN] = "Hello"; /* array with space for 128 characters */
char *str2 = "World";
strncpy(str1, str2, STR1_LEN);
str1[STR1_LEN - 1] = 0; /* be sure to terminate str1 */
Other option (with dynamically managed memory):
#define STR1_LEN 128
char *str1 = malloc(STR1_LEN); /* allocate dynamic memory for str1 */
char *str2 = "World";
/* we should check here that str1 is not NULL, which would mean 'out of memory' */
strncpy(str1, str2, STR1_LEN);
str1[STR1_LEN - 1] = 0; /* be sure to terminate str1 */
free(str1); /* free the memory for str1 */
str1 = NULL;
EDIT: #chqrlie requested in the comments that the #define should be named STR1_SIZE not STR1_LEN. Presumably to reduce confusion because it's not the length in characters of the "string" but the length/size of the buffer allocated. Furthermore, #chqrlie requested not to give examples with the strncpy function. That wasn't really my choice as the OP used strcpy which is very dangerous so I picked the closest function that can be used correctly. But yes, I should probably have added, that the use of strcpy, strncpy, and similar functions is not recommended.
There seems to be some confusion here. Both fragments invoke undefined behaviour. Let me explain why:
char *str1; defines a pointer to characters, but it is uninitialized. It this definition occurs in the body of a function, its value is invalid. If this definition occurs at the global level, it is initialized to NULL.
*str1 = "Hello"; is an error: you are assigning a string pointer to the character pointed to by str1. str1 is uninitialized, so it does not point to anything valid, and you channot assign a pointer to a character. You should have written str1 = "Hello";. Furthermore, the string "Hello" is constant, so the definition of str1 really should be const char *str1;.
char *str2 = "World"; Here you define a pointer to a constant string "World". This statement is correct, but it would be better to define str2 as const char *str2 = "World"; for the same reason as above.
strcpy(str1,str2); //Works as expected NO it does not work at all! str1 does not point to a char array large enough to hold a copy of the string "World" including the final '\0'. Given the circumstances, this code invokes undefined behaviour, which may or may not cause a crash.
You mention the code works as expected: it only does no in appearance: what really happens is this: str1 is uninitialized, if it pointed to an area of memory that cannot be written, writing to it would likely have crashed the program with a segmentation fault; but if it happens to point to an area of memory where you can write, and the next statement *str1 = "Hello"; will modify the first byte of this area, then strcpy(str1, "World"); will modify the first 6 bytes at that place. The string pointed to by str1 will then be "World", as expected, but you have overwritten some area of memory that may be used for other purposes your program may consequently crash later in unexpected ways, a very hard to find bug! This is definitely undefined behaviour.
The second fragment invokes undefined behaviour for a different reason:
char *str1 = "Hello"; No problem, but should be const.
char *str2 = "World"; OK too, but should also be const.
strcpy(str1,str2); //SEGMENTATION FAULT of course it is invalid: you are trying to overwrite the constant character string "Hello" with the characters from the string "World". It would work if the string constant was stored in modifiable memory, and would cause even greater confusion later in the program as the value of the string constant was changed. Luckily, most modern environemnts prevent this by storing string constants in a read only memory. Trying to modify said memory causes a segment violation, ie: you are accessing the data segment of memory in a faulty way.
You should use strcpy() only to copy strings to character arrays you define as char buffer[SOME_SIZE]; or allocate as char *buffer = malloc(SOME_SIZE); with SOME_SIZE large enough to hold what you are trying to copy plus the final '\0'
Both code are wrong, even if "it works" in your first case. Hopefully this is only an academic question! :)
First let's look at *str1 which you are trying to modify.
char *str1;
This declares a dangling pointer, that is a pointer with the value of some unspecified address in the memory. Here the program is simple there is no important stuff, but you could have modified very critical data here!
char *str = "Hello";
This declares a pointer which will point to a protected section of the memory that even the program itself cannot change during execution, this is what a segmentation fault means.
To use strcpy(), the first parameter should be a char array dynamically allocated with malloc(). If fact, don't use strcpy(), learn to use strncpy() instead because it is safer.
I read this on wikipedia
int main(void)
{
char *s = "hello world";
*s = 'H';
}
When the program containing this code is compiled, the string "hello world" is placed in the section of the program executable file marked as read-only; when loaded, the operating system places it with other strings and constant data in a read-only segment of memory. When executed, a variable, s, is set to point to the string's location, and an attempt is made to write an H character through the variable into the memory, causing a segmentation fault**
i don't know why the string is placed in read only segment.please someone could explain this.
String literals are stored in read-only memory, that's just how it works. Your code uses a pointer initialized to point at the memory where a string literal is stored, and thus you can't validly modify that memory.
To get a string in modifiable memory, do this:
char s[] = "hello world";
then you're fine, since now you're just using the constant string to initialize a non-constant array.
There is a big difference between:
char * s = "Hello world";
and
char s[] = "Hello world";
In the first case, s is a pointer to something that you can't change. It's stored in read-only memory (typically, in the code section of your application).
In the latter case, you allocate an array in read-write memory (typically plain RAM), that you can modify.
When you do: char *s = "hello world"; then s is a pointer that points to a memory that is in the code part, so you can't change it.
When you do: char s[] = "Hello World"; then s is an array of chars
that are on the stack, so you can change it.
If you don't want the string to be changed during the program, it is better to do: char
const *s = ....;. Then, when you try to change the string, your program will not crash with segmentation fault, it will arise a compiler error (which is much better).
first have a good understanding of pointers, I will give u a short demo:
First let us analyze your code line by line. Lets start from main onwards
char *s = "Some_string";
first of all, you are declaring a pointer to a char variable, now *s is a address in memory, and C will kick you if you try to change its memory value, thats illegal, so u better declare a character array, then assign s to its address, then change s.
Hope you get, it. For further reference and detailed understanding, refer KN King: C programming A Modern Approach
Per the language definition, string literals have to be stored in such a way that their lifetime extends over the lifetime of the program, and that they are visible over the entire program.
Exactly what this means in terms of where the string gets stored is up to the implementation; the language definition does not mandate that string literals are stored in read-only memory, and not all implementations do so. It only says that attempting to modify the contents of a string literal results in undefined behavior, meaning the implementation is free to do whatever it wants.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Difference between char *str=“STRING” and char str[] = “STRING”?
I wrote the following code:
int main()
{
char *str = "hello";
str[0] = 'H';
printf("%s\n", str);
}
This gives me a segmentation fault, I cant understand why.
str is pointer to char not const char. Even if that's the case shouldn't it give a compile error like the following program:
int main()
{
const char *str = "hello";
str[0] = 'H';
printf("%s\n", str);
}
It gives an error: assignment of read-only location *str.
EDIT
If my code places the pointer to a read only location, shouldn't I get a compilation error?
You assign a pointer to a constant string (which comes as a part of your text and is thus not writable memory).
Fix with char str[] = "hello"; this will create a r/w copy of the constant string on your stack.
What you do is a perfectly valid pointer assignment. What the compiler does not know is that in a standard system constant strings are placed in read-only memory. On embedded (or other weird) systems this may be different.
Depending on your system you could come with an mprotect and change the VM flags on your pointer destination to writable. So the compiler allows for this code, your OS does not though.
When you initialize a char * using a literal string, then you shouldn't try to modify it's contents: the variable is pointing to memory that doesn't belong to you.
You can use:
char str[] = "hello";
str[0] = 'H';
With this code you've declared an array which is initialized with a copy of the literal string's contents, and now you can modify the array.
Your code has undefined behavior in runtime. You are attempting to write to a literal string, which is not allowed. Such writes may trigger an error or have undefined behavior. Your specific C compiler has str point to read-only memory, and attempting to write to that memory leads to a segmentation fault. Even though it's not const, the write is still not allowed.
char *str = "hello";
When you declare str as above, it is not guaranteed which part of memory it will be stored. str might be read-only depending on implementation. So trying to change it will cause segmentation fault.
In order to avoid segmentation faullt, declare str as an array of characters instead.
char *str = "hello";
here the string hello is a literal.
string literals are always stored in read only memory.
this is the reason you are getting a segmentation fault when you are trying to change the value at read only memory.
Declaring str as char* reserves memory for the pointer, but not for the string.
The compiler can put the memory for "hello" anywhere he likes.
You have no guarantee that str[i] is writable, so that's why in some compilers this results in a seg fault.
If you want to make sure that the string is in writable memory, then you have to allocate memory using alloc() or you can use
char str[] = "hello";
#include <stdio.h>
int main() {
char *t = "hello world";
puts(t);
//printf("%s", t);
t = "goodbye world";
puts(t);
}
The memory for t isn't allocated, so why I don't get segfault when I run it?
t is a pointer, so you are just making t point to another string.
Because string literals are allocated statically in your program memory - you do not need to allocate memory for them explicitly.
Memory is allocated for t; enough memory is allocated for it to hold a pointer (typically, 4 bytes in a 32-bit program, 8 bytes in a 64-bit program).
Further, the initialization for t ensures that the pointer points somewhere:
char *t = "hello world";
String literals are also allocated space, somewhere. Often, that is in the read-only portion of memory, so you should really be using const char *t = "hello world"; and even if you don't use the explicit const, you should not try to modify the string that t points at. But it is the compiler's problem to ensure that t is pointing somewhere valid.
Similarly, after the assignment:
t = "goodbye, Cruel World!";
the variable is pointing at space allocated by the compiler. As long as you don't abuse it (and your code doesn't), this is fine.
What would get you into trouble is something like this:
char *t;
puts(t); // t is uninitialized; undefined behaviour
t = 0; // equivalently, t = NULL;
puts(t); // t contains the null pointer; undefined behaviour
The uninitialized local variable could contain any value; you cannot predict reliably what will happen. On some machines, it may contain a null pointer and cause a crash, but that is not something you can rely on.
A null pointer doesn't point at anything valid, so dereferencing a null pointer leads to undefined behaviour, and very often that undefined behaviour is a crash. (Classically, on DEC VAX machines, you got a zero byte at address zero instead of a crash. That led (in part) to one of Henry Spencer's Ten Commandments "All the world is not a VAX" — and also "Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end.")
So, in your program, memory is allocated for t and t is initialized and assigned to point to (read-only) string constants, so there is no excuse for the program to crash.
t is here a pointer to the first character of an anonymous string, which can be in read-only memory. A good idea is to declare the pointer as pointer to const char :
const char *t = "hello world";
See also here.
All the memory the compiler needs to allocate for t is 4 bytes on a 32-bit system. Remember that it's just a pointer. In the first couple of lines it's pointing to "hello world", but after that you change it so it points to "goodbye world". C will have allocated enough memory for the strings you have defined and passes you the pointer so you can point to them. You don't need to worry about that. Also remember that these string are static and read-only, which means you can't safely say t[4] = 'b';.