Memcpy on stack memory [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the difference between char a[] = “string” and char *p = “string”?
What is the difference between using memcpy on stack memory vs on heap memory?
The following code works on Tru64 but segfaults on LINUX
char * string2 = " ";
(void)memcpy((char *)(string2),(char *)("ALT=---,--"),(size_t)(10));
The second version works on LINUX
char * string2 = malloc(sizeof(char)*12);
(void)memcpy((char *)(string2),(char *)("ALT=---,--"),(size_t)(10));
Can someone explain the segfault on LINUX?

The first example has an Undefined Behavior. And so it might work correctly or not or show any random behavior.
Explanation:
The first example, declares a pointer string2to a string literal. String literals are stored in Implementation defined read only memory locations. A user program is not allowed to modify this memory. Any attempt to do so results in Undefined Behavior.
Reference:
C99 standard 6.4.5/5 "String Literals - Semantics":
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

In the first example, you must differ between the pointer and the actual string contents: Although the pointer (string2) is on the stack, the actual string bytes are not. There is a good change that they are in the constants area which is readonly, hence the segfault.

First of all, the behavior is undefined. From C99, 6.4.5/6:
It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
In practical terms, what happens is that the OS has chosen to load the associated image section in read-only memory; hence the segfault when you try to write.

In the first example you are not in stack memory but in .rodata (read-only data) section. String literals have static storage duration and are not required to be modifiable.
void foo(void)
{
// string array has automatic storage duration
char string[] = " ";
// string2 pointer has automatic storage duration and points to
// a string literal that a static storage duration
char *string2 = " "; // string2 pointer has
}

In the first case the destination for the memcpy is a string literal, as pointed out by others.
in the second case: Don't cast unnecessary. Avoid magic constants. Sizeof(char) == 1.
#include <stdlib.h>
#include <string.h>
char * string2 = malloc(1+strlen("ALT=---,--"));
(void)memcpy(string2, "ALT=---,--"), 1+strlen("ALT=---,--") );
, which is equivalent to:
char * string2 = malloc(1+strlen("ALT=---,--"));
(void)strcpy(string2, "ALT=---,--") );
BTW: In the original, the constant '10' was too small; the terminating nul byte would not be copied, and the string would be unterminated.

You can try this:
char string2[] = " ";
(void)memcpy((char *)string2,(char *)("ALT=---,--"),(size_t)(10));

Change your first example to:
char string2 [] = " ";
(void)memcpy((char *)(string2),(char *)("ALT=---,--"),(size_t)(10));
and then you'll be comparing stack memory versus heap memory.

Related

Question about strings (as a variable from cs50) in C [duplicate]

The following code receives seg fault on line 2:
char *str = "string";
str[0] = 'z'; // could be also written as *str = 'z'
printf("%s\n", str);
While this works perfectly well:
char str[] = "string";
str[0] = 'z';
printf("%s\n", str);
Tested with MSVC and GCC.
See the C FAQ, Question 1.32
Q: What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
A: A string literal (the formal term
for a double-quoted string in C
source) can be used in two slightly
different ways:
As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values
of the characters in that array (and,
if necessary, its size).
Anywhere else, it turns into an unnamed, static array of characters,
and this unnamed array may be stored
in read-only memory, and which
therefore cannot necessarily be
modified. In an expression context,
the array is converted at once to a
pointer, as usual (see section 6), so
the second declaration initializes p
to point to the unnamed array's first
element.
Some compilers have a switch
controlling whether string literals
are writable or not (for compiling old
code), and some may have options to
cause string literals to be formally
treated as arrays of const char (for
better error catching).
Normally, string literals are stored in read-only memory when the program is run. This is to prevent you from accidentally changing a string constant. In your first example, "string" is stored in read-only memory and *str points to the first character. The segfault happens when you try to change the first character to 'z'.
In the second example, the string "string" is copied by the compiler from its read-only home to the str[] array. Then changing the first character is permitted. You can check this by printing the address of each:
printf("%p", str);
Also, printing the size of str in the second example will show you that the compiler has allocated 7 bytes for it:
printf("%d", sizeof(str));
Most of these answers are correct, but just to add a little more clarity...
The "read only memory" that people are referring to is the text segment in ASM terms. It's the same place in memory where the instructions are loaded. This is read-only for obvious reasons like security. When you create a char* initialized to a string, the string data is compiled into the text segment and the program initializes the pointer to point into the text segment. So if you try to change it, kaboom. Segfault.
When written as an array, the compiler places the initialized string data in the data segment instead, which is the same place that your global variables and such live. This memory is mutable, since there are no instructions in the data segment. This time when the compiler initializes the character array (which is still just a char*) it's pointing into the data segment rather than the text segment, which you can safely alter at run-time.
Why do I get a segmentation fault when writing to a string?
C99 N1256 draft
There are two different uses of character string literals:
Initialize char[]:
char c[] = "abc";
This is "more magic", and described at 6.7.8/14 "Initialization":
An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the
elements of the array.
So this is just a shortcut for:
char c[] = {'a', 'b', 'c', '\0'};
Like any other regular array, c can be modified.
Everywhere else: it generates an:
unnamed
array of char What is the type of string literals in C and C++?
with static storage
that gives UB if modified
So when you write:
char *c = "abc";
This is similar to:
/* __unnamed is magic because modifying it gives UB. */
static char __unnamed[] = "abc";
char *c = __unnamed;
Note the implicit cast from char[] to char *, which is always legal.
Then if you modify c[0], you also modify __unnamed, which is UB.
This is documented at 6.4.5 "String literals":
5 In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals. The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals, the array elements have
type char, and are initialized with the individual bytes of the multibyte character
sequence [...]
6 It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.
6.7.8/32 "Initialization" gives a direct example:
EXAMPLE 8: The declaration
char s[] = "abc", t[3] = "abc";
defines "plain" char array objects s and t whose elements are initialized with character string literals.
This declaration is identical to
char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };
The contents of the arrays are modifiable. On the other hand, the declaration
char *p = "abc";
defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.
GCC 4.8 x86-64 ELF implementation
Program:
#include <stdio.h>
int main(void) {
char *s = "abc";
printf("%s\n", s);
return 0;
}
Compile and decompile:
gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o
Output contains:
char *s = "abc";
8: 48 c7 45 f8 00 00 00 movq $0x0,-0x8(%rbp)
f: 00
c: R_X86_64_32S .rodata
Conclusion: GCC stores char* it in .rodata section, not in .text.
If we do the same for char[]:
char s[] = "abc";
we obtain:
17: c7 45 f0 61 62 63 00 movl $0x636261,-0x10(%rbp)
so it gets stored in the stack (relative to %rbp).
Note however that the default linker script puts .rodata and .text in the same segment, which has execute but no write permission. This can be observed with:
readelf -l a.out
which contains:
Section to Segment mapping:
Segment Sections...
02 .text .rodata
In the first code, "string" is a string constant, and string constants should never be modified because they are often placed into read only memory. "str" is a pointer being used to modify the constant.
In the second code, "string" is an array initializer, sort of short hand for
char str[7] = { 's', 't', 'r', 'i', 'n', 'g', '\0' };
"str" is an array allocated on the stack and can be modified freely.
Because the type of "whatever" in the context of the 1st example is const char * (even if you assign it to a non-const char*), which means you shouldn't try and write to it.
The compiler has enforced this by putting the string in a read-only part of memory, hence writing to it generates a segfault.
char *str = "string";
The above sets str to point to the literal value "string" which is hard-coded in the program's binary image, which is probably flagged as read-only in memory.
So str[0]= is attempting to write to the read-only code of the application. I would guess this is probably compiler dependent though.
To understand this error or problem you should first know difference b/w the pointer and array
so here firstly i have explain you differences b/w them
string array
char strarray[] = "hello";
In memory array is stored in continuous memory cells, stored as [h][e][l][l][o][\0] =>[] is 1 char byte size memory cell ,and this continuous memory cells can be access by name named strarray here.so here string array strarray itself containing all characters of string initialized to it.in this case here "hello"
so we can easily change its memory content by accessing each character by its index value
`strarray[0]='m'` it access character at index 0 which is 'h'in strarray
and its value changed to 'm' so strarray value changed to "mello";
one point to note here that we can change the content of string array by changing character by character but can not initialized other string directly to it like strarray="new string" is invalid
Pointer
As we all know pointer points to memory location in memory ,
uninitialized pointer points to random memory location so and after initialization points to particular memory location
char *ptr = "hello";
here pointer ptr is initialized to string "hello" which is constant string stored in read only memory (ROM) so "hello" can not be changed as it is stored in ROM
and ptr is stored in stack section and pointing to constant string "hello"
so ptr[0]='m' is invalid since you can not access read only memory
But ptr can be initialised to other string value directly since it is just pointer so it can be point to any memory address of variable of its data type
ptr="new string"; is valid
char *str = "string";
allocates a pointer to a string literal, which the compiler is putting in a non-modifiable part of your executable;
char str[] = "string";
allocates and initializes a local array which is modifiable
The C FAQ that #matli linked to mentions it, but no one else here has yet, so for clarification: if a string literal (double-quoted string in your source) is used anywhere other than to initialize a character array (ie: #Mark's second example, which works correctly), that string is stored by the compiler in a special static string table, which is akin to creating a global static variable (read-only, of course) that is essentially anonymous (has no variable "name"). The read-only part is the important part, and is why the #Mark's first code example segfaults.
The
char *str = "string";
line defines a pointer and points it to a literal string. The literal string is not writable so when you do:
str[0] = 'z';
you get a seg fault. On some platforms, the literal might be in writable memory so you won't see a segfault, but it's invalid code (resulting in undefined behavior) regardless.
The line:
char str[] = "string";
allocates an array of characters and copies the literal string into that array, which is fully writable, so the subsequent update is no problem.
String literals like "string" are probably allocated in your executable's address space as read-only data (give or take your compiler). When you go to touch it, it freaks out that you're in its bathing suit area and lets you know with a seg fault.
In your first example, you're getting a pointer to that const data. In your second example, you're initializing an array of 7 characters with a copy of the const data.
// create a string constant like this - will be read only
char *str_p;
str_p = "String constant";
// create an array of characters like this
char *arr_p;
char arr[] = "String in an array";
arr_p = &arr[0];
// now we try to change a character in the array first, this will work
*arr_p = 'E';
// lets try to change the first character of the string contant
*str_p = 'G'; // this will result in a segmentation fault. Comment it out to work.
/*-----------------------------------------------------------------------------
* String constants can't be modified. A segmentation fault is the result,
* because most operating systems will not allow a write
* operation on read only memory.
*-----------------------------------------------------------------------------*/
//print both strings to see if they have changed
printf("%s\n", str_p); //print the string without a variable
printf("%s\n", arr_p); //print the string, which is in an array.
In the first place, str is a pointer that points at "string". The compiler is allowed to put string literals in places in memory that you cannot write to, but can only read. (This really should have triggered a warning, since you're assigning a const char * to a char *. Did you have warnings disabled, or did you just ignore them?)
In the second place, you're creating an array, which is memory that you've got full access to, and initializing it with "string". You're creating a char[7] (six for the letters, one for the terminating '\0'), and you do whatever you like with it.
Assume the strings are,
char a[] = "string literal copied to stack";
char *p = "string literal referenced by p";
In the first case, the literal is to be copied when 'a' comes into scope. Here 'a' is an array defined on stack. It means the string will be created on the stack and its data is copied from code (text) memory, which is typically read-only (this is implementation specific, a compiler can place this read-only program data in read-writable memory also).
In the second case, p is a pointer defined on stack (local scope) and referring a string literal (program data or text) stored else where. Usually modifying such memory is not good practice nor encouraged.
Section 5.5 Character Pointers and Functions of K&R also discusses about this topic:
There is an important difference between these definitions:
char amessage[] = "now is the time"; /* an array */
char *pmessage = "now is the time"; /* a pointer */
amessage is an array, just big enough to hold the sequence of characters and '\0' that initializes it. Individual characters within the array may be changed but amessage will always refer to the same storage. On the other hand, pmessage is a pointer, initialized to point to a string constant; the pointer may subsequently be modified to point elsewhere, but the result is undefined if you try to modify the string contents.
Constant memory
Since string literals are read-only by design, they are stored in the Constant part of memory. Data stored there is immutable, i.e., cannot be changed. Thus, all string literals defined in C code get a read-only memory address here.
Stack memory
The Stack part of memory is where the addresses of local variables live, e.g., variables defined in functions.
As #matli's answer suggests, there are two ways of working with string these constant strings.
1. Pointer to string literal
When we define a pointer to a string literal, we are creating a pointer variable living in Stack memory. It points to the read-only address where the underlying string literal resides.
#include <stdio.h>
int main(void) {
char *s = "hello";
printf("%p\n", &s); // Prints a read-only address, e.g. 0x7ffc8e224620
return 0;
}
If we try to modify s by inserting
s[0] = 'H';
we get a Segmentation fault (core dumped). We are trying to access memory that we shouldn't access. We are attempting to modify the value of a read-only address, 0x7ffc8e224620.
2. Array of chars
For the sake of the example, suppose the string literal "Hello" stored in constant memory has a read-only memory address identical to the one above, 0x7ffc8e224620.
#include <stdio.h>
int main(void) {
// We create an array from a string literal with address 0x7ffc8e224620.
// C initializes an array variable in the stack, let's give it address
// 0x7ffc7a9a9db2.
// C then copies the read-only value from 0x7ffc8e224620 into
// 0x7ffc7a9a9db2 to give us a local copy we can mutate.
char a[] = "hello";
// We can now mutate the local copy
a[0] = 'H';
printf("%p\n", &a); // Prints the Stack address, e.g. 0x7ffc7a9a9db2
printf("%s\n", a); // Prints "Hello"
return 0;
}
Note: When using pointers to string literals as in 1., best practice is to use the const keyword, like const *s = "hello". This is more readable and the compiler will provide better help when it's violated. It will then throw an error like error: assignment of read-only location ‘*s’ instead of the seg fault. Linters in editors will also likely pick up the error before you manually compile the code.
First is one constant string which can't be modified. Second is an array with initialized value, so it can be modified.
Segmentation fault is caused when you try to access the memory which is inaccessible.
char *str is a pointer to a string that is nonmodifiable(the reason for getting segfault).
whereas char str[] is an array and can be modifiable..

Why does strcpy specifically requires an array as first argument? (As opposed to a char* const ptr) [duplicate]

The following code receives seg fault on line 2:
char *str = "string";
str[0] = 'z'; // could be also written as *str = 'z'
printf("%s\n", str);
While this works perfectly well:
char str[] = "string";
str[0] = 'z';
printf("%s\n", str);
Tested with MSVC and GCC.
See the C FAQ, Question 1.32
Q: What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
A: A string literal (the formal term
for a double-quoted string in C
source) can be used in two slightly
different ways:
As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values
of the characters in that array (and,
if necessary, its size).
Anywhere else, it turns into an unnamed, static array of characters,
and this unnamed array may be stored
in read-only memory, and which
therefore cannot necessarily be
modified. In an expression context,
the array is converted at once to a
pointer, as usual (see section 6), so
the second declaration initializes p
to point to the unnamed array's first
element.
Some compilers have a switch
controlling whether string literals
are writable or not (for compiling old
code), and some may have options to
cause string literals to be formally
treated as arrays of const char (for
better error catching).
Normally, string literals are stored in read-only memory when the program is run. This is to prevent you from accidentally changing a string constant. In your first example, "string" is stored in read-only memory and *str points to the first character. The segfault happens when you try to change the first character to 'z'.
In the second example, the string "string" is copied by the compiler from its read-only home to the str[] array. Then changing the first character is permitted. You can check this by printing the address of each:
printf("%p", str);
Also, printing the size of str in the second example will show you that the compiler has allocated 7 bytes for it:
printf("%d", sizeof(str));
Most of these answers are correct, but just to add a little more clarity...
The "read only memory" that people are referring to is the text segment in ASM terms. It's the same place in memory where the instructions are loaded. This is read-only for obvious reasons like security. When you create a char* initialized to a string, the string data is compiled into the text segment and the program initializes the pointer to point into the text segment. So if you try to change it, kaboom. Segfault.
When written as an array, the compiler places the initialized string data in the data segment instead, which is the same place that your global variables and such live. This memory is mutable, since there are no instructions in the data segment. This time when the compiler initializes the character array (which is still just a char*) it's pointing into the data segment rather than the text segment, which you can safely alter at run-time.
Why do I get a segmentation fault when writing to a string?
C99 N1256 draft
There are two different uses of character string literals:
Initialize char[]:
char c[] = "abc";
This is "more magic", and described at 6.7.8/14 "Initialization":
An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the
elements of the array.
So this is just a shortcut for:
char c[] = {'a', 'b', 'c', '\0'};
Like any other regular array, c can be modified.
Everywhere else: it generates an:
unnamed
array of char What is the type of string literals in C and C++?
with static storage
that gives UB if modified
So when you write:
char *c = "abc";
This is similar to:
/* __unnamed is magic because modifying it gives UB. */
static char __unnamed[] = "abc";
char *c = __unnamed;
Note the implicit cast from char[] to char *, which is always legal.
Then if you modify c[0], you also modify __unnamed, which is UB.
This is documented at 6.4.5 "String literals":
5 In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals. The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals, the array elements have
type char, and are initialized with the individual bytes of the multibyte character
sequence [...]
6 It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.
6.7.8/32 "Initialization" gives a direct example:
EXAMPLE 8: The declaration
char s[] = "abc", t[3] = "abc";
defines "plain" char array objects s and t whose elements are initialized with character string literals.
This declaration is identical to
char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };
The contents of the arrays are modifiable. On the other hand, the declaration
char *p = "abc";
defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.
GCC 4.8 x86-64 ELF implementation
Program:
#include <stdio.h>
int main(void) {
char *s = "abc";
printf("%s\n", s);
return 0;
}
Compile and decompile:
gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o
Output contains:
char *s = "abc";
8: 48 c7 45 f8 00 00 00 movq $0x0,-0x8(%rbp)
f: 00
c: R_X86_64_32S .rodata
Conclusion: GCC stores char* it in .rodata section, not in .text.
If we do the same for char[]:
char s[] = "abc";
we obtain:
17: c7 45 f0 61 62 63 00 movl $0x636261,-0x10(%rbp)
so it gets stored in the stack (relative to %rbp).
Note however that the default linker script puts .rodata and .text in the same segment, which has execute but no write permission. This can be observed with:
readelf -l a.out
which contains:
Section to Segment mapping:
Segment Sections...
02 .text .rodata
In the first code, "string" is a string constant, and string constants should never be modified because they are often placed into read only memory. "str" is a pointer being used to modify the constant.
In the second code, "string" is an array initializer, sort of short hand for
char str[7] = { 's', 't', 'r', 'i', 'n', 'g', '\0' };
"str" is an array allocated on the stack and can be modified freely.
Because the type of "whatever" in the context of the 1st example is const char * (even if you assign it to a non-const char*), which means you shouldn't try and write to it.
The compiler has enforced this by putting the string in a read-only part of memory, hence writing to it generates a segfault.
char *str = "string";
The above sets str to point to the literal value "string" which is hard-coded in the program's binary image, which is probably flagged as read-only in memory.
So str[0]= is attempting to write to the read-only code of the application. I would guess this is probably compiler dependent though.
To understand this error or problem you should first know difference b/w the pointer and array
so here firstly i have explain you differences b/w them
string array
char strarray[] = "hello";
In memory array is stored in continuous memory cells, stored as [h][e][l][l][o][\0] =>[] is 1 char byte size memory cell ,and this continuous memory cells can be access by name named strarray here.so here string array strarray itself containing all characters of string initialized to it.in this case here "hello"
so we can easily change its memory content by accessing each character by its index value
`strarray[0]='m'` it access character at index 0 which is 'h'in strarray
and its value changed to 'm' so strarray value changed to "mello";
one point to note here that we can change the content of string array by changing character by character but can not initialized other string directly to it like strarray="new string" is invalid
Pointer
As we all know pointer points to memory location in memory ,
uninitialized pointer points to random memory location so and after initialization points to particular memory location
char *ptr = "hello";
here pointer ptr is initialized to string "hello" which is constant string stored in read only memory (ROM) so "hello" can not be changed as it is stored in ROM
and ptr is stored in stack section and pointing to constant string "hello"
so ptr[0]='m' is invalid since you can not access read only memory
But ptr can be initialised to other string value directly since it is just pointer so it can be point to any memory address of variable of its data type
ptr="new string"; is valid
char *str = "string";
allocates a pointer to a string literal, which the compiler is putting in a non-modifiable part of your executable;
char str[] = "string";
allocates and initializes a local array which is modifiable
The C FAQ that #matli linked to mentions it, but no one else here has yet, so for clarification: if a string literal (double-quoted string in your source) is used anywhere other than to initialize a character array (ie: #Mark's second example, which works correctly), that string is stored by the compiler in a special static string table, which is akin to creating a global static variable (read-only, of course) that is essentially anonymous (has no variable "name"). The read-only part is the important part, and is why the #Mark's first code example segfaults.
The
char *str = "string";
line defines a pointer and points it to a literal string. The literal string is not writable so when you do:
str[0] = 'z';
you get a seg fault. On some platforms, the literal might be in writable memory so you won't see a segfault, but it's invalid code (resulting in undefined behavior) regardless.
The line:
char str[] = "string";
allocates an array of characters and copies the literal string into that array, which is fully writable, so the subsequent update is no problem.
String literals like "string" are probably allocated in your executable's address space as read-only data (give or take your compiler). When you go to touch it, it freaks out that you're in its bathing suit area and lets you know with a seg fault.
In your first example, you're getting a pointer to that const data. In your second example, you're initializing an array of 7 characters with a copy of the const data.
// create a string constant like this - will be read only
char *str_p;
str_p = "String constant";
// create an array of characters like this
char *arr_p;
char arr[] = "String in an array";
arr_p = &arr[0];
// now we try to change a character in the array first, this will work
*arr_p = 'E';
// lets try to change the first character of the string contant
*str_p = 'G'; // this will result in a segmentation fault. Comment it out to work.
/*-----------------------------------------------------------------------------
* String constants can't be modified. A segmentation fault is the result,
* because most operating systems will not allow a write
* operation on read only memory.
*-----------------------------------------------------------------------------*/
//print both strings to see if they have changed
printf("%s\n", str_p); //print the string without a variable
printf("%s\n", arr_p); //print the string, which is in an array.
In the first place, str is a pointer that points at "string". The compiler is allowed to put string literals in places in memory that you cannot write to, but can only read. (This really should have triggered a warning, since you're assigning a const char * to a char *. Did you have warnings disabled, or did you just ignore them?)
In the second place, you're creating an array, which is memory that you've got full access to, and initializing it with "string". You're creating a char[7] (six for the letters, one for the terminating '\0'), and you do whatever you like with it.
Assume the strings are,
char a[] = "string literal copied to stack";
char *p = "string literal referenced by p";
In the first case, the literal is to be copied when 'a' comes into scope. Here 'a' is an array defined on stack. It means the string will be created on the stack and its data is copied from code (text) memory, which is typically read-only (this is implementation specific, a compiler can place this read-only program data in read-writable memory also).
In the second case, p is a pointer defined on stack (local scope) and referring a string literal (program data or text) stored else where. Usually modifying such memory is not good practice nor encouraged.
Section 5.5 Character Pointers and Functions of K&R also discusses about this topic:
There is an important difference between these definitions:
char amessage[] = "now is the time"; /* an array */
char *pmessage = "now is the time"; /* a pointer */
amessage is an array, just big enough to hold the sequence of characters and '\0' that initializes it. Individual characters within the array may be changed but amessage will always refer to the same storage. On the other hand, pmessage is a pointer, initialized to point to a string constant; the pointer may subsequently be modified to point elsewhere, but the result is undefined if you try to modify the string contents.
Constant memory
Since string literals are read-only by design, they are stored in the Constant part of memory. Data stored there is immutable, i.e., cannot be changed. Thus, all string literals defined in C code get a read-only memory address here.
Stack memory
The Stack part of memory is where the addresses of local variables live, e.g., variables defined in functions.
As #matli's answer suggests, there are two ways of working with string these constant strings.
1. Pointer to string literal
When we define a pointer to a string literal, we are creating a pointer variable living in Stack memory. It points to the read-only address where the underlying string literal resides.
#include <stdio.h>
int main(void) {
char *s = "hello";
printf("%p\n", &s); // Prints a read-only address, e.g. 0x7ffc8e224620
return 0;
}
If we try to modify s by inserting
s[0] = 'H';
we get a Segmentation fault (core dumped). We are trying to access memory that we shouldn't access. We are attempting to modify the value of a read-only address, 0x7ffc8e224620.
2. Array of chars
For the sake of the example, suppose the string literal "Hello" stored in constant memory has a read-only memory address identical to the one above, 0x7ffc8e224620.
#include <stdio.h>
int main(void) {
// We create an array from a string literal with address 0x7ffc8e224620.
// C initializes an array variable in the stack, let's give it address
// 0x7ffc7a9a9db2.
// C then copies the read-only value from 0x7ffc8e224620 into
// 0x7ffc7a9a9db2 to give us a local copy we can mutate.
char a[] = "hello";
// We can now mutate the local copy
a[0] = 'H';
printf("%p\n", &a); // Prints the Stack address, e.g. 0x7ffc7a9a9db2
printf("%s\n", a); // Prints "Hello"
return 0;
}
Note: When using pointers to string literals as in 1., best practice is to use the const keyword, like const *s = "hello". This is more readable and the compiler will provide better help when it's violated. It will then throw an error like error: assignment of read-only location ‘*s’ instead of the seg fault. Linters in editors will also likely pick up the error before you manually compile the code.
First is one constant string which can't be modified. Second is an array with initialized value, so it can be modified.
Segmentation fault is caused when you try to access the memory which is inaccessible.
char *str is a pointer to a string that is nonmodifiable(the reason for getting segfault).
whereas char str[] is an array and can be modifiable..

At what cases are we allowed to change to content of a string in c? [duplicate]

The following code receives seg fault on line 2:
char *str = "string";
str[0] = 'z'; // could be also written as *str = 'z'
printf("%s\n", str);
While this works perfectly well:
char str[] = "string";
str[0] = 'z';
printf("%s\n", str);
Tested with MSVC and GCC.
See the C FAQ, Question 1.32
Q: What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
A: A string literal (the formal term
for a double-quoted string in C
source) can be used in two slightly
different ways:
As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values
of the characters in that array (and,
if necessary, its size).
Anywhere else, it turns into an unnamed, static array of characters,
and this unnamed array may be stored
in read-only memory, and which
therefore cannot necessarily be
modified. In an expression context,
the array is converted at once to a
pointer, as usual (see section 6), so
the second declaration initializes p
to point to the unnamed array's first
element.
Some compilers have a switch
controlling whether string literals
are writable or not (for compiling old
code), and some may have options to
cause string literals to be formally
treated as arrays of const char (for
better error catching).
Normally, string literals are stored in read-only memory when the program is run. This is to prevent you from accidentally changing a string constant. In your first example, "string" is stored in read-only memory and *str points to the first character. The segfault happens when you try to change the first character to 'z'.
In the second example, the string "string" is copied by the compiler from its read-only home to the str[] array. Then changing the first character is permitted. You can check this by printing the address of each:
printf("%p", str);
Also, printing the size of str in the second example will show you that the compiler has allocated 7 bytes for it:
printf("%d", sizeof(str));
Most of these answers are correct, but just to add a little more clarity...
The "read only memory" that people are referring to is the text segment in ASM terms. It's the same place in memory where the instructions are loaded. This is read-only for obvious reasons like security. When you create a char* initialized to a string, the string data is compiled into the text segment and the program initializes the pointer to point into the text segment. So if you try to change it, kaboom. Segfault.
When written as an array, the compiler places the initialized string data in the data segment instead, which is the same place that your global variables and such live. This memory is mutable, since there are no instructions in the data segment. This time when the compiler initializes the character array (which is still just a char*) it's pointing into the data segment rather than the text segment, which you can safely alter at run-time.
Why do I get a segmentation fault when writing to a string?
C99 N1256 draft
There are two different uses of character string literals:
Initialize char[]:
char c[] = "abc";
This is "more magic", and described at 6.7.8/14 "Initialization":
An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the
elements of the array.
So this is just a shortcut for:
char c[] = {'a', 'b', 'c', '\0'};
Like any other regular array, c can be modified.
Everywhere else: it generates an:
unnamed
array of char What is the type of string literals in C and C++?
with static storage
that gives UB if modified
So when you write:
char *c = "abc";
This is similar to:
/* __unnamed is magic because modifying it gives UB. */
static char __unnamed[] = "abc";
char *c = __unnamed;
Note the implicit cast from char[] to char *, which is always legal.
Then if you modify c[0], you also modify __unnamed, which is UB.
This is documented at 6.4.5 "String literals":
5 In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals. The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals, the array elements have
type char, and are initialized with the individual bytes of the multibyte character
sequence [...]
6 It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.
6.7.8/32 "Initialization" gives a direct example:
EXAMPLE 8: The declaration
char s[] = "abc", t[3] = "abc";
defines "plain" char array objects s and t whose elements are initialized with character string literals.
This declaration is identical to
char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };
The contents of the arrays are modifiable. On the other hand, the declaration
char *p = "abc";
defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.
GCC 4.8 x86-64 ELF implementation
Program:
#include <stdio.h>
int main(void) {
char *s = "abc";
printf("%s\n", s);
return 0;
}
Compile and decompile:
gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o
Output contains:
char *s = "abc";
8: 48 c7 45 f8 00 00 00 movq $0x0,-0x8(%rbp)
f: 00
c: R_X86_64_32S .rodata
Conclusion: GCC stores char* it in .rodata section, not in .text.
If we do the same for char[]:
char s[] = "abc";
we obtain:
17: c7 45 f0 61 62 63 00 movl $0x636261,-0x10(%rbp)
so it gets stored in the stack (relative to %rbp).
Note however that the default linker script puts .rodata and .text in the same segment, which has execute but no write permission. This can be observed with:
readelf -l a.out
which contains:
Section to Segment mapping:
Segment Sections...
02 .text .rodata
In the first code, "string" is a string constant, and string constants should never be modified because they are often placed into read only memory. "str" is a pointer being used to modify the constant.
In the second code, "string" is an array initializer, sort of short hand for
char str[7] = { 's', 't', 'r', 'i', 'n', 'g', '\0' };
"str" is an array allocated on the stack and can be modified freely.
Because the type of "whatever" in the context of the 1st example is const char * (even if you assign it to a non-const char*), which means you shouldn't try and write to it.
The compiler has enforced this by putting the string in a read-only part of memory, hence writing to it generates a segfault.
char *str = "string";
The above sets str to point to the literal value "string" which is hard-coded in the program's binary image, which is probably flagged as read-only in memory.
So str[0]= is attempting to write to the read-only code of the application. I would guess this is probably compiler dependent though.
To understand this error or problem you should first know difference b/w the pointer and array
so here firstly i have explain you differences b/w them
string array
char strarray[] = "hello";
In memory array is stored in continuous memory cells, stored as [h][e][l][l][o][\0] =>[] is 1 char byte size memory cell ,and this continuous memory cells can be access by name named strarray here.so here string array strarray itself containing all characters of string initialized to it.in this case here "hello"
so we can easily change its memory content by accessing each character by its index value
`strarray[0]='m'` it access character at index 0 which is 'h'in strarray
and its value changed to 'm' so strarray value changed to "mello";
one point to note here that we can change the content of string array by changing character by character but can not initialized other string directly to it like strarray="new string" is invalid
Pointer
As we all know pointer points to memory location in memory ,
uninitialized pointer points to random memory location so and after initialization points to particular memory location
char *ptr = "hello";
here pointer ptr is initialized to string "hello" which is constant string stored in read only memory (ROM) so "hello" can not be changed as it is stored in ROM
and ptr is stored in stack section and pointing to constant string "hello"
so ptr[0]='m' is invalid since you can not access read only memory
But ptr can be initialised to other string value directly since it is just pointer so it can be point to any memory address of variable of its data type
ptr="new string"; is valid
char *str = "string";
allocates a pointer to a string literal, which the compiler is putting in a non-modifiable part of your executable;
char str[] = "string";
allocates and initializes a local array which is modifiable
The C FAQ that #matli linked to mentions it, but no one else here has yet, so for clarification: if a string literal (double-quoted string in your source) is used anywhere other than to initialize a character array (ie: #Mark's second example, which works correctly), that string is stored by the compiler in a special static string table, which is akin to creating a global static variable (read-only, of course) that is essentially anonymous (has no variable "name"). The read-only part is the important part, and is why the #Mark's first code example segfaults.
The
char *str = "string";
line defines a pointer and points it to a literal string. The literal string is not writable so when you do:
str[0] = 'z';
you get a seg fault. On some platforms, the literal might be in writable memory so you won't see a segfault, but it's invalid code (resulting in undefined behavior) regardless.
The line:
char str[] = "string";
allocates an array of characters and copies the literal string into that array, which is fully writable, so the subsequent update is no problem.
String literals like "string" are probably allocated in your executable's address space as read-only data (give or take your compiler). When you go to touch it, it freaks out that you're in its bathing suit area and lets you know with a seg fault.
In your first example, you're getting a pointer to that const data. In your second example, you're initializing an array of 7 characters with a copy of the const data.
// create a string constant like this - will be read only
char *str_p;
str_p = "String constant";
// create an array of characters like this
char *arr_p;
char arr[] = "String in an array";
arr_p = &arr[0];
// now we try to change a character in the array first, this will work
*arr_p = 'E';
// lets try to change the first character of the string contant
*str_p = 'G'; // this will result in a segmentation fault. Comment it out to work.
/*-----------------------------------------------------------------------------
* String constants can't be modified. A segmentation fault is the result,
* because most operating systems will not allow a write
* operation on read only memory.
*-----------------------------------------------------------------------------*/
//print both strings to see if they have changed
printf("%s\n", str_p); //print the string without a variable
printf("%s\n", arr_p); //print the string, which is in an array.
In the first place, str is a pointer that points at "string". The compiler is allowed to put string literals in places in memory that you cannot write to, but can only read. (This really should have triggered a warning, since you're assigning a const char * to a char *. Did you have warnings disabled, or did you just ignore them?)
In the second place, you're creating an array, which is memory that you've got full access to, and initializing it with "string". You're creating a char[7] (six for the letters, one for the terminating '\0'), and you do whatever you like with it.
Assume the strings are,
char a[] = "string literal copied to stack";
char *p = "string literal referenced by p";
In the first case, the literal is to be copied when 'a' comes into scope. Here 'a' is an array defined on stack. It means the string will be created on the stack and its data is copied from code (text) memory, which is typically read-only (this is implementation specific, a compiler can place this read-only program data in read-writable memory also).
In the second case, p is a pointer defined on stack (local scope) and referring a string literal (program data or text) stored else where. Usually modifying such memory is not good practice nor encouraged.
Section 5.5 Character Pointers and Functions of K&R also discusses about this topic:
There is an important difference between these definitions:
char amessage[] = "now is the time"; /* an array */
char *pmessage = "now is the time"; /* a pointer */
amessage is an array, just big enough to hold the sequence of characters and '\0' that initializes it. Individual characters within the array may be changed but amessage will always refer to the same storage. On the other hand, pmessage is a pointer, initialized to point to a string constant; the pointer may subsequently be modified to point elsewhere, but the result is undefined if you try to modify the string contents.
Constant memory
Since string literals are read-only by design, they are stored in the Constant part of memory. Data stored there is immutable, i.e., cannot be changed. Thus, all string literals defined in C code get a read-only memory address here.
Stack memory
The Stack part of memory is where the addresses of local variables live, e.g., variables defined in functions.
As #matli's answer suggests, there are two ways of working with string these constant strings.
1. Pointer to string literal
When we define a pointer to a string literal, we are creating a pointer variable living in Stack memory. It points to the read-only address where the underlying string literal resides.
#include <stdio.h>
int main(void) {
char *s = "hello";
printf("%p\n", &s); // Prints a read-only address, e.g. 0x7ffc8e224620
return 0;
}
If we try to modify s by inserting
s[0] = 'H';
we get a Segmentation fault (core dumped). We are trying to access memory that we shouldn't access. We are attempting to modify the value of a read-only address, 0x7ffc8e224620.
2. Array of chars
For the sake of the example, suppose the string literal "Hello" stored in constant memory has a read-only memory address identical to the one above, 0x7ffc8e224620.
#include <stdio.h>
int main(void) {
// We create an array from a string literal with address 0x7ffc8e224620.
// C initializes an array variable in the stack, let's give it address
// 0x7ffc7a9a9db2.
// C then copies the read-only value from 0x7ffc8e224620 into
// 0x7ffc7a9a9db2 to give us a local copy we can mutate.
char a[] = "hello";
// We can now mutate the local copy
a[0] = 'H';
printf("%p\n", &a); // Prints the Stack address, e.g. 0x7ffc7a9a9db2
printf("%s\n", a); // Prints "Hello"
return 0;
}
Note: When using pointers to string literals as in 1., best practice is to use the const keyword, like const *s = "hello". This is more readable and the compiler will provide better help when it's violated. It will then throw an error like error: assignment of read-only location ‘*s’ instead of the seg fault. Linters in editors will also likely pick up the error before you manually compile the code.
First is one constant string which can't be modified. Second is an array with initialized value, so it can be modified.
Segmentation fault is caused when you try to access the memory which is inaccessible.
char *str is a pointer to a string that is nonmodifiable(the reason for getting segfault).
whereas char str[] is an array and can be modifiable..

I want to know why this works without having to bind memory for the string

Hello guys I recently picked up C programming and I am stuck at understanding pointers. As far as I understand to store a value in a pointer you have to bind memory (using malloc) the size of the value you want to store. Given this, the following code should not work as I have not allocated 11 bytes of memory to store my string of size 11 bytes and yet for some reason beyond my comprehension it works perfectly fine.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(){
char *str = NULL;
str = "hello world\0";
printf("filename = %s\n", str);
return 0;
}
In this case
str = "hello world\0";
str points to the address of the first element of an array of chars, initialized with "hello world\0". In other words, str points to a "string literal".
By definition, the array is allocated and the address of the first element has to be "valid".
Quoting C11, chapter §6.4.5, String literals
In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals.78) The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals, the array elements have
type char, and are initialized with the individual bytes of the multibyte character
sequence. [....]
Memory allocation still happens, just not explicitly by you (via memory allocator functions).
That said, the "...\0" at the end is repetitive, as mentioned (in the first statement of the quote) above, by default, the array will be null-terminated.
Using a char variable without malloc is stating that the string you are assigning is read-only. This means that you are creating a pointer to a string constant. "hello world\0" is somewhere in the read-only part of memory and you are just pointing to it.
Now if you want to make changes to the string. Let's say changing the h to H, that would be str[0]='H'. Without malloc that will not be possible to make.
When you declare a string literal in a C program, it is stored in a read-only section of the program code. A statement of the form char *str = "hello"; will assign the address of this string to the char* pointer. However, the string itself (i.e., the characters h, e, l, l and o, plus the \0 string terminator) are still located in read-only memory, so you can't change them at all.
Note that there's no need for you to explicitly add a zero byte terminator to your string declarations. The C compiler will do this for you.
Right. But in this case you are just pointing to a string literal which is placed in the constant memory area. Your pointer is created in the stack area. So you are just pointing to another address. i.e, at the starting address of string literal.
Try using copy the string literal in your pointer variable. Then it will give error because you have not allocated memory. Hope you understand now.
Storage for string literals is set aside at program startup and held until the program exits. This storage may be read-only, and attempting to modify the contents of a string literal results in undefined behavior (it may work, it may crash, it may do something in between).

C string function (strcopy,strcat...,strstr) with arrays and pointers

Whenever I am using one of these functions in dev-C++(I know its old but for some reason still taught at my college.)
strcat,strcpy,strcmp,strchr...//And their variants stricmp...
The first argument for these functions always has to be an array (i.e:
char ch[]="hello";
But it can't be a pointer to a string bc for some reason this causes a crash.
In fact for an example look at both of these codes:
code1:
#include<stdio.h>
#include<string.h>
main()
{char ch[20]="Hello world!";
char *ch2="Hello Galaxy!";
strcat(ch,ch2);
printf("%s",ch);
scanf("%d")//Just to see the output.
}
This code works fine and gives the expected result(Hello World!Hello Galaxy!)
But the inverse code2 crashes.
code2:
#include<stdio.h>
#include<string.h>
main()
{char ch[20]="Hello world!";
char *ch2="Hello Galaxy!";
strcat(ch2,ch);
printf("%s",ch2);
scanf("%d")//Just to see the output.
}
This code crashes and causes a
file.exe has stopped working Error.
This is the same for almost all of the strings functions that takes two arguments.
What is the cause of this problem.
With char *ch2 = "Hello Galaxy!"; you are obtaining a pointer to a string literal. You should never attempt to modify a string literals, as this invokes undefined behaviour (which in your case has manifested as a crash).
With char ch[20] = "Hello World!"; you are initialising an array using the contents of a string literal, so you end up with your own modifiable copy of the string in ch.
Also, note that 20 characters is not enough for Hello World!Hello Galaxy! to fit, and this is also undefined behaviour, and known as overflowing your buffer.
char ch[20] = "Hello world!"
ch is an array of char initialized by the elements of a string literal (and the rest of the array is initialized with 0).
char *ch2="Hello Galaxy!";
ch2 is a pointer to a string literal.
String literals are not required to be modifiable in C. Modifying a string literal is undefined behavior in C.
There are two problems. The first is that your string literal is not long enough to hold the concatenated string "Hello world!Hello Galaxy!". The space allocated is only 13 bytes (12 characters plus the space for the '0' byte that terminates the string). The concatenated string requires 26 bytes (25 chars + 1 null-valued char).
However, this isn't the real problem. The real problem is that you're accessing memory that you should not be, and that the operating system often protects. Most implementations of C provide four areas of storage:
The stack, where variables you declare in a function are allocated
The heap, where calls to malloc/calloc/realloc allocate memory
Global static storage, where non-const global variables (those declared outside of a function) are allocated.
Global constant storage, where all string literals and other global variables declared const are allocated.
The first three areas are, in principle, modifiable. The fourth area is not, and is often stored in memory that the operating system marks as read-only. When you assign the string literal "Hello Galaxy!" tochar* ch2, the variablech2` points into global constant storage.
To give you a better idea, the following code generals a segfault when I run it:
#include <stdio.h>
int main(int argc, char** argv)
{
char* s = "Foo bar baz";
s[0] = 'B';
printf("%s\n",s);
return 0;
}
The segfault occurs in the s[0] = ... line, because I'm accessing storage that the operating system has marked as read-only.
That is about the size of pointer array..overflow problem..
char *ch2="Hello Galaxy!"; when you use this automaticly the size of *ch2 gets 14 with the null character but when you move the ch[] array into the *ch2 , you get an error. you cannot move an array with 20 size into another array with 14 size...
String literals are read-only.This means that if you assign:
char* str="Hello";
You can't pass str as first argument of strcpy and strcat, because this would cause to write over read-only memory.
If instead you declare it this way:
char str2[]="Hello";
Then the str2 array is stored on the stack, and you can change it's values.
You can still pass str to functions like strcmp (which just reads the wto strings and compare them), or as second argument of strcat and strcpy, since this doesn't cause the string to be written.
you got an error because you try to access the code section of your process which is read-only.
that is your string literal present in code and the address of that string literal you use in assignment to your pointer variable.
So you can access the code but you cannot modify this.
every executable file contain some sections
like...
1.text(code of your program as well as string literals present here)
2.data uninitialized
3.data initialized
you can veryfy this by command
size <executable-file-neme>
Also using command
objdump -D <executable-file-neme>

Resources