This question already has answers here:
How are string literals compiled in C?
(2 answers)
"Life-time" of a string literal in C
(9 answers)
Closed 8 years ago.
I have written a simple c code which shows below. In this code snippet I want to verify where the const string abcd stores. I first guess that it should be stored in .data section for read-only. After a test in Debian, however, things is different from what I initial guessed. By checking the assembly code which generated by gcc, I find it is placed in the stack frame of function p. But when I try it later in OSX, the string is stored in .data section again. Now I am confused by this. Is there any standard for the storing of const string?
#include<stdio.h>
char *p()
{
char p[] = "abcd";
return p;
}
int main()
{
char *pp = p();
printf("%s\n",pp);
return 0;
}
UPDATE: rici's answer awaken me. In OSX, the initial literal is stored in .data and then moved into function's stack frame later. Thus, it becomes a local variable for this function. However, gcc in Debian handle this situation is different from OSX. In Debian, gcc directly stored literal in stack instead of moving it from .data. I'm sorry for my carelessness.
in your case, it's located in stack. and returning the pointer to main will cause undefined behavior. but, if you have static char p[] = "abcd"; or char *p = "abcd"; they(the data) are located in .data.
There is a huge difference between:
const char s[] = "abcd";
and
const char* t = "abcd";
The first of these declares s to be an array object initialized from the string "abcd". s will have an address distinct from that of any other object in the program. The character string itself might be a compile-time artifact; the initialization is a copy so the character string does not need to be present at runtime if the compiler can find some other way of performing the initialization (such as a store immediate operation).
The second declaration declares t to be a pointer to a string constant. The string constant now must be present at runtime, because expressions like t+1, which are pointers inside the string, are valid. The language standard does not guarantee that every occurrence of string literals in the program is unique, nor does it guarantee that all occurrence are merged (although good compilers will try to do the second.) It does, however, guarantee that they have static lifetime.
Consequently, this is undefined behaviour, because the lifetime of the array s ends when the function returns:
const char *gimme_a_string() {
const char s[] = "abcd";
return s;
}
However, this is fine:
const char *gimme_a_string() {
const char *s = "abcd";
return s;
}
Also:
const char s[] = "abcd";
const char t[] = "abcd";
printf("%d\n", s == t);
is guaranteed to print 0, while
const char* s = "abcd";
const char* t = "abcd";
printf("%d\n", s == t);
might print either 0 or 1, depending on the implementation. (As written, it will almost certainly print 1. However, if the two declarations are in separate compilation units and lto is not enable, it is likely to print 0.)
Since the array form is initialized with a copy the non-const version is fine:
char s[] = "abcd";
s[3] = 'C';
But the char pointer version must be a const to avoid undefined behaviour.
// Will produce a warning on most compilers with compile option -Wall or equivalent
char* s = "abcd";
// *** UNDEFINED BEHAVIOUR *** Can cause random program breakage
s[3] = 'C';
Technically, the non-const declaration of s is legal (which is why the compiler only warns) because it is the attempt to modify the constant which is UB. But you should always heed compiler warnings; it is better to think of the declaration / initialization as wrong, because it is.
Related
This question already has answers here:
What is the difference between char s[] and char *s?
(14 answers)
Closed 5 years ago.
A string constant in C can be initialized in two ways: using array and a character pointer;
Both can access the string constant and can print it;
Coming to editing part, if I want to edit a string that is initialized using arrays, it is straight forward and we can edit using array individual characters.
If I want to edit a string that is initialized using character pointer, is it impossible to do?
Let us consider the following two programs:
Program #1:
#include<stdio.h>
void str_change(char *);
int main()
{
char str[] = "abcdefghijklm";
printf("%s\n", str);
str_change(str);
printf("%s\n", str);
return 0;
}
void str_change(char *temp)
{
int i = 0;
while (temp[i] != '\0') {
temp[i] = 'n' + temp[i] - 'a';
i++;
}
}
Program #2:
#include<stdio.h>
void str_change(char *);
int main()
{
char *str = "abcdefghijklm";
printf("%s\n", str);
str_change(str);
printf("%s\n", str);
return 0;
}
void str_change(char *temp)
{
int i = 0;
while (temp[i] != '\0') {
temp[i] = 'n' + temp[i] - 'a';
i++;
}
}
I tried the following version of function to program #2, but of no use
void str_change(char *temp)
{
while (*temp != '\0') {
*temp = 'n' + *temp - 'a';
temp++;
}
}
The first program is working pretty well,but segmentation fault for other, So, is it mandatory to pass only the string constants that are initialized using arrays between functions, if editing of string is required?
So it is mandatory to pass only the string constants that are initialized using arrays between functions, if editing of string is requiredc?
Basically, yes, though the real explanation is not these exact words. The following definition creates an array:
char str[] = "abc";
this is not a string literal. The "abc" token is a string literal syntax, not a string literal object. Here, that literal specifies the initial value for the str array. Array objects are modifiable.
char *str = "abc";
Here the "abc" syntax in the source code is an expression denoting a string literal object in the translated program image. It's also a kind of array, with static storage duration (regardless of the storage duration of str). The "abc" syntax evaluates to a pointer to the first character of this array, and the str pointer is initialized with that pointer value.
String literals are not required to support modification; the behavior of attempting to modify a string literal object is undefined behavior.
Even in systems where you don't get a predictable segmentation fault, strange things can happen. For instance:
char *a = "xabc";
char *b = "abc";
b[0] = 'b'; /* b changes to "bbc" */
Suppose the assignment works. It's possible that a will also be changed to "xbbc". A C compiler is allowed to merge the storage of identical literals, or literals which are suffixes of other literals.
It doesn't matter whether or not a and b are close together; this sneaky effect could occur even between distant declarations in different functions, perhaps even in different translation units.
String literals should be considered to be part of the program's image; a program which successfully modifies a string literal is effectively self-modifying code. The reason you get a "segmentation fault" in your environment is precisely because of a safeguard against self-modifying code: the "text" section of the compiled program (which contains the machine code) is located in write-protected pages of virtual memory. And the string literals are placed there together with the machine code (often interspersed among its functions). Attempts to modify a string literal result in write accesses to the text section, which are blocked by the permission bits on the pages.
In another kind of environment, C code might be used to produce a software image which goes into read-only memory: actual ROM chips. The string literals go into the ROM together with the code. Attempting to modify one adds up to attempting to modify ROM. The hardware might have no detection for that. For instance, the instruction might appear to execute, but when the location is read back, the original value is still there, not the new value. Like the segmentation fault, this is within the specification range of "undefined behavior": any behavior is!
String literals are stored in static duration storage which exist for program lifetime and could be read only. Changing content of this literal leads to undefined behavior.
Copy this literal to modificable array and pass it to function.
char array[5];
strcpy(array, "test");
If you are declaring pointer to string literal, make it const so compiler Will warn you if you try to modify it.
const char * ptr = " string literal";
I think, because if you use pointer, you can only read this array, you can't write anything to there with loop, because elements of your array don't situated nearly. (Sorry for my English)
This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 5 years ago.
I am trying to write code to reverse a string in place (I'm just trying to get better at C programming and pointer manipulation), but I cannot figure out why I am getting a segmentation fault:
#include <string.h>
void reverse(char *s);
int main() {
char* s = "teststring";
reverse(s);
return 0;
}
void reverse(char *s) {
int i, j;
char temp;
for (i=0,j = (strlen(s)-1); i < j; i++, j--) {
temp = *(s+i); //line 1
*(s+i) = *(s+j); //line 2
*(s+j) = temp; //line 3
}
}
It's lines 2 and 3 that are causing the segmentation fault. I understand that there may be better ways to do this, but I am interested in finding out what specifically in my code is causing the segmentation fault.
Update: I have included the calling function as requested.
There's no way to say from just that code. Most likely, you are passing in a pointer that points to invalid memory, non-modifiable memory or some other kind of memory that just can't be processed the way you process it here.
How do you call your function?
Added: You are passing in a pointer to a string literal. String literals are non-modifiable. You can't reverse a string literal.
Pass in a pointer to a modifiable string instead
char s[] = "teststring";
reverse(s);
This has been explained to death here already. "teststring" is a string literal. The string literal itself is a non-modifiable object. In practice compilers might (and will) put it in read-only memory. When you initialize a pointer like that
char *s = "teststring";
the pointer points directly at the beginning of the string literal. Any attempts to modify what s is pointing to are deemed to fail in general case. You can read it, but you can't write into it. For this reason it is highly recommended to point to string literals with pointer-to-const variables only
const char *s = "teststring";
But when you declare your s as
char s[] = "teststring";
you get a completely independent array s located in ordinary modifiable memory, which is just initialized with string literal. This means that that independent modifiable array s will get its initial value copied from the string literal. After that your s array and the string literal continue to exist as completely independent objects. The literal is still non-modifiable, while your s array is modifiable.
Basically, the latter declaration is functionally equivalent to
char s[11];
strcpy(s, "teststring");
You code could be segfaulting for a number of reasons. Here are the ones that come to mind
s is NULL
s points to a const string which is held in read only memory
s is not NULL terminated
I think #2 is the most likely. Can you show us the call site of reverse?
EDIT
Based on your sample #2 is definitely the answer. A string literal in C/C++ is not modifiable. The proper type is actually const char* and not char*. What you need to do is pass a modifiable string into that buffer.
Quick example:
char* pStr = strdup("foobar");
reverse(pStr);
free(pStr);
Are you testing this something like this?
int main() {
char * str = "foobar";
reverse(str);
printf("%s\n", str);
}
This makes str a string literal and you probably won't be able to edit it (segfaults for me). If you define char * str = strdup(foobar) it should work fine (does for me).
Your declaration is completely wrong:
char* s = "teststring";
"teststring" is stored in the code segment, which is read-only, like code. And, s is a pointer to "teststring", at the same time, you're trying to change the value of a read-only memory range. Thus, segmentation fault.
But with:
char s[] = "teststring";
s is initialized with "teststring", which of course is in the code segment, but there is an additional copy operation going on, to the stack in this case.
See Question 1.32 in the C FAQ list:
What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
Answer:
A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:
As the initializer for an array of char, as in the declaration of char a[], it specifies the initial values of the characters in that array (and, if necessary, its size).
Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.
Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).
(emphasis mine)
See also Back to Basics by Joel.
Which compiler and debugger are you using? Using gcc and gdb, I would compile the code with -g flag and then run it in gdb. When it segfaults, I would just do a backtrace (bt command in gdb) and see which is the offending line causing the problem. Additionally, I would just run the code step by step, while "watching" the pointer values in gdb and know where exactly is the problem.
Good luck.
As some of the answers provided above, the string memory is read-only. However, some compilers provide an option to compile with writable strings. E.g. with gcc, 3.x versions supported -fwritable-strings but newer versions don't.
I think strlen can not work since s is not NULL terminated. So the behaviour of your for iteration is not the one you expect.
Since the result of strlen will be superior than s length you will write in memory where you should not be.
In addition s points to a constant strings hold by a read only memory. You can not modify it. Try to init s by using the gets function as it is done in the strlen example
This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 5 years ago.
I am trying to write code to reverse a string in place (I'm just trying to get better at C programming and pointer manipulation), but I cannot figure out why I am getting a segmentation fault:
#include <string.h>
void reverse(char *s);
int main() {
char* s = "teststring";
reverse(s);
return 0;
}
void reverse(char *s) {
int i, j;
char temp;
for (i=0,j = (strlen(s)-1); i < j; i++, j--) {
temp = *(s+i); //line 1
*(s+i) = *(s+j); //line 2
*(s+j) = temp; //line 3
}
}
It's lines 2 and 3 that are causing the segmentation fault. I understand that there may be better ways to do this, but I am interested in finding out what specifically in my code is causing the segmentation fault.
Update: I have included the calling function as requested.
There's no way to say from just that code. Most likely, you are passing in a pointer that points to invalid memory, non-modifiable memory or some other kind of memory that just can't be processed the way you process it here.
How do you call your function?
Added: You are passing in a pointer to a string literal. String literals are non-modifiable. You can't reverse a string literal.
Pass in a pointer to a modifiable string instead
char s[] = "teststring";
reverse(s);
This has been explained to death here already. "teststring" is a string literal. The string literal itself is a non-modifiable object. In practice compilers might (and will) put it in read-only memory. When you initialize a pointer like that
char *s = "teststring";
the pointer points directly at the beginning of the string literal. Any attempts to modify what s is pointing to are deemed to fail in general case. You can read it, but you can't write into it. For this reason it is highly recommended to point to string literals with pointer-to-const variables only
const char *s = "teststring";
But when you declare your s as
char s[] = "teststring";
you get a completely independent array s located in ordinary modifiable memory, which is just initialized with string literal. This means that that independent modifiable array s will get its initial value copied from the string literal. After that your s array and the string literal continue to exist as completely independent objects. The literal is still non-modifiable, while your s array is modifiable.
Basically, the latter declaration is functionally equivalent to
char s[11];
strcpy(s, "teststring");
You code could be segfaulting for a number of reasons. Here are the ones that come to mind
s is NULL
s points to a const string which is held in read only memory
s is not NULL terminated
I think #2 is the most likely. Can you show us the call site of reverse?
EDIT
Based on your sample #2 is definitely the answer. A string literal in C/C++ is not modifiable. The proper type is actually const char* and not char*. What you need to do is pass a modifiable string into that buffer.
Quick example:
char* pStr = strdup("foobar");
reverse(pStr);
free(pStr);
Are you testing this something like this?
int main() {
char * str = "foobar";
reverse(str);
printf("%s\n", str);
}
This makes str a string literal and you probably won't be able to edit it (segfaults for me). If you define char * str = strdup(foobar) it should work fine (does for me).
Your declaration is completely wrong:
char* s = "teststring";
"teststring" is stored in the code segment, which is read-only, like code. And, s is a pointer to "teststring", at the same time, you're trying to change the value of a read-only memory range. Thus, segmentation fault.
But with:
char s[] = "teststring";
s is initialized with "teststring", which of course is in the code segment, but there is an additional copy operation going on, to the stack in this case.
See Question 1.32 in the C FAQ list:
What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
Answer:
A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:
As the initializer for an array of char, as in the declaration of char a[], it specifies the initial values of the characters in that array (and, if necessary, its size).
Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.
Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).
(emphasis mine)
See also Back to Basics by Joel.
Which compiler and debugger are you using? Using gcc and gdb, I would compile the code with -g flag and then run it in gdb. When it segfaults, I would just do a backtrace (bt command in gdb) and see which is the offending line causing the problem. Additionally, I would just run the code step by step, while "watching" the pointer values in gdb and know where exactly is the problem.
Good luck.
As some of the answers provided above, the string memory is read-only. However, some compilers provide an option to compile with writable strings. E.g. with gcc, 3.x versions supported -fwritable-strings but newer versions don't.
I think strlen can not work since s is not NULL terminated. So the behaviour of your for iteration is not the one you expect.
Since the result of strlen will be superior than s length you will write in memory where you should not be.
In addition s points to a constant strings hold by a read only memory. You can not modify it. Try to init s by using the gets function as it is done in the strlen example
This is a continuation of another question I have.
Consider the following code:
char *hi = "hello";
char *array1[3] =
{
hi,
"world",
"there."
};
It doesn't compile to my surprise (apparently I don't know C syntax as well as I thought) and generates the following error:
error: initializer element is not constant
If I change the char* into char[] it compiles fine:
char hi[] = "hello";
char *array1[3] =
{
hi,
"world",
"there."
};
Can somebody explain to me why?
In the first example (char *hi = "hello";), you are creating a non-const pointer which is initialized to point to the static const string "hello". This pointer could, in theory, point at anything you like.
In the second example (char hi[] = "hello";) you are specifically defining an array, not a pointer, so the address it references is non-modifiable. Note that an array can be thought of as a non-modifiable pointer to a specific block of memory.
Your first example actually compiles without issue in C++ (my compiler, at least).
This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 5 years ago.
I am trying to write code to reverse a string in place (I'm just trying to get better at C programming and pointer manipulation), but I cannot figure out why I am getting a segmentation fault:
#include <string.h>
void reverse(char *s);
int main() {
char* s = "teststring";
reverse(s);
return 0;
}
void reverse(char *s) {
int i, j;
char temp;
for (i=0,j = (strlen(s)-1); i < j; i++, j--) {
temp = *(s+i); //line 1
*(s+i) = *(s+j); //line 2
*(s+j) = temp; //line 3
}
}
It's lines 2 and 3 that are causing the segmentation fault. I understand that there may be better ways to do this, but I am interested in finding out what specifically in my code is causing the segmentation fault.
Update: I have included the calling function as requested.
There's no way to say from just that code. Most likely, you are passing in a pointer that points to invalid memory, non-modifiable memory or some other kind of memory that just can't be processed the way you process it here.
How do you call your function?
Added: You are passing in a pointer to a string literal. String literals are non-modifiable. You can't reverse a string literal.
Pass in a pointer to a modifiable string instead
char s[] = "teststring";
reverse(s);
This has been explained to death here already. "teststring" is a string literal. The string literal itself is a non-modifiable object. In practice compilers might (and will) put it in read-only memory. When you initialize a pointer like that
char *s = "teststring";
the pointer points directly at the beginning of the string literal. Any attempts to modify what s is pointing to are deemed to fail in general case. You can read it, but you can't write into it. For this reason it is highly recommended to point to string literals with pointer-to-const variables only
const char *s = "teststring";
But when you declare your s as
char s[] = "teststring";
you get a completely independent array s located in ordinary modifiable memory, which is just initialized with string literal. This means that that independent modifiable array s will get its initial value copied from the string literal. After that your s array and the string literal continue to exist as completely independent objects. The literal is still non-modifiable, while your s array is modifiable.
Basically, the latter declaration is functionally equivalent to
char s[11];
strcpy(s, "teststring");
You code could be segfaulting for a number of reasons. Here are the ones that come to mind
s is NULL
s points to a const string which is held in read only memory
s is not NULL terminated
I think #2 is the most likely. Can you show us the call site of reverse?
EDIT
Based on your sample #2 is definitely the answer. A string literal in C/C++ is not modifiable. The proper type is actually const char* and not char*. What you need to do is pass a modifiable string into that buffer.
Quick example:
char* pStr = strdup("foobar");
reverse(pStr);
free(pStr);
Are you testing this something like this?
int main() {
char * str = "foobar";
reverse(str);
printf("%s\n", str);
}
This makes str a string literal and you probably won't be able to edit it (segfaults for me). If you define char * str = strdup(foobar) it should work fine (does for me).
Your declaration is completely wrong:
char* s = "teststring";
"teststring" is stored in the code segment, which is read-only, like code. And, s is a pointer to "teststring", at the same time, you're trying to change the value of a read-only memory range. Thus, segmentation fault.
But with:
char s[] = "teststring";
s is initialized with "teststring", which of course is in the code segment, but there is an additional copy operation going on, to the stack in this case.
See Question 1.32 in the C FAQ list:
What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
Answer:
A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:
As the initializer for an array of char, as in the declaration of char a[], it specifies the initial values of the characters in that array (and, if necessary, its size).
Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.
Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).
(emphasis mine)
See also Back to Basics by Joel.
Which compiler and debugger are you using? Using gcc and gdb, I would compile the code with -g flag and then run it in gdb. When it segfaults, I would just do a backtrace (bt command in gdb) and see which is the offending line causing the problem. Additionally, I would just run the code step by step, while "watching" the pointer values in gdb and know where exactly is the problem.
Good luck.
As some of the answers provided above, the string memory is read-only. However, some compilers provide an option to compile with writable strings. E.g. with gcc, 3.x versions supported -fwritable-strings but newer versions don't.
I think strlen can not work since s is not NULL terminated. So the behaviour of your for iteration is not the one you expect.
Since the result of strlen will be superior than s length you will write in memory where you should not be.
In addition s points to a constant strings hold by a read only memory. You can not modify it. Try to init s by using the gets function as it is done in the strlen example