Modify a string with pointer [duplicate] - c

This question already has answers here:
Increment the first byte of a string by one
(3 answers)
Closed 7 years ago.
This two codes have to change the char 2 in the character '4'
int main(int argc, char *argv[]){
char *s = "hello";
*(s+2)='4';
printf( "%s\n",s);
return 0;
}
When I run this I get segmentation fault while when I run this:
int main(int argc, char *argv[]){
char *s = argv[1];
*(s+2)='4';
printf( "%s\n",s);
return 0;
}
I know that there are other methods to do this. What is the difference between the 2 programs?

In your first case, you're facing undefined behaviour by attempting to modify a string literal. A segmentation fault is one of the common side-effects of UB.
In your code,
char *s = "hello";
essentially puts the starting address of the string literal "hello" into s. Now, is you want to modify the content of *s (or *(s+n), provided n does not go out of bounds), it will actually try to modify that string literal. As usually, the string literals are stored in the read-only memory, they are usually not allowed to be modified. Quoting from C11, chapter §6.4.5, String literals, (emphasis mine)
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
However, in your second case, you're doing
char *s = argv[1];
which is putting the value of argv[1] into s. Now, s points to the string contanied by argv[1]. Here, the contents of argv[1] (or, argv[n], to be general) is not read-only, it can be modified. So, using *s (or *(s+n), provided n does not go out of bounds), you can modify the contents.
This case is defined behaviour, because as per §5.1.2.2.2, Program startup
The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.
So, the second case is a special case while using argv[n], which is by the C standard rules, modifiable.

As Sourav said, attempting to modify a string literal invokes undefined behavior. If you alternately did the following, it would work fine.
int main(int argc, char *argv[]){
char s[] = "hello";
*(s+2)='4';
printf( "%s\n",s);
return 0;
}

When you do
char *s = "hello";
you are assigning *s to point to the beginning of a string literal which is non-modifiable. Which means you can read what's there but you can't change it (which is why
*(s+2)='4';
is giving you a segmentation fault.
In your second case, you are not giving your pointer a string literal, so you can modify it.
In fact, in your second case, you are using argvwhich is specifically explained to be modifiable in the c standard.

the literal: 'hello' is in read only memory, so cannot be changed.
Trying to change it causes the seg fault event
Trying to change the array of strings from argv[] has the same problem. The values are readonly
Trying to change a command line argument causes the seg fault event.
You could use: ' char s[] = "hello"; ' as that will put the literal on the stack, where it can be changed.
You could copy 'strcpy()' after getting the length of the command line argument and performing a malloc() for that length+1

Related

How char * pointers works [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 5 years ago.
I am trying to write code to reverse a string in place (I'm just trying to get better at C programming and pointer manipulation), but I cannot figure out why I am getting a segmentation fault:
#include <string.h>
void reverse(char *s);
int main() {
char* s = "teststring";
reverse(s);
return 0;
}
void reverse(char *s) {
int i, j;
char temp;
for (i=0,j = (strlen(s)-1); i < j; i++, j--) {
temp = *(s+i); //line 1
*(s+i) = *(s+j); //line 2
*(s+j) = temp; //line 3
}
}
It's lines 2 and 3 that are causing the segmentation fault. I understand that there may be better ways to do this, but I am interested in finding out what specifically in my code is causing the segmentation fault.
Update: I have included the calling function as requested.
There's no way to say from just that code. Most likely, you are passing in a pointer that points to invalid memory, non-modifiable memory or some other kind of memory that just can't be processed the way you process it here.
How do you call your function?
Added: You are passing in a pointer to a string literal. String literals are non-modifiable. You can't reverse a string literal.
Pass in a pointer to a modifiable string instead
char s[] = "teststring";
reverse(s);
This has been explained to death here already. "teststring" is a string literal. The string literal itself is a non-modifiable object. In practice compilers might (and will) put it in read-only memory. When you initialize a pointer like that
char *s = "teststring";
the pointer points directly at the beginning of the string literal. Any attempts to modify what s is pointing to are deemed to fail in general case. You can read it, but you can't write into it. For this reason it is highly recommended to point to string literals with pointer-to-const variables only
const char *s = "teststring";
But when you declare your s as
char s[] = "teststring";
you get a completely independent array s located in ordinary modifiable memory, which is just initialized with string literal. This means that that independent modifiable array s will get its initial value copied from the string literal. After that your s array and the string literal continue to exist as completely independent objects. The literal is still non-modifiable, while your s array is modifiable.
Basically, the latter declaration is functionally equivalent to
char s[11];
strcpy(s, "teststring");
You code could be segfaulting for a number of reasons. Here are the ones that come to mind
s is NULL
s points to a const string which is held in read only memory
s is not NULL terminated
I think #2 is the most likely. Can you show us the call site of reverse?
EDIT
Based on your sample #2 is definitely the answer. A string literal in C/C++ is not modifiable. The proper type is actually const char* and not char*. What you need to do is pass a modifiable string into that buffer.
Quick example:
char* pStr = strdup("foobar");
reverse(pStr);
free(pStr);
Are you testing this something like this?
int main() {
char * str = "foobar";
reverse(str);
printf("%s\n", str);
}
This makes str a string literal and you probably won't be able to edit it (segfaults for me). If you define char * str = strdup(foobar) it should work fine (does for me).
Your declaration is completely wrong:
char* s = "teststring";
"teststring" is stored in the code segment, which is read-only, like code. And, s is a pointer to "teststring", at the same time, you're trying to change the value of a read-only memory range. Thus, segmentation fault.
But with:
char s[] = "teststring";
s is initialized with "teststring", which of course is in the code segment, but there is an additional copy operation going on, to the stack in this case.
See Question 1.32 in the C FAQ list:
What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
Answer:
A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:
As the initializer for an array of char, as in the declaration of char a[], it specifies the initial values of the characters in that array (and, if necessary, its size).
Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.
Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).
(emphasis mine)
See also Back to Basics by Joel.
Which compiler and debugger are you using? Using gcc and gdb, I would compile the code with -g flag and then run it in gdb. When it segfaults, I would just do a backtrace (bt command in gdb) and see which is the offending line causing the problem. Additionally, I would just run the code step by step, while "watching" the pointer values in gdb and know where exactly is the problem.
Good luck.
As some of the answers provided above, the string memory is read-only. However, some compilers provide an option to compile with writable strings. E.g. with gcc, 3.x versions supported -fwritable-strings but newer versions don't.
I think strlen can not work since s is not NULL terminated. So the behaviour of your for iteration is not the one you expect.
Since the result of strlen will be superior than s length you will write in memory where you should not be.
In addition s points to a constant strings hold by a read only memory. You can not modify it. Try to init s by using the gets function as it is done in the strlen example

Difference between pointer to an array and array name as a pointer when used with increment operators [duplicate]

This question already has answers here:
Increment the first byte of a string by one
(3 answers)
Closed 7 years ago.
This two codes have to change the char 2 in the character '4'
int main(int argc, char *argv[]){
char *s = "hello";
*(s+2)='4';
printf( "%s\n",s);
return 0;
}
When I run this I get segmentation fault while when I run this:
int main(int argc, char *argv[]){
char *s = argv[1];
*(s+2)='4';
printf( "%s\n",s);
return 0;
}
I know that there are other methods to do this. What is the difference between the 2 programs?
In your first case, you're facing undefined behaviour by attempting to modify a string literal. A segmentation fault is one of the common side-effects of UB.
In your code,
char *s = "hello";
essentially puts the starting address of the string literal "hello" into s. Now, is you want to modify the content of *s (or *(s+n), provided n does not go out of bounds), it will actually try to modify that string literal. As usually, the string literals are stored in the read-only memory, they are usually not allowed to be modified. Quoting from C11, chapter §6.4.5, String literals, (emphasis mine)
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
However, in your second case, you're doing
char *s = argv[1];
which is putting the value of argv[1] into s. Now, s points to the string contanied by argv[1]. Here, the contents of argv[1] (or, argv[n], to be general) is not read-only, it can be modified. So, using *s (or *(s+n), provided n does not go out of bounds), you can modify the contents.
This case is defined behaviour, because as per §5.1.2.2.2, Program startup
The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.
So, the second case is a special case while using argv[n], which is by the C standard rules, modifiable.
As Sourav said, attempting to modify a string literal invokes undefined behavior. If you alternately did the following, it would work fine.
int main(int argc, char *argv[]){
char s[] = "hello";
*(s+2)='4';
printf( "%s\n",s);
return 0;
}
When you do
char *s = "hello";
you are assigning *s to point to the beginning of a string literal which is non-modifiable. Which means you can read what's there but you can't change it (which is why
*(s+2)='4';
is giving you a segmentation fault.
In your second case, you are not giving your pointer a string literal, so you can modify it.
In fact, in your second case, you are using argvwhich is specifically explained to be modifiable in the c standard.
the literal: 'hello' is in read only memory, so cannot be changed.
Trying to change it causes the seg fault event
Trying to change the array of strings from argv[] has the same problem. The values are readonly
Trying to change a command line argument causes the seg fault event.
You could use: ' char s[] = "hello"; ' as that will put the literal on the stack, where it can be changed.
You could copy 'strcpy()' after getting the length of the command line argument and performing a malloc() for that length+1

I don't understand the difference between a char array and a char * string [duplicate]

This question already has answers here:
What is the difference between char s[] and char *s?
(14 answers)
Closed 7 years ago.
I made the following little program:
#include <stdio.h>
void strncpy(char * s, char * t, int n);
int
main()
{
char string1[]="Learning strings";
char string2[10];
strncpy(string2,string1,3);
printf("string1:%s\nstring2:%s\n",string1,string2);
return 0;
}
void strncpy(char * s, char * t, int n)
{
int i;
for(i=0; i<n && t[i]!=0;i++)
s[i]=t[i];
s[i]=0;
}
I was trying to learn the difference between doing something like:
char greeting[]="Hello!";
And
char * farewell="Goodbye!";
And I thought my program would work with either of the two types of 'strings'(correct way of saying it?), but it only works with the first one.
Why does this happen? What's the difference between the two types?
What would I have to do to my program to be able to use strings of the second type?
The statement
char greeting[] = "Hello!";
causes the compiler to work out the size of the string literal "Hello!" (7 characters including the terminating '\0'), create an array of that size, and then copy that string into that array. The result of that greeting can be modified (e.g. its characters overwritten).
The statement
char * farewell="Goodbye!";
creates a pointer that points at the first character in the string literal "Goodbye!". That string literal cannot be modified without invoking undefined behaviour.
Either greeting or farewell can be passed to any function that does not attempt to modify them. greeting can also be passed to any function which modifies it (as long as only characters greeting[0] through to greeting[6] are modified, and no others). If farewell is modified, the result is undefined behaviour.
Generally speaking, it is better to change the definition of farewell to
const char * farewell="Goodbye!";
which actually reflects its true nature (and will, for example, cause a compilation error if farewell is passed to a function expecting a non-const parameter). The fact that it is possible to define farewell as a non-const pointer while it points at (the first character of) a string literal is a historical anomaly.
And, of course, if you want farewell to be safely modifiable, declare it as an array, not as a pointer.
The string literals "Hello" and "Goodbye" are stored as arrays of char such that they are allocated at program startup and released at program exit, and are visible over the entire program. They may be stored in such a way that they cannot be modified (such as in a read-only data segment). Attempting to modify the contents of a string literal results in undefined behavior, meaning the compiler isn't required to handle the situation in any particular way - it may work the way you want, it may result in a segmentation violation, or it may do so ething else.
The line
char greeting[] = "Hello";
allocates enough space to hold a copy of the literal and writes the contents of the literal to it. You may modify the contents of this array at will (although you can't store strings longer than "Hello" to it).
The line
char *farewell = "Goodbye";
creates a pointer and writes the address of the string literal "Goodbye" to it. Since this is a pointer to a string literal, we cannot write to the contents of the literal through that pointer.

constant pointer and pointer to constant in a string [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 5 years ago.
I am trying to write code to reverse a string in place (I'm just trying to get better at C programming and pointer manipulation), but I cannot figure out why I am getting a segmentation fault:
#include <string.h>
void reverse(char *s);
int main() {
char* s = "teststring";
reverse(s);
return 0;
}
void reverse(char *s) {
int i, j;
char temp;
for (i=0,j = (strlen(s)-1); i < j; i++, j--) {
temp = *(s+i); //line 1
*(s+i) = *(s+j); //line 2
*(s+j) = temp; //line 3
}
}
It's lines 2 and 3 that are causing the segmentation fault. I understand that there may be better ways to do this, but I am interested in finding out what specifically in my code is causing the segmentation fault.
Update: I have included the calling function as requested.
There's no way to say from just that code. Most likely, you are passing in a pointer that points to invalid memory, non-modifiable memory or some other kind of memory that just can't be processed the way you process it here.
How do you call your function?
Added: You are passing in a pointer to a string literal. String literals are non-modifiable. You can't reverse a string literal.
Pass in a pointer to a modifiable string instead
char s[] = "teststring";
reverse(s);
This has been explained to death here already. "teststring" is a string literal. The string literal itself is a non-modifiable object. In practice compilers might (and will) put it in read-only memory. When you initialize a pointer like that
char *s = "teststring";
the pointer points directly at the beginning of the string literal. Any attempts to modify what s is pointing to are deemed to fail in general case. You can read it, but you can't write into it. For this reason it is highly recommended to point to string literals with pointer-to-const variables only
const char *s = "teststring";
But when you declare your s as
char s[] = "teststring";
you get a completely independent array s located in ordinary modifiable memory, which is just initialized with string literal. This means that that independent modifiable array s will get its initial value copied from the string literal. After that your s array and the string literal continue to exist as completely independent objects. The literal is still non-modifiable, while your s array is modifiable.
Basically, the latter declaration is functionally equivalent to
char s[11];
strcpy(s, "teststring");
You code could be segfaulting for a number of reasons. Here are the ones that come to mind
s is NULL
s points to a const string which is held in read only memory
s is not NULL terminated
I think #2 is the most likely. Can you show us the call site of reverse?
EDIT
Based on your sample #2 is definitely the answer. A string literal in C/C++ is not modifiable. The proper type is actually const char* and not char*. What you need to do is pass a modifiable string into that buffer.
Quick example:
char* pStr = strdup("foobar");
reverse(pStr);
free(pStr);
Are you testing this something like this?
int main() {
char * str = "foobar";
reverse(str);
printf("%s\n", str);
}
This makes str a string literal and you probably won't be able to edit it (segfaults for me). If you define char * str = strdup(foobar) it should work fine (does for me).
Your declaration is completely wrong:
char* s = "teststring";
"teststring" is stored in the code segment, which is read-only, like code. And, s is a pointer to "teststring", at the same time, you're trying to change the value of a read-only memory range. Thus, segmentation fault.
But with:
char s[] = "teststring";
s is initialized with "teststring", which of course is in the code segment, but there is an additional copy operation going on, to the stack in this case.
See Question 1.32 in the C FAQ list:
What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
Answer:
A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:
As the initializer for an array of char, as in the declaration of char a[], it specifies the initial values of the characters in that array (and, if necessary, its size).
Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.
Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).
(emphasis mine)
See also Back to Basics by Joel.
Which compiler and debugger are you using? Using gcc and gdb, I would compile the code with -g flag and then run it in gdb. When it segfaults, I would just do a backtrace (bt command in gdb) and see which is the offending line causing the problem. Additionally, I would just run the code step by step, while "watching" the pointer values in gdb and know where exactly is the problem.
Good luck.
As some of the answers provided above, the string memory is read-only. However, some compilers provide an option to compile with writable strings. E.g. with gcc, 3.x versions supported -fwritable-strings but newer versions don't.
I think strlen can not work since s is not NULL terminated. So the behaviour of your for iteration is not the one you expect.
Since the result of strlen will be superior than s length you will write in memory where you should not be.
In addition s points to a constant strings hold by a read only memory. You can not modify it. Try to init s by using the gets function as it is done in the strlen example

Segmentation Fault ++operator on char * [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why does this Seg Fault?
I receive a segmentation fault when using ++ operator on a char *
#include<stdio.h>
int main()
{
char *s = "hello";
printf("%c ", ++(*s));
return 0;
}
But if I do the following:
#include<stdio.h>
int main()
{
char *s = "hello";
char c = *s;
printf("%c ", ++c);
return 0;
}
Then the code compiles perfectly, what is the problem with the above code?
The first code snippet is attempting to modify a character in a string literal as:
++(*s)
is attempting to increment the first character in s. String literals are (commonly) read-only and an attempt to modify will cause the segmentation fault (the C standard states If the program attempts to modify such an array, the behavior is undefined.).
The second snippet is modifying a char variable, which is not read-only as after:
char c = *s;
c is a copy of the first character in s and c can be safely incremented.
In the first case you modify a constant literal, and in the second you modify a variable.
This code:
printf("%c ", ++(*s));
tries to modify a string literal through a pointer to one of its characters. Modifying string literals is undefined behavior - the quite likely outcome is that string literals are often stored in read-only memory, so it's technically illegal to modify them and that's why it manifests itself as segmentation fault on your system.
char *s = "hello";
This implies that 's' is a const string.
If you need a non-const string, you should allocate it explicitly from heap.
You are trying to change a string literal in the first case which is not allowed. In the second case you create a new char from the first character of the string literal. You modify the copy of that character and that is why the second case works.
Your code does not have write permission for the segment where the string literal is stored.

Resources