C string assignment gives segmentation fault [duplicate] - c

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 7 years ago.
I am new to C and I want perform this task: declare and initialize a string and then reassign each string element to a new value.
Writing the code in this way:
char *str = "geeksforgeeks\0";
for (int i = 0; str[i] != '\0'; ++i) {
str[i] = 'a';
}
throws a segmentation fault.
But if I write the code in this manner:
char string[] = "geeksforgeeks\0";
char *str = string;
for (int i = 0; str[i] != '\0'; ++i) {
str[i] = 'a';
}
the program behaves correctly.
Also this code:
char str[] = "geeksforgeeks\0";
for (int i = 0; str[i] != '\0'; ++i) {
str[i] = 'a';
}
behaves correctly.
What is the difference between the two? Should't be equivalent?

char *str = "geeksforgeeks\0";
This string is allocated in readonly* memory and you can't modify it. Also the null terminator there is redundant.
Same is not the case with the array you defined, that is why it works. In the case with array the string literal is copied to memory where array resides - and you can modify contents of that array. So using this
char *str = string;
you point to the first element of the array - which as mentioned, is modifiable (as well as all elements of the array).
*It can be they are stored not in read only memory, depends on platform. But anyway you are not allowed to modify them.

If you have:
char *str = "geeksforgeeks\0";
the string is (usually) stored in read-only memory and you get a segmentation fault when you try to modify it. (The \0 is really not needed; you have two null bytes at the end of the string.)
The simplest fix is to use an array instead of a constant string (which is basically what you do in the second working case):
char str[] = "geeksforgeeks";
Note that you should really use this for the string since the string is not modifiable:
const char *str = "geeksforgeeks";

The reason is simple.
In first example, you have a pointer to an static string. that's why you get a segmentation fault.
char *str = "Test";
This is practically a constant string. But in 2nd example, it is a variable that you change.
// You have a variable here
char str_array[] = "Test";
// Now you have a pointer to str_array
char *str = str_array;

You’ve hit on a bit of ugly legacy baggage. When you write the literal "geeksforgeeks\0", the compiler turns that into a pointer to an array of characters. If you later use the string "geeksforgeeks\0" again, it’s allowed to point both references to the same array. This only works if you can’t modify the array; otherwise, fputs(stdout, "geeksforgeeks\0"); would be printing aeeksforgeeks. (Fortran can top this: on at least one compiler, you could pass the constant 1 by name to a function, set it equal to -1, and all your loops would then run backwards.) On the other hand, the C standard doesn’t say that modifying string literals won’t work, and there’s some old code that did. It’s undefined behavior.
When you allocate an array to hold the string, you’re creating a unique copy, and that can be modified without causing errors elsewhere.
So why aren’t string literals const char * instead of char *? Early versions of C didn’t have the const keyword, and the standards committee didn’t want to break that much old code. However, you can and should declare pointers to string literals as const char* s = "geeksforgeeks\0"; so the compiler will stop you from shooting yourself in the foot.

Related

memcpy array vs pointer example and question [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 2 years ago.
This program illustrates my question:
#include "stdio.h"
#include "string.h"
void first_try() // segmentation fault
{
size_t numc = 1;
char *dest = "i "; // this is bad? i need to declare character array first? why?
char *src = "lone wolf c program\n";
memcpy(dest, src, numc);
printf("%s", dest);
}
void second_try() // works
{
size_t numc = 1;
char dest[24] = "i get overwritten";
char *src = "lone wolf c program\n";
memcpy(dest, src, 20);
printf("%s", dest);
}
int main(void)
{
//first_try(); // run-time error
second_try();
}
Why does the first_try() method cause a segmentation fault error?
Context
// feel free to ignore this context
I'm still a c programming newb. I went to https://www.devdocs.io and looked at the api for memcpy().
My instinct was to immediately write first_try(). I don't understand the difference between the dest variables in each function. Don't they both contain valid address values?
I read in a "strings as pointers" blog that "the character array containing the string must already exist". Apparently, writing just char *dest = "string"; compiles but is less useful than writing char buf[] = "string"; with a follow-up ptr that can be passed around: char *dest = &buf;. I'd like to understand the 'why' in all of this.
In C all literal strings (like your "i ") is really a non-modifiable array containing the characters in the string, plus the null-terminator.
Attempting to modify a literal string leads to undefined behavior.
When you use a pointer for dest
char *dest = "i ";
you make the pointer point to the first element of the three-element array for the string "i ".
When you use it as a destination for memcpy you attempt to modify the contents of this non-modifiable array, leading to undefined behavior and your crash.
This is why you should generally always use const when defining such pointers:
const char *dest = "i ";
When you use an array for the destination, it's allocated in a writable part of the memory for your program and you can modify it as you please.
But please make sure the destination array is large enough to hold the full source string, including the null-terminator. Otherwise your memcpy call will write out of bounds of the allocated memory, and you will again have undefined behavior.

How to change character in a string? [duplicate]

This question already has answers here:
What is the difference between char s[] and char *s?
(14 answers)
Closed 5 years ago.
I want to have a function that receives a string as a parameter and change symbols of the string
char *strChanger(char *str).
I tried to implement it like this:
char *strChanger(char *str) {
if(str[0] != '\0') {
str[0] = 'a';
}
return str;
}
In the program it should look like char *newstr = strChanger("hi");
But when I try to change a character in the string, a program chrashes.
I did some experiments and find that:
// Works fine
char str[] = "hi";
str[0] = 'a';
// Crashes
char *str = "hi";
str[0] = 'a';
I dont understand the difference. Why the second block of code doesn't work?
Because modifying string literal is undefined behavior - in your case leading to crash of your program.
In the second example, you are passing the string literal direclty and tried to make changes to it. Speaking in terms of char *newstr = strChanger("hi").
In fact char *str = "hi"; is basically making str point to the string literal. More specifically string literal is an array which is converted into pointer to the first element and which is then assigned to str. Then you tried to modify it - which is undefined behavior.
In the first case a copy of it is made which is modifiable and you can make changes to it which is then passed and it worked. You are declaring a char array and initializing it with the content of the string literal.
If you have POSIX defined "strdup" then you can do this
char *newstr = strChanger(strdup("hi"));
But again then inside strChanger you need to check the passed value - strdup may return NULL in case it fails to allocate memory and provide you with the copied string. At some point after using like this you will have to free the memory - free(newstr).
From standard 6.4.5p7 under string literal
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
From 6.7.9p14 under initialization
An array of character type may be initialized by a character string literal or UTF-8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

in C, Is it mandatory to pass character array name only to change characters of string constant? [duplicate]

This question already has answers here:
What is the difference between char s[] and char *s?
(14 answers)
Closed 5 years ago.
A string constant in C can be initialized in two ways: using array and a character pointer;
Both can access the string constant and can print it;
Coming to editing part, if I want to edit a string that is initialized using arrays, it is straight forward and we can edit using array individual characters.
If I want to edit a string that is initialized using character pointer, is it impossible to do?
Let us consider the following two programs:
Program #1:
#include<stdio.h>
void str_change(char *);
int main()
{
char str[] = "abcdefghijklm";
printf("%s\n", str);
str_change(str);
printf("%s\n", str);
return 0;
}
void str_change(char *temp)
{
int i = 0;
while (temp[i] != '\0') {
temp[i] = 'n' + temp[i] - 'a';
i++;
}
}
Program #2:
#include<stdio.h>
void str_change(char *);
int main()
{
char *str = "abcdefghijklm";
printf("%s\n", str);
str_change(str);
printf("%s\n", str);
return 0;
}
void str_change(char *temp)
{
int i = 0;
while (temp[i] != '\0') {
temp[i] = 'n' + temp[i] - 'a';
i++;
}
}
I tried the following version of function to program #2, but of no use
void str_change(char *temp)
{
while (*temp != '\0') {
*temp = 'n' + *temp - 'a';
temp++;
}
}
The first program is working pretty well,but segmentation fault for other, So, is it mandatory to pass only the string constants that are initialized using arrays between functions, if editing of string is required?
So it is mandatory to pass only the string constants that are initialized using arrays between functions, if editing of string is requiredc?
Basically, yes, though the real explanation is not these exact words. The following definition creates an array:
char str[] = "abc";
this is not a string literal. The "abc" token is a string literal syntax, not a string literal object. Here, that literal specifies the initial value for the str array. Array objects are modifiable.
char *str = "abc";
Here the "abc" syntax in the source code is an expression denoting a string literal object in the translated program image. It's also a kind of array, with static storage duration (regardless of the storage duration of str). The "abc" syntax evaluates to a pointer to the first character of this array, and the str pointer is initialized with that pointer value.
String literals are not required to support modification; the behavior of attempting to modify a string literal object is undefined behavior.
Even in systems where you don't get a predictable segmentation fault, strange things can happen. For instance:
char *a = "xabc";
char *b = "abc";
b[0] = 'b'; /* b changes to "bbc" */
Suppose the assignment works. It's possible that a will also be changed to "xbbc". A C compiler is allowed to merge the storage of identical literals, or literals which are suffixes of other literals.
It doesn't matter whether or not a and b are close together; this sneaky effect could occur even between distant declarations in different functions, perhaps even in different translation units.
String literals should be considered to be part of the program's image; a program which successfully modifies a string literal is effectively self-modifying code. The reason you get a "segmentation fault" in your environment is precisely because of a safeguard against self-modifying code: the "text" section of the compiled program (which contains the machine code) is located in write-protected pages of virtual memory. And the string literals are placed there together with the machine code (often interspersed among its functions). Attempts to modify a string literal result in write accesses to the text section, which are blocked by the permission bits on the pages.
In another kind of environment, C code might be used to produce a software image which goes into read-only memory: actual ROM chips. The string literals go into the ROM together with the code. Attempting to modify one adds up to attempting to modify ROM. The hardware might have no detection for that. For instance, the instruction might appear to execute, but when the location is read back, the original value is still there, not the new value. Like the segmentation fault, this is within the specification range of "undefined behavior": any behavior is!
String literals are stored in static duration storage which exist for program lifetime and could be read only. Changing content of this literal leads to undefined behavior.
Copy this literal to modificable array and pass it to function.
char array[5];
strcpy(array, "test");
If you are declaring pointer to string literal, make it const so compiler Will warn you if you try to modify it.
const char * ptr = " string literal";
I think, because if you use pointer, you can only read this array, you can't write anything to there with loop, because elements of your array don't situated nearly. (Sorry for my English)

Segmentation Fault while using tolower() on dynamic arrays [duplicate]

This question already has answers here:
Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s[]"?
(19 answers)
Closed 8 years ago.
I put this code on my C compiler (Dev Cpp).
char *str = "SomeTHing";
for(int i = 0; str[i]; i++){
str[i] = tolower(str[i]);
}
This gives a segmentation fault whereas if i use a static array,
char str[10] = "SomeTHing";
the loop works fine. Can anyone tell why is this happening?
char *str = "SomeTHing"; allocates read-only memory to the pointer str. To change its contents in any way is undefined behaviour. On your system that is manifesting itself as a crash. It's a pity that (i) your compiler is not warning you about your assigning this to a char* rather than a const char* or (ii) you're ignoring the warning.
char str[10] = "SomeTHing"; allocates the buffer on the stack, including the null terminator. Changing its contents is defined, although you need to keep a null terminator intact if you want to use some of the string library functions like strlen that rely on it.
char *str = "SomeTHing";
will place SomeTHing in the read-only parts of the memory and making str a pointer to that, making any writing operation on this memory illegal. Any try to modification this cause Undefined Behaviour.
Now following case
char str[10] = "SomeTHing";
this is working because puts the literal string in read-only memory and copies the string to newly allocated memory on the stack. it will probably be stored within an "initialized data segment" that is loaded from the executable file into write able memory when the program is run.

Segmentation Fault ++operator on char * [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why does this Seg Fault?
I receive a segmentation fault when using ++ operator on a char *
#include<stdio.h>
int main()
{
char *s = "hello";
printf("%c ", ++(*s));
return 0;
}
But if I do the following:
#include<stdio.h>
int main()
{
char *s = "hello";
char c = *s;
printf("%c ", ++c);
return 0;
}
Then the code compiles perfectly, what is the problem with the above code?
The first code snippet is attempting to modify a character in a string literal as:
++(*s)
is attempting to increment the first character in s. String literals are (commonly) read-only and an attempt to modify will cause the segmentation fault (the C standard states If the program attempts to modify such an array, the behavior is undefined.).
The second snippet is modifying a char variable, which is not read-only as after:
char c = *s;
c is a copy of the first character in s and c can be safely incremented.
In the first case you modify a constant literal, and in the second you modify a variable.
This code:
printf("%c ", ++(*s));
tries to modify a string literal through a pointer to one of its characters. Modifying string literals is undefined behavior - the quite likely outcome is that string literals are often stored in read-only memory, so it's technically illegal to modify them and that's why it manifests itself as segmentation fault on your system.
char *s = "hello";
This implies that 's' is a const string.
If you need a non-const string, you should allocate it explicitly from heap.
You are trying to change a string literal in the first case which is not allowed. In the second case you create a new char from the first character of the string literal. You modify the copy of that character and that is why the second case works.
Your code does not have write permission for the segment where the string literal is stored.

Resources