Why does this C program throw a segmentation fault at runtime?

Why does this C program throw a segmentation fault at runtime? - c

I'm declaring character array as char* string. And then I declare other pointer which again refer to original string, then when I'm going to change any thing on that string, at runtime program throws segmentation fault.
#include <string.h>
#include <ctype.h>
int main(void)
{
char* s = "random";
char* t = s;
*t = 'R'; // ----> Segmentation fault
t[0] = toupper(t[0]); // ----> Segmentation fault
*s = 'R'; // ----> Segmentation fault
s[0] = 'R'; // ----> Segmentation fault
printf("s is : %s , address : %p \n", s,s);
printf("t is : %s , address : %p \n", t,t);
return 0;
}
even this is not working :
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(void)
{
char *p = "random";
*p = 'N'; // ----> Segmentation fault
return 0;
}

I'm declaring character array as char* string.
This is where your problems begin! Although pointers and arrays have some things in common, syntactically, they are not the same. What you are doing in the line copied below is declaring s as a pointer to a char and initializing that pointer with the address of the string literal you provide.
char* s = "random";
As a string literal, the compiler is allowed to (though not obliged to) allocate memory for that data in read-only memory; thus, when you attempt (later) to modify the character pointed to by (the address in) the s variable (or any other pointer, such as your t, which contains the same address), you will experience undefined behaviour. Some systems will cause your program to crash ("Segmentation fault"), others may silently allow you to 'get away' with it. Indeed, you may even get different result with the same code at different times.
To fix this, and to properly declare a character array, use the [] notation:
char a[] = "random";
This will declare a as a (modifiable) array of characters (whose size is determined, in this case, by the initial value you give it - 7, here, with the terminating nul character included); then, the compiler will initialize that array with a copy of the string literal's data. You are then free to use an expression like *a to refer to the first element of that array.
The following short program may be helpful:
#include <stdio.h>
int main()
{
char* s = "random";
*s = 'R'; // Undefined behaviour: could be ignored, could crash, could work!
printf("%s\n", s);
char a[] = "random";
*a = 'R'; // Well-defined behaviour - this will change the first letter of the string.
printf("%s\n", a);
return 0;
}
(You may need to comment-out the lines that use s to get the code to run to the other lines!)

"random" is a constant string literal. It is placed by OS into the protected Read-Only Memory (CPU is instructed by OS to prevent write operations in this memory).
's' and 't' both point to this protected read-only memory region.
Once you tries to write into it, CPU detects this attempt and generates exception.

you try to modify the string literal. It is an Undefined Behavoiur and everything may happen including the segfault.
from the C standard what is an UB:
—The program attempts to modify a string literal (6.4.5).

Related

Why doesn't the strcat implementation cause a segmentation fault?

I'm trying to learn C and I tried the exercise in the book "The C Programming Language" in which I implemented the strcat() function as below:
char *my_strcat(char *s, const char *t)
{
char *dest = s;
while (*s) s++;
while (*s++ = *t++);
return dest;
}
And I'm calling this like:
int main(int argc, char const *argv[])
{
char x[] = "Hello, ";
char y[] = "World!\n";
my_strcat(x, y);
puts(x);
return 0;
}
My problem is, I don't fully understand my own implementation.
My question is by the time line while (*s) s++; completes, now I have set the address held in s to the memory location that contains \0 which is the last element of array x.
Then in the line while (*s++ = *t++);, I'm setting s to the address of the next block of memory which is outside array x and copy the content of t to this new location. How is it allowed to write the content at the location of t to the location pointed by s when it was not part of the storage I had requested when I initialized x?
If I called my_strcat like below, I get a segmentation fault:
int main(int argc, char const *argv[])
{
char *x = "Hello, ";
char *y = "World!\n";
my_strcat(x, y);
puts(x);
return 0;
}
which kind of makes sense. My understanding is char *x = "foo" and char x[] = "foo" are the same with the difference that in the latter case, storage allocated to x is fixed while the first is not. So, I feel like the segmentation fault should happen in the latter case, rather the former?
Thank you for clarification.

This is undefined behavior. Anything can happen.
The real, practical reason that the first program works is because the strings in the first are stored on the stack. It overwrites the stack, which causes undefined behavior, but "just works", which is unfortunate.
The second program doesn't "work" because the strings are stored in read-only memory. Any attempt to write to these strings will cause undefined behavior (or a segfault).
Your implementation of strcat is valid, you just need to allocate adequate space for the string that you're trying to append to.
So, to recap:
How is it allowed to write the content at the location of t to the location pointed by s when it was not part of the storage I had requested when I initialized x?
It's not. This is undefined behavior that just happens to "work".

C String Length using null

I know the C language has dynamic length strings whereby it uses the special character null (represented as 0) to terminate a string - rather than maintaining the length.
I have this simple C code that creates a string with the null character in the fifth index:
#include <stdio.h>
#include <stdlib.h>
int main () {
char * s= "sdfsd\0sfdfsd";
printf("%s",s);
s[5]='3';
printf("%s",s);
return 0;
}
Thus, a print of the string will only output up to the fifth index. Then the code changes the character at the fifth index to a '3'. Given my understanding, I assumed it would print the full string with the 3 instead of the null, as such:
sdfsdsdfsd3sfdfsd
but instead it outputs:
sdfsdsdfsd
Can someone explain this?

This program exhibits undefined behavior because you modify a read-only string literal. char* s = "..." makes s point to constant memory; C++ actually disallows pointing non-const char* to string literals, but in C it's still possible, and we have to be careful (see this SO answer for more details and a C99 standards quote)
Change the assignment line to:
char s[] = "sdfsd\0sfdfsd";
Which creates an array on the stack and copies the string to it, as an initializer. In this case modifying s[5] is valid and you get the result you expect.

String literals can not be changed because the compiler put the string literals into a read-only data-section (but this might vary by underlying platform). The effect of attempting to modify a string literal is undefined.
In your code:
char * s= "sdfsd\0sfdfsd"
Here, s is char pointer pointing to a string "sdfsd\0sfdfsd" stored in read-only memory, making it immutable.
Here you are trying to modify the content of read-only memory:
s[5]='3';
which leads to undefined behavior.
Instead, you can use char[]:
#include <stdio.h>
int main () {
char a[] = "sdfsd\0sfdfsd";
char * s = a;
printf("%s",s);
s[5]='3';
printf("%s\n",s);
return 0;
}

This operation has failed:
s[5] = 3;
You're trying to change a string literal, which is always read-only. My testing shows the program exited with segfault:
Segmentation fault (core dumped)
You should store it in an array (or allocated memory) before any attempts to change it:
char s[] = "sdfsd\0sfdfsd";
With the above change, the program works as intended.

#include <stdio.h>
int main(){
char x[10] = "aa\0a";
x[2] = '1';
puts(x);
printf("\n\n\nPress any key to exit...");
getch();
return 0;
}
Output: aa1a

Can a string pointer in C be directly assigned a string literal?

The following program works fine, and I'm surprised why :
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
void xyz(char **value)
{
// *value = strdup("abc");
*value = "abc"; // <-- ??????????
}
int main(void)
{
char *s1;
xyz(&s1);
printf("s1 : %s \n", s1);
}
Output :
s1 : abc
My understanding was that I have to use strdup() function to allocate memory for a string in C for which I have not allocated memory. But in this case the program seems to be working fine by just assigning string value using " ", can anyone please explain ?

String literals don't exist in the ether. They reside in your programs memory and have an address.
Consequently you can assign that address to pointers. The behavior of your program is well defined, and nothing bad will happen, so long as you don't attempt to modify a literal through a pointer.
For that reason, it's best to make the compiler work for you by being const correct. Prefer to mark the pointee type as const whenever possible, and your compiler will object to modification attempts.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
void xyz(char const **value)
{
*value = "abc";
}
int main(void)
{
char const *s1;
xyz(&s1);
printf("s1 : %s \n", s1);
s1[0] = 'a'; << Error on this line
}

Your program works fine because string literal "abc" are character arrays , and what actually happens while assigning a string literal to a pointer is , the compiler first creates a character array and then return the address of the first element of the array just like when we call the name of any other array.
so in your program you have passed address of a char pointer to the function xyz
and
*value = "abc";
for this statement , compiler first creates a character array in the memory and then returns its address which in turn gets stored in the char pointer.It is worth knowing that the compiler creates the char array in read only memory.Thus , the address returned refers to a const char array.Any attempt to modify its value will return it compile-time error.

You can define a string in C with char *str = "some string";, str is a pointer which points to the location of the first letter in a string.

C: segmentation fault associated with `swprintf`

After calling swprintf to convert a char to a wchar, calling a loop triggers a segmentation fault; the statements in the loop header work fine outside the context of the loop. The swprintf command is successful based on the return value (3, which is the number of characters written to the output buffer up to the terminating null character), a fact that can be confirmed by removing the lines for the loop to prevent the segmentation fault. Nevertheless, a for or while loop called afterwards (or before) produces a segmentation fault. This problem appears to be solved by changing char* string to char string[4], implicating a pointer as the source of the problem; given my use case, however, I would rather use a char pointer than a char array. Oddly though, if the loop is removed concomitant with the change of char* to char string[4], then a segmentation fault ensues. Thus, some undefined behavior is occurring, likely because of a malformed swprintf statement; how can this statement be written properly, and if the statement is malformed, why does it nonetheless produce the expected return value?
#include <stdio.h>
#include <wchar.h>
void main() {
char* string = "ABC";
//char string[4] = "ABC";
wchar_t* string_w;
swprintf(string_w, 4, L"%hs", string);
int retval = swprintf(string_w, 4, L"%hs", string);
printf("string: %ls\nreturn value: %d\n",string_w, retval);
for (int i = 0; i < 2; i++) {
// do nothing
}
}
Note: I am compiling using gcc: (Debian 6.2.1-5) 6.2.1 20161124.

This
wchar_t* string_w;
and this
swprintf(string_w, 4, L"%hs", string);
without making string_w point to valid memory is a recipe for problem. You need to properly initialize string_w before using it, e.g.
wchar_t* string_w = malloc(SOME_PROPER_SIZE);

You need to allocate the space that string_w points to; as it is, it is a pointer that doesn't point anywhere in particular.

String copy(strcpy)

I have the following code.
#include <string.h>
#include <stdio.h>
int main()
{
char * l;
*l = 'c';
*(l+1) = 'g';
*(l+2) = '\0';
char *second;
strcpy(second, l);
printf("string: %s\n", second);
}
When I run it is says:
The output says "Segmentation fault"....any suggestions??
Thanks

l is an uninitialized pointer; you can't dereference it. You should allocate enough space to write its contents (statically (1) or dynamically (2)).
char l[3]; /* (1) */
#include <stdlib.h>
char *l = malloc(3); /* (2) */
It is the same error with strcpy: second is an unitialized pointer, you can't write into it.

You will learn to despise the Segmentation Fault error...
It's usually called when you try to access memory that is not yours. Most common occurrence would be when you try to access an array index that is out of bounds.
char *l just creates a pointer to a char. What you want is a string, which in C is defined as an array of chars. So when you try to access the next location in memory of whatever l is pointing to (which will probably just be garbage), you're going to access memory that isn't yours, thus Segmentation Fault

You could get memory with malloc or point the pointer to an already existing variable.
char word[3];
char *l;
l = word;
Now you can do such assignments:
*l = 'c';
*(l+1) = 'g';
*(l+2) = '\0';
but now that you want to copy it to another pointer, this pointer must be pointing to another string or you should allocate memory for it.
char *pointer_to_second;
char second[3];
pointer_to_second = second;
or if you prefer to get dynamic memory, change the 3 lines above be this one bellow:
char *pointer_to_second = malloc(sizeof(char) * 3);
after that you can do what you wanted:
strcpy(pointer_to_second, l);
But remember, if you are using a C compiler you must declare all variables at the beggining, otherwise you will get an error. If you are using a C++ compiler you won't have to concern about it.
Segmentation fault happens when you try to access a field that doesn't belong to your vector. For example, if you try this:
printf("The value in position 3 of my pointer is %c\n", *(l + 3));
You will probably get an error, because you pointer have 3 positions and you are trying to acess the 4th one.