This is my simple program:
int main(int argc, const char * argv[])
{
char s [5] = "Hello";
printf("%s", s);
return 0;
}
Hello is a 5 chars length so I define a char s [5] but the output is:
Hellop\370\277_\377
And when I change the char s [5] to char s [], everything works fine. What's the problem with my code?
You're forgetting the null terminator.
C strings are expected to have one \0 byte on the end to mark the end of the string. The reason your output looks strange is because printf, looking for the null terminator, has wandered off into uninitialized memory, which triggers undefined behavior.
In this case, printf appears to somewhat luckily find a null pretty quickly, and terminate normally after printing some garbage. However, this kind of bug will often crash your program with the message segmentation fault. A "seg fault" occurs when the operating system kills your process because it's doing something it's not supposed to, like reading memory that doesn't belong to it.
Try this instead:
char s[] = "Hello"; //s is now 6 characters long.
By not providing a number, the compiler decides how big your array needs to be and copies the "Hello" data into it.
If you need a string you don't want to change, you should declare them this way instead:
const char* s = "Hello";
This way, you've created a pointer that points to static memory, containing the string "Hello", no copy needed.
Related
I was just going through C library functions to see what I can do with them. When I came across the strcpy function the code I wrote resulted in a segmentation fault and I would like to know why. The code I wrote should be printing WorldWorld. If I understood correctly, strcpy(x,y) will copy the contents of y into x.
main() {
char *x = "Hello";
char *y = "World";
printf(strcpy(x,y));
}
If it worked, the code you wrote would print "World", not "WorldWorld". Nothing is appended, strcpy overwrites data only.
Your program crashes because "Hello" and "World" are string constants. It's undefined behavior to attempt to write to a constant, and in your case this manifests as a segmentation fault. You should use char x[] = "Hello"; and char y[] = "World"; instead, which reserve memory on the stack to hold the strings, where they can be overwritten.
There are more problems with your program, though:
First, you should never pass a variable string as the first argument to printf: either use puts, or use printf("%s", string). Passing a variable as a format string prevents compilers that support type-checking printf arguments from doing that verification, and it can transform into a serious vulnerability if users can control it.
Second, you should never use strcpy. Strcpy will happily overrun buffers, which is another major security vulnerability. For instance, if you wrote:
char foo[] = "foo";
strcpy(foo, "this string is waaaaaay too long");
return;
you will cause undefined behavior, your program would crash again, and you're opening the door to other serious vulnerabilities that you can avoid by specifying the size of the destination buffer.
AFAIK, there is actually no standard C function that will decently copy strings, but the least bad one would be strlcpy, which additionally requires a size argument.
I have already written a couple of C programs and consider this awkward to ask. But why do I receive a segmentation fault for the following code being supposed to replace "test" by "aaaa"?
#include <stdio.h>
int main(int argc, char* argv[])
{
char* s = "test\0";
printf("old: %s \n", s);
int x = 0;
while(s[x] != 0)
{
s[x++] = 'a'; // segmentation fault here
}
printf("new: %s \n", s); // expecting aaaa
return 0;
}
This assignment is writing to a string literal, which is stored in a read-only section of your executable when it is loaded in memory.
Also, note that the \0 in the literal is redundant.
One way to fix this (as suggested in comments) without copying the string: declare your variable as an array:
char s[] = "test";
This will cause the function to allocate at least 5 bytes of space for the string on the stack, which is normally writeable memory.
Also, you should generally declare a pointer to a string literal as const char*. This will cause the compiler to complain if you try to write to it, which is good, since the system loader will often mark the memory it points to as read-only.
Answering the OP's question in the comment posted to #antron.
What you need is to allocate a character array, then use strcpy() to initialize it with your text, then overwrite with a-s.
Allocation can be done statically (i.e., char s[10];) but make sure the space is enough to store the length of your init string (including the terminating \0).
Alternatively, you can dynamically allocate memory using malloc() and free it using free(). This enables you to allocate exactly enough space to hold your init string (figure it out in run-time using strlen()).
#include<string.h>
#include<stdio.h>
void main()
{
char *str1="hello";
char *str2="world";
strcat(str2,str1);
printf("%s",str2);
}
If I run this program, I'm getting run time program termination.
Please help me.
If I use this:
char str1[]="hello";
char str2[]="world";
then it is working!
But why
char *str1="hello";
char *str2="world";
this code is not working????
You are learning from a bad book. The main function should be declared as
int main (void);
Declaring it as void invokes undefined behaviour when the application finishes. Well, it doesn't finish yet, but eventually it will.
Get a book about the C language. You would find that
char *srt1="hello";
is compiled as if you wrote
static const char secret_array [6] = { 'h', 'e', 'l', 'l', 'o', 0 };
char* srt1 = (char*) &secret_array [0];
while
char srt1[]="hello";
is compiled as if you wrote
char srt1 [6] = { 'h', 'e', 'l', 'l', 'o', 0 };
Both strcat calls are severe bugs, because the destination of the strcat call doesn't have enough memory to contain the result. The first call is also a bug because you try to modify constant memory. In the first case, the bug leads to a crash, which is a good thing and lucky for you. In the second case the bug isn't immediately detected. Which is bad luck. You can bet that if you use code like this in a program that is shipped to a customer, it will crash if you are lucky, and lead to incorrect results that will cost your customer lots of money and get you sued otherwise.
In your code.
char *srt1="hello";
you created a pointer and pointed it at a constant string. The compiler puts that in a part of memory that is marked as read-only.
So strcat will try to modifying that which cause undefined behaviour.
it has no name and has static storage duration (meaning that it lives for the entire life of the program); and a variable of type pointer-to-char, called p, which is initialized with the location of the first character in that unnamed, read-only array.
see my answer for proper understanding why it working for first and not for second case.
String literals are read-only, you cannot change them.
This:
char *srt2="world";
means srt2 (bad name, btw) is a pointer variable, pointing at memory containing the constant data "world" (and a terminating '\0' character). There's no additional room after the six characters, and you cannot even change the letters.
You need:
char str2[32] = "world";
This, on the other hand, makes str2 be an array of 32 characters, where the first 6 characters are initialized to "world" and a terminating '\0'. It's fine to append to this, since new characters can fit into the existing array, as long as you don't overstep and try to store more than 32 characters (including the terminator).
char *srt1="hello";
char *srt2="world";
when you declare string like this, it is stored in Read-only Memory!
You can't change the variables stored in Read-only memory!
When you do strcat it is trying to modify the string, which is present in read only memory. So it is not allowed here! It is "Undefined".
In your program, str1 and str2 are not supposed to be written to because they're declared as pointer to read-only memory. To see this, do 'objdump -s a.out' and you'll see the following:
Contents of section .rodata:
400640 01000200 68656c6c 6f00776f 726c6400 ....hello.world.
400650 257300 %s.
strcat tries to write to that part of memory, thus causing a segmentation fault.
You can also use an Array. Here is a simple program of mine where I faced a problem like this, but in this problem you have to declare the size of array. Like,
mahe = [10].
#include <stdio.h>
#include <stdlib.h>
int main()
{
char cname[10] = "mahe";
strcat(cname, "Karim");
printf("%s\n", cname);
return 0;
}
I was making a basic program of strings and did this. There is a string in this way:
#include<stdio.h>
int main()
{
char str[7]="network";
printf("%s",str);
return 0;
}
It prints network.In my view, it should not print network. Some garbage value should be printed because '\0' does not end this character array. So how it got printed? There were no warning or errors too.
That's because
char str[7]="network";
is the same as
char str[7]={'n','e','t','w','o','r','k'};
str is a valid char array, but not a string, because it's no null-terminated. So it's undefined behavior to use %s to print it.
Reference: C FAQ: Is char a[3] = "abc"; legal? What does it mean?
char str[7]="network";
This Invokes Undefined behavior.
You did not declared array with enough space
This should be
char str[8]="network";
char str[7]="network";
you did not provide enough space for the string
char str[8]="network";
It's possible that stack pages start off completely zeroed in your system, so the string is actually null-terminated in memory, but not thanks to your code.
Try looking at the program in memory using a debugger, reading your platform documentation or printing out the contents of str[7] to get some clues. Doing so invokes undefined behavior but it's irrelevant when you're trying to figure out what your specific compiler and OS are doing at one given point in time.
I came upon this weird behaviour when dealing with C strings. This is an exercise from the K&R book where I was supposed to write a function that appends one string onto the end of another string. This obviously requires the destination string to have enough memory allocated so that the source string fits. Here is the code:
/* strcat: Copies contents of source at the end of dest */
char *strcat(char *dest, const char* source) {
char *d = dest;
// Move to the end of dest
while (*dest != '\0') {
dest++;
} // *dest is now '\0'
while (*source != '\0') {
*dest++ = *source++;
}
*dest = '\0';
return d;
}
During testing I wrote the following, expecting a segfault to happen while the program is running:
int main() {
char s1[] = "hello";
char s2[] = "eheheheheheh";
printf("%s\n", strcat(s1, s2));
}
As far as I understand s1 gets an array of 6 chars allocated and s2 an array of 13 chars. I thought that when strcat tries to write to s1 at indexes higher than 6 the program would segfault. Instead everything works fine, but the program doesn't exit cleanly, instead it does:
helloeheheheheheh
zsh: abort ./a.out
and exits with code 134, which I think just means abort.
Why am I not getting a segfault (or overwriting s2 if the strings are allocated on the stack)? Where are these strings in memory (the stack, or the heap)?
Thanks for your help.
I thought that when strcat tries to write to s1 at indexes higher than 6 the program would segfault.
Writing outside the bounds of memory you have allocated on the stack is undefined behaviour. Invoking this undefined behaviour usually (but not always) results in a segfault. However, you can't be sure that a segfault will happen.
The wikipedia link explains it quite nicely:
When an instance of undefined behavior occurs, so far as the language specification is concerned anything could happen, maybe nothing at all.
So, in this case, you could get a segfault, the program could abort, or sometimes it could just run fine. Or, anything. There is no way of guaranteeing the result.
Where are these strings in memory (the stack, or the heap)?
Since you've declared them as char [] inside main(), they are arrays that have automatic storage, which for practical purposes means they're on the stack.
Edit 1:
I'm going to try and explain how you might go about discovering the answer for yourself. I'm not sure what actually happens as this is not defined behavior (as others have stated), but you can do some simple debugging to figure out what your compiler is actually doing.
Original Answer
My guess would be that they are both on the stack. You can check this by modifying your code with:
int main() {
char c1 = 'X';
char s1[] = "hello";
char s2[] = "eheheheheheh";
char c2 = '3';
printf("%s\n", strcat(s1, s2));
}
c1 and c2 are going to be on the stack. Knowing that you can check if s1 and s2 are as well.
If the address of c1 is less than s1 and the address of s1 is less than c2 then it is on the stack. Otherwise it is probably in your .bss section (which would be the smart thing to do but would break recursion).
The reason I'm banking on the strings being on the stack is that if you are modifying them in the function, and that function calls itself, then the second call would not have its own copy of the strings and hence would not be valid... However, the compiler still knows that this function isn't recursive and can put the strings in the .bss so I could be wrong.
Assuming my guess that it is on the stack is right, in your code
int main() {
char s1[] = "hello";
char s2[] = "eheheheheheh";
printf("%s\n", strcat(s1, s2));
}
"hello" (with the null terminator) is pushed onto the stack, followed by "eheheheheheh" (with the null terminator).
They are both located one after the other (thanks to plain luck of the order in which you wrote them) forming a single memory block that you can write to (but shouldn't!)... That's why there is no seg fault, you can see this by breaking before printf and looking at the addresses.
s2 == (uintptr_t)s1 + (strlen(s1) + 1) should be true if I'm right.
Modifying your code with
int main() {
char s1[] = "hello";
char c = '3';
char s2[] = "eheheheheheh";
printf("%s\n", strcat(s1, s2));
}
Should see c overwritten if I'm right...
However, if I'm wrong and it is in the .bss section then they could still be adjacent and you would be overwriting them without a seg fault.
If you really want to know, disassemble it:
Unfortunately I only know how to do it on Linux. Try using the nm <binary> > <text file>.txt command or objdump -t <your_binary> > <text file>.sym command to dump all the symbols from your program. The commands should also give you the section in which each symbol resides.
Search the file for the s1 and s2 symbols, if you don't find them it should mean that they are on the stack but we will check that in the next step.
Use the objdump -S your_binary > text_file.S command (make sure you built your binary with debug symbols) and then open the .S file in a text editor.
Again search for the s1 and s2 symbols, (hopefully there aren't any others, I suspect not but I'm not sure).
If you find their definitions followed by a push or sub %esp command, then they are on the stack. If you're unsure about what their definitions mean, post it back here and let us have a look.
There's no seg fault or even an overwrite because it can use the memory of the second string and still function. Even give the correct answer. The abort is a sign that the program realized something was wrong. Try reversing the order in which you declare the strings and try again. It probably won't be as pleasant.
int main() {
char s1[] = "hello";
char s2[] = "eheheheheheh";
printf("%s\n", strcat(s1, s2));
}
instead use:
int main() {
char s1[20] = "hello";
char s2[] = "eheheheheheh";
printf("%s\n", strcat(s1, s2));
}
Here is the reason why your program didn't crash:
Your strings are declared as array (s1[] and s2[]). So they're on the stack. And just so happens that memory for s2[] is right after s1[]. So when strcat() is called, all it does is moving each character in s2[] one byte forward. Stack as stack is readable and writable. So there is no restriction what you'e doing.
But I believe the compiler is free to locate s1[] and s2[] where it see fits so this is just a happy accident.
Now to get your program to crash is relatively easy
Swap s1 and s2 in your call: instead of strcat(s1, s2), do strcat(s2, s1). This should cause stack smashing exception.
Change s1[] and s2[] to *s1 and *s2. This should cause segfault when you're writing to readonly segment.
hmm.... the strings are in stack all right since heap is used only for dynamic allocation of memory and stuff..
segfault is for invalid memory access, but with this array you are just writing stuff which is going out of bound (outside the boundry) for the array , so while writing i dont think you will have a issue .... Since in C its actually left to the programer to ensure things are kept in bound for arrays.
Also while reading if you use pointers - I dont think there will be a issue either since you can just continue to read till where ever you want and using the sum of previous lengths. But if you use functions that are mentioned in string.h they relay on the presence of the null character "\0" to decide where to halt the operation -- hence i think your function worked !!
but the termination could also indicate that any other variable / something that might have been present next to the location of the strings might have got over written with char value .... accessing those might have caused the program to exit !!
hope this helps .... good question by the way !