Why doesn't the strcat implementation cause a segmentation fault? - c

I'm trying to learn C and I tried the exercise in the book "The C Programming Language" in which I implemented the strcat() function as below:
char *my_strcat(char *s, const char *t)
{
char *dest = s;
while (*s) s++;
while (*s++ = *t++);
return dest;
}
And I'm calling this like:
int main(int argc, char const *argv[])
{
char x[] = "Hello, ";
char y[] = "World!\n";
my_strcat(x, y);
puts(x);
return 0;
}
My problem is, I don't fully understand my own implementation.
My question is by the time line while (*s) s++; completes, now I have set the address held in s to the memory location that contains \0 which is the last element of array x.
Then in the line while (*s++ = *t++);, I'm setting s to the address of the next block of memory which is outside array x and copy the content of t to this new location. How is it allowed to write the content at the location of t to the location pointed by s when it was not part of the storage I had requested when I initialized x?
If I called my_strcat like below, I get a segmentation fault:
int main(int argc, char const *argv[])
{
char *x = "Hello, ";
char *y = "World!\n";
my_strcat(x, y);
puts(x);
return 0;
}
which kind of makes sense. My understanding is char *x = "foo" and char x[] = "foo" are the same with the difference that in the latter case, storage allocated to x is fixed while the first is not. So, I feel like the segmentation fault should happen in the latter case, rather the former?
Thank you for clarification.

This is undefined behavior. Anything can happen.
The real, practical reason that the first program works is because the strings in the first are stored on the stack. It overwrites the stack, which causes undefined behavior, but "just works", which is unfortunate.
The second program doesn't "work" because the strings are stored in read-only memory. Any attempt to write to these strings will cause undefined behavior (or a segfault).
Your implementation of strcat is valid, you just need to allocate adequate space for the string that you're trying to append to.
So, to recap:
How is it allowed to write the content at the location of t to the location pointed by s when it was not part of the storage I had requested when I initialized x?
It's not. This is undefined behavior that just happens to "work".

Related

What is the issue with this C code which has a error?

It is a code written to copy one pointer to another.
ERROR is Segmentation error (core Dumped)
#include<stdio.h>
char strcp(char *,char *);
int main()
{
char *p="string",*q;
printf("%s",p);
strcp(p,q);
printf("%s",q);
return 0;
}
char strcp(char *p,char *q)
{
int i;
for(i=0;*(p+i)!='\0';i++)
*(p+i)=*(q+i);
}
char *p="string"...
strcp(p,q);
What p points to is a literal and literals are read-only. Trying to copy anything to it is forbidden (and causes a segmentation fault).
...and q is not initialized, another possible cause of the seg fault.
The problem with this algorithm is an implicit assumption it makes about pointers: char *q is not a string, it is a pointer to character. It can be treated as a string if you allocate space and place a null-terminated sequence of characters into it, but your code does not do it.
You can allocate space to q with malloc, like this:
char *p="string";
char *q=malloc(strlen(p)+1);
In addition, your version of strcpy reads null terminator from a wrong pointer, and does not null-terminate the copied string:
char strcp(char *p, char *q)
{
int i;
for(i=0;*(q+i)!='\0';i++) // <<== Fix this
*(p+i)=*(q+i);
*(p+i) = '\0'; // <<== Add this line
}
As the other answers have indicated the problem starts* with char *p="string",*q;.
The Literal "string" compiles to the equivalent:
const char foo[7] = {'s','t','r','i','n','g','\0'}; why *\0
As you might see further in you code you're trying to copy data to a const a\rray. Which is illegal.
But you're playing C, you have implicitly casted the const char foo[] to a char *p, right there during initialization.
C is not type safe because it's tightly coupled to the actual instruction on the hardware. Where types don't exist anymore, just widths. But that is for another topic.
*It's not the only flaw. I tossed in a few explanatory wiki links. Because the question shows you're a novice programmer. Keep up the work.

Bus error in my own strcat using pointer

I'm doing a pointer version of the strcat function, and this is my code:
void strcat(char *s, char *t);
int main(void) {
char *s = "Hello ";
char *t = "world\n";
strcat(s, t);
return 0;
}
void strcat(char *s, char *t) {
while (*s)
s++;
while ((*s++ = *t++))
;
}
This seems straightforward enough, but it gives bus error when running on my Mac OS 10.10.3. I can't see why...
In your code
char *s = "Hello ";
s points to a string literal which resides in read-only memory location. So, the problem is twofold
Trying to alter the content of the string literal invokes undefined behaviour.
(almost ignoring point 1) The destination pointer does not have enough memory to hold the concatinated final result. Memory overrun. Again undefined behaviour.
You should use an array (which resides in read-write memory) with sufficient length to hold the resulting (final) string instead (no memory overrun).
Suggestion: Don't use the same name for user-defined functions as that of the library functions. use some other name, e.g., my_strcat().
Pseudocode:
#define MAXVAL 512
char s[MAXVAL] = "Hello";
char *t = "world\n"; //are you sure you want the \n at the end?
and then
my_strcat(s, t);
you are adding the text of 't' after the last addres s is pointing to
char *s = "Hello ";
char *t = "world\n";
but writing after 's' is undefined behavior because the compiler might put that text in constant memory, in your case it crashes so it actually does.
you should reserve enough memory where s points to by either using malloc or declare it array style
char s[32] = "Hello ";

char* leads to segfault but char[] doesn't [duplicate]

This question already has answers here:
Difference between char[] and char * in C [duplicate]
(3 answers)
Closed 7 years ago.
I think I know the answer to my own question but I would like to have confirmation that I understand this perfectly.
I wrote a function that returns a string. I pass a char* as a parameter, and the function modifies the pointer.
It works fine and here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void get_file_name(char* file_name_out)
{
char file_name[12+1];
char dir_name[50+12+1];
strcpy(file_name, "name.xml");
strcpy(dir_name, "/home/user/foo/bar/");
strcat(dir_name, file_name);
strcpy(file_name_out, dir_name); // Clarity - equivalent to a return
}
int main()
{
char file_name[100];
get_file_name(file_name);
printf(file_name);
return 0;
}
But if I replace char file_name[100]; by char *filename; or char *filename = "";, I get a segmentation fault in strcpy().
I am not sure why ?
My function takes a char* as a parameter and so does strcpy().
As far as I understand, char *filename = ""; creates a read-only string. strcpy() is then trying to write into a read-only variable, which is not allowed so the error makes sense.
But what happens when I write char *filename; ? My guess is that enough space to fit a pointer to a char is allocated on the stack, so I could write only one single character where my file_name_out points. A call to strcpy() would try to write at least 2, hence the error.
It would explain why the following code compiles and yields the expected output:
void foo(char* a, char* b)
{
*a = *b;
}
int main()
{
char a = 'A', b = 'B';
printf("a = %c, b = %c\n", a, b);
foo(&a, &b);
printf("a = %c, b = %c\n", a, b);
return 0;
}
On the other hand, if I use char file_name[100];, I allocate enough room on the stack for 100 characters, so strcpy() can happily write into file_name_out.
Am I right ?
As far as I understand, char *filename = ""; creates a read-only
string. strcpy() is then trying to write into a read-only variable,
which is not allowed so the error makes sense.
Yes, that's right. It is inherently different from declaring a character array. Initializing a character pointer to a string literal makes it read-only; attempting to change the contents of the string leads to UB.
But what happens when I write char *filename; ? My guess is that
enough space to fit a pointer to a char is allocated on the stack, so
I could write only one single character into my file_name_out
variable.
You allocate enough space to store a pointer to a character, and that's it. You can't write to *filename, not even a single character, because you didn't allocate space to store the contents pointed to by *filename. If you want to change the contents pointed to by filename, first you must initialize it to point to somewhere valid.
I think the issue here is that
char string[100];
allocates memory to string - which you can access using string as pointer
but
char * string;
does not allocate any memory to string so you get a seg fault.
to get memory you could use
string = calloc(100,sizeo(char));
for example, but you would need to remember at the end to free the memory with
free(string);
or you could get a memory leak.
another memory allocation route is with malloc
So in summary
char string[100];
is equivalent to
char * string;
string = calloc(100,sizeo(char));
...
free(string);
although strictly speaking calloc initializes all elements to zero, whereas in the string[100] decalaration the array elements are undefined unless you use
string[100]={}
if you use malloc instead to grad the memory the contents are undefined.
Another point made by #PaulRooney is that char string[100] gives memory allocation on the stack whereas calloc uses the heap. For more information about the heap and stack see this question and answers...
char file_name[100]; creates a contiguous array of 100 chars. In this case file_name is a pointer of type (char*) which points to the first element in the array.
char* file_name; creates a pointer. However, it is not initialized to a particular memory address. Further, this expression does not allocate memory.
char *filename;
Allocate nothing. Its just a pointer pointing to an unspecified location (the value is whatever was in that memory previously). Using this pointer will never work as it probably points outside the memory range your program is allowed to use.
char *filename = "";
Points to a piece of the programs data segment. As you already said it's read only and so attempting to change it leads to the segfault.
In your final example you are dealing with single char values, not strings of char values and your function foo treats them as such. So there is no issue with the length of buffers the char* values point to.

string handling in c using pointers

I have written the following program (it is given as an example in one of the best text books). When I compile it in my Ubuntu machine or at http://www.compileonline.com/compile_c_online.php, I get "segmentation fault"
The problem is with while( *p++ = *str2++)
I feel it is a perfectly legal program. Experts, please explain about this error.
PS: I searched the forum, but I found no convincing answer. Some people even answered wrong, stating that *(unary) has higher precedence than ++ (postfix).
#include <stdio.h>
#include <string.h>
int main()
{
char *str1= "Overflow";
char *str2= "Stack";
char *p = str1;
while(*p)
++p;
while( *p++ = *str2++)
;
printf("%s",str1);
return 0;
}
Thanks
str1 and str2 point to string literals. You aren't allowed to modify those. Even if you could, there isn't enough memory allocated for the string to hold the characters you're trying to append. Instead, initialize a sufficiently large char array from a string literal:
char str1[14] = "Overflow";
I feel it is a perfectly legal program.
Unfortunately, it is not. You have multiple, severe bugs.
First of all, you are creating pointers to string literals char *str1= "Overflow"; and then you try to modify that memory. String literals are allocated in read-only memory and attempting to write to them results in undefined behavior (anything can happen).
Then you have while(*p) ++p; which looks for the end of the string, to find out where to append the next one. Even if you rewrite the pointers to string literals into arrays, you don't have enough free memory at that location. You must allocate enough memory to hold both "Overflow" and "Stack", together with the string null termination.
You should change your program to something like this (not tested):
#include <stdio.h>
int main()
{
char str1[20] = "Overflow"; // allocate an array with enough memory to hold everything
char str2[] = "Stack"; // allocate just enough to hold the string "Stack".
char *p1 = str1;
char *p2 = str2;
while(*p1)
++p1;
while(*p1++ = *p2++)
;
printf("%s",str1); // should print "OverflowStack"
return 0;
}
Or of course, you could just #include <string.h> and then strcat(str1, str2).
Because you are crossing the string boundary of Overflow (str1) is why you are getting sigsegv.
str1 does not have enough memory allocated to accomodate beyond Overflow.

How char[] and char* are different in this case?

When we run this piece of code, it works normally and prints string constant on the screen:
char *someFun(){
char *temp = "string constant";
return temp;
}
int main(){
puts(someFun());
}
But when we run the following similar code, it won't work and print some garbage on screen:
char *someFun1(){
char temp[ ] = "string";
return temp;
}
int main(){
puts(someFun1());
}
What is the reason behind it? Essentially, both functions do similar things (i.e. return a "string"), but still they behave differently. Why is that?
char *temp = "string constant";
string constant literal resides on read only segment. It gets deallocated at program termination. So, you can have a reference pointing to it.
char temp[ ] = "string";
string is copied to temp which resides on stack. As the function returns, unwinding of stack begins which de-allocates the variables in the function scope. But you are returning a reference to it which no longer exists on stack and hence you are getting garbage. But sometimes you may still get the correct result but you should not rely on it.
In the first case, the pointer temp will point to a global constant storing "string constant". Therefore, when you return the pointer, it's valid.
In the second case, '"string"' is just a char array on the stack - which dies after you return from the function.

Resources