C - strcpy pointer - c

I want to ask about strcpy. I got problem here. Here is my code:
char *string1 = "Sentence 1";
char *string2 = "A";
strcpy(string1, string2);
I think there is no problem in my code there. The address of the first character in string1 and string2 are sent to the function strcpy. There should be no problem in this code, right?
Anybody please help me solve this problem or explain to me..
Thank you.

There is a problem -- your pointers are each pointing to memory which you cannot write to; they're pointing to constants which the compiler builds into your application.
You need to allocate space in writable memory (the stack via char string1[<size>]; for example, or the heap via char *string1 = malloc(<size>);). Be sure to replace with the amount of buffer space you need, and add an extra byte at least for NULL termination. If you malloc(), be sure you free() later!

This gives undefined behaviour. The compiler may allow it, due to a quirk of history (string literals aren't const), but you're basically trying to overwrite data which on many platforms you simply cannot modify.

From linux man pages:
char *strcpy(char *dest, const char *src);
The strcpy() function copies the string pointed to by src,
including the terminating null byte ('\0'), to the buffer pointed to
by dest. The strings may not overlap, and the destination string
dest must be large enough to receive the copy.
You have a problem with your *dest pointer, since it's pointing to a string literal instead of allocated, modifiable memory. Try defining string one as char string1[BUFFER_LENGTH]; or allocate it dynamically with malloc().

Related

Do you need to free a string from 'strcpy' if you copy a string generated from malloc?

Say I have some code snippet
char *str = malloc(sizeof(char)*10)
// some code to add content to the string in some way
To create a string of 10 chars. If I then copy str with strcpy from the standard string library into a new variable like so
char *copy;
strcpy(copy, str);
I'm aware I then need to free str using free(str), but is that enough? Or does strcpy also dynamically allocate memory for copy if used on a string created from malloc?
Or does strcpy also dynamically allocate memory for copy
No, strcpy knows nothing about memory, so your code copies a string into an uninitialized pointer pointing at la-la land.
If you want allocation + copy in the same call, there is non-standard strdup for that (which looks like it will be added to the C standard in the next version of the language).
Alternatively just do char *copy = malloc(strlen(str)+1); and then strcpy. Keep in mind to always leave room for the null terminator.
strcpy does not allocate, thus your second snippet is invalid unless copy is initialized with some buffer (regardless of it being stack- or heap allocated) or is is a large enough array.
Side note: if you don't know the exact length of the source string, you need to make sure the target buffer size is not exceed (e.g. by using strncpy or providing a large enough target buffer).
I guess documentation should answer your question in detail.
For starters the pointer copy is not initialized
char *copy;
So this call of strcoy
strcpy(copy, str);
results in undefined behavior.
The pointer copy must point to a character array where the string pointed to by the pointer str will be copied.
You need to free what was allocated with malloc, calloc or realloc.
So if the target array pointed to by the pointer copy was dynamically allocated like for example
char *copy = malloc( 10 * sizeof( char ) );
then of course you will need to free the allocated memory when it is not required any more.
This has nothing common with the function strcpy that just copies a string from one character array to another character array.

Confusion about strcat with memcpy

Okay so I have seen a few implementations of the strcat function with memcpy. I understand that it is efficient, since there no need to allocate. But how do you preserve overwriting the contents of the source string with the resultant string.
For example lets take-:
char *str1 = "Hello";
char *str2 = "World";
str1 = strcat(str1, str2);
How do I ensure that in str2 isn't overwritten with the contents of the resultant "HelloWorld" string ?
Also if strings are nothing but char arrays, and arrays are suppose to have a fixed size then without reallocation of memory if I copy bytes into the array that are larger than the array, then isn't that unsafe ?
It's not about unsafe, it's undefined behavior.
First of all, you're trying to modify a string literal, which inherently invokes UB.
Secondly, regarding the size of the destination buffer, quoting the man page (emphasis mine)
The strcat() function appends the src string to the dest string, overwriting the terminating null byte ('\0') at the end of dest, and then adds a terminating null byte. The strings may not overlap, and the dest string must have enough space for the result. If dest is not large enough, program behavior is unpredictable; [...]
I understand that it is efficient, since there no need to allocate.
That's an incorrect understanding. Neither memcpy nor strcat allocates memory. Both require that you pass pointers that point to sufficient amount of valid memory. If that is not the case, the program is subject to undefined behavior.
Your posted code is subject to undefined behavior for couple of reasons:
str1 points to a string literal, which is in read-only portion of the program.
str1 does not enough memory to hold the string "HelloWorld" and the terminating null character.

Why does this simple C program crash at runtime?

I tried the following simple C program but it crashes at runtime without giving any output. What is wrong here? How can I solve this problem?
#include <stdio.h>
#include <string.h>
int main(void)
{
char *s1="segmentation";
char *s2="fault";
char *s3=strcat(s1,s2);
printf("concatanated string is %s",s3);
}
So this is the agregated answer for this question:
you should not try to alter string literal in any way. according to the C standard , altering string literals causes undefined behaviour:
"It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined."
but let's say for the discussion that s1 is not string literal - you still need have enough buffer for strcat to work on - strcat finds the nul termination character and start writing on it the string you're appending. if your buffer is not big enough - you will try to write outside the bounderies of your array - causing again undefined behaviour.
Because strcat append functions on his first argument.
Ie the result will be store on s1 not on s3
You should allocate more memory for s1.
Ie :
char* s1 = malloc(sizeof(char) * (13 + 6)); //length of your 2 strings
strcpy(s1, "segmentation");
char *s2="fault";
strcat(s1,s2);
printf("concatanated string is %s",s1);
Others are focusing on there is not enough space in s1 for string concatenation. However, the bigger problem here is you are trying to modify a string literal, which is undefined behavior. Defining s1 as a char array that has enough space should work:
char s1[20] = "segmentation";
char *s2 = "fault";
strcat(s1,s2);
printf("concatanated string is %s",s1);
char *s1="segmentation";
s1 is an immutable string, which will be reside in read-only memory. If you look at the strcat definition:
char *strcat(char *dest, const char *src) here
dest -- This is pointer to the destination array, which should contain a C string, and should be large enough to contain the concatenated resulting string.
so when you are calling char *s3=strcat(s1,s2); you are trying to modify the immutable string which result in segmentation fault.
The most problematic thing here is that you declared s1 and s2 as char * and not as const char* - always use const in such case - this is read-only memory when you initialize a string this way.
If you want to extend the string in s1, you should not initialize it as you did, but you should allocate the memory for s1 on the stack or in the dynamic memory.
Example for allocating on the stack:
char s1[100] = "segmentation";
Example for allocating in the dynamic memory:
char *s1 = malloc(100 * sizeof(char));
strcpy(s1, "segmentation");
I used here 100 as I assume that this is enough for your string. You should always allocate a number that is at least the length of your string + 1
Found a similar one here on comp.lang.c It also answers in depth.
the main problem here is that space for the concatenated result is not
properly allocated. C does not provide an automatically-managed string
type. C compilers allocate memory only for objects explicitly
mentioned in the source code (in the case of strings, this includes
character arrays and string literals). The programmer must arrange for
sufficient space for the results of run-time operations such as string
concatenation, typically by declaring arrays, or by calling malloc.
strcat() performs no allocation; the second string is appended to the
first one, in place. The first (destination) string must be writable
and have enough room for the concatenated result. Therefore, one fix
would be to declare the first string as an array:
The original call to strcat in the question actually has two problems:
the string literal pointed to by s1, besides not being big enough for
any concatenated text, is not necessarily writable at all.
look at the definition of strcat()
char *strcat(char *dest, const char *src)
dest -- This is pointer to the destination array, which should contain a C string, and should be large enough to contain the concatenated resulting string.
src -- This is the string to be appended. This should not overlap the destination.
s1 is not enough to hold the concatenated string, which cause to write beyond the limit. It causes the run-time failure.
try this,
char *s1="segmentation";
char *s2="fault";
char* s3 = malloc(sizeof(s1) + sizeof(s2));
strcpy(s3, s1);
strcat(s3, s2);

strcpy behaving differently when two pointers are assigned strings in different ways

I am sorry, I might me asking a dumb question but I want to understand is there any difference in the below assignments? strcpy works in the first case but not in the second case.
char *str1;
*str1 = "Hello";
char *str2 = "World";
strcpy(str1,str2); //Works as expected
char *str1 = "Hello";
char *str2 = "World";
strcpy(str1,str2); //SEGMENTATION FAULT
How does compiler understand each assignment?Please Clarify.
Edit: In the first snippet you wrote *str1 = "Hello" which is equivalent to assigning to str[0], which is obviously wrong, because str1 is uninitialized and therefore is an invalid pointer. If we assume that you meant str1 = "Hello", then you are still wrong:
According to C specs, Attempting to modify a string literal results in undefined behavior: they may be stored in read-only storage (such as .rodata) or combined with other string literals so both snippets that you provided will yield undefined behavior.
I can only guess that in the second snippet the compiler is storing the string in some read-only storage, while in the first one it doesn't, so it works, but it's not guaranteed.
Sorry, both examples are very wrong and lead to undefined behaviour, that might or might not crash. Let me try to explain why:
str1 is a dangling pointer. That means str1 points to somewhere in your memory, writing to str1 can have arbitrary consequences. For example a crash or overriding some data in memory (eg. other local variables, variables in other functions, everything is possible)
The line *str1 = "Hello"; is also wrong (even if str1 were a valid pointer) as *str1 has type char (not char *) and is the first character of str1 which is dangling. However, you assign it a pointer ("Hello", type char *) which is a type error that your compiler will tell you about
str2 is a valid pointer but presumably points to read-only memory (hence the crash). Normally, constant strings are stored in read-only data in the binary, you cannot write to them, but that's exactly what you do in strcpy(str1,str2);.
A more correct example of what you want to achieve might be (with an array on the stack):
#define STR1_LEN 128
char str1[STR1_LEN] = "Hello"; /* array with space for 128 characters */
char *str2 = "World";
strncpy(str1, str2, STR1_LEN);
str1[STR1_LEN - 1] = 0; /* be sure to terminate str1 */
Other option (with dynamically managed memory):
#define STR1_LEN 128
char *str1 = malloc(STR1_LEN); /* allocate dynamic memory for str1 */
char *str2 = "World";
/* we should check here that str1 is not NULL, which would mean 'out of memory' */
strncpy(str1, str2, STR1_LEN);
str1[STR1_LEN - 1] = 0; /* be sure to terminate str1 */
free(str1); /* free the memory for str1 */
str1 = NULL;
EDIT: #chqrlie requested in the comments that the #define should be named STR1_SIZE not STR1_LEN. Presumably to reduce confusion because it's not the length in characters of the "string" but the length/size of the buffer allocated. Furthermore, #chqrlie requested not to give examples with the strncpy function. That wasn't really my choice as the OP used strcpy which is very dangerous so I picked the closest function that can be used correctly. But yes, I should probably have added, that the use of strcpy, strncpy, and similar functions is not recommended.
There seems to be some confusion here. Both fragments invoke undefined behaviour. Let me explain why:
char *str1; defines a pointer to characters, but it is uninitialized. It this definition occurs in the body of a function, its value is invalid. If this definition occurs at the global level, it is initialized to NULL.
*str1 = "Hello"; is an error: you are assigning a string pointer to the character pointed to by str1. str1 is uninitialized, so it does not point to anything valid, and you channot assign a pointer to a character. You should have written str1 = "Hello";. Furthermore, the string "Hello" is constant, so the definition of str1 really should be const char *str1;.
char *str2 = "World"; Here you define a pointer to a constant string "World". This statement is correct, but it would be better to define str2 as const char *str2 = "World"; for the same reason as above.
strcpy(str1,str2); //Works as expected NO it does not work at all! str1 does not point to a char array large enough to hold a copy of the string "World" including the final '\0'. Given the circumstances, this code invokes undefined behaviour, which may or may not cause a crash.
You mention the code works as expected: it only does no in appearance: what really happens is this: str1 is uninitialized, if it pointed to an area of memory that cannot be written, writing to it would likely have crashed the program with a segmentation fault; but if it happens to point to an area of memory where you can write, and the next statement *str1 = "Hello"; will modify the first byte of this area, then strcpy(str1, "World"); will modify the first 6 bytes at that place. The string pointed to by str1 will then be "World", as expected, but you have overwritten some area of memory that may be used for other purposes your program may consequently crash later in unexpected ways, a very hard to find bug! This is definitely undefined behaviour.
The second fragment invokes undefined behaviour for a different reason:
char *str1 = "Hello"; No problem, but should be const.
char *str2 = "World"; OK too, but should also be const.
strcpy(str1,str2); //SEGMENTATION FAULT of course it is invalid: you are trying to overwrite the constant character string "Hello" with the characters from the string "World". It would work if the string constant was stored in modifiable memory, and would cause even greater confusion later in the program as the value of the string constant was changed. Luckily, most modern environemnts prevent this by storing string constants in a read only memory. Trying to modify said memory causes a segment violation, ie: you are accessing the data segment of memory in a faulty way.
You should use strcpy() only to copy strings to character arrays you define as char buffer[SOME_SIZE]; or allocate as char *buffer = malloc(SOME_SIZE); with SOME_SIZE large enough to hold what you are trying to copy plus the final '\0'
Both code are wrong, even if "it works" in your first case. Hopefully this is only an academic question! :)
First let's look at *str1 which you are trying to modify.
char *str1;
This declares a dangling pointer, that is a pointer with the value of some unspecified address in the memory. Here the program is simple there is no important stuff, but you could have modified very critical data here!
char *str = "Hello";
This declares a pointer which will point to a protected section of the memory that even the program itself cannot change during execution, this is what a segmentation fault means.
To use strcpy(), the first parameter should be a char array dynamically allocated with malloc(). If fact, don't use strcpy(), learn to use strncpy() instead because it is safer.

string manipulations in C

Following are some basic questions that I have with respect to strings in C.
If string literals are stored in read-only data segment and cannot be changed after initialisation, then what is the difference between the following two initialisations.
char *string = "Hello world";
const char *string = "Hello world";
When we dynamically allocate memory for strings, I see the following allocation is capable enough to hold a string of arbitary length.Though this allocation work, I undersand/beleive that it is always good practice to allocate the actual size of actual string rather than the size of data type.Please guide on proper usage of dynamic allocation for strings.
char *str = (char *)malloc(sizeof(char));
scanf("%s",str);
printf("%s\n",str);
1.what is the difference between the following two initialisations.
The difference is the compilation and runtime checking of the error as others already told about this.
char *string = "Hello world";--->stored in read only data segment and
can't be changed,but if you change the value then compiler won't give
any error it only comes at runtime.
const char *string = "Hello world";--->This is also stored in the read
only data segment with a compile time checking as it is declared as
const so if you are changing the value of string then you will get an
error at compile time ,which is far better than a failure at run time.
2.Please guide on proper usage of dynamic allocation for strings.
char *str = (char *)malloc(sizeof(char));
scanf("%s",str);
printf("%s\n",str);
This code may work some time but not always.The problem comes at run-time when you will get a segmentation fault,as you are accessing the area of memory which is not own by your program.You should always very careful in this dynamic memory allocation as it will leads to very dangerous error at run time.
You should always allocate the amount of memory you need correctly.
The most error comes during the use of string.You should always keep in mind that there is a '\0' character present at last of the string and during the allocation its your responsibility to allocate memory for this.
Hope this helps.
what is the difference between the following two initialisations.
String literals have type char* for legacy reasons. It's good practice to only point to them via const char*, because it's not allowed to modify them.
I see the following allocation is capable enough to hold a string of arbitary length.
Wrong. This allocation only allocates memory for one character. If you tried to write more than one byte into string, you'd have a buffer overflow.
The proper way to dynamically allocate memory is simply:
char *string = malloc(your_desired_max_length);
Explicit cast is redundant here and sizeof(char) is 1 by definition.
Also: Remember that the string terminator (0) has to fit into the string too.
In the first case, you're explicitly casting the char* to a const one, meaning you're disallowing changes, at the compiler level, to the characters behind it. In C, it's actually undefined behaviour (at runtime) to try and modify those characters regardless of their const-ness, but a string literal is a char *, not a const char *.
In the second case, I see two problems.
The first is that you should never cast the return value from malloc since it can mask certain errors (especially on systems where pointers and integers are different sizes). Specifically, unless there is an active malloc prototype in place, a compiler may assume that it returns an int rather than the correct void *.
So, if you forget to include stdlib.h, you may experience some funny behaviour that the compiler couldn't warn you about, because you told it with an explicit cast that you knew what you were doing.
C is perfectly capable of implicit casting between the void * returned from malloc and any other pointer type.
The second problem is that it only allocates space for one character, which will be the terminating null for a string.
It would be better written as:
char *string = malloc (max_str_size + 1);
(and don't ever multiply by sizeof(char), that's a waste of time - it's always 1).
The difference between the two declarations is that the compiler will produce an error (which is much preferable to a runtime failure) if an attempt to modify the string literal is made via the const char* declared pointer. The following code:
const char* s = "hello"; /* 's' is a pointer to 'const char'. */
*s = 'a';
results in the VC2010 emitted the following error:
error C2166: l-value specifies const object
An attempt to modify the string literal made via the char* declared pointer won't be detected until runtime (VC2010 emits no error), the behaviour of which is undefined.
When malloc()ing memory for storing of strings you must remember to allocate one extra char for storing the null terminator as all (or nearly all) C string handling functions require the null terminator. For example, to allocate a buffer for storing "hello":
char* p = malloc(6); /* 5 for "hello" and 1 for null terminator. */
sizeof(char) is guaranteed to be 1 so is unrequired and it is not necessary to cast the return value of malloc(). When p is no longer required remember to free() the allocated memory:
free(p);
Difference between the following two initialisations.
first, char *string = "Hello world";
- "Hello world" stored in stack segment as constant string and its address is assigned to pointer'string' variable.
"Hello world" is constant. And you can't do string[5]='g', and doing this will cause a segmentation fault.
Where as 'string' variable itself is not constant. And you can change its binding:
string= "Some other string"; //this is correct, no segmentation fault
const char *string = "Hello world";
Again "Hello world" stored in stack segment as constant string and its address is assigned to 'string' variable.
And string[5]='g', and this cause segmentation fault.
No use of const keyword here!
Now,
char *string = (char *)malloc(sizeof(char));
Above declaration same as first one but this time you are assignment is dynamic from Heap segment (not from stack)
The code:
char *string = (char *)malloc(sizeof(char));
Will not hold a string of arbitrary length. It will allocate a single character and return a pointer to char character. Note that a pointer to a character and a pointer to what you call a string are the same thing.
To allocate space for a string you must do something like this:
char *data="Hello, world";
char *copy=(char*)malloc(strlen(data)+1);
strcpy(copy,data);
You need to tell malloc exactly how many bytes to allocate. The +1 is for the null terminator that needs to go onto the end.
As for literal string being stored in a read-only segment, this is an implementation issue, although is pretty much always the case. Most C compilers are pretty relaxed about const'ing access to these strings, but attempting to modify them is asking for trouble, so you should always declare them const char * to avoid any issues.
That particular allocation may appear to work as there's probably plenty of space in the program heap, but it doesn't. You can verify it by allocating two "arbitrary" strings with the proposed method and memcpy:ing some long enough string to the corresponding addresses. In the best case you see garbage, in the worst case you'll have segmentation fault or assert from malloc or free.

Resources