Memory confusion for strncpy in C - c

This week one problem was discussed by my colleague regarding memory:
Sample code 1:
int main()
{
#define Str "This is String."
char dest[1];
char buff[10];
strncpy(dest, Str, sizeof(Str));
printf("Dest: %s\n", dest);
printf("Buff: %s\n", buff);
}
Output:
Dest: This is String.
Buff: his is String.
Sample Code 2:
int main()
{
#define Str "This is String."
char dest[1];
//char buff[10];
strncpy(dest, Str, sizeof(Str));
printf("Dest: %s\n", dest);
//printf("Buff: %s\n", buff);
}
Output:
Dest: This is String.
*** stack smashing detected ***: ./test terminated
Aborted (core dumped)
I am not understanding why i am getting that output in case 1? as buff is not even used in strncpy, and if i comment variable buff it will give stack smashing detected but with output for dest.
Also for buff why i am getting Output as "his as string."

This is an interesting problem that we all wish to understand at some point or the other. The problem that occurs here is known as “Buffer Overflow”. The side effects of this problem can vary from system to system (also referred as undefined behavior). Just to explain you what might be happening in your case lets assume that the memory layout of the variables in your program is as below
Note above representation is just for understanding and doesn't show actual representation for any architecture.
After the strncpy command is executed the contents of this memory region are as below
Now when you print buff you can see that the start address of buf now has 'h' in it. The printf starts printing this until it finds a null character which is past the buff memory region. Hence you get 'his is String' when you print buf.
However note that program 1 doesn't generate a stack smashing error because of stack guard (which is system/implementation) dependent. So if you execute this code on a system that doesn't include this the Program 1 will also crash (You can test this by increasing Str to a long string).
In case of Program 2 the strncpy just goes past the stack guard over writing the return address from main and hence you get a crash.
Hope this helps.
P.S. All above description is for understanding and doesn't show any actual system representation.

The C Standard specifies strncpy this way:
7.24.2.4 The strncpy function
Synopsis
#include <string.h>
char *strncpy(char * restrict s1,
const char * restrict s2,
size_t n);
Description
The strncpy function copies not more than n characters (characters that follow a null character are not copied) from the array pointed to by s2 to the array pointed to by s1.
If copying takes place between objects that overlap, the behavior is undefined.
If the array pointed to by s2 is a string that is shorter than n characters, null characters are appended to the copy in the array pointed to by s1, until n characters in all have been written.
Returns
The strncpy function returns the value of s1.
These semantics are widely misunderstood: strncpy is not a safe version of strcpy, the destination array is NOT null terminated if the source string is longer than the n argument.
In your example, this n argument is larger than the size of the destination array: the behavior is undefined because characters are written beyond the end of the destination array.
You can observe this is the first example as the buff array is positioned by the compiler just after the end of the dest array in automatic storage (aka on the stack) and is overwritten by strncpy. The compiler could use a different method so the observed behavior is by no means guaranteed.
My advice is to NEVER USE THIS FUNCTION. An opinion shared by other C experts such as Bruce Dawson: Stop using strncpy already!
You should favor a less error-prone function such as this one:
// Utility function: copy with truncation, return source string length
// truncation occurred if return value >= size argument
size_t bstrcpy(char *dest, size_t size, const char *src) {
size_t i;
/* copy the portion that fits */
for (i = 0; i + 1 < size && src[i] != '\0'; i++) {
dest[i] = src[i];
}
/* null terminate destination unless size == 0 */
if (i < size) {
dest[i] = '\0';
}
/* compute necessary length to allow truncation detection */
while (src[i] != '\0') {
i++;
}
return i;
}
You would use it this way in your example:
int main(void) {
#define Str "This is String."
char dest[12];
// the size of the destination array is passed
// after the pointer, just as for `snprintf`
bstrcpy(dest, sizeof dest, Str);
printf("Dest: %s\n", dest);
return 0;
}
Output:
This is a S

strncpy(dest, Str, sizeof(Str));
Your dest is only one byte, so here you are writing in memory which you are not supposed to and this invokes undefined behavior. In other words, anything can happen depending on how compiler implement these things.
The most probable reason for buf getting written is that the compiler places dest after buf. So when you are writing past the boundary of dest you are writing to buf. When you comment out buf it leads to crash.
But as I said before, you may get completely different behavior if a different compiler or even different version of same compiler is used.
Summary: Never do anything that invokes undefined behavior. In strncpy you are supposed to use sizeof(dest), not sizeof(src) and allocate sufficient memory for destination so that data from source is not lost.

The location of the variables on your stack is :-
0. dest
1. buff
12. canary
16. Return address
When buff is present, it protects the canary and return address from damage.
This is undefined behavior (writing more data into dest than fits). The canary has a special random value within it, that is set up when the function starts, and is tested before executing the return instruction. This adds some form of protection to buffer overruns.
Examples of undefined nature, is that the program may have crashed with "illegal instruction # xxxxxx" due to not having a canary.
The program may have behaved normally, if the return address was separate from the variable location.
The stack will typically grow in a negative direction on most current CPUs. Also the location of dest vs buff is compiler dependent. It may have switched them round, or if (for example) you took away the second printf, the compiler may have removed the storage for dest, as it may have decided it was not correctly used.

Related

Why strncpy() is not respecting the given size_t n which is 10 in temp2?

This problem is blowing my mind...Can anyone please sort out the problem because i have already wasted hours on this.. ;(
#include <stdio.h>
#include <string.h>
int main(){
char string[] = "Iam pretty much big string.";
char temp1[50];
char temp2[10];
// strcpy() and strncpy()
strcpy(temp1, string);
printf("%s\n", temp1);
strncpy(temp2, temp1, 10);
printf("%s\n", temp2);
return 0;
}
Result
Iam pretty much big string.
Iam prettyIam pretty much big string.
Expected Result:
Iam pretty much big string.
Iam pretty
The strncpy function is respecting the 10 byte limit you're giving it.
It copies the first 10 bytes from string to temp2. None of those 10 bytes is a null byte, and the size of temp2 is 10, so there are no null bytes in temp2. When you then pass temp2 to printf, it reads past the end of the array invoking undefined behavior.
You would need to set the size given to strncpy to the array size - 1, then manually add the null byte to the end.
strncpy(temp2, temp1, sizeof(temp2)-1);
temp2[sizeof(temp2)-1] = 0;
The address of temp2 is just before the address of temp1 and because you do not copy the final 0, the printf will continue printing after the end of temp2.
As time as you do not insert the 0, the result of printf is undefined.
You invoke Undefined Behavior attempting to print temp2 as temp2 is not nul-terminated. From man strncpy:
"Warning: If there is no null byte among the first n bytes of src,
the string placed in dest will not be null-terminated." (emphasis in
original)
See also C11 Standard - 7.24.2.4 The strncpy function (specifically footnote: 308)
So temp2 is not nul-terminated.
Citation of the appropriate [strncpy] tag on Stack Overflow https://stackoverflow.com/tags/strncpy/info, which may help you to understand what happens exactly:
This function is not recommended to use for any purpose, neither in C nor C++. It was never intended to be a "safe version of strcpy" but is often misused for such purposes. It is in fact considered to be much more dangerous than strcpy, since the null termination mechanism of strncpy is not intuitive and therefore often misunderstood. This is because of the following behavior specified by ISO 9899:2011 7.24.2.4:
char *strncpy(char * restrict s1,
const char * restrict s2,
size_t n);
/--/
3 If the array pointed to by s2 is a string that is shorter than n characters, null characters
are appended to the copy in the array pointed to by s1, until n characters in all have been
written.
A very common mistake is to pass an s2 which is exactly as many characters as the n parameter, in which case s1 will not get null terminated. That is: strncpy(dst, src, strlen(src));
/* MCVE of incorrect use of strncpy */
#include <string.h>
#include <stdio.h>
int main (void)
{
const char* STR = "hello";
char buf[] = "halt and catch fire";
strncpy(buf, STR, strlen(STR));
puts(buf); // prints "helloand catch fire"
return 0;
}
Recommended practice in C is to check the buffer size in advance and then use strcpy(), alternatively memcpy().
Recommended practice in C++ is to use std::string instead.
From the manpage for strncpy():
Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
Either your input is shorter than the supplied length, you add the terminating null byte yourself, or it won't be there. printf() expects the string to be properly null terminated, and thus overruns your allocated buffer.
This only goes to show that the n variants of many standard functions are by no means safe. You must read their respective man pages, and specifically look for what they do when the supplied length does not suffice.

Confusion in "strcat function in C assumes the destination string is large enough to hold contents of source string and its own."

So I read that strcat function is to be used carefully as the destination string should be large enough to hold contents of its own and source string. And it was true for the following program that I wrote:
#include <stdio.h>
#include <string.h>
int main(){
char *src, *dest;
printf("Enter Source String : ");
fgets(src, 10, stdin);
printf("Enter destination String : ");
fgets(dest, 20, stdin);
strcat(dest, src);
printf("Concatenated string is %s", dest);
return 0;
}
But not true for the one that I wrote here:
#include <stdio.h>
#include <string.h>
int main(){
char src[11] = "Hello ABC";
char dest[15] = "Hello DEFGIJK";
strcat(dest, src);
printf("concatenated string %s", dest);
getchar();
return 0;
}
This program ends up adding both without considering that destination string is not large enough. Why is it so?
The strcat function has no way of knowing exactly how long the destination buffer is, so it assumes that the buffer passed to it is large enough. If it's not, you invoke undefined behavior by writing past the end of the buffer. That's what's happening in the second piece of code.
The first piece of code is also invalid because both src and dest are uninitialized pointers. When you pass them to fgets, it reads whatever garbage value they contain, treats it as a valid address, then tries to write values to that invalid address. This is also undefined behavior.
One of the things that makes C fast is that it doesn't check to make sure you follow the rules. It just tells you the rules and assumes that you follow them, and if you don't bad things may or may not happen. In your particular case it appeared to work but there's no guarantee of that.
For example, when I ran your second piece of code it also appeared to work. But if I changed it to this:
#include <stdio.h>
#include <string.h>
int main(){
char dest[15] = "Hello DEFGIJK";
strcat(dest, "Hello ABC XXXXXXXXXX");
printf("concatenated string %s", dest);
return 0;
}
The program crashes.
I think your confusion is not actually about the definition of strcat. Your real confusion is that you assumed that the C compiler would enforce all the "rules". That assumption is quite false.
Yes, the first argument to strcat must be a pointer to memory sufficient to store the concatenated result. In both of your programs, that requirement is violated. You may be getting the impression, from the lack of error messages in either program, that perhaps the rule isn't what you thought it was, that somehow it's valid to call strcat even when the first argument is not a pointer to enough memory. But no, that's not the case: calling strcat when there's not enough memory is definitely wrong. The fact that there were no error messages, or that one or both programs appeared to "work", proves nothing.
Here's an analogy. (You may even have had this experience when you were a child.) Suppose your mother tells you not to run across the street, because you might get hit by a car. Suppose you run across the street anyway, and do not get hit by a car. Do you conclude that your mother's advice was incorrect? Is this a valid conclusion?
In summary, what you read was correct: strcat must be used carefully. But let's rephrase that: you must be careful when calling strcat. If you're not careful, all sorts of things can go wrong, without any warning. In fact, many style guides recommend not using functions such as strcat at all, because they're so easy to misuse if you're careless. (Functions such as strcat can be used perfectly safely as long as you're careful -- but of course not all programmers are sufficiently careful.)
The strcat() function is indeed to be used carefully because it doesn't protect you from anything. If the source string isn't NULL-terminated, the destination string isn't NULL-terminated, or the destination string doesn't have enough space, strcat will still copy data. Therefore, it is easy to overwrite data you didn't mean to overwrite. It is your responsibility to make sure you have enough space. Using strncat() instead of strcat will also give you some extra safety.
Edit Here's an example:
#include <stdio.h>
#include <string.h>
int main()
{
char s1[16] = {0};
char s2[16] = {0};
strcpy(s2, "0123456789abcdefOOPS WAY TOO LONG");
/* ^^^ purposefully copy too much data into s2 */
printf("-%s-\n",s1);
return 0;
}
I never assigned to s1, so the output should ideally be --. However, because of how the compiler happened to arrange s1 and s2 in memory, the output I actually got was -OOPS WAY TOO LONG-. The strcpy(s2,...) overwrote the contents of s1 as well.
On gcc, -Wall or -Wstringop-overflow will help you detect situations like this one, where the compiler knows the size of the source string. However, in general, the compiler can't know how big your data will be. Therefore, you have to write code that makes sure you don't copy more than you have room for.
Both snippets invoke undefined behavior - the first because src and dest are not initialized to point anywhere meaningful, and the second because you are writing past the end of the array.
C does not enforce any kind of bounds checking on array accesses - you won't get an "Index out of range" exception if you try to write past the end of an array. You may get a runtime error if you try to access past a page boundary or clobber something important like the frame pointer, but otherwise you just risk corrupting data in your program.
Yes, you are responsible for making sure the target buffer is large enough for the final string. Otherwise the results are unpredictable.
I'd like to point out what is actually happening in the 2nd program in order to illustrate the problem.
It allocates 15 bytes at the memory location starting at dest and copies 14 bytes into it (including the null terminator):
char dest[15] = "Hello DEFGIJK";
...and 11 bytes at src with 10 bytes copied into it:
char src[11] = "Hello ABC";
The strcat() call then copies 10 bytes (9 chars plus the null terminator) from src into dest, starting right after the 'K' in dest. The resulting string at dest will be 23 bytes long including the null terminator. The problem is, you allocated only 15 bytes at dest, and the memory adjacent to that memory will be overwritten, i.e. corrupted, leading to program instability, wrong results, data corruption, etc.
Note that the strcat() function knows nothing about the amount of memory you've allocated at dest (or src, for that matter). It is up to you to make sure you've allocated enough memory at dest to prevent memory corruption.
By the way, the first program doesn't allocate memory at dest or src at all, so your calls to fgets() are corrupting memory starting at those locations.

The necessity to memset with '\0', in a toy example

I encountered the following example of using memset in tutorialspoint:
#include <stdio.h>
#include <string.h>
int main(){
char src[40];
char dest[100];
memset(dest, '\0', sizeof(dest));
strcpy(src, "This is tutorialspoint.com");
strcpy(dest, src);
printf("Final copied string : %s\n", dest);
return(0);
}
I don't get why the memset line is used, as the compile and result are the same when that line is commented. I would like to ask is that line necessary? or is it a good practice to do so when doing strcpy()? or it is just one random line.
Thanks!
It's not needed in this case, in the sense that it has no effect on the output. It might be needed in some similar cases.
char dest[100];
This defines dest as a local array of 100 chars. Its initial value is garbage. It could have been written as:
char dest[100] = "";
or
char dest[100] = { 0 };
but none of those are necessary because dest is assigned a value before it's used.
strcpy(src, "This is tutorialspoint.com");
strcpy(dest, src);
This copies the string contained in src into the array dest. It copies the 26 characters of "This is tutorialspoint.com" plus 1 additional character, the terminating '\0; that marks the end of the string. The previous contents of the dest array are ignored. (If we were using strcat(), it would matter, because strcat() has to find a '\0' in the destination before it can start copying.)
Without the memset() call, the remaining 73 bytes of dest would be garbage -- but that wouldn't matter, because we never look at anything past the '\0' at dest[26].
If, for some reason, we decided to add something like:
printf("dest[99] = '%c'\n", dest[99]);
to the program, then the memset() would matter. But since the purpose of dest is to hold a string (which is by definition terminated by a '\0' null character), that wouldn't be a sensible thing to do. Perfectly legal, but not sensible.
the posted code could skip the initialization via memset().
A time it really becomes useful is when debugging and you use the debugger to display the contents of the variable.
Another time to use memset() is when allocating something like an array of pointers, which might not all be set to point to something specific, like more allocated memory.
Then when passing those pointers to 'free()the unused pointers are set to NULL, so will not cause a crash when passed tofree()`

Wrong strlen output

I have the following piece of code in C:
char a[55] = "hello";
size_t length = strlen(a);
char b[length];
strncpy(b,a,length);
size_t length2 = strlen(b);
printf("%d\n", length); // output = 5
printf("%d\n", length2); // output = 8
Why is this the case?
it has to be 'b [length +1]'
strlen does not include the null character in the end of c strings.
You never initialized b to anything. Therefore it's contents are undefined. The call to strlen(b) could read beyond the size of b and cause undefined behavior (such as a crash).
b is not initialized: it contains whatever is in your RAM when the program is run.
For the first string a, the length is 5 as it should be "hello" has 5 characters.
For the second string, b you declare it as a string of 5 characters, but you don't initialise it, so it counts the characters until it finds a byte containing the 0 terminator.
UPDATE: the following line was added after I wrote the original answer.
strncpy(b,a,length);
after this addition, the problem is that you declared b of size length, while it should be length + 1 to provision space for the string terminator.
Others have already pointed out that you need to allocate strlen(a)+1 characters for b to be able to hold the whole string.
They've given you a set of parameters to use for strncpy that will (attempt to) cover up the fact that it's not really suitable for the job at hand (or almost any other, truth be told). What you really want is to just use strcpy instead. Also note, however, that as you've allocated it, b is also a local (auto storage class) variable. It's rarely useful to copy a string into a local variable.
Most of the time, if you're copying a string, you need to copy it to dynamically allocated storage -- otherwise, you might as well use the original and skip doing a copy at all. Copying a string into dynamically allocated storage is sufficiently common that many libraries already include a function (typically named strdup) for the purpose. If you're library doesn't have that, it's fairly easy to write one of your own:
char *dupstr(char const *input) {
char *ret = malloc(strlen(input)+1);
if (ret)
strcpy(ret, input);
return ret;
}
[Edit: I've named this dupstr because strdup (along with anything else starting with str is reserved for the implementation.]
Actually char array is not terminated by '\0' so strlen has no way to know where it sh'd stop calculating lenght of string as as
its syntax is int strlen(char *s)-> it returns no. of chars in string till '\0'(NULL char)
so to avoid this this we have to append NULL char (b[length]='\0')
otherwise strlen count char in string passed till NULL counter is encountered

strcat Vs strncat - When should which function be used?

Some static code analyzer tools are suggesting that all strcat usage should be replaced with strncat for safety purpose?
In a program, if we know clearly the size of the target buffer and source buffers, is it still recommended to go for strncat?
Also, given the suggestions by static tools, should strcat be used ever?
Concatenate two strings into a single string.
Prototypes
#include <string.h>
char * strcat(char *restrict s1, const char *restrict s2);
char * strncat(char *restrict s1, const char *restrict s2, size_t n);
DESCRIPTION
The strcat() and strncat() functions append a copy of the null-terminated
string s2 to the end of the null-terminated string s1, then add a terminating \0'. The string s1 must have sufficient space to hold the
result.
The strncat() function appends not more than n characters from s2, and
then adds a terminating \0'.
The source and destination strings should not overlap, as the behavior is
undefined.
RETURN VALUES
The `strcat()` and `strncat()` functions return the pointer s1.
SECURITY CONSIDERATIONS
The strcat() function is easily misused in a manner which enables malicious users to arbitrarily change a running program's functionality
through a buffer overflow attack.
Avoid using strcat(). Instead, use strncat() or strlcat() and ensure
that no more characters are copied to the destination buffer than it can
hold.
Note that strncat() can also be problematic. It may be a security concern for a string to be truncated at all. Since the truncated string
will not be as long as the original, it may refer to a completely different resource and usage of the truncated resource could result in very
incorrect behavior. Example:
void
foo(const char *arbitrary_string)
{
char onstack[8] = "";
#if defined(BAD)
/*
* This first strcat is bad behavior. Do not use strcat!
*/
(void)strcat(onstack, arbitrary_string); /* BAD! */
#elif defined(BETTER)
/*
* The following two lines demonstrate better use of
* strncat().
*/
(void)strncat(onstack, arbitrary_string,
sizeof(onstack) - strlen(onstack) - 1);
#elif defined(BEST)
/*
* These lines are even more robust due to testing for
* truncation.
*/
if (strlen(arbitrary_string) + 1 >
sizeof(onstack) - strlen(onstack))
err(1, "onstack would be truncated");
(void)strncat(onstack, arbitrary_string,
sizeof(onstack) - strlen(onstack) - 1);
#endif
}
Example
char dest[20] = "Hello";
char *src = ", World!";
char numbers[] = "12345678";
printf("dest before strcat: \"%s\"\n", dest); // "Hello"
strcat(dest, src);
printf("dest after strcat: \"%s\"\n", dest); // "Hello, World!"
strncat(dest, numbers, 3); // strcat first 3 chars of numbers
printf("dest after strncat: \"%s\"\n", dest); // "Hello, World!123"
If you are absolutely sure about source buffer's size and that the source buffer contains a NULL-character terminating the string, then you can safely use strcat when the destination buffer is large enough.
I still recommend using strncat and give it the size of the destination buffer - length of the destination string - 1
Note: I edited this since comments noted that my previous answer was horribly wrong.
They don't do the same thing so they can't be substituted for one another. Both have different data models.
A string for strcat is a null
terminated string for which you (as the programmer) guarantee that it has enough space.
A string for strncat is a sequence
of char that is either terminated
at the length you are indicating or
by a null termination if it is
supposed to be shorter than that
length.
So the use of these functions just depends on the assumptions that you may (or want to) do about your data.
Static tools are generally poor at understanding the circumstances around the use of a function. I bet most of them just warn for every strcat encountered instead of actually looking whether the data passed to the function is deterministic or not. As already mentioned, if you have control over your input data neither function is unsafe.
Though note that strncat() is slightly slower, as it has to check against '\0' termination and a counter, and also explicitly add it to the end. strcat() on the other hand just checks for '\0', and it adds the trailing '\0' to the new string by copying the terminator from the second string along with all the data.
It's very simple strcat is used to concatenate two strings , for example
String a= data
String b = structures
If use perform strcat
Strcat(a, b)
then
a= data structures
But if you want to concatenate specific numer of word r elements then you can use strncat
Example if you want to concatenate only the first two alphabet lts of b into a then you have to write
Strncat(a,b,2)
(It means that you just cancatenate the fist two alphabets of b into a , and a becomes
a = data st

Resources