Is it bad practice to use memcpy over strncpy or similar? - c

When moving strings between buffers, I frequently treat char* as data buffers, and use none of the string buffer (strncpy, strncat, etc).
Example:
memcpy(target, src, strlen(src) + 1) // after buffer size checks
//vs
strncpy(target, src, target_size - 1);
target[target_size - 1] = 0;
Is this a bad practice?
EDIT: I know the difference, this is not a duplicate, but rather a question of standard practice.

Use of
memcpy(target, src, strlen(src) + 1) // after buffer size checks
potentially involves traversing the string twice -- once in strlen and once in memcpy. You take a small performance hit that you don't if you use strcpy.
If you compute the length of the string for unrelated reasons or have the length of the string from other resources, it's unclear to me whether memcpy is better or worse than strncpy.
If you don't the compute the length of the string for other reasons or don't have the length of the string from other resources, it is better to use strcpy instead of memcpy or strncpy.

If you know the semantics and sizes, it's not an issue. Remember that strncpy will stop at the first null and also fill the following bytes in the destination up to n characters with null bytes if the string is shorter (it will not write a null byte if there wasn't one in the source). strncpy can also give you a little bit of better typechecking, as it expects a char * in all cases, rather than a void *.
Which is more efficient is up for debate, but based on CPU bulk instructions which can copy an entire block of memory in one instruction, memcpy is probably faster, as strncpy would check each copied byte for a NUL character.

Related

Isn't strncpy better than strcpy? [duplicate]

This question already has answers here:
Why should you use strncpy instead of strcpy?
(10 answers)
Closed 5 years ago.
I know there are a lot of answers on the web which say one is better than the other but I have come to this understanding of strncpy() is better in terms of preventing unnecessary crashes in the system because of the missing '\0'. As many people put it, it is quite defensive.
I have already gone through this link: Why should you use strncpy instead of strcpy?
Let's say:
char dest[5];
char src[7] = "Hello";
strncpy(dest, src, sizeof(dest)-1);
With this implementation (the key being sizeof(dest)-1 which always gives room for the '\0' to be appended), even if the destination has a truncated string, isn't it always safer to use strncpy() instead of strcpy()? In other words, I'm trying to understand when would one want to use strcpy() over strncpy()?
It is not better. Contrary to common belief strncpy was not created to replace strcpy or to be a safe version of it. It fails to null terminate the string when the source string is larger than the num parameter, so you have to do it manually. Like this:
strncpy(dst, src, num);
dst[num - 1] = 0;
Which requires 2 lines to achieve 1 simple goal and it is easy for developers to forget the second one.
Besides that, it always writes num characters (appends 0's if necessary) regardless of the sizes of the strings which can be a performance concern when dst is small and num is large.
Using strncat fixes the second problem of writing extra characters but still forces you to write 2 lines:
dst[0];
strncat(dst, src, num);
The function strlcpy fixes both of these problems but it is not a C standard library function so do not use it.
The right choice is snprintf which is guaranteed to null terminate the string and only writes the necessary amount of characters, even if it means truncating the source string.
snprintf(dst, size, "%s", src);
In C, strncpy has the advantage over strcpy that it does not overflow the buffer when the source is larger than the destination.
strncpy has the disadvantage that, when the source is larger than the destination, it does not terminate the buffer with a \0. So every call of strncpy must be followed with a statement terminating the buffer, such as
strncpy(dst, src, n);
dst[n-1]='\0';
User Eugene Sh points out that for defensive programming strlcpy must be used: "strlcpy() takes the full size of the destination buffer and
guarantees NUL-termination."
In C++, use std::string rather than c-style strings unless performance in a particular area is critical, in which case performance test to see if it is more than a negligible difference. You can use std::string::reserve to minimize allocations when you know the size up front.
In my opinion, 85% of bugs are caused by c-style strings. std::string is much safer and is less error prone.
In C, you won't have std::string available.
If you have to use c-style strings, then I'd prefer strncpy, because at least it asks for the size and makes you think about the size, rather than just overflowing.

strcat vs strncat for string literal

I want to append a string literal to destination. I can use strcat or strncat:
strcat(dest, "values");
Or
strncat(dest, "values", sizeof("values") - 1);
strcat has shorter code, it looks neat.
But I wonder about their runtime performance.
Is strncat slightly faster at runtime because there's no need to locate terminator?
Or maybe compilers could do optimization and so there is no difference?
First, both strcat and strncat loks for the null terminator, the difference is that strncat also check for the size of the copied data, and will copy only n bytes.
Second, since strcat does not check for the size of the copied data, and copies until it gets to a null terminator, it might (and will!!) cause a buffer overflow, overriding data that is stored in memory after the buffer you copy to.
Third, your usage of the strncat is not safer, as you limit the copy by the size of the source buffer, not the destination buffer. E.g. to use it correctly you should pass the size of the destination buffer:
strncat(dest, "values", sizeof(dest) -1 );
Fourth, if the size of the source string is bigger than than n of the destination, a null terminator will not be appended, so after the call to strncat you should add it yourself:
strncat(dest, "values", sizeof(dest) -1 );
dest[sizeof(dest) - 1] = '\0';
Last thing, since this is strncat, and it copies to wherever the destination string terminates, the size calculation is slightly more complex and is actually:
strncat(dest, "values", total_size_of_dest_buffer - strlen(dest) - 1 );
I am absolutely sure, that performance is not an issue here.
If you take a look at sources of both functions strcpy() and strncpy() (from glibc) you'll find out that both of them need to iterate over each character of src argument.
Use strcpy() as it's much easier to read and maintain, and is less error-prone.
Also, if there is anything that could optimize this code, I guess any decent compiler will handle that, as it seems to be quite common expression.
Is strncat slightly faster at runtime because there's no need to locate terminator?
char *strncat(char * restrict s1, const char * restrict s2, size_t n);
Unlikely. In general, char *strncat() needs to locate a null character in s2, if it exists before s2[n], as concatenation stop before n characters. Of course, the
null character in s2 of strcpy(s1,s2) needs to be found.
strncat(dest, "values", sizeof("values") - 1); may be slower than strcat(dest, "values"); and is not safer.
Code that does not overrun the array that may truncate:
// Assuming dest is an array
strncat(dest, "values", sizeof dest - strlen(dest) - 1);
An optimization compiler is allowed to "look-inside" well known functions like the ones below and emit code that "knows" the length/size of "values". such a compiler would certainly make equivalent performance code for the 2 below as the resultant functionality is identical - in this case.
strcat(dest, "values");
// or
strncat(dest, "values", sizeof("values") - 1);

snprintf vs. strcpy (etc.) in C

For doing string concatenation, I've been doing basic strcpy, strncpy of char* buffers. Then I learned about the snprintf and friends.
Should I stick with my strcpy, strcpy + \0 termination? Or should I just use snprintf in the future?
For most purposes I doubt the difference between using strncpy and snprintf is measurable.
If there's any formatting involved I tend to stick to only snprintf rather than mixing in strncpy as well.
I find this helps code clarity, and means you can use the following idiom to keep track of where you are in the buffer (thus avoiding creating a Shlemiel the Painter algorithm):
char sBuffer[iBufferSize];
char* pCursor = sBuffer;
pCursor += snprintf(pCursor, sizeof(sBuffer) - (pCursor - sBuffer), "some stuff\n");
for(int i = 0; i < 10; i++)
{
pCursor += snprintf(pCursor, sizeof(sBuffer) - (pCursor - sBuffer), " iter %d\n", i);
}
pCursor += snprintf(pCursor, sizeof(sBuffer) - (pCursor - sBuffer), "into a string\n");
snprintf is more robust if you want to format your string. If you only want to concatenate, use strncpy (don't use strcpy) since it's more efficient.
As others did point out already: Do not use strncpy.
strncpy will not zero terminate in case of truncation.
strncpy will zero-pad the whole buffer if string is shorter than buffer. If buffer is large, this may be a performance drain.
snprintf will (on POSIX platforms) zero-terminate. On Windows, there is only _snprintf, which will not zero-terminate, so take that into account.
Note: when using snprintf, use this form:
snprintf(buffer, sizeof(buffer), "%s", string);
instead of
snprintf(buffer, sizeof(buffer), string);
The latter is insecure and - if string depends on user input - can lead to stack smashes, etc.
sprintf has an extremely useful return value that allows for efficient appending.
Here's the idiom:
char buffer[HUGE] = {0};
char *end_of_string = &buffer[0];
end_of_string += sprintf( /* whatever */ );
end_of_string += sprintf( /* whatever */ );
end_of_string += sprintf( /* whatever */ );
You get the idea. This works because sprintf returns the number of characters it wrote to the buffer, so advancing your buffer by that many positions will leave you pointing to the '\0' at the end of what's been written so far. So when you hand the updated position to the next sprintf, it can start writing new characters right there.
Constrast with strcpy, whose return value is required to be useless. It hands you back the same argument you passed it. So appending with strcpy implies traversing the entire first string looking for the end of it. And then appending again with another strcpy call implies traversing the entire first string, followed by the 2nd string that now lives after it, looking for the '\0'. A third strcpy will re-traverse the strings that have already been written yet again. And so forth.
So for many small appends to a very large buffer, strcpy approches (O^n) where n is the number of appends. Which is terrible.
Plus, as others mentioned, they do different things. sprintf can be used to format numbers, pointer values, etc, into your buffer.
I think there is another difference between strncpy and snprintf.
Think about this:
const int N=1000000;
char arr[N];
strncpy(arr, "abce", N);
Usually, strncpy will set the rest of the destination buffer to '\0'. This will cost lots of CPU time. While when you call snprintf,
snprintf(a, N, "%s", "abce");
it will leave the buffer unchanged.
I don't know why strncpy will do that, but in this case, I will choose snprintf instead of strncpy.
All *printf functions check formatting and expand its corresponding argument, thus it is slower than a simple strcpy/strncpy, which only copy a given number of bytes from linear memory.
My rule of thumb is:
Use snprintf whenever formatting is needed.
Stick to strncpy/memcpy when only need to copy a block of linear memory.
You can use strcpy whenever you know exatcly the size of buffers you're copying. Don't use that if you don't have full control over the buffers size.
strcpy, strncpy, etc. only copies strings from one memory location to another. But, with snprint, you can do more stuff like formatting the string. Copying integers into buffer, etc.
It purely depends on your requirement which one to use. If as per your logic, strcpy & strncpy is already working for you, there is no need to jump to snprintf.
Also, remember to use strncpy for better safety as suggested by others.
The difference between strncpy and snprintf is that strncpy basically lays on you responsibility of terminating string with '\0'. It may terminate dst with '\0' but only if src is short enough.
Typical examples are:
strncpy(dst, src, n);
// if src is longer than n dst will not contain null
// terminated string at this point
dst[n - 1] = '\0';
snprintf(dst, n, "%s", src); // dst will 100% contain null terminated string

Why does strncpy not null terminate?

strncpy() supposedly protects from buffer overflows. But if it prevents an overflow without null terminating, in all likelihood a subsequent string operation is going to overflow. So to protect against this I find myself doing:
strncpy( dest, src, LEN );
dest[LEN - 1] = '\0';
man strncpy gives:
The strncpy() function is similar, except that not more than n bytes of src are copied. Thus, if there is no null byte among the first n bytes of src, the result will not be null-terminated.
Without null terminating something seemingly innocent like:
printf( "FOO: %s\n", dest );
...could crash.
Are there better, safer alternatives to strncpy()?
strncpy() is not intended to be used as a safer strcpy(), it is supposed to be used to insert one string in the middle of another.
All those "safe" string handling functions such as snprintf() and vsnprintf() are fixes that have been added in later standards to mitigate buffer overflow exploits etc.
Wikipedia mentions strncat() as an alternative to writing your own safe strncpy():
*dst = '\0';
strncat(dst, src, LEN);
EDIT
I missed that strncat() exceeds LEN characters when null terminating the string if it is longer or equal to LEN char's.
Anyway, the point of using strncat() instead of any homegrown solution such as memcpy(..., strlen(...))/whatever is that the implementation of strncat() might be target/platform optimized in the library.
Of course you need to check that dst holds at least the nullchar, so the correct use of strncat() would be something like:
if (LEN) {
*dst = '\0'; strncat(dst, src, LEN-1);
}
I also admit that strncpy() is not very useful for copying a substring into another string, if the src is shorter than n char's, the destination string will be truncated.
Originally, the 7th Edition UNIX file system (see DIR(5)) had directory entries that limited file names to 14 bytes; each entry in a directory consisted of 2 bytes for the inode number plus 14 bytes for the name, null padded to 14 characters, but not necessarily null-terminated. It's my belief that strncpy() was designed to work with those directory structures - or, at least, it works perfectly for that structure.
Consider:
A 14 character file name was not null terminated.
If the name was shorter than 14 bytes, it was null padded to full length (14 bytes).
This is exactly what would be achieved by:
strncpy(inode->d_name, filename, 14);
So, strncpy() was ideally fitted to its original niche application. It was only coincidentally about preventing overflows of null-terminated strings.
(Note that null padding up to the length 14 is not a serious overhead - if the length of the buffer is 4 KB and all you want is to safely copy 20 characters into it, then the extra 4075 nulls is serious overkill, and can easily lead to quadratic behaviour if you are repeatedly adding material to a long buffer.)
There are already open source implementations like strlcpy that do safe copying.
http://en.wikipedia.org/wiki/Strlcpy
In the references there are links to the sources.
Strncpy is safer against stack overflow attacks by the user of your program, it doesn't protect you against errors you the programmer do, such as printing a non-null-terminated string, the way you've described.
You can avoid crashing from the problem you've described by limiting the number of chars printed by printf:
char my_string[10];
//other code here
printf("%.9s",my_string); //limit the number of chars to be printed to 9
Some new alternatives are specified in ISO/IEC TR 24731 (Check https://buildsecurityin.us-cert.gov/daisy/bsi/articles/knowledge/coding/317-BSI.html for info). Most of these functions take an additional parameter that specifies the maximum length of the target variable, ensure that all strings are null-terminated, and have names that end in _s (for "safe" ?) to differentiate them from their earlier "unsafe" versions.1
Unfortunately, they're still gaining support and may not be available with your particular tool set. Later versions of Visual Studio will throw warnings if you use the old unsafe functions.
If your tools don't support the new functions, it should be fairly easy to create your own wrappers for the old functions. Here's an example:
errCode_t strncpy_safe(char *sDst, size_t lenDst,
const char *sSrc, size_t count)
{
// No NULLs allowed.
if (sDst == NULL || sSrc == NULL)
return ERR_INVALID_ARGUMENT;
// Validate buffer space.
if (count >= lenDst)
return ERR_BUFFER_OVERFLOW;
// Copy and always null-terminate
memcpy(sDst, sSrc, count);
*(sDst + count) = '\0';
return OK;
}
You can change the function to suit your needs, for example, to always copy as much of the string as possible without overflowing. In fact, the VC++ implementation can do this if you pass _TRUNCATE as the count.
1Of course, you still need to be accurate about the size of the target buffer: if you supply a 3-character buffer but tell strcpy_s() it has space for 25 chars, you're still in trouble.
Use strlcpy(), specified here: http://www.courtesan.com/todd/papers/strlcpy.html
If your libc doesn't have an implementation, then try this one:
size_t strlcpy(char* dst, const char* src, size_t bufsize)
{
size_t srclen =strlen(src);
size_t result =srclen; /* Result is always the length of the src string */
if(bufsize>0)
{
if(srclen>=bufsize)
srclen=bufsize-1;
if(srclen>0)
memcpy(dst,src,srclen);
dst[srclen]='\0';
}
return result;
}
(Written by me in 2004 - dedicated to the public domain.)
Instead of strncpy(), you could use
snprintf(buffer, BUFFER_SIZE, "%s", src);
Here's a one-liner which copies at most size-1 non-null characters from src to dest and adds a null terminator:
static inline void cpystr(char *dest, const char *src, size_t size)
{ if(size) while((*dest++ = --size ? *src++ : 0)); }
strncpy works directly with the string buffers available, if you are working directly with your memory, you MUST now buffer sizes and you could set the '\0' manually.
I believe there is no better alternative in plain C, but its not really that bad if you are as careful as you should be when playing with raw memory.
Without relying on newer extensions, I have done something like this in the past:
/* copy N "visible" chars, adding a null in the position just beyond them */
#define MSTRNCPY( dst, src, len) ( strncpy( (dst), (src), (len)), (dst)[ (len) ] = '\0')
and perhaps even:
/* pull up to size - 1 "visible" characters into a fixed size buffer of known size */
#define MFBCPY( dst, src) MSTRNCPY( (dst), (src), sizeof( dst) - 1)
Why the macros instead of newer "built-in" (?) functions? Because there used to be quite a few different unices, as well as other non-unix (non-windows) environments that I had to port to back when I was doing C on a daily basis.
I have always preferred:
memset(dest, 0, LEN);
strncpy(dest, src, LEN - 1);
to the fix it up afterwards approach, but that is really just a matter of preference.
These functions have evolved more than being designed, so there really is no "why".
You just have to learn "how". Unfortunately the linux man pages at least are
devoid of common use case examples for these functions, and I've noticed lots
of misuse in code I've reviewed. I've made some notes here:
http://www.pixelbeat.org/programming/gcc/string_buffers.html

Why should you use strncpy instead of strcpy?

Edit: I've added the source for the example.
I came across this example:
char source[MAX] = "123456789";
char source1[MAX] = "123456789";
char destination[MAX] = "abcdefg";
char destination1[MAX] = "abcdefg";
char *return_string;
int index = 5;
/* This is how strcpy works */
printf("destination is originally = '%s'\n", destination);
return_string = strcpy(destination, source);
printf("after strcpy, dest becomes '%s'\n\n", destination);
/* This is how strncpy works */
printf( "destination1 is originally = '%s'\n", destination1 );
return_string = strncpy( destination1, source1, index );
printf( "After strncpy, destination1 becomes '%s'\n", destination1 );
Which produced this output:
destination is originally = 'abcdefg'
After strcpy, destination becomes '123456789'
destination1 is originally = 'abcdefg'
After strncpy, destination1 becomes '12345fg'
Which makes me wonder why anyone would want this effect. It looks like it would be confusing. This program makes me think you could basically copy over someone's name (eg. Tom Brokaw) with Tom Bro763.
What are the advantages of using strncpy() over strcpy()?
The strncpy() function was designed with a very particular problem in mind: manipulating strings stored in the manner of original UNIX directory entries. These used a short fixed-sized array (14 bytes), and a nul-terminator was only used if the filename was shorter than the array.
That's what's behind the two oddities of strncpy():
It doesn't put a nul-terminator on the destination if it is completely filled; and
It always completely fills the destination, with nuls if necessary.
For a "safer strcpy()", you are better off using strncat() like so:
if (dest_size > 0)
{
dest[0] = '\0';
strncat(dest, source, dest_size - 1);
}
That will always nul-terminate the result, and won't copy more than necessary.
strncpy combats buffer overflow by requiring you to put a length in it. strcpy depends on a trailing \0, which may not always occur.
Secondly, why you chose to only copy 5 characters on 7 character string is beyond me, but it's producing expected behavior. It's only copying over the first n characters, where n is the third argument.
The n functions are all used as defensive coding against buffer overflows. Please use them in lieu of older functions, such as strcpy.
While I know the intent behind strncpy, it is not really a good function. Avoid both. Raymond Chen explains.
Personally, my conclusion is simply to avoid strncpy and all its friends if you are dealing with null-terminated strings. Despite the "str" in the name, these functions do not produce null-terminated strings. They convert a null-terminated string into a raw character buffer. Using them where a null-terminated string is expected as the second buffer is plain wrong. Not only do you fail to get proper null termination if the source is too long, but if the source is short you get unnecessary null padding.
See also Why is strncpy insecure?
strncpy is NOT safer than strcpy, it just trades one type of bugs with another. In C, when handling C strings, you need to know the size of your buffers, there is no way around it. strncpy was justified for the directory thing mentioned by others, but otherwise, you should never use it:
if you know the length of your string and buffer, why using strncpy ? It is a waste of computing power at best (adding useless 0)
if you don't know the lengths, then you risk silently truncating your strings, which is not much better than a buffer overflow
What you're looking for is the function strlcpy() which does terminate always the string with 0 and initializes the buffer. It also is able to detect overflows. Only problem, it's not (really) portable and is present only on some systems (BSD, Solaris). The problem with this function is that it opens another can of worms as can be seen by the discussions on
http://en.wikipedia.org/wiki/Strlcpy
My personal opinion is that it is vastly more useful than strncpy() and strcpy(). It has better performance and is a good companion to snprintf(). For platforms which do not have it, it is relatively easy to implement.
(for the developement phase of a application I substitute these two function (snprintf() and strlcpy()) with a trapping version which aborts brutally the program on buffer overflows or truncations. This allows to catch quickly the worst offenders. Especially if you work on a codebase from someone else.
EDIT: strlcpy() can be implemented easily:
size_t strlcpy(char *dst, const char *src, size_t dstsize)
{
size_t len = strlen(src);
if(dstsize) {
size_t bl = (len < dstsize-1 ? len : dstsize-1);
((char*)memcpy(dst, src, bl))[bl] = 0;
}
return len;
}
The strncpy() function is the safer one: you have to pass the maximum length the destination buffer can accept. Otherwise it could happen that the source string is not correctly 0 terminated, in which case the strcpy() function could write more characters to destination, corrupting anything which is in the memory after the destination buffer. This is the buffer-overrun problem used in many exploits
Also for POSIX API functions like read() which does not put the terminating 0 in the buffer, but returns the number of bytes read, you will either manually put the 0, or copy it using strncpy().
In your example code, index is actually not an index, but a count - it tells how many characters at most to copy from source to destination. If there is no null byte among the first n bytes of source, the string placed in destination will not be null terminated
strncpy fills the destination up with '\0' for the size of source, eventhough the size of the destination is smaller....
manpage:
If the length of src is less than n, strncpy() pads the remainder of
dest with null bytes.
and not only the remainder...also after this until n characters is
reached. And thus you get an overflow... (see the man page
implementation)
This may be used in many other scenarios, where you need to copy only a portion of your original string to the destination. Using strncpy() you can copy a limited portion of the original string as opposed by strcpy(). I see the code you have put up comes from publib.boulder.ibm.com.
That depends on our requirement.
For windows users
We use strncpy whenever we don't want to copy entire string or we want to copy only n number of characters. But strcpy copies the entire string including terminating null character.
These links will help you more to know about strcpy and strncpy
and where we can use.
about strcpy
about strncpy
the strncpy is a safer version of strcpy as a matter of fact you should never use strcpy because its potential buffer overflow vulnerability which makes you system vulnerable to all sort of attacks

Resources