sprintf_s with a buffer too small - c

The following code causes an error and kills my application. It makes sense as the buffer is only 10 bytes long and the text is 22 bytes long (buffer overflow).
char buffer[10];
int length = sprintf_s( buffer, 10, "1234567890.1234567890." );
How do I catch this error so I can report it instead of crashing my application?
Edit:
After reading the comments below I went with _snprintf_s. If it returns a -1 value then the buffer was not updated.
length = _snprintf_s( buffer, 10, 9, "123456789" );
printf( "1) Length=%d\n", length ); // Length == 9
length = _snprintf_s( buffer, 10, 9, "1234567890.1234567890." );
printf( "2) Length=%d\n", length ); // Length == -1
length = _snprintf_s( buffer, 10, 10, "1234567890.1234567890." );
printf( "3) Length=%d\n", length ); // Crash, it needs room for the NULL char

It's by design. The entire point of sprintf_s, and other functions from the *_s family, is to catch buffer overrun errors and treat them as precondition violations. This means that they're not really meant to be recoverable. This is designed to catch errors only - you shouldn't ever call sprintf_s if you know the string can be too large for a destination buffer. In that case, use strlen first to check and decide whether you need to trim.

Instead of sprintf_s, you could use snprintf (a.k.a _snprintf on windows).
#ifdef WIN32
#define snprintf _snprintf
#endif
char buffer[10];
int length = snprintf( buffer, 10, "1234567890.1234567890." );
// unix snprintf returns length output would actually require;
// windows _snprintf returns actual output length if output fits, else negative
if (length >= sizeof(buffer) || length<0)
{
/* error handling */
}

This works with VC++ and is even safer than using snprintf (and certainly safer than _snprintf):
void TestString(const char* pEvil)
{
char buffer[100];
_snprintf_s(buffer, _TRUNCATE, "Some data: %s\n", pEvil);
}
The _TRUNCATE flag indicates that the string should be truncated. In this form the size of the buffer isn't actually passed in, which (paradoxically!) is what makes it so safe. The compiler uses template magic to infer the buffer size which means it cannot be incorrectly specified (a surprisingly common error). This technique can be applied to create other safe string wrappers, as described in my blog post here:
https://randomascii.wordpress.com/2013/04/03/stop-using-strncpy-already/

From MSDN:
The other main difference between sprintf_s and sprintf is that sprintf_s takes a length parameter specifying the size of the output buffer in characters. If the buffer is too small for the text being printed then the buffer is set to an empty string and the invalid parameter handler is invoked. Unlike snprintf, sprintf_s guarantees that the buffer will be null-terminated (unless the buffer size is zero).
So ideally what you've written should work correctly.

Looks like you're writing on MSVC of some sort?
I think the MSDN docs for sprintf_s says that it assert dies, so I'm not too sure if you can programmatically catch that.
As LBushkin suggested, you're much better off using classes that manage the strings.

See section 6.6.1 of TR24731 which is the ISO C Committee version of the functionality implemented by Microsoft. It provides functions set_constraint_handler(), abort_constraint_handler() and ignore_constraint_handler() functions.
There are comments from Pavel Minaev suggesting that the Microsoft implementation does not adhere to the TR24731 proposal (which is a 'Type 2 Tech Report'), so you may not be able to intervene, or you may have to do something different from what the TR indicates should be done. For that, scrutinize MSDN.

Related

May printf (or fprintf or dprintf) return ("successfully") less (but nonnegative) than the number of "all bytes"?

The manual says that
Upon successful return, these functions [printf, dprintf etc.] return the number of characters printed.
The manual does not mention whethet may this number less (but yet nonnegative) than the length of the "final" (substitutions and formattings done) string. Nor mentions that how to check whether (or achieve that) the string was completely written.
The dprintf function operates on file descriptor. Similarily to the write function, for which the manual does mention that
On success, the number of bytes written is returned (zero indicates nothing was written). It is not an error if this number is smaller than the number of bytes requested;
So if I want to write a string completely then I have to enclose the n = write() in a while-loop. Should I have to do the same in case of dprintf or printf?
My understanding of the documentation is that dprintf would either fail or output all the output. But I agree that it is some gray area (and I might not understand well); I'm guessing that a partial output is some kind of failure (so returns a negative size).
Here is the implementation of musl-libc:
In stdio/dprintf.c the dprintf function just calls vdprintf
But in stdio/vdprintf.c you just have:
static size_t wrap_write(FILE *f, const unsigned char *buf, size_t len)
{
return __stdio_write(f, buf, len);
}
int vdprintf(int fd, const char *restrict fmt, va_list ap)
{
FILE f = {
.fd = fd, .lbf = EOF, .write = wrap_write,
.buf = (void *)fmt, .buf_size = 0,
.lock = -1
};
return vfprintf(&f, fmt, ap);
}
So dprintf is returning a size like vfprintf (and fprintf....) does.
However, if you really are concerned, you'll better use snprintf or asprintf to output into some memory buffer, and explicitly use write(2) on that buffer.
Look into stdio/__stdio_write.c the implementation of __stdio_write (it uses writev(2) with a vector of two data chunks in a loop).
In other words, I would often not really care; but if you really need to be sure that every byte has been written as you expect it (for example if the file descriptor is some HTTP socket), I would suggest to buffer explicitly (e.g. by calling snprintf and/or asprintf) yourself, then use your explicit write(2).
PS. You might check yourself the source code of your particular C standard library providing dprintf; for GNU glibc see notably libio/iovdprintf.c
With stdio, returning the number of partially written bytes doesn't make much sense because stdio functions work with a (more or less) global buffer whose state is unknown to you and gets dragged in from previous calls.
If stdio functions allowed you to work with that, the error return values would need to be more complex as they would not only need to communicate how many characters were or were not outputted, but also whether the failure was before your last input somewhere in the buffer, or in the middle of your last input and if so, how much of the last input got buffered.
The d-functions could theoretically give you the number of partially written characters easy, but POSIX specifies that they should mirror the stdio functions and so they only give you a further unspecified negative value on error.
If you need more control, you can use the lower level functions.
Concerning printf(), it is quite clear.
The printf function returns the number of characters transmitted, or a negative value if an output or encoding error occurred. C11dr ยง7.21.6.3 3
A negative value is returned if an error occurred. In that case 0 or more characters may have printed. The count is unknowable via the standard library.
If the value return is not negative, that is the number sent to stdout.
Since stdout is often buffered, that may not be the number received at the output device on the conclusion of printf(). Follow printf() with a fflush(stdout)
int r1 = printf(....);
int r2 = fflush(stdout);
if (r1 < 0 || r2 != 0) Handle_Failure();
For finest control, "print" to a buffer and use putchar() or various non-standard functions.
My bet is that no. (After looking into the - obfuscated - source of printf.) So any nonnegative return value means that printf was fully succesful (reached the end of the format string, everything was passed to kernel buffers).
But some (authentic) people should confirm it.

How do I use strlen() on a formatted string?

I'd like to write a wrapper function for the mvwprint/mvwchgat ncurses functions which prints the message in the specified window and then changes its attributes.
However, mvwchgat needs to know how many characters it should change - and I have no idea how to tell mvwchgat how long the formatted string is, since a strlen() on, for instance, "abc%d" obviously returns 5, because strlen doesn't know what %d stands for ...
In C99 or C11, you can use a line like this:
length = snprintf(NULL, 0, format_string, args);
From the manual of snprintf (emphasis mine):
The functions snprintf() and vsnprintf() do not write more than size bytes (including the terminating null byte ('\0')). If the output was truncated due to this limit then the return value is the number of characters (excluding the terminating null byte) which would have been written to the final string if enough space had been available. Thus, a return value of size or more means that the output was truncated.
Since we are giving snprintf 0 as the size, then the output is always truncated and the output of snprintf would be the number of characters that would have been written, which is basically the length of the string.
In C89, you don't have snprintf. A workaround is to create a temporary file, or if you are in *nix open /dev/null and write something like this:
FILE *throw_away = fopen("/dev/null", "w"); /* On windows should be "NUL" but I haven't tested */
if (throw_away)
{
fprintf(throw_away, "<format goes here>%n", <args go here>, &length);
fclose(throw_away);
} /* else, try opening a temporary file */
You can't know in advance how long your string will be:
printf("abc%d", 0); //4 chars
printf("abc%d", 111111111);//12 chars
All with the same format string.
The only sure way is to sprintf the text in question into a buffer, strlen(buffer) the result and printf("%s", buffer); the result to screen.
This solution avoids double formatting at the cost of allocating long enough buffer.

Why would one add 1 or 2 to the second argument of snprintf?

What is the role of 1 and 2 in these snprintf functions? Could anyone please explain it
snprintf(argv[arg++], strlen(pbase) + 2 + strlen("ivlpp"), "%s%ccivlpp", pbase, sep);
snprintf(argv[arg++], strlen(defines_path) + 1, "-F\"%s\"", defines_path);
The role of the +2 is to allow for a terminal null and the embedded character from the %c format, so there is exactly the right amount of space for formatting the first string. but (as 6502 points out), the actual string provided is one space shorter than needed because the strlen("ivlpp") doesn't match the civlpp in the format itself. This means that the last character (the second 'p') will be truncated in the output.
The role of the +1 is also to cause snprintf() to truncate the formatted data. The format string contains 4 literal characters, and you need to allow for the terminal null, so the code should allocate strlen(defines)+5. As it is, the snprintf() truncates the data, leaving off 4 characters.
I'm dubious about whether the code really works reliably...the memory allocation is not shown, but will have to be quite complex - or it will have to over-allocate to ensure that there is no danger of buffer overflow.
Since a comment from the OP says:
I don't know the use of snprintf()
int snprintf(char *restrict s, size_t n, const char *restrict format, ...);
The snprintf() function formats data like printf(), but it writes it to a string (the s in the name) instead of to a file. The first n in the name indicates that the function is told exactly how long the string is, and snprintf() therefore ensures that the output data is null terminated (unless the length is 0). It reports how long the string should have been; if the reported value is longer than the value provided, you know the data got truncated.
So, overall, snprintf() is a relatively safe way of formatting strings, provided you use it correctly. The examples in the question do not demonstrate 'using it correctly'.
One gotcha: if you work on MS Windows, be aware that the MSVC implementation of snprintf() does not exactly follow the C99 standard (and it looks a bit as though MS no longer provides snprintf() at all; only various alternatives such as _snprintf()). I forget the exact deviation, but I think it means that the string is not properly null-terminated in all circumstances when it should be longer than the space provided.
With locally defined arrays, you normally use:
nbytes = snprintf(buffer, sizeof(buffer), "format...", ...);
With dynamically allocated memory, you normally use:
nbytes = snprintf(dynbuffer, dynbuffsize, "format...", ...);
In both cases, you check whether nbytes contains a non-negative value less than the size argument; if it does, your data is OK; if the value is equal to or larger, then your data got chopped (and you know how much space you needed to allocate).
The C99 standard says:
The snprintf function returns the number of characters that would have been written
had n been sufficiently large, not counting the terminating null character, or a negative
value if an encoding error occurred. Thus, the null-terminated output has been
completely written if and only if the returned value is nonnegative and less than n.
The programmer whose code you are reading doesn't know how to use snprintf properly. The second argument is the buffer size, so it should almost always look like this:
snprintf(buf, sizeof buf, "..." ...);
The above is for situations where buf is an array, not a pointer. In the latter case you have to pass the buffer size along:
snprintf(buf, bufsize, "...", ...);
Computing the buffer size is unneeded.
By the way, since you tagged the question as qt-related. There is a very nice QString class that you should use instead.
At a first look both seem incorrect.
In the first case the correct computation would be path + sep + name + NUL so 2 would seem ok, but for the name the strlen call is using ilvpp while the formatting code is using instead cilvpp that is one char longer.
In the second case the number of chars added is 4 (-L"") so the number to add should be 5 because of the ending NUL.

Different ways to calculate string length

A comment on one of my answers has left me a little puzzled. When trying to compute how much memory is needed to concat two strings to a new block of memory, it was said that using snprintf was preferred over strlen, as shown below:
size_t length = snprintf(0, 0, "%s%s", str1, str2);
// preferred over:
size_t length = strlen(str1) + strlen(str2);
Can I get some reasoning behind this? What is the advantage, if any, and would one ever see one result differ from the other?
I was the one who said it, and I left out the +1 in my comment which was written quickly and carelessly, so let me explain. My point was merely that you should use the pattern of using the same method to compute the length that will eventually be used to fill the string, rather than using two different methods that could potentially differ in subtle ways.
For example, if you had three strings rather than two, and two or more of them overlapped, it would be possible that strlen(str1)+strlen(str2)+strlen(str3)+1 exceeds SIZE_MAX and wraps past zero, resulting in under-allocation and truncation of the output (if snprintf is used) or extremely dangerous memory corruption (if strcpy and strcat are used).
snprintf will return -1 with errno=EOVERFLOW when the resulting string would be longer than INT_MAX, so you're protected. You do need to check the return value before using it though, and add one for the null terminator.
If you only need to determine how big would be the concatenation of the two strings, I don't see any particular reason to prefer snprintf, since the minimum operations to determine the total length of the two strings is what the two strlen calls do. snprintf will almost surely be slower, because it has to check the parameters and parse the format string besides just walking the two strings counting the characters.
... but... it may be an intelligent move to use snprintf if you are in a scenario where you want to concatenate two strings, and have a static, not too big buffer to handle normal cases, but you can fallback to a dynamically allocated buffer in case of big strings, e.g.:
/* static buffer "big enough" for most cases */
char buffer[256];
/* pointer used in the part where work on the string is actually done */
char * outputStr=buffer;
/* try to concatenate, get the length of the resulting string */
int length = snprintf(buffer, sizeof(buffer), "%s%s", str1, str2);
if(length<0)
{
/* error, panic and death */
}
else if(length>sizeof(buffer)-1)
{
/* buffer wasn't enough, allocate dynamically */
outputStr=malloc(length+1);
if(outputStr==NULL)
{
/* allocation error, death and panic */
}
if(snprintf(outputStr, length, "%s%s", str1, str2)<0)
{
/* error, the world is doomed */
}
}
/* here do whatever you want with outputStr */
if(outputStr!=buffer)
free(outputStr);
One advantage would be that the input strings are only scanned once (inside the snprintf()) instead of twice for the strlen/strcpy solution.
Actually, on rereading this question and the comment on your previous answer, I don't see what the point is in using sprintf() just to calculate the concatenated string length. If you're actually doing the concatenation, my above paragraph applies.
You need to add 1 to the strlen() example. Remember you need to allocate space for nul terminating byte.
So snprintf( ) gives me the size a string would have been. That means I can malloc( ) space for that guy. Hugely useful.
I wanted (but did not find until now) this function of snprintf( ) because I format tons of strings for output later; but I wanted not to have to assign static bufs for the outputs because it's hard to predict how long the outputs will be. So I ended up with a lot of 4096-long char arrays :-(
But now -- using this newly-discovered (to me) snprintf( ) char-counting function, I can malloc( ) output bufs AND sleep at night, both.
Thanks again and apologies to the OP and to Matteo.
EDIT: random, mistaken nonsense removed. Did I say that?
EDIT: Matteo in his comment below is absolutely right and I was absolutely wrong.
From C99:
2 The snprintf function is equivalent to fprintf, except that the output is written into
an array (specified by argument s) rather than to a stream. If n is zero, nothing is written,
and s may be a null pointer. Otherwise, output characters beyond the n-1st are
discarded rather than being written to the array, and a null character is written at the end
of the characters actually written into the array. If copying takes place between objects
that overlap, the behavior is undefined.
Returns
3 The snprintf function returns the number of characters that would have been written
had n been sufficiently large, not counting the terminating null character, or a neg ative
value if an encoding error occurred. Thus, the null-terminated output has been
completely written if and only if the returned value is nonnegative and less than n.
Thank you, Matteo, and I apologize to the OP.
This is great news because it gives a positive answer to a question I'd asked here only a three weeks ago. I can't explain why I didn't read all of the answers, which gave me what I wanted. Awesome!
The "advantage" that I can see here is that strlen(NULL) might cause a segmentation fault, while (at least glibc's) snprintf() handles NULL parameters without failing.
Hence, with glibc-snprintf() you don't need to check whether one of the strings is NULL, although length might be slightly larger than needed, because (at least on my system) printf("%s", NULL); prints "(null)" instead of nothing.
I wouldn't recommend using snprintf() instead of strlen() though. It's just not obvious. A much better solution is a wrapper for strlen() which returns 0 when the argument is NULL:
size_t my_strlen(const char *str)
{
return str ? strlen(str) : 0;
}

Why does strncpy not null terminate?

strncpy() supposedly protects from buffer overflows. But if it prevents an overflow without null terminating, in all likelihood a subsequent string operation is going to overflow. So to protect against this I find myself doing:
strncpy( dest, src, LEN );
dest[LEN - 1] = '\0';
man strncpy gives:
The strncpy() function is similar, except that not more than n bytes of src are copied. Thus, if there is no null byte among the first n bytes of src, the result will not be null-terminated.
Without null terminating something seemingly innocent like:
printf( "FOO: %s\n", dest );
...could crash.
Are there better, safer alternatives to strncpy()?
strncpy() is not intended to be used as a safer strcpy(), it is supposed to be used to insert one string in the middle of another.
All those "safe" string handling functions such as snprintf() and vsnprintf() are fixes that have been added in later standards to mitigate buffer overflow exploits etc.
Wikipedia mentions strncat() as an alternative to writing your own safe strncpy():
*dst = '\0';
strncat(dst, src, LEN);
EDIT
I missed that strncat() exceeds LEN characters when null terminating the string if it is longer or equal to LEN char's.
Anyway, the point of using strncat() instead of any homegrown solution such as memcpy(..., strlen(...))/whatever is that the implementation of strncat() might be target/platform optimized in the library.
Of course you need to check that dst holds at least the nullchar, so the correct use of strncat() would be something like:
if (LEN) {
*dst = '\0'; strncat(dst, src, LEN-1);
}
I also admit that strncpy() is not very useful for copying a substring into another string, if the src is shorter than n char's, the destination string will be truncated.
Originally, the 7th Edition UNIX file system (see DIR(5)) had directory entries that limited file names to 14 bytes; each entry in a directory consisted of 2 bytes for the inode number plus 14 bytes for the name, null padded to 14 characters, but not necessarily null-terminated. It's my belief that strncpy() was designed to work with those directory structures - or, at least, it works perfectly for that structure.
Consider:
A 14 character file name was not null terminated.
If the name was shorter than 14 bytes, it was null padded to full length (14 bytes).
This is exactly what would be achieved by:
strncpy(inode->d_name, filename, 14);
So, strncpy() was ideally fitted to its original niche application. It was only coincidentally about preventing overflows of null-terminated strings.
(Note that null padding up to the length 14 is not a serious overhead - if the length of the buffer is 4 KB and all you want is to safely copy 20 characters into it, then the extra 4075 nulls is serious overkill, and can easily lead to quadratic behaviour if you are repeatedly adding material to a long buffer.)
There are already open source implementations like strlcpy that do safe copying.
http://en.wikipedia.org/wiki/Strlcpy
In the references there are links to the sources.
Strncpy is safer against stack overflow attacks by the user of your program, it doesn't protect you against errors you the programmer do, such as printing a non-null-terminated string, the way you've described.
You can avoid crashing from the problem you've described by limiting the number of chars printed by printf:
char my_string[10];
//other code here
printf("%.9s",my_string); //limit the number of chars to be printed to 9
Some new alternatives are specified in ISO/IEC TR 24731 (Check https://buildsecurityin.us-cert.gov/daisy/bsi/articles/knowledge/coding/317-BSI.html for info). Most of these functions take an additional parameter that specifies the maximum length of the target variable, ensure that all strings are null-terminated, and have names that end in _s (for "safe" ?) to differentiate them from their earlier "unsafe" versions.1
Unfortunately, they're still gaining support and may not be available with your particular tool set. Later versions of Visual Studio will throw warnings if you use the old unsafe functions.
If your tools don't support the new functions, it should be fairly easy to create your own wrappers for the old functions. Here's an example:
errCode_t strncpy_safe(char *sDst, size_t lenDst,
const char *sSrc, size_t count)
{
// No NULLs allowed.
if (sDst == NULL || sSrc == NULL)
return ERR_INVALID_ARGUMENT;
// Validate buffer space.
if (count >= lenDst)
return ERR_BUFFER_OVERFLOW;
// Copy and always null-terminate
memcpy(sDst, sSrc, count);
*(sDst + count) = '\0';
return OK;
}
You can change the function to suit your needs, for example, to always copy as much of the string as possible without overflowing. In fact, the VC++ implementation can do this if you pass _TRUNCATE as the count.
1Of course, you still need to be accurate about the size of the target buffer: if you supply a 3-character buffer but tell strcpy_s() it has space for 25 chars, you're still in trouble.
Use strlcpy(), specified here: http://www.courtesan.com/todd/papers/strlcpy.html
If your libc doesn't have an implementation, then try this one:
size_t strlcpy(char* dst, const char* src, size_t bufsize)
{
size_t srclen =strlen(src);
size_t result =srclen; /* Result is always the length of the src string */
if(bufsize>0)
{
if(srclen>=bufsize)
srclen=bufsize-1;
if(srclen>0)
memcpy(dst,src,srclen);
dst[srclen]='\0';
}
return result;
}
(Written by me in 2004 - dedicated to the public domain.)
Instead of strncpy(), you could use
snprintf(buffer, BUFFER_SIZE, "%s", src);
Here's a one-liner which copies at most size-1 non-null characters from src to dest and adds a null terminator:
static inline void cpystr(char *dest, const char *src, size_t size)
{ if(size) while((*dest++ = --size ? *src++ : 0)); }
strncpy works directly with the string buffers available, if you are working directly with your memory, you MUST now buffer sizes and you could set the '\0' manually.
I believe there is no better alternative in plain C, but its not really that bad if you are as careful as you should be when playing with raw memory.
Without relying on newer extensions, I have done something like this in the past:
/* copy N "visible" chars, adding a null in the position just beyond them */
#define MSTRNCPY( dst, src, len) ( strncpy( (dst), (src), (len)), (dst)[ (len) ] = '\0')
and perhaps even:
/* pull up to size - 1 "visible" characters into a fixed size buffer of known size */
#define MFBCPY( dst, src) MSTRNCPY( (dst), (src), sizeof( dst) - 1)
Why the macros instead of newer "built-in" (?) functions? Because there used to be quite a few different unices, as well as other non-unix (non-windows) environments that I had to port to back when I was doing C on a daily basis.
I have always preferred:
memset(dest, 0, LEN);
strncpy(dest, src, LEN - 1);
to the fix it up afterwards approach, but that is really just a matter of preference.
These functions have evolved more than being designed, so there really is no "why".
You just have to learn "how". Unfortunately the linux man pages at least are
devoid of common use case examples for these functions, and I've noticed lots
of misuse in code I've reviewed. I've made some notes here:
http://www.pixelbeat.org/programming/gcc/string_buffers.html

Resources