Related
Perhaps I'm misinterpreting my results, but:
#include <stdio.h>
int
main(void)
{
char buf[32] = "";
int x;
x = scanf("%31[^\0]", buf);
printf("x = %d, buf=%s", x, buf);
}
$ printf 'foo\n\0bar' | ./a.out
x = 1, buf=foo
Since the string literal "%31[^\0]" contains an embedded null, it seems that it should be treated the same as "%31[^", and the compiler should complain that the [ is unmatched. Indeed, if you replace the string literal, clang gives:
warning: no closing ']' for '%[' in scanf format string [-Wformat]
Why does it work to embed a null character in the string literal passed to scanf?
-- EDIT --
The above is undefined behavior and merely happens to "work".
First of all, Clang totally fails to output any meaningful diagnostics here, whereas GCC knows exactly what is happening - so yet again GCC 1 - 0 Clang.
And as for the format string - well, it doesn't work. The format argument to scanf is a string. The string ends at terminating null, i.e. the format string you're giving to scanf is
scanf("%31[^", buf);
On my computer, compiling the program gives
% gcc scanf.c
scanf.c: In function ‘main’:
scanf.c:8:20: warning: no closing ‘]’ for ‘%[’ format [-Wformat=]
8 | x = scanf("%31[^\0]", buf);
| ^
scanf.c:8:21: warning: embedded ‘\0’ in format [-Wformat-contains-nul]
8 | x = scanf("%31[^\0]", buf);
| ^~
The scanset must have the closing right bracket ], otherwise the conversion specifier is invalid. If conversion specifier is invalid, the behaviour is undefined.
And, on my computer running it,
% printf 'foo\n\0bar' | ./a.out
x = 0, buf=
Q.E.D.
This is a rather strange situation. I think there are a couple of things going on.
First of all, a string in C ends by definition at the first \0. You can always scoff at this rule, for example by writing a string literal with an explicit \0 in the middle of it. When you do, though, the characters after the \0 are mostly invisible. Very few standard library functions are able to see them, because of course just about everything that interprets a C string will stop at the first \0 it finds.
However: the string you pass as the first argument to scanf is typically parsed twice -- and by "parsed" I mean actually interpreted as a scanf format string possibly containing special % sequences. It's always going to be parsed at run time, by the actual copy of scanf in your C run-time library. But it's typically also parsed by the compiler, at compile time, so that the compiler can warn you if the % sequences don't match the actual arguments you call it with. (The run-time library code for scanf, of course, is unable to perform this checking.)
Now, of course, there's a pretty significant potential problem here: what if the parsing performed by the compiler is in some way different than the parsing performed by the actual scanf code in the run-time library? That might lead to confusing results.
And, to my considerable surprise, it looks like the scanf format parsing code in compilers can (and in some cases does) do something special and unexpected. clang doesn't (it doesn't complain about the malformed string at all), but gcc says both "no closing ‘]’ for ‘%[’ format" and "embedded ‘\0’ in format". So it's noticing.
This is possible (though still surprising) because the compiler, at least, can see the whole string literal, and is in a position to notice that the null character is an explicit one inserted by the programmer, not the more usual implicit one appended by the compiler. And indeed the warning "embedded ‘\0’ in format" emitted by gcc proves that gcc, at least, is quite definitely written to accommodate this possibility. (See the footnote below for a bit more on the compiler's ability to "see" the whole string literal.)
But the second question is, why does it (seem to) work at runtime? What is the actual scanf code in the C library doing?
That code, at least, has no way of knowing that the \0 was explicit and that there are "real" characters following it. That code simply must stop at the first \0 that it finds. So it's operating as if the format string was
"%31[^"
That's a malformed format string, of course. The run-time library code isn't required to do anything reasonable. But my copy, like yours, is able to read the full string "foo". What's up with that?
My guess is that after seeing the % and the [ and the ^, and deciding that it's going to scan characters not matching some set, it's perfectly willing to, in effect, infer the missing ], and sail on matching characters from the scanset, which ends up having no excluded characters.
I tested this by trying the variant
x = scanf("%31[^\0o]", buf);
This also matched and printed "foo", not "f".
Obviously things are nothing like guaranteed to work like this, of course. #AnttiHaapala has already posted an answer showing that his C RTL declines to scan "foo" with the malformed scan string at all.
Footnote:
Most of the time, an embedded in \0 in a string truly, prematurely ends it. Most of the time, everything following the \0 is effectively invisible, because at run time, every piece of string interpreting code will stop at the first \0 it finds, with no way to know whether it was one explicitly inserted by the programmer or implicitly appended by the compiler. But as we've seen, the compiler can tell the difference, because the compiler (obviously) can see the entire string literal, exactly as entered by the programmer. Here's proof:
char str1[] = "Hello, world!";
char str2[] = "Hello\0world!";
printf("sizeof(str1) = %zu, strlen(str1) = %zu\n", sizeof(str1), strlen(str1));
printf("sizeof(str2) = %zu, strlen(str2) = %zu\n", sizeof(str2), strlen(str2));
Normally, sizeof on a string literal gives you a number one bigger than strlen. But this code prints:
sizeof(str1) = 14, strlen(str1) = 13
sizeof(str2) = 13, strlen(str2) = 5
Just for fun I also tried:
char str3[5] = "Hello";
This time, though, strlen gave a larger number:
sizeof(str3) = 5, strlen(str3) = 10
I was mildly lucky. str3 has no trailing \0, neither one inserted by me nor appended by the compiler, so strlen sails off the end, and could easily have counted hundreds or thousands of characters before finding a random \0 somewhere in memory, or crashing.
Why can a null character be embedded in a conversion specifier for scanf?
A null character cannot directly be specified as part of a scanset as in "%31[^\0]" as the parsing of the string ends with the first null character.
"%31[^\0]" is parsed by scanf() as if it was "%31[^". As it is an invalid scanf()
specifier, UB will likely follow. A compiler may provide diagnostics on more than what scanf() sees.
A null character can be part of a scanset as in "%31[^\n]". This will read in all characters, including the null character, other than '\n'.
In the unusual case of reading null chracters, to determine the number of characters read scanned, use "%n".
int n = 0;
scanf("%31[^\n]%n", buf, &n);
scanf("%*1[\n]"); // Consume any 1 trailing \n
if (n) {
printf("First part of buf=%s, %d characters read ", buf, n);
}
Code I have:
int main(){
char readChars[3];
puts("Enter the value of the card please:");
scanf(readChars);
printf(readChars);
printf("done");
}
All I see is:
"done"
after I enter some value to terminal and pressing Enter, why?
Edit:
Isn't the prototype for scanf:
int scanf(const char *format, ...);
So I should be able to use it with just one argument?
The actual problem is that you are passing an uninitialized array as the format to scanf().
Also you are invoking scanf() the wrong way try this
if (scanf("%2s", readChars) == 1)
printf("%s\n", readChars);
scanf() as well as printf() use a format string and that's actually the cause for the f in their name.
And yes you are able to use it with just one argument, scanf() scans input according to the format string, the format string uses special values that are matched against the input, if you don't specify at least one then scanf() will only be useful for input validation.
The following was extracted from C11 draft
7.21.6.2 The fscanf function
The format shall be a multibyte character sequence, beginning and ending in its initial shift state. The format is composed of zero or more directives: one or more white-space characters, an ordinary multibyte character (neither % nor a white-space character), or a conversion specification. Each conversion specification is introduced by the character %. After the %, the following appear in sequence:
An optional assignment-suppressing character *.
An optional decimal integer greater than zero that specifies the maximum field width
(in characters).
An optional length modifier that specifies the size of the receiving object.
A conversion specifier character that specifies the type of conversion to be applied.
as you can read above, you need to pass at least one conversion specifier, and in that case the corresponding argument to store the converted value, if you pass the conversion specifier but you don't give an argument for it, the behavior is undefined.
Yes, it is possible to call scanf with just one parameter, and it may even be useful on occasion. But it wouldn't do what you apparently thought it would. (It would just expect the characters in the argument in the input stream and skip them.) You didn't notice because you failed to do due diligence as a programmer. I'll list what you should do:
RTFM. scanf's first parameter is a format string. Plain characters which are not part of conversion sequences and are not whitespace are expected literally in the input. They are read and discarded. If they do not appear, conversion stops there, and the position in the input stream where the unexpected character occured is the start of subsequent reads. In your case probably no character was ever successfully read from the input, but you don't know for sure, because you didn't initialize the format string (see below).
Another interesting detail is scanf's return value which indicates the number items successfully read. I'll discuss that below together with the importance to check return values.
Initialize locals. C doesn't automatically initialize local data for performance reasons (in today's light one would probably enforce user initialization like other languages do, or make auto initialization a default with an opt-out possibility for the few inner loops where it would hurt). Because you didn't initialize readchars, you don't know what's in it, so you don't know what scanf expected in the input stream. On top it probably is nominally undefined behaviour. (But on your PC it shouldn't do anything unexpected.)
Check return values. scanf probably returned 0 in your example. The manual states that scanf returns the number of items successfully read, here 0, i.e. no input conversion took place. This type of undetected failure can be fatal in long sequences of read operations because the following scanfs may read in one-off indexes from a sequence of tokens, or may stall as well (and not update their pointees at all), etc.
Please bear with me -- I do not always read the manual, check return values or (by error) initialize variables for little test programs. But if it doesn't work, it's part of my investigation. And before I ask anybody, let alone the world, I make damn sure that I have done my best to find out what I did wrong, beforehand.
You're not using scanf correctly:
scanf(formatstring, address_of_destination,...)
is the right way to do it.
EDIT:
Isn't the prototype for scanf:
int scanf(const char *format, ...);
So I should be able to use it with just one argument?
No, you should not. Please read documentation on scanf; format is a string specifying what scanf should read, and the ... are the things that scanf should read into.
The first argument to scanf is the format string. What you need is:
scanf("%2s", readChars);
It Should provided Format specifiers in scanf function
char readChars[3];
puts("Enter the value of the card please:");
scanf("%s",readChars);
printf("%s",readChars);
printf("done");
http://www.cplusplus.com/reference/cstdio/scanf/ more info...
In the various cases that a buffer is provided to the standard library's many string functions, is it guaranteed that the buffer will not be modified beyond the null terminator? For example:
char buffer[17] = "abcdefghijklmnop";
sscanf("123", "%16s", buffer);
Is buffer now required to equal "123\0efghijklmnop"?
Another example:
char buffer[10];
fgets(buffer, 10, fp);
If the read line is only 3 characters long, can one be certain that the 6th character is the same as before fgets was called?
The C99 draft standard does not explicitly state what should happen in those cases, but by considering multiple variations, you can show that it must work a certain way so that it meets the specification in all cases.
The standard says:
%s - Matches a sequence of non-white-space characters.252)
If no l length modifier is present, the corresponding argument shall be a
pointer to the initial element of a character array large enough to accept the
sequence and a terminating null character, which will be added automatically.
Here's a pair of examples that show it must work the way you are proposing to meet the standard.
Example A:
char buffer[4] = "abcd";
char buffer2[10]; // Note the this could be placed at what would be buffer+4
sscanf("123 4", "%s %s", buffer, buffer2);
// Result is buffer = "123\0"
// buffer2 = "4\0"
Example B:
char buffer[17] = "abcdefghijklmnop";
char* buffer2 = &buffer[4];
sscanf("123 4", "%s %s", buffer, buffer2);
// Result is buffer = "123\04\0"
Note that the interface of sscanf doesn't provide enough information to really know that these were different. So, if Example B is to work properly, it must not mess with the bytes after the null character in Example A. This is because it must work in both cases according to this bit of spec.
So implicitly it must work as you stated due to the spec.
Similar arguments can be placed for other functions, but I think you can see the idea from this example.
NOTE:
Providing size limits in the format, such as "%16s", could change the behavior. By the specification, it would be functionally acceptable for sscanf to zero out a buffer to its limits before writing the data into the buffer. In practice, most implementations opt for performance, which means they leave the remainder alone.
When the intent of the specification is to do this sort of zeroing out, it is usually explicitly specified. strncpy is an example. If the length of the string is less than the maximum buffer length specified, it will fill the rest of the space with null characters. The fact that this same "string" function could return a non-terminated string as well makes this one of the most common functions for people to roll their own version.
As far as fgets, a similar situation could arise. The only gotcha is that the specification explicitly states that if nothing is read in, the buffer remains untouched. An acceptable functional implementation could sidestep this by checking to see if there is at least one byte to read before zeroing out the buffer.
Each individual byte in the buffer is an object. Unless some part of the function description of sscanf or fgets mentions modifying those bytes, or even implies their values may change e.g. by stating their values become unspecified, then the general rule applies: (emphasis mine)
6.2.4 Storage durations of objects
2 [...] An object exists, has a constant address, and retains its last-stored value throughout its lifetime. [...]
It's this same principle that guarantees that
#include <stdio.h>
int a = 1;
int main() {
printf ("%d\n", a);
printf ("%d\n", a);
}
attempts to print 1 twice. Even though a is global, printf can access global variables, and the description of printf doesn't mention not modifying a.
Neither the description of fgets nor that of sscanf mentions modifying buffers past the bytes that actually were supposed to be written (except in the case of a read error), so those bytes don't get modified.
The standard is somewhat ambiguous on this, but I think a reasonable reading of it is that the answer is: yes, it's not allowed to write more bytes to the buffer than it read+null. On the other hand, a stricter reading/interpretation of the text could conclude that the answer is no, there's no guarantee. Here's what a publicly avaialble draft says about fgets.
char *fgets(char * restrict s, int n, FILE * restrict stream);
The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.
The fgets function returns s if successful. If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.
There's a guarantee about how much it is supposed to read from the input, i.e. stop reading at newline or EOF and not read more than n-1 bytes. Although nothing is said explicitly about how much it's allowed to write to the buffer, the common knowledge is that fgets's n parameter is used to prevent buffer overflows. It's a little strange that the standard uses the ambiguous term read, which may not necessarily imply that gets can't write to the buffer more than n bytes, if you want to nitpick on the terminology it uses. But note that the same "read" terminology is used about both issues: the n-limit and the EOF/newline limit. So if you interpret the n-related "read" as a buffer-write limit, then [for consistency] you can/should interpret the other "read" the same way, i.e. not write more than what it read when string is shorter than the buffer.
On the other hand, if you distinguish between the uses of the phrase-verb "read into" (="write") and just "read", then you can't read the committee's text the same way. You are guaranteed that it won't "read into" (="write to") the array more than n bytes, but if the input string is terminated sooner by newline or EOF you're only guaranteed the rest (of the input) won't be "read", but whether that implies in won't be "read into" (="written to") the buffer is unclear under this stricter reading. The crucial issue is keyword is "into", which is elided, so the problem is whether the completion given by me in brackets in the following modified quote is the intended interpretation:
No additional characters are read [into the array] after a new-line character (which is retained) or after end-of-file.
Frankly a single postcondition stated as a formula (and would be pretty short in this case) would have been a lot more helpful than the verbiage I quoted...
I can't be bothered to try and analyze their writeup about the *scanf family, because I suspect it's going to be even more complicated given all the other things that happen in those functions; their writeup for fscanf is about five pages long... But I suspect a similar logic applies.
is it guaranteed that the buffer will not be modified beyond the null
terminator?
No, there's no guarantee.
Is buffer now required to equal "123\0efghijklmnop"?
Yes. But that's only because you've used correct parameters to your string related functions. Should you mess up buffer length, input modifiers to sscanf and such, then you program will compile. But it will most likely fail during runtime.
If the read line is only 3 characters long, can one be certain that the 6th character is the same as before fgets was called?
Yes. Once fgets() figures you have a 3 character input string it stores the input in the provided buffer, and it doesn't care about the reset of provided space at all.
Is buffer now required to equal "123\0efghijklmnop"?
Here buffer is just consists of 123 string guaranteed terminating at NUL.
Yes the memory allocated for array buffer will not get de-allocated, however you are making sure/restricting your string buffer can atmost only have 16 char elements which you can read into it at any point of time. Now depends whether you write just a single char or maximum what buffer can take.
For example:
char buffer[4096] = "abc";`
actually does something below,
memcpy(buffer, "abc", sizeof("abc"));
memset(&buffer[sizeof("abc")], 0, sizeof(buffer)-sizeof("abc"));
The standard insists that if any part of char array is initialized that is all it consists of at any moment until obeying its memory boundary.
There are no any guarantees from standard, which is why the functions sscanf and fgets are recommended to be used (with respect to the size of the buffer) as you show in your question (and using of fgets is considered preferable compared with gets).
However, some standard functions use null-terminator in their work, e.g. strlen (but I suppose you ask about string modification)
EDIT:
In your example
fgets(buffer, 10, fp);
untouching characters after 10-th is guaranteed (content and length of buffer will not be considered by fgets)
EDIT2:
Moreover, when using fgets keep in mind that '\n' will be stored in the buffers. e.g.
"123\n\0fghijklmnop"
instead of expected
"123\0efghijklmnop"
Depends on the function in use (and to a lesser degree its implementation). sscanf will start writing when it encounters its first non-whitespace character, and continue writing until its first whitespace character, where it will add a finishing 0 and return. But a function like strncpy (famously) zeroes out the rest of the buffer.
There is however nothing in the C standard which mandates how these functions behave.
I've seen several usage of fgets (for example, here) that go like this:
char buff[7]="";
(...)
fgets(buff, sizeof(buff), stdin);
The interest being that, if I supply a long input like "aaaaaaaaaaa", fgets will truncate it to "aaaaaa" here, because the 7th character will be used to store '\0'.
However, when doing this:
int i=0;
for (i=0;i<7;i++)
{
buff[i]='a';
}
printf("%s\n",buff);
I will always get 7 'a's, and the program will not crash. But if I try to write 8 'a's, it will.
As I saw it later, the reason for this is that, at least on my system, when I allocate char buff[7] (with or without =""), the 8th byte (counting from 1, not from 0) gets set to 0. From what I guess, things are done like this precisely so that a for loop with 7 writes, followed by a string formatted read, could succeed, whether the last character to be written was '\0' or not, and thus avoiding the need for the programmer to set the last '\0' himself, when writing chars individually.
From this, it follows that in the case of
fgets(buff, sizeof(buff), stdin);
and then providing a too long input, the resulting buffstring will automatically have two '\0' characters, one inside the array, and one right after it that was written by the system.
I have also observed that doing
fgets(buff,(sizeof(buff)+17),stdin);
will still work, and output a very long string, without crashing. From what I guessed, this is because fgets will keep writing until sizeof(buff)+17, and the last char to be written will precisely be a '\0', ensuring that any forthcoming string reading process would terminate properly (although the memory is messed up anyway).
But then, what about fgets(buff, (sizeof(buff)+1),stdin);? this would use up all the space that was rightfully allocated in buff, and then write a '\0' right after it, thus overwriting...the '\0' previously written by the system. In other words, yes, fgets would go out of bounds, but it can be proven that when adding only one to the length of the write, the program will never crash.
So in the end, here comes the question: why does fgets always terminates its write with a '\0', when another '\0', placed by the system right after the array, already exists? why not do like in the one by one for-loop based write, that can access the whole of the array and write anything the programmer wants, without endangering anything?
Thank you very much for your answer!
EDIT: indeed, there is no proof possible, as long as I do not know whether this 8th '\0' that mysteriously appears upon allocation of buff[7], is part of the C standard or not, specifically for string arrays. If not, then...it's just luck that it works :-)
but it can be proven that when adding only one to the length of the write, the program will never crash.
No! You can't prove that! Not in the sense of a mathematical proof. You have only shown that on your system, with your compiler, with those particular compiler settings you used, with particular environment configuration, it might not crash. This is far from a mathematical proof!
In fact the C standard itself, although it guarantees that you can get the address of "one place after the last element of an array", it also states that dereferencing that address (i.e. trying to read or write from that address) is undefined behaviour.
That means that an implementation can do everything in this case. It can even do what you expect with naive reasoning (i.e. work - but it's sheer luck), but it may also crash or it may also format your HD (if your are very, very unlucky). This is especially true when writing system software (e.g. a device driver or a program running on the bare metal), i.e. when there is no OS to shield you from the nastiest consequences of writing bad code!
Edit This should answer the question made in a comment (C99 draft standard):
7.19.7.2 The fgets function
Synopsis
#include <stdio.h>
char *fgets(char * restrict s, int n,
FILE * restrict stream);
Description
The fgets function reads at most one less than the number of characters specified by n
from the stream pointed to by stream into the array pointed to by s. No additional
characters are read after a new-line character (which is retained) or after end-of-file. A
null character is written immediately after the last character read into the array.
Returns
The fgets function returns s if successful. If end-of-file is encountered and no
characters have been read into the array, the contents of the array remain unchanged and a
null pointer is returned. If a read error occurs during the operation, the array contents are
indeterminate and a null pointer is returned.
Edit: Since it seems that the problem lies in a misunderstanding of what a string is, this is the relevant excerpt from the standard (emphasis mine):
7.1.1 Definitions of terms
A string is a contiguous sequence of characters terminated by and including the first null
character. The term multibyte string is sometimes used instead to emphasize special
processing given to multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial (lowest addressed)
character. The length of a string is the number of bytes preceding the null character and
the value of a string is the sequence of the values of the contained characters, in order.
From C11 standard draft:
The fgets function reads at most one less than the number of characters specified by n
from the stream pointed to by stream into the array pointed to by s. No additional
characters are read after a new-line character (which is retained) or after end-of-file. A
null character is written immediately after the last character read into the array.
The fgets function returns s if successful. If end-of-file is encountered and no
characters have been read into the array, the contents of the array remain unchanged and a
null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.
The behaviour you describe is undefined.
Is snprintf always null terminating the destination buffer?
In other words, is this sufficient:
char dst[10];
snprintf(dst, sizeof (dst), "blah %s", somestr);
or do you have to do like this, if somestr is long enough?
char dst[10];
somestr[sizeof (dst) - 1] = '\0';
snprintf(dst, sizeof (dst) - 1, "blah %s", somestr);
I am interested both in what the standard says and what some popular libc might do which is not standard behavior.
As the other answers establish: It should:
snprintf ... Writes the results to a character string buffer. (...)
will be terminated with a null character, unless buf_size is zero.
So all you have to take care is that you don't pass an zero-size buffer to it, because (obviously) it cannot write a zero to "nowhere".
However, beware that Microsoft's library does not have a function called snprintf but instead historically only had a function called _snprintf (note leading underscore) which does not append a terminating null. Here's the docs (VS 2012, ~~ VS 2013):
http://msdn.microsoft.com/en-us/library/2ts7cx93%28v=vs.110%29.aspx
Return Value
Let len be the length of the formatted data string (not including the
terminating null). len and count are in bytes for _snprintf, wide
characters for _snwprintf.
If len < count, then len characters are stored in buffer, a
null-terminator is appended, and len is returned.
If len = count, then len characters are stored in buffer, no
null-terminator is appended, and len is returned.
If len > count, then count characters are stored in buffer, no
null-terminator is appended, and a negative value is returned.
(...)
Visual Studio 2015 (VC14) apparently introduced the conforming snprintf function, but the legacy one with the leading underscore and the non null-terminating behavior is still there:
The snprintf function truncates the output when len is greater
than or equal to count, by placing a null-terminator at
buffer[count-1]. (...)
For all functions other than snprintf, if len = count, len
characters are stored in buffer, no null-terminator is appended,
(...)
According to snprintf(3) manpage.
The functions snprintf() and vsnprintf() write at most size bytes (including the trailing null byte ('\0')) to str.
So, yes, no need to terminate if size >= 1.
According to the C standard, unless the buffer size is 0, vsnprintf() and snprintf() null terminates its output.
The snprintf() function shall be equivalent to sprintf(), with the addition of the n argument which states the size of the buffer referred to by s. If n is zero, nothing shall be written and s may be a null pointer. Otherwise, output bytes beyond the n-1st shall be discarded instead of being written to the array, and a null byte is written at the end of the bytes actually written into the array.
So, if you need to know how big a buffer to allocate, use a size of zero, and you can then use a null pointer as the destination. Note that I linked to the POSIX pages, but these explicitly say that there is not intended to be any divergence between Standard C and POSIX where they cover the same ground:
The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-2008 defers to the ISO C standard.
Be wary of the Microsoft version of vsnprintf(). It definitely behaves differently from the standard C version when there is not enough space in the buffer (it returns -1 where the standard function returns the required length). It is not entirely clear that the Microsoft version null terminates its output under error conditions, whereas the standard C version does.
However, note that Microsoft has changed the rules (vsnprintf()) since this answer was originally written:
Beginning with the UCRT in Visual Studio 2015 and Windows 10, vsnprintf is no longer identical to _vsnprintf. The vsnprintf function conforms to the C99 standard; _vnsprintf is kept for backward compatibility with older Visual Studio code.
Similar comments apply to snprintf() and sprintf().
Note also the answers to Do you use the TR 24731 safe functions? (see MSDN for the Microsoft version of the vsprintf_s()) and the Mac solution for the safe alternatives to unsafe C standard library functions?
Some older versions of SunOS did weird things with snprintf and might have not NUL-terminated the output and had return values that didn't match what everyone else was doing, but anything that has been released in the past 10 years have been doing what C99 says.
The ambiguity starts from the C Standard itself. Both C99 and C11 have identical description of snprintf function. Here is the description from C99:
7.19.6.5 The snprintf function
Synopsis
1 #include <stdio.h>
int snprintf(char * restrict s, size_t n, const char * restrict format, ...);
Description
2 The snprintf function is equivalent to fprintf, except that the output is written into an array (specified by argument s) rather than to a stream. If n is zero, nothing is written, and s may be a null pointer. Otherwise, output characters beyond the n-1st are
discarded rather than being written to the array, and a null character is written at the end of the characters actually written into the array. If copying takes place between objects that overlap, the behavior is undefined.
Returns
3 The snprintf function returns the number of characters that would have been written had n been sufficiently large, not counting the terminating null character, or a negative value if an encoding error occurred. Thus, the null-terminated output has been completely written if and only if the returned value is nonnegative and less than n.
On the one hand the sentence
Otherwise, output characters beyond the n-1st are discarded rather than being written to the array, and a null character is written at the end of the characters actually written into the array
says that
if (the s points to a 3-character-long array, and) n is 3, then 2 characters will be written, and the characters beyond the 2nd one are discarded; then the null character is written after those 2 (and the null character will be the 3rd character written).
And this I believe answers the original question.
THE ANSWER:
If copying takes place between objects that overlap, the behavior is undefined.
If n is 0 then nothing is written to the output
otherwise, if no encoding errors encountered, the output is ALWAYS null-terminated (regardless of whether the output fits in the output array or not; if not then some characters are discarded such that the output array is never overflown),
otherwise (if encoding errors are encountered) the output can stay non-null-terminated.
On the other hand
The last sentence
Thus, the null-terminated output has been completely written if and only if the returned value is nonnegative and less than n
gives ambiguity (or my English is not good enough). I can interpret this sentence in at least two ways:
1. The output is null-terminated if and only if the returned value is nonnegative and less than n (which means that if the returned value is not less than n, i.e. the output (including the terminating null character) does not fit in the array, then the output is not null-terminated).
2. The output is complete (no characters have been discarded) if and only if the returned value is nonnegative and less than n.
I believe that the interpretation 1 above contradicts THE ANSWER, causes misunderstanding and lengthy discussions. That is why the last sentence describing the snprintf function needs a change in order to remove any ambiguity (which gives grounds for writing a Proposal to the C language Standard).
The example of non-ambiguous wording I believe can be taken from http://en.cppreference.com/w/c/io/fprintf (see 4)), thanks to #"Martin Ba" for the link.
See also the question "snprintf: Are there any C Standard Proposals/plans to change the description of this func?".