What is the underlying difference between printf(s) and printf("%s", s)? - c

The question is plain and simple, s is a string, I suddenly got the idea to try to use printf(s) to see if it would work and I got a warning in one case and none in the other.
char* s = "abcdefghij\n";
printf(s);
// Warning raised with gcc -std=c11:
// format not a string literal and no format arguments [-Wformat-security]
// On the other hand, if I use
char* s = "abc %d efg\n";
printf(s, 99);
// I get no warning whatsoever, why is that?
// Update, I've tested this:
char* s = "random %d string\n";
printf(s, 99, 50);
// Results: no warning, output "random 99 string".
So what's the underlying difference between printf(s) and printf("%s", s) and why do I get a warning in just one case?

In the first case, the non-literal format string could perhaps come from user code or user-supplied (run-time) data, in which case it might contain %s or other conversion specifications, for which you've not passed the data. This can lead to all sorts of reading problems (and writing problems if the string includes %n — see printf() or your C library's manual pages).
In the second case, the format string controls the output and it doesn't matter whether any string to be printed contains conversion specifications or not (though the code shown prints an integer, not a string). The compiler (GCC or Clang is used in the question) assumes that because there are arguments after the (non-literal) format string, the programmer knows what they're up to.
The first is a 'format string' vulnerability. You can search for more information on the topic.
GCC knows that most times the single argument printf() with a non-literal format string is an invitation to trouble. You could use puts() or fputs() instead. It is sufficiently dangerous that GCC generates the warnings with the minimum of provocation.
The more general problem of a non-literal format string can also be problematic if you are not careful — but extremely useful assuming you are careful. You have to work harder to get GCC to complain: it requires both -Wformat and -Wformat-nonliteral to get the complaint.
From the comments:
So ignoring the warning, as if I really know what I am doing and there will be no errors, is one or another more efficient to use or are they the same? Considering both space and time.
Of your three printf() statements, given the tight context that the variable s is as assigned immediately above the call, there is no actual problem. But you could use puts(s) if you omitted the newline from the string or fputs(s, stdout) as it is and get the same result, without the overhead of printf() parsing the entire string to find out that it is all simple characters to be printed.
The second printf() statement is also safe as written; the format string matches the data passed. There is no significant difference between that and simply passing the format string as a literal — except that the compiler can do more checking if the format string is a literal. The run-time result is the same.
The third printf() passes more data arguments than the format string needs, but that is benign. It isn't ideal, though. Again, the compiler can check better if the format string is a literal, but the run-time effect is practically the same.
From the printf() specification linked to at the top:
Each of these functions converts, formats, and prints its arguments under control of the format. The format is a character string, beginning and ending in its initial shift state, if any. The format is composed of zero or more directives: ordinary characters, which are simply copied to the output stream, and conversion specifications, each of which shall result in the fetching of zero or more arguments. The results are undefined if there are insufficient arguments for the format. If the format is exhausted while arguments remain, the excess arguments shall be evaluated but are otherwise ignored.
In all these cases, there is no strong indication of why the format string is not a literal. However, one reason for wanting a non-literal format string might be that sometimes you print the floating point numbers in %f notation and sometimes in %e notation, and you need to choose which at run-time. (If it is simply based on value, %g might be appropriate, but there are times when you want the explicit control — always %e or always %f.)

The warning says it all.
First, to discuss about the issue, as per the signature, the first parameter to printf() is a format string which can contain format specifiers (conversion specifier). In case, a string contains a format specifier and the corresponding argument is not supplied, it invokes undefined behavior.
So, a cleaner (or safer) approach (of printing a string which needs no format specification) would be puts(s); over printf(s); (the former does not process s for any conversion specifiers, removing the reason for the possible UB in the later case). You can choose fputs(), if you're worried about the ending newline that automatically gets added in puts().
That said, regarding the warning option, -Wformat-security from the online gcc manual
At present, this warns about calls to printf and scanf functions where the format string is not a string literal and there are no format arguments, as in printf (foo);. This may be a security hole if the format string came from untrusted input and contains %n.
In your first case, there's only one argument supplied to printf(), which is not a string literal, rather a variable, which can be very well generated/ populated at run time, and if that contains unexpected format specifiers, it may invoke UB. Compiler has no way to check for the presence of any format specifier in that. That is the security problem there.
In the second case, the accompanying argument is supplied, the format specifier is not the only argument passed to printf(), so the first argument need not to be verified. Hence the warning is not there.
Update:
Regarding the third one, with excess argument that required by the supplied format string
printf(s, 99, 50);
quoting from C11, chapter §7.21.6.1
[...] If the format is exhausted while arguments remain, the excess arguments are
evaluated (as always) but are otherwise ignored. [...]
So, passing excess argument is not a problem (from the compiler perspective) at all and it is well defined. NO scope for any warning there.

There are two things in play in your question.
The first is covered succinctly by Jonathan Leffler - the warning you're getting is because the string isn't literal and doesn't have any format specifiers in it.
The other is the mystery of why the compiler doesn't issue a warning that your number of arguments doesn't match the number of specifiers. The short answer is "because it doesn't," but more specifically, printf is a variadic function. It takes any number of arguments after the initial format specification - from 0 on up. The compiler can't check to see if you gave the right amount; that's up to the printf function itself, and leads to the undefined behavior that Joachim mentioned in comments.
EDIT:
I'm going to give further answer to your question, as a means of getting on a small soapbox.
What's the difference between printf(s) and printf("%s", s)? Simple - in the latter, you're using printf as it's declared. "%s" is a const char *, and it will subsequently not generate the warning message.
In your comments to other answers, you mentioned "Ignoring the warning...". Don't do this. Warnings exist for a reason, and should be resolved (otherwise they're just noise, and you'll miss warnings that actually matter among the cruft of all the ones that don't.)
Your issue can be resolved in several ways.
const char* s = "abcdefghij\n";
printf(s);
will resolve the warning, because you're now using a const pointer, and there are none of the dangers that Jonathan mentioned. (You could also declare it as const char* const s, but don't have to. The first const is important, because it then matches the declaration of printf, and because const char* s means that characters pointed to by s can't change, i.e. the string is a literal.)
Or, even simpler, just do:
printf("abcdefghij\n");
This is implicitly a const pointer, and also not a problem.

So what's the underlying difference between printf(s) and printf("%s", s)
"printf(s)" will treat s as a format string. If s contains format specifiers then printf will interpret them and go looking for varargs. Since no varargs actually exist this will likely trigger undefined behaviour.
If an attacker controls "s" then this is likely to be a security hole.
printf("%s",s) will just print what is in the string.
and why do I get a warning in just one case?
Warnings are a balance between catching dangerous stupidity and not creating too much noise.
C programmers are in the habbit of using printf and various printf like functions* as generic print functions even when they don't actually need formatting. In this environment it's easy for someone to make the mistake of writing printf(s) without thinking about where s came from. Since formatting is pretty useless without any data to format printf(s) has little legitimate use.
printf(s,format,arguments) on the other hand indicates that the programmer deliberately intended formatting to take place.
Afaict this warning is not turned on by default in upstream gcc, but some distros are turning it on as part of their efforts to reduce security holes.
* Both standard C functions like sprintf and fprintf and functions in third party libraries.

The underlying reason: printf is declared like:
int printf(const char *fmt, ...) __attribute__ ((format(printf, 1, 2)));
This tells gcc that printf is a function with a printf-style interface where the format string comes first. IMHO it must be literal; I don't think there's a way to tell the good compiler that s is actually a pointer to a literal string it had seen before.
Read more about __attribute__ here.

Related

What is the point of format specifier in C?

What is the point of format specifier in C if we have allready set the type of variable before printf?
For example:
#include<stdio.h>
int main(void)
{
int a=7
printf("%d", a);
}
Like, it's allready stated what a is, it's integer(int). So what is the point of adding %d to specify that it's an integer?
The answer to this question really only makes sense in the context of C's history.
C is, by now, a pretty old language. Though undoubtedly a "high level language", it is famously low-level as high-level languages go. And its earliest compiler was deliberately and self-consciously small and simple.
In its first incarnation, C did not enforce type safety during function calls. For example, if you called sqrt(144), you got the wrong answer, because sqrt expects an argument of type double, but 144 is an int. It was the programmer's responsibility to call a function with arguments of the correct types: the compiler did not know (did not even attempt to keep track of) the arguments expected by each function, so it did not and could not perform automatic conversions. (A separate program, lint, could check that functions were called with the correct arguments.)
C++ corrected this deficiency, by introducing the function prototype. These were inherited by C in the first ANSI C standard in 1989. However, a function prototype only works for a function that expects a single, fixed argument list, meaning that it can't help for functions that accept a variable number of arguments, the premier example being: printf.
The other thing to remember is that, in C, printf is a more or less ordinary function. ("Ordinary" other than accepting a variable number of arguments, that is.) So the compiler has no direct mechanism to notice the types of the arguments and make that list of types available to printf. printf has no way of knowing, at run time, what types were passed during any given call; it can only rely (it must rely) on the clues provided in the format string. (This is by contrast to languages, many of them, where the print statement is an explicit part of the language parsed by the compiler, meaning that the compiler can do whatever it needs to do in order to treat each argument properly according to its known type.)
So, by the rules of the language (which are constrained by backwards compatibility and the history of the language), the compiler can't do anything special with the arguments in a printf call, other than performing what is called the default argument promotions. So the compiler can't fix things (can't perform the "correct" implicit conversion) if you write something like
int a = 7;
printf("%f", a);
This is, admittedly, an uncomfortable situation. These days, programmers are used to the protections and the implicit promotions provided for by function prototypes. If, these days, you can call
int x = sqrt(144);
and have the right thing happen, why can't you similarly call
printf("%f\n", 144);
Well, you can't, although a good, modern compiler will try to help you out anyway. Although the compiler doesn't have to inspect the format string (because that's printf's job to do, at run time), and the compiler isn't allowed to insert any implicit conversions (other than the default promotions, which don't help here), a compiler can duplicate printf's logic, inspect the format string, and issue strong warnings if the programmer makes a mistake. For example, given
printf("%f\n", 144);
gcc prints "warning: format ‘%f’ expects argument of type ‘double’, but argument 2 has type ‘int", and clang prints "warning: format specifies type 'double' but the argument has type 'int'".
In my opinion, this is a fine compromise, balancing C's legacy behavior with modern expectations.
what is the point of adding %d to specify that it's an integer?
printf() is a function which receives a variable number of arguments of various type after the format argument. It does not directly know the number nor the type of arguments passed nor received.
The callers knows the argument count and types it gives to printf().
To pass the arguments count and type information, the format argument is used by the caller to encodes the argument count and types. printf() uses that format and decodes it to know the argument count and type. It is very important that the format and following arguments passed are consistent.
printf() accepts a variable number of arguments. To process those variable arguments it (va_start()) needs to know the last fixed argument is. It (va_arg()) also needs to know the type of each argument so it figure how much data to read.
The format specifier is also a compact template (or DSL) to express how text and variables should be formatted including field width, alignment, precision, encoding.

Choose the lesser evil incorrect Printf() statements: Fewer parameters vs extra parameters

A. printf("Values: X=%s Y=%s\n", x,y,z);
B. printf("Values: x=%s, Y=%s\n", x);
Both of the above printf() statements are incorrect: one has extra parameters, other has fewer parameters. I would like to choose between the lesser evil with an explanation. Can a modern C compiler help catch such problems? If yes, how does printf() implementor need to assist the compiler?
Both of the above printf() statements are incorrect: one has extra parameters, other has fewer parameters.
The first one is not incorrect according to the C standard. The rules for function calls in general, in C 2018 6.5.2.2, do not make it an error to pass unused arguments for a ... in the function prototype. For printf specifically, C 2018 7.21.6.1 2 (about fprintf, which the specification for printf refers to) says extra arguments are harmless:
… If the format is exhausted while arguments remain, the excess arguments are evaluated (as always) but are otherwise ignored…
Certainly if a programmer writes printf("Values: X=%s. Y=%s.\n", x, y, z);, they might have made a mistake, and a compiler would be reasonable in pointing out this possibility. However, consider code such as:
printf(ComputedFormat, x, y, z);
Here it is reasonable that we wish to print different numbers of values in different circumstances, and the ComputedFormat reflects this. It would be tedious to write code for each case and dispatch to them with a switch statement. It is simpler to write one call and let the computed format determine how many values are printed. So it is not always an error to have more arguments than the conversion specifications use.
I would like to choose between the lesser evil with an explanation.
The behavior of the latter code is not defined by the C standard. C 2018 7.21.6.1 2 also says:
… If there are insufficient arguments for the format, the behavior is undefined…
Thus, no behavior may be relied on from the latter code, unless there is some guarantee from the C implementation.
Can a modern C compiler help catch such problems?
Good modern C compilers have information about the specification of printf and, when the format argument is a string literal, they compare the number and types of the arguments to the conversion specifications in the string.
If yes, how does printf() implementor need to assist the compiler?
The implementor of printf does not need to do anything except conform to the specification of printf in the C standard. The aid described above is performed by the C compiler with reference to the C standard; it does not rely on features of the particular printf implementation.
In some platforms, information about the number of arguments passed is provided to the called routine. In such platforms, a printf implementor could check whether too few arguments are provided and signal an error in some method.
Eric Postpischil has already made a great answer that uses the most reliable source (the C standard), but I just want to post my own answer about why printf may behave as it does in both cases.
printf is a variadic function which can take a variable number of arguments. The way it knows how many you have passed is solely through the format string; every time it finds a format specifier, it takes the next argument out of the list (and assumes its type from which specifier has been used). Nothing would really happen to any extra arguments because since there is no specifier for them, the function will not even try to take them and they will not be printed. So you may be warned about the extra arguments by the compiler, but the behavior in the first example is well-defined.
The second, on the other hand, is definitely undefined behavior. Since there are not enough arguments to match the number of format specifiers in the string, eventually when it finds the second %s, it will try to take the next variadic argument, but the issue is that you haven't passed any. When this happens for me, it prints some garbage value in place of the format specifier that doesn't look too nice. Anything could happen in undefined behavior though. In this case, the function seems to try to take the next variadic argument from a CPU register / the stack (memory) and fetches some garbage value that happened to be there (though again, anything could happen with undefined behavior).
So in short:
printf("%s\n", "Hello", "World");
| | ^^^^^^^ Ignored
-------
and
printf("%s\n"); ?
| |
----------

Why is printf with a single argument (without conversion specifiers) deprecated?

In a book that I'm reading, it's written that printf with a single argument (without conversion specifiers) is deprecated. It recommends to substitute
printf("Hello World!");
with
puts("Hello World!");
or
printf("%s", "Hello World!");
Can someone tell me why printf("Hello World!"); is wrong? It is written in the book that it contains vulnerabilities. What are these vulnerabilities?
printf("Hello World!"); is IMHO not vulnerable but consider this:
const char *str;
...
printf(str);
If str happens to point to a string containing %s format specifiers, your program will exhibit undefined behaviour (mostly a crash), whereas puts(str) will just display the string as is.
Example:
printf("%s"); //undefined behaviour (mostly crash)
puts("%s"); // displays "%s\n"
printf("Hello world");
is fine and has no security vulnerability.
The problem lies with:
printf(p);
where p is a pointer to an input that is controlled by the user. It is prone to format strings attacks: user can insert conversion specifications to take control of the program, e.g., %x to dump memory or %n to overwrite memory.
Note that puts("Hello world") is not equivalent in behavior to printf("Hello world") but to printf("Hello world\n"). Compilers usually are smart enough to optimize the latter call to replace it with puts.
Further to the other answers, printf("Hello world! I am 50% happy today") is an easy bug to make, potentially causing all manner of nasty memory problems (it's UB!).
It's just simpler, easier and more robust to "require" programmers to be absolutely clear when they want a verbatim string and nothing else.
And that's what printf("%s", "Hello world! I am 50% happy today") gets you. It's entirely foolproof.
(Steve, of course printf("He has %d cherries\n", ncherries) is absolutely not the same thing; in this case, the programmer is not in "verbatim string" mindset; she is in "format string" mindset.)
I'll just add a bit of information regarding the vulnerability part here.
It's said to be vulnerable because of printf string format vulnerability. In your example, where the string is hardcoded, it's harmless (even if hardcoding strings like this is never fully recommended). But specifying the parameter's types is a good habit to take. Take this example:
If someone puts format string character in your printf instead of a regular string (say, if you want to print the program stdin), printf will take whatever he can on the stack.
It was (and still is) very used to exploit programs into exploring stacks to access hidden information or bypass authentication for example.
Example (C):
int main(int argc, char *argv[])
{
printf(argv[argc - 1]); // takes the first argument if it exists
}
if I put as input of this program "%08x %08x %08x %08x %08x\n"
printf ("%08x %08x %08x %08x %08x\n");
This instructs the printf-function to retrieve five parameters from the stack and display them as 8-digit padded hexadecimal numbers. So a possible output may look like:
40012980 080628c4 bffff7a4 00000005 08059c04
See this for a more complete explanation and other examples.
This is misguided advice. Yes, if you have a run-time string to print,
printf(str);
is quite dangerous, and you should always use
printf("%s", str);
instead, because in general you can never know whether str might contain a % sign. However, if you have a compile-time constant string, there's nothing whatsoever wrong with
printf("Hello, world!\n");
(Among other things, that is the most classic C program ever, literally from the C programming book of Genesis. So anyone deprecating that usage is being rather heretical, and I for one would be somewhat offended!)
Calling printf with literal format strings is safe and efficient, and there
exist tools to automatically warn you if your invocation of printf with user
provided format strings is unsafe.
The most severe attacks on printf take advantage of the %n format
specifier. In contrast to all other format specifiers, e.g. %d, %n actually
writes a value to a memory address provided in one of the format arguments.
This means that an attacker can overwrite memory and thus potentially take
control of your program. Wikipedia
provides more detail.
If you call printf with a literal format string, an attacker cannot sneak
a %n into your format string, and you are thus safe. In fact,
gcc will change your call to printf into a call to puts, so there litteraly
isn't any difference (test this by running gcc -O3 -S).
If you call printf with a user provided format string, an attacker can
potentially sneak a %n into your format string, and take control of your
program. Your compiler will usually warn you that his is unsafe, see
-Wformat-security. There are also more advanced tools that ensure that
an invocation of printf is safe even with user provided format strings, and
they might even check that you pass the right number and type of arguments to
printf. For example, for Java there is Google's Error Prone
and the Checker Framework.
A rather nasty aspect of printf is that even on platforms where the stray memory reads could only cause limited (and acceptable) harm, one of the formatting characters, %n, causes the next argument to be interpreted as a pointer to a writable integer, and causes the number of characters output thus far to be stored to the variable identified thereby. I've never used that feature myself, and sometimes I use lightweight printf-style methods which I've written to include only the features I actually use (and don't include that one or anything similar) but feeding standard printf functions strings received from untrustworthy sources may expose security vulnerabilities beyond the ability to read arbitrary storage.
Since no one has mentioned, I'd add a note regarding their performance.
Under normal circumstances, assuming no compiler optimisations are used (i.e. printf() actually calls printf() and not fputs()), I would expect printf() to perform less efficiently, especially for long strings. This is because printf() has to parse the string to check if there are any conversion specifiers.
To confirm this, I have run some tests. The testing is performed on Ubuntu 14.04, with gcc 4.8.4. My machine uses an Intel i5 cpu. The program being tested is as follows:
#include <stdio.h>
int main() {
int count = 10000000;
while(count--) {
// either
printf("qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM");
// or
fputs("qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM", stdout);
}
fflush(stdout);
return 0;
}
Both are compiled with gcc -Wall -O0. Time is measured using time ./a.out > /dev/null. The following is the result of a typical run (I've run them five times, all results are within 0.002 seconds).
For the printf() variant:
real 0m0.416s
user 0m0.384s
sys 0m0.033s
For the fputs() variant:
real 0m0.297s
user 0m0.265s
sys 0m0.032s
This effect is amplified if you have a very long string.
#include <stdio.h>
#define STR "qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM"
#define STR2 STR STR
#define STR4 STR2 STR2
#define STR8 STR4 STR4
#define STR16 STR8 STR8
#define STR32 STR16 STR16
#define STR64 STR32 STR32
#define STR128 STR64 STR64
#define STR256 STR128 STR128
#define STR512 STR256 STR256
#define STR1024 STR512 STR512
int main() {
int count = 10000000;
while(count--) {
// either
printf(STR1024);
// or
fputs(STR1024, stdout);
}
fflush(stdout);
return 0;
}
For the printf() variant (ran three times, real plus/minus 1.5s):
real 0m39.259s
user 0m34.445s
sys 0m4.839s
For the fputs() variant (ran three times, real plus/minus 0.2s):
real 0m12.726s
user 0m8.152s
sys 0m4.581s
Note: After inspecting the assembly generated by gcc, I realised that gcc optimises the fputs() call to an fwrite() call, even with -O0. (The printf() call remains unchanged.) I am not sure whether this will invalidate my test, as the compiler calculates the string length for fwrite() at compile-time.
For gcc it is possible to enable specific warnings for checking printf() and scanf().
The gcc documentation states:
-Wformat is included in -Wall. For more control over some aspects
of format checking, the options -Wformat-y2k,
-Wno-format-extra-args, -Wno-format-zero-length,
-Wformat-nonliteral, -Wformat-security, and -Wformat=2 are
available, but are not included in -Wall.
The -Wformat which is enabled within the -Wall option does not enable several special warnings that help to find these cases:
-Wformat-nonliteral will warn if you do not pass a string litteral as format specifier.
-Wformat-security will warn if you pass a string that might contain a dangerous construct. It's a subset of -Wformat-nonliteral.
I have to admit that enabling -Wformat-security revealed several bugs we had in our codebase (logging module, error handling module, xml output module, all had some functions that could do undefined things if they had been called with % characters in their parameter. For info, our codebase is now around 20 years old and even if we were aware of these kind of problems, we were extremely surprised when we enabled these warnings how many of these bugs were still in the codebase).
printf("Hello World\n")
automatically compiles to the equivalent
puts("Hello World")
you can check it with diassembling your executable:
push rbp
mov rbp,rsp
mov edi,str.Helloworld!
call dword imp.puts
mov eax,0x0
pop rbp
ret
using
char *variable;
...
printf(variable)
will lead to security issues, don't ever use printf that way!
so your book is actually correct, using printf with one variable is deprecated but you can still use printf("my string\n") because it will automatically become puts
Beside the other well-explained answers with any side-concerns covered, I would like to give a precise and concise answer to the provided question.
Why is printf with a single argument (without conversion specifiers) deprecated?
A printf function call with a single argument in general is not deprecated and has also no vulnerabilities when used properly as you always shall code.
C Users amongst the whole world, from status beginner to status expert use printf that way to give a simple text phrase as output to the console.
Furthermore, Someone have to distinguish whether this one and only argument is a string literal or a pointer to a string, which is valid but commonly not used. For the latter, of course, there can occur inconvenient outputs or any kind of Undefined Behavior, when the pointer is not set properly to point to a valid string but these things can also occur if the format specifiers are not matching the respective arguments by giving multiple arguments.
Of course, It is also not right and proper that the string, provided as one and only argument, has any format or conversion specifiers, since there is no conversion going to be happen.
That said, giving a simple string literal like "Hello World!" as only argument without any format specifiers inside that string like you provided it in the question:
printf("Hello World!");
is not deprecated or "bad practice" at all nor has any vulnerabilities.
In fact, many C programmers begin and began to learn and use C or even programming languages in general with that HelloWorld-program and this printf statement as first ones of its kind.
They wouldn´t be that if they were deprecated.
In a book that I'm reading, it's written that printf with a single argument (without conversion specifiers) is deprecated.
Well, then I would take the focus on the book or the author itself. If an author is really doing such, in my opinion, incorrect assertions and even teaching that without explicitly explaining why he/she is doing so (if those assertions are really literally equivalent provided in that book), I would consider it a bad book. A good book, as opposed to that, shall explain why to avoid certain kind of programming methods or functions.
According to what I said above, using printf with only one argument (a string literal) and without any format specifiers is not in any case deprecated or considered as "bad practice".
You should ask the author, what he meant with that or even better, mind him to clarify or correct the relative section for the next edition or imprints in general.

Valid printf() statements in C

Given that:
char *message = "Hello, World";
char *format = "x=%i\n";
int x = 10;
Why is printf (message); invalid (i.e. rejected by compiler for being potentially insecure) and printf (format, x); isn't?
Is format treated as a string literal in this case and message as a format string? If so, why?
Update
I know why printf (message); is rejected. My question is, why is printf (format, x); not rejected too.
I'm using clang. The error message for printf (message); is format string is not string literal (potentially insecure).
It compiles fine under gcc. So it does appear to be compiler specific and to do with how clang sets it warnings.
It is a compiler limitation.
If it is known at compile time that the pointer is pointing at a string literal, then the compiler could check the specifiers and omit the warning.
There is no special reason, why you are getting a warning for one and not the other. Standard doesn't specify anything relevant to this issue. It is just how the compiler is implemented. A different one might warn for both cases, or neither one.
You can get a warning in both cases by enabling the -Wformat-nonliteral option, which is not included in either -Wall or -Wextra (but it is in -Weverything).
For whatever reason, this seems like an intentional design decision to emit a security warning only when the non-literal printf statement takes no additional arguments. The source code which emits this warning can be found in lib/Sema/SemaChecking.cpp:
// If there are no arguments specified, warn with -Wformat-security, otherwise
// warn only with -Wformat-nonliteral.
if (Args.size() == firstDataArg)
Diag(Args[format_idx]->getLocStart(),
diag::warn_format_nonliteral_noargs)
<< OrigFormatExpr->getSourceRange();
else
Diag(Args[format_idx]->getLocStart(),
diag::warn_format_nonliteral)
<< OrigFormatExpr->getSourceRange();
I'd guess that this is for compatibility with existing legacy code, but that's pure speculation.
If you pass arguments to be formatted to printf, it expects you know that the first argument is going to be a format string. If it weren’t a format string, well, what would it do with all those extra arguments? It’s a reasonable inference.
On the other hand, say we specified no data to be formatted beyond the format string itself. Then what if it did have format specifiers in it? It would always be erroneous unless no format specifiers were in it, and so since there is a safe alternative for this situation, it’s warning you.
Clang has -Wformat-security on by default. You can suppress the warning by passing the option -Wno-format-security while compiling. That should let you compile printf(message);.
You can find the description here.
From the gcc docpage:
-Wformat-security:
If -Wformat is specified, also warn about uses of format functions that represent possible security problems. At present, this warns about calls to printf and scanf functions where the format string is not a string literal and there are no format arguments, as in printf (foo);. This may be a security hole if the format string came from untrusted input and contains `%n'. (This is currently a subset of what -Wformat-nonliteral warns about, but in future warnings may be added to -Wformat-security that are not included in -Wformat-nonliteral.)"
So, if there are no format arguments, it simply checks that the format string is a string literal. And here's the definition of a string literal. Starting from C++11, char[] and char*'s are not longer considered string literals. Note that you can make message a string literal by making it a const char[].
Coming to the question you raised in the comment, printf("%fHello") will generate a warning if -Wformat is set.
The function prototype for printf is declared as:
int printf (const char * format, ...);
where ... is a variable argument list (i.e. 0 to ∞).
format is treated as a format string. If you want print message using printf() you can try printf (format, message); where format equals "%s".

C char array is not a string pattern?

I am having a compilation error on the following code:
printf((char *) buffer);
and the error message that I am getting is:
cc1: format not a string literal and no format arguments...
I suspect there are some libraries that I forgot to install, as I was able to compile and run the code without an error on the other machine...
PS: The question rises with the fact that I was able to run the same code on some other machine... I suspect a difference in gcc version might cause a problem like this?
Newer GCC versions try to parse the format string passed to printf and similar functions and determine if the argument list correctly matches the format string. It can't do this because you've passed it a buffer for the first argument, which would normally be a format string.
Your code is not incorrect C, it's just a poor usage of C. As others mentioned you should use "%s" as a format string to print a single string. This protects you from a class of errors that involve percentage signs in your string, if you don't control the input. It's a best practice to never pass anything but a string literal as the first argument to the printf or sprintf family of functions.
try
printf("%s", (char*) buffer);
;-)
This warning is generated by gcc if
-Wformat-nonliteral
is set. It's not part of -Wall or -Wextra (at least for version 4.4.0), so just drop it if you want the code to compile warning-free.
This is a warning for your safety, not an error. This new compiler is apparently being more strict about it. I don't think it's actually illegal in C, so the compiler should have an option to disable treating this as an error.
However, you pretty much never want to pass anything other than a string literal as the first argument to printf. The reason that doing so is such a horrible idea that the compiler has a special built-in check to warn you about it is this: Suppose the non-literal string that you pass as the first argument to printf happens to contain printf formatting characters. printf is then going to try to access second, third, fourth, etc, arguments that you didn't actually pass in, and may well crash your program by trying to do so. If the non-literal first argument is actually user-supplied, then the problem is even worse since a malicious user could crash your program at will.

Resources