This question already has answers here:
Behaviour of printf when printing a %d without supplying variable name
(6 answers)
Closed 4 years ago.
I stumbled upon this C code in a college test and when I tested it on Dev-C++ 5.11 compiler, it printed random characters. I can't understand how or why this code works. Can someone enlighten me?
int main() {
char a[10] = "%s" ;
printf( a ) ;
}
This question has two parts: the missing quotes and the random characters.
printf() is just a function. You can pass strings and other values to functions as arguments. You don't have to use a literal. You can use both char *a = "something"; printf(a) (passing a variable as an argument) and printf("something") (passing a string literal as an argument).
printf() is also a variadic function. This means it can accept any number of arguments. You can use printf("hello world"), printf("%s", "hello world") and even printf("%s %s", "hello", "world"). Some older compilers don't verify you actually passed the right number of arguments based on the first argument which is the format string. This is why your code compiles even though it's missing an argument. When the program runs the code goes over the format string, sees "%s" and looks for the second argument to print it as a string. Since there is no second argument it basically reads random memory and you get garbage characters.
printf function signature is:
int printf(const char *format, ...);
It expects format string as the first argument and variable number of arguments that are handled and printed based on the format specifiers in the format string. variable a in your question is providing it the format string. Reason for random characters is that the argument for format specifier %s is missing. Following will correctly print a string:
printf( a, "Hello World!" );
A list of format specifiers can be seen here https://en.wikipedia.org/wiki/Printf_format_string
Why does it compile?
Because variadic arguments accepted by printf are processed at run time. Not all compilers do compile time checks for validating arguments against the format string. Even if they do they would at most throw a warning, but still compile the program.
It's using the string "%s" as a format string, and using uninitialized memory as the "data".
The only reason it does "something" is because the compiler was apparently not smart enough to recognize that the format string required one parameter and zero parameters were supplied. Or because compiler warnings were ignored and/or errors were turned off.
Just an FYI for anybody who bumps into this: "Always leave all warnings and errors enabled and fix your code until they're gone" This doesn't guarantee correct behaviour but does make "mysterious" problems less likely.
user_input = "%s%s%s%s%s%s%s%s";
printf("user input is: %s", user_input);
... crash!
The above lines cause an error. I want to write a function which can be used like printf but can sanitize all arguments after the first to make sure they do not contain the % symbol. The function should be used like 'printf' in that it can take any number of arguments and it print outs a string in the same manner. If the other arguments contain the % symbol, I just want that symbol taken out before it is put in the format string.
If this new function were called safe_printf, I would want the behavior to be like this:
user_input = "%s%s%s%s%s%s%s%s";
safe_printf("user input is: %s, user_input);
user input is: ssssssss
It seems like writing a function like this may not be possible, (I can't figure out how to preprocess the char *s in the va_list without knowing how many there are) if that's the case please let me know. Thanks!
The string formatting vulnerability exists if user provided input is passed to the printf(3) function. You can prevent this by not allowing that happen, ever. There is absolutely not reason to have a dynamic format string because you should know ahead of time the required format, and therefore you can put the format string in read only memory through the use of a const char *
Say we are working in C.
if I go ahead and do this:
char *word;
word = "Hello friends";
printf(word);
then XCode tells me that because I'm not using a string literal, that I might have something that is potentially insecure. Does that mean an opening for something to hack my program? If so, how could that happen?
Alternatively, if I do this:
char *word;
word = "Hello friends";
printf("%s", word);
Then XCode raises no flags and I'm fine. What exactly is the difference?
The first argument to printf is not the string you want to print. It's a format string. It can contain formatting instructions which results in the string that is printed once it is combined with further arguments.
Your first example is an uncontrolled format string vulnerability.
The issue is that in the first case if word contains formatting specs (%d, %f, %s, etc.) then printf() will assume those values are on the stack, but in fact they aren't, which could lead to a crash.
Use puts() or fputs() instead if you don't care about formatting.
Hello I'm quite new to C and in a nutshell I was doing the following as part of my assignment in class:
foo (char *var) {
printf(var);
}
I was told that this is bad practice and insecure but did not get much detailed information on this by my tutor. I assume that if the string value of var is controllable by the user it may be used to perform a bufferoverflow? How would I properly harden this code? Do I have to limit the str length or something?
Cheers & Thanks!
You should use:
printf("%s", var);
instead. The way you have it, I could enter %s as my input, and printf would read a random piece of memory as it looked for a string to print. That can cause any amount of unexpected behaviour.
This is UNSAFE because it could lead to a Format String Attack
Well, the first argument of printf is a format string. So the caller of your function could pass:
foo("%d")
and then printf would look for an integer which isn't there, and cause undefined behaviour. One possible fix would be for your function to call:
printf("%s", var);
which would have printf interpret var as a regular string (rather than a format).
Printf has the following signature:
int printf(
const char *format [,
argument]...
);
If the user inputs format characters for example %s all kinds of bad things will happen.
Use puts if you just want to output the string.
What is the use of the %n format specifier in C? Could anyone explain with an example?
Most of these answers explain what %n does (which is to print nothing and to write the number of characters printed thus far to an int variable), but so far no one has really given an example of what use it has. Here is one:
int n;
printf("%s: %nFoo\n", "hello", &n);
printf("%*sBar\n", n, "");
will print:
hello: Foo
Bar
with Foo and Bar aligned. (It's trivial to do that without using %n for this particular example, and in general one always could break up that first printf call:
int n = printf("%s: ", "hello");
printf("Foo\n");
printf("%*sBar\n", n, "");
Whether the slightly added convenience is worth using something esoteric like %n (and possibly introducing errors) is open to debate.)
Nothing printed. The argument must be a pointer to a signed int, where the number of characters written so far is stored.
#include <stdio.h>
int main()
{
int val;
printf("blah %n blah\n", &val);
printf("val = %d\n", val);
return 0;
}
The previous code prints:
blah blah
val = 5
I haven't really seen many practical real world uses of the %n specifier, but I remember that it was used in oldschool printf vulnerabilities with a format string attack quite a while back.
Something that went like this
void authorizeUser( char * username, char * password){
...code here setting authorized to false...
printf(username);
if ( authorized ) {
giveControl(username);
}
}
where a malicious user could take advantage of the username parameter getting passed into printf as the format string and use a combination of %d, %c or w/e to go through the call stack and then modify the variable authorized to a true value.
Yeah it's an esoteric use, but always useful to know when writing a daemon to avoid security holes? :D
From here we see that it stores the number of characters printed so far.
n The argument shall be a pointer to an integer into which is written the number of bytes written to the output so far by this call to one of the fprintf() functions. No argument is converted.
An example usage would be:
int n_chars = 0;
printf("Hello, World%n", &n_chars);
n_chars would then have a value of 12.
So far all the answers are about that %n does, but not why anyone would want it in the first place. I find it's somewhat useful with sprintf/snprintf, when you might need to later break up or modify the resulting string, since the value stored is an array index into the resulting string. This application is a lot more useful, however, with sscanf, especially since functions in the scanf family don't return the number of chars processed but the number of fields.
Another really hackish use is getting a pseudo-log10 for free at the same time while printing a number as part of another operation.
The argument associated with the %n will be treated as an int* and is filled with the number of total characters printed at that point in the printf.
The other day I found myself in a situation where %n would nicely solve my problem. Unlike my earlier answer, in this case, I cannot devise a good alternative.
I have a GUI control that displays some specified text. This control can display part of that text in bold (or in italics, or underlined, etc.), and I can specify which part by specifying starting and ending character indices.
In my case, I am generating the text to the control with snprintf, and I'd like one of the substitutions to be made bold. Finding the starting and ending indices to this substitution is non-trivial because:
The string contains multiple substitutions, and one of the substitutions is arbitrary, user-specified text. This means that doing a textual search for the substitution I care about is potentially ambiguous.
The format string might be localized, and it might use the $ POSIX extension for positional format specifiers. Therefore searching the original format string for the format specifiers themselves is non-trivial.
The localization aspect also means that I cannot easily break up the format string into multiple calls to snprintf.
Therefore the most straightforward way to find the indices around a particular substitution would be to do:
char buf[256];
int start;
int end;
snprintf(buf, sizeof buf,
"blah blah %s %f yada yada %n%s%n yakety yak",
someUserSpecifiedString,
someFloat,
&start, boldString, &end);
control->set_text(buf);
control->set_bold(start, end);
It doesn't print anything. It is used to figure out how many characters got printed before %n appeared in the format string, and output that to the provided int:
#include <stdio.h>
int main(int argc, char* argv[])
{
int resultOfNSpecifier = 0;
_set_printf_count_output(1); /* Required in visual studio */
printf("Some format string%n\n", &resultOfNSpecifier);
printf("Count of chars before the %%n: %d\n", resultOfNSpecifier);
return 0;
}
(Documentation for _set_printf_count_output)
It will store value of number of characters printed so far in that printf() function.
Example:
int a;
printf("Hello World %n \n", &a);
printf("Characters printed so far = %d",a);
The output of this program will be
Hello World
Characters printed so far = 12
Those who want to use %n Format Specifier may want to look at this:
Do Not Use the "%n" Format String Specifier
In C, use of the "%n" format specification in printf() and sprintf()
type functions can change memory values. Inappropriate
design/implementation of these formats can lead to a vulnerability
generated by changes in memory content. Many format vulnerabilities,
particularly those with specifiers other than "%n", lead to
traditional failures such as segmentation fault. The "%n" specifier
has generated more damaging vulnerabilities. The "%n" vulnerabilities
may have secondary impacts, since they can also be a significant
consumer of computing and networking resources because large
guantities of data may have to be transferred to generate the desired
pointer value for the exploit. Avoid using the "%n" format
specifier. Use other means to accomplish your purpose.
Source: link
In my opinion, %n in 1st argument of print function simply record the number of character it prints on the screen before it reach the the %n format code including white spaces and new line character.`
#include <stdio.h>
int main()
{
int i;
printf("%d %f\n%n", 100, 123.23, &i);
printf("%d'th characters printed on the screen before '%%n'", i);
}
output:
100 123.230000
15'th characters printed on the screen before '%n'(with new character).
We can assign the of i in an another way...
As we know the argument of print function:-
int printf(char *control-string, ...);
So, it returns the number the number of characters output. We can assign that return value to i.
#include <stdio.h>
int main()
{
int i;
i = printf("%d %f\n", 100, 123.23);
printf("%d'th characters printed on the screen.", i);
}
%n is C99, works not with VC++.