Printf with no arguments explanation - c

I understand that if printf is given no arguments it outputs an unexpected value.
Example:
#include <stdio.h>
int main() {
int test = 4 * 4
printf("The answer is: %d\n");
return 0;
}
This returns a random number. After playing around with different formats such as %p, %x etc, it doesn't print 16(because I didn't add the variable to the argument section) What i'd like to know is, where are these values being taken from? Is it the top of the stack? It's not a new value every time I compile, which is weird, it's like a fixed value.

printf("The answer is: %d\n");
invokes undefined behavior. C requires a conversion specifier to have an associated argument. While it is undefined behavior and anything can happen, on most systems you end up dumping the stack. It's the kind of trick used in format string attacks.

It is called undefined behavior and it is scary (see this answer).
If you want an explanation, you need to dive into implementation specific details. So study the generated source code (e.g. compile with gcc -Wall -Wextra -fverbose-asm + your optimization flags, then look into the generated .s assembly file) and the ABI of your system.

The printf function will go looking for the argument on the stack, even if you don't supply one. Anything that's on there will be used, if it can't find an integer argument. Most times, you will get nonsensical data. The data chosen varies depending on the settings of your compiler. On some compilers, you may even get 16 as a result.
For example:
int printf(char*, int d){...}
This would be how printf works(not really, just an example). It doesn't return an error if d is null or empty, it just looks on the stack for the argument that's supposed to be there to display.

Printf is a variable argument function. Most compilers push arguments onto the stack and then call the function, but, depending on machine, operating system, calling convention, number of arguments, etc, there are also other values pushed onto the stack, which might be constant in your function.
Printf reads this area of memory and returns it.

Related

Why doesn't linking a function of the wrong type return garbage?

I'm going through the K&R C book and chapter 4.2 says the following:
If atof itself and the call to it in main have inconsistent types in the same source file, the error will be detected by the compiler. But if (as is more likely) atof were compiled separately, the mismatch would not be detected, atof would return a double that main would treat as an int, and meaningless answers would result.
I wanted to see for myself what would happen if I created and inconsistently used a function as they describe:
main.c:
#include <stdio.h>
int floatFunction();
int main() {
int result = floatFunction();
printf("result: %d\n", result);
}
floatFunction.c:
float floatFunction() {
return 10;
}
bash:
$ gcc main.c floatFunction.c -o main
$ ./main
result: 0
$
Regardless of what value I had floatFunction return, I always got 0 as a result. I also tried compiling with -o3, same result. I was expecting to see the bit pattern of 10.0 interpreted as an int as a result. Why is this not happening?
Also, why doesn't the compilation process notify me that the types don't match and how can I protect myself from making a mistake like this in real code?
There are different registers used for passing (and returning) floating-point vs. integral values (pointers and integers) on x86 at least.
Why you consistently had integral zero in the return-register for integral values, while the return-register for floating-point values was used for the true return-value, I don't know. Just luck probably.
See for example What are the calling conventions for UNIX & Linux system calls (and user-space functions) on i386 and x86-64 for more about Linux calling conventions.
Anyway, remember that whatever you observe, at least on the language level it is simply undefined behavior, anything goes.
I was expecting to see the bit pattern of 10.0 interpreted as an int as a result. Why is this not happening?
The most likely reason is that on your platform (which you didn't specify) floating point values are returned in a different register from the one in which integer values are returned.
For example, on x86_64 the integer result is returned in the rax register, while double and float result is returned in the xmm0 register.
Here is an article on various x86 calling conventions.
If you are not on x86_64, you'll need to look up the calling conventions for your platform.
Also, why doesn't the compilation process notify me that the types don't match
The compilation process only sees one source file (compilation unit) at at time, so it can't warn you.
The linker could, but only the AIX linker actually does (of the ones I know).
how can I protect myself from making a mistake like this in real code?
By declaring your functions in a header file and including that header in both compilation units.

Printing statement in c

What if we just give format string in printf statement in c like:
printf("%d, %d, %d",a, b);
What does the third %d give In answer?
I did it but not able to understand the output of the code.
The third %d gives in answer undefined behavior because there is no corresponding argument.
From the C Standard (7.21.6.1 The fprintf function)
9 If a conversion specification is invalid, the behavior is
undefined.275) If any argument is not the correct type for the
corresponding conversion specification, the behavior is undefined.
Pay attention to that the name of the standard function is printf not Printf.
In C, functions that take variadic arguments (i.e. the ... parameter) have no way of knowing beforehand the number or type of the arguments. They must be kept track of in some way. A separate length parameter is one way, but the printf-family functions use the number of format specifiers in the format string to keep track.
If you tell the function "hey, there's a third parameter" when you don't pass one, this is undefined behavior. Anything can happen. It may appear to not print anything. It may read a garbage value from the memory location or the register where it expects to find the value. It may crash.
Reasoning about what might happen when your code invokes undefined behavior is a waste of time. Just make sure your code is free of it.
You've told printf to expect 3 additional int arguments, but only passed 2. It's going to look for that third int argument somewhere, and depending on how printf is implemented on your system, you may get a runtime error, or you may see garbage output, or you may see no output, or something entirely different may happen.
Officially, the behavior is left undefined - neither the compiler nor the runtime environment the program is executing in are required to handle the situation in any particular way. The result isn't guaranteed to be predictable or repeatable.
There is no "should" here - any result is "correct" as far as the language definition is concerned.

In C, why is %s working without giving it a value?

According to my knowledge and some threads like this, if you want to print strings in C you have to do something like this:
printf("%s some text", value);
And the value will be displayed instead of %s.
I wrote this code:
char password[] = "default";
printf("Enter name: \n");
scanf("%s", password);
printf("%s is your password", password); // All good - the print is as expected
But I noticed that I can do the exact same thing without the value part and it will still work:
printf("%s is your password");
So my question is why does the %s placeholder get a value without me giving it one, and how does it know what value to give it?
This is undefined behavior, anything can happen included something that looks like correct. But it is incorrect.
Your compiler can probably tell you the problem if you use correct options.
Standard says (emphasized is mine):
7.21.6.1 The fprintf function
The fprintf function writes output to the stream pointed to by stream,
under control of the string pointed to by format that specifies how
subsequent arguments are converted for output. If there are
insufficient arguments for the format, the behavior is undefined. If
the format is exhausted while arguments remain, the excess arguments
are evaluated (as always) but are otherwise ignored. The fprintf
function returns when the end of the format string is encountered.
The printf() function uses a C language feature that lets you pass a variable number of arguments to a function. (Technically called 'variadic functions' - https://en.cppreference.com/w/c/variadic - I'll just say 'varargs' for short.)
When a function is called in C, the arguments to the function are pushed onto the stack(*) - but the design of the varargs feature provides no way for the called function to know how many parameters were passed in.
When the printf() function executes, it scans the format string, and the %s tells it to look for a string in the next position in the variable argument list. Since there are no more arguments in the list, the code 'walks off the end of the array' and grabs the next thing it sees in memory. I suspect what's happening is that the next location in memory still has the address of password from your prior call to scanf, and since that address points to a string, and you told printf to print a string, you got lucky, and it worked.
Try putting another function call (for example: printf("%s %s %s\n","X","Y","Z") in between your call to scanf("%s", password); and printf("%s is your password"); and you will almost certainly see different behavior.
Free Advice: C has a lot of sharp corners and undefined bits, but a good compiler (and static analysis or 'lint' tool) can warn you about a lot of common errors. If you are going to work in C, learn how to crank your compiler warnings to the max, learn what all the errors and warnings mean (as they happen, not all at once!) and force yourself to write C code that compiles without any warnings. It will save you a lot of unnecessary hassle.
(*) generalizing here for simplicity - sometimes arguments can be passed in registers, sometimes things are inlined, blah blah blah.
So, there are a lot of posts telling that you shouldn't do printf("%s is your password");, and that you were just lucky. I guess from your question that you somewhat knew that. But few are telling you the probable reason for why you were lucky.
To understand what probably happened, we have to understand how function parameters are passed. The caller of a function must put the parameters on an agreed upon place for the function to find the parameters. So for parameters 1...N we call these places r1 ... rN. (This kind of agreement is part of something we call a "Function Calling Convention")
That means that this code:
scanf("%s", password);
printf("%s is your password",password);
may be turned into this pseudo-code by the compiler
r1="%s";
r2=password;
call scanf;
r1="%s is your password";
r2=password;
call printf;
If you now remove the second parameter from the printf call, your pseudo-code will look like this:
r1="%s";
r2=password;
call scanf;
r1="%s is your password";
call printf;
Be aware that after call scanf;, r2 might be unmodified and still be set to password, therefore call printf; "works"
You might think that you have discovered a new way to optimize code, by eliminating one of the r2=password; assignments. This might be true for old "dumb" compilers, but not for modern ones.
Modern compilers will already do this when it is safe. And it is not always safe. Reasons for why it isn't safe might be thatscanf and printf have different calling conventions, r2 might have been modified behind your back, etc..
To better get a feeling of what the compiler is doing, I recommend to look at the assembler output from your compiler, at different optimization levels.
And please, always compile with -Wall. The compiler is often good at telling you when you are doing dumb stuff.

Why random integer is outputed when more '%' conversions than data arguments error occurs in c?

I take out the age variable from the printf() call just to see what happens. I then compile it with make. It seems it only throws warning about more % conversions than data arguments and unused age variable but no compile error. I then run the executable file and it does run. Only every time I run it, it returns different random integer. I'm wondering what causes this behavior?
#include <stdio.h>
int main(int argc, char *arg[]) {
int age = 10;
int height = 72;
printf("I'm %d years old\n");
printf("I'm %d inches tall\n", height);
return 0;
}
As per the printf() specification, if there are insufficient number of arguments for the required format specifier, it invokes undefined behavior.
So, your code
printf("I'm %d years old\n");
which is missing the required argument for %d, invokes UB and not guaranteed to produce any valid result.
Cross reference, C11 standard, chapter ยง7.21.6.1
[..] If there are insufficient arguments for the format, the behavior is
undefined. [..]
According to the C Standard (7.21.6.1 The fprintf function - the same is valid for printf)
...If there are insufficient arguments for the format, the behavior is undefined. If the format is exhausted while arguments
remain, the excess arguments are evaluated (as always) but are
otherwise ignored.
The printf using cdecl, which using stack arguments. If you implied to the function that you are using one argument, it will be pulled out of the runtime stack, and if you didn't put there your number, the place will probably contain some garbage data. So the argument which will be printed is some arbitrary data.
With only one exception I know of, the C Standard imposes no requirements with regard to any action which in some plausible implementations might be usefully trapped. It is not hard to imagine a C compiler passing a variadic function like printf an indication of what arguments it has passed, nor would it be hard to an implementer thinking that it could be useful to have the compiler trigger a trap if code tries to retrieve a variadic parameters of some type when the corresponding argument is some other type or doesn't exist at all. Because it could be useful to have compilers trap in such cases, and because the behavior of such a trap would be outside the jurisdiction of the Standard, the Standard imposes no requirements about what may or may not happen when a variadic function tries to receive arguments which weren't passed to it.
In practice, rather than letting variadic functions know how many arguments they've received, most compilers simply have conventions which describe a relationship between the location of the non-variadic argument and the locations of subsequent variadic arguments. The generated code won't know whether a function has received e.g. two arguments of type int, but it will know that each such argument, if it exists, will be stored in a certain place. On such a compiler, using excess format specifiers will generally result in the generated code looking at the places where additional arguments would have been stored had they existed. In many cases, this location will have been used for some other purpose and then abandoned, and may hold the last value stored there for that purpose, but there is generally no reason to expect anything in particular about the contents of abandoned memory.

Turbo C++: Why does printf print expected values, when no variables are passed to it?

A question was asked in a multiple choice test: What will be the output of the following program:
#include <stdio.h>
int main(void)
{
int a = 10, b = 5, c = 2;
printf("%d %d %d\n");
return 0;
}
and the choices were various permutations of 10, 5, and 2. For some reason, it works in Turbo C++, which we use in college. However, it doesn't when compiled with gcc (which gives a warning when -Wall is enabled) or clang (which has -Wformat enabled and gives a warning by default) or in Visual C++. The output is, as expected, garbage values. My guess is that it has something to do with the fact that either Turbo C++ is 16-bit, and running on 32-bit Windows XP, or that TCC is terrible when it comes to standards.
The code has undefined behaviour.
In Turbo C++, it just so happens that the three variables live at the exact positions on the stack where the missing printf() argument would be. This results in the undefined behaviour manifesting itself by having the "correct" values printed.
However, you can't reasonably rely on this to be the case. Even the slightest change to your build environment (e.g. different compiler options) could break things in an arbitrarily nasty way.
The answer here is that the program could do anything -- this is undefined behavior. According to printf()s documentation (emphasis mine):
By default, the arguments are used in the order given, where each '*' and each conversion specifier asks for the next argument (and it is an error if insufficiently many arguments are given).
If your multiple-choice test does not have a choice for "undefined behavior" then it is a flawed test. Under the influence of undefined behavior, any answer on such a multiple-choice test question is technically correct.
It is an undefined behaviour. So it could be anything.
Try to use
printf("%d %d%d", a,b,c)
Reason:- Local variables are called on the stack and printf in Turbo C++ sees them in the same order in which they were assigned in the stack.
SUGGESTION(From comments):-
Understanding why it behaves in a particular way with a particular compiler can be useful in diagnosing problems, but don't make any other use of the information.
What's actually going on is that arguments are normally passed on the call stack. Local variables are also passed on the call stack, and so printf() sees those values, in whatever order the compiler decided to store them there.
This behavior, as well as many others, are allowed under the umbrella of undefined behavoir
No, it's not related to architecture. It is related to how TurboC++ handles the stack. Variables a, b, and c are locals and as such allocated in the stack. printf also expects the values in the stack. Apparently, TurboC++ does not add anything else to the stack after the locals and printf is able to take them as parameters. Just coincidence.

Resources