problem with function printf() - c

Here is my program:
#include <stdio.h>
int main()
{
int a=0x09;
int b=0x10;
unsigned long long c=0x123456;
printf("%x %llx\n",a,b,c);//in "%llx", l is lowercase of 'L', not digit 1
return 0;
}
the output was:
9 12345600000010
I want to know:
how function printf() is executed?
what will happen if the number of arguments isn't equal to that of formats?
please help me and use this program as an example to make an explanation.

The problem is that your types don't match. This is undefined behavior.
Your second argument b does not match the type of the format. So what's happening is that printf() is reading past the 4 bytes holding b (printf is expecting an 8-byte operand, but b is only 4 bytes). Therefore you're getting junk. The 3rd argument isn't printed at all since your printf() only has 2 format codes.
Since the arguments are usually passed consecutively (and adjacent) in memory, the 4 extra bytes that printf() is reading are actually the lower 4 bytes of c.
So in the end, the second number that's being printed is equal to b + ((c & 0xffffffff) << 32).
But I want to reiterate: this behavior is undefined. It's just that most systems today behave like this.

If the arguments that you pass to printf don't match the format specification then you get undefined behavior. This means that anything can happen and you cannot reason about the results that you happen to see on your specific system.
In your case, %llx requires and argument of type unsigned long long but you supplied an int. This alone causes undefined behaviour.
It is not an error to pass more arguments to printf than there are format specificiers, the excess arguments are evaluated but ignored.

printf() increases a pointer to read an argument at a time according to the format. If the number of formatting arguments is larger than the number of parameters, then printf() will output data from unknown memory locations. But if the number of parameters is larger than the number of formatting arguments, then no harm was done. E.g. gcc will warn you if the number of formatting arguments and parameters don't match.

Related

Use of format specifiers for conversions

I am unable to deduce the internal happenings inside the machine when we print data using format specifiers.
I was trying to understand the concept of signed and unsigned integers and the found the following:
unsigned int b=-12;
printf("%d\n",b); //prints -12
printf("%u\n\n",b); //prints 4294967284
I am guessing that b actually stores the binary version of -12 as 11111111111111111111111111110100.
So, since b is unsigned , b technically stores 4294967284.
But still the format specifier %d causes the binary value of b to be printed as its signed version i,e, -12.
However,
printf("%f\n",2); //prints 0.000000
printf("%f\n",100); //prints 0.000000
printf("%d\n",3.2); //prints 2147483639
printf("%d\n",3.1); //prints 2147483637
I kind of expected the 2 to be printed as 2.00000 and 3.2 to be printed as 3 as per type conversion norms.
Why does this not happen and what exactly takes place at machine level ?
Mismatching format specifier and argument type (like using the floating point specifier "%f" to print an int value) leads to undefined behavior.
Remember that 2 is an integer value, and vararg functions (like printf) doesn't really know the types of the arguments. The printf function have to rely on the format specifier to assume the argument is of the specified type.
To better understand how you get the results you get, to understand "the internal happenings", we first must make two assumptions:
The system uses 32 bits for the int type
The system uses 64 bits for the double type
Now what happens with
printf("%f\n",2); //prints 0.000000
is that the printf function sees the "%f" specifier, and fetch the next argument as a 64-bit double value. Since the int value you provided in the argument list is only 32 bits, half of the bits in the double value will be unknown. The printf function will then print the (invalid) double value. If you're unlucky some of the unknown bits might lead the value to be a trap value which can cause a crash.
Similarly with
printf("%d\n",3.2); //prints 2147483639
the printf function fetches the next argument as a 32-bit int value, losing half of the bits in the 64-bit double value provided as the actual argument. Exactly which 32 bits are copied into the internal int value depends on endianness. Integers don't have trap values so no crashes happens, just an unexpected value will be printed.
what exactly takes place at machine level ?
The stdio.h functions are quite far from the machine level. They provide a standardized abstraction layer on top of various OS API. Whereas "machine level" would refer to the generated assembler. The behavior you experience is mostly related to details of the C language rather than the machine.
On the machine level, there exists no signed numbers, but everything is treated as raw binary data. The compiler can turn raw binary data into a signed number by using an instruction that tells the CPU: "use what's stored at this location and treat it as a signed number". Specifically, as a two's complement signed number on all common computers. But this is irrelevant when explaining why your code misbehaves.
The integer constant 12 is of type int. When we write -12 we apply the unary - operator on that. The result is still of type int but now of value -12.
Then you attempt to store this negative number in an unsigned int. This triggers an implicit conversion to unsigned int, which should be carried out according to the C standard:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type
The maximum value of a 32 bit unsigned int is 2^32 - 1, which equals 4.29*10^9 - 1. "One more than the maximum" gives 4.29*10^9. If we calculate-12 + 4.29*10^9 we get 4294967284. This is in range of an unsigned int and is the result you see later.
Now as it happens, the printf family of functions is very unsafe. If you provide a wrong format specifier which doesn't matches the type, they might crash or display the wrong result etc - the program invokes undefined behavior.
So when you use %d or %i reserved for signed int, but pass an unsigned int, anything can happen. "Anything" includes the compiler trying to convert the passed type to match the passed format specifier. That's what happened when you used %d.
When you pass values of types completely mismatching the format specifier, the program just prints gibberish though. Because you are still invoking undefined behavior.
I kind of expected the 2 to be printed as 2.00000 and 3.2 to be printed as 3 as per type conversion norms.
The reason why the printf family can't do anything intelligent like assuming that 2 should be converted to 2.0, is because they are variadic (variable argument) functions. Meaning they can take any number of arguments. In order to make that possible, the parameters are essentially passed as raw binary through something called va_list, and all type information is lost. The printf implementation is therefore left with no type information but the format string you gave it. This is why variadic functions are so unsafe to use.
Unlike a regular function which has more type safety - if you declare void foo (float f) and pass the integer constant 2 (type int), it will attempt to implicitly convert from integer to float, and perhaps also give a conversion warning.
The behaviors you observe are the result of printf interpreting the bits given to it as the type specified by the format specifier. In particular, at least for your system:
The bits for an int argument and an unsigned argument in the same position within the argument list would be passed in the same place, so when you give printf one and tell it to format the other, it uses the bits you give it as if they were the bits of the other.
The bits for an int argument and a double argument would be passed in different places—possibly a general register for the int argument and a special floating-point register for the double argument, so when you give printf one and tell it to format the other, it does not get the bits for the double to use for the int; it gets completely unrelated bits that were left lying around by previous operations.
Whenever a function is called, values for its arguments must be placed in certain places. These places vary according to the software and hardware used, and they vary by the type and number of arguments. However, for any particular argument type, argument position, and specific software and hardware used, there is a specific place (or combination of places) where the bits of that argument should be stored to be passed to the function. The rules for this are part of the Application Binary Interface (ABI) for the software and hardware being used.
First, let us neglect any compiler optimization or transformation and examine what happens when the compiler implements a function call in source code directly as a function call in assembly language. The compiler will take the arguments you provide for printf and write them to the places designated for those types of arguments. When printf executes, it examines the format string. When it sees a format specifier, it figures out what type of argument it should have, and it looks for the value of that argument in the place for that type of argument.
Now, there are two things that can happen. Say you passed an unsigned but used a format specifier for int, like %d. In every ABI I have seen, an unsigned and an int argument (in the same position within the list of arguments) are passed in the same place. So, when printf looks for the bits for the int it is expected, it will get the bits for the unsigned you passed.
Then printf will interpret those bits as if they encoded the value for an int, and it will print the results. In other words, the bits of your unsigned value are reinterpreted as the bits of an int.1
This explains why you see “-12” when you pass the unsigned value 4,294,967,284 to printf to be formatted with %d. When the bits 11111111111111111111111111110100 are interpreted as an unsigned, they represent the value 4,294,967,284. When they are interpreted as an int, they represent the value −12 on your system. (This encoding system is called two’s complement. Other encoding systems include one’s complement and sign-and-magnitude, in which these bits would represent −1 and −2,147,483,636, respectively. Those systems are rare for plain integer types these days.)
That is the first of two things that can happen, and it is common when you pass the wrong type but it is similar to the correct type in size and nature—it is passed in the same place as the wrong type. The second thing that can happen is that the argument you pass is passed in a different place than the argument that is expected. For example, if you pass a double as an argument, it is, in many systems, placed in separate set of registers for floating-point values. When printf goes looking for an int argument for %d, it will not find the bits of your double at all. Instead, what it finds in the place where it looks for an int argument might be whatever bits happened to be left in a register or memory location from previous operations, or it might be the bits of the next argument in the list of arguments. In any case, this means that the value printf prints for the %d will have nothing to do with the double value you passed, because the bits of the double are not involved in any way—a complete different set of bits is used.
This is also part of the reason the C standard says it does not define the behavior when the wrong argument type is passed for a printf conversion. Once you have messed up the argument list by passing double where an int should have been, all the following arguments may be in the wrong places too. They might be in different registers from where they are expected, or they might be in different stack locations from where they are expected. printf has no way to recover from this mistake.
As stated, all of the above neglects compiler optimization. The rules of C arose out of various needs, such as accommodating the problems above and making C portable to a variety of systems. However, once those rules are written, compilers can take advantage of them to allow optimization. The C standard permits a compiler to make any transformation of a program as long as the changed program has the same behavior as the original program under the rules of the C standard. This permission allows compilers to speed up programs tremendously in some circumstances. But a consequence is that, if your program has behavior not defined by the C standard (and not defined by any other rules the compiler follows), it is allowed to transform your program into anything. Over the years, compilers have grown increasingly aggressive about their optimizations, and they continue to grow. This means, aside from the simple behaviors described above, when you pass incorrect arguments to printf, the compiler is allowed to produce completely different results. Therefore, although you may commonly see the behaviors I describe above, you may not rely on them.
Footnote
1 Note that this is not a conversion. A conversion is an operation whose input is one type and whose output is another type but has the same value (or as nearly the same as is possible, in some sense, as when we convert a double 3.5 to an int 3). In some cases, a conversion does not require any change to the bits—an unsigned 3 and an int 3 use the same bits to represent 3, so the conversion does not change the bits, and the result is the same as a reinterpretation. But they are conceptually different.

Different output with printf

I am trying to print the value of a particularly large value after reading it from the console. When I am trying to print it from two different ways, one directly after assigning, one with the return value from the strtol function, I get different output! Can someone please explain me why I am noticing two different outputs?
Input value is: 4294967290
Here is the code snippet.
long int test2 = strtol(argv[1], &string, 10);
printf("the strtol value is %ld\n", test2);
printf("the strtol function value is %ld\n", strtol(argv[1], &string, 10));
Output
the strtol value is -6
the strtol function value is 4294967290
You need to do two things:
Add the line #include <stdlib.h> at the beginning of your program. That will cause strtol to be declared. (You also need #include <stdio.h> in order to declare printf.)
Add -Wall to your compiler flags (if you are using gcc or clang). That will cause the compiler to tell you that you need to declare strtol, and it might even suggest which header to include.
What is going on is that you haven't declared strtol, with the result that the compiler assumes that it returns an int.
Since 4294967290 is 232-6, it is the 32-bit Two's-complement representation of -6. Because the compiler assumes that strtol returns an int, the code it produces only looks at the low-order 32 bits. Since you are assigning the value to a long, the compiler needs to emit code which sign-extends the int. In other words, it takes the low-order 32 bits as though they were a signed integer (which would be -6) and then widens that -6 to a long.
In the second call to printf, the return value of strtol is inserted in the printf argument list without conversion. You tell printf that the argument is a long (by using the l flag in %ld), and by luck the entire 64 bits are in the argument list, so printf reads them out as a long and prints the "correct" value.
Needless to say, all this is undefined behaviour, and the actual output you are seeing is in no way guaranteed; it just happens to work that way with your compiler on your architecture. On some other compiler or on some other architecture, things might have been completely different, including the bit-length of int and long. So the above explanation, although possibly interesting, is of no practical value. Had you simply included the correct header and thereby told the compiler the real return type of strtol, you would have gotten the output you expected.

why printf behaves differently when we try to print character as a float and as a hexadecimal?

I tried to print character as a float in printf and got output 0. What is the reason for this.
Also:
char c='z';
printf("%f %X",c,c);
is giving some weird output for hexadecimal while output is correct when I do this:
printf("%X",c);
why is it so?
The printf() function is a variadic function, which means that you can pass a variable number of arguments of unspecified types to it. This also means that the compiler doesn't know what type of arguments the function expects, and so it cannot convert the arguments to the correct types. (Modern compilers can warn you if you get the arguments wrong to printf, if you invoke it with enough warning flags.)
For historical reasons, you can not pass an integer argument of smaller rank than int, or a floating type of smaller rank than double to a variadic function. A float will be converted to double and a char will be converted to int (or unsigned int on bizarre implementations) through a process called the default argument promotions.
When printf parses its parameters (arguments are passed to a function, parameters are what the function receives), it retrieves them using whatever method is appropriate for the type specified by the format string. The "%f" specifier expects a double. The "%X" specifier expects an unsigned int.
If you pass an int and printf tries to retrieve a double, you invoke undefined behaviour.
If you pass an int and printf tries to retrieve an unsigned int, you invoke undefined behaviour.
Undefined behaviour may include (but is not limited to) printing strange values, crashing your program or (the most insidious of them all) doing exactly what you expect.
Source: n1570 (The final public draft of the current C standard)
You need to use a cast operator like this:
char c = 'z';
printf("%f %X", (float)c, c);
or
printf("%f %X", (double)c, c);
In Xcode, if I do not do this, I get the warning:
Format specifies specifies 'double' but the argument has type 'char', and the output is 0.000000.
I tried to print character as a float in printf and got output 0. What is the reason for this.
The question is, what value did you expect to see? Why would you expect something other than 0?
The short answer to your question is that the behavior of printf is undefined if the type of the argument doesn't match the conversion specifier. The %f conversion specifier expects its corresponding argument to have type double; if it isn't, all bets are off, and the exact output will vary.
To understand the floating point issue, consider reading: http://en.wikipedia.org/wiki/IEEE_floating_point
As for hexadecimal, let me guess.. the output was something like... 99?
This is because of encodings.. the machine has to represent information in some format, and usually that format entails either giving meanings to certain bits in a number, or having a table of symbols to numbers, or both
Floating points are sometimes represented as a (sign,mantissa,exponent) triplet all packed in a 32 or 64 bit number - characters are sometimes represented in a format named ASCII, which establishes which number corresponds to each character you type
Because printf, like any function that work with varargs, eg: int foobar(const char fmt, ...) {} tries to interpret its parameter to certain type.
If you say "%f", then pass c (as a char), then printf will try to read a float.
You can read more here: var_arg (even if this is C++, it still applies).

why i am not getting the expected output?

int main()
{
int x;
float y;
char c;
x = -4443;
y = 24.25;
c = 'M';
printf("\nThe value of integer variable x is %f", (float)x);
printf("\nThe value of float variable y is %d", y);
printf("\nThe value of character variable c is %f\n",c);
return 0;
}
Output:
The value of integer variable x is -4443.000000
The value of float variable y is 0
The value of character variable c is 24.250000
Why am I not getting the expected output?
But when I am using external casting I am getting expected output which is:
The value of integer variable x is -4443.000000
The value of float variable y is 24
The value of character variable c is 77.000000
why i am not getting the expected output ?
Short answer: Because your expectations are wrong.
You're instructing the compiler to read an integer from where y is. Which is wrong. Format specifier don't tell the compiler to do casts, just what type to expect, and trust you to provide the right type.
The behaviour can be due to the fact that, for example, a float is stored in 8 bytes. The high-order bytes will be 0 in this case. But an int is stored in 4 bytes. So you tell the compiler read the int from where y is, it reads the first 4 bytes, which are 0, and prints 0...
EDIT: As John pointed out in the comments, this is UB, which means that anything can happen:
7.21.6.1/9
If a conversion specification is invalid, the behavior is undefined.282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
Many computing platforms pass different types of arguments in different ways. On some platforms, floating-point arguments are passed in special floating-point registers. On most platforms, integer arguments are passed in general processor registers. Large arguments, such as structures, are stored somewhere in memory, and a pointer is passed instead (invisibly to the C source code). Once the few registers available for arguments are used, the remaining arguments are typically passed on the stack.
When you call printf, the compiler does not match the arguments you pass to the conversion specifiers in the format string. (Except that a good compiler will check and issue a warning if the types do not match.) In order to operate, the printf routine reads the format string and, when it finds a conversion specifier, it reads data from where the corresponding argument should be. If you specify “%d” but pass a float, the printf routine may read data from a general processor register, but the float value is in a floating-point register. Therefore, the value that is printed will be whatever data happened to be in the general processor register.
Similarly, when you specify “%f” but pass an integer, the printf routine may read from a floating-point register, but the integer value is in a general processor register.
The compiler will not convert printf arguments to the target type and might not warn you about the mismatches. You must match the conversion specifiers in the format string to the argument types.
Bonus: Here are documents describing how arguments are passed to subroutines on one platform (Mac OS X).
You cannot format a char as a float "%f", use "%c" or "%d" instead. I find that http://www.cplusplus.com/reference/clibrary/cstdio/printf/ is a good reference.
The format specifiers and the types of the arguments don't match, which I believe causes undefined behavior. printf doesn't do casting for you, so you have to explicitly cast the arguments.

C program : help about variable definition sequence

void main()
{
float x = 8.2;
int r = 6;
printf ( "%f" , r/4);
}
It is clearly odd that i am not explicitly typecasting the r ( of int type ) in the printf func to float. However if i change the sequence of declaring x and r and declare r first and then x i get different results(in this case it is a garbage value). Again i am not using x
in the program anywhere.. These are the things i meant to be wrong... i want to keep them the way they are. But when i excute the first piece of code i get 157286.375011 as result ( a garbage value ).
void main()
{
int r = 6;
float x = 8.2;
printf ( "%f" , r/4);
}
and if i execute the code above i get 0.000000 as result. i know results can go wrong because i am using %f in the printf when it should have been %d... the results may be wrong... but my question is why the results change when i change sequence of variable definitions. Should not it be the same whether right or wrong???
Why is this happening?
printf does not have any type checking. It relies on you to do that checking yourself, verifying that all of the types match the formatting specifiers.
If you don't do that, you enter into the realm of undefined behavior, where anything can happen. The printf function is trying to interpret the specified value in terms of the format specifier you used. And if they don't match, boom.
It's nonsense to specify %f for an int, but you already knew that...
f conversion specifier takes a double argument but you are passing an int argument. Passing an int argument to f conversion specifier is undefined behavior.
In this expression:
r / 4
both operands are of type int and the result is also of type int.
Here is what you want:
printf ("%f", r / 4.0);
When printf grabs the optional variables (i.e. the variables after the char * that tells it what to print), it has to get them off the stack. double is usually 64 bits (8 bytes) whereas int is 32 bits (4 bytes).
Moreover, floating point numbers have an odd internal structure as compared to integers.
Since you're passing an int in place of a double, printf is trying to get 8 bytes off the stack instead of four, and it's trying to interpret the bytes of a int as the bytes of a double.
So not only are you getting 4 bytes of memory containing no one knows what, but you're also interpreting that memory -- that's 4 bytes of int and 4 bytes of random stuff from nowhere -- as if it were a double.
So yeah, weird things are going to happen. When you re-compile (or even times re-run) a program that just wantonly picks things out of memory where it hasn't malloc'd and it hasn't stored, you're going to get unpredictable and wildly-changing values.
Don't do it.

Resources