Why does an integer interpreted as a double render zero? - c

printf("%f", 20); results in the output 0.000000, not 20.000000. I'm guessing this has to do with how an int is represented in memory and how a double is represented in memory. Surprisingly for me, no matter how I alter 20, for example by making the number larger, the output is still 0.000000. Could someone please explain the underlying mechanics of this?

Most probably you are compiling your code on a platform/ABI where even for varargs functions data is passed into registers, and in particular different registers for integer/floating point values. x86_64 on Linux/OS X behaves like that.
The caller has an integer to pass, so it puts it into rsi; on the other side, printf expects a floating point value, so it tries to read it from xmm0. No matter how you change your integer argument to any other value printf will be unaffected - if will just print whatever happens to stay into xmm0 at the moment of the call.
You can actually check if this is the case by changing your call to:
printf("%f", 20, 123.45);
if it's working as I described, you should see 123.45 printed (the caller here puts 123.45 into xmm0, as it is the first floating point parameter passed to the function; printf behaves as before, but this time finds another value into xmm0).

First of all, this is undefined behavior. %f expects an argument of type float/double. Passing an int makes incompatible type and hence it invokes UB.
Quoting C11, chapter §7.21.6.1, fprintf()
f,F
A double argument representing a floating-point number [...]
a float is also allowed, as due to default argument promotion rule, it will get promoted to a double which is the expected type there, so either a double or a float is acceptable, but not an int.
...and that's all. UB, is, well, UB. You cannot try to justify anything with a code producing UB.
That said, with proper warning levels enabled, the code should not compile, at all. Though, if you choose to make the code compile and produce a binary/assembly code you can see different code generated for different platforms. One of such cases is explained in the other answer by Matteo Italia, considering x86_64 arch on Linux/OS X.

The problem is your compiler is assuming the 20 is an int. Your options are to either declare a float variable and input it here OR add a typecast.
e.g.
printf("%f", (double)20);

Related

Why can't I use %d to print the address of a variable

#include <stdio.h>
int main()
{
int a = 10;
printf("address = %d\n ", &a);
return 0;
}
test.c:11:29: warning: format specifies type 'int' but the argument has type 'int *' [-Wformat]
printf("address = %d\n ", &a);
output if I use %d
address = -376933160
output if I use %p, this is also weird, I think I should get a positive integer instead of this?
address = 0x7ffee07004d8
I know the correct way to do this should be %p, but I see in YouTube video, people can just use %d and don't get any problems. I wonder if my setting is wrong, I am using VS code to run the program.
video example
Update
Now I am aware -376933160 is an incorrect value, but still I wonder why it just outputs a random number instead of stopping the execution ?
Format string identifiers have specific purposes. The %d identifier is for integer values, like short, int etc. long has %ld more variants as such.
You have a pointer, that holds an address. Although it is a numerical value, it's special in its purpose and should be formatted as %p, the proper way to print pointers.
Also, the size of pointers may change by architecture, so it may not be the same size as the %d identifier expects.
Regarding the different values that were printed to the screen: If you were to print the address of a variable in these two formats in the same execution, you may get again one 'positive' hex value and one 'negative' integer value. But, these values are actually the same. Almost.
The integer value representation of the variable is only the lower 32 bits of the 64 bit value, and it's negative because it is signed, and as a pointer representation (since it's an address and sign doesn't matter) it is unsigned hex value and looks positive, though both are the same value in memory (At least the 32 bits that are equal). This is happening because of different width of variable and something called "Two's complement". You can further read about it here
Note: The two values you mentioned are not the same, since you got them in different executions of the program and ASLR was on, the actual address value of a has changed between executions.
It is important to mention, even though I refer to pointers in this answer as numerical values, it is not correct to regard them as so, as they hold addresses which are their own type category (Thanks #JohnBullinger for clarifying).
Use the correct format identifier to avoid this warning, at it is informing you that you may have miss typed or used the wrong variable since it is not a regular numerical value you're trying to print, but an address.
Now I am aware -376933160 is an incorrect value, but still I wonder
why it just outputs a random number instead of stopping the execution
?
Because it is Undefined Behaviour. It does what it does. No rules.
Using the wrong format specifier is cheating the compiler. Imagine that I want to buy your old car. The price is $5000. I pay you but not in US$. If I pay £5000 you are the winner. But if I pay in drachmas 5000GDR you will not be very happy. And your behaviour will be undefined. Maybe you will chase me with the baseball bat in your hand or maybe you simply give up and accept your losses.
Same happens with the compiler and the printf function.
For the same reason you can't use %f, or %c, or anything other than %p.
%d expects its corresponding argument to have type int; if the argument doesn't have type int, then the behavior is undefined. You may get reasonable-looking output, or you may not.
On most 64-bit machines sizeof (int *) > sizeof (int). On x86 pointers are 64 bits wide while ints are 32 bits wide. If you pass a pointer as the argument for %d, printf will only pull half of the pointer value - the output is gibberish.
Now I am aware -376933160 is an incorrect value, but still I wonder why it just outputs a random number instead of stopping the execution ?
Again, you're likely only seeing the lower 32 bits of a 64-bit pointer value.
Undefined behavior does not mean "stop execution" or "print an error" or anything else - it just means that neither the compiler nor the runtime environment are required to handle the situation in any particular way.
The loss of bits and the apppearence of the minus sign provoke a warning when you do it directly:
adr.c:7:13: warning: overflow in conversion from 'long int' to 'int' changes value from '140732663858392' to '-529529640' [-Woverflow]
7 | int adr = 0x7ffee07004d8;
e07004d8 (lower 32 bits) is over 2^32. With 0x7ffe0000000a it converts to '10'.
Not a random number, and no reason to "stop execution".
Such a "cast" rarely makes sense, especially not on pointers. (Unless you take a pointer to convert it to a long just to play with it).
The compiler will generally issue a warning if there is a conversion between pointer and integer types without an explicit cast as a protection to the programmer. You can eliminate the warning by casting &a to long long int.
Depending on the system you may be able to print the decimal value if you cast it with %lld:
printf("address = %lld\n ", (long long int)&a);
However, not all systems may support ll as a valid length.

Use of format specifiers for conversions

I am unable to deduce the internal happenings inside the machine when we print data using format specifiers.
I was trying to understand the concept of signed and unsigned integers and the found the following:
unsigned int b=-12;
printf("%d\n",b); //prints -12
printf("%u\n\n",b); //prints 4294967284
I am guessing that b actually stores the binary version of -12 as 11111111111111111111111111110100.
So, since b is unsigned , b technically stores 4294967284.
But still the format specifier %d causes the binary value of b to be printed as its signed version i,e, -12.
However,
printf("%f\n",2); //prints 0.000000
printf("%f\n",100); //prints 0.000000
printf("%d\n",3.2); //prints 2147483639
printf("%d\n",3.1); //prints 2147483637
I kind of expected the 2 to be printed as 2.00000 and 3.2 to be printed as 3 as per type conversion norms.
Why does this not happen and what exactly takes place at machine level ?
Mismatching format specifier and argument type (like using the floating point specifier "%f" to print an int value) leads to undefined behavior.
Remember that 2 is an integer value, and vararg functions (like printf) doesn't really know the types of the arguments. The printf function have to rely on the format specifier to assume the argument is of the specified type.
To better understand how you get the results you get, to understand "the internal happenings", we first must make two assumptions:
The system uses 32 bits for the int type
The system uses 64 bits for the double type
Now what happens with
printf("%f\n",2); //prints 0.000000
is that the printf function sees the "%f" specifier, and fetch the next argument as a 64-bit double value. Since the int value you provided in the argument list is only 32 bits, half of the bits in the double value will be unknown. The printf function will then print the (invalid) double value. If you're unlucky some of the unknown bits might lead the value to be a trap value which can cause a crash.
Similarly with
printf("%d\n",3.2); //prints 2147483639
the printf function fetches the next argument as a 32-bit int value, losing half of the bits in the 64-bit double value provided as the actual argument. Exactly which 32 bits are copied into the internal int value depends on endianness. Integers don't have trap values so no crashes happens, just an unexpected value will be printed.
what exactly takes place at machine level ?
The stdio.h functions are quite far from the machine level. They provide a standardized abstraction layer on top of various OS API. Whereas "machine level" would refer to the generated assembler. The behavior you experience is mostly related to details of the C language rather than the machine.
On the machine level, there exists no signed numbers, but everything is treated as raw binary data. The compiler can turn raw binary data into a signed number by using an instruction that tells the CPU: "use what's stored at this location and treat it as a signed number". Specifically, as a two's complement signed number on all common computers. But this is irrelevant when explaining why your code misbehaves.
The integer constant 12 is of type int. When we write -12 we apply the unary - operator on that. The result is still of type int but now of value -12.
Then you attempt to store this negative number in an unsigned int. This triggers an implicit conversion to unsigned int, which should be carried out according to the C standard:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type
The maximum value of a 32 bit unsigned int is 2^32 - 1, which equals 4.29*10^9 - 1. "One more than the maximum" gives 4.29*10^9. If we calculate-12 + 4.29*10^9 we get 4294967284. This is in range of an unsigned int and is the result you see later.
Now as it happens, the printf family of functions is very unsafe. If you provide a wrong format specifier which doesn't matches the type, they might crash or display the wrong result etc - the program invokes undefined behavior.
So when you use %d or %i reserved for signed int, but pass an unsigned int, anything can happen. "Anything" includes the compiler trying to convert the passed type to match the passed format specifier. That's what happened when you used %d.
When you pass values of types completely mismatching the format specifier, the program just prints gibberish though. Because you are still invoking undefined behavior.
I kind of expected the 2 to be printed as 2.00000 and 3.2 to be printed as 3 as per type conversion norms.
The reason why the printf family can't do anything intelligent like assuming that 2 should be converted to 2.0, is because they are variadic (variable argument) functions. Meaning they can take any number of arguments. In order to make that possible, the parameters are essentially passed as raw binary through something called va_list, and all type information is lost. The printf implementation is therefore left with no type information but the format string you gave it. This is why variadic functions are so unsafe to use.
Unlike a regular function which has more type safety - if you declare void foo (float f) and pass the integer constant 2 (type int), it will attempt to implicitly convert from integer to float, and perhaps also give a conversion warning.
The behaviors you observe are the result of printf interpreting the bits given to it as the type specified by the format specifier. In particular, at least for your system:
The bits for an int argument and an unsigned argument in the same position within the argument list would be passed in the same place, so when you give printf one and tell it to format the other, it uses the bits you give it as if they were the bits of the other.
The bits for an int argument and a double argument would be passed in different places—possibly a general register for the int argument and a special floating-point register for the double argument, so when you give printf one and tell it to format the other, it does not get the bits for the double to use for the int; it gets completely unrelated bits that were left lying around by previous operations.
Whenever a function is called, values for its arguments must be placed in certain places. These places vary according to the software and hardware used, and they vary by the type and number of arguments. However, for any particular argument type, argument position, and specific software and hardware used, there is a specific place (or combination of places) where the bits of that argument should be stored to be passed to the function. The rules for this are part of the Application Binary Interface (ABI) for the software and hardware being used.
First, let us neglect any compiler optimization or transformation and examine what happens when the compiler implements a function call in source code directly as a function call in assembly language. The compiler will take the arguments you provide for printf and write them to the places designated for those types of arguments. When printf executes, it examines the format string. When it sees a format specifier, it figures out what type of argument it should have, and it looks for the value of that argument in the place for that type of argument.
Now, there are two things that can happen. Say you passed an unsigned but used a format specifier for int, like %d. In every ABI I have seen, an unsigned and an int argument (in the same position within the list of arguments) are passed in the same place. So, when printf looks for the bits for the int it is expected, it will get the bits for the unsigned you passed.
Then printf will interpret those bits as if they encoded the value for an int, and it will print the results. In other words, the bits of your unsigned value are reinterpreted as the bits of an int.1
This explains why you see “-12” when you pass the unsigned value 4,294,967,284 to printf to be formatted with %d. When the bits 11111111111111111111111111110100 are interpreted as an unsigned, they represent the value 4,294,967,284. When they are interpreted as an int, they represent the value −12 on your system. (This encoding system is called two’s complement. Other encoding systems include one’s complement and sign-and-magnitude, in which these bits would represent −1 and −2,147,483,636, respectively. Those systems are rare for plain integer types these days.)
That is the first of two things that can happen, and it is common when you pass the wrong type but it is similar to the correct type in size and nature—it is passed in the same place as the wrong type. The second thing that can happen is that the argument you pass is passed in a different place than the argument that is expected. For example, if you pass a double as an argument, it is, in many systems, placed in separate set of registers for floating-point values. When printf goes looking for an int argument for %d, it will not find the bits of your double at all. Instead, what it finds in the place where it looks for an int argument might be whatever bits happened to be left in a register or memory location from previous operations, or it might be the bits of the next argument in the list of arguments. In any case, this means that the value printf prints for the %d will have nothing to do with the double value you passed, because the bits of the double are not involved in any way—a complete different set of bits is used.
This is also part of the reason the C standard says it does not define the behavior when the wrong argument type is passed for a printf conversion. Once you have messed up the argument list by passing double where an int should have been, all the following arguments may be in the wrong places too. They might be in different registers from where they are expected, or they might be in different stack locations from where they are expected. printf has no way to recover from this mistake.
As stated, all of the above neglects compiler optimization. The rules of C arose out of various needs, such as accommodating the problems above and making C portable to a variety of systems. However, once those rules are written, compilers can take advantage of them to allow optimization. The C standard permits a compiler to make any transformation of a program as long as the changed program has the same behavior as the original program under the rules of the C standard. This permission allows compilers to speed up programs tremendously in some circumstances. But a consequence is that, if your program has behavior not defined by the C standard (and not defined by any other rules the compiler follows), it is allowed to transform your program into anything. Over the years, compilers have grown increasingly aggressive about their optimizations, and they continue to grow. This means, aside from the simple behaviors described above, when you pass incorrect arguments to printf, the compiler is allowed to produce completely different results. Therefore, although you may commonly see the behaviors I describe above, you may not rely on them.
Footnote
1 Note that this is not a conversion. A conversion is an operation whose input is one type and whose output is another type but has the same value (or as nearly the same as is possible, in some sense, as when we convert a double 3.5 to an int 3). In some cases, a conversion does not require any change to the bits—an unsigned 3 and an int 3 use the same bits to represent 3, so the conversion does not change the bits, and the result is the same as a reinterpretation. But they are conceptually different.

c-behaviour of printf("%d", <double>) [duplicate]

This question already has answers here:
Given the state of the stack and registers, can we predict the outcome of printf's undefined behavior
(2 answers)
Closed 6 years ago.
Why does this code prints nonsense values? If it makes sense then what is it?
printf("%d\n", 5.0 / 4);
By the way, I know about format specifiers I should be using %f instead of %d. but I want to know what c actually does.
Strangely, every time I run the compiled program, it prints a different thing. doesn't it have to be deterministic?
As far as I could observe, this code prints a similar thing:
float c;
printf("%d\n", &c);
are they any related?
and when i tried:
float c;
printf("%d\n%d\n", c, &c);
There is a constant 252 between those two values. 256 - sizeof(float) maybe?
and declaring c as a double makes the difference 0.
Thanks in advance!
UPDATE: writing the same code on different machines yielded different results.(252 being 56. former is a 64-bit ubuntu machine and latter is 64-bit OS X)
printf is unusual in that it has no single function signature. In other words, there's no mechanism to automatically take the actual arguments you pass and convert them into the type that's expected. Whatever you try to pass, gets passed. If the type you pass does not match the type expected by the corresponding format specifier, you get nonsense.
Furthermore, if the mismatched types you're passing versus what the format specifiers are expecting have different sizes, printf can end up popping random garbage off the stack, that has nothing to do with what you thought you passed.
You asked, "doesn't it have to be deterministic?", and the answer there is a very emphatic "NO"! By passing the wrong type of value to printf, you have invoked undefined behavior, which means that anything can happen, and it very definitely does not have to be the same thing twice.
(Even though they're not required to, some compilers peek at the format string and try to check the values you passed against the values that are expected. For example, under gcc, your first code fragment gets the warning "format ‘%d’ expects type ‘int’, but argument 2 has type ‘double’". But I don't know of a compiler that will try to do conversions to "fix" the arguments.)

Why does %d show two different values for *b and *c in the code [b and c points to same address]

Consider the Code below
{
float a=7.999,*b,*c;
b=&a;c=b;
printf("%d-b\n%d-c\n%d-a\n",*b,*c,a);
}
OUTPUT:
-536870912-b
1075838713-c
-536870912-a
I know we are not allowed to use %d instead of %f, but why does *b and *c give two different values?
Both have the same address, can someone explain?
I want to know the logic behind it
Here is a simplified example of your ordeal:
#include <stdio.h>
int main() {
float a=7.999, b=7.999;
printf("%d-a\n%d-b\n",a,b);
}
What's happening is that a and b are converted to doubles (8 bytes each) for the call to printf (since it is variadic). Inside the printf function, the given data, 16 bytes, is printed as two 4-byte ints. So you're printing the first and second half of one of the given double's data as ints.
Try changing the printf call to this and you'll see both halves of both doubles as ints:
printf("%d-a1\n%d-a2\n%d-b1\n%d-b2\n",a,b);
I should add that as far as the standard is concerned, this is simply "undefined behavior", so the above interpretation (tested on gcc) would apply only to certain implementations.
There can be any number of reasons.
The most obvious -- your platform passes some integers in integer registers and some floating point numbers in floating point registers, causing printf to look in registers that have never been set to any particular value.
Another possibility -- the variables are different sizes. So printf is looking in data that wasn't set or was set as part of some other operation.
Because printf takes its parameters through ..., type agreement is essential to ensure the implementation of the function is even looking in the right places for its parameters.
You would have to have deep knowledge of your platform and/or dig into the generated assembly code to know for sure.
Using wrong conversion specification invokes undefined behavior. You may get either expected or unexpected value. Your program may crash, may give different result on different compiler or any unexpected behavior.
C11: 7.21.6 Formatted input/output functions:
If a conversion specification is invalid, the behavior is undefined.282) If any argument is
not the correct type for the corresponding conversion specification, the behavior is
undefined.
// Bad
float a=7.999,*b,*c;
b=&a;c=b;
printf("%d-b\n%d-c\n%d-a\n",*b,*c,a);
// Good
float a=7.999,*b,*c;
b=&a;c=b;
printf("%f-b\n%f-c\n%f-a\n",*b,*c,a);
Using an integer format specified "%d" instead of correctly using a float specified "%f" is what alk elliptically fails to explain as "undefined behavior".
You need to use the correct format specifier.

Puzzled about printf output

While using printf with %d as format specifier and giving a float as an argument e.g, 2.345, it prints 1546188227. So, I understand that it may be due to the conversion of single point float precision format to simple decimal format. But when we print 2.000 with the %d as format specifier, then why it prints 0 only ?
Please help.
Format specifier %d can only be used with values of type int (and compatible types). Trying to use %d with float or any other types produces undefined behavior. That's the only explanation that truly applies here. From the language point of view the output you see is essentially random. And you are not guaranteed to get any output at all.
If you are still interested in investigating the specific reason for the output you see (however little sense it makes), you'll have to perform a platform-specific investigation, because the actual behavior depends critically on various implementation details. And you are not even mentioning your platform in your post.
In any case, as a side note, note that it is impossible to pass float values as variadic arguments to variadic functions. float values in such cases are always converted to double and passed as double. So in your case it is double values you are attempting to print. Behavior is still undefined though.
Go here, enter 2.345, click "rounded". Observe 64-bit hex value: 4002C28F5C28F5C3. Observe that 1546188227 is 0x5c28f5c3.
Now repeat for 2.00. Observe that 64-bit hex value is 4000000000000000
P.S. When you say that you give a float argument, what you apparently mean is that you give a double argument.
Here what ISO/IEC 9899:1999 standard $7.19.6 states:
If a conversion specification is invalid, the behavior is undefined.239)
If any argument is not the correct type for the corresponding conversion
specification, the behavior is undefined.
If you're trying to make it print integer values, cast the floats to ints in the call:
printf("ints: %d %d", (int) 2.345, (int) 2.000);
Otherwise, use the floating point format identifier:
printf("floats: %f %f", 2.345, 2.000);
When you use printf with wrong format specifier for the corresponding argument, the result is undefined behavior. It could be anything and may differ from one implementation to another. Only correct use of format specified has defined behavior.
First a small nitpick: The literal 2.345 is actually of the type double, not float, and besides, even a float, such as the literal 2.345f, would be converted to double when used as an argument to a function that takes a variable number of arguments, such as printf.
But what happens here is that the (typically) 64 bits of the double value is sent to printf, and then it interprets (typically) 32 of those bits as an integer value. So it just happens that those bits were zero.
According to the standard, this is what is called undefined behavior: The compiler is allowed to do anything at all.

Resources