Dereferencing int but casting to a float prints nothing in C

Dereferencing int but casting to a float prints nothing in C - c

C Noob here trying to follow along with some online lectures. In the professors example he shows us that we can read the data stored in an int as a float by doing the following: *(float*)&i. I tried doing this with the following code but nothing happens. I am testing it here: http://ideone.com/ExmXSW
#include <stdio.h>
int main(void) {
// your code goes here
int i=37;
printf("%f", *(float*)&i);
return 0;
}

This causes undefined behaviour:
Executing *(float *)&i violates the strict aliasing rule
The wrong format specifier was used: %i is for int, however you supplied a float
When code causes undefined behaviour, anything may happen. A lecture advising you to do this is a rubbish lecture unless it is specifically showing this as an example of what NOT to do. It is incorrect to say "we can read the data stored in an int as float" by this method.
NB. ideone.com is not great for testing because it suppresses a whole lot of compiler error messages, so you may think your code is correct when it in fact is not.

What the professor may wanted to teach you that if you insert an integer in to a memory location (which represented by 32 bits in most machines) you can read it as a float (again 32 bits in most of the machines) but you will get different values. This is because integer is stored as a simple binary for example 0x000000001 is equals to integer 1 and 0x00000002 is for integer 2 etc.
However float representation in binary format is quite different. It is look like as follows:
bit 31 30 23 22 0
S EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMM
where S is the sign, E is for exponent and M is for mantissa.
Here is a bit of code that I was working on to help you understand this:
#include <stdio.h>
int main(void) {
void* x = malloc(sizeof(int));
int* y = x;
float* z = x;
*y=955555555;
printf("%f", *z);
return 0;
}
What I have done in this code is to allocate a memory and let variable y interpret it as integer and variable z interpret it as floating point. Now you can change y and see the that z has totally different value. In this case the output of the program is 0.000117.
You can also change variable z and see the same happens with variable y because both of them are pointing to the same memory location but interpreting it as different types.

You need to use the correct format code. What you're doing is undefined behavior, but it would probably work if you changed the printf to:
printf("%f", *(float*)&i);
so it uses the correct format code. The problem is that, in modern x64 calling conventions, the first few values are passed in registers. But on x86-64 at least, it's a completely different set of registers for integer vs. floating point values, so using %i looks at a completely different register that has an effectively random value (it's deterministic only in the sense that you could examine the assembly to figure out what it will be, but not something you could guess from looking at the source code).

There is absolutely NO correlation between the bits you would use to store 37 as an int and what would be interpreted when cast to a float.[1] Why? Integers (presuming 32-bit for sake of argument) are stored in memory as a simple binary representation of the number in question (subject to the endianness of the current machine), while (32-bit) floating point numbers are stored in IEEE-754 single-precision floating point format comprised of a single sign-bit, and 8-bit exponent (8-bit excess-127 notation) and a 23-bit mantissa (in a normalized "hidden-bit" format).
The only thing the integer and floating point number have in common is the fact they both occupy 32 bits of memory. That is the only reason you can, despite violating every tenant of strict-aliasing, cast a pointer to the integer value address and attempt to interpret it as a float.
Let's, for sake of argument, look at what you are doing. When you store 37 as in integer, on a little-endian box, you will have the following in memory:
00000000-00000000-00000000-00100101
When you interpret those 32-bits of memory by casting to float, you are attempting to interpret an IEEE-754 single-precision floating point value in memory of:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1
|- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -|
|s| exp | mantissa |
when in reality, if you were looking at 37.0 in IEEE-754 single-precision floating point format, you would have the following in memory:
0 1 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
|- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -|
|s| exp | mantissa |
If you compare what you are trying to look at as a float with what you should be looking at as a float, you should notice your cast results in floating point representation with a 0 value for the 8-bit exponent and a nonsensically small 23-bit mantissa. Integer 37 interpreted as a float results in a conversion so small it is virtually non-printable regardless how may significant digits you specify in the format.
The bottom line is there is no relation between the integer value in memory and what a floating number created from those same bits would be. (aside from a few computations not relevant here). An integer is an integer and a float is a float. The only thing their memory storage has in common is that on some machines they can both occupy 32-bits of memory. While you can abuse a cast and attempt to use the same bits as a floating point value - just be aware there is little to nothing in the way of correlation between the two interpretations.
footnotes:
[1] there are limited determinations of the next possible higher/lower valid floating point value that can be drawn by interpreting the memory occupied by a float point value as an integer.

You wrote that "nothing happens", but the code actually works as expected (with some assumptions about the architecture, such as that ints and floats are both 32-bit). At least it prints something, but the result is probably not what you expected.
On Ideone.com the printed output is 0.000000. The reason for this is that the integer value 37 interpreted as float value is 5.1848e-44. When using the "%f" format specifier in printf, this extremely small number will be rounded to zero. If you change the format string to "%e", the output would be 5.184804e-44. Or if you would change the value of i to 1078530010, the output would be 3.141593, for example.
(NB: Note that the value is actually first converted from float to double, and the double is passed to printf(). The "%f" format specifier also expects a double, not a float, so that works out well.)
There's certainly truth in many of the already posted answers. The code indeed violates the strict aliasing rule, and in general, the results are defined. This is mostly because data types can differ between different CPU architectures (different sizes, different endianness, etc.). Also, the compiler is allowed to make certain assumptions and try to optimize your code, causing the compiled executable to behave different than intended.
In practice, the intended behavior can be "forced" by using constructs such as volatile pointers, restricting the compilers ability to optimize the code. Newer versions of the C/C++ standard have even more advanced constructs for this. However, generally speaking, for a given target architecture of which you know the data sizes and formats, the code you posted can work correctly.
However, there is a better solution, and I'm surprised nobody has mentioned it yet. The only (somewhat) portable way to do this without breaking the strict aliasing rule, is using unions. See the following example (and demo at ideone.com):
#include <stdio.h>
int main(void) {
union {
int i;
float f;
} test;
test.i = 1078530010;
printf("%f", test.f);
return 0;
}
This code also prints 3.141593, as expected.

Related

Can someone explain what maxBit is?

I am trying to understand what is maxBit in the following and what it represents?
When I print min and max, I get numbers that make no sense to me.
Thank you.
#include <stdio.h>
#include <math.h>
int main() {
union {double a; size_t b;} u;
u.a = 12345;
size_t max = u.b;
u.a = 6;
size_t min = u.b;
int maxBit = floor(log(max-min) / log(2));
printf("%d",maxBit);
return 0;
}

This code appears to be using a horrible kludge. I am one of the more welcoming participants here regarding tolerating code that uses compiler extensions or other things beyond the C standard, but this code does simply unnecessary things for no apparent good purpose. It relies on size_t being 64 bits. It may be 64 bits in some specific C implementation this was written for, but that is not portable, and C implementations that use 64 bits are generally modern, and modern implementations ought to support the uint64_t of <stdint.h>, which would be an appropriate type for this. So better code would have used uint64_t.
Unless there is some quite surprising motivation for this and other issues in the code, it is low quality, bad code. Do not use it, and regard any code from the same source with skepticism.
That said, the code likely assumes the IEEE-754 binary64 is used for double, and max-min gives the difference between the representations of 12345 and 6. log(max-min) / log(2) finds the base-two-logarithm of max-min, and the integer portion of that will be the index of the highest bit that changed. For 12345, the exponent field is 1036. For 6, the exponent field is 1025. The difference is 11 (binary 1011), in which the first set bit is bit 3 of the exponent field. The field runs from bits 62 to 52 in the binary64 format, so bit 3 in the exponent field is bit 55 (52+3) in the whole 64 bits of the representation. So maxBit will be 55. However, there is no apparent significance to this. There is no great value in knowing that bit 55 is the highest bit set in the difference between the representations of 12345 and 6. I am familiar with a variety of IEEE-754 bit-twiddling hacks, and I do not recognize this. I expect nobody can tell you much more about this without context, such as where the code came from or how it is used.

From C17 document, 6.5.2.3 Structure and union members, footnote 97 :
If the member used to read the contents of a union object is not the
same as the member last used to store a value in the object, the
appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type as described
in 6.2.6 (a process sometimes called “type punning”). This might be a
trap representation.
Therefore, when you store u.a = 12345 and then access size_t max = u.b, the bit patterns in the memory of u.a is reinterpreted as a size_t. Since, u.a is of double, it is represented in IEEE754 format.
The value stored in max and min are :
4668012349850910720 (0100000011001000000111001000000000000000000000000000000000000000-> IEEE754)
4618441417868443648 (0100000000011000000000000000000000000000000000000000000000000000-> IEEE754)
Then, max-min = 49570931982467072, then log(max-min)/log(2) = 55.460344, then floor(55.460344) = 55. This is reason for 55 as output.
PS: There are two types of IEEE754 format : Single precision (32) and Double precision (64). Please visit this website IEEE754 for more details.

Using incorrect format specifier in printf()

I am trying to solve the next problem:
printf("%d", 1.0f); // Output is 0
So, I really do not know why it is so. The number 1.0 (32 bit in IEEE 754) has the next binary interpretation:
00111111 10000000 00000000 00000000
If convert this one to integer interpretation we get the next:
1 065 353 216
So, sizeof(int) == sizeof(float) == 4 bytes.
I know the float number in C will be converted into double by compiler, but I use f for float constant.
I tried different values and I counted the binary numbers, but I do not know. That is insanity.
I want to see the 1 065 353 216 in my console.

When you use the incorrect format specifier to printf, you invoke undefined behavior, meaning you can't accurately predict what will happen.
That being said, floating point values are typically passed to functions via floating point registers, while integer values are typically passed on the stack. So the value you're seeing is whatever happened to be sitting on the stack.
As an example, if I put that line by itself in a main function, it prints a different value every time I run it.
If you want to print the representation of a float, you can use a union:
union {
float f;
unsigned int i;
} u;
u.f = 1.0f;
printf("%d", u.i);

what is meant by 'Most C system provide for logically infinite floating values'?

Initially I declared variables x and y as type int:
#include<stdio.h>
int main(){
int x, y = 0 ;
x = 1 / y;
printf("%d", x);
return 0;
}
Program crashed (for obvious reasons).
Now I declared variables x and y as double:
#include<stdio.h>
int main(){
double x, y = 0 ;
x = 1 / y;
printf("%d", x);
return 0;
}
But Output: 0. (why?)
Then I changed %d to %f in printf:
#include<stdio.h>
int main(){
double x, y = 0 ;
x = 1 / y;
printf("%f", x);
return 0;
}
Output: 1.#INF00
I don't understand what is happening here.
Please explain me above cases.

Most systems you're likely to come in contact with use IEEE 754 representation for floating point numbers. That representation has ways to store the values +infinity and -infinity.
While strictly speaking, dividing by 0 is undefined behavior, implementations using IEEE 754 extend the language to allow it for floating point types. In this case, dividing by 0 can be considered infinity, so your implementation allows it, and 1.#INF00 is how MSVC prints the value for infinity.
Also, using the wrong format specifier for printing as in your second example where you use %d to print a double is undefined behavior. Format specifiers must match the datatype of what is passed in.

There are no numbers in a computer. We build computers out of physical parts, and we use physical properties to store and manipulate data.
In various places, a computer has electric charges, electric voltages, magnetic fields, or other physical things that we use to represent data. In doing this, we pick some physical state and call it “0” and some other state and call it “1”. These are merely convenient names. Mathematical numbers like 0 and 1 are abstract entities—they are concepts with no physical existence. The numbers 0 and 1 do not exist in computers.
We bundle these physical states, often in groups of eight, 32, or 64, and then label them in various ways. For example, with the eight bits in the states labeled 00100010, we might call that “34”. It is still not a number. We are merely using a binary notation for the number 34, and that binary notation further designates the state of the pieces of the machine.
The pieces of the machine in the state 00100010 are not inherently the actual number 34 any more than they are a banana or the concept of red.
We design parts of the machine so they can manipulate these states. Somewhere in the machine is an adder that takes as input physical states representing one number and physical states representing another number and creates as output physical states representing the number that is the sum of the two input numbers.
Computers contain many parts like these that create the effect of adding numbers, multiplying numbers, subtracting numbers, and so on. These are just effects created in a machine.
With floating-point numbers, we designate certain of the bit patterns to represent infinity, and we design the floating-point arithmetic unit to behave correspondingly. When we give the floating-point arithmetic adder one input that represents infinity and another input that represents a finite number, it produces an output that represents infinity, because we designed the floating-point arithmetic unit to do that. Similarly, when we give the floating-point divider one input that represents the number one and another input that represents the number zero, it produces as output the bit pattern that represents infinity, again because we designed it to do that.
When you print a floating-point object that as the bit pattern representing infinity using printf("%f", x);, the C implementation prints a string representing infinity. Microsoft’s chosen string for that is “1.#INF00”, which is rather ugly. Some other implementations use “inf”, which is only slightly better.
When you attempt to print a floating-point object using printf("%d", x);, the behavior is not defined by the C standard, because the %d conversion expects to receive an int object, but the x you are passing is a double object. Passing the wrong type of argument can screw up the argument-passing mechanism in a variety of ways, so you will not always get answers that make sense without knowing how the internals of the software work.

data type: float, long conversion in C

I was reading C primer plus, in chapter 3, data type, the author says:
If you take the bit pattern that represents the float number 256.0 and interpret it as a long value, you get 113246208.
I don't understand how the conversion works. Can someone helps me with this? Thanks.

256.0 is 1.0*28, right?
Now, look at the format (stealing it from #bash.d):
31 0
| |
SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM //S - SIGN , E - EXPONENT, M - MANTISSA
The number is positive, so 0 goes into S.
The exponent, 8, goes into EEEEEEEE but before it goes there you need to add 127 to it as required by the format, so 135 goes there.
Now, of 1.0 only what's to the right of the point is actually stored in MMMMMMMMMMMMMMMMMMMMMMM, so 0 goes there. The 1. is implied for most numbers represented in the format and isn't actually stored in the format.
The idea here is that the absolute values of all nonzero numbers can be transformed into
1.0...1.111(1) * 10some integer (all numbers are binary)
or nearly equivalently
1.0...1.999(9) * 2some integer (all numbers are decimal)
and that's what I did at the top of my answer. The transformation is done by repeated division or multiplication of the number by 2 until you get the mantissa in the decimal range [1.0, 2.0) (or [1.0, 10.0) in binary). Since there's always this 1 in a non-zero number, why store it? And so it's not stored and gives you another free M bit.
So you end up with:
(0 << 31) + ((8 + 127) << 23) + 0 = 1132462080
The format is described here.

What's important from that quote is that integer/long and floats are saved in a different format in memory, so that you cannot simply pick up a bit of memory that has a float in it and say that now it's an int and get a correct value.
The specifics on how each data type is saved into memory can be found searching for IEEE standard, but again that isn't probably the objective of the quote. What it tries to tell you is that floats and integers are saved using a different pattern and you cannot simply use a float number as an int or vice-versa.

While integer and long values are usually represented using two's complement, float-values have a special Encoding, because you cannot tell the computer to display a float-value only using bits.
A 32-bit float number contains a sign-bit, a mantisse and an exponent. These determine together what value the float has.
See here for an article.
EDIT
So, this is what a float encoded by IEEE 754 looks like (32-bit)
31 0
| |
SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM //S - SIGN , E - EXPONENT, M - MANTISSE
I don't know the pattern for 256.0, but the long value will be purely interpreted as
31 0
| |
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB // B - BIT
So there is no "conversion", but a different interpretation.

A small program for understanding unions in C [duplicate]

Suppose I define a union like this:
#include <stdio.h>
int main() {
union u {
int i;
float f;
};
union u tst;
tst.f = 23.45;
printf("%d\n", tst.i);
return 0;
}
Can somebody tell me what the memory where tst is stored will look like?
I am trying to understand the output 1102813594 that this program produces.

It depends on the implementation (compiler, OS, etc.) but you can use the debugger to actually see the memory contents if you want.
For example, in my MSVC 2008:
0x00415748 9a 99 bb 41
is the memory contents. Read from LSB on the left side (Intel, little-endian machine), this is 0x41bb999a or indeed 1102813594.
Generally, however, the integer and float are stored in the same bytes. Depending on how you access the union, you get the integer or floating point interpretation of those bytes. The size of the memory space, again, depends on the implementation, although it's usually the largest of its constituents aligned to some fixed boundary.
Why is the value such as it is in your (or mine) case? You should read about floating-point number representation for that (look up ieee 754)

The result is depends on the compiler implementation, But for most x86 compilers, float and int will be the same size. Wikipedia has a pretty good diagram of the layout of a 32 bit float http://en.wikipedia.org/wiki/Single_precision_floating-point_format, that can help to explain 1102813594.
If you print out the int as a hex value, it will be easier to figure out.
printf("%x\n", tst.i);

With a union, both variables are stored starting at the same memory location. A float is stored in an IEEE format (can't remember the standard number, you can look that up[edit: as pointed out by others, IEEE 754]). But, it will be a two's complement normalized (mantissa is always between 0 and 10, exponent can be anything) floating point number.
you are taking the first 4 bytes of that number (again, you can look up what bits go where in the 16 or 32 bits that a float takes up, can't remember). So it basically means nothing and it isn't useful as an int. That is, unless you know why you would want to do something like that, but usually, a float and int combo isn't very useful.
And, no, I don't think it is implementation defined. I believe that the standard dictates what format a float is in.

In union, members will be share the same memory. so that we can get the float value as integer value.
Floating number format will be different from integer storage. so that we can understand the difference using the union.
For Ex:
If I store the 12 integer value in ( 32 bits ). we can get this 12 value as floating point format.
It will stored as signed(1 bit), exponent(8 bits) and significant precision(23 bits).

I wrote a little program that shows what happens when you preserve the bit pattern of a 32-bit float into a 32-bit integer. It gives you the exact same output you are experiencing:
#include <iostream>
int main()
{
float f = 23.45;
int x = *reinterpret_cast<int*>(&f);
std::cout << x; // 1102813594
}