Why is an addition of a float and int results a float? - c

I'm currently reading "The C Programming Language - 2nd Edition". In the first chapter, it is explained that an operation of an float with an int results an int. There is this program:
#include <stdio.h>
int main()
{
float fahr, celsius;
int lower, upper, step;
lower = 0;
upper = 300;
step = 20;
fahr = lower;
while (fahr <= upper)
{
celsius = (5.0/9.0) * (fahr-32.0);
printf("%3.0f\t%6.1f\n", fahr, celsius);
fahr = fahr + step;
}
}
When the line fahr = fahr + step is executed, shouldn't fahr become an int ? Does it stays a float because I was declared as a float ?

Yes, if you declared your variable as a float, it won't change in your code. If you do an operation between an int and a float and put on a float variable, you will have a float result, and the opposite is true, if you put your variable in a int var, you will lose the decimal part of your number.
You can't change your variable type in C.

If the book said that, it is wrong. Simple as that!
When you add an integer to a float, you get a float. Furthermore, you assigned the result to a float, so it can't be anything else. Objects don't change type.

Since you declare fahr as a float, any value you assign to it will be converted to float.
Any arithmetic operation between an int and a float will have a float result. This is specified as part of the usual arithmetic conversions:
6.3.1.8 Usual arithmetic conversions
1 Many operators that expect operands of arithmetic type cause conversions and yield result
types in a similar way. The purpose is to determine a common real type for the operands
and result. For the specified operands, each operand is converted, without change of type
domain, to a type whose corresponding real type is the common real type. Unless
explicitly stated otherwise, the common real type is also the corresponding real type of
the result, whose type domain is the type domain of the operands if they are the same,
and complex otherwise. This pattern is called the usual arithmetic conversions:
First, if the corresponding real type of either operand is long double, the other
operand is converted, without change of type domain, to a type whose corresponding real type is long double.
Otherwise, if the corresponding real type of either operand is double, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is double.
Otherwise, if the corresponding real type of either operand is float, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is float.62)
Otherwise, the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
62) For example, addition of a double _Complex and a float entails just the conversion of the
float operand to double (and yields a double _Complex result).
C 2011 Online Draft
An arithmetic operation between two ints will yield an int result. For example, 1/2 yields 0, 4/3 yields 1, 7/3 yields 2, etc. If you assign the result of an integer division to a float variable, it will be stored as a float, but you don't get the fractional portion of the result. IOW, given code like
float fahr = 4 / 3;
printf( "%f\n", fahr );
your output will be 1.0, not 1.33333. If you want a floating-point result, at least one of the operands must be a floating-point type:
float fahr = 4 / 3.0f;
printf( "%f\n", fahr );
will output 1.33333.

+1 For reading that book.
C chooses the highest resolution on arithmic, so float wins in your case.

Related

Why does dividing a int with a float result in a float?

Question why does this happen?
Is this just a C language thing?
I'm following the cs50 course.
#include <stdio.h>
int main(void)
{
int testInt = 5;
printf("%f", testInt / 4.0);
}
Output is 1.250000 -- float value
When an expression is being evaluated the compiler needs to determine the common type of operands of the expression.
So for this expression
testInt / 4.0
(where 4.0 is a floating constant of the type double) as the range of values of an object of the type double is greater than the range of values of an object of the type int then the compiler converts the object of the type int to an object of the type double (because it is safer to make this conversion instead of converting an object of the type double to an object of the type int at least due to truncation of the object of the type double) and performs the operation.
Such conversions are called the usual arithmetic conversions and described in the C Standard.
From the C Standard (6.3.1.8 Usual arithmetic conversions)
Otherwise, if the corresponding real type of either operand is double,
the other operand is converted, without change of type domain, to a
type whose corresponding real type is double.
Is this just a C language thing?
The answer is "because that's how the C language defines the operation."
It is common in many languages to promote an integer to a floating point before doing an operation with another floating point value.
If it didn't work this way, there would be many accidental loss-of-precision (or loss-of-information) bugs.
Why does dividing a int with a float result in a float?
In C, with operators like *, / +, -, %, when the 2 operands are of different types, a common one is found by converting the lower ranking one to the higher one.
int ranks lower than float, so the int operand is converted.
The language could have specified int * float differently - perhaps as some some_type mult_int_by_float(int, float) operation. Yet that approach leads to many combinations and still leaves the type of the result unanswered. Promoting the lesser ranked type is simpler.
A language with N types could N*N different multiply operations. With C's approach of ranking and conversion, it is more like N different multiply operations.
Is this just a C language thing?
Yes. The TL/DR; version is that the operand with the narrower/less precise type is converted to same type as the operand with the wider/more precise type, and the type of the result is the same as that of the operand with the wider/more precise type. Here's the specific set of rules:
6.3.1.8 Usual arithmetic conversions
Many operators that expect operands of arithmetic type cause conversions and yield result
types in a similar way. The purpose is to determine a common real type for the operands
and result. For the specified operands, each operand is converted, without change of type
domain, to a type whose corresponding real type is the common real type. Unless
explicitly stated otherwise, the common real type is also the corresponding real type of
the result, whose type domain is the type domain of the operands if they are the same,
and complex otherwise. This pattern is called the usual arithmetic conversions:
First, if the corresponding real type of either operand is long double, the other
operand is converted, without change of type domain, to a type whose corresponding real type is long double.
Otherwise, if the corresponding real type of either operand is double, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is double.
Otherwise, if the corresponding real type of either operand is float, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is float.
62)
Otherwise, the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
62) For example, addition of a double _Complex and a float entails just the conversion of the
float operand to double (and yields a double _Complex result).
C 2011 Online Draft
If both operands are integers, then the result is an integer. If either operand is a floating-point type, then the result has a floating-point type.

C arithmetic conversion multiplying unsigned with signed and result in float

int main()
{
printf("Hello World\n");
int x = -10;
unsigned y = 25;
float z = x*y;
printf("x=%d,y=%u,z=%f\n",x,y,z);
return 0;
}
When I run the above code, I get the following output:
Hello World
x=-10,y=25,z=4294967046.000000
My question is:
For the second printf, I would have expected z=(float) ( (unsigned)(-10)*25 ) = (float) (4294967286 x 25) = (float) 107374182150, what am I missing here?
Here's what's happening. As per C11 6.3.1.8 Usual arithmetic conversions (the "otherwise" comes into play here since previous paragraphs discuss what happens when either type is already floating point):
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
This means your signed value of -10 becomes an unsigned value of 0xffff'fff6, or 4,294,967,286. Multiplying that by 25 gives 107,374,182,150 or 0x18'ffff'ff06 (which is the result you want).
However, at this point, no float calculations have been done, the multiplication is a pure integer calculation and the resultant value will be an integer. And that, combined with the fact your unsigned integers are 32 bits long, means it gets truncated to 0xffff'ff06, or 4,294,967,046.
Then you put that into the float.
To fix this to match your expected results, you should change he expression to force this:
float z = 1.0f * (unsigned)x * y;
This changes the int * unsigned-int calculation into a float * unsigned-int * unsigned-int one. The unsigned cast first ensures x will be converted to the equivalent unsigned value and the multiplication by 1.0f ensures the multiplication are done in the float arena to avoid integer truncation.
Following on from the correct answer from #paxdiablo, the starting point for the result is due to unsigned having a rank equal to the rank of the int, e,g,
The rank of any unsigned integer type shall equal the rank of the
corresponding signed integer type, if any. C11 Standard - 6.3.1
Arithmetic
operands(p1)
This comes into play with the integer conversion cited in #paxdiablo's answer:
6.3.1.8 Usual arithmetic conversions
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
The problem is that -10 (negative values) are stored (in almost all computers) in two-complement. In two's complement the value for -10 takes the bitwise NOT of 10 and adds 1 (so in binary 00001010 become 11110110 sign extended to 32-bits). That is:
11111111111111111111111111110110
For which the unsigned values is 4294967286. When multiplied by 25, it exceeds the range of unsigned so the value is reduced modulo until it fits within the range of unsigned resulting in 4294967046. 6.2.5 Types(p9).
What Am I Missing?
The part that is missing is understanding the result of unsigned multiplication is being assigned as a float value. The intermediate result from x * y is unsigned. float f = x * y; is just an assignment of the result to a float.
What you want is for the intermediate calculation to be done as a float, so cast one of the operands (not the result) to float, e.g.
float f = (float)x * y
It does not matter which of the two values is cast to float, the following would be just fine:
float f = x * (float)y;
Now the result will be -250.

How is the standarized way to calculate float with integers?

Do any of you know how this will be calculated in C?
uint8_t samplerate = 200;
uint8_t Result;
Result = 0.5 * samplerate;
Now, the problem is that 0.5 is a float and samplerate an integer.
Result could then be either 0, because 0.5 is converted in an integer and therefore rounded to 0 (Result = 0 * 200 = 0). Or Result could be 100, because the compiler sees 0.5 first and converts samplerate into float (Result = 0.5 * 200 = 100).
Is there a standarized way how the compiler will handle these calculations?
I mean will the compiler look at the variable on the very left (in this case 0.5) first and convert the other to this, or will it look at the variable on the very right (samplerate) and convert the other variables to this?
I know how I could solve this problem but I look for an general answer, if this is C standarized and how will it calculate such equations?
When numeric values of various types are combined in a expression, they are subject to the usual arithmetic conversions, which is a set of rules which dictate which operand should be converted and to what type.
These conversions are spelled out in section 6.3.1.8 of the C standard:
Many operators that expect operands of arithmetic type cause
conversions and yield result types in a similar way. The purpose is
to determine a common real type for the operands and result. For the
specified operands, each operand is converted, without change of type
domain, to a type whose corresponding real type is the
common real type. Unless explicitly stated otherwise, the
common real type is also the corresponding real type of the
result, whose type domain is the type domain of the operands
if they are the same, and complex otherwise. This pattern is
called the usual arithmetic conversions :
First, if the corresponding real type of either operand is long double , the other operand is converted, without change of type domain, to a type whose corresponding real type is long
double .
Otherwise, if the corresponding real type of either operand is double , the other operand is converted, without change of type domain, to a type whose corresponding real type is
double .
Otherwise, if the corresponding real type of either operand is float , the other operand is converted, without change of type domain, to a type whose corresponding real type is
float .
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted
operands:
If both operands have the same type, then no further
conversion is needed.
Otherwise, if both operands have signed
integer types or both have unsigned integer types, the operand
with the type of lesser integer conversion rank is converted
to the type of the operand with greater rank.
Otherwise, if the
operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then
the operand with signed integer type is converted to the type
of the operand with unsigned integer type.
Otherwise, if the
type of the operand with signed integer type can represent all of the
values of the type of the operand with unsigned integer type, then the
operand with unsigned integer type is converted to the type
of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
Note in particular the paragraph in bold, which is what applies in your case.
The floating point constant 0.5 has type double, so the value of other operand is converted to type double, and the result of the multiplication operator * has type double. This result is then assigned back to a variable of type uint8_t, so the double value is converted to this type for assignment.
So in this case Result will have the value 100.
Yes, of course this is controlled by the standard, there is no uncertainty here.
Basically the integer will be promoted to double (since the type of 0.5 is double, it's not float) and the computation will happen there, then the result will be truncated back down to uint8_t. The compiler will shout at you for the loss of precision, typically. If it does not, add more warning options as required.
Yes, there is a standard. In this case, the numbers in the expression are automatically converted to the wider type (one that occupies more bytes), so your expression will be evaluated as follows:
(0.5: double) * (0: uint8_t) => (0.5: double) * (0.0: double) == (0.0: double)
uint8_t Result = (0.0: double) => (0: uint8_t) // this is a forced cast, because Result is of type uint8_t
double is wider than uint8_t, so (0: uint8_t) is widened to (0.0: double). This cast doesn't lose information since double occupies enough space to fit all the data stored in uint8_t.

Conversion to float in uint32_t calculation

I am trying to improve a SWRTC by modifying the definition of a second (long unsigned n_ticks_per_second) by synchronizing time with a server.
#include <stdint.h>
#include <stdio.h>
int main(int argc, char * argv[]){
int32_t total_drift_SEC;
int32_t drift_per_sec_TICK;
uint32_t at_update_posix_time = 1491265740;
uint32_t posix_time = 1491265680;
uint32_t last_update_posix_time = 1491251330;
long unsigned n_ticks_per_sec = 1000;
total_drift_SEC = (posix_time - at_update_posix_time);
drift_per_sec_TICK = ((float) total_drift_SEC) / (at_update_posix_time - last_update_posix_time);
n_ticks_per_sec += drift_per_sec_TICK;
printf("Total drift sec %d\r\n", total_drift_SEC);
printf("Drift per sec in ticks %d\r\n", drift_per_sec_TICK);
printf("n_ticks_per_second %lu\r\n", n_ticks_per_sec);
return 0;
}
What I don't understand is that I need to cast total_drift_SEC to float in order to have a correct result in the end, ie to have n_ticks_per_sec equal to 1000 in the end.
The output of this code is:
Total drift sec -60
Drift per sec in ticks 0
n_ticks_per_second 1000
Whereas the output of the code without the cast to float is:
Total drift sec -60
Drift per sec in ticks 298054
n_ticks_per_second 299054
This line
drift_per_sec_TICK = total_drift_SEC / (at_update_posix_time - last_update_posix_time);
divides a 32 bit signed int by a 32 bit unsigned int.
32 bit unsigned int has a higher rank then 32 bit signed int.
When doing arithmetic operations the "Usual Arithmetic Conversions" are applied:
From the C11 Standard (draft) 6.3.1.8/1:
if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
So -60 gets converted to a (32 bit) unsigned int: 4294967236
Here
drift_per_sec_TICK = (float) total_drift_SEC / (at_update_posix_time - last_update_posix_time);
The following applies (from the paragraph of the C Standard as above):
if the corresponding real type of either operand is float, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is float.
To not blindly step into those traps always specify -Wconversion when compiling with GCC.
Because with "integer" version total_drift_SEC will become unsigned so -60 --> 4294967236
4294967236 / 14410 = 298054
Using float the division will calculate:
-60/14410 = 0
Referring to the c-standard at page 53
6.3.1.8 Usual arithmetic conversions
1 Many operators that expect operands of arithmetic type cause conversions and yield result
types in a similar way. The purpose is to determine a common real type for the operands
and result. For the specified operands, each operand is converted, without change of type
domain, to a type whose corresponding real type is the common real type. Unless
explicitly stated otherwise, the common real type is also the corresponding real type of
the result, whose type domain is the type domain of the operands if they are the same,
and complex otherwise. This pattern is called the usual arithmetic conversions:
[...]
Otherwise, the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is
needed. Otherwise, if both operands have signed integer types or both
have unsigned integer types, the operand with the type of lesser
integer conversion rank is converted to the type of the operand with
greater rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or equal to the rank of the type of the other operand, then
the operand with signed integer type is converted to the type of the
operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, then the operand with unsigned integer type is
converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
Ephasis mine

Typecasting integers as doubles to get a double result from divison of the typecasted integers

I do not understand the underlying reason the output is a double between the following examples. In terms of:
Why does a double divided by an int result in a double?
Why does a int divided by a double result in a double?
#include <stdio.h>
int main(int agrc, char **argv)
{
double d;
int a=5,b=2;
d = (double)a/b;
printf("d= %G\n",d); // outputs 2.5
d = a/(double)b;
printf("d= %G\n",d); // outputs 2.5
}
From the C standard, section 6.3.1.8: Usual arithmetic conversions:
First, if the corresponding real type of either operand is long double,
the other operand is converted, without change of type domain, to a
type whose corresponding real type is long double.
Otherwise, if the corresponding real type of either operand is double, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is double.
Otherwise, if the corresponding real type of either operand is float,
the other operand is converted, without change of type domain, to a
type whose corresponding real type is float.
Otherwise, the integer promotions are performed on both operands.
So if one operand to an arithmetic operator is int and the other is double, the standard states that the resulting expression has type double.
The cast has precedence over the division, and an operation between a double and an int will produce a double
Due to necessities in the evaluation, if one of the operands of a division is a double data type, the other is automatically promoted to that for means of calculation. In your example this happens because the operator precedence for an explicit cast is higher then for a divison.
If you want to only cast the result of the division, you can do:
d = (double)(a/b);
To ensure the integer division is performed first, and the explicit cast to double is performed second.
For additional context of #dbush 's excellent answer, it is important to note that the standard specifies that for all arithmetic type conversions containing differing types, that conversion is from the smaller of two types to the largest:
Summarized from C11 - 6.3.1.8 Usual arithmetic conversions:
1st: if real type of either operand is long double, the other is
converted to long double
2nd: Otherwise if real type of either
operand is double, the other is converted to double
3rd: Otherwise
if real type of either operand is float, the other is converted to
float
And it goes on to specify how integer promotions are made in similar fashion...

Resources