Mathematic operation between integer and float in a 32-bit system [duplicate] - c

This question already has answers here:
Why does floating-point arithmetic not give exact results when adding decimal fractions?
(31 answers)
Closed 6 years ago.
On a 32-bit system, I found that the operation below always return the correct value when a < 2^31 but returns random results where a is larger.
uint64_t a = 14227959735;
uint64_t b = 32768;
float c = 256.0;
uint64_t d = a - b/ c; // d returns 14227959808
I believe the problem here is that the int-to-float operation returns undefined behavior, but could someone help explain why it gives such a value?

The entire calculation goes to float, then gets cast to a 64 bit integer. But floats can't accurately represent large integers, unless they happen to be powers of two.

Related

Division in C language [duplicate]

This question already has answers here:
How can I force division to be floating point? Division keeps rounding down to 0?
(11 answers)
C - division doesnt work [duplicate]
(2 answers)
Closed 1 year ago.
Hello fellas hope you all doing well i am kinda newbie in C language, I just need to ask a basic question that is that when i divide numbers in C like this:
#include<stdio.h>
main()
{
float a = 15/4;
printf("%.2f", a);
}
the division happens but the answer comes in a form like 3.00(which is not correct it did'nt count the remainders)
But when i program it like this:
#include<stdio.h>
main()
{
float a = 15;
float b = 4;
float res = a/b;
printf("%.2f", res);
}
this method gives me the correct answer. So i want to ask the reason behind the difference b/w these two programs why doesn't the first method works and why the second method working?
In this expression
15/4
the both operands have integer types (more precisely the type int). So the integer arithmetic is performed.
If at least one operand had a floating point type (float or double) as for example
15/4.0
or
15/4.0f
then the result will be a floating point number (in the first expression of the type double and in the second expression of the type float)
And in this expression
a/b
the both operands have floating point types (the type float). So the result is also a floating point number.
When you state your varable to float you automatically casting the values he get from the equation.
For example:
Float a = 2/4 it's like writing float a = float(equation).
Take care,
Ori

C pow() function printing gigantic numbers? [duplicate]

This question already has answers here:
c++ pow(2,1000) is normaly to big for double, but it's working. why?
(3 answers)
Closed 1 year ago.
sorry if this is a silly question, I am relatively new to C programming. Thus I have probably misunderstood something fundamental about variables/overflows.
I made this simple program and I can't explain the result
#include <stdio.h>
#include <math.h>
int main() {
double number = pow(2, 1000);
printf("%f\n", number);
}
The result of that (compiled with gcc) is a humongous number that should never fit in a variable of type double: 10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376.000000
Usually when I try to assign a constant that is too large for it's type the compiler gives me an error.
So my question is: What is this number? Why am I not getting any errors? Is the variable number actually storing this value? Plainly, what's happening?
Doubles are not stored like integers. This page explains how double are representated in memory
Doubles contain 1 byte of sign, 11 bits for the exponent and 53 bits for the sigificand precision. Thus you can store numbers up to 1.7*10^308. Thus, your number cand be represented in a double (although with a limited precision). When printing it with %f, you just get its numerical value (approximated).
In C, double type can fit more than 300 decimal digits! A double is stored in 8 bytes, holding a number in the range 2.3E-308 to 1.7E+308 (15 decimal places accuracy).
Reference: https://www.tutorialspoint.com/cprogramming/c_data_types.htm

Reading double Value PIC18F67K22 [duplicate]

This question already has an answer here:
Rounding issue when using long long on PIC
(1 answer)
Closed 4 years ago.
char q[150];
void main(void){
System_Initialization();
UART_Init_2();
while(1){
double A=23.045610;
sprintf(q,"%f\r\n",A);
UART_Tx_2(q);}}
When I Read value of A its give 23.045410 instead of 23.045610
anyone know why this will happened?
i am using PIC18F67k22 controller and Xc8 Compiler
On the PIC18, the data types float and double are the same, and only 32 bits long. That is not enough bits to store more than five or so decimal digits. Therefore you can expect some rounding error at the end of the decimals.

Why did my float get truncated? [duplicate]

This question already has answers here:
C integer division and floor
(4 answers)
Closed 7 years ago.
#include <stdio.h>
int main(void)
{
float c =8/5;
printf("The Result: %f", c);
return 0;
}
The answer is 1.000000. Why isn't it 1.600000?
C is interpreting your 8/5 input as integers. With integers, C truncates it down to 1.
Change your code to 8.0/5.0. That way it knows you're working with real numbers, and it will store the result you're looking for.
The expression
8/5
is an all int expression. So, it evaluates to (int )1
The automatic conversion to float happens in the assignment.
If you convert to float before the divide, you will get the answer you seek:
(float )8/5
or just
8.0/5
When you don't specify what data types you use (for example, in your code you use the integer constants 8 and 5) C uses the smallest reasonable type. In your case, it assigned 8 and 5 the integer type, and because both operands to the division expression were integers, C produced an integer result. Integers don't have decimal points or fractional parts, so C truncates the result. This throws away the remainder of the division operation leaving you with 1 instead of 1.6.
Notice this happens even though you store the result in a float. This is because the expression is evaluated using integer types, then the result is stored as is.
There are at least two ways to fix this:
Cast the part of the expression to a double type or other type that can store fractional parts:
Here 8 is cast to the double type, so C will perform float division with the operands. Note that if you cast (8 / 5), integer division will be performed before the cast.
foo = (double) 8 / 5
Use a double as one of the operands:
foo = 8.0/5

Not getting expected value when running a program [duplicate]

This question already has answers here:
What decides what datatype that will be used to store the temporary value in?
(3 answers)
Closed 7 years ago.
int a=10;
float b;
printf("the no\n");
b=((10-1)/12)*50;
printf("b value is %f",b);
return 0;
But when I calculate the b value in scientific calc we get b=40. And my question is why it shows b=0 when I run my code
In the calculation, the expression ((10-1)/12) is a division of an integer by an integer. The first integer evaluates to 9, and since 9/12 is less than 1, 9/12 evaluates to 0. This 0 is then multiplied by 50 to give b = 0.
To make the division act like a float, make one of the constants in the division portion a float. You can do this several ways the shortest of which is to add "f" as a suffix to one of the numbers, e.g. use 10F instead of 10, to get ((10F-1)/12). This makes the subtraction a floating point operation, which make the division and then multiplication of your original expression floating point operations as well. This should then give you the expected (float) result.
There will be a loss of data when you try to perform integer division. As already explained,
b=((10-1)/12)*50;
will yield 0 because :
((10-1)/12)*50; = (9/12)*50; = (0)*50; = 0;
To prevent this data loss, you can do the following:
((10-1)/12.0)*50;
Basically, by 12.0, it will be read as a float, so appropriately, floating point division will be performed.

Resources