When I use % operator on float values I get error stating that "invalid operands to binary % (have ‘float’ and ‘double’)".I want to enter the integers value only but the numbers are very large(not in the range of int type)so to avoid the inconvenience I use float.Is there any way to use % operator on such large integer values????
You can use the fmod function from the standard math library. Its prototype is in the standard header <math.h>.
You're probably better off using long long, which has greater precision than double in most systems.
Note: If your numbers are bigger than a long long can hold, then fmod probably won't behave the way you want it to. In that case, your best bet is a bigint library, such as this one.
The % operator is only defined for integer type operands; you'll need to use the fmod* library functions for floating-point types:
#include <math.h>
double fmod(double x, double y);
float fmodf(float x, float y);
long double fmodl(long double x, long double y);
When I haven't had easy access to fmod or other libraries (for example, doing a quick Arduino sketch), I find that the following works well enough:
float someValue = 0.0;
// later...
// Since someValue = (someValue + 1) % 256 won't work for floats...
someValue += 1.0; // (or whatever increment you want to use)
while (someValue >= 256.0){
someValue -= 256.0;
}
consider : int 32 bit and long long int of 64 bits
Yes, %(modulo) operator isn't work with floats and double.. if you want to do the modulo operation on large number you can check long long int(64bits) might this help you.
still the range grater than 64 bits then in that case you need to store the data in .. string and do the modulo operation algorithmically.
or either you can go to any scripting language like python
If you want to use an int use long long, don't use a format that is non-ideal for your problem if a better format exists.
Related
I am implementing a fractional delay line algorithm.
One of the tasks involved is the decomposition of a floating-point value into its integral and fractional part.
I know there are a lot of posts about this topic on SO and I probably read most of them.
However I haven’t found one post that deals with the specifics of this scenario.
The algorithm must be using 64-bit floating-point values.
Input floating-point values are guaranteed to always be positive. (delay times cannot be negative)
The output integer part has to be represented by an integer datatype.
The integer datatype must have enough bits so that the double-to-integer conversion occurs without the risk of overflowing.
Issues resulting from floating-point values lacking an exact internal representation must be avoided.
(i.e. 9223372036854775809.0 might be internally represented as 9223372036854775808.9999998 and when cast to integer it erroneously becomes 9223372036854775808)
The implementation should work regardless of rounding mode or compiler optimization settings.
So I wrote a function:
double my_modf(double x, int64_t *intPartOut);
As you can see its signature is similar to the modf() function in the C standard library.
The first implementation I came up with is:
double my_modf(double x, int64_t *intPartOut)
{
double y;
double fracPart = modf(x, &y);
*intPartOut = (int64_t)y;
return fracPart;
}
I have also been experimenting with this implementation which - at least on my machine - runs faster than the previous, however I doubt its robustness.
double my_modf(double x, int64_t *intPartOut)
{
int64_t y = (int64_t)x;
*intPartOut = y;
return x - y;
}
...and this is my latest attempt:
double my_modf(double x, int64_t *intPartOut)
{
*intPartOut = llround(x);
return x - floor(x);
}
I can't make up my mind as to which implementation would be best to use, or if there are other implementations that I haven't considered that would better accomplish the following goals.
I am looking for the (1) most robust and (2) most efficient implementation to decompose a floating-point number into its integral and fractional part, keeping into consideration the list of points mentioned above.
Given that the maximum value of the integer part of the floating-point input x is 263−1 and that x is non-negative, then both:
double my_modf(double x, int64_t *intPartOut)
{
double y;
double fracPart = modf(x, &y);
*intPartOut = y;
return fracPart;
}
and:
double my_modf(double x, int64_t *intPartOut)
{
int64_t y = x;
*intPartOut = y;
return x - y;
}
will correctly return the integer part in intPartOut and the fractional part in the return value regardless of rounding mode.
GCC 9.2 for x86_64 does a better job optimizing the latter version, and so does Apple Clang 11.0.0.
llround will not return the integer part as desired because it rounds to the nearest integer rather than truncating.
Issues about x containing errors cannot be resolved with the information provided in the question. The routines shown above have no error; they return exactly the integer and fractional parts of their input.
Updated answer after reading your comment below.
If you are already sure the values are within [0, 2^63-1] then a simple cast will be faster than llround() since this function may also check for overflow (on my system, the manual page states so, however the C standard does not require it).
On my machine for example (x86-64 Nehalem) casting is a single instruction (cvttsd2si) and llround() is obviously more than one.
Am I guaranteed to get the right result with a simple cast (truncation) or is it safer to round?
Depends on what you mean with "right". If the value in the double can be correctly represented by an int64_t, then sure you're going to get exactly the same value. However, if the value cannot be precisely represented by the double then truncation is automatically performed when casting. If you want to round the value in a different way that's another story and you'll have to use one of ceil(), floor() or round().
If you also are sure that no values will be +/- Infinity or NaN (and in that case you can use -Ofast), then your second implementation should be the fastest if you want truncation, while the third should be the fastest if you want to floor() the value.
When I use the pow() function, sometimes the results are off by one. For example, this code produces 124, but I know that 5³ should be 125.
#include<stdio.h>
#include<math.h>
int main(){
int i = pow(5, 3);
printf("%d", i);
}
Why is the result wrong?
Your problem is that you are mixing integer variables with floating point math. My bet is that the result of 5^3 is something like 124.999999 due to rounding problems and when cast into integer variable get floored to 124.
There are 3 ways to deal with this:
more safely mix floating math and integers
int x=5,y=3,z;
z=floor(pow(x,y)+0.5);
// or
z=round(pow(x,y));
but using this will always present a possible risk of rounding errors affecting the result especially for higher exponents.
compute on floating variables only
so replace int with float or double. This is a bit safer than #1 but still in some cases is this not usable (depends on the task). and may need occasional floor,ceil,round along the way to get the wanted result correctly.
Use integer math only
This is the safest way (unless you cross the int limit). The pow can be computed on integer math relatively easily see:
Power by squaring for negative exponents
pow(x, y) is most likely implemented as exp(y * log(x)): modern CPUs can evaluate exp and log in a couple of flicks of the wrist.
Although adequate for many scientific applications, when truncating the result to an integer, the result can be off for even trivial arguments. That's what is happening here.
Your best bet is to roll your own version of pow for integer arguments; i.e. find one from a good library. As a starting point, see The most efficient way to implement an integer based power function pow(int, int)
Use Float Data Type
#include <stdio.h>
#include <math.h>
int main()
{
float x=2;
float y=2;
float p= pow(x,y);
printf("%f",p);
return 0;
}
You can use this function instead of pow:
long long int Pow(long long int base, unsigned int exp)
{
if (exp > 0)
return base * Pow(base, exp-1);
return 1;
}
printf("Percent decrease: ");
printf("%.2f", (float)((orgChar-codeChar)/orgChar));
I'm using this statement to print some results to my command console, however, I end up with zero. Putting the equation into another variable doesn't work either.
orgChar = 91 and codeChar = 13, how do I print out this equation?
Integer division will lead to result 0 here and you are type casting the result later to float so eventually you will end up with 0
Make any one of the variables float before division
(orgChar-codeChar)/(float)orgChar
As others have mentioned, the subtraction and division are done using integer math before the cast to (float). By that point, the integer division has a truncated result of 0. Instead:
// (float)((orgChar-codeChar)/orgChar)
((float) orgChar - codeChar)/orgChar
// or
(orgChar - codeChar)/ (float) orgChar
As the float argument gets converted to double as part of the "usual argument promotion" of arguments to a variadic function like printf(), might as well do
printf("%.2f", (orgChar-codeChar)/ (double) orgChar);
Casting, in general, should be avoided. Some casts unintentionally narrow the operation. If unsigned is 32-bit and a1 is uint64_t, then a1 was narrowed before the shift and unexpected results may occur. If a1 was a char, it is nicely converted without trouble to an unsigned.
The second method of *1u will not narrow. It will insure a2*1u is at least the width of an unsigned.
unsigned sh1 = (unsigned) a1 >> b1; // avoid
unsigned sh2 = a2*1u >> b2; // better
So recommend, rather than (float) or (double), use the idiom of multiplying by 1.
printf("%.2f", (orgChar - codeChar) * 1.0 / orgChar);
you don't need to typecast the whole expression. you can simply type cast either the numerator or the denominator to get the float result with precision of 2 decimal places.
for eg:
here in this code defining a variable c as float doesnt guarantee the result to be float.for getting the precise result you need to typecast either the numerator or denominator.
You shouldn't need to cast to float at all. Simply make sure both variables are of type float or double before attempting to print them as floats. This means either declaring the variables as floats, or using the correct function, such as atof () when converting the data to floats (normally this is done when you get the data from the command-line or a file.)
This should work...
#include <stdio.h>
int
main (void)
{
float orgChar = 91;
float codeChar = 13;
printf ("%.2f\n", (orgChar - codeChar) / orgChar);
return 0;
}
I was solving this problem on spoj http://www.spoj.com/problems/ATOMS/. I had to give the integral part of log(m / n) / log(k) as output. I had taken m, n, k as long long. When I was calculating it using long doubles, I was getting a wrong answer, but when I used float, it got accepted.
printf("%lld\n", (long long)(log(m / (long double)n) / log(k)));
This was giving a wrong answer but this:
printf("%lld\n", (long long)((float)log(m / (float)n) / (float)log(k)));
got accepted. So are there situations when float is better than double with respect to precision?
A float is never more accurate than a double since the former must be a subset of the latter, by the C standard:
6.2.5/6: "The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double."
Note that the standard does not insist on a particular floating point representation although IEEE754 is particularly common.
It might be better in some cases in terms of calculation time/space performance. One example that is just on the table in front of me - an ARM Cortex-M4F based microcontroller, having a hardware Floating Point Unit (FPU), capable of working with single-precision arithmetic, but not with double precision, which is giving an incredible boost to floating point calculations.
Try this simple code :
#include<stdio.h>
int main(void)
{
float i=3.3;
if(i==3.3)
printf("Equal\n");
else
printf("Not Equal\n");
return 0;
}
Now try the same with double as a datatype of i.
double will always give you more precision than a float.
With double, you encode the number using 64 bits, while your using only 32 bits with float.
Edit: As Jens mentioned it may not be the case. double will give more precision only if the compiler is using IEEE-754. That's the case of GCC, Clang and MSVC. I haven't yet encountered a compiler which didn't use 32 bits for floats and 64 bits for doubles though...
long long int A = 3289168178315264;
long long int B = 1470960727228416;
double D = sqrt(5);
long long int out = A + B*D;
printf("%lld",out);
This gives result : -2147483648
I am not able to figure out why (it should be a positive result).
Can somebody help?
maybe you have to specify those constants as "long long" literals? e.g. 3289168178315264LL
What compiler/operating system are you using? I ran your code using Visual C++ 2008 Express Edition on Windows XP and IT WORKS - answer: 6578336356630528 (this is a 53-bit number, so it just fits inside a double).
I also tried two variations to see if the order of operations mattered:
long long int out = A;
out+=B*D;
long long int out = B*D;
out+=A;
These both work as well!
Curious.
My guess is that the compiler needs to round the result from "A+B*D" to an integer first, because you're storing the result inside an int field. So basically, you're having a datatype conflict.
Both A and B are still valid numbers for a long long int, which is 8 bytes long. You could even multiply them by 1.000 and still have valid long long int values. In some other languages it's also known as the int64.
A double, however, is also 64-bits but part of those are used as exponent. When you multiply a double with an int64, the result would be another double. Adding another int64 to it still keeps it a double. Then you're assigning the result to an int64 again without using a specific rounding method. It would not surprise me if the compiler would use a 4-bit rounding function for this. Am even amazed that the compiler doesn't puke and break on that statement!
Anyways, when using large numbers like these, you have to be extra careful when mixing different types.
Your answer (have to verify) calcuates successfully, however, it causes an overflow into the sign bit, which makes the answer negative. Solution : make all your variables unsigned.
Why:
Numbers are stored as series of bits in you computer's memory. The first bit in such a series, when set means that you number is negative. So the calculation works, but overflows into the sign bit.
Recommendation:
If you're working with numbers this big, I recommend you to get a multiprecision arithmetic library. 'T will save you a lot of time and trouble.
The parameter to sqrt should be double.
#include <math.h>
double sqrt( double num );
And also we should explict cast the result from B * D to long long.
long long int A = 3289168178315264;
long long int B = 1470960727228416;
double D = sqrt(5.0);
printf("%f\n",D);
long long int out = A + (long long) (B * D);
printf("%lld",out);