Bizarre math error when doing a basic check in C

Bizarre math error when doing a basic check in C - c

For a simple program, the assignment was to create a program that accepts a ten digit phone number, and then reads it back to the user.
There were to be controls to ensure that:
The first digit is not 0.
That the number entered is ten numbers.
The error check seemed simple; I thought using a while loop to ensure that the range of the number was between 1000000000 and 9999999999 would work out, and according to independent calculations, it seems it should.
while ((MDN - valueCheck < 0) || (MDN > 9999999999)) {
printf("Entered number is not ten digits. Please re-enter.\n");
scanf("%d", &MDN);
}
Both MDN and valueCheck are long long type variables (so that the range can go past 2,147,483,647; IIRC long long was 64-bit), but they seem to still be listed as 32-bit integers, as entering 2147483647 comes out just fine (or any lower phone number works as well), but entering 2,147,483,648 (or anything above) causes it to be displayed as -2147483647.
Related to the above, entering a higher number, not only does the value wrap around the range of the 32-bit integer, but the phone number printed by the printf statement after the loop is always equal to the entered number minus twice the limit of a 32-bit integer.
Is there any simple way to make the program actually work in 64-bit numbers like I wanted it to? The algorithm seems solid, if I can make the math work properly.

Try scanf("%lld", &MDN); instead of scanf("%d", &MDN);
From man scanf:
ll (ell ell)
Indicates that the conversion will be one of dioux or n and the
next pointer is a pointer to a long long int (rather than int).

Q: Is there any simple way to make the program actually work in 64-bit numbers like I wanted it to?
A: Use int64_t and "%" SCNx64.
If you:
want 32-bit integers, use type int32_t.
want 64-bit integers, use type int64_t.
use int, the range is at least -32767 to +32767.
use long, the range is at least -2147483647 to +2147483647.
use long long, the range is at least -9223372036854775807 to +9223372036854775807.
With scanf() use the matching format specifier:
int "%d"
long "%ld"
long long "%lld"
int32_t "%" SCNx32
int64_t "%" SCNx64

You could've used this:
scanf("%lld", &MDN);
but there are issues with coding and approach itself (see below)
I would advise to stick to string-regexp input-validation approach instead.
Coding issue:
scanf() in a loop is vulnerable to unprocessed-input-buffer issue, e.g if user enters a text instead of a number - scanf() will fail, stdin won't be consumed and loop will continue eternally.
See "How to clear input buffer in C?" for details. Following change should address it:
while(!scanf("%lld", &MDN)) {
char c;
while(c=getchar()!='\n'&& c!=EOF);
}
Approach issue:
The approach itself is vulnerable to overflow and type-trimming issues.
For example: "-9999999999999999999" will pass all MDN validations (see below).
Also there could be compiler/specific issues while comparing long long (MDN)to an integer (0 or 9999999999 instead of 0LL or 9999999999LL).
Entered number is not ten digits. Please re-enter.
-9999999999999999999
"MDN" is -9223372036854775808
"MDN > 9999999999" is false
"MDN - valueCheck < 0" is false (!) because "MDN - valueCheck" is 9223372035854775808

Related

I need to create a decimal to binary program that can receive input of up to 100,000,000 and output the whole answer without displaying rubbish

As you've read, I created a decimal to binary program and it works well, but it cannot handle user input equal to 100,000,000. My solution is to print each character as it goes, but I do not know what the appropriate loop to use is, and I am also not that great with the math so the main formula to be used is unclear to me. Arrays are not allowed. Any advice is appreciated. Thank you.
#include <stdio.h>
unsigned long long int input,inp,rem=0,ans=0,place_value=1,ans;
int main()
{
printf("\nYou have chosen Decimal to Binary and Octal Conversion!\n");
printf("Enter a decimal number:\n");
scanf("%llu", &input);
inp=input;
while(input){
rem=input%2;
input=input/2;
ans=ans+(rem*place_value);
place_value=place_value*10;
}
printf("%llu in Decimal is %llu in Binary Form.\n", inp,ans);
return 0;
}
Edit: I have already read all your answers and I have done my best to understand them. I was able to understand most of what was brought up but some terms or lessons mentioned will require more time from me to learn. I have already submitted my output without solving the 100,000,000 issue but I intend to use the knowledge I have now to create better outputs. I tried asking a friend of mine and he told me he was able to do it using method 2 found here:https://www.wikihow.com/Convert-from-Decimal-to-Binary. Perhaps my instructor simply wanted to teach us how to fully utilize control structures and data types which is why there are so many restrictions. Thank you all for your time and god bless.

So as the comments have explained, the decimal number 100000000 has the 27-bit binary representation 101111101011110000100000000. We can therefore store that in a 32-bit int with no problem. But if we were to try to store the decimal number 101111101011110000100000000, which just happens to look like a binary number, well, that would require 87 bits, so it won't even fit into a 64-bit long long integer.
And the code in this question does try to compute its result, ans, as a decimal number which just happens to look like a binary number. And for that reason this code can't work for numbers larger than 1048575 (assuming a 64-bit unsigned long long int).
And this is one reason that "decimal to binary" conversion (or, for that matter, conversion to any base) should normally not be done to a result variable that's an integer. Normally, the result of such a conversion — to any base — should either be done to a result variable that's a string, or it should be printed out immediately. (The moral here is that the base only matters when a number is printed out for a human to read, which implies either a string, and/or something printed to, say, stdout.)
However, in C a string is of course an array. So asking someone to do base conversion without using arrays is a perverse, pointless exercise.
If you print the digits out immediately, you don't have to store them in an array. But the standard algorithm — repeated division by 2 (or whatever the base is) generates digits in reverse order, from least-significant to most-significant, which ends up being right-to-left, which is the wrong order to just print them out. Conventional convert-to-digits code usually stores the computed digits into an array, and then reverses the array — but if there's a prohibition against using arrays, this strategy is (again pointlessly) denied to us.
The other way to get the digits out in the other order is to use a recursive algorithm, as #chux has demonstrated in his answer.
But just to be perverse in my own way, I'm going to show another way to do it.
Even though it's generally a horrible idea, constructing the digits into an integer, that's in base 10 but looks like it's in base 2, is at least one way to store things up and get the answer back out with the digits in the right order. The only problem is that, as we've seen, the number can get outrageously big, especially for base 2. (The other problem, not that it matters here, is that this approach won't work for bases greater than 10, since there's obviously no way to construct a decimal number that just happens to look like it's in, say, base 16.)
The question is, how can we represent integers that might be as big as 87 bits? And my answer is, we can use what's called "multiple precision arithmetic". For example, if we use a pair of 64-bit unsigned long long int variables, we can theoretically represent numbers up to 128 bits in size, or 340282366920938463463374607431768211455!
Multiple precision arithmetic is an advanced but fascinating and instructive topic. Normally it uses arrays, too, but if we limit ourselves to just two "halves" of our big numbers, and make certain other simplifications, we can do it pretty simply, and achieve something just powerful enough to solve the problem in the question.
So, to repeat, we're going to represent a 128-bit number as a "high half" and a "low half". Actually, to keeps things simpler, it's not actually going to be a 128-bit number. To keep things simpler, the "high half" is going to be the first 18 digits of a 36-digit decimal number, and the "low half" is going to be the other 18 digits. This will give us the equivalent of of only about 120 bits, but it will still be plenty for our purposes.
So how do we do arithmetic on 36-digit numbers represented as "high" and "low" halves? Actually, it ends up being more or less the same way we learned how to do pencil-and-paper arithmetic on numbers represented as digits, at all.
If I have one of these "big" numbers, in its two halves:
high1 low1
and if I have a second one, also in two halves:
high2 low2
and if I want to compute the sum
high1 low1
+ high2 low2
-----------
high3 low3
the way I do it is to add low1 and low2 to get the low half of the sum, low3. If low3 is less than 1000000000000000000 — that is, if it has 18 digits or less — I'm okay, but if it's bigger than that, I have a carry into the next column. And then to get the high half of the sum, high3, I just add high1 plus high2 plus the carry, if any.
Multiplication is harder, but it turns out for this problem we're never going to have to compute a full 36-digit × 36-digit product. We're only ever going to have to multiply one of our big numbers by a small number, like 2 or 10. The problem will look like this:
high1 low1
× fac
-----------
high3 low3
So, again by the rules of paper-and-pencil arithmetic we learned long ago, low3 is going to be low1 × fac, and high3 is going to be high1 × fac, again with a possible carry.
The next question is how we're going to carry these low and high halves around. As I said, normally we'd use an array, but we can't here. The second choice might be a struct, but you may not have learned about those yet, and if your crazy instructor won't let you use arrays, it seems that using structures might well be out of bounds, also. So we'll just write a few functions that accept high and low halves as separate arguments.
Here's our first function, to add two 36-digit numbers. It's actually pretty simple:
void long_add(unsigned long long int *hi, unsigned long long int *lo,
unsigned long long int addhi, unsigned long long int addlo)
{
*hi += addhi;
*lo += addlo;
}
The way I've written it, it doesn't compute c = a + b; it's more like a += b. That is, it takes addhi and addlo and adds them in to hi and lo, modifying hi and lo in the process. So hi and lo are passed in as pointers, so that the pointed-to values can be modified. The high half is *hi, and we add in the high half of the number to be added in, addhi. And then we do the same thing with the low half. And then — whoops — what about the carry? That's not too hard, but to keep things nice and simple, I'm going to defer it to a separate function. So my final long_add function looks like:
void long_add(unsigned long long int *hi, unsigned long long int *lo,
unsigned long long int addhi, unsigned long long int addlo)
{
*hi += addhi;
*lo += addlo;
check_carry(hi, lo);
}
And then check_carry is simple, too. It looks like this:
void check_carry(unsigned long long int *hi, unsigned long long int *lo)
{
if(*lo >= 1000000000000000000ULL) {
int carry = *lo / 1000000000000000000ULL;
*lo %= 1000000000000000000ULL;
*hi += carry;
}
}
Again, it accepts pointers to lo and hi, so that it can modify them.
The low half is *lo, which is supposed to be at most an 18-bit number, but if it's got 19 — that is, if it's greater than or equal to 1000000000000000000, that means it has overflowed, and we have to do the carry thing. The carry is the extent by which *lo exceeds 18 digits — it's actually just the top 19th (and any greater) digit(s). If you're not super-comfortable with this kind of math, it may not be immediately obvious that taking *lo, and dividing it by that big number (it's literally 1 with eighteen 0's) will give you the top 19th digit, or that using % will give you the low 18 digits, but that's exactly what / and % do, and this is a good way to learn that.
In any case, having computed the carry, we add it in to *hi, and we're done.
So now we're done with addition, and we can tackle multiplication. For our purposes, it's just about as easy:
void long_multiply(unsigned long long int *hi, unsigned long long int *lo,
unsigned int fac)
{
*hi *= fac;
*lo *= fac;
check_carry(hi, lo);
}
It looks eerily similar to the addition case, but it's just what our pencil-and-paper analysis said we were going to have to do. (Again, this is a simplified version.) We can re-use the same check_carry function, and that's why I chose to break it out as a separate function.
With these functions in hand, we can now rewrite the binary-to-decimal program so that it will work with these even bigger numbers:
int main()
{
unsigned int inp, input;
unsigned long long int anslo = 0, anshi = 0;
unsigned long long int place_value_lo = 1, place_value_hi = 0;
printf("Enter a decimal number:\n");
scanf("%u", &input);
inp = input;
while(input){
int rem = input % 2;
input = input / 2;
// ans=ans+(rem*place_value);
unsigned long long int tmplo = place_value_lo;
unsigned long long int tmphi = place_value_hi;
long_multiply(&tmphi, &tmplo, rem);
long_add(&anshi, &anslo, tmphi, tmplo);
// place_value=place_value*10;
long_multiply(&place_value_hi, &place_value_lo, 10);
}
printf("%u in Decimal is ", inp);
if(anshi == 0)
printf("%llu", anslo);
else printf("%llu%018llu", anshi, anslo);
printf(" in Binary Form.\n");
}
This is basically the same program as in the question, with these changes:
The ans and place_value variables have to be greater than 64 bits, so they now exist as _hi and _lo halves.
We're calling our new functions to do addition and multiplication on big numbers.
We need a tmp variable (actually tmp_hi and tmp_lo) to hold the intermediate result in what used to be the simple expression ans = ans + (rem * place_value);.
There's no need for the user's input variable to be big, so I've reduced it to a plain unsigned int.
There's also some mild trickiness involved in printing the two halves of the final answer, anshi and anslo, back out. But if you compile and run this program, I think you'll find it now works for any input numbers you can give it. (It should theoretically work for inputs up to 68719476735 or so, which is bigger than will fit in a 32-bit input inp.)
Also, for those still with me, I have to add a few disclaimers. The only reason I could get away with writing long_add and long_multiply functions that looked so small and simple was that they are simple, and work only for "easy" problems, without undue overflow. I chose 18 digits as the maximum for the "high" and "lo" halves because a 64-bit unsigned long long int can actually hold numbers up to the equivalent of 19 digits, and that means that I can detect overflow — of up to one digit — simply, with that > 1000000000000000000ULL test. If any intermediate result ever overflowed by two digits, I'd have been in real trouble. But for simple additions, there's only ever a single-digit carry. And since I'm only ever doing tiny multiplications, I could cheat and assume (that is, get away with) a single-digit carry there, too.
If you're trying to do multiprecision arithmetic in full generality, for multiplication you have to consider partial products that have up to twice as many digits/bits as their inputs. So you either need to use an output type that's twice as wide as the inputs, or you have to split the inputs into halves ("sub-halves"), and work with them individually, basically doing a little 2×2 problem, with various carries, for each "digit".
Another problem with multiplication is that the "obvious" algorithm, the one based on the pencil-and-paper technique everybody learned in elementary school, can be unacceptably inefficient for really big problems, since it's basically O(N2) in the number of digits.
People who do this stuff for a living have lots of more-sophisticated techniques they've worked out, for things like detecting overflow and for doing multiplication more efficiently.
And then if you want some real fun (or a real nightmare, full of bad flashbacks to elementary school), there's long division...

OP's code suffers from overflow in place_value*10
A way to avoid no array and range limitations is to use recursion.
Perhaps beyond where OP is now.
#include <stdio.h>
void print_lsbit(unsigned long long x) {
if (x > 1) {
print_lsbit(x / 2); // Print more significant digits first
}
putchar(x % 2 + '0'); // Print the LSBit
}
int main(void) {
printf("\nYou have chosen Decimal to Binary and Octal Conversion!\n");
printf("Enter a decimal number:\n");
//scanf("%llu", &input);
unsigned long long input = 100000000;
printf("%llu in Decimal is ", input);
print_lsbit(input);
printf(" in Binary Form.\n");
return 0;
}
Output
You have chosen Decimal to Binary and Octal Conversion!
Enter a decimal number:
100000000 in Decimal is 101111101011110000100000000 in Binary Form.

scanf("%d", &i) won't read numbers with more than 10 digits

I am trying to read in an integer of arbitrary size using the %d format modifier for scanf(). When the input integer has 10 or fewer digits, this action performs as expected. But, when I try to input a number with 11 or more digits, scanf("%d", &number) behaves strangely. It seems to return negative values. For example, when 12345678901 is read, the value assigned too number is -539222987.
Does anyone have an idea as to why this might be? Maybe there's a limit to the size of an integer read by scanf("%d", &i)?
Thanks!

In most cases in C, int are generally 32 bits, which will only store numbers between 2147483647 and -2147483648. Any number outside of the limits won't be supported.
If you want to store longer numbers, you can look into long long which are generally 64 bits and can store between 9223372036854775807 and -9223372036854775808.
You can also make your type unsigned so it won't store negative numbers and allocate all values to possitive numbers.
If you want to have absolute control on the number of bits your data type has, you can use <inttypes.h>. int64_t will give you 64-bits of signed values and uint64_t will give you 64-bits of unsigned values.
You can look at the limits of most data types in <limits.h>

An integer type can only store a number as large as the number of bits that are available in that type. The maximum size can be found in <limits.h>. Essentially a string with an arbitrarily large number of digits can't be stored reliably in a fixed size integer type.
If you can guarantee all input will be small enough to fit you will be fine using a built in fixed width type. However if you need to be able to read in an arbitrarily large integer from a string you may want to look into libraries that exist to do that task. Getting arbitrary precision representations right is a bit complicated so I'd suggest using a library such as GMP, in which you may wish to look at this https://gmplib.org/manual/Formatted-Input-Strings.html#Formatted-Input-Strings

What's wrong with my C code? (Prime factors of a big number)

Could you please help me?
I'm a beginner at C and my code doesn't work.
I'm trying to determine largest prime factor of 600851475143 and when I run the code, it just does nothing. Trying with smaller number however works.
long i;
for (i = 600851475143; i > 1; i--)
{
if (600851475143 % i == 0)
{
printf("%d\n", i);
}
};

It's probably a 32-bit system. The number 600851475143 is bigger than 32 bits.
Instead of long i try:
long long i;
And instead of printf("%d\n", i); try:
printf("%lld\n", i);
And use 600851475143LL in place of 600851475143.

First of all, the correct way to print a long is not %d but %ld (d = decimal, ld = long decimal). If long and int have different sizes on your system (which is not unusual), the results would not print correctly to begin with.
Next possible problem is that 600851475143 is more than fits into a 32 bit variable, yet long is only guaranteed to be at least 32 bit. It may be bigger than that, but only 32 bits are guaranteed. So are you sure that long is big enough on your system?
Try
printf("%zu\n", sizeof(long));
if it says 8, everything is fine, if it says only 4, long is not sufficient and you need to use long long instead (and %lld for printf, lld = long long decimal).
Last but not least, you are aware that your loop needs to do 600 billion iterations, aren't you? Even if you have a very fast system with a very fast CPU this will take quite some time to complete. So you will see 600851475143 printed to the screen immediately, but it will take quite some time before your code terminates (or finds another divisor, in case this is not a prime number).
Optionally:
Instead of writing 600851475143, you may write 600851475143LL to let the compiler know you want this number to be of type long long. It seems like the C 2011 standard doesn't require this any longer (numbers are automatically treated as long or long long if required), yet I know that pior to C 2011 some compilers least issued a warning for numbers being bigger than int (or bigger than long).

You can start your loop with the greatest integer less than or equal to the square root of your large number. Then you can find factor pairs working down through the loop. Write a separate function to check whether a given number is prime. If the larger factor of the pair is prime, return it. If the larger is not prime, check if the smaller is prime and if so return it.

How to find the largest prime factor of 600851475143?

#include <stdio.h>
main()
{
long n=600851475143;
int i,j,flag;
for(i=2;i<=n/2;i++)
{
flag=1;
if(n%i==0)//finds factors backwards
{
for(j=2;j<=(n/i)/2;j++)//checks if factor is prime
{
if((n/i)%j==0)
flag=0;
}
if(flag==1)
{
printf("%d\n",n/i);//displays largest prime factor and exits
exit(0);
}
}
}
}
The code above works for n = 6008514751. However, it doesn't work for n = 600851475143, even though that number still is within the range of a long.
What can I do to make it work?

One potential problem is that i and j are int, and could overflow for large n (assuming int is narrower than long, which it often is).
Another issue is that for n=600,851,475,143 your program does quite a lot of work (the largest factor is 6857). It is not unreasonable to expect it to take a long time to complete.

Use longs in place of ints. Better still, use uint64_t which has been defined since C99 (acknowledge Zaibis). It is a 64 bit unsigned integral type on all platforms. (The code as you have it will overflow on some platforms).
And now we need to get your algorithm working more quickly:
Your test for prime is inefficient; you don't need to iterate over all the even numbers. Just iterate over primes; up to and equal to the square root of the number you're testing (not half way which you currently do).
Where do you get the primes from? Well, call your function recursively. Although in reality I'd be tempted to cache the primes up to, say, 65536.

From ISO/IEC 9899:TC3
5.2.4.2.1 Sizes of integer types
[...]
Their implementation-defined values shall be equal or greater in magnitude(absolute value) to those shown, with the same sign.
[...]
— minimum value for an object of type long int
LONG_MIN -2147483647 // -(2^31 - 1)
— maximum value for an object of type long int
LONG_MAX +2147483647 // 2^31 - 1
EDIT:
Sorry I forgot to add what this should tell you.
The point is long doesn't even need to be able to hold the value you mentioned, as the standard says it has to be able to hold at least 4 Bytes with sign so it could be possible that your machine is just able to hold values up to 2147483647 in a variable of type long.

On 32-bit machine long range from -2,147,483,648 to 2,147,483,647 and On 64-bit machine its range is from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (NOTE: This is not mandated by C standard and may vary from one compiler to another).
As OP said in comment he is on 32-bit, 600851475143 goes out of range as it is not fit in the range of long.

Try changing n to long long int .. and change i,j to long
EDIT: define n like this :
long long int n = 600851475143LL;
LL - is a suffix to enforce long long type ...

Integer Overflow

I have an unsigned long long that I use to track volume. The volume is incremented by another unsigned long long. Every 5 seconds I print this value out and when the value reaches the 32 bit unsigned maximum the printf gives me a negative value. The code snippet follows:
unsigned long long vol, vold;
char voltemp[10];
vold = 0;
Later...
while (TRUE) {
vol = atoi(voltemp);
vold += vol;
fprintf(fd2, "volume = %llu);
}
What am I doing wrong? This runs under RedHat 4 2.6.9-78.0.5.ELsmp gcc version 3.4.5

Since you say it prints a negative value, there must be something else wrong, apart from your use of atoi instead of strtoull. A %llu format specifier just doesn't print a negative value.
It strongly looks like the problem is the fprintf call. Check that you included stdio.h and that the argument list is indeed what is in the source code.

Well I can't really tell because your code has syntax errors, but here is a guess:
vol = atoi(voltemp);
atoi converts ascii to integer. You might want to try atol but that only gets it to a long, not a long long.
Your C standard library MIGHT have atoll.

You can't use atoi if the number can exceed the bounds of signed int.
EDIT: atoll (which is apparently standard), as suggested, is another good option. Just keep in mind that limits you to signed long long. Actually, the simplest option is strtoull, which is also standard.

Are you sure fprintf can take in a longlong as a parameter, rather than a pointer to it? It looks like it is converting your longlong to an int before passing it in.

I'd guess the problem is that printf is not handling %llu the way you think it is.
It's probably taking only 32 bits off the stack, not 64.
%llu is only standard since C99. maybe your compiler likes %LU better?

For clarification the fprintf statement was copied incorrectly (my mistake, sorry). The fprintf statement should actually read:
fprintf(fd2, "volume = %llu\n", vold);
Also, while admittedly sloppy the maximum length of the the array voltemp is 9 bytes (digits) which is well within the limits of a 32-bit integer.
When I pull this code out of the program it is part of and run it in a test program I get the result I would expect which is puzzling.

If voltemp is ever really big, you'll need to use strtoull, not atoi.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight