K&R Exercise 3-4: Negative Numbers Represented In Binary - c

I'm having a hard time understanding this exercise:
In a two's complement number representation, our version of itoa does not
handle the largest negative number, that is, the value of n equal to -(2^(wordsize-1)). Explain why not. Modify it to print that value correctly, regardless of the machine on which it runs.
Here is what the itoa originally looks like:
void reverse(char s[], int n)
{
int toSwap;
int end = n-1;
int begin = 0;
while(begin <= end) // Swap the array in place starting from both ends.
{
toSwap = s[begin];
s[begin] = s[end];
s[end] = toSwap;
--end;
++begin;
}
}
// Converts an integer to a character string.
void itoa(int n, char s[])
{
int i, sign;
if ((sign = n) < 0)
n = -n;
i = 0;
do
{
s[i++] = n % 10 + '0';
} while ((n /= 10) > 0);
if (sign < 0)
s[i++] = '-';
s[i] = '\0';
reverse(s, i);
}
I found this answer, but I don't understand the explanation:
http://www.stevenscs.com/programs/KR/$progs/KR-EX3-04.html
Because the absolute value of the largest negative number a word can hold is greater than that of the largest positive number, the statement early in iota that sets positive a negative number corrupts its value.
Are they saying that negative numbers contain more bits because of the sign than a positive number which has no sign? Why would multiplying by -1 affect how the large negative number is stored?

In two's complement representation, the range of values you can represent is -2n-1 to 2n-1-1. Thus, with 8 bits, you can represent values in the range -128 to 127. That's what's meant by the phrase, "the largest negative number a word can hold is greater than that of the largest positive number."
Illustrating with just 3 bits to make it clearer:Value Bits
----- ----
0 000
1 001
2 010
3 011
-4 100
-3 101
-2 110
-1 111
With 3 bits, there's no way we can represent a positive 4 in two's complement, so n = -n; won't give us the result we expect1. That's why the original atoi implementation above can't deal with INT_MIN.
Behavior on signed integer overflow is undefined, meaning that there's no fixed result.

The problem is that, if n is the largest negative number, when you do n=-n you obtain 0, bacause you cannot represent a positive number that big.
A solution can be to hold the positive number in a long integer.

Related

Long to int truncation problem, truncating exceptions or handling error

Leetcode requires that the output of -91283472332 be converted to int, and the output result is -2147483648. I use long to store the result, and then return int. Why is the result returned -1089159116
here's my code
int myAtoi(char * s){
char *str = s;
long n = 0;
char *flag ;
while(*str){
if( * str =='-' || * str == '+')
{
flag = str++;
continue;
}
if(*str<='9' && *str>='0')
{
n*=10;
n+=(*str++)-48;
continue;
}
if(*str>='A'&&*str<='z')
break;
++str;
}
if(*flag == '-')
{
n-=(2*n);
}
return n;
}
So here's the description
Input: s = "-91283472332"
Output: -2147483648
Explanation:
Step 1:
"-91283472332" (no characters read because there is no leading whitespace)
^
Step 2:
"-91283472332" ('-' is read, so the result should be negative)
^
Step 3:
"-91283472332" ("91283472332" is read in)
^
The parsed integer is -91283472332.
Since -91283472332 is less than the lower bound of the range [-231, 231 - 1], the final result is clamped to -231 = -2147483648.
The value -91283472332 is 0xFFFFFFEABF14C034 in hexadecimal, two's complement.
When it is trunctated to 32-bit long, the value is 0xBF14C034 and it means -1089159116 when interpreted as two's complement.
You should add some conditional branch to return -2147483648 when the value exceeds the limit.
I guess you're doing this problem. Since it requires values outside the range to be clamped to the maximum values, A.K.A saturated math, you'll need to check the value's range like this
if (n > INT_MAX)
return INT_MAX;
else if (n < INT_MIN)
return INT_MINT;
else
return n;
It's similar to std::clamp(n, INT_MIN, INT_MAX) in C++
You can see that clearly in the requirements (emphasis mine):
If the integer is out of the 32-bit signed integer range [-231, 231 - 1], then clamp the integer so that it remains in the range. Specifically, integers less than -231 should be clamped to -231, and integers greater than 231 - 1 should be clamped to 231 - 1.
Now compare that with the above if blocks
If you cast the value from 64 to 32-bit then it'll reduce the value modulo 2n:
-91283472332 % 2147483648 = -1089159116,
or in hex: 0xFFFFFFEABF14C034 & 0xFFFFFFFF = 0xBF14C034
Saturation math is common in many areas like digital signal processing or computer graphics
is there a function in C or C++ to do "saturation" on an integer
How to do unsigned saturating addition in C?

Using an unsigned int in a do-while loop

I'm new to coding in c and I've been trying to wrap my head around unsigned integers. This is the code I have:
#include <stdio.h>
int main(void)
{
unsigned int hours;
do
{
printf("Number of hours you spend sleeping a day: ");
scanf(" %u", &hours);
}
while(hours < 0);
printf("\nYour number is %u", hours);
}
However, when I run the code and use (-1) it does not ask the question again like it should and prints out (Your number is 4294967295) instead. If I change unsigned int to a normal int, the code works fine. Is there a way I can change my code to make the unsigned int work?
Appreciate any help!
Is there a way I can change my code to make the unsigned int work?
Various approaches possible.
Read as int and then convert to unsigned.
Given "Number of hours you spend sleeping a day: " implies a small legitimate range about 0 to 24, read as int and convert.
int input;
do {
puts("Number of hours you spend sleeping a day:");
if (scanf("%d", &input) != 1) {
Handle_non_text_input(); // TBD code for non-numeric input like "abc"
}
} while (input < 0 || input > 24);
unsigned hours = input;
An unsigned int cannot hold negative numbers. It is useful since it can store a full 32 bit number (twice as large as a regular int), but it cannot hold negative numbers So when you try to read your negative unsigned int, it is being read as a positive number. Although both int and unsigned int are 32 bit numbers, they will be interpreted much differently.
I would try the next test:
do:{
printf("enter valid input...")
scanf("new input...")
} while (hours > 24)
Why should it work?
An unsigned int in C is a binary number, with 32 bit. that means it's max value is 2^32 - 1.
Note that:
2^32 - 1 == 4294967295. That is no coincidence. Negative ints are usually represented using the "Two's complement" method.
A word about that method:
When I use a regular int, it's most significant bit is reserved for sign: 1 if negative, 0 if positive. A positive int than holds a 0 in it's most significant bit, and 1's and 0's on the remaining coordinates in the ordinary binary manner.
Negative ints, are represented differently:
Suppose K is a positive number, represented by N bits.
The number (-K) is represented using 1 in the most significant bit, and the POSITIVE NUMBER: (2^(N-1) - K) occupying the N-1 least significant bits.
Example:
Suppose N = 4, K = 7. Binary representation for 7 using 4 bit:
7 = 0111 (The most significant bit is reserved for sign, remember?)
-7 , on the other hand:
-7 = concat(1, 2^(4-1) - 7) == 1001
Another example:
1 = 0001, -1 = 1111.
Note that if we use 32 bits, -1 is 1...1 (altogether we have 32 1's). This is exactly the binary representation of the unsigned int 4294967295. When you use unsigned int, you instruct the compiler to refer to -1 as a positive number. This is where your unexpected "error" comes from.
Now - If you use the while(hours>24), you rule out most of the illegal input. I am not sure though if you rule out all illegal input. It might be possible to think of a negative number such that the compiler interpret it as a non-negative number in the range [0:24] when asked to ignore the sign, and refer to the most significant bit as 'just another bit'.

Get bits from number string

If I have a number string (char array), one digit is one char, resulting in that the space for a four digit number is 5 bytes, including the null termination.
unsigned char num[] ="1024";
printf("%d", sizeof(num)); // 5
However, 1024 can be written as
unsigned char binaryNum[2];
binaryNum[0] = 0b00000100;
binaryNum[1] = 0b00000000;
How can the conversion from string to binary be made effectively?
In my program i would work with ≈30 digit numbers, so the space gain would be big.
My goal is to create datapackets to be sent over UDP/TCP.
I would prefer not to use libraries for this task, since the available space the code can take up is small.
EDIT:
Thanks for quick response.
char num = 0b0000 0100 // "4"
--------------------------
char num = 0b0001 1000 // "24"
-----------------------------
char num[2];
num[0] = 0b00000100;
num[1] = 0b00000000;
// num now contains 1024
I would need ≈ 10 bytes to contain my number in binary form. So, if I as suggested parse the digits one by one, starting from the back, how would that build up to the final big binary number?
In general, converting a number in string representation to decimal is easy because each character can be parsed separately. E.g. to convert "1024" to 1024 you can just look at the '4', convert it to 4, multiply by 10, then convert the 2 and add it, multiply by 10, and so on until you have parsed the whole string.
For binary it is not so easy, e.g. you can convert 4 to 100 and 2 to 010 but 42 is not 100 010 or 110 or something like that. So, your best bet is to convert the whole thing to a number and then convert that number to binary using mathematical operations (bit shifts and such). This will work fine for numbers that fit in one of the C++ number types, but if you want to handle arbitrarily large numbers you will need a BigInteger class which seems to be a problem for you since the code has to be small.
From your question I gather that you want to compress the string representation in order to transmit the number over a network, so I am offering a solution that does not strictly convert to binary but will still use fewer bytes than the string representation and is easy to use. It is based on the fact that you can store a number 0..9 in 4 bits, and so you can fit two of those numbers in a byte. Hence you can store an n-digit number in n/2 bytes. The algorithm could be as follows:
Take the last character, '4'
Subtract '0' to get 4 (i.e. an int with value 4).
Strip the last character.
Repeat to get 0
Concatenate into a single byte: digits[0] = (4 << 4) + 0.
Do the same for the next two numbers: digits[1] = (2 << 4) + 1.
Your representation in memory will now look like
4 0 2 1
0100 0000 0010 0001
digits[0] digits[1]
i.e.
digits = { 64, 33 }
This is not quite the binary representation of 1024, but it is shorter and it allows you to easily recover the original number by reversing the algorithm.
You even have 5 values left that you don't use for storing digits (i.e. everything larger than 1010) which you can use for other things like storing the sign, decimal point, byte order or end-of-number delimiter).
I trust that you will be able to implement this, should you choose to use it.
If I understand your question correctly, you would want to do this:
Convert your string representation into an integer.
Convert the integer into binary representation.
For step 1:
You could loop through the string
Subtract '0' from the char
Multiply by 10^n (depending on the position) and add to a sum.
For step 2 (for int x), in general:
x%2 gives you the least-significant-bit (LSB).
x /= 2 "removes" the LSB.
For example, take x = 6.
x%2 = 0 (LSB), x /= 2 -> x becomes 3
x%2 = 1, x /= 2 -> x becomes 1
x%2 = 1 (MSB), x /= 2 -> x becomes 0.
So we we see that (6)decimal == (110)bin.
On to the implementation (for N=2, where N is maximum number of bytes):
int x = 1024;
int n=-1, p=0, p_=0, i=0, ex=1; //you can use smaller types of int for this if you are strict on memory usage
unsigned char num[N] = {0};
for (p=0; p<(N*8); p++,p_++) {
if (p%8 == 0) { n++; p_=0; } //for every 8bits, 1) store the new result in the next element in the array. 2) reset the placing (start at 2^0 again).
for (i=0; i<p_; i++) ex *= 2; //ex = pow(2,p_); without using math.h library
num[n] += ex * (x%2); //add (2^p_ x LSB) to num[n]
x /= 2; // "remove" the last bit to check for the next.
ex = 1; // reset the exponent
}
We can check the result for x = 1024:
for (i=0; i<N; i++)
printf("num[%d] = %d\n", i, num[i]); //num[0] = 0 (0b00000000), num[1] = 4 (0b00000100)
To convert a up-to 30 digit decimal number, represented as a string, into a serious of bytes, effectively a base-256 representation, takes up to 13 bytes. (ceiling of 30/log10(256))
Simple algorithm
dest = 0
for each digit of the string (starting with most significant)
dest *= 10
dest += digit
As C code
#define STR_DEC_TO_BIN_N 13
unsigned char *str_dec_to_bin(unsigned char dest[STR_DEC_TO_BIN_N], const char *src) {
// dest[] = 0
memset(dest, 0, STR_DEC_TO_BIN_N);
// for each digit ...
while (isdigit((unsigned char) *src)) {
// dest[] = 10*dest[] + *src
// with dest[0] as the most significant digit
int sum = *src - '0';
for (int i = STR_DEC_TO_BIN_N - 1; i >= 0; i--) {
sum += dest[i]*10;
dest[i] = sum % 256;
sum /= 256;
}
// If sum is non-zero, it means dest[] overflowed
if (sum) {
return NULL;
}
}
// If stopped on something other than the null character ....
if (*src) {
return NULL;
}
return dest;
}

Integer range and Overflow

This is a simple program to convert positive decimal number into binary. I have to report and stop conversion of those numbers which could cause overflow or erroneous results. I found size of integer is 4 Bytes but it converts correctly upto 1023 only.
I am confuse where the number "1023" came from? Is there any method to calculate so i can predict what will be the correct range, if say, i am programming on another system.
#include<stdio.h>
int main(void)
{
int decimal,binary=0,y,m=1;
scanf("%d",&decimal);
if(decimal<=1023)
{
while(decimal>0)
{
y=decimal%2;
binary=binary+(m*y);
m=m*10;
decimal=decimal/2;
}
printf("\nBinary Equivalent is: %d",binary);
}
else
{printf("Sorry, The Number You've entered exceeds the maximum allowable range for conversion");}
getch();
return 0;
}
1023 is equal to 1024-1 (2^10 -1), so a number lesser than or equal to 1023 will have 10 digits in base 2. Since you are using an int to get the result, it stores up to 2^31-1 = 2147483647 (31 because one of the 32 bits is used to represent the sign (+ or -)). When you have a 1024 or higher number, it uses more than 10 digits - and, thus, is higher than 2147483647.
Hope that helps.
Actually, the range of an integer is between [-2^31, 2^31-1]. Because there are 4 bytes (i.e., 32 bits). However, if you want to scanf a non-negative integer. you have to initial a unsigned int rather int. The range will be [0, 2^32-1].
The problem is in the temporary variables binary and m you use. Because 1024 would take 11 divisions to become 0, m will become 10.000.000.000. However, the maximum value of an int is 2.147.483.647 (since one bit of the four bytes is used as a sign bit). m will thus overflow, which results in an incorrect result. 1023 or smaller values will take 10 devisions or less to become 0, so m is max 1.000.000.000, so m has no overflow.
You seem to want to take a 4 byte decimal number and convert it to a binary string of 0s and 1s. The method used in the code fails when the decimal number is > 1023. Here is a conversion that produces a string of 0s and 1s that can be printed using:
printf("\nBinary Equivalent is: ");
while( int i = 0; i < 32; i++ )
{
printf( "%c", (decimal & (1<<i) )? '1': '0');
}
printf("\n");
This eliminates much of the code clutter and produces the desired output.

I've created a code to convert binary to decimal, but doesn't work with more than 10 bits

I've created a small code to convert binary number to decimal number.
When I enter a binary number until 10 bits, the result be correct, but when I increase than 10 bits, the result would be wrong.
The algorithm that I used is the following
1 1 0 0 1 0
32 16 8 4 2 1 x
------------------
32+ 16+ 0+ 0+ 2+ 0
The Code:
unsigned long binary, i=0, j=0, result=0, base=1;
unsigned char *binaryStandalone = (unsigned char *)malloc(16);
memset(binaryStandalone, 0, 16);
printf("Enter a binary number: ");
scanf("%u", &binary);
while(binary > 0){
binaryStandalone[i] = binary % 10;
binary = binary / 10;
i++;
}
for(j=0;j<i;j++){
result += (binaryStandalone[j] * 1 << j);
printf("%u = %u\n", j, base << j);
}
printf("The decimal number is: %u\n", result);
free(binaryStandalone);
Now I want to know, what is the reason that the code doesn't give me the correct result when increase the binary number more than 10 bits ?
It seems that your platform uses 32 bit for a long int, therefore your binary
variable can hold at most the value 2^32 - 1 = 4294967295, which is sufficient
for 10 digits, but not for eleven.
You could use unsigned long long instead (64 bit would be sufficient for 20 digits), or read the input as a string.
you store in an unsigned long which has range 0 to 4,294,967,295 -> only 10 numbers
Because the long value you're using to store the "binary" value has not more decimal digits. You might want to use a string type for input instead.

Resources