Why is the range of any data type greater on negative side as compare to positive side?
For example, in case of integer:
In Turbo C its range is -32768 to 32767 and for Visual Studio it is -2147483648 to 2147483647.
The same happens to other data types.
Because of how numbers are stored. Signed numbers are stored using something called "two's complement notation".
Remember all variables have a certain amount of bits. If the most significant one of them, the one on the left, is a 0, then the number is non-negative (i.e., positive or zero), and the rest of the bits simply represent the value.
However, if the leftmost bit is a 1, then the number is negative. The real value of the number can be obtained by subtracting 2^n from the whole number represented (as an unsigned quantity, including the leftmost 1), where n is the amount of bits the variable has.
Since only n - 1 bits are left for the actual value (the "mantissa") of the number, the possible combinations are 2^(n - 1). For positive/zero numbers, this is easy: they go from 0, to 2^(n - 1) - 1. That -1 is to account for zero itself -- for instance, if you only had four possible combinations, those combinations would represent 0, 1, 2, and 3 (notice how there's four numbers): it goes from 0 to 4 - 1.
For negative numbers, remember the leftmost bit is 1, so the whole number represented goes between 2^(n - 1) and (2^n) - 1 (parentheses are very important there!). However, as I said, you have to take 2^n away to get the real value of the number. 2^(n - 1) - 2^n is -(2^(n - 1)), and ((2^n) - 1) - 2^n is -1. Therefore, the negative numbers' range is -(2^(n - 1)) to -1.
Put all that together and you get -2^(n - 1) to 2^(n - 1) - 1. As you can see, the upper bound gets a -1 that the lower bound doesn't.
And that's why there's one more negative number than positive.
The minimum range required by C is actually -32767 through 32767, because it has to cater for two's complement, ones' complement and sign/magnitude encoding for negative numbers, all of which the C standard allows. See Annex E, Implementation limits of C11 (and C99) for details on the minimum ranges for data types.
Your question pertains only to the two's complement variant and the reason for that is simple. With 16 bits, you can represent 216 (or 65,536) different values and zero has to be one of those. Hence there are an odd number of values left, of which the majority (by one) are negative values:
1 thru 32767 = 37267 values
0 = 1 value
-1 thru -32768 = 32768 values
-----
65536 values
Both ones' complement and sign-magnitude encoding allow for a negative-zero value (as well as positive-zero), meaning that one less bit pattern is available for the non-zero numbers, hence the reduced minimum range you find in the standard.
1 thru 32767 = 37267 values
0 = 1 value
-0 = 1 value
-1 thru -32767 = 32767 values
-----
65536 values
Two's complement is actually a nifty encoding scheme because positive and negative numbers can be added together with the same simple hardware. Other encoding schemes tend to require more elaborate hardware to do the same task.
For a fuller explanation on how two's complement works, see the wikipedia page.
Because the range includes zero. The number of different values an n-bit integer can represent is 2^n. That means a 16-bit integer can represent 65536 different values. If it's an unsigned 16-bit integer, it can represent 0-65535 (inclusive). The convention for signed integers is to represent -32768 to 32767, -214748368 to 214748367, etc.
With 2s complement negative numbers are defined as the bitwise not plus 1 this reduces the range of possible numbers in a given number of bits by 1 on the negative side.
Ordinarily, due to using a two's complement system for storing negative values, when you flip the sign bit on an integer it's biased toward the negative.
The range should be: -(2^(n-1)) - ((2^(n-1)-1)
Related
I am learning about data allocation and am a little confused.
If you are looking for the smallest or greatest value that can be stored in a certain number of bits then does it matter what the data type is?
Wouldn't the smallest or biggest number that could be stored in 22 bits would be 22 1's positive or negative? Is the first part of this question a red herring? Wouldn't the smallest value be -4194303?
A 22-bit data element can store any one of 2^22 distinct values. What those values actually mean is a matter of interpretation. That interpretation may be imposed by a compiler or some piece of hardware, or may be under the control of the programmer, and suit some specific application.
A simple interpretation, of course, would be to treat the 22 bits as an unsigned integer, with values from 0 to (2^22)-1. A two's-complement, signed integer is a slightly more sophisticated interpretation of the same bits. Or you (or the compiler, or CPU) could divide the 22 bits up into a mantissa and exponent, and store a range of decimal numbers. The range and precision would depend on how many bits were allocated to the mantissa, and how many to the exponent.
Or you could split the bits up and use some for the numerator and some for the denominator of a fraction. Or, in fact, anything else.
Some of these interpretations of the bits are built into hardware, some are implemented by compilers or libraries, and some are entirely under the programmer's control. Not all programming languages allow the programmer to manipulate individual bits in a natural or efficient way, but some do. Sometimes, using a highly unconventional interpretation of binary data can give significant efficiency gains, but usually at the expense of readability and maintainability.
So, yes, it matters what the data type is.
There is no law (of humans, logic, or nature) that says bits must represent numbers only in the pattern that one of the bits represents 20, another represents 21, another represents 22, and so on (and the number represented is the sum of those values for the bits that are 1). We have choices about how to use bits to represent numbers, including:
The bits do use that pattern, and so 22 bits can represent any number from 0 to the sum of 20 + 21 + 22 + … + 221 = 222 − 1 = 4,194,303. The smallest representable value is 0.
The bits mostly use that pattern, but it is modified so that one bit represents −221 instead of +221. This is called two’s complement, and the smallest value representable is −221 = −2,097,152.
The bits represent numbers as described above except the represent value is divided by 1000. This is called fixed-point. In the first case, the value represent by all bits 1 would be 4194.303, but the smallest representable value would be 0. With a combination of two’s complement and fixed-point scaled by 1/1000, the smallest representable value would be −2097.152.
The bits represent a floating-point number, where one bit represents a sign (+ or −), certain bits represent an exponent and other information, and the remaining bits represent a significand. In common floating-point formats, when all the bits in that exponent-and-other field are 1s and the significand field bits are 0s, the number represents +∞ or −∞, according to the sign bit. In such a format, the smallest representable value is −∞.
As an example, we could designate patterns of bits to represent numbers arbitrarily. We could say that 0000000000000000000000 represents 34, 0000000000000000000001 represents −15, 0000000000000000000010 represents 5, 0000000000000000000011 represents 3+4i, and so on. The smallest representable value would be whichever of those arbitrary values is smallest.
So what the smallest representable value is depends entirely on the type, since the “type” of the data includes the scheme by which the bits represent values.
If the type is a “signed integer type,” there is still some flexibility in the representation. Most modern C implementations (and other programming languages) use the two’s complement scheme described above. But the C standard still allows two other schemes:
One’s complement: If the first bit is 1, the value represented is negative, and its magnitude is given by complementing the remaining bits and interpreting them as binary. Using six bits for an example, 101001 would be negative with the magnitude of 101102 = 22, so −22.
Sign-and-magnitude: If the first bit is 1, the value represented is negative, and its magnitude is given by interpreting the remaining bits as binary. Using the same bits, 101001 would negative with the magnitude of 010012 = 9, so −9.
In both one’s complement and sign-and-magnitude, the smallest representable value with 22 bits is −(221−1) = −2,097,151.
To stretch the question further, C defines standard integer types but allows implementations to extend the language. An implementation could define some “signed integer type” with an arbitrary scheme for representing numbers, as long as that scheme included a sign, to make the name correct.
Without going into technical jargon about doing maths with Two's compliment, I'll try to explain in easy words.
First you need to raise 2 with power of 'number of bits'.
Let's take an example of an 8 bit type,
An un-signed 8-bit integer can store 2 ^ 8 = 256 values.
Since values are indexed starting from 0, so values range from 0 - 255.
Assuming you want to store signed values, so you need to get the half (simply divide it by 2),
256 / 2 = 128.
Remember we start from zero,
You might be rightly thinking you can store -127 to 127 starting from zero on both sides.
Just know that there is only zero (there is nothing like +0 or -0),
so you start with zero to positive half. 0 to 127,
that leaves you with negative half starting from -1 to -128
Hence the range will be -128 to 127.
For a 22 bit signed integer you can do the math,
2 ^ 22 = 4,194,304
4194304 / 2 = 2,097,152
-1 for positive side,
range will be, -2097152 to 2097151.
To answer your question,
-2097152 would be the smallest number you can store.
Thanks everyone for the replies. I figured it out with the help of all of your info but I will explain the answer to show exactly what gaps of knowledge I had that lead to my misunderstanding.
The data type does matter in this question because for signed data types the first bit is used to represent whether or not a binary number is positive or negative. 0111 = 7 and 1111 = -7
sign int and unsigned int use the same number of bits, 32 bits. Since an unsigned int is unsigned: the first bit isn't used to represent positive or negative so it can represent a larger number with that extra bit. 1111 converted to an unsigned int is 15 whereas with the signed int it was -7 since the furthest left bit represents the sign: 1 is negative and 0 is positive.
Now to answer "If a C signed integer type is stored in 22 bits, what is the smallest value it can store?":
If you convert binary to decimal you get 1111111111111111111111 = 4194304
This decimal value -1 is the maximum value an unsigned could hold. Since our data type is signed it has to use one less bit for the number value since the first bit represents the sign. This gives us -2097152.
Thanks again, everyone.
This question already has answers here:
What is the maximum and minimum values can be represented with 5-digit number? in 2's complement representation
(2 answers)
Closed 5 years ago.
I understand to get two's complement of an integer, we first flip the bits and add one to it but I'm having hard time figuring out Tmax and Tmin?
In a 8-bit machine using two's compliment for signed integers how would I find the maximum and minimum integer values it can hold?
would tmax be =01111111? and tmin =11111111?
You are close.
The minimum value of a signed integer with n bits is found by making the most significant bit 1 and all the others 0.
The value of this is -2^(n-1).
The maximum value of a signed integer with n bits is found by making the most significant bit 0 and all the others 1.
The value of this is 2^(n-1)-1.
For 8 bits the range is -128...127, i.e., 10000000...01111111.
Read about why this works at Wikipedia.
Due to an overflow that would lead to a negative zero, the binary representation for the smallest signed integer using twos complement representation is usually a one bit for the sign, followed by all zero bits.
If you divide the values in an unsigned type into two groups, one group for negative and another positive, then you'll end up with two zeros (a negative zero and a positive zero). This seems wasteful, so many have decided to give that a value. What value should it have? Well, it:
has a 1 for a sign bit, implying negative;
has a 1 for the most significant bit, implying 2width-1 (128, in your example)...
Combining these points to reinterpret that value as -128 seems to make sense.
My textbook provides the following explanation for the two's complement method for signed integers:
We’ll discuss this method as it applies to a 1-byte value. In that
context, the values 0 through 127 are represented by the last 7 bits,
with the high-order bit set to 0. So far, that’s the same as the
sign-magnitude method. Also, if the high-order bit is 1, the value is
negative. The difference comes in determining the value of that
negative number. Subtract the bit-pattern
for a negative number from the 9-bit pattern 100000000 (256 as
expressed in binary), and
the result is the magnitude of the value.
None of this makes any sense to me. Typical processors use octets (8-bit bytes). What does it mean by subtracting the 8-bit byte from the 9-bit byte?
Basically, for efficient computation, you want to have the same operations (addition, subtraction, etc.) to be performed the same way, regardless of the sign. So first, consider the case of unsigned bytes.
With 8 bits, you can represent any value between 0-255. Addition and subtraction work the same way as usual (modulo 256), and everything is fine.
Now, imagine that when you are at 127, incrementing by 1 gives you -128. We're still counting modulo 256, but the top 128 numbers have shifted by 256. Now, let's examine addition:
10 + (-5) = 10 + 251 (unsigned) = 261 = 5 (modulo 255).
Everything works as expected. So, in our new representation, -128 is 127 + 1, which is 01111111 + 1 which is 10000000. -1 will be 11111111. I hope I've helped.
You have 1 byte (8 digit number with each digit being a 0 or a 1)
2's complement works by looking at the first digit
10010011
^ this one
If it's a 0, then the number is positive and you can continue to convert binary to decimal normally.
If it's 1, then it's negative. Convert it normally (bin(10010011) = 147) and THEN subtract 256 (147 - 256 = -109), and there is your 2's complement number.
My memory about two's complement says:
"Flip the bits plus one"
Indeed, the following makes a number negative:
int i, j= 127;
i = (~j)+1;
printf ("%d\n",i); // prints -127
given the following function:
int boof(int n) {
return n + ~n + 1;
}
What does this function return? I'm having trouble understanding exactly what is being passed in to it. If I called boof(10), would it convert 10 to base 2, and then do the bitwise operations on the binary number?
This was a question I had on a quiz recently, and I think the answer is supposed to be 0, but I'm not sure how to prove it.
note: I know how each bitwise operator works, I'm more confused on how the input is processed.
Thanks!
When n is an int, n + ~n will always result in an int that has all bits set.
Strictly speaking, the behavior of adding 1 to such an int will depend on the representation of signed numbers on the platform. The C standard support 3 representations for signed int:
for Two's Complement machines (the vast majority of systems in use today), the result will be 0 since an int with all bits set is -1.
on a One's Complement machine (which are pretty rare today, I believe), the result will be 1 since an int with all bits set is 0 or -0 (negative zero) or undefined behavior.
a Signed-magnitude machine (are there really any of these still in use?), an int with all bits set is a negative number with the maximum magnitude (so the actual value will depend on the size of an int). In this case adding 1 to it will result in a negative number (the exact value, again depends on the number of bits that are used to represent an int).
Note that the above ignores that it might be possible for some implementations to trap with various bit configurations that might be possible with n + ~n.
Bitwise operations will not change the underlying representation of the number to base 2 - all math on the CPU is done using binary operations regardless.
What this function does is take n and then add it to the two's complement negative representation of itself. This essentially negates the input. Anything you put in will equal 0.
Let me explain with 8 bit numbers as this is easier to visualize.
10 is represented in binary as 00001010.
Negative numbers are stored in two's complement (NOTing the number and adding 1)
So the (~n + 1) portion for 10 looks like so:
11110101 + 1 = 11110110
So if we take n + ~n+1:
00001010 + 11110110 = 0
Notice if we add these numbers together we get a left carry which will set the overflow flag, resulting in a 0. (Adding a negative and positive number together never means the overflow indicates an exception!)
See this
The CARRY and OVERFLOW flag in Binary Arithmetic
Why the range of signed character is -128 to 127 but not -127 to 128 ?
That is because of the way two's complement encoding works: 0 is treated as a "positive" number (signed bit off), so, therefore, the number of available positive values is reduced by one.
In ones' complement encoding (which is not very common nowadays, but in the olden days, it was), there were separate values for +0 and -0, and so the range for an 8-bit quantity is -127 to +127.
In 8-bit 2's complement encoding numbers -128 and +128 have the same representation: 10000000. So, the designer of the hardware is presented with an obvious dilemma: how to interpret bit-pattern 10000000. Formally, it will work either way. If they decide to interpret it as +128, the resultant range will be -127..+128. If they decide to interpret it as -128, the resultant range will be -128..+127.
In actual real-life 2's complement representation the latter approach is chosen because it satisfies the following nice convention: all bit-patterns with 1 in higher-order bit represent negative numbers.
It is worth noting though, that language specification does not require 2's-complement implementations to treat the 100...0 bit pattern as a valid value in any signed integer type. E.g. implementations are allowed to restrict 8-bit signed char to -127..+127 range and regard 10000000 as an invalid bit combination (trap representation).
I think an easy way to explain this for the common soul is :
A bit is a value 0 or 1, or 2 possibilities
A 2-bit holds two combinations or 0 and 1 for four possible values : 00, 01, 10, and 11.
A 3-bit holds three combinations for a total of eight possible values : 000 to 111.
Thus n-bits holds n combinations for a total of 2^n possible values. Therefore, an 8-bit value is 2^8 = 256 possible values.
For signed numbers, the most significant bit (the first one reading the value from left to right) is the sign bit; that leaves a possibility of 2^(n-1) possible values. For an 8-bit signed number, this is 2^7 = 128 possible values for each sign. But since the positive sign includes the zero (0 to 127 = 128 different values, and 128 + 128 = 2^8 = 256), the negative sign includes -1 to... -128 for 128 different values also. Where :
10000000 = -128
...
11111111 = -1
00000000 = 0
...
01111111 = 127
#include <limits.h>
#include <stdio.h>
...
printf("range of signed character is %i ... %i", CHAR_MIN, CHAR_MAX );
If you just consider twos complement as arithmetic modulo 256, then the cutoff between positive and negative is purely arbitrary. You could just as well have put it at 63/-192, 254/-1, 130/-125, or anywhere else. However, as a standard signed integer format, twos complement came by convention put put the cutoff at 127/-128. This cutoff has one big benefit: the high bit being set corresponds directly to the number being negative.
As for the C language, it leaves the format of signed numbers up to the implementation, but only offers 3 choices of implementation, all of which use a "sign bit": sign/magnitude, ones complement, and twos complement.
If you look at ranges of chars and ints there seems to be one extra number on the negative side. This is because a negative number is always stored as 2’s compliment of its binary. For example, let us see how -128 is stored. Firstly, binary of 128 is calculated (10000000), then its 1’s compliment is obtained (01111111). A 1’s compliment is obtained by changing all 0s to 1s and 1s to 0s. Finally, 2’s compliment of this number, i.e. 10000000, gets stored. A 2’s compliment is obtained by adding 1 to the 1’s compliment. Thus, for -128, 10000000 gets stored. This is an 8-bit number and it can be easily accommodated in a char. As against this, +128 cannot be stored in a char because its binary 010000000 (left-most 0 is for positive sign) is a 9-bit number. However +127 can be stored as its binary 01111111 turns out to be a 8-bit number.
Step 1:
If you take 2's complement of any number from 0 up to 127 the bit number 8 will always be 1. So lets reserve that info.
Step2 :
if you find the -127 by applying 2's complement into +127 you will find "1 0 0 0 0 0 0 1" and finally if you substract 1 from this number then the smallest 8 bit number -128 will be achieved as "1 0 0 0 0 0 0 0"
As a result if we combine the info that we reserved at step 1 and the result from step2, we come to the conclusion that, the most significiant bit or bit number 8 in char containers must always be 1 so called signed bit.