Why does subtracting a negative 8-bit value from the 9-bit pattern 100000000 give the magnitude? - c

My textbook provides the following explanation for the two's complement method for signed integers:
We’ll discuss this method as it applies to a 1-byte value. In that
context, the values 0 through 127 are represented by the last 7 bits,
with the high-order bit set to 0. So far, that’s the same as the
sign-magnitude method. Also, if the high-order bit is 1, the value is
negative. The difference comes in determining the value of that
negative number. Subtract the bit-pattern
for a negative number from the 9-bit pattern 100000000 (256 as
expressed in binary), and
the result is the magnitude of the value.
None of this makes any sense to me. Typical processors use octets (8-bit bytes). What does it mean by subtracting the 8-bit byte from the 9-bit byte?

Basically, for efficient computation, you want to have the same operations (addition, subtraction, etc.) to be performed the same way, regardless of the sign. So first, consider the case of unsigned bytes.
With 8 bits, you can represent any value between 0-255. Addition and subtraction work the same way as usual (modulo 256), and everything is fine.
Now, imagine that when you are at 127, incrementing by 1 gives you -128. We're still counting modulo 256, but the top 128 numbers have shifted by 256. Now, let's examine addition:
10 + (-5) = 10 + 251 (unsigned) = 261 = 5 (modulo 255).
Everything works as expected. So, in our new representation, -128 is 127 + 1, which is 01111111 + 1 which is 10000000. -1 will be 11111111. I hope I've helped.

You have 1 byte (8 digit number with each digit being a 0 or a 1)
2's complement works by looking at the first digit
10010011
^ this one
If it's a 0, then the number is positive and you can continue to convert binary to decimal normally.
If it's 1, then it's negative. Convert it normally (bin(10010011) = 147) and THEN subtract 256 (147 - 256 = -109), and there is your 2's complement number.

My memory about two's complement says:
"Flip the bits plus one"
Indeed, the following makes a number negative:
int i, j= 127;
i = (~j)+1;
printf ("%d\n",i); // prints -127

Related

Binary representation of data types in C

If I have a normal 32 bit int in C, how is the number represented in memory? Specifically, how is the sign stored?
At first I thought it would be sign and magnitude but then I remembered there would be + and - 0, so then I thought it could be stored in two's complement, however when using two's complement I ended up getting a maximum value of 4294967293 with the MSB set to 0 (I got this value by summing up everything from 2^31 to 2^0) which as we all know is wrong.
The C standard doesn't mandate two's complement, but it's by far the most common option (and I think even this is an understatement). The reason you arrive a 4 billion something, not at the correct INT_MAX, is an off-by-one error: You added the 32th bit as well. 231 is the value of the MSB (because the LSB has value 20), so the correct value is the sum 20 + ... + 230 = 231-1.
Even this is implementation defined, most machines use two's complement.
The problem is that your computation is wrong. Summing up 2^0 to 2^31 (which is 2^32 - 1, by the way) means that you use 32 bits. This is not correct: only 31 bits are used for the actual number. MSB has a special meaning and it has to do with the sign (is not really 'the sign'). So, you need to sum up from 2^0 to 2^30 (which is 2^31 - 1).

Why does the range of int has a minus 1?

I read that the range of an int is dependent on a byte.
So taking int to be 4 bytes long, thats 4 * 8 bits = 32 bits.
So the range should be : 2 ^ (32-1) = 2 ^ (31)
Why do some people say its 2^31 - 1 though?
Thanks!
Because the counting starts from 0
And the range of int is 2,147,483,647 and 2^32 which is 2,147,483,648. hence we subtract 1
Also the loss of 1 bit is for the positive and negative sign
Check this interestinf wiki article on Integers:-
The most common representation of a positive integer is a string of
bits, using the binary numeral system. The order of the memory bytes
storing the bits varies; see endianness. The width or precision of an
integral type is the number of bits in its representation. An integral
type with n bits can encode 2n numbers; for example an unsigned type
typically represents the non-negative values 0 through 2n−1. Other
encodings of integer values to bit patterns are sometimes used, for
example Binary-coded decimal or Gray code, or as printed character
codes such as ASCII.
There are four well-known ways to represent signed numbers in a binary
computing system. The most common is two's complement, which allows a
signed integral type with n bits to represent numbers from −2(n−1)
through 2(n−1)−1. Two's complement arithmetic is convenient because
there is a perfect one-to-one correspondence between representations
and values (in particular, no separate +0 and −0), and because
addition, subtraction and multiplication do not need to distinguish
between signed and unsigned types. Other possibilities include offset
binary, sign-magnitude, and ones' complement.
You mean 232-1, NOT 232-1.
But your question is about why people use 231. The loss of a whole bit is if the int is a signed one. You lose the first bit to indicate if the number is positive or negative.
A signed int (32 bit) ranges from -2,147,483,648 to +2,147,483,647.
An unsigned int (32 bit) ranges from 0 to 4,294,967,295 (which is 232 -1).
int is a signed data type.
The first bit represents the sign, followed by bits for the value.
If the sign bit is 0, the value is simply the sum of all bits set to 1 ( to the power of 2).
e.g. 0...00101 is 20 + 22 = 5
if the first bit is 1, the value is -232 + the sum of all bits set to 1 (to the power of 2).
e.g. 1...111100 is -232 + 231 + 230 + ... + 22 = -4
all 0 will this result in zero.
When you calculate after, you will see that any number between (and including) the range - 231 and 20 + ... + 231 = 232 - 1 can be created with those 32 bits.
232-1 is not same as 232 - 1 (as 0 is included in the range, we subtract 1)
For your understanding, let us replace by small number 4 instead of 32
24-1 = 8
whereas 24-1 = 16-1 = 15.
Hope this helps!
Since integer is 32 bit. It could store total 2^32 values. So an integer ranges from -2^31 to 2^31-1 giving a total of 2^32 values(2^31 values in the negative range+2^31 values in positive range including 0).However, the first bit(the most significant bit) is reserved for the sign of the integer. Again u need to understand how negative integers are stored.They are stored in 2's complement form, So -9 will be stored as 2's complement of 9.
So 9 is stored in 32 bit system as
0000 0000 0000 0000 0000 0000 0000 1001
and -9 will be stored as
1111 1111 1111 1111 1111 1111 1111 0111 (2's complement of 9).
Again due to some arithmetic operation on an integer, if it happens to exceed the maximum value(2^31-1) then it will recycle to the negative values. So if you add 1 to 2^31-1 it will give you -2^31.

What does signed and unsigned values mean?

What does signed mean in C? I have this table to show:
This says signed char 128 to +127. 128 is also a positive integer, so how can this be something like +128 to +127? Or do 128 and +127 have different meanings? I am referring to the book Apress Beginning C.
A signed integer can represent negative numbers; unsigned cannot.
Signed integers have undefined behavior if they overflow, while unsigned integers wrap around using modulo.
Note that that table is incorrect. First off, it's missing the - signs (such as -128 to +127). Second, the standard does not guarantee that those types must fall within those ranges.
By default, numerical values in C are signed, which means they can be both negative and positive. Unsigned values on the other hand, don't allow negative numbers.
Because it's all just about memory, in the end all the numerical values are stored in binary. A 32 bit unsigned integer can contain values from all binary 0s to all binary 1s. When it comes to 32 bit signed integer, it means one of its bits (most significant) is a flag, which marks the value to be positive or negative. So, it's the interpretation issue, which tells that value is signed.
Positive signed values are stored the same way as unsigned values, but negative numbers are stored using two's complement method.
If you want to write negative value in binary, first write positive number, next invert all the bits and last add 1. When a negative value in two's complement is added to a positive number of the same magnitude, the result will be 0.
In the example below lets deal with 8-bit numbers, because it'll be simple to inspect:
positive 95: 01011111
negative 95: 10100000 + 1 = 10100001 [positive 161]
0: 01011111 + 10100001 = 100000000
^
|_______ as we're dealing with 8bit numbers,
the 8 bits which means results in 0
The table is missing the minuses. The range of signed char is -128 to +127; likewise for the other types on the table.
It was a typo in the book; signed char goes from -128 to 127.
Signed integers are stored using the two's complement representation, in which the first bit is used to indicate the sign.
In C, chars are just 8 bit integers. This means that they can go from -(2^7) to 2^7 - 1. That's because we use the 7 last bits for the number and the first bit for the sign. 0 means positive and 1 means negative (in two's complement representation).
The biggest positive 7 bit number is (01111111)b = 2^7 - 1 = 127.
The smallest negative 7 bit number is (11111111)b = -128
(because 11111111 is the two's complement of 10000000 = 2^7 = 128).
Unsigned chars don't have signs so they can use all the 8 bits. Going from (00000000)b = 0 to (11111111)b = 255.
Signed numbers are those that have either + or - appended with them.
E.g +2 and -6 are signed numbers.
Signed Numbers can store both positive and negative numbers thats why they have bigger range.
i.e -32768 to 32767
Unsigned numbers are simply numbers with no sign with them. they are always positive. and their range is from 0 to 65535.
Hope it helps
Signed usually means the number has a + or - symbol in front of it. This means that unsigned int, unsigned shorts, etc cannot be negative.
Nobody mentioned this, but range of int in table is wrong:
it is
-2^(31) to 2^(31)-1
i.e.,
-2,147,483,648 to 2,147,483,647
A signed integer can have both negative and positive values. While a unsigned integer can only have positive values.
For signed integers using two's complement , which is most commonly used, the range is (depending on the bit width of the integer):
char s -> range -128-127
Where a unsigned char have the range:
unsigned char s -> range 0-255
First, your table is wrong... negative numbers are missing. Refering to the type char.... you can represent at all 256 possibilities as char has one byte means 2^8. So now you have two alternatives to set ur range. either from -128 to +128 or 0 to 255. The first one is a signed char the second a unsigned char. If you using integers be aware what kind of operation system u are using. 16 bit ,32 bit or 64 bit. Int (16 bit,32 bit,64 bit). char has always just 8 bit value.
It means that there will likely be a sign ( a symbol) in front of your value (+12345 || -12345 )

What are the max and min numbers a short type can store in C?

I'm having a hard time grasping data types in C. I'm going through a C book and one of the challenges asks what the maximum and minimum number a short can store.
Using sizeof(short); I can see that a short consumes 2 bytes. That means it's 16 bits, which means two numbers since it takes 8 bits to store the binary representation of a number. For example, 9 would be 00111001 which fills up one bit. So would it not be 0 to 99 for unsigned, and -9 to 9 signed?
I know I'm wrong, but I'm not sure why. It says here the maximum is (-)32,767 for signed, and 65,535 for unsigned.
short int, 2 Bytes, 16 Bits, -32,768 -> +32,767 Range (16kb)
Think in decimal for a second. If you have only 2 digits for a number, that means you can store from 00 to 99 in them. If you have 4 digits, that range becomes 0000 to 9999.
A binary number is similar to decimal, except the digits can be only 0 and 1, instead of 0, 1, 2, 3, ..., 9.
If you have a number like this:
01011101
This is:
0*128 + 1*64 + 0*32 + 1*16 + 1*8 + 1*4 + 0*2 + 1*1 = 93
So as you can see, you can store bigger values than 9 in one byte. In an unsigned 8-bit number, you can actually store values from 00000000 to 11111111, which is 255 in decimal.
In a 2-byte number, this range becomes from 00000000 00000000 to 11111111 11111111 which happens to be 65535.
Your statement "it takes 8 bits to store the binary representation of a number" is like saying "it takes 8 digits to store the decimal representation of a number", which is not correct. For example the number 12345678901234567890 has more than 8 digits. In the same way, you cannot fit all numbers in 8 bits, but only 256 of them. That's why you get 2-byte (short), 4-byte (int) and 8-byte (long long) numbers. In truth, if you need even higher range of numbers, you would need to use a library.
As long as negative numbers are concerned, in a 2's-complement computer, they are just a convention to use the higher half of the range as negative values. This means the numbers that have a 1 on the left side are considered negative.
Nevertheless, these numbers are congruent modulo 256 (modulo 2^n if n bits) to their positive value as the number really suggests. For example the number 11111111 is 255 if unsigned, and -1 if signed which are congruent modulo 256.
The reference you read is correct. At least, for the usual C implementations where short is 16 bits - that's not actually fixed in the standard.
16 bits can hold 2^16 possible bit patterns, that's 65536 possibilities. Signed shorts are -32768 to 32767, unsigned shorts are 0 to 65535.
This is defined in <limits.h>, and is SHRT_MIN & SHRT_MAX.
Others have posted pretty good solutions for you, but I don't think they have followed your thinking and explained where you were wrong. I will try.
I can see that a short consumes 2 bytes. That means it's 16 bits,
Up to this point you are correct (though short is not guaranteed to be 2 bytes long like int is not guaranteed to be 4 — the only guaranteed size by standard (if I remember correctly) is char which should always be 1 byte wide).
which means two numbers since it takes 8 bits to store the binary representation of a number.
From here you started to drift a bit. It doesn't really take 8 bits to store a number. Depending on a number, it may take 16, 32 64 or even more bits to store it. Dividing your 16 bits into 2 is wrong. If not a CPU implementation specifics, we could have had, for example, 2 bit numbers. In that case, those two bits could store values like:
00 - 0 in decimal
01 - 1 in decimal
10 - 2 in decimal
11 - 3 in decimal
To store 4, we need 3 bits. And so the value would "not fit" causing an overflow. Same applies to 16-bit number. For example, say we have unsigned "255" in decimal stored in 16-bits, the binary representation would be 0000000011111111. When you add 1 to that number, it becomes 0000000100000000 (256 in decimal). So if you had only 8 bits, it would overflow and become 0 because the most significant bit would have been discarded.
Now, the maximum unsigned number you can in 16 bits memory is — 1111111111111111, which is 65535 in decimal. In other words, for unsigned numbers - set all bits to 1 and that will yield you the maximum possible value.
For signed numbers, however, the most significant bit represents a sign — 0 for positive and 1 for negative. For negative, the maximum value is 1000000000000000, which is -32678 in base 10. The rules for signed binary representation are well described here.
Hope it helps!
The formula to find the range of any unsigned binary represented number:
2 ^ (sizeof(type)*8)

Range of signed char

Why the range of signed character is -128 to 127 but not -127 to 128 ?
That is because of the way two's complement encoding works: 0 is treated as a "positive" number (signed bit off), so, therefore, the number of available positive values is reduced by one.
In ones' complement encoding (which is not very common nowadays, but in the olden days, it was), there were separate values for +0 and -0, and so the range for an 8-bit quantity is -127 to +127.
In 8-bit 2's complement encoding numbers -128 and +128 have the same representation: 10000000. So, the designer of the hardware is presented with an obvious dilemma: how to interpret bit-pattern 10000000. Formally, it will work either way. If they decide to interpret it as +128, the resultant range will be -127..+128. If they decide to interpret it as -128, the resultant range will be -128..+127.
In actual real-life 2's complement representation the latter approach is chosen because it satisfies the following nice convention: all bit-patterns with 1 in higher-order bit represent negative numbers.
It is worth noting though, that language specification does not require 2's-complement implementations to treat the 100...0 bit pattern as a valid value in any signed integer type. E.g. implementations are allowed to restrict 8-bit signed char to -127..+127 range and regard 10000000 as an invalid bit combination (trap representation).
I think an easy way to explain this for the common soul is :
A bit is a value 0 or 1, or 2 possibilities
A 2-bit holds two combinations or 0 and 1 for four possible values : 00, 01, 10, and 11.
A 3-bit holds three combinations for a total of eight possible values : 000 to 111.
Thus n-bits holds n combinations for a total of 2^n possible values. Therefore, an 8-bit value is 2^8 = 256 possible values.
For signed numbers, the most significant bit (the first one reading the value from left to right) is the sign bit; that leaves a possibility of 2^(n-1) possible values. For an 8-bit signed number, this is 2^7 = 128 possible values for each sign. But since the positive sign includes the zero (0 to 127 = 128 different values, and 128 + 128 = 2^8 = 256), the negative sign includes -1 to... -128 for 128 different values also. Where :
10000000 = -128
...
11111111 = -1
00000000 = 0
...
01111111 = 127
#include <limits.h>
#include <stdio.h>
...
printf("range of signed character is %i ... %i", CHAR_MIN, CHAR_MAX );
If you just consider twos complement as arithmetic modulo 256, then the cutoff between positive and negative is purely arbitrary. You could just as well have put it at 63/-192, 254/-1, 130/-125, or anywhere else. However, as a standard signed integer format, twos complement came by convention put put the cutoff at 127/-128. This cutoff has one big benefit: the high bit being set corresponds directly to the number being negative.
As for the C language, it leaves the format of signed numbers up to the implementation, but only offers 3 choices of implementation, all of which use a "sign bit": sign/magnitude, ones complement, and twos complement.
If you look at ranges of chars and ints there seems to be one extra number on the negative side. This is because a negative number is always stored as 2’s compliment of its binary. For example, let us see how -128 is stored. Firstly, binary of 128 is calculated (10000000), then its 1’s compliment is obtained (01111111). A 1’s compliment is obtained by changing all 0s to 1s and 1s to 0s. Finally, 2’s compliment of this number, i.e. 10000000, gets stored. A 2’s compliment is obtained by adding 1 to the 1’s compliment. Thus, for -128, 10000000 gets stored. This is an 8-bit number and it can be easily accommodated in a char. As against this, +128 cannot be stored in a char because its binary 010000000 (left-most 0 is for positive sign) is a 9-bit number. However +127 can be stored as its binary 01111111 turns out to be a 8-bit number.
Step 1:
If you take 2's complement of any number from 0 up to 127 the bit number 8 will always be 1. So lets reserve that info.
Step2 :
if you find the -127 by applying 2's complement into +127 you will find "1 0 0 0 0 0 0 1" and finally if you substract 1 from this number then the smallest 8 bit number -128 will be achieved as "1 0 0 0 0 0 0 0"
As a result if we combine the info that we reserved at step 1 and the result from step2, we come to the conclusion that, the most significiant bit or bit number 8 in char containers must always be 1 so called signed bit.

Resources