Suppose you have the following C code.
unsigned char a = 1;
printf("%d\n", ~a); // prints -2
printf("%d\n", a); // prints 1
I am surprised to see -2 printed as a result of ~1 conversion:
The opposite of 0000 0001 is 1111 1110. That is anything but -2.
What am I missing here?
It is two's complement.
In two's complement representation, if a number x's most significant bit is 1, then the actual value would be −(~x + 1).
For instance,
0b11110000 = -(~0b1111 + 1) = -(15 + 1) = -16.
This is a natural representation of negative numbers, because
0000001 = 1
0000000 = 0
1111111 = -1 (wrap around)
1111110 = -2
1111101 = -3 etc.
See http://en.wikipedia.org/wiki/Two%27s_complement for detail.
BTW, to print an unsigned value, use the %hhu or %hhx format. See http://www.ideone.com/YafE3.
%d stands for signed decimal number, not unsigned. So your bit pattern, even though it is stored in an unsigned variable, is interpreted as a signed number.
See this Wikipedia entry on signed number representations for an understanding of the bit values. In particular see Two's complement.
One (mildly humorous) way to think of signed maths is to recognize that the most significant bit really represents an infinite number of bits above it. So in a 16-bit signed number, the most significant bit is 32768+65536+131072+262144+...etc. which is 32768*(1+2+4+8+...) Using the standard formula for a power series, (1+ X + X^2 + X^3 +...) = 1/(1-X), one discovers that (1+2+4+8+...) is -1, so the sum of all those bits is -32768.
Related
My textbook provides the following explanation for the two's complement method for signed integers:
We’ll discuss this method as it applies to a 1-byte value. In that
context, the values 0 through 127 are represented by the last 7 bits,
with the high-order bit set to 0. So far, that’s the same as the
sign-magnitude method. Also, if the high-order bit is 1, the value is
negative. The difference comes in determining the value of that
negative number. Subtract the bit-pattern
for a negative number from the 9-bit pattern 100000000 (256 as
expressed in binary), and
the result is the magnitude of the value.
None of this makes any sense to me. Typical processors use octets (8-bit bytes). What does it mean by subtracting the 8-bit byte from the 9-bit byte?
Basically, for efficient computation, you want to have the same operations (addition, subtraction, etc.) to be performed the same way, regardless of the sign. So first, consider the case of unsigned bytes.
With 8 bits, you can represent any value between 0-255. Addition and subtraction work the same way as usual (modulo 256), and everything is fine.
Now, imagine that when you are at 127, incrementing by 1 gives you -128. We're still counting modulo 256, but the top 128 numbers have shifted by 256. Now, let's examine addition:
10 + (-5) = 10 + 251 (unsigned) = 261 = 5 (modulo 255).
Everything works as expected. So, in our new representation, -128 is 127 + 1, which is 01111111 + 1 which is 10000000. -1 will be 11111111. I hope I've helped.
You have 1 byte (8 digit number with each digit being a 0 or a 1)
2's complement works by looking at the first digit
10010011
^ this one
If it's a 0, then the number is positive and you can continue to convert binary to decimal normally.
If it's 1, then it's negative. Convert it normally (bin(10010011) = 147) and THEN subtract 256 (147 - 256 = -109), and there is your 2's complement number.
My memory about two's complement says:
"Flip the bits plus one"
Indeed, the following makes a number negative:
int i, j= 127;
i = (~j)+1;
printf ("%d\n",i); // prints -127
I have a simple program with a short variable declaration:
short int v=0XFFFD;
printf("v = %d\n",v);
printf("v = %o\n",v);
printf("v = %X\n",v);
The result is:
v = -3 ; v = 37777777775 ; v = FFFFFFFD
I don't understand how to calculate these values. I know that a short variable can hold values between -32768 and 32767, and the value 0XFFFD causes an overflow, but I don't know how to calculate the exact value, which is -3 in this case.
Also, if my declaration is v=0XFFFD why the output v=%X is FFFFFFFD?
First of all a short can be as short as 16 bits (which probably is the case on your compiler). This means that 65533 can't be represented correctly, the assignment overflows, it wraps to -3 (as short int is a signed short integer). But you already knew that.
Secondly when sent as an argument to printf the short int is converted to int automatically, but as v contains -3 that's the value that is sent to printf.
Thirldly the %o and %X conversions expect an unsigned int which is not quite what you've supplied. This means undefined behavior (in theory), but in practice it's quite predictable. This means that the bit pattern for -3 is interpreted as an unsigned integer istead which on 32 bit machines happens to be 0xFFFFFFFD.
If short is 2 bytes on this machine then 0x8000 is binary representation of the biggest negative value this type holds that is -32768. Because of how things are designed, next numbers are represented by corresponding next bit patterns, i.e:
biggest negative value = 0x8000 = -32768
0x8001 = biggest negative value + 1 = -32767
0xFFFF = biggest negative value + 0x7FFF = -1
0xFFFE = biggest negative value + 0x7FFE = 0xFFFF - 1 = -2
0xFFFD = biggest negative value + 0x7FFD = 0xFFFF - 2 = 0xFFFE - 1 = -3
The number 0xFFFD does not cause an overflow, since -3 is perfectly within the range of -32768 through 32767.
Any short variable is a signed two's complement and the result of -3(decimal) is calculated like any two's complement value. So to find out how to calculate the result of -3(decimal) please take a look on the many tutorials on youtube or in any pertinent textbooks.
Your compiler seems to conduct an implicite type conversion of your short variable to an int value before it prints the number as an int. To do that it has to carry out a so called sign extension since it has to make a 32-bit signed two's complement out of a 16-bit signed two's complement. Just like with any decimal number there are an infinite amount of zeros preceding the number (for example 34 = 0034) there are an infinite number of 1s preceding a negative two's complement number. So the compiler copies the most significand bit to the left. makin 0xFFFFFFFD out of 0xFFFD and 37777777775(oct) out of 177775(oct) i.e. (00)1 111 111 111 111 101 (bin).
I hope that information helped you.
but I don't know how to calculate the exact value,which is -3 in this case.
The most common way of representing integer numbers in computers are 2's complement.
If you have an n-bit integer, you get the decimal value like this:
Decimal value = - bn-12n-1 + bn-22n-2 + ... + b121 + b020 where bi is the value of bit number i.
In your case you have a 16 bit variable. The hexadecimal representation is 0xfffd which in binary is 1111.1111.1111.1101
Inserting into the formula you'll get:
Decimal value = - 1*215 + 1*214 + ... + 1*22 + 0*21 + 1*20 = - 3
See https://en.wikipedia.org/wiki/Two%27s_complement for more about the subject.
I read about twos complement on wikipedia and on stack overflow, this is what I understood but I'm not sure if it's correct
signed int
the left most bit is interpreted as -231 and this how we can have negative numbers
unsigned int
the left most bit is interpreted as +231 and this is how we achieve large positive numbers
update
What will the compiler see when we store 3 vs -3?
I thought 3 is always 00000000000000000000000000000011
and -3 is always 11111111111111111111111111111101
example for 3 vs -3 in C:
unsigned int x = -3;
int y = 3;
printf("%d %d\n", x, y); // -3 3
printf("%u %u\n", x, y); // 4294967293 3
printf("%x %x\n", x, y); // fffffffd 3
Two's complement is a way to represent negative integers in binary.
First of all, here's a standard 32-bit integer ranges:
Signed = -(2 ^ 31) to ((2 ^ 31) - 1)
Unsigned = 0 to ((2 ^ 32) - 1)
In two's complement, a negative is represented by inverting the bits of its positive equivalent and adding 1:
10 which is 00001010 becomes -10 which is 11110110 (if the numbers were 8-bit integers).
Also, the binary representation is only important if you plan on using bitwise operators.
If your doing basic arithmetic, then this is unimportant.
The only time this may give unexpected results outside of the aforementioned times is getting the absolute value of the signed version of -(2 << 31) which will always give a negative.
Your problem does not have to do with the representation, but the type.
A negative number in an unsigned integer is represented the same, the difference is that it becomes a super high number since it must be positive and the sign bit works as normal.
You should also realize that ((2^32) - 5) is the exact same thing as -5 if the value is unsigned, etc.
Therefore, the following holds true:
unsigned int x = (2 << 31) - 5;
unsigned int y = -5;
if (x == y) {
printf("Negative values wrap around in unsigned integers on underflow.");
}
else {
printf( "Unsigned integer underflow is undefined!" );
}
The numbers don't change, just the interpretation of the numbers. For most two's complement processors, add and subtract do the same math, but set a carry / borrow status assuming the numbers are unsigned, and an overflow status assuming the number are signed. For multiply and divide, the result may be different between signed and unsigned numbers (if one or both numbers are negative), so there are separate signed and unsigned versions of multiply and divide.
For 32-bit integers, for both signed and unsigned numbers, n-th bit is always interpreted as +2n.
For signed numbers with the 31th bit set, the result is adjusted by -232.
Example:
1111 1111 1111 1111 1111 1111 1111 11112 as unsigned int is interpreted as 231+230+...+21+20. The interpretation of this as a signed int would be the same MINUS 232, i.e. 231+230+...+21+20-232 = -1.
(Well, it can be said that for signed numbers with the 31th bit set, this bit is interpreted as -231 instead of +231, like you said in the question. I find this way a little less clear.)
Your representation of 3 and -3 is correct: 3 = 0x00000003, -3 + 232 = 0xFFFFFFFD.
Yes, you are correct, allow me to explain a bit further for clarification purposes.
The difference between int and unsigned int is how the bits are interpreted. The machine processes unsigned and signed bits the same way, but there are extra bits added for signing. Two's complement notation is very readable when dealing with related subjects.
Example:
The number 5's, 0101, inverse is 1011.
In C++, it's depends when you should use each data type. You should use unsigned values when functions or operators return those values. ALUs handle signed and unsigned variables very similarly.
The exact rules for writing in Two's complement is as follows:
If the number is positive, count up to 2^(32-1) -1
If it is 0, use all zeroes
For negatives, flip and switch all the 1's and 0's.
Example 2(The beauty of Two's complement):
-2 + 2 = 0 is displayed as 0010 + 1110; and that is 10000. With overflow at the end, we have our result as 0000;
I am given this code to convert a signed integer into two's complement but I don't understand how it really works, especially if the input is negative.
void convertB2T( int32_t num) {
uint8_t bInt[32];
int32_t mask = 0x01;
for (int position = 0; position < NUM_BITS; position++) {
bInt[position] = ( num & Mask) ? 1 : 0;
Mask = Mask << 1;
}
}
So my questions are:
num is an integer, Mask is a hex, so how does num & Mask work? Does C just convert num to binary representation and do the bitwise and? Also the output of Mask is an integer correct? So if this output is non-zero, it is treated as TRUE and if zero, FALSE, right?
How does this work if num is negative? I tried running the code and actually did not get a correct answer (all higher level bits are 1's).
This program basically extracts each bit of the number and puts it in a vector. So every bit becomes a vector element. It has nothing to do with two's complement conversion (although the resulting bit-vector will be in two's complement, as the internal representation of numbers is in two's complement).
The computer has no idea what hex means. Every value is stored in binary, because binary is the only thing computer understands. So, ,the "integer" and the hex values are converted to binary (the hex there is also an integer). On these binary representations that the computer uses, the binary operators are applied.
In order to understand what is happening with the result when num is negative, you need to understand that the result is basically the two's complement representation of num and you need to know how the two's complement representation works. Wikipedia is a good starting point.
To answer your questions
1.Yes num is integer represented in decimal format and mask is also integer represented in hex format.
Yes C compiler treats num and mask with their binary equivalents.
Say
num = 24; // binary value on 32 bit machine is 000000000000000000011000
mask = 0x01; // binary value on 32 bit machine is 000000000000000000000001
Yes compiler now performs & bitwise and the equivalent binary values.
Yes if output is nonzero, treated as true
If a number is negative, its represented in 2's complement form.
Basically your code is just storing binary equivalent of number into array. You are not representing in twos complement.
If MSB is 1 indicates number is negative. if a number is negative
num = -24; // represent binary value of 24
000000000000000000011000 -> apply 1's complement + 1 to this binary value
111111111111111111100111 -> 1's complement
+000000000000000000000001 -> add 1
------------------------
111111111111111111101000 -> -24 representation
------------------------
From C traps and pitfalls
If a and b are two integer variables, known to be non-negative then to
test whether a+b might overflow use:
if ((int) ((unsigned) a + (unsigned) b) < 0 )
complain();
I didn't get that how comparing the sum of both integers with zero will let you know that there is an overflow?
The code you saw for testing for overflow is just bogus.
For signed integers, you must test like this:
if (a^b < 0) overflow=0; /* opposite signs can't overflow */
else if (a>0) overflow=(b>INT_MAX-a);
else overflow=(b<INT_MIN-a);
Note that the cases can be simplified a lot if one of the two numbers is a constant.
For unsigned integers, you can test like this:
overflow = (a+b<a);
This is possible because unsigned arithmetic is defined to wrap, unlike signed arithmetic which invokes undefined behavior on overflow.
When an overflow occurs, the sum exceeds some range (let's say this one):
-4,294,967,295 < sum < 4,294,967,295
So when the sum overflows, it wraps around and goes back to the beginning:
4,294,967,295 + 1 = -4,294,967,295
If the sum is negative and you know the the two numbers are positive, then the sum overflowed.
If a and b are known to be non negative integers, the sequence (int) ((unsigned) a + (unsigned) b) will return indeed a negative number on overflow.
Lets assume a 4 bit (max positive integer is 7 and max unsigned integer is 15) system with the following values:
a = 6
b = 4
a + b = 10 (overflow if performed with integers)
While if we do the addition using the unsigned conversion, we will have:
int((unsigned)a + (unsigned)b) = (int) ((unsigned)(10)) = -6
To understand why, we can quickly check the binary addition:
a = 0110 ; b = 0100 - first bit is the sign bit for signed int.
0110 +
0100
------
1010
For unsigned int, 1010 = 10. While the same representation in signed int means -6.
So the result of the operation is indeed < 0.
If the integers are unsigned and you're assuming IA32, you can do some inline assembly to check the value of the CF flag. The asm can be trimmed a bit, I know.
int of(unsigned int a, unsigned int b)
{
unsigned int c;
__asm__("addl %1,%2\n"
"pushfl \n"
"popl %%edx\n"
"movl %%edx,%0\n"
:"=r"(c)
:"r"(a), "r"(b));
return(c&1);
}
There are some good explanations on this page.
Here's the simple way from that page that I like:
Do the addition normally, then check the result (e.g. if (a+23<23) overflow).
As we know that Addition of 2 Numbers might be overflow.
So for that we can use following way to add the two numbers.
Adder Concept
Suppose we have 2 numbers "a" AND "b"
(a^b)+(a&b);
this equation will give the correct result..
And this is patented by the Samsung.
assuming twos compliment representation and 8 bit integers, the most significant bit has sign (1 for negative and 0 for positive), since we know the integers are non negative, it means most significant bit is 0 for both integers. Now if adding the unsigned representation of these numbers result in a 1 in most significant bit then that mean the addition has overflowed, and to check whether an unsigned integer has a 1 in most significant bit is to check if it is more than the range of signed integer, or you can convert it to signed integer which will be negative (because the most significant bit is 1)
example 8 bit signed integers (range -128 to 127):
twos compliment of 127 = 0111 1111
twos complement of 1 = 0000 0001
unsigned 127 = 0111 1111
unsigned 1 = 0000 0001
unsigned sum = 1000 0000
sum is 128, which is not a overflow for unsigned integer but is a overflow for signed integer, the most significant bit gives it away.