Everything I look up just tells me how to do complement operations/calculations in C.
I want to know what representation does C use internally and how does it handle with overflow.
C allows 3 representation for signed integers
(https://port70.net/~nsz/c/c11/n1570.html#6.2.6.2p2):
the corresponding value with sign bit 0 is negated (sign and
magnitude);
the sign bit has the value -(2M) (two's complement);
the sign bit has the value -(2M- 1) (ones' complement).
Two's complement is most common.
Unsigned overflow wraps around the maximum value for the unsigned.
Signed overflow causes undefined behavior. I.e., it is assumed not to happen, and if you do make it happen, no guarantees can be made about your program's behavior.
Overflow in signed atomics is an exception: it is well defined and two's complement is mandated there: https://port70.net/~nsz/c/c11/n1570.html#7.17.7.5p3.
Whether C uses a one's complement, two's complement, or sign/magnitude representation for negative integers is implementation defined. That is, each compiler gets to choose, typically based on the processor it's generating code for.
So when you write
x = -x
the compiler might generate code equivalent to
x = ~x; /* one's complement */
or
x = ~x + 1; /* two's complement */
or
x ^= 0x80000000; /* sign/magnitude, assuming 32 bits */
Most of the time you don't have to worry about this, of course. (Also most of the time — these days it's a safe bet to say all of the time — you're working on a machine that uses two's complement, as it's the overwhelming favorite.)
Since it's implementation defined, the documentation is supposed to tell you. But I suppose you could always determine it empirically with a scrap of code:
#include <stdio.h>
#include <limits.h>
int main()
{
int x = 1;
int negativex = -x;
if(negativex == ~x)
printf("one's complement\n");
else if(negativex == ~x + 1)
printf("two's complement\n");
else if(negativex == (x ^ (1 << (sizeof(x) * CHAR_BIT - 1))))
printf("sign/magnitude\n");
else
printf("what the heck kind of machine are you on??\n");
}
You asked about overflow. For unsigned integers, overflow is defined as "wrapping around" in the obvious way (that is, it's performed modulo 2^N, where N is the number of bits). But for signed integers, overflow is formally undefined: theoretically there can be machines where signed integer overflow generates an error, more or less like dividing by 0.
(On ordinary two's complement machines, of course, signed integer arithmetic quietly wraps around in the obvious way also, since the whole point of two's complement is that wraparound overflow makes it work.)
Addendum: Although the C Standards so far have, as mentioned, allowed for all three possibilities, two's complement is such the overwhelming favorite these days that, from what I hear, the next revision of the C Standard is going to require/guarantee it.
In short, we can say that the 2s complement in C is defined as the sum of the one's complement in C and one.
The 2s complement in C is generated from the 1s complement in C. As we know that the 1s complement of a binary number is created by transforming bit 1 to 0 and 0 to 1; the 2s complement of a binary number is generated by adding one to the 1s complement of a binary number.
Related
I've done some quick tests that a signed int to unsigned int cast in C does not change the bit values (on an online debugger).
What I want to know is whether it is guaranteed by a C standard or just the common (but not 100% sure) behaviour ?
Conversion from signed int to unsigned int does not change the bit representation in two’s-complement C implementations, which are the most common, but will change the bit representation for negative numbers, including possible negative zeroes on one’s complement or sign-and-magnitude systems.
This is because the cast (unsigned int) a is not defined to retain the bits but the result is the positive remainder of dividing a by UINT_MAX + 1 (or as the C standard (C11 6.3.1.3p2) says,
the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
The two’s complement representation for negative numbers is the most commonly used representation for signed numbers exactly because it has this property of negative value n mapping to the same bit pattern as the mathematical value n + UINT_MAX + 1 – it makes it possible to use the same machine instruction for signed and unsigned addition, and the negative numbers will work because of wraparound.
Casting from a signed to an unsigned integer is required to generate the correct arithmetic result (the same number), modulo the size of the unsigned integer, so to speak. That is, after
int i = anything;
unsigned int u = (unsigned int)i;
and on a machine with 32-bit ints, the requirement is that u is equal to i, modulo 232.
(We could also try to say that u receives the value i % 0x100000000, except it turns out that's not quite right, because the C rules say that when you divide a negative integer by a positive integer, you get a quotient rounded towards 0 and a negative remainder, which isn't the kind of modulus we want here.)
If i is 0 or positive, it's not hard to see that u will have the same bit pattern.
If i is negative, and if you're on a 2's complement machine, it turns out the result is also guaranteed to have the same bit pattern. (I'd love to present a nice proof of that result here, but I don't have time just now to try to construct it.)
The vast majority of today's machines use 2's complement. But if you were on a 1's complement or sign/magnitude machine, I'm pretty sure the bit patterns would not always be the same.
So, bottom line, the sameness of the bit patterns is not guaranteed by the C Standard, but arises due to a combination of the C Standard's requirements, and the particulars of 2's complement arithmetic.
I have recently been studying the one’s complement system of representing numbers and from what I understand there are two variants of the number 0. There is a negative zero (-0) and a positive zero (+0).
My question is, on a one’s complement architecture, how exactly is this anomaly treated in C? Does C make a distinction between -0 and +0 or are both of these forms simply treated as zero.
If it is the case that both +0 and -0 return TRUE when tested for zero, then I am wondering how the following sample code would work that calculates the number of set bits in an integer if we enter -0 as its input.
int bitcount(int x)
{
int b;
for (b = 0; x != 0; b++)
x &= (x-1);
return b;
}
Since -0, in one’s complement, has all of its bits set to 1, -0 should return the highest amount of bits set out of any other number; however, it appears that this code would fail the loop test condition of x != 0, and would not even enter the loop, giving an incorrect result.
Would it be possible somehow in C, in a one's complement architecture, to make the loop condition sensitive to positive zeros as in: x != +0 Also, if I would subtract 1 from +0, would I get -0, or -1. In other words, does +0 - 1 = -0 in a one’s complement architecture?
All in all, not to go too far off course in this discussion, I am just wondering how C treats the peculiarities of the number 0 in a one’s complement architecture.
It is implementation-defined whether, on a ones-complement architecture, the value "with sign bit and all value bits 1" is a "trap representation" or a normal value. If it is a trap representation, any attempt to do anything with it, or even create it in the first place, provokes undefined behavior. If it is a normal value, it is a "negative zero", and there is an explicit list of operations that are allowed to produce it:
If the implementation supports negative zeros, they shall be generated only by:
the &, |, ^, ~, <<, and >> operators with operands that produce such a value;
the +, -, *, /, and % operators where one operand is a negative zero and the result is zero;
compound assignment operators based on the above cases.
It is unspecified whether these cases actually generate a negative zero or a normal zero, and whether a negative zero becomes a normal zero when stored in an object.
(C11/N1570, section 6.2.6.2 paragraph 3.)
It also appears to be unspecified (by omission) whether negative zero compares equal to normal zero. Similar rules apply to sign-and-magnitude architectures.
So, what this boils down to is, the behavior of your example code is implementation-defined, and the implementation may not define it helpfully. You would need to consult the compiler and architecture manuals for this hypothetical ones-complement machine to figure out whether it does what you want it to do.
However, the entire question is moot, because nobody has manufactured a non-twos-complement CPU in at least 25 years. One hopes that a future revision of the C standard will cease to allow for the possibility; it would simplify many things.
To answer your question, there are 2 possibilities to consider:
if the bit pattern with all bits set is a trap representation (which is explicitly allowed by the C Standard), passing such a value to the function has undefined behavior.
if this bit pattern is allowed, it the one's complement representation of negative zero, which should compare equal to 0. In this case, the function as written will have defined behavior and will return 0 since the initial loop test is false.
The result would be different if the function was written this way:
int bitcount32(int x) {
// naive implementation assuming 31 value bits
int count = 0, b;
for (b = 0; b < 31; b++) {
if (x & (1 << b))
count++;
}
}
return count;
}
On this one's complement architecture, bitcount32(~0) would evaluate to 31:
(x & (1 << b)) with x the argument with the specific bit pattern and b in range for 1 << b to be defined, evaluates to 1 << b, a different value from the result on two's complement and sign/magnitude architectures.
Note however that the posted implementation has undefined behavior for an argument of INT_MIN as x-1 causes a signed arithmetic overflow. It is highly advisable to always use unsigned types for bitwise and shift operations.
I'm new here but I was just wondering what ~0 would be if in the two's complement system, signed integer.
Is it -1 because the flipping of 0 is 1, but if it's signed the answer would be -1? Or is the answer just 1 and just because I'm working with signed numbers doesn't mean that it would be -1?
0 is a signed int (if you wanted it to be unsigned, you'd write 0U) and therefore ~0 is also a signed int.
If your machine uses a 2's-complement representation, then that will have the value -1. The vast majority of machines -- possibly all the machines you will ever see in your career -- are 2's-complement, but technically speaking, ~0 may invoke undefined behaviour if you use it on a machine which uses 1's-complement representation of signed integers and which also prohibits negative zeros.
Even if it may not matter, it's a good idea to get into the habit of only using unsigned integer types with bitwise operators.
Remember that the bitwise operators perform "the integer promotions" on their operands, which means that signed and unsigned short and char are automatically promoted to int -- not unsigned int (unless it happens that short is the same width as int) -- so an explicit cast to unsigned may be necessary.
~0 is not the two's complement of zero. It is the bit inversion of 0, which is the same as the one's complement.
If you want the two's complement in C, you will need -0 (note the minus sign)
And, -0 would just be 0.
Proof (in eight bit)
zero - 0b00000000
one'e complement - 0b11111111
Add one - 0b00000001 (ignoring overflow)
-----------
Two's complement - 0b00000000
given the following function:
int boof(int n) {
return n + ~n + 1;
}
What does this function return? I'm having trouble understanding exactly what is being passed in to it. If I called boof(10), would it convert 10 to base 2, and then do the bitwise operations on the binary number?
This was a question I had on a quiz recently, and I think the answer is supposed to be 0, but I'm not sure how to prove it.
note: I know how each bitwise operator works, I'm more confused on how the input is processed.
Thanks!
When n is an int, n + ~n will always result in an int that has all bits set.
Strictly speaking, the behavior of adding 1 to such an int will depend on the representation of signed numbers on the platform. The C standard support 3 representations for signed int:
for Two's Complement machines (the vast majority of systems in use today), the result will be 0 since an int with all bits set is -1.
on a One's Complement machine (which are pretty rare today, I believe), the result will be 1 since an int with all bits set is 0 or -0 (negative zero) or undefined behavior.
a Signed-magnitude machine (are there really any of these still in use?), an int with all bits set is a negative number with the maximum magnitude (so the actual value will depend on the size of an int). In this case adding 1 to it will result in a negative number (the exact value, again depends on the number of bits that are used to represent an int).
Note that the above ignores that it might be possible for some implementations to trap with various bit configurations that might be possible with n + ~n.
Bitwise operations will not change the underlying representation of the number to base 2 - all math on the CPU is done using binary operations regardless.
What this function does is take n and then add it to the two's complement negative representation of itself. This essentially negates the input. Anything you put in will equal 0.
Let me explain with 8 bit numbers as this is easier to visualize.
10 is represented in binary as 00001010.
Negative numbers are stored in two's complement (NOTing the number and adding 1)
So the (~n + 1) portion for 10 looks like so:
11110101 + 1 = 11110110
So if we take n + ~n+1:
00001010 + 11110110 = 0
Notice if we add these numbers together we get a left carry which will set the overflow flag, resulting in a 0. (Adding a negative and positive number together never means the overflow indicates an exception!)
See this
The CARRY and OVERFLOW flag in Binary Arithmetic
When performing bitwise subtraction using two's complement, how does one know when the overflow should be ignored? Several websites I read stated that the overflow is simply ignored, but that does not always work -- the overflow is necessary for problems like -35 - 37, as an extra digit is needed to express the answer of -72.
EDIT: Here's an example, using the above equation.
35 to binary -> 100011, find two's complement to make it negative: 011101
37 to binary -> 100101, find two's complement to make it negative: 011011
Perform addition of above terms (binary equivalent of -35 - 37):
011101
011011
------
111000
Take two's complement to convert back to positive: 001000
The above is what many websites (including academic ones) say the answer should be, as you ignore overflow. This is clearly incorrect, however.
An overflow happens when the result cannot be represented in the target data type. The value -72 can be represented in a char, which is a signed 8-bit quantity... there is no overflow in your example. Perhaps you are thinking about a borrow while doing bitwise subtraction... when you subtract a '1' from a '0' you need to borrow from the next higher order bit position. You cannot ignore borrows when doing subtraction.
-35 decimal is 11011101 in two's complement 8-bit
+37 decimal is 00100101 in two's complement 8-bit
going right to left from least significant to most significant bit you can subtract each bit in +37 from each bit in -35 until you get to bit 5 (counting starts at bit 0 on the right). At bit position 5 you need to subtract '1' from '0' so you need to borrow from bit position 6 (the next higher order bit) in -35, which happens to be a '1' prior to the borrow. The result looks like this
-35 decimal is 11011101 in two's complement 8-bit
+37 decimal is 00100101 in two's complement 8-bit
--------
-72 decimal is 10111000 in two's complement 8-bit
The result is negative, and your result in 8-bit two's complement has the high order bit set (bit 7)... which is negative, so there is no overflow.
Update:I think I see where the confusion is, and I claim that the answer here Adding and subtracting two's complement is wrong when it says you can discard the carry (indicates overflow). In that answer they do subtraction by converting the second operand to negative using two's complement and then adding. That's fine - but a carry doesn't represent overflow in that case. If you add two positive numbers in N bits (numbered 0 to N-1) and you consider this unsigned arithmetic range 0 to (2^N)-1 and you get a carry out of bit position N-1 then you have overflow - the sum of two positive numbers (interpreted as unsigned to maximize the range of representable positive numbers) should not generate a carry out of the highest order bit (bit N-1). So when adding two positive numbers you identify overflow by saying
there must be no carry out of bit N-1 when you interpret them as unsigned and
the result in bit N-1 must be zero when interpreted as signed (two's complement)
Note, however, that processors don't distinguish between signed and unsigned addition/subtraction... they set the overflow flag to indicate that if you are interpreting your data as signed then the result could not be represented (is wrong).
Here is a very detailed explanation of carry and overflow flag. The takeaway from that article is this
In unsigned arithmetic, watch the carry flag to detect errors.
In unsigned arithmetic, the overflow flag tells you nothing interesting.
In signed arithmetic, watch the overflow flag to detect errors.
In signed arithmetic, the carry flag tells you nothing interesting.
This is consistent with the definition of arithmetic overflow in Wikipedia which says
Most computers distinguish between two kinds of overflow conditions. A carry occurs when the result of an addition or subtraction, considering the operands and result as unsigned numbers, does not fit in the result. Therefore, it is useful to check the carry flag after adding or subtracting numbers that are interpreted as unsigned values. An overflow proper occurs when the result does not have the sign that one would predict from the signs of the operands (e.g. a negative result when adding two positive numbers). Therefore, it is useful to check the overflow flag after adding or subtracting numbers that are represented in two's complement form (i.e. they are considered signed numbers).