Does C define binary representation of integer e.g. one's, two's complement... or is this
representation processor (computer or something else) dependent?
Exmple of code written in C:
short a = -5;
Where do I need to look to know wheter a is two's complement 1111 1111 1111 1011 or
signed bit representation 1000 0000 0000 0101?
C supports the following three representations for signed integers:
2's complement (the most common, you're rather unlikely to see others in practice)
1's complement
sign-and-magnitude
C also allows there to be some padding (non-value) bits in the representation, which is also something very uncommon in practice.
C does not define whether integers should be stored in memory as big endian or little endian or in some other byte order.
If you want to find out how exactly integers are represented on a specific platform, you need to analyze the underlying memory. Also, if -INT_MAX == INT_MIN + 1, you have a 2's complement representation, otherwise it's uncertain, which one of the three it is.
I think it's safe to assume these days that there are no padding bits and the representation is 2's complement.
It is platform dependent, same as little/big Endian.
Also the number size of certain types are platform dependent.
The representation of an integer is something platform(processor)-dependent. See Endianness.
Binary representation of integer is platform dependent, if platform follows little endian then integers then -ve no.s are stored as 2's complement.
a= -5;
b = 5;
printf("%d %d", a, b);
printf("\n%u %u", a, b);
will display,
-5 5
4294967291 5
signed short -5 is 0xFFFB. The sign + absolute value representation doesn't make sense.
Addition and substractions doesn't care about signed/unsigned type. If you add 0xFFFB and 0x0005 you'll get 0x0000. If you try 0x8005 + 0x0005 you'll get 0x800A, which would be according to your hypothesis -10 which is nonsense. The number is binary complement, but rather it is just -x = 2^16 - x (mod 2^16) for short number.
Related
Everything I look up just tells me how to do complement operations/calculations in C.
I want to know what representation does C use internally and how does it handle with overflow.
C allows 3 representation for signed integers
(https://port70.net/~nsz/c/c11/n1570.html#6.2.6.2p2):
the corresponding value with sign bit 0 is negated (sign and
magnitude);
the sign bit has the value -(2M) (two's complement);
the sign bit has the value -(2M- 1) (ones' complement).
Two's complement is most common.
Unsigned overflow wraps around the maximum value for the unsigned.
Signed overflow causes undefined behavior. I.e., it is assumed not to happen, and if you do make it happen, no guarantees can be made about your program's behavior.
Overflow in signed atomics is an exception: it is well defined and two's complement is mandated there: https://port70.net/~nsz/c/c11/n1570.html#7.17.7.5p3.
Whether C uses a one's complement, two's complement, or sign/magnitude representation for negative integers is implementation defined. That is, each compiler gets to choose, typically based on the processor it's generating code for.
So when you write
x = -x
the compiler might generate code equivalent to
x = ~x; /* one's complement */
or
x = ~x + 1; /* two's complement */
or
x ^= 0x80000000; /* sign/magnitude, assuming 32 bits */
Most of the time you don't have to worry about this, of course. (Also most of the time — these days it's a safe bet to say all of the time — you're working on a machine that uses two's complement, as it's the overwhelming favorite.)
Since it's implementation defined, the documentation is supposed to tell you. But I suppose you could always determine it empirically with a scrap of code:
#include <stdio.h>
#include <limits.h>
int main()
{
int x = 1;
int negativex = -x;
if(negativex == ~x)
printf("one's complement\n");
else if(negativex == ~x + 1)
printf("two's complement\n");
else if(negativex == (x ^ (1 << (sizeof(x) * CHAR_BIT - 1))))
printf("sign/magnitude\n");
else
printf("what the heck kind of machine are you on??\n");
}
You asked about overflow. For unsigned integers, overflow is defined as "wrapping around" in the obvious way (that is, it's performed modulo 2^N, where N is the number of bits). But for signed integers, overflow is formally undefined: theoretically there can be machines where signed integer overflow generates an error, more or less like dividing by 0.
(On ordinary two's complement machines, of course, signed integer arithmetic quietly wraps around in the obvious way also, since the whole point of two's complement is that wraparound overflow makes it work.)
Addendum: Although the C Standards so far have, as mentioned, allowed for all three possibilities, two's complement is such the overwhelming favorite these days that, from what I hear, the next revision of the C Standard is going to require/guarantee it.
In short, we can say that the 2s complement in C is defined as the sum of the one's complement in C and one.
The 2s complement in C is generated from the 1s complement in C. As we know that the 1s complement of a binary number is created by transforming bit 1 to 0 and 0 to 1; the 2s complement of a binary number is generated by adding one to the 1s complement of a binary number.
I'm trying to convert my 16 bit integer to two's complement if it's negative.
At the moment, I'm using the One's complement operator. I figure I can use that, and then add 1 to the binary value to convert it to two's complement. However, I'm unable to do x = ~a + 1 because that just yields the integer value + 1.
If my process is correct, how can I add 1 to the binary integer? If not, what is the most appropriate way to convert a 16 bit integer to 2's complement in Objective-C?
A two's complement is a kind of representation of signed numbers, not a value or an operation. On nearly all modern computers two's complements are the standard representation of signed integers. (And it is one of three allowed representations of integers in the standard.)
Therefore if you have any signed integral number, it is represented in two's complement. I think you want to have "the" two's complement of a positive number. This is simply the negative value, if your machine uses two's complements for signed numbers.
int positiveValue = 5;
int twosComplement = -positiveValue;
I have C code in which I do the following.
int nPosVal = +0xFFFF; // + Added for ease of understanding
int nNegVal = -0xFFFF; // - Added for valid reason
Now when I try
printf ("%d %d", nPosVal >> 1, nNegVal >> 1);
I get
32767 -32768
Is this expected?
I am able to think something like
65535 >> 1 = (int) 32767.5 = 32767
-65535 >> 1 = (int) -32767.5 = -32768
That is, -32767.5 is rounded off to -32768.
Is this understanding correct?
It looks like your implementation is probably doing an arithmetic bit shift with two's complement numbers. In this system, it shifts all of the bits to the right and then fills in the upper bits with a copy of whatever the last bit was. So for your example, treating int as 32-bits here:
nPosVal = 00000000000000001111111111111111
nNegVal = 11111111111111110000000000000001
After the shift, you've got:
nPosVal = 00000000000000000111111111111111
nNegVal = 11111111111111111000000000000000
If you convert this back to decimal, you get 32767 and -32768 respectively.
Effectively, a right shift rounds towards negative infinity.
Edit: According to the Section 6.5.7 of the latest draft standard, this behavior on negative numbers is implementation dependent:The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Their stated rational for this: The C89 Committee affirmed the freedom in implementation granted by K&R in not requiring the signed right shift operation to sign extend, since such a requirement might slow down fast code and since the usefulness of sign extended shifts is marginal. (Shifting a negative two’s complement
integer arithmetically right one place is not the same as dividing by two!)
So it's implementation dependent in theory. In practice, I've never seen an implementation not do an arithmetic shift right when the left operand is signed.
No, you don't get fractional numbers like 0.5 when working with integers. The results can be easily explained when you look at the binary representations of the two numbers:
65535: 00000000000000001111111111111111
-65535: 11111111111111110000000000000001
Bit shifting to the right one bit, and extending at the left (note that this is implementation dependant, thanks Trent):
65535 >> 1: 00000000000000000111111111111111
-65535 >> 1: 11111111111111111000000000000000
Convert back to decimal:
65535 >> 1 = 32767
-65535 >> 1 = -32768
The C specification does not specify if the sign bit is shifted over or not. It is implementation dependent.
When you right-shift, the least-significant-bit is discarded.
0xFFFF = 0 1111 1111 1111 1111, which right-shifts to give 0 0111 1111 1111 1111 = 0x7FFF
-0xFFFF = 1 0000 0000 0000 0001 (2s complement), which right-shifts to 1 1000 0000 0000 0000 = -0x8000
A-1: Yes. 0xffff >> 1 is 0x7fff or 32767. I'm not sure what -0xffff does. That's peculiar.
A-2: Shifting is not the same thing as dividing. It is bit shifting—a primitive binary operation. That it sometimes can be used for some types of division is convenient, but not always the same.
Beneath the C level, machines have a CPU core which is entirely integer or scalar. Although these days every desktop CPU has an FPU, this was not always the case and even today embedded systems are made with no floating point instructions.
Today's programming paradigms and CPU designs and languages date from the era where the FPU might not even exist.
So, CPU instructions implement fixed point operations, generally treated as purely integer ops. Only if a program declares items of float or double will any fractions exist. (Well, you can use the CPU ops for "fixed point" with fractions but that is now and always was quite rare.)
Regardless of what was required by a language standard committee years ago, all reasonable machines propagate the sign bit on right shifts of signed numbers. Right shifts of unsigned values shift in zeroes on the left. The bits shifted out on the right are dropped on the floor.
To further your understanding you will need to investigate "twos-complement arithmetic".
Consider these definitions:
int x=5;
int y=-5;
unsigned int z=5;
How are they stored in memory? Can anybody explain the bit representation of these in memory?
Can int x=5 and int y=-5 have same bit representation in memory?
ISO C states what the differences are.
The int data type is signed and has a minimum range of at least -32767 through 32767 inclusive. The actual values are given in limits.h as INT_MIN and INT_MAX respectively.
An unsigned int has a minimal range of 0 through 65535 inclusive with the actual maximum value being UINT_MAX from that same header file.
Beyond that, the standard does not mandate twos complement notation for encoding the values, that's just one of the possibilities. The three allowed types would have encodings of the following for 5 and -5 (using 16-bit data types):
two's complement | ones' complement | sign/magnitude
+---------------------+---------------------+---------------------+
5 | 0000 0000 0000 0101 | 0000 0000 0000 0101 | 0000 0000 0000 0101 |
-5 | 1111 1111 1111 1011 | 1111 1111 1111 1010 | 1000 0000 0000 0101 |
+---------------------+---------------------+---------------------+
In two's complement, you get a negative of a number by inverting all bits then adding 1.
In ones' complement, you get a negative of a number by inverting all bits.
In sign/magnitude, the top bit is the sign so you just invert that to get the negative.
Note that positive values have the same encoding for all representations, only the negative values are different.
Note further that, for unsigned values, you do not need to use one of the bits for a sign. That means you get more range on the positive side (at the cost of no negative encodings, of course).
And no, 5 and -5 cannot have the same encoding regardless of which representation you use. Otherwise, there'd be no way to tell the difference.
As an aside, there are currently moves underway, in both C and C++ standards, to nominate two's complement as the only encoding for negative integers.
Because it's all just about memory, in the end all the numerical values are stored in binary.
A 32 bit unsigned integer can contain values from all binary 0s to all binary 1s.
When it comes to 32 bit signed integer, it means one of its bits (most significant) is a flag, which marks the value to be positive or negative.
The C standard specifies that unsigned numbers will be stored in binary. (With optional padding bits). Signed numbers can be stored in one of three formats: Magnitude and sign; two's complement or one's complement. Interestingly that rules out certain other representations like Excess-n or Base −2.
However on most machines and compilers store signed numbers in 2's complement.
int is normally 16 or 32 bits. The standard says that int should be whatever is most efficient for the underlying processor, as long as it is >= short and <= long then it is allowed by the standard.
On some machines and OSs history has causes int not to be the best size for the current iteration of hardware however.
Here is the very nice link which explains the storage of signed and unsigned INT in C -
http://answers.yahoo.com/question/index?qid=20090516032239AAzcX1O
Taken from this above article -
"process called two's complement is used to transform positive numbers into negative numbers. The side effect of this is that the most significant bit is used to tell the computer if the number is positive or negative. If the most significant bit is a 1, then the number is negative. If it's 0, the number is positive."
Assuming int is a 16 bit integer (which depends on the C implementation, most are 32 bit nowadays) the bit representation differs like the following:
5 = 0000000000000101
-5 = 1111111111111011
if binary 1111111111111011 would be set to an unsigned int, it would be decimal 65531.
I have C code in which I do the following.
int nPosVal = +0xFFFF; // + Added for ease of understanding
int nNegVal = -0xFFFF; // - Added for valid reason
Now when I try
printf ("%d %d", nPosVal >> 1, nNegVal >> 1);
I get
32767 -32768
Is this expected?
I am able to think something like
65535 >> 1 = (int) 32767.5 = 32767
-65535 >> 1 = (int) -32767.5 = -32768
That is, -32767.5 is rounded off to -32768.
Is this understanding correct?
It looks like your implementation is probably doing an arithmetic bit shift with two's complement numbers. In this system, it shifts all of the bits to the right and then fills in the upper bits with a copy of whatever the last bit was. So for your example, treating int as 32-bits here:
nPosVal = 00000000000000001111111111111111
nNegVal = 11111111111111110000000000000001
After the shift, you've got:
nPosVal = 00000000000000000111111111111111
nNegVal = 11111111111111111000000000000000
If you convert this back to decimal, you get 32767 and -32768 respectively.
Effectively, a right shift rounds towards negative infinity.
Edit: According to the Section 6.5.7 of the latest draft standard, this behavior on negative numbers is implementation dependent:The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Their stated rational for this: The C89 Committee affirmed the freedom in implementation granted by K&R in not requiring the signed right shift operation to sign extend, since such a requirement might slow down fast code and since the usefulness of sign extended shifts is marginal. (Shifting a negative two’s complement
integer arithmetically right one place is not the same as dividing by two!)
So it's implementation dependent in theory. In practice, I've never seen an implementation not do an arithmetic shift right when the left operand is signed.
No, you don't get fractional numbers like 0.5 when working with integers. The results can be easily explained when you look at the binary representations of the two numbers:
65535: 00000000000000001111111111111111
-65535: 11111111111111110000000000000001
Bit shifting to the right one bit, and extending at the left (note that this is implementation dependant, thanks Trent):
65535 >> 1: 00000000000000000111111111111111
-65535 >> 1: 11111111111111111000000000000000
Convert back to decimal:
65535 >> 1 = 32767
-65535 >> 1 = -32768
The C specification does not specify if the sign bit is shifted over or not. It is implementation dependent.
When you right-shift, the least-significant-bit is discarded.
0xFFFF = 0 1111 1111 1111 1111, which right-shifts to give 0 0111 1111 1111 1111 = 0x7FFF
-0xFFFF = 1 0000 0000 0000 0001 (2s complement), which right-shifts to 1 1000 0000 0000 0000 = -0x8000
A-1: Yes. 0xffff >> 1 is 0x7fff or 32767. I'm not sure what -0xffff does. That's peculiar.
A-2: Shifting is not the same thing as dividing. It is bit shifting—a primitive binary operation. That it sometimes can be used for some types of division is convenient, but not always the same.
Beneath the C level, machines have a CPU core which is entirely integer or scalar. Although these days every desktop CPU has an FPU, this was not always the case and even today embedded systems are made with no floating point instructions.
Today's programming paradigms and CPU designs and languages date from the era where the FPU might not even exist.
So, CPU instructions implement fixed point operations, generally treated as purely integer ops. Only if a program declares items of float or double will any fractions exist. (Well, you can use the CPU ops for "fixed point" with fractions but that is now and always was quite rare.)
Regardless of what was required by a language standard committee years ago, all reasonable machines propagate the sign bit on right shifts of signed numbers. Right shifts of unsigned values shift in zeroes on the left. The bits shifted out on the right are dropped on the floor.
To further your understanding you will need to investigate "twos-complement arithmetic".