I have C code in which I do the following.
int nPosVal = +0xFFFF; // + Added for ease of understanding
int nNegVal = -0xFFFF; // - Added for valid reason
Now when I try
printf ("%d %d", nPosVal >> 1, nNegVal >> 1);
I get
32767 -32768
Is this expected?
I am able to think something like
65535 >> 1 = (int) 32767.5 = 32767
-65535 >> 1 = (int) -32767.5 = -32768
That is, -32767.5 is rounded off to -32768.
Is this understanding correct?
It looks like your implementation is probably doing an arithmetic bit shift with two's complement numbers. In this system, it shifts all of the bits to the right and then fills in the upper bits with a copy of whatever the last bit was. So for your example, treating int as 32-bits here:
nPosVal = 00000000000000001111111111111111
nNegVal = 11111111111111110000000000000001
After the shift, you've got:
nPosVal = 00000000000000000111111111111111
nNegVal = 11111111111111111000000000000000
If you convert this back to decimal, you get 32767 and -32768 respectively.
Effectively, a right shift rounds towards negative infinity.
Edit: According to the Section 6.5.7 of the latest draft standard, this behavior on negative numbers is implementation dependent:The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Their stated rational for this: The C89 Committee affirmed the freedom in implementation granted by K&R in not requiring the signed right shift operation to sign extend, since such a requirement might slow down fast code and since the usefulness of sign extended shifts is marginal. (Shifting a negative two’s complement
integer arithmetically right one place is not the same as dividing by two!)
So it's implementation dependent in theory. In practice, I've never seen an implementation not do an arithmetic shift right when the left operand is signed.
No, you don't get fractional numbers like 0.5 when working with integers. The results can be easily explained when you look at the binary representations of the two numbers:
65535: 00000000000000001111111111111111
-65535: 11111111111111110000000000000001
Bit shifting to the right one bit, and extending at the left (note that this is implementation dependant, thanks Trent):
65535 >> 1: 00000000000000000111111111111111
-65535 >> 1: 11111111111111111000000000000000
Convert back to decimal:
65535 >> 1 = 32767
-65535 >> 1 = -32768
The C specification does not specify if the sign bit is shifted over or not. It is implementation dependent.
When you right-shift, the least-significant-bit is discarded.
0xFFFF = 0 1111 1111 1111 1111, which right-shifts to give 0 0111 1111 1111 1111 = 0x7FFF
-0xFFFF = 1 0000 0000 0000 0001 (2s complement), which right-shifts to 1 1000 0000 0000 0000 = -0x8000
A-1: Yes. 0xffff >> 1 is 0x7fff or 32767. I'm not sure what -0xffff does. That's peculiar.
A-2: Shifting is not the same thing as dividing. It is bit shifting—a primitive binary operation. That it sometimes can be used for some types of division is convenient, but not always the same.
Beneath the C level, machines have a CPU core which is entirely integer or scalar. Although these days every desktop CPU has an FPU, this was not always the case and even today embedded systems are made with no floating point instructions.
Today's programming paradigms and CPU designs and languages date from the era where the FPU might not even exist.
So, CPU instructions implement fixed point operations, generally treated as purely integer ops. Only if a program declares items of float or double will any fractions exist. (Well, you can use the CPU ops for "fixed point" with fractions but that is now and always was quite rare.)
Regardless of what was required by a language standard committee years ago, all reasonable machines propagate the sign bit on right shifts of signed numbers. Right shifts of unsigned values shift in zeroes on the left. The bits shifted out on the right are dropped on the floor.
To further your understanding you will need to investigate "twos-complement arithmetic".
Related
I'm right-shifting -109 by 5 bits, and I expect -3, because
-109 = -1101101 (binary)
shift right by 5 bits
-1101101 >>5 = -11 (binary) = -3
But, I am getting -4 instead.
Could someone explain what's wrong?
Code I used:
int16_t a = -109;
int16_t b = a >> 5;
printf("%d %d\n", a,b);
I used GCC on linux, and clang on osx, same result.
The thing is you are not considering negative numbers representation correctly. With right shifting, the type of shift (arithmetic or logical) depends on the type of the value being shifted. If you cast your value to an unsigned value, you might get what you are expecting:
int16_t b = ((unsigned int)a) >> 5;
You are using -109 (16 bits) in your example. 109 in bits is:
00000000 01101101
If you take's 109 2's complement you get:
11111111 10010011
Then, you are right shifting by 5 the number 11111111 10010011:
__int16_t a = -109;
__int16_t b = a >> 5; // arithmetic shifting
__int16_t c = ((__uint16_t)a) >> 5; // logical shifting
printf("%d %d %d\n", a,b,c);
Will yield:
-109 -4 2044
The result of right shifting a negative value is implementation defined behavior, from the C99 draft standard section 6.5.7 Bitwise shift operators paragraph 5 which says (emphasis mine going forward):
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type
or if E1 has a signed type and a nonnegative value, the value of the result is the integral
part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
If we look at gcc C Implementation-defined behavior documents under the Integers section it says:
The results of some bitwise operations on signed integers (C90 6.3, C99 and C11 6.5).
Bitwise operators act on the representation of the value including both the sign and value bits, where the sign bit is considered immediately above the highest-value value bit. Signed ‘>>’ acts on negative numbers by sign extension.
That's pretty clear what's happening, when representing signed integers, negative integers have a property which is, sign extension, and the left most significant bit is the sign bit.
So, 1000 ... 0000 (32 bit) is the biggest negative number that you can represent, with 32 bits.
Because of this, when you have a negative number and you shift right, a thing called sign extension happens, which means that the left most significant bit is extended, in simple terms it means that, for a number like -109 this is what happens:
Before shifting you have (16bit):
1111 1111 1001 0011
Then you shift 5 bits right (after the pipe are the discarded bits):
1XXX X111 1111 1100 | 1 0011
The X's are the new spaces that appear in your integer bit representation, that due to the sign extension, are filled with 1's, which give you:
1111 1111 1111 1100 | 1 0011
So by shifting: -109 >> 5, you get -4 (1111 .... 1100) and not -3.
Confirming results with the 1's complement:
+3 = 0... 0000 0011
-3 = ~(0... 0000 0011) + 1 = 1... 1111 1100 + 1 = 1... 1111 1101
+4 = 0... 0000 0100
-4 = ~(0... 0000 0100) + 1 = 1... 1111 1011 + 1 = 1... 1111 1100
Note: Remember that the 1's complement is just like the 2's complement with the diference that you first must negate the bits of positive number and only then sum +1.
Pablo's answer is essentially correct, but there are two small bits (no pun intended!) that may help you see what's going on.
C (like pretty much every other language) uses what's called two's complement, which is simply a different way of representing negative numbers (it's used to avoid the problems that come up with other ways of handling negative numbers in binary with a fixed number of digits). There is a conversion process to turn a positive number in two's complement (which looks just like any other number in binary - except that the furthest most left bit must be 0 in a positive number; it's basically the sign place-holder) is reasonably simple computationally:
Take your number
00000000 01101101 (It has 0s padding it to the left because it's 16 bits. If it was long, it'd be padded with more zeros, etc.)
Flip the bits
11111111 10010010
Add one.
11111111 10010011.
This is the two's complement number that Pablo was referring to. It's how C holds -109, bitwise.
When you logically shift it to the right by five bits you would APPEAR to get
00000111 11111100.
This number is most definitely not -4. (It doesn't have a 1 in the first bit, so it's not negative, and it's way too large to be 4 in magnitude.) Why is C giving you negative 4 then?
The reason is basically that the ISO implementation for C doesn't specify how a given compiler needs to treat bit-shifting in negative numbers. GCC does what's called sign extension: the idea is to basically pad the left bits with 1s (if the initial number was negative before shifting), or 0s (if the initial number was positive before shifting).
So instead of the 5 zeros that happened in the above bit-shift, you instead get:
11111111 11111100. That number is in fact negative 4! (Which is what you were consistently getting as a result.)
To see that that is in fact -4, you can just convert it back to a positive number using the two's complement method again:
00000000 00000011 (bits flipped)
00000000 00000100 (add one).
That's four alright, so your original number (11111111 11111100) was -4.
For GCC 32 bits, -1 >> 1 returns me FFFFFFFF, but I thought after 2's complement, I will get
0111 1111 ... 1111 which should be 7fff ffff. did i miss something?
Under most implementations, that operator does an arithmetic shift for signed types, so it preserves the sign bit (which is the leftmost bit), in this case 1.
As #Clifford correctly pointed out, the language standard leaves the implementation of >> up to the implementor.
See the Wikipedia article for details.
For E1 >> E2, if E1 is negative, then the behavior is implementation-defined, which means different compilers could use different strategies to implement it.
Apparently GCC choose arithmetic shift, as pointed out by #merlin2011
This question already has answers here:
Arithmetic bit-shift on a signed integer
(6 answers)
Closed 9 years ago.
so, lets say I have a signed integer (couple of examples):
-1101363339 = 10111110 01011010 10000111 01110101 in binary.
-2147463094 = 10000000 00000000 01010000 01001010 in binary.
-20552 = 11111111 11111111 10101111 10111000 in binary.
now: -1101363339 >> 31 for example, should equal 1 right? but on my computer, I am getting -1. Regardless of what negative integer I pick if x = negative number, x >> 31 = -1. why? clearly in binary it should be 1.
Per C99 6.5.7 Bitwise shift operators:
If E1 has a signed type and a negative value, the resulting value is implementation-defined.
where E1 is the left-hand side of the shift expression. So it depends on your compiler what you'll get.
In most languages when you shift to the right it does an arithmetic shift, meaning it preserves the most significant bit. Therefore in your case you have all 1's in binary, which is -1 in decimal. If you use an unsigned int you will get the result you are looking for.
Per C 2011 6.5.7 Bitwise shift operators:
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type
or if E1 has a signed type and a nonnegative value, the value of the result is the integral
part of the quotient of E1/ 2E2. If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
Basically, the right-shift of a negative signed integer is implementation defined but most implementations choose to do it as an arithmetic shift.
The behavior you are seeing is called an arithmetic shift which is when right shifting extends the sign bit. This means that the MSBs will carry the same value as the original sign bit. In other words, a negative number will always be negative after a left shift operation.
Note that this behavior is implementation defined and cannot be guaranteed with a different compiler.
What you are seeing is an arithmetic shift, in contrast to the bitwise shift you were expecting; i.e., the compiler, instead of "brutally" shifting the bits, is propagating the sign bit, thus dividing by 2N.
When talking about unsigned ints and positive ints, a right shift is a very simple operation - the bits are shifted to the right by one place (inserting 0 on the left), regardless of their meaning. In such cases, the operation is equivalent to dividing by 2N (and actually the C standard defines it like that).
The distinction comes up when talking about negative numbers. Several negative numbers representation exist, although currently for integers most commonly 2's complement representation is used.
The problem of a "brutal" bitwise shift here is, for starters, that one of the bits is used in some way to express the sign; thus, shifting the binary digits regardless of the negative integers representation can give unexpected results.
For example, commonly in 2's representation the most significant bit is 1 for negative numbers, 0 for positive numbers; applying a bitwise shift (with zeroes inserted to the left) to a negative number would (between other things) make it positive, not resulting in the (usually expected) division by 2N
So, arithmetic shift is introduced; negative numbers represented in 2's complement have an interesting property: the division by 2N behavior of the shift is preserved if, instead of inserting zeroes from the left, you insert bits that have the same value of the original sign bit.
In this way, signed divisions by 2N can be performed with just a bit of extra logic in the shift, without having to resort to a fully-fledged division routine.
Now, is arithmetic shift guaranteed for signed integers? In some languages yes1, but in C it's not like that - the behavior of the shift operators when dealing with negative integers is left as an implementation-defined detail.
As often happens, this is due to different hardware support for the operation; C is used on vastly different platforms, and, especially in the past, there was quite a difference in the "cost" of operations depending on the platform.
For example, if the processor does not provide an arithmetic right shift instruction, the compiler would be mandated to emit a much slower DIV instruction of some kind, which could be a problem in an inner loop on slower processors. For these reasons, the C standard leaves it up to the implementor to do the most appropriate thing for the current platform.
In your case, your implementation probably chose arithmetic shift because you are running on an x86 processor, that uses 2's complement arithmetic and provides both bitwise and arithmetic shift as single CPU instructions.
Actually, languages like Java even have separated arithmetic and bitwise shift operators - this is mainly due to the fact that they do not have unsigned types to e.g. store bitfields.
I have C code in which I do the following.
int nPosVal = +0xFFFF; // + Added for ease of understanding
int nNegVal = -0xFFFF; // - Added for valid reason
Now when I try
printf ("%d %d", nPosVal >> 1, nNegVal >> 1);
I get
32767 -32768
Is this expected?
I am able to think something like
65535 >> 1 = (int) 32767.5 = 32767
-65535 >> 1 = (int) -32767.5 = -32768
That is, -32767.5 is rounded off to -32768.
Is this understanding correct?
It looks like your implementation is probably doing an arithmetic bit shift with two's complement numbers. In this system, it shifts all of the bits to the right and then fills in the upper bits with a copy of whatever the last bit was. So for your example, treating int as 32-bits here:
nPosVal = 00000000000000001111111111111111
nNegVal = 11111111111111110000000000000001
After the shift, you've got:
nPosVal = 00000000000000000111111111111111
nNegVal = 11111111111111111000000000000000
If you convert this back to decimal, you get 32767 and -32768 respectively.
Effectively, a right shift rounds towards negative infinity.
Edit: According to the Section 6.5.7 of the latest draft standard, this behavior on negative numbers is implementation dependent:The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Their stated rational for this: The C89 Committee affirmed the freedom in implementation granted by K&R in not requiring the signed right shift operation to sign extend, since such a requirement might slow down fast code and since the usefulness of sign extended shifts is marginal. (Shifting a negative two’s complement
integer arithmetically right one place is not the same as dividing by two!)
So it's implementation dependent in theory. In practice, I've never seen an implementation not do an arithmetic shift right when the left operand is signed.
No, you don't get fractional numbers like 0.5 when working with integers. The results can be easily explained when you look at the binary representations of the two numbers:
65535: 00000000000000001111111111111111
-65535: 11111111111111110000000000000001
Bit shifting to the right one bit, and extending at the left (note that this is implementation dependant, thanks Trent):
65535 >> 1: 00000000000000000111111111111111
-65535 >> 1: 11111111111111111000000000000000
Convert back to decimal:
65535 >> 1 = 32767
-65535 >> 1 = -32768
The C specification does not specify if the sign bit is shifted over or not. It is implementation dependent.
When you right-shift, the least-significant-bit is discarded.
0xFFFF = 0 1111 1111 1111 1111, which right-shifts to give 0 0111 1111 1111 1111 = 0x7FFF
-0xFFFF = 1 0000 0000 0000 0001 (2s complement), which right-shifts to 1 1000 0000 0000 0000 = -0x8000
A-1: Yes. 0xffff >> 1 is 0x7fff or 32767. I'm not sure what -0xffff does. That's peculiar.
A-2: Shifting is not the same thing as dividing. It is bit shifting—a primitive binary operation. That it sometimes can be used for some types of division is convenient, but not always the same.
Beneath the C level, machines have a CPU core which is entirely integer or scalar. Although these days every desktop CPU has an FPU, this was not always the case and even today embedded systems are made with no floating point instructions.
Today's programming paradigms and CPU designs and languages date from the era where the FPU might not even exist.
So, CPU instructions implement fixed point operations, generally treated as purely integer ops. Only if a program declares items of float or double will any fractions exist. (Well, you can use the CPU ops for "fixed point" with fractions but that is now and always was quite rare.)
Regardless of what was required by a language standard committee years ago, all reasonable machines propagate the sign bit on right shifts of signed numbers. Right shifts of unsigned values shift in zeroes on the left. The bits shifted out on the right are dropped on the floor.
To further your understanding you will need to investigate "twos-complement arithmetic".
Why does (-1 >> 1) result in -1? I'm working in C, though I don't think that should matter.
I can not figure out what I'm missing...
Here is an example of a C program that does the calc:
#include <stdio.h>
int main()
{
int num1 = -1;
int num2 = (num1 >> 1);
printf( "num1=%d", num1 );
printf( "\nnum2=%d", num2 );
return 0;
}
Because signed integers are represented in two's complement notation.
-1 will be 11111111 (if it was an 8 bit number).
-1 >> 1 evidently sign extends so that it remains 11111111. This behaviour depends on the compiler, but for Microsoft, when shifting a signed number right (>>) the sign bit is copied, while shifting an unsigned number right causes a 0 to be put in the leftmost bit.
An Arithmetic right shift will preserve the sign when shifting a signed number:
11111111 (-1) will stay 11111111 (-1)
In contrast, a Logical right shift won't preserve the sign:
11111111 (-1) will become 01111111 (127)
Your code clearly does an arithmetic shift, so the sign bit (MSB) is repeated. What the operator (>>) does depends on the implementation details of the platform you're using. In most cases, it's an arithmetic shift.
Also, note that 11111111 can have two different meanings depending on the representation. This also affects they way they'll be shifted.
If unsigned, 11111111 represents 255. Shifting it to the right won't preserve the sign since the MSB is not a sign bit.
If signed, 11111111 represents -1. Arithmetically shifting it to the right will preserve the sign.
Bit shifting a negative number is implementation-behavior in C. The results will depend on your platform, and theoretically could be completely nonsensical. From the C99 standard (6.5.7.5):
The result of E1 >> E2 is E1
right-shifted E2 bit positions. If E1
has an unsigned type or if E1 has a
signed type and a nonnegative value,
the value of the result is the
integral part of the quotient of E1 /
2^E2 . If E1 has a signed type and a
negative value, the resulting value is
implementation-defined.
The reason this happens is most likely because your compiler uses the x86 SAR (Shift Arithmetic Right) instruction to implement >>. This means sign extension will occur - the most significant bit will be replicated into the new MSB once the value is shifted over. From the intel manuals:
The shift arithmetic right (SAR) and
shift logical right (SHR) instructions
shift the bits of the destination
operand to the right (toward less
significant bit locations). For each
shift count, the least significant bit
of the destination operand is shifted
into the CF flag, and the most
significant bit is either set or
cleared depending on the instruction
type. The SHR instruction clears the
most significant bit (see Figure 7-8
in the Intel® 64 and IA-32
Architectures Software Developer’s
Manual, Volume 1); the SAR instruction
sets or clears the most significant
bit to correspond to the sign (most
significant bit) of the original value
in the destination operand. In effect,
the SAR instruction fills the empty
bit position’s shifted value with the
sign of the unshifted value (see
Figure 7-9 in the Intel® 64 and IA-32
Architectures Software Developer’s
Manual, Volume 1).
When you right shift and the leftmost bit is a 1, some platforms/compilers will bring in a 0 and some will preserve the 1 and make the new leftmost bit a 1. This preserves the sign of the number so a negative number stays negative and is called sign extension.
You'll see the difference if you try ((unsigned) -1) >> 1, which will do an unsigned right shift and so will always shift in a 0 bit.
sign extension.