Are the shift operators (<<, >>) arithmetic or logical in C? - c

In C, are the shift operators (<<, >>) arithmetic or logical?

When shifting left, there is no difference between arithmetic and logical shift. When shifting right, the type of shift depends on the type of the value being shifted.
(As background for those readers unfamiliar with the difference, a "logical" right shift by 1 bit shifts all the bits to the right and fills in the leftmost bit with a 0. An "arithmetic" shift leaves the original value in the leftmost bit. The difference becomes important when dealing with negative numbers.)
When shifting an unsigned value, the >> operator in C is a logical shift. When shifting a signed value, the >> operator is an arithmetic shift.
For example, assuming a 32 bit machine:
signed int x1 = 5;
assert((x1 >> 1) == 2);
signed int x2 = -5;
assert((x2 >> 1) == -3);
unsigned int x3 = (unsigned int)-5;
assert((x3 >> 1) == 0x7FFFFFFD);

According to K&R 2nd edition the results are implementation-dependent for right shifts of signed values.
Wikipedia says that C/C++ 'usually' implements an arithmetic shift on signed values.
Basically you need to either test your compiler or not rely on it. My VS2008 help for the current MS C++ compiler says that their compiler does an arithmetic shift.

TL;DR
Consider i and n to be the left and right operands respectively of a shift operator; the type of i, after integer promotion, be T. Assuming n to be in [0, sizeof(i) * CHAR_BIT) — undefined otherwise — we've these cases:
| Direction | Type | Value (i) | Result |
| ---------- | -------- | --------- | ------------------------ |
| Right (>>) | unsigned | ≥ 0 | −∞ ← (i ÷ 2ⁿ) |
| Right | signed | ≥ 0 | −∞ ← (i ÷ 2ⁿ) |
| Right | signed | < 0 | Implementation-defined† |
| Left (<<) | unsigned | ≥ 0 | (i * 2ⁿ) % (T_MAX + 1) |
| Left | signed | ≥ 0 | (i * 2ⁿ) ‡ |
| Left | signed | < 0 | Undefined |
† most compilers implement this as arithmetic shift
‡ undefined if value overflows the result type T; promoted type of i
Shifting
First is the difference between logical and arithmetic shifts from a mathematical viewpoint, without worrying about data type size. Logical shifts always fills discarded bits with zeros while arithmetic shift fills it with zeros only for left shift, but for right shift it copies the MSB thereby preserving the sign of the operand (assuming a two's complement encoding for negative values).
In other words, logical shift looks at the shifted operand as just a stream of bits and move them, without bothering about the sign of the resulting value. Arithmetic shift looks at it as a (signed) number and preserves the sign as shifts are made.
A left arithmetic shift of a number X by n is equivalent to multiplying X by 2n and is thus equivalent to logical left shift; a logical shift would also give the same result since MSB anyway falls off the end and there's nothing to preserve.
A right arithmetic shift of a number X by n is equivalent to integer division of X by 2n ONLY if X is non-negative! Integer division is nothing but mathematical division and round towards 0 (trunc).
For negative numbers, represented by two's complement encoding, shifting right by n bits has the effect of mathematically dividing it by 2n and rounding towards −∞ (floor); thus right shifting is different for non-negative and negative values.
for X ≥ 0, X >> n = X / 2n = trunc(X ÷ 2n)
for X < 0, X >> n = floor(X ÷ 2n)
where ÷ is mathematical division, / is integer division. Let's look at an example:
37)10 = 100101)2
37 ÷ 2 = 18.5
37 / 2 = 18 (rounding 18.5 towards 0) = 10010)2 [result of arithmetic right shift]
-37)10 = 11011011)2 (considering a two's complement, 8-bit representation)
-37 ÷ 2 = -18.5
-37 / 2 = -18 (rounding 18.5 towards 0) = 11101110)2 [NOT the result of arithmetic right shift]
-37 >> 1 = -19 (rounding 18.5 towards −∞) = 11101101)2 [result of arithmetic right shift]
As Guy Steele pointed out, this discrepancy has led to bugs in more than one compiler. Here non-negative (math) can be mapped to unsigned and signed non-negative values (C); both are treated the same and right-shifting them is done by integer division.
So logical and arithmetic are equivalent in left-shifting and for non-negative values in right shifting; it's in right shifting of negative values that they differ.
Operand and Result Types
Standard C99 §6.5.7:
Each of the operands shall have integer types.
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behaviour is undefined.
short E1 = 1, E2 = 3;
int R = E1 << E2;
In the above snippet, both operands become int (due to integer promotion); if E2 was negative or E2 ≥ sizeof(int) * CHAR_BIT then the operation is undefined. This is because shifting more than the available bits is surely going to overflow. Had R been declared as short, the int result of the shift operation would be implicitly converted to short; a narrowing conversion, which may lead to implementation-defined behaviour if the value is not representable in the destination type.
Left Shift
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1×2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behaviour is undefined.
As left shifts are the same for both, the vacated bits are simply filled with zeros. It then states that for both unsigned and signed types it's an arithmetic shift. I'm interpreting it as arithmetic shift since logical shifts don't bother about the value represented by the bits, it just looks at it as a stream of bits; but the standard talks not in terms of bits, but by defining it in terms of the value obtained by the product of E1 with 2E2.
The caveat here is that for signed types the value should be non-negative and the resulting value should be representable in the result type. Otherwise the operation is undefined. The result type would be the type of the E1 after applying integral promotion and not the destination (the variable which is going to hold the result) type. The resulting value is implicitly converted to the destination type; if it is not representable in that type, then the conversion is implementation-defined (C99 §6.3.1.3/3).
If E1 is a signed type with a negative value then the behaviour of left shifting is undefined. This is an easy route to undefined behaviour which may easily get overlooked.
Right Shift
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a non-negative value, the value of the result is the integral part of the quotient of E1/2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Right shift for unsigned and signed non-negative values are pretty straight forward; the vacant bits are filled with zeros. For signed negative values the result of right shifting is implementation-defined. That said, most implementations like GCC and Visual C++ implement right-shifting as arithmetic shifting by preserving the sign bit.
Conclusion
Unlike Java, which has a special operator >>> for logical shifting apart from the usual >> and <<, C and C++ have only arithmetic shifting with some areas left undefined and implementation-defined. The reason I deem them as arithmetic is due to the standard wording the operation mathematically rather than treating the shifted operand as a stream of bits; this is perhaps the reason why it leaves those areas un/implementation-defined instead of just defining all cases as logical shifts.

In terms of the type of shift you get, the important thing is the type of the value that you're shifting. A classic source of bugs is when you shift a literal to, say, mask off bits. For example, if you wanted to drop the left-most bit of an unsigned integer, then you might try this as your mask:
~0 >> 1
Unfortunately, this will get you into trouble because the mask will have all of its bits set because the value being shifted (~0) is signed, thus an arithmetic shift is performed. Instead, you'd want to force a logical shift by explicitly declaring the value as unsigned, i.e. by doing something like this:
~0U >> 1;

Here are functions to guarantee logical right shift and arithmetic right shift of an int in C:
int logicalRightShift(int x, int n) {
return (unsigned)x >> n;
}
int arithmeticRightShift(int x, int n) {
if (x < 0 && n > 0)
return x >> n | ~(~0U >> n);
else
return x >> n;
}

When you do
- left shift by 1 you multiply by 2
- right shift by 1 you divide by 2
x = 5
x >> 1
x = 2 ( x=5/2)
x = 5
x << 1
x = 10 (x=5*2)

Well, I looked it up on wikipedia, and they have this to say:
C, however, has only one right shift
operator, >>. Many C compilers choose
which right shift to perform depending
on what type of integer is being
shifted; often signed integers are
shifted using the arithmetic shift,
and unsigned integers are shifted
using the logical shift.
So it sounds like it depends on your compiler. Also in that article, note that left shift is the same for arithmetic and logical. I would recommend doing a simple test with some signed and unsigned numbers on the border case (high bit set of course) and see what the result is on your compiler. I would also recommend avoiding depending on it being one or the other since it seems C has no standard, at least if it is reasonable and possible to avoid such dependence.

Left shift <<
This is somehow easy and whenever you use the shift operator, it is always a bit-wise operation, so we can't use it with a double and float operation. Whenever we left shift one zero, it is always added to the least significant bit (LSB).
But in right shift >> we have to follow one additional rule and that rule is called "sign bit copy". Meaning of "sign bit copy" is if the most significant bit (MSB) is set then after a right shift again the MSB will be set if it was reset then it is again reset, means if the previous value was zero then after shifting again, the bit is zero if the previous bit was one then after the shift it is again one. This rule is not applicable for a left shift.
The most important example on right shift if you shift any negative number to right shift, then after some shifting the value finally reach to zero and then after this if shift this -1 any number of times the value will remain same. Please check.

gcc will typically use logical shifts on unsigned variables and for left-shifts on signed variables. The arithmetic right shift is the truly important one because it will sign extend the variable.
gcc will will use this when applicable, as other compilers are likely to do.

GCC does
for -ve - > Arithmetic Shift
For +ve -> Logical Shift

According to many c compilers:
<< is an arithmetic left shift or bitwise left shift.
>> is an arithmetic right shiftor bitwise right shift.

Related

Question about using bitshift operators on int

I am a beginner so dont get mad at my simple question but suppose I have an int variable
lets say a, and I do a<<3 will that be equal to a*2^3 = a*8 as I read that bitshift operators multiply the variable with 2^x.
Am I correct or I am misreading this situation??
Thanks!
Yes, with some exceptions.
Consider 123410 × 103 = 123400010. We added zeroes to the right of the decimal representation of the number by multiplying it by a power of ten.
Similarly, we can add zeroes to the right of the binary representation of the number by multiplying it by a power of two. For example, 10112 × 23 = 10110002.
The exceptions fall into two categories.
Exceptions due to overflow
If the left operand has an unsigned type, and the result is too large for that type, the most significant bits will be dropped.
For example, in an environment with a 32-bit unsigned int type, 8000000116 × 21 = 1000000216, but 0x80000001u << 1 produces the chopped result 0x0000002u.
If the left operand has an signed type, and the result is too large for that type, the behaviour is undefined.
For example, in an environment with a 32-bit int type, the behaviour of 1 << 31 is undefined.
Exceptions due to weird operands
If the value of the right operand is negative, the behaviour is undefined.
For example, the behaviour of 1 << -1 is undefined.
If the value of the left operand is negative, the behaviour is undefined.
For example, the behaviour of -1 << 1 is undefined.
If the value of the right operand is greater than or equal to the width of the promoted left operand, the behaviour is undefined.
For example, in an environment with a 32-bit unsigned int type, the behaviour of 1u << 32 is undefined.
C17, on the semantics of <<:
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Left shift simply means to shift the set binary values to some places to the left. for example, 3 has a binary value of 11 so doing 3<<3 means we left shift these set bits (1) to 3 places to the left. so 11 becomes 11000 which is equal to 24.
You are correct! By bitshifting to the left you multiply your integer value with 2.
Visualising it, you do the following:
Let's say a is an 8-bit integer and is equal to 1.
That means that the binary code will look like: 00000001
Now if we bit shift 3 times to the left, that binary code becomes 00001000, which is 8.

Do I have correct understanding of Undefined Behavior for shift operators in C?

I want to make sure I understand exactly under what circumstances the << and >> operators in C produce Undefined Behavior. This is my current understanding:
Let...:
x_t be the type of x after integer promotion
N be the bitwidth of x after integer promotion
M be the number of 0s to the left of the most-significant 1 bit in the representation of x after integer promotion
x << y is UB if any of the following:
x < 0 (even if y == 0)
y < 0
y >= N
x_t is a signed type and y >= M
x >> y is UB if any of the following:
y < 0
y >= N
...and is implementation defined if:
x < 0
If I have this understanding correct, it would imply the following:
unsigned short x = 1;
x << 31;
This would be undefined behavior in the case where int is 32 bits and short is 16 (because x would be promoted to int, and the left shift by 31 would put the 1 bit into position 31), but it would be defined behavior in the case where int and short are both 32 bits (because x would be promoted to an unsigned int and 31 < 32).
Yes.
I find your definition of M a little weak. Specifically, it wasn't clear to me if you were including the sign bit.
But yes, the interpretation is correct.
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
y < 0 ⇒ UB
y >= N ⇒ UB
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum
value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
This paragraph is poorly worded.
There's no doubt that the behaviour of E1 << E2 isn't defined when E1 × 2E2 isn't representable.
For x << y, x_t is a signed type and y >= M ⇒ UB[1]
But what about when "E1 has a signed type and nonnegative value" is false? 3 << 2 is clearly not undefined behaviour, so that means neither the "then" not the "otherwise" clauses apply when this is false, so that means the spec is silent on the behaviour of -3 << 2. It's literally behaviour that's not defined by the spec. So,
For x << y, x < 0 ⇒ UB
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1/2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
For x >> y, x < 0 ⇒ Implementation defined
We need to consider not just two's-complement, but ones' complement and sign-magnitude in assessing the validity of this interpretation.
Do I have correct understanding of Undefined Behavior for shift operators in C?
Yes, whereas the M part is a little vague.
While the any of the following: lists some examples, it does not exhaust the whole possible space of behaviors. INT_MAX << 1 is also UB. The rule is x * 2**y <= X_T_MAX, where X_T_MAX is the maximum representable value in type x_t. That is a 2D plane of allowed numbers.
What you have written here is far more cumbersome to comprehend than the actual standard... the TL;DR of C17 6.5.7 can be summarized as:
UB: Don't left shift variables containing negative values.
UB: Don't left shift data into the sign bit of a signed operand (the type obtained after promotion).
UB: Don't shift by a negative or too large shift count.
Right-shifting variables containing negative values gives implementation-defined behavior in the form of either logical or arithmetic shift. Non-portable.
Done, that's it. No need to make things more complicated.
The golden rule is: never perform bitwise arithmetic on signed types ever. Abide it and you'll avoid a number of well-known bugs.
Under C89, the behavior of left-shifting an N-bit integer left by 0..N-1 bits was unambiguously defined for all possible signed or unsigned values of the integer, except on platforms where signed and unsigned types had padding bits in different places. On non-two's-complement platforms, however, behaving in the mandated function may have been less useful than e.g. processing << as a "multiply by power of two" operator, and more expensive than allowing compilers select in arbitrary fashion from among platform-specific interpretations (e.g. sometimes processing x<<1 as (x+x) and sometimes processing it by actually shifting x left by one bit).
Because there was no reason to imagine that implementations for two's-complement platforms would deviate from the C89 behavior, even if allowed to do so, and because people who worked with other platforms would be better placed than the Committee to weigh the pros and cons of handling the construct in precisely-predictable predictable or somewhat-unpredictable fashion, the C99 Committee opted to waive jurisdiction over the behavior of left-shifting negative numbers. The Committee classified left-shifts of negative numbers as Undefined Behavior because they never imagined that its characterization of actions as "non-portable or erroneous" would be twisted to imply that the Committee judged such actions "non-portable, and therefore erroneous".

Bitshift Causing Overflow When It Shouldn't

When I am bitshifting the max positive 2's complement, shouldn't shifting it 31 bits make an effective 0 because it starts as 0111 1111, etc.
I have tried decreasing the shift but I am assuming it's just the computer reading it incorrectly.
int main(void)
{
int x = 0x7FFFFFFF;
int nx = ~x;
int ob = nx >> 31;
int ans = ob & nx;
printf("%d",ans);
}
I am expecting ob to be 0 but it turns out as the twos complement minimum. I am using this to create a bang without actually using !.
If you were shifting the max positive two's complement number, it would end up as zero.
But you are not shifting that number:
int x = 0x7FFFFFFF;
int nx = ~x; // 0x80000000 (assuming 32-bit int).
You are bit-shifting the largest (in magnitude) negative number.
And, as per the standards document C11 6.5.7 Bitwise shift operators /5 (my emphasis):
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2^E2 . If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Your implementation seems to preserve the sign bit, which is why you end up with the non-zero negative value.
As an aside, if you want a ! operator, you can just use:
output = (input == 0);
See afore-mentioned standard, 6.5.3.3 Unary arithmetic operators /5 (again, my emphasis) where it explicitly calls out the equivalence:
The result of the logical negation operator ! is 0 if the value of its operand compares unequal to 0, 1 if the value of its operand compares equal to 0. The result has type int. The expression !E is equivalent to (0==E).

can't shift negative numbers to the right in c

I am going through 'The C language by K&R'. Right now I am doing the bitwise section. I am having a hard time in understanding the following code.
int mask = ~0 >> n;
I was playing on using this to mask n left side of another binary like this.
0000 1111
1010 0101 // random number
My problem is that when I print var mask it still negative -1. Assuming n is 4. I thought shifting ~0 which is -1 will be 15 (0000 1111).
thanks for the answers
Performing a right shift on a negative value yields an implementation defined value. Most hosted implementations will shift in 1 bits on the left, as you've seen in your case, however that doesn't necessarily have to be the case.
Unsigned types as well as positive values of signed types always shift in 0 bits on the left when shifting right. So you can get the desired behavior by using unsigned values:
unsigned int mask = ~0u >> n;
This behavior is documented in section 6.5.7 of the C standard:
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative
value, the value of the result is the integral part of the quotient
of E1 / 2E2 .If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
Right-shifting negative signed integers is an implementation-defined behavior, which is usually (but not always) filling the left with ones instead of zeros. That's why no matter how many bits you've shifted, it's always -1, as the left is always filled by ones.
When you shift unsigned integers, the left will always be filled by zeros. So you can do this:
unsigned int mask = ~0U >> n;
^
You should also note that int is typically 2 or 4 bytes, meaning if you want to get 15, you need to right-shift 12 or 28 bits instead of only 4. You can use a char instead:
unsigned char mask = ~0U;
mask >>= 4;
In C, and many other languages, >> is (usually) an arithmetic right shift when performed on signed variables (like int). This means that the new bit shifted in from the left is a copy of the previous most-significant bit (MSB). This has the effect of preserving the sign of a two's compliment negative number (and in this case the value).
This is in contrast to a logical right shift, where the MSB is always replaced with a zero bit. This is applied when your variable is unsigned (e.g. unsigned int).
From Wikipeda:
The >> operator in C and C++ is not necessarily an arithmetic shift. Usually it is only an arithmetic shift if used with a signed integer type on its left-hand side. If it is used on an unsigned integer type instead, it will be a logical shift.
In your case, if you plan to be working at a bit level (i.e. using masks, etc.) I would strongly recommend two things:
Use unsigned values.
Use types with specific sizes from <stdint.h> like uint32_t

Why does right shifting negative numbers in C bring 1 on the left-most bits? [duplicate]

This question already has answers here:
Are the shift operators (<<, >>) arithmetic or logical in C?
(11 answers)
Closed 7 years ago.
The book "C The Complete Reference" by Herbert Schildt says that "(In the case of a signed, negative integer, a right shift will cause a 1 to be brought in so that the sign bit is preserved.)"
What's the point of preserving the sign bit?
Moreover, I think that the book is referring to the case when negative numbers are represented using a sign bit and not using two's complement. But still even in that case the reasoning doesn't seem to make any sense.
The Schildt book is widely acknowledged to be exceptionally poor.
In fact, C doesn't guarantee that a 1 will be shifted in when you right-shift a negative signed number; the result of right-shifting a negative value is implementation-defined.
However, if right-shift of a negative number is defined to shift in 1s to the highest bit positions, then on a 2s complement representation it will behave as an arithmetic shift - the result of right-shifting by N will be the same as dividing by 2N, rounding toward negative infinity.
The statement is sweeping and inaccurate, like many a statement by Mr Schildt. Many people recommend throwing his books away. (Amongst other places, see The Annotated Annotated C Standard, and ACCU Reviews — do an author search on Schildt; see also the Definitive List of C Books on Stack Overflow).
It is implementation defined whether right shifting a negative (necessarily signed) integer shifts zeros or ones into the high order bits. The underlying CPUs (for instance, ARM; see also this class) often have two different underlying instructions — ASR or arithmetic shift right and LSR or logical shift right, of which ASR preserves the sign bit and LSR does not. The compiler writer is allowed to choose either, and may do so for reasons of compatibility, speed or whimsy.
ISO/IEC 9899:2011 §6.5.7 Bitwise shift operators
¶5 The result of E1 >> E2is E1 right-shifted E2 bit positions. If E1 has an unsigned type
or if E1 has a signed type and a nonnegative value, the value of the result is the integral
part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
The point is that the C >> (right shift) operator preserves1 the sign for a (signed) int.
For example:
int main() {
int a;
unsigned int b;
a = -8;
printf("%d (0x%X) >> 1 = %d (0x%X)\n", a, a, a>>1, a>>1);
b = 0xFFEEDDCC;
printf("%d (0x%X) >> 1 = %d (0x%X)\n", b, b, b>>1, b>>1);
return 0;
}
Output:
-8 (0xFFFFFFF8) >> 1 = -4 (0xFFFFFFFC) [sign preserved, LSB=1]
-1122868 (0xFFEEDDCC) >> 1 = 2146922214 (0x7FF76EE6) [MSB = 0]
If it didn't preserve the sign, the result would make absolutely no sense. You would take a small negative number, and by shifting right one (dividing by two), you would end up with a large positive number instead.
1 - This is implementation-defined, but from my experience, most compilers choose an arithmetic (sign-preserving) shift instruction.
In the case of a signed, negative integer, a right shift will cause a 1 to be brought in so that the sign bit is preserved
Not necessarily. See the C standard C11 6.5.7:
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has
an unsigned type or if E1 has a signed type and a nonnegative value,
the value of the result is the integral part of the quotient of E1 /
2E2. If E1 has a signed type and a negative value, the resulting value
is implementation-defined.
This means that the compiler is free to shift in whatever it likes (0 or 1), as long as it documents it.

Resources