I am a beginner so dont get mad at my simple question but suppose I have an int variable
lets say a, and I do a<<3 will that be equal to a*2^3 = a*8 as I read that bitshift operators multiply the variable with 2^x.
Am I correct or I am misreading this situation??
Thanks!
Yes, with some exceptions.
Consider 123410 × 103 = 123400010. We added zeroes to the right of the decimal representation of the number by multiplying it by a power of ten.
Similarly, we can add zeroes to the right of the binary representation of the number by multiplying it by a power of two. For example, 10112 × 23 = 10110002.
The exceptions fall into two categories.
Exceptions due to overflow
If the left operand has an unsigned type, and the result is too large for that type, the most significant bits will be dropped.
For example, in an environment with a 32-bit unsigned int type, 8000000116 × 21 = 1000000216, but 0x80000001u << 1 produces the chopped result 0x0000002u.
If the left operand has an signed type, and the result is too large for that type, the behaviour is undefined.
For example, in an environment with a 32-bit int type, the behaviour of 1 << 31 is undefined.
Exceptions due to weird operands
If the value of the right operand is negative, the behaviour is undefined.
For example, the behaviour of 1 << -1 is undefined.
If the value of the left operand is negative, the behaviour is undefined.
For example, the behaviour of -1 << 1 is undefined.
If the value of the right operand is greater than or equal to the width of the promoted left operand, the behaviour is undefined.
For example, in an environment with a 32-bit unsigned int type, the behaviour of 1u << 32 is undefined.
C17, on the semantics of <<:
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Left shift simply means to shift the set binary values to some places to the left. for example, 3 has a binary value of 11 so doing 3<<3 means we left shift these set bits (1) to 3 places to the left. so 11 becomes 11000 which is equal to 24.
You are correct! By bitshifting to the left you multiply your integer value with 2.
Visualising it, you do the following:
Let's say a is an 8-bit integer and is equal to 1.
That means that the binary code will look like: 00000001
Now if we bit shift 3 times to the left, that binary code becomes 00001000, which is 8.
Related
I have the following function in C:
int lrot32(int a, int n)
{
printf("%X SHR %d = %X\n",a, 32-n, (a >> (32-n)));
return ((a << n) | (a >> (32-n)));
}
When I pass as arguments lrot32(0x8F5AEB9C, 0xB) I get the following:
8F5AEB9C shr 21 = FFFFFC7A
However, the result should be 47A. What am I doing wrong?
Thank you for your time
int is a signed integer type. C11 6.5.7p4-5 says the following:
4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. [...] If E1 has a signed type and nonnegative value, and E1 x 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...] if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2 . If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Thus in the case of <<, if the shifted value is negative, or the positive value after shift is not representable in the result type (here: int), the behaviour is undefined; in the case of >>, if the value is negative the result is implementation defined.
Thus in either case, you'd get results that at least depend on the implementation, and in the case of left-shift, worse, possibly on the optimization level and such. A strictly conforming program cannot rely on any particular behaviour.
If however you want to target a particular compiler, then check its manuals on what the behaviour - if any is specified - would be. For example GCC says:
The results of some bitwise operations on signed integers (C90 6.3,
C99 and C11 6.5).
Bitwise operators act on the representation of the value including
both the sign and value bits, where the sign bit is considered
immediately above the highest-value value bit. Signed ‘>>’ acts on
negative numbers by sign extension. [*]
As an extension to the C language, GCC does not use the latitude given
in C99 and C11 only to treat certain aspects of signed ‘<<’ as
undefined. However, -fsanitize=shift (and -fsanitize=undefined) will
diagnose such cases. They are also diagnosed where constant
expressions are required.
[*] sign extension here means that the sign bit - which is 1 for negative integers, is repeated by the shift amountwhen right-shift is executed - this is why you see those Fs in the result.
Furthermore GCC always requires 2's complement representation, so if you would always use GCC, no matter which architecture you're targeting, this is the behaviour you'd see. Also, in the future someone might use another compiler for your code, thus causing other behaviour there.
Perhaps you'd want to use unsigned integers - unsigned int or rather, if a certain width is expected, then for example uint32_t, as the shifts are always well-defined for it, and would seem to match your expectations.
Another thing to note is that not all shift amounts are allowed. C11 6.5.7 p3:
[...]If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
Thus if you ever shift an unsigned integer having width of 32 bits by 32 - left or right, the behaviour is undefined. This should be kept in mind. Even if the compiler wouldn't do anything wacky, some processor architectures do act as if shift by 32 would then shift all bits away - others behave as if the shift amount was 0.
Is something like
uint32_t foo = 1;
foo = foo << 33;
undefined behaviour in C?
The shift is undefined behaviour in fact. See the standard (C11, final draft N1570, 6.5.7p3):
"... If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.".
Rationale: Shift operations can behave quite different on different CPU architectures if the shift count is >= the width of the argument. This way the standard allows the compiler to generate the fastest code without caring about border-effects.
Note: that things are different if int is wider than 33 bits (e.g. 64 bits). Reason are the integer promotions which first convert the uint32_t to int, so the shift is executed with the (then larger) int value. This leaves the back-conversion to uint32_t of the assignment, See 6.3.1.3, paragraph 1, 2 for this case. However, on most modern systems, int is not larger than 32 bits.
This is (apparently) undefined behaviour. From the C standard section 6.5.7 (of WG14/N1256 Committee Draft — Septermber 7, 2007 ISO/IEC 9899:TC3 - effectively the C99 standard):
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Re 3, this suggests it is undefined behaviour as the value of the right operand is greater or equal to the width of the promoted left operand.
Re 4, As the shift is unsigned, the first sentence applies: If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. This would (but for 3) suggest result is therefore zero.
I believe 3 will take precedence over 4, so it is (after all) undefined.
Olaf's answer shows the same is true under C11.
I am trying to understand the behavior of bitwise operators on signed and unsigned types. As per the ISO/IEC document, following are my understandings.
Left shift Operator
The result of E1 << E2, is E1 left-shifted E2 bit positions
The vacated bits on the account of left shift will be filled by zeros.
E1 as signed non-negative: E1 << E2 will result to E1 multiplied by 2 power of E2, if the value is representable by the result type.
Q1: What about signed negatives?
Q2: I am not able to understand on what's meant by "reduced modulo" in the following context. "If E1 has an unsigned type, the value of the result is E1 × 2E2 , reduced modulo
one more than the maximum value representable in the result type".
Right Shift Operator
The result of E1 >> E2 is E1 right-shifted E2 bit positions.
E1 as signed non-negative/unsigned:The value of the result is the integral part of the quotient of E1 / 2E2
Q3: For signed negative integers I see, some books defining that the vacant positions will be filled with 1.Please elaborate more on the use of right shift operator on signed negative int.
Q1: The behaviour of the left shift operator on negative values of signed integer types is undefined, as is the behaviour for positive values of signed integer types when the result E1 * 2^E2 is not representable in the type.
That is explicitly mentioned in section 6.5.7, paragraph 4 and 5 of the standard (n1570 draft):
4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2 , reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Q2: The reduction modulo one more than the maximum value representable in the unsigned integer type means that the bits that are shifted out on the left are simply discarded.
Mathematically, if the maximal value of the unsigned type is 2^n - 1 (and it is always of that form), the result of shifting E1 left by E2 bits is the value V in the range from 0 to 2^n - 1 such that the difference
(E1 * 2^E2 - V)
is divisible by 2^n, that is, it's the remainder you get when dividing E1 * 2^E2 by 2^n.
Q3: The behaviour upon shifting right negative values of signed integer types is implementation-defined. The most common behaviour (at least on two's complement machines) is an arithmetic shift, that is, the result is the quotient rounded down (towards negative infinity).
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2^E2 . If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Re: Q1
If E1 is negative, the behavior is undefined.
Re: Q2
Unsigned arithmetic is "cyclic", that is, it wraps around so UINT_MAX + 1 is 0 again. It's as if every calculation was done modulo UINT_MAX+1. Another way to think about it is that excess bits that don't fit in on the left are simply dropped.
Re: Q3
If E1 is negative, the result is implementation-defined. That is, it depends on your machine/compiler/options, but the behavior has to be documented ("defined") somewhere, ususally the compiler manual. Two popular choices are to fill the incoming bits on the left with 1 (arithmetic shift) or with 0 (logical shift).
If you really want to understand the bitwise shift operators. Look at these simple rules:
1) In left shift, E1 << E2, all the vacant bits on right side will be filled by zeroes, does not matter if the number is signed or unsigned, always zeroes will be shifted in.
2) In left shift, E1 >> E2, all the vacant bits on left side, will be 0 if number is positive, and will be 1 if number is negative.Keep in mind that an unsigned number is never negative. Also some implementations might also fill them with 0 on some machines even if number is negative, so never rely on this.
All other scenarios could be explained by these two simple rules.
Now if you want to know the value of result after shifting , just write the bit representation of number and shift manually on paper and enter bits at vacant spaces using these two rules. Then you would be able to understand better how bit shifting works.
For example lets take int i = 7;
i<<2
now i = 0000 0000 0000 0000 0000 0000 0000 0111
perform two left shits according to rule 1 would be:
0000 0000 0000 0000 0000 0000 0001 1100
which will give it value of 28 ( E1 * 2E2 = 7 *2*2 = 28), which is same as representated by bit pattern.
Now let me explain the "reduced modulo" thing, well if you are shifting a big number, than the bits on left hand side will be lost, "reduced modulo" compensates for it, So if your resultant value if greater than the maximum value that the data type could hold, the bits on left will be lost , and then:
result = (E1*2*E2) % ( maximumValue + 1).
For various other cases, just keep the above rules in mind , and you are good :)
Q2: "value reduced modulo X" means "value mod X" in math and can be written as "value % X" in C. That part just explains how integer overflows work.
few days back I gave Microsoft GD online exam for internship there. I have always studied that the left shift of negative number is an undefined behavior but that paper had almost 7 questions out of 30 related to shift operators and approx 5 questions were ther which involved shifting negative numbers to left and they had no option saying "undefined behavior". I was shocked to see that . So, my question is that has this C standard changed ? Is this defined now ?
sample question :
printf("%d",-1<<10);
I marked its answer as -1024 by the logic 2^10*-1
I even ran this on gcc and it gave me o/p as -1024 nly (when I returned home.)
The rules haven't changed. It's still technically undefined.
Quoting from the C standard (Section 6.5.7, paragraph 4, of n1548):
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with
zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.
It clearly says if E1 is not unsigned or signed with nonnegative value, the behavior is undefined.
In the shift operrators >> or <<:
operands should be of integral type or subject to integral promotion.
If right operand is negative or greater than the bits in left operand then Undefined behaviour
if left operand is negative with >> then it's implementation defiend and with << undefined behaviour
Quoting from K&R Appendix-A
The shift operators << and >> group left-to-right. For both operators,
each operand must be integral, and is subject to integral the promotions.
The type of the result is that of the promoted left operand.
The result is undefined if the right operand is negative,
or greater than or equal to the number of bits in the left expression's type.
shift-expression:
additive-expression
shift-expression << additive-expression
shift-expression >> additive-expression
The value of E1<<E2 is E1 (interpreted as a bit pattern) left-shifted E2 bits;
in the absence of overflow, this is equivalent to multiplication by 2. The value
of E1>>E2 is E1 right-shifted E2 bit positions. The right shift is equivalent to division
by 2. if E1 is unsigned or it has a non-negative value; otherwise the result is
implementation-defined.
Perhaps you were thinking that the right operand is negative?
http://msdn.microsoft.com/en-us/library/336xbhcz(v=vs.80).aspx
In C, are the shift operators (<<, >>) arithmetic or logical?
When shifting left, there is no difference between arithmetic and logical shift. When shifting right, the type of shift depends on the type of the value being shifted.
(As background for those readers unfamiliar with the difference, a "logical" right shift by 1 bit shifts all the bits to the right and fills in the leftmost bit with a 0. An "arithmetic" shift leaves the original value in the leftmost bit. The difference becomes important when dealing with negative numbers.)
When shifting an unsigned value, the >> operator in C is a logical shift. When shifting a signed value, the >> operator is an arithmetic shift.
For example, assuming a 32 bit machine:
signed int x1 = 5;
assert((x1 >> 1) == 2);
signed int x2 = -5;
assert((x2 >> 1) == -3);
unsigned int x3 = (unsigned int)-5;
assert((x3 >> 1) == 0x7FFFFFFD);
According to K&R 2nd edition the results are implementation-dependent for right shifts of signed values.
Wikipedia says that C/C++ 'usually' implements an arithmetic shift on signed values.
Basically you need to either test your compiler or not rely on it. My VS2008 help for the current MS C++ compiler says that their compiler does an arithmetic shift.
TL;DR
Consider i and n to be the left and right operands respectively of a shift operator; the type of i, after integer promotion, be T. Assuming n to be in [0, sizeof(i) * CHAR_BIT) — undefined otherwise — we've these cases:
| Direction | Type | Value (i) | Result |
| ---------- | -------- | --------- | ------------------------ |
| Right (>>) | unsigned | ≥ 0 | −∞ ← (i ÷ 2ⁿ) |
| Right | signed | ≥ 0 | −∞ ← (i ÷ 2ⁿ) |
| Right | signed | < 0 | Implementation-defined† |
| Left (<<) | unsigned | ≥ 0 | (i * 2ⁿ) % (T_MAX + 1) |
| Left | signed | ≥ 0 | (i * 2ⁿ) ‡ |
| Left | signed | < 0 | Undefined |
† most compilers implement this as arithmetic shift
‡ undefined if value overflows the result type T; promoted type of i
Shifting
First is the difference between logical and arithmetic shifts from a mathematical viewpoint, without worrying about data type size. Logical shifts always fills discarded bits with zeros while arithmetic shift fills it with zeros only for left shift, but for right shift it copies the MSB thereby preserving the sign of the operand (assuming a two's complement encoding for negative values).
In other words, logical shift looks at the shifted operand as just a stream of bits and move them, without bothering about the sign of the resulting value. Arithmetic shift looks at it as a (signed) number and preserves the sign as shifts are made.
A left arithmetic shift of a number X by n is equivalent to multiplying X by 2n and is thus equivalent to logical left shift; a logical shift would also give the same result since MSB anyway falls off the end and there's nothing to preserve.
A right arithmetic shift of a number X by n is equivalent to integer division of X by 2n ONLY if X is non-negative! Integer division is nothing but mathematical division and round towards 0 (trunc).
For negative numbers, represented by two's complement encoding, shifting right by n bits has the effect of mathematically dividing it by 2n and rounding towards −∞ (floor); thus right shifting is different for non-negative and negative values.
for X ≥ 0, X >> n = X / 2n = trunc(X ÷ 2n)
for X < 0, X >> n = floor(X ÷ 2n)
where ÷ is mathematical division, / is integer division. Let's look at an example:
37)10 = 100101)2
37 ÷ 2 = 18.5
37 / 2 = 18 (rounding 18.5 towards 0) = 10010)2 [result of arithmetic right shift]
-37)10 = 11011011)2 (considering a two's complement, 8-bit representation)
-37 ÷ 2 = -18.5
-37 / 2 = -18 (rounding 18.5 towards 0) = 11101110)2 [NOT the result of arithmetic right shift]
-37 >> 1 = -19 (rounding 18.5 towards −∞) = 11101101)2 [result of arithmetic right shift]
As Guy Steele pointed out, this discrepancy has led to bugs in more than one compiler. Here non-negative (math) can be mapped to unsigned and signed non-negative values (C); both are treated the same and right-shifting them is done by integer division.
So logical and arithmetic are equivalent in left-shifting and for non-negative values in right shifting; it's in right shifting of negative values that they differ.
Operand and Result Types
Standard C99 §6.5.7:
Each of the operands shall have integer types.
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behaviour is undefined.
short E1 = 1, E2 = 3;
int R = E1 << E2;
In the above snippet, both operands become int (due to integer promotion); if E2 was negative or E2 ≥ sizeof(int) * CHAR_BIT then the operation is undefined. This is because shifting more than the available bits is surely going to overflow. Had R been declared as short, the int result of the shift operation would be implicitly converted to short; a narrowing conversion, which may lead to implementation-defined behaviour if the value is not representable in the destination type.
Left Shift
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1×2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behaviour is undefined.
As left shifts are the same for both, the vacated bits are simply filled with zeros. It then states that for both unsigned and signed types it's an arithmetic shift. I'm interpreting it as arithmetic shift since logical shifts don't bother about the value represented by the bits, it just looks at it as a stream of bits; but the standard talks not in terms of bits, but by defining it in terms of the value obtained by the product of E1 with 2E2.
The caveat here is that for signed types the value should be non-negative and the resulting value should be representable in the result type. Otherwise the operation is undefined. The result type would be the type of the E1 after applying integral promotion and not the destination (the variable which is going to hold the result) type. The resulting value is implicitly converted to the destination type; if it is not representable in that type, then the conversion is implementation-defined (C99 §6.3.1.3/3).
If E1 is a signed type with a negative value then the behaviour of left shifting is undefined. This is an easy route to undefined behaviour which may easily get overlooked.
Right Shift
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a non-negative value, the value of the result is the integral part of the quotient of E1/2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Right shift for unsigned and signed non-negative values are pretty straight forward; the vacant bits are filled with zeros. For signed negative values the result of right shifting is implementation-defined. That said, most implementations like GCC and Visual C++ implement right-shifting as arithmetic shifting by preserving the sign bit.
Conclusion
Unlike Java, which has a special operator >>> for logical shifting apart from the usual >> and <<, C and C++ have only arithmetic shifting with some areas left undefined and implementation-defined. The reason I deem them as arithmetic is due to the standard wording the operation mathematically rather than treating the shifted operand as a stream of bits; this is perhaps the reason why it leaves those areas un/implementation-defined instead of just defining all cases as logical shifts.
In terms of the type of shift you get, the important thing is the type of the value that you're shifting. A classic source of bugs is when you shift a literal to, say, mask off bits. For example, if you wanted to drop the left-most bit of an unsigned integer, then you might try this as your mask:
~0 >> 1
Unfortunately, this will get you into trouble because the mask will have all of its bits set because the value being shifted (~0) is signed, thus an arithmetic shift is performed. Instead, you'd want to force a logical shift by explicitly declaring the value as unsigned, i.e. by doing something like this:
~0U >> 1;
Here are functions to guarantee logical right shift and arithmetic right shift of an int in C:
int logicalRightShift(int x, int n) {
return (unsigned)x >> n;
}
int arithmeticRightShift(int x, int n) {
if (x < 0 && n > 0)
return x >> n | ~(~0U >> n);
else
return x >> n;
}
When you do
- left shift by 1 you multiply by 2
- right shift by 1 you divide by 2
x = 5
x >> 1
x = 2 ( x=5/2)
x = 5
x << 1
x = 10 (x=5*2)
Well, I looked it up on wikipedia, and they have this to say:
C, however, has only one right shift
operator, >>. Many C compilers choose
which right shift to perform depending
on what type of integer is being
shifted; often signed integers are
shifted using the arithmetic shift,
and unsigned integers are shifted
using the logical shift.
So it sounds like it depends on your compiler. Also in that article, note that left shift is the same for arithmetic and logical. I would recommend doing a simple test with some signed and unsigned numbers on the border case (high bit set of course) and see what the result is on your compiler. I would also recommend avoiding depending on it being one or the other since it seems C has no standard, at least if it is reasonable and possible to avoid such dependence.
Left shift <<
This is somehow easy and whenever you use the shift operator, it is always a bit-wise operation, so we can't use it with a double and float operation. Whenever we left shift one zero, it is always added to the least significant bit (LSB).
But in right shift >> we have to follow one additional rule and that rule is called "sign bit copy". Meaning of "sign bit copy" is if the most significant bit (MSB) is set then after a right shift again the MSB will be set if it was reset then it is again reset, means if the previous value was zero then after shifting again, the bit is zero if the previous bit was one then after the shift it is again one. This rule is not applicable for a left shift.
The most important example on right shift if you shift any negative number to right shift, then after some shifting the value finally reach to zero and then after this if shift this -1 any number of times the value will remain same. Please check.
gcc will typically use logical shifts on unsigned variables and for left-shifts on signed variables. The arithmetic right shift is the truly important one because it will sign extend the variable.
gcc will will use this when applicable, as other compilers are likely to do.
GCC does
for -ve - > Arithmetic Shift
For +ve -> Logical Shift
According to many c compilers:
<< is an arithmetic left shift or bitwise left shift.
>> is an arithmetic right shiftor bitwise right shift.