I'm slinging some C code and I need to bitshift a 32 bit int left 32 bits. When I run this code with the parameter n = 0, the shifting doesn't happen.
int x = 0xFFFFFFFF;
int y = x << (32 - n);
Why doesn't this work?
Shift at your own peril. Per the standard, what you want to do is undefined behavior.
C99 §6.5.7
3 - The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
In other words, if you try to shift a 32bit value by anything more than 31 bits, or a negative number, you're results are undefined.
According to section 3.3.7 Bitwise shift operators in the draft of C89 (?) standard:
If the value of the right operand is negative or is greater than or equal to the width in bits of the promoted left operand, the behavior is undefined.
Assuming int is 32-bit on the system that you are compiling the code in, when n is 0, you are shifting 32 bits. According to the statement above, your code results in undefined behavior.
Related
I am a beginner so dont get mad at my simple question but suppose I have an int variable
lets say a, and I do a<<3 will that be equal to a*2^3 = a*8 as I read that bitshift operators multiply the variable with 2^x.
Am I correct or I am misreading this situation??
Thanks!
Yes, with some exceptions.
Consider 123410 × 103 = 123400010. We added zeroes to the right of the decimal representation of the number by multiplying it by a power of ten.
Similarly, we can add zeroes to the right of the binary representation of the number by multiplying it by a power of two. For example, 10112 × 23 = 10110002.
The exceptions fall into two categories.
Exceptions due to overflow
If the left operand has an unsigned type, and the result is too large for that type, the most significant bits will be dropped.
For example, in an environment with a 32-bit unsigned int type, 8000000116 × 21 = 1000000216, but 0x80000001u << 1 produces the chopped result 0x0000002u.
If the left operand has an signed type, and the result is too large for that type, the behaviour is undefined.
For example, in an environment with a 32-bit int type, the behaviour of 1 << 31 is undefined.
Exceptions due to weird operands
If the value of the right operand is negative, the behaviour is undefined.
For example, the behaviour of 1 << -1 is undefined.
If the value of the left operand is negative, the behaviour is undefined.
For example, the behaviour of -1 << 1 is undefined.
If the value of the right operand is greater than or equal to the width of the promoted left operand, the behaviour is undefined.
For example, in an environment with a 32-bit unsigned int type, the behaviour of 1u << 32 is undefined.
C17, on the semantics of <<:
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Left shift simply means to shift the set binary values to some places to the left. for example, 3 has a binary value of 11 so doing 3<<3 means we left shift these set bits (1) to 3 places to the left. so 11 becomes 11000 which is equal to 24.
You are correct! By bitshifting to the left you multiply your integer value with 2.
Visualising it, you do the following:
Let's say a is an 8-bit integer and is equal to 1.
That means that the binary code will look like: 00000001
Now if we bit shift 3 times to the left, that binary code becomes 00001000, which is 8.
I just checked the C++ standard. It seems the following code should NOT be undefined behavior:
unsigned int val = 0x0FFFFFFF;
unsigned int res = val >> 34; // res should be 0 by C++ standard,
// but GCC gives warning and res is 67108863
And from the standard:
The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1
has an unsigned type or if E1 has a signed type and a non-negative
value, the value of the result is the integral part of the quotient of
E1/2^E2. If E1 has a signed type and a negative value, the resulting
value is implementation-defined.
According to the standard, since 34 is NOT an negative number, the variable res will be 0.
GCC gives the following warning for the code snippet, and res is 67108863:
warning: right shift count >= width of type
I also checked the assembly code emitted by GCC. It just calls SHRL, and the Intel instruction document for SHRL, the res is not ZERO.
So does that mean GCC doesn't implement the standard behavior on Intel platform?
The draft C++ standard in section 5.8 Shift operators in paragraph 1 says(emphasis mine):
The type of the result is that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.
So if unsigned int is 32 bits or less then this is undefined which is exactly the warning that gcc is giving you.
To explain exactly what happens:
The compiler will load 34 into a register, and then your constant in another register, and perform a right shift operation with those two registers. The x86 processor performs a "shiftcount % bits" on the shift value, meaning that you get a right-shift by 2.
And since 0x0FFFFFFF (268435455 decimal) divided by 4 = 67108863, that's the result you see.
If you had a different processor, for example a PowerPC (I think), it may well give you zero.
I am going through 'The C language by K&R'. Right now I am doing the bitwise section. I am having a hard time in understanding the following code.
int mask = ~0 >> n;
I was playing on using this to mask n left side of another binary like this.
0000 1111
1010 0101 // random number
My problem is that when I print var mask it still negative -1. Assuming n is 4. I thought shifting ~0 which is -1 will be 15 (0000 1111).
thanks for the answers
Performing a right shift on a negative value yields an implementation defined value. Most hosted implementations will shift in 1 bits on the left, as you've seen in your case, however that doesn't necessarily have to be the case.
Unsigned types as well as positive values of signed types always shift in 0 bits on the left when shifting right. So you can get the desired behavior by using unsigned values:
unsigned int mask = ~0u >> n;
This behavior is documented in section 6.5.7 of the C standard:
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative
value, the value of the result is the integral part of the quotient
of E1 / 2E2 .If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
Right-shifting negative signed integers is an implementation-defined behavior, which is usually (but not always) filling the left with ones instead of zeros. That's why no matter how many bits you've shifted, it's always -1, as the left is always filled by ones.
When you shift unsigned integers, the left will always be filled by zeros. So you can do this:
unsigned int mask = ~0U >> n;
^
You should also note that int is typically 2 or 4 bytes, meaning if you want to get 15, you need to right-shift 12 or 28 bits instead of only 4. You can use a char instead:
unsigned char mask = ~0U;
mask >>= 4;
In C, and many other languages, >> is (usually) an arithmetic right shift when performed on signed variables (like int). This means that the new bit shifted in from the left is a copy of the previous most-significant bit (MSB). This has the effect of preserving the sign of a two's compliment negative number (and in this case the value).
This is in contrast to a logical right shift, where the MSB is always replaced with a zero bit. This is applied when your variable is unsigned (e.g. unsigned int).
From Wikipeda:
The >> operator in C and C++ is not necessarily an arithmetic shift. Usually it is only an arithmetic shift if used with a signed integer type on its left-hand side. If it is used on an unsigned integer type instead, it will be a logical shift.
In your case, if you plan to be working at a bit level (i.e. using masks, etc.) I would strongly recommend two things:
Use unsigned values.
Use types with specific sizes from <stdint.h> like uint32_t
I have the following function in C:
int lrot32(int a, int n)
{
printf("%X SHR %d = %X\n",a, 32-n, (a >> (32-n)));
return ((a << n) | (a >> (32-n)));
}
When I pass as arguments lrot32(0x8F5AEB9C, 0xB) I get the following:
8F5AEB9C shr 21 = FFFFFC7A
However, the result should be 47A. What am I doing wrong?
Thank you for your time
int is a signed integer type. C11 6.5.7p4-5 says the following:
4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. [...] If E1 has a signed type and nonnegative value, and E1 x 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...] if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2 . If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Thus in the case of <<, if the shifted value is negative, or the positive value after shift is not representable in the result type (here: int), the behaviour is undefined; in the case of >>, if the value is negative the result is implementation defined.
Thus in either case, you'd get results that at least depend on the implementation, and in the case of left-shift, worse, possibly on the optimization level and such. A strictly conforming program cannot rely on any particular behaviour.
If however you want to target a particular compiler, then check its manuals on what the behaviour - if any is specified - would be. For example GCC says:
The results of some bitwise operations on signed integers (C90 6.3,
C99 and C11 6.5).
Bitwise operators act on the representation of the value including
both the sign and value bits, where the sign bit is considered
immediately above the highest-value value bit. Signed ‘>>’ acts on
negative numbers by sign extension. [*]
As an extension to the C language, GCC does not use the latitude given
in C99 and C11 only to treat certain aspects of signed ‘<<’ as
undefined. However, -fsanitize=shift (and -fsanitize=undefined) will
diagnose such cases. They are also diagnosed where constant
expressions are required.
[*] sign extension here means that the sign bit - which is 1 for negative integers, is repeated by the shift amountwhen right-shift is executed - this is why you see those Fs in the result.
Furthermore GCC always requires 2's complement representation, so if you would always use GCC, no matter which architecture you're targeting, this is the behaviour you'd see. Also, in the future someone might use another compiler for your code, thus causing other behaviour there.
Perhaps you'd want to use unsigned integers - unsigned int or rather, if a certain width is expected, then for example uint32_t, as the shifts are always well-defined for it, and would seem to match your expectations.
Another thing to note is that not all shift amounts are allowed. C11 6.5.7 p3:
[...]If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
Thus if you ever shift an unsigned integer having width of 32 bits by 32 - left or right, the behaviour is undefined. This should be kept in mind. Even if the compiler wouldn't do anything wacky, some processor architectures do act as if shift by 32 would then shift all bits away - others behave as if the shift amount was 0.
Is something like
uint32_t foo = 1;
foo = foo << 33;
undefined behaviour in C?
The shift is undefined behaviour in fact. See the standard (C11, final draft N1570, 6.5.7p3):
"... If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.".
Rationale: Shift operations can behave quite different on different CPU architectures if the shift count is >= the width of the argument. This way the standard allows the compiler to generate the fastest code without caring about border-effects.
Note: that things are different if int is wider than 33 bits (e.g. 64 bits). Reason are the integer promotions which first convert the uint32_t to int, so the shift is executed with the (then larger) int value. This leaves the back-conversion to uint32_t of the assignment, See 6.3.1.3, paragraph 1, 2 for this case. However, on most modern systems, int is not larger than 32 bits.
This is (apparently) undefined behaviour. From the C standard section 6.5.7 (of WG14/N1256 Committee Draft — Septermber 7, 2007 ISO/IEC 9899:TC3 - effectively the C99 standard):
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Re 3, this suggests it is undefined behaviour as the value of the right operand is greater or equal to the width of the promoted left operand.
Re 4, As the shift is unsigned, the first sentence applies: If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. This would (but for 3) suggest result is therefore zero.
I believe 3 will take precedence over 4, so it is (after all) undefined.
Olaf's answer shows the same is true under C11.