I just checked the C++ standard. It seems the following code should NOT be undefined behavior:
unsigned int val = 0x0FFFFFFF;
unsigned int res = val >> 34; // res should be 0 by C++ standard,
// but GCC gives warning and res is 67108863
And from the standard:
The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1
has an unsigned type or if E1 has a signed type and a non-negative
value, the value of the result is the integral part of the quotient of
E1/2^E2. If E1 has a signed type and a negative value, the resulting
value is implementation-defined.
According to the standard, since 34 is NOT an negative number, the variable res will be 0.
GCC gives the following warning for the code snippet, and res is 67108863:
warning: right shift count >= width of type
I also checked the assembly code emitted by GCC. It just calls SHRL, and the Intel instruction document for SHRL, the res is not ZERO.
So does that mean GCC doesn't implement the standard behavior on Intel platform?
The draft C++ standard in section 5.8 Shift operators in paragraph 1 says(emphasis mine):
The type of the result is that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.
So if unsigned int is 32 bits or less then this is undefined which is exactly the warning that gcc is giving you.
To explain exactly what happens:
The compiler will load 34 into a register, and then your constant in another register, and perform a right shift operation with those two registers. The x86 processor performs a "shiftcount % bits" on the shift value, meaning that you get a right-shift by 2.
And since 0x0FFFFFFF (268435455 decimal) divided by 4 = 67108863, that's the result you see.
If you had a different processor, for example a PowerPC (I think), it may well give you zero.
Related
According to the answer to this questions:
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Which seems to imply that 1 << 31 is undefined.
However GCC doesn't issue a warning if I use 1 << 31.
It does issue one for 1 << 32.
link
So which is it? Am I misunderstanding the standard?
Does GCC have its own interpretation?
No: 1 << 31 has undefined behavior if the type int has only 31 value bits.
1U << 31 is OK and evaluates to 0x80000000 if type unsigned int has 32 value bits.
On a system where bytes have 8 bits, sizeof(int) == 4 means int has at most 31 value bits, so shifting 1 by 31 places is undefined. Conversely, on a system where CHAR_BIT > 8, it may be OK to write 1 << 31.
gcc might issue a warning if you raise the warning level. try gcc -Wall -Wextra -W -Werror. clang does issue a warning with the same options.
To address Michaël Roy's comments, 1 << 31 does not evaluate to INT_MIN reliably. It might give this value on your system, but the Standard does not guarantee it, in fact the Standard describes this as undefined behavior, so not only can you not rely on it, you should avoid it to avoid spurious bugs. The optimizers routinely take advantage of potential undefined behavior to remove code and break the programmers' assumptions.
For example, the following code might compile to a simple return 1;:
int check_shift(int i) {
if ((1 << i) > 0)
return 1;
else
return 0;
}
None of the compilers supported by Godbolt's compiler explorer do, but doing so would not break conformity.
The reason GCC doesn't warn about this is because 1 << 31 was valid (but implementation-defined) in C90, and is valid (but implementation-defined) even in modern C++. C90 defines << as a bit shift and followed by saying that for unsigned types, its result was that of a multiplication, but did no such thing for signed types, which implicitly made it valid and left it covered by the general wording that bitwise operators have implementation-defined aspects for signed types. C++ nowadays defines << as multiplying to the corresponding unsigned type, with the result converted back to the signed type, which is implementation-defined as well.
C99 and C11 did make this invalid (saying the behaviour is undefined), but compilers are permitted to accept it as an extension. For compatibility with existing code, and to share code between the C and C++ frontends, GCC continues to do so, with one exception: you can use -fsanitize=undefined to get detected undefined behaviour to abort your program at run-time, and this one does handle 1 << 31, but only when compiling as C99 or C11.
It does invoke undefined behaviour, as explained by the other answers/comments. However, as to why GCC doesn't emit a diagnostic.
There are actually two things that can lead to undefined behaviour for a left-shift (both from [6.5.7]):
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Evidently GCC detects the first one (because it's trivial to do so), but not the latter.
I have the following function in C:
int lrot32(int a, int n)
{
printf("%X SHR %d = %X\n",a, 32-n, (a >> (32-n)));
return ((a << n) | (a >> (32-n)));
}
When I pass as arguments lrot32(0x8F5AEB9C, 0xB) I get the following:
8F5AEB9C shr 21 = FFFFFC7A
However, the result should be 47A. What am I doing wrong?
Thank you for your time
int is a signed integer type. C11 6.5.7p4-5 says the following:
4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. [...] If E1 has a signed type and nonnegative value, and E1 x 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...] if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2 . If E1 has a signed type and a negative value, the resulting value is implementation-defined.
Thus in the case of <<, if the shifted value is negative, or the positive value after shift is not representable in the result type (here: int), the behaviour is undefined; in the case of >>, if the value is negative the result is implementation defined.
Thus in either case, you'd get results that at least depend on the implementation, and in the case of left-shift, worse, possibly on the optimization level and such. A strictly conforming program cannot rely on any particular behaviour.
If however you want to target a particular compiler, then check its manuals on what the behaviour - if any is specified - would be. For example GCC says:
The results of some bitwise operations on signed integers (C90 6.3,
C99 and C11 6.5).
Bitwise operators act on the representation of the value including
both the sign and value bits, where the sign bit is considered
immediately above the highest-value value bit. Signed ‘>>’ acts on
negative numbers by sign extension. [*]
As an extension to the C language, GCC does not use the latitude given
in C99 and C11 only to treat certain aspects of signed ‘<<’ as
undefined. However, -fsanitize=shift (and -fsanitize=undefined) will
diagnose such cases. They are also diagnosed where constant
expressions are required.
[*] sign extension here means that the sign bit - which is 1 for negative integers, is repeated by the shift amountwhen right-shift is executed - this is why you see those Fs in the result.
Furthermore GCC always requires 2's complement representation, so if you would always use GCC, no matter which architecture you're targeting, this is the behaviour you'd see. Also, in the future someone might use another compiler for your code, thus causing other behaviour there.
Perhaps you'd want to use unsigned integers - unsigned int or rather, if a certain width is expected, then for example uint32_t, as the shifts are always well-defined for it, and would seem to match your expectations.
Another thing to note is that not all shift amounts are allowed. C11 6.5.7 p3:
[...]If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
Thus if you ever shift an unsigned integer having width of 32 bits by 32 - left or right, the behaviour is undefined. This should be kept in mind. Even if the compiler wouldn't do anything wacky, some processor architectures do act as if shift by 32 would then shift all bits away - others behave as if the shift amount was 0.
Is something like
uint32_t foo = 1;
foo = foo << 33;
undefined behaviour in C?
The shift is undefined behaviour in fact. See the standard (C11, final draft N1570, 6.5.7p3):
"... If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.".
Rationale: Shift operations can behave quite different on different CPU architectures if the shift count is >= the width of the argument. This way the standard allows the compiler to generate the fastest code without caring about border-effects.
Note: that things are different if int is wider than 33 bits (e.g. 64 bits). Reason are the integer promotions which first convert the uint32_t to int, so the shift is executed with the (then larger) int value. This leaves the back-conversion to uint32_t of the assignment, See 6.3.1.3, paragraph 1, 2 for this case. However, on most modern systems, int is not larger than 32 bits.
This is (apparently) undefined behaviour. From the C standard section 6.5.7 (of WG14/N1256 Committee Draft — Septermber 7, 2007 ISO/IEC 9899:TC3 - effectively the C99 standard):
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Re 3, this suggests it is undefined behaviour as the value of the right operand is greater or equal to the width of the promoted left operand.
Re 4, As the shift is unsigned, the first sentence applies: If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. This would (but for 3) suggest result is therefore zero.
I believe 3 will take precedence over 4, so it is (after all) undefined.
Olaf's answer shows the same is true under C11.
Edit: As pointed out below I missed the first part of the ANSI C standard:
"If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined." The errors (or rather lack of errors / difference in errors) are due to the particular compiler I was using.
I've come across something a bit strange, and I hope that someone can shed some light on my ignorance here. The necessary sample code is as follows:
#include <stdio.h>
int main(void)
{
unsigned a, b;
int w, x, y;
a = 0x00000001;
b = 0x00000020;
w = 31;
x = 32;
y = 33;
a << w; /*No error*/
a << x; /*No error*/
a << y; /*No error*/
a << 31; /*No error*/
a << 32; /*Error*/
a << 33; /*Error*/
a << 31U; /*No error*/
a << 32U; /*Error*/
a << 33U; /*Error*/
a << w + 1; /*No error*/
a << b; /*No error*/
return 0;
}
My question is this: why is it that an error is returned for a raw number, but not for any of the variables? They, I think, should be treated the same. According to the C11 standard
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with
zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2 , reduced modulo
one more than the maximum value representable in the result type. If E1 has a signed
type and nonnegative value, and E1 × 2 E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.
The right side, since the left is unsigned type, should be 2^E2 reduced modulo one more than the maximum value representable in the result type.... That sentence isn't entirely clear to me, but in practice it seems that it is E1 << (E2%32) - despite that 32 is not the maximum representable in the result type. Regardless, it is not undefined for the C11 standard, yet the error
left shift count >= width of type [enabled by default]
shows up when trying to compile. I cannot deduce why it is that some values of >31 work (e.g. x = 33; a <
I am using the GCC compiler on 64-Bit Fedora.
Thanks in advance.
-Will
My question is this: why is it that an error is returned for a raw number, but not for any of the variables?
Because absence of compiler warnings is not a guarantee of good program behavior. The compiler would be right to emit a warning for a << x, but it does not have to.
They, I think, should be treated the same
The compiler is doing you a favor when it warns for a << 33. It is not doing you any favor when it doesn't warn for a << y, but the compiler does not have to do you any favor.
If you want to be certain that your program does not contain undefined behavior, you cannot rely on the absence of compiler warnings, but you can use a sound static analyzer. If a sound static analyzer for undefined behavior does not detect any in your program, then you can conclude that it does not produce any (modulo the conditions of use that would be documented for the analyzer in question). For instance:
$ frama-c -val t.c
...
t.c:13:[kernel] warning: invalid RHS operand for shift. assert 0 ≤ x < 32;
in practice it seems that it is E1 << (E2%32)
The reason you are seeing this is that this is the behavior implemented by the shift instructions in x86_64's instruction set.
However, shifting by a negative number or by a number larger than the width of the type is undefined behavior. It works differently on other architectures, and even some compiler for your architecture may compute it at compile-time (as part of the constant propagation phase) with rules that differ from the one you have noticed. Do not rely on the result being E1 << (E2%32) any more than you would rely on memory still containing the correct results after being free()d.
The right side, since the left is unsigned type, should be 2^E2 reduced modulo one more than the maximum value representable in the result type.... That sentence isn't entirely clear to me, but in practice it seems that it is E1 << (E2%32) - despite that 32 is not the maximum representable in the result type.
That's not the correct interpretation. It's the result that is modulo 2^32, not E2. That sentence is describing how bits shifted off the left side are discarded. As a result, any E2 greater than or equal to the number of bits in an int would be zero, if it were allowed. Since shifts greater than or equal to that number of bits are undefined behavior, the compiler is doing you the favor of producing an error at compile-time, rather than leaving it until runtime for strange and incorrect things to happen.
For n bit of data shifting is only possible for values x>0 and x<=n-1 where x is no of bit to shift.
here in your case unsigned has memory size equals to 32 bit so only possible shifting ranges from 1 to 31. you are trying to shift data beyond the storage size of that variable that's why it is giving error to you.
modulo one more than the maximum value representable in the result type....
means that the value E1 * 2^E2 is reduced mod (UINT_MAX+1) for unsigned int. This has nothing at all to do with your hypothesis about E2.
Regardless, it is not undefined for the C11 standard,
You forgot to read the paragraph before the one you quoted:
If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the behavior is undefined.
All the shifts of 32 or more cause undefined behaviour. The compiler is not required to issue a warning about this, but it's being nice to you in some of the cases.
I'm slinging some C code and I need to bitshift a 32 bit int left 32 bits. When I run this code with the parameter n = 0, the shifting doesn't happen.
int x = 0xFFFFFFFF;
int y = x << (32 - n);
Why doesn't this work?
Shift at your own peril. Per the standard, what you want to do is undefined behavior.
C99 §6.5.7
3 - The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
In other words, if you try to shift a 32bit value by anything more than 31 bits, or a negative number, you're results are undefined.
According to section 3.3.7 Bitwise shift operators in the draft of C89 (?) standard:
If the value of the right operand is negative or is greater than or equal to the width in bits of the promoted left operand, the behavior is undefined.
Assuming int is 32-bit on the system that you are compiling the code in, when n is 0, you are shifting 32 bits. According to the statement above, your code results in undefined behavior.