Why does (1 >> 0x80000000) == 1? - c

The number 1, right shifted by anything greater than 0, should be 0, correct? Yet I can type in this very simple program which prints 1.
#include <stdio.h>
int main()
{
int b = 0x80000000;
int a = 1 >> b;
printf("%d\n", a);
}
Tested with gcc on linux.

6.5.7 Bitwise shift operators:
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
The compiler is at license to do anything, obviously, but the most common behaviors are to optimize the expression (and anything that depends on it) away entirely, or simply let the underlying hardware do whatever it does for out-of-range shifts. Many hardware platforms (including x86 and ARM) mask some number of low-order bits to use as a shift-amount. The actual hardware instruction will give the result you are observing on either of those platforms, because the shift amount is masked to zero. So in your case the compiler might have optimized away the shift, or it might be simply letting the hardware do whatever it does. Inspect the assembly if you want to know which.

according to the standard, shifting for more than the bits actually existing can result in undefined behavior. So we cannot blame the compiler for that.
The motivation probably resides in the "border meaning" of 0x80000000 that sits on the boundary of the maximum positive and negative together (and that is "negative" having the highmost bit set) and on certain check that should be done and that the compiled program doesn't to to avoid to waste time verifying "impossible" things (do you really want the processor to shift bits 3 billion times?).

It's very probably not attempting to shift by some large number of bits.
INT_MAX on your system is probably 2**31-1, or 0x7fffffff (I'm using ** to denote exponentiation). If that's the case, then In the declaration:
int b = 0x80000000;
(which was missing a semicolon in the question; please copy-and-paste your exact code) the constant 0x80000000 is of type unsigned int, not int. The value is implicitly converted to int. Since the result is outside the bounds of int, the result is implementation-defined (or, in C99, may raise an implementation-defined signal, but I don't know of any implementation that does that).
The most common way this is done is to reinterpret the bits of the unsigned value as a 2's-complement signed value. The result in this case is -2**31, or -2147483648.
So the behavior isn't undefined because you're shifting by value that equals or exceeds the width of type int, it's undefined because you're shifting by a (very large) negative value.
Not that it matters, of course; undefined is undefined.
NOTE: The above assumes that int is 32 bits on your system. If int is wider than 32 bits, then most of it doesn't apply (but the behavior is still undefined).
If you really wanted to attempt to shift by 0x80000000 bits, you could do it like this:
unsigned long b = 0x80000000;
unsigned long a = 1 >> b; // *still* undefined
unsigned long is guaranteed to be big enough to hold the value 0x80000000, so you avoid part of the problem.
Of course, the behavior of the shift is just as undefined as it was in your original code, since 0x80000000 is greater than or equal to the width of unsigned long. (Unless your compiler has a really big unsigned long type, but no real-world compiler does that.)
The only way to avoid undefined behavior is not to do what you're trying to do.
It's possible, but vanishingly unlikely, that your original code's behavior is not undefined. That can only happen if the implementation-defined conversion of 0x80000000 from unsigned int to int yields a value in the range 0 .. 31. IF int is smaller than 32 bits, the conversion is likely to yield 0.

well read that maybe can help you
expression1 >> expression2
The >> operator masks expression2 to avoid shifting expression1 by too much.
That because if the shift amount exceeded the number of bits in the data type of expression1, all the original bits would be shifted away to give a trivial result.
Now for ensure that each shift leaves at least one of the original bits,
the shift operators use the following formula to calculate the actual shift amount:
mask expression2 (using the bitwise AND operator) with one less than the number of bits in expression1.
Example
var x : byte = 15;
// A byte stores 8 bits.
// The bits stored in x are 00001111
var y : byte = x >> 10;
// Actual shift is 10 & (8-1) = 2
// The bits stored in y are 00000011
// The value of y is 3
print(y); // Prints 3
That "8-1" is because x is 8 bytes so the operacion will be with 7 bits. that void remove last bit of original chain bits

Related

In C, what happens if we left shift the bits out of range and again right shift the values in the same operation

In case of unsigned short I shifted 383 by 11 positions towards left and again in the same instruction shifted it by 15 positions right, I expected the value to be 1 but it was 27. But when I used both the shift operations in different instructions one after another(first left shift and then right), output was 1.
here is a sample code :-
unsigned short seed = 383;
printf("size of short: %d\n",sizeof(short));
unsigned short seedout,seed1,seed2,seed3,seedout1;
seed1 = (seed<<11);
seed2 = (seed1>>15);
seed3 = ((seed<<11)>>15);
printf("seed1 :%d\t seed2: %d\t seed3: %d\n",seed1,seed2,seed3);
and its output was :
size of short: 2
seed1 :63488 seed2: 1 seed3: 23
seedout1: 8 seedout :382
Process returned 0 (0x0) execution time : 0.154 s
For clarity, you compare
unsigned short seed1 = (seed<<11);
unsigned short seed2 = (seed1>>15);
on one hand and
unsigned short seed3 = ((seed<<11)>>15);
on the other hand.
The first one takes the result of the shift operation, stores it in an unsigned short variable (which apparently is 16 bit on your platform) and shifts this result right again.
The second one shifts the result immediately.
The reason why this is different resp. the bits which are shifted out to the left are retained is the following:
Although seed is unsigned short, seed<<11 is signed int. Thus, these bits are not cut off as it is the case when storing the result, but they are kept in the intermediate signed int. Only the assignment to seed1 makes the value unsigned short, which leads to a clipping of the bits.
In other words: your second example is merely equivalent to
int seed1 = (seed<<11); // instead of short
unsigned short seed2 = (seed1>>15);
Regarding left shifting, type signedness & implicit promotion:
Whenever something is left shifted into the sign bit of a signed integer type, we invoke undefined behavior. Similarly, we also invoke undefined behavior when left-shifting a negative value.
Therefore we must always ensure that the left operand of << is unsigned. And here's the problem, unsigned short is a small integer type, so it is subject to implicit type promotion whenever used in an expression. Shift operators always integer promote the left operand:
C17 6.5.7:
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand.
(This makes shifts in particular a special case, since they don't care about the type of the right operand but only look at the left one.)
So in case of a 16 bit system, you'll run into the case where unsigned short gets promoted to unsigned int, because a 16 bit int cannot hold all values of a 16 bit unsigned short. And that's fine, it's not a dangerous conversion.
On a 32 bit system however, the unsigned short gets promoted to int which is signed. Should you left shift a value like 0x8000 (MSB) set 15 bits or more, you end up shifting data into the sign bit of the promoted int, which is a subtle and possibly severe bug. For example, this prints "oops" on my Windows computer:
#include <stdio.h>
int main (void)
{
unsigned short x=0x8000;
if((x<<16) < 0) ) // undefined behavior
puts("oops");
}
But the compiler could as well have assumed that a left shift of x can never result in a value < x and removed the whole machine code upon optimization.
We need to be sure that we never end up with a signed type by accident! Meaning we must know how implicit type promotion works in C.
As for left-shifting unsigned int or larger unsigned types, that's perfectly well-defined as long as we don't shift further than the width of the (promoted) type itself (more than 31 bits on a 32 bit system). Any bits shifted out well be discarded, and if you right shift, it will always be a logical shift where zeroes are shifted in from the right.
To answer the actual question:
Your unsigned short is integer promoted to an int on a 32 bit system. This allows to shift beyond the 16 bits of an unsigned short, but if you discard those extra bits by saving the result in an unsigned short, you end up with this:
383 = 0x17F
0x17f << 11 = 0xBF800
0xBF800 truncated to 16 bits = 0xF800 = 63488
0xF800 >> 15 = 0x1
However, if skipping the middle step truncation to 15 bits, you have this instead:
0xBF800 >> 15 = 0x17 = 23
But again, this is only by luck since this time we didn't end up shifting data into the sign bit.
Another example, when executing this code, you might expect to get either the value 0 or the value 32768:
unsigned short x=32768;
printf("%d", x<<16>>16);
But it prints -32768 on my 2's complement PC. The x<<16 invokes undefined behavior, and the >>16 then apparently sign extended the result.
These kind of subtle shift bugs are common, particularly in embedded systems. A frightening amount of all C programs out there are written by people who didn't know about implicit promotions.
I shifted 383 by 11 positions towards left and again in the same instruction shifted it by 15 positions right, I expected the value to be 1 but it was 27
Simple math, you've shifted it 4 bits to the right, which is equivalent to dividing by 16.
Divide 383 by 16, and you get 27 (integer-division, of course).
Note that the "simply shifted it 4 bits" part holds because:
You've used an unsigned operand, which means that you did not "drag 1s" when shifting right
The shift-left operation likely returns an unsigned integer (32 bits) on your platform, so no data was loss during that part.
BTW, with regards to the 2nd bullet above - when you do this in parts and store the intermediate result into an unsigned short, you do indeed lose data and get a different result.
In other words, when doing seed<<11, the compiler uses 32-bit operations, and when storing it into seed1, only the LSB part of the previous result is preserved.
EDIT:
27 above should be 23. I copied that from your description without checking, though I see that you did mention 23 further down your question, so I'm assuming that 27 was a simple typo...

shift count greater than width of type

I have a function that takes an int data_length and does the following:
unsigned char *message = (unsigned char*)malloc(65535 * sizeof(char));
message[2] = (unsigned char)((data_length >> 56) & 255);
I'm getting the following:
warning: right shift count >= width of type [-Wshift-count-overflow]
message[2] = (unsigned char)((data_length >> 56) & 255);
The program works as expected, but how can I remove the compiler warning (without disabling it)?
Similar questions didn't seem to use a variable as the data to be inserted so it seemed the solution was to cast them to int or such.
Shifting by an amount greater than the bit width of the type in question is not allowed by the standard, and doing so invokes undefined behavior.
This is detailed in section 6.5.7p3 of the C standard regarding bitwise shift operators.
The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If
the value of the right operand is negative or is greater than
or equal to the width of the promoted left operand, the behavior is
undefined.
If the program appears to be working, it is by luck. You could make a unrelated change to your program or simply build it on a different machine and suddenly things will stop working.
If the size of data_length is 32 bits or less, then shifting right by 56 is too big. You can only shift by 0 - 31.
The problem is simple. You're using data_length as int when it should be unsigned as negative lengths hardly make sense. Also to be able to shift 56 bits the value must be at least 56 57 bits wide. Otherwise the behaviour is undefined.
In practice processors are known to do wildly different things. In one, shifting a 32-bit value right by 32 bits will clear the variable. In another, the value is shifted by 0 bits (32 % 32!). And then in some, perhaps the processor considers it invalid opcode and the OS kills the process.
Simple solution: declare uint64_t data_length.
If you really have limited yourself to 32-bit datatypes, then you can just assign 0 to these bytes that signify the most significant bytes. Or just cast to uint64_t or unsigned long long before the shift.

Weird behavior of right shift in C (sometimes arithmetic, sometimes logical)

GCC version 5.4.0
Ubuntu 16.04
I have noticed some weird behavior with the right shift in C when I store a value in variable or not.
This code snippet is printing 0xf0000000, the expected behavior
int main() {
int x = 0x80000000
printf("%x", x >> 3);
}
These following two code snippets are printing 0x10000000, which is very weird in my opinion, it is performing logical shifts on a negative number
1.
int main() {
int x = 0x80000000 >> 3
printf("%x", x);
}
2.
int main() {
printf("%x", (0x80000000 >> 3));
}
Any insight would be really appreciated. I do not know if it a specific issue with my personal computer, in which case it can't be replicated, or if it is just a behavior in C.
Quoting from https://en.cppreference.com/w/c/language/integer_constant, for an hexadecimal integer constant without any suffix
The type of the integer constant is the first type in which the value can fit, from the list of types which depends on which numeric base and which integer-suffix was used.
int
unsigned int
long int
unsigned long int
long long int(since C99)
unsigned long long int(since C99)
Also, later
There are no negative integer constants. Expressions such as -1 apply the unary minus operator to the value represented by the constant, which may involve implicit type conversions.
So, if an int has 32 bit in your machine, 0x80000000 has the type unsigned int as it can't fit an int and can't be negative.
The statement
int x = 0x80000000;
Converts the unsigned int to an int in an implementation defined way, but the statement
int x = 0x80000000 >> 3;
Performs a right shift to the unsigned int before converting it to an int, so the results you see are different.
EDIT
Also, as M.M noted, the format specifier %x requires an unsigned integer argument and passing an int instead causes undefined behavior.
Right shift of the negative integer has implementation defined behavior. So when shifting right the negative number you cant "expect" anything
So it is just as it is in your implementation. It is not weird.
6.5.7/5 [...] If E1 has a signed type and a negative value, the resulting value is implementation- defined.
It may also invoke the UB
6.5.7/4 [...] If E1 has a signed type and nonnegative value, and E1×2E2 is representable in the result type, then that is the resulting
value; otherwise, the behavior is undefined.
As noted by #P__J__, the right shift is implementation-dependent, so you should not rely on it to be consistent on different platforms.
As for your specific test, which is on a single platform (possibly 32-bit Intel or another platform that uses two's complement 32-bit representation of integers), but still shows a different behavior:
GCC performs operations on literal constants using the highest precision available (usually 64-bit, but may be even more). Now, the statement x = 0x80000000 >> 3 will not be compiled into code that does right-shift at run time, instead the compiler figures out both operands are constant and folds them into x = 0x10000000. For GCC, the literal 0x80000000 is NOT a negative number. It is the positive integer 2^31.
On the other hand, x = 0x80000000 will store the value 2^31 into x, but the 32-bit storage cannot represent that as the positive integer 2^31 that you gave as an integer literal - the value is beyond the range representable by a 32-bit two's complement signed integer. The high-order bit ends up in the sign bit - so this is technically an overflow, though you don't get a warning or error. Then, when you use x >> 3, the operation is now performed at run-time (not by the compiler), with the 32-bit arithmetic - and it sees that as a negative number.

Why does shifting a variable by more than its width in bits zeroes out?

This question is inspired by other questions from StackOverflow. Today, while browsing StackOverflow, I've come across an issue of bitshifting a variable by a value k that is >= the width of that variable in bits. This means shifting a 32-bit int by 32 or more bit positions.
Left shift an integer by 32 bits
Unexpected C/C++ bitwise shift operators outcome
From these questions, it is obvious that if we attempt to shift a number by k bits that are >= the bit width of the variable, only the least-significant log2k bits are taken. For a 32-bit int, the least significant 5 bits are masked and taken to be the shift amount.
So in general, if w = width of the variable in bits,
x >> k becomes x >> (k % w)
For an int, this is x >> (k % 32).
The count is masked to five bits, which limits the count range to 0 to 31.
So I've written a small little program to observe the behavior that should theoretically be produced. I've written in comments the resulting shift amount % 32.
#include <stdio.h>
#include <stdlib.h>
#define PRINT_INT_HEX(x) printf("%s\t%#.8x\n", #x, x);
int main(void)
{
printf("==============================\n");
printf("Testing x << k, x >> k, where k >= w\n");
int lval = 0xFEDCBA98 << 32;
//int lval = 0xFEDCBA98 << 0;
int aval = 0xFEDCBA89 >> 36;
//int aval = 0xFEDCBA89 >> 4;
unsigned uval = 0xFEDCBA89 >> 40;
//unsigned uval = 0xFEDCBA89 >> 8;
PRINT_INT_HEX(lval)
PRINT_INT_HEX(aval)
PRINT_INT_HEX(uval)
putchar('\n');
return EXIT_SUCCESS;
}
And the output does not match the expected behavior of the shift instructions!
==============================
Testing x << k, x >> k, where k >= w
lval 00000000
aval 00000000
uval 00000000
=====================================================================
Actually I was a bit confused with Java. In C/C++, shifting an int by a number of bits greater than the bit width, could possibly be reduced k % w, but this is not guaranteed by the C standard. There is no rule which says that this kind of behavior should happen all the time. It is undefined behavior.
However, this is the case in Java. This is a rule of the Java programming language.
Bitshift operators description in Java language specification
Weird result of Java Integer left shift
java : shift distance for int restricted to 31 bits
The linked questions specifically state that shifting by an amount greater than the bit width of the type being shifted invokes undefined behavior, which the standard defines as "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements"
When you invoke undefined behavior, anything can happen. The program may crash, it may output strange results, or it may appear to work properly. Also, how undefined behavior manifests itself can change if you use a different compiler or different optimization settings on the same compiler.
The C standard states the following regarding the bitwise shift operators in section 6.5.7p3:
The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If
the value of the right operand is negative or is greater than
or equal to the width of the promoted left operand, the behavior is
undefined.
In this case it's possible that the compiler could reduce the amount to shift modulo the bit width, as you suggested, or it could treat it as mathematically shifting by that amount resulting in all bits being 0. Either is a valid result because the standard does not specify the behavior.
One reason for the undefined-ness is that the 8086, the original x86, did not mask any bits off the shift count. It instead literally performed the shifts, using one clock tick per position.
Intel then realized that allowing 255+ clock ticks for a shift instruction perhaps wasn't such a good idea. They probably considered maximum interrupt response time, for example.
From my old 80286-manual:
To reduce the maximum execution time, the iAPX 286 does not allow shift counts greater than 31. If a shift count greater than 31 is attempted, only the bottom five bits of the shift count are used. The iAPX 86 uses all 8 bits of the shift count.
That gave you different results for the exact same program on a PC/XT and a PC/AT.
So what should the language standard say?
Java solved this by not using the underlying hardware. C instead chose to say that the effect is unclear.

regarding left shift and right shift operator

void times(unsigned short int time)
{
hours=time>>11;
minutes=((time<<5)>>10);
}
Take the input time to be 24446
The output values are
hours = 11
minutes = 763
The expected values are
hours = 11
minutes = 59
What internal processing is going on in this code?
Binary of 24446 is 0101111101111110
Time>>11 gives 01011 which means 11.
((Time<<5)>>10) gives 111011 which means 59.
But what else is happening here?
What else is going on here?
If time is unsigned short, there is an important difference between
minutes=((time<<5)>>10);
and
unsigned short temp = time << 5;
minutes = temp >> 10;
In both expressions, time << 5 is computed as an int, because of integer promotion rules. [Notes 1 and 2].
In the first expression, this int result is then right-shifted by 10. In the second expression, the assignment to unsigned short temp narrows the result to a short, which is then right-shifted by 10.
So in the second expression, high-order bits are removed (by the cast to unsigned short), while in the first expression they won't be removed if int is wider than short.
There is another important caveat with the first expression. Since the integer promotions might change an unsigned short into an int, the intermediate value might be signed, in which case overflow would be possible if the left shift were large enough. (In this case, it isn't.) The right shift might then be applied to a negative number, the result is "implementation-defined"; many implementations define the behaviour of right-shift of a negative number as sign-extending the number. This can also lead to surprises.
Notes:
Assuming that int is wider than short. If unsigned int and unsigned short are the same width, no conversion will happen and you won't see the difference you describe. The "integer promotions" are described in §6.3.1.1/2 of the C standard (using the C11 draft):
If an int can represent all values of the original type (as restricted by the width, for a
bit-field), the value is converted to an int; otherwise, it is converted to an unsigned
int. These are called the integer promotions. All other types are unchanged by the
integer promotions.
Integer promotion rules effectively make it impossible to do any arithmetic computation directly with a type smaller than int, although compilers may use the what-if rule to use sub-word opcodes. In that case, they have to produce the same result as would have been produced with the promoted values; that's easy for unsigned addition and multiplication, but trickier for shift.
The bitshift operators are an exception to the semantics of arithmetic operations. For most arithmetic operations, the C standard requires that "the usual arithmetic conversions" be applied before performing the operation. The usual arithmetic conversions, among other things, guarantee that the two operands have the same type, which will also be the type of the result. For bitshifts, the standard only requires that integer promotions be performed on both operands, and that the result will have the type of the left operand. That's because the shift operators are not symmetric. For almost all architectures, there is no valid right operand for a shift which will not fit in an unsigned char, and there is obviously no need for the types or even the signedness of the left and right operands to be the same.
In any event, as with all arithmetic operators, the integer promotions (at least) are going to be performed. So you will not see intermediate results narrower than an int.
This piece of code seems to think that int is 16 bit and left shifting time would clear the top 5 bits.
Since you're most likely working in 32/64 bits, it doesn't happen if the value in time is a 16 bit value:
time >> 5 == (time << 5) >> 10
Try this:
minutes = (time >> 5) & 0x3F;
or
minutes = (time & 0x07FF) >> 5;
or
Declare time as unsigned short and cast to unsigned short after every shift operation since math is 32/64 bit.
24446 in binary is: 0101 1111 0111 1110
Bits 0-4 - unknown
Bits 5-10 minutes
Bits 11-16 hours
It seems that size of 'int' for the platform you working on 32 bits.
As far as processing is concerned assume that,
First statement is dividing "time" by '11'.
Second statement is multiplying "time" by 5 then dividing whole by 10.
Answer of your question ends here.
If you add what time value actually contains(number of seconds/miliseconds/hours or something else) then you may get more help.
Edit:
As #egur pointed out,you might be porting your code from 16 bit to 32/64 bit platform.
A widely accepted C coding style to make the code portable is something like below:
make Typedef.h file and include it in every other C file,
//Typedef.h
typedef unsigned int U16
typedef signed int S16
typedef unsigned short U8
typedef signed short S8
:
:
//END
Use U16,U8 etc. while declaring variables.
Now when you move to larger bit processor say 32 bit,Change your Typedef.h to
//Typedef.h
typedef unsigned int U32
typedef signed int S32
typedef unsigned short U16
typedef signed short S16
No need to change anything in rest code.
edit2:
after seeing your edit:
((Time<<5)>>10) gives 111011 which means 59.
For 32 bit processors
((Time<<5)>>10) gives 0000 0010 1111 1011 which means 763.

Resources