Here is two code that appears to be doing same thing,but It does not. These two different when run and compared output with tracing gives confusion as it appears that the 1st code processing is machine dependent code.
Please read the two codes
Code 1:--
unsigned char c=(((~0 << 3) >> 4) << 1);
printf("%d", c);
Output:-- 254
Code 2:--
unsigned char c=(~0 << 3);
c >>= 4;
c <<= 1;
printf("%d", c);
Output:-. 30
The Output of the above code is different.
Not only this code (1st code) giving confusion but all types of code involving single line multiple bitwise shift operator gives unexpected results.
2nd code is doing correct.
Please run this code on your machine and verify above output
AND / OR
Explain why these output are not same.
OR
Finally we have to learn that we should not apply multiple bitwise shift operator in our code.
Thanks
~0 << 3 is always a bug, neither example is correct.
0 is of type int which is signed.
~0 will convert the binary contents to all ones: 0xFF...FF.
When you left shift data into the sign bit of a signed integer, you invoke undefined behavior. Same thing if you left shift a negative integer.
Conclusion: neither example has deterministic output and both can crash or print garbage.
First, ~0 << 3 invokes undefined behavior because ~0 is a signed integer value with all bits set to 1 and you subsequently left shift into the sign bit.
Changing this to ~0u << 3 prevents UB but prints the same result, so the question is why.
So first we have this:
~0u
Which has type unsigned int. This is at least 16 bits so the value is:
0xffff
Then this:
`~0u << 3`
Gives you:
0xfff8
Then this:
((~0 << 3) >> 4)
Gives you:
0x0fff
And this:
(((~0 << 3) >> 4) << 1)
Gives you:
0x1ffe
Assigning this value to an unsigned char effectively trim it down to the low order byte:
0xfe
So it prints 254.
Now in the second case you start with this:
unsigned char c = (~0 << 3);
From above, this assigns 0xfff8 to c which gets truncated to 0xf8. Then >> 4 gives you 0x0f and << 1 gives you 0x1e which is 30.
I compiled (with x86-64 gcc 9.1) these two lines:
int main() {
unsigned char e=(~0 << 1);
unsigned char d=(((~0 << 3) >> 4) << 1);
}
And I got the following assembly output:
main:
push rbp
mov rbp, rsp
mov BYTE PTR [rbp-1], -2
mov BYTE PTR [rbp-2], -2
mov eax, 0
pop rbp
ret
As you can see, both lines are converted to the same instruction mov BYTE PTR [rbp-1], -2. So, it seems the compiler is making an optimization with your first code.
Thanks to Thomas Jager for his answer (given on question comment)
The solution is simple.
In 1st code, the bit manipulation is performed with taking operand as signed char.
Because of this, two complement binary number continue to change its bit pattern as the bit manipulation is in process. After that the result two complement number is converted to positive number before assigning to unsigned variable c. Hence Result is 254 finally.
The question is to explain why two code output is different.
We all know Code 2nd is working good.
Hence i explain only why Code 1 is working incorrectly.
1st Code : -
unsigned char c=(((~0 << 3) >> 4) << 1);
printf("%d", c);
The tracing of 1st code are as follows : -
Step 1: ~0 -----> -1 ----(binary form)----> 11111111 with sign bit 1 (means negative)
Step 2: (sign bit 1)11111111 << 3 -----shifting to left----> (sign bit 1)11111000
Step 3 ***: (sign bit 1)11111000 >> 4 ----shifing to right-----> (sign bit 1)11111111
*[*** - The left most bits is 1 in Result because of sign extension
Sign bit 1 retain its bit to 1 but right shifting the number will append 1 to
left most bits without modify sign bit to 0 .
Hence all left most bit append to 1 because sign bit 1 is supplying this 1 to
left most bits while right shifting ]*
Step 4: (sign bit 1)11111111 << 1 ---shifting to left---> (sign bit 1)11111110
Step 5: two complement number (sign bit 1)11111110 converted to positive number
by deleting only sign bit to 0.
Step 6: Result : (sign bit 0)11111110 ---decimal equivalent---> 254
I only just explain his answer.
Thanks to all for giving effort for answer of this question.
This question already has answers here:
What's bad about shifting a 32-bit variable 32 bits?
(5 answers)
Closed 7 years ago.
I am developing a simple C app on a CentOS linux machine my university owns and I am getting very strange inconsistent behavior with the << operator.
Basically I am attempting to shift 0xffffffff left based on a variable shiftNum which is based on variable n
int shiftNum = (32 + (~n + 1));
int shiftedBits = (0xffffffff << shiftNum);
This has the effect of shifting 0xffffffff left 32-n times and works as expected. However when n = 0 and shiftNum = 32 I get some very strange behaviour. Instead of getting the expected 0x00000000 I get 0xffffffff.
For example this script:
int n = 0;
int shiftNum = (32 + (~n + 1));
int shiftedBits = (0xffffffff << shiftNum );
printf("n: %d\n",n);
printf("shiftNum: 0x%08x\n",shiftNum);
printf("shiftedBits: 0x%08x\n",shiftedBits);
int thirtyTwo = 32;
printf("ThirtyTwo: 0x%08x\n",thirtyTwo);
printf("Test: 0x%08x\n", (0xffffffff << thirtyTwo));
Outputs:
n: 0
shiftNum: 0x00000020
shiftedBits: 0xffffffff
ThirtyTwo: 0x00000020
Test: 0x00000000
I have no idea what is going on honestly. Some crazy low-level something I suspect. Even more strange the operation (0xffffffff << (shiftNum -1)) << 1 outputs 0x00000000.
Does anyone have any clue whats going on?
If you invoke undefined behaviour, the results are unspecified and anything is valid.
When n is 0, 32 + (~n + 1) is 32 (on a two's complement CPU). If sizeof(shiftNum) == 4 (or sizeof(shiftNum) * CHAR_BIT == 32, which usually has the same result), then you are only allowed to shift by values 0..31; anything else is undefined behaviour.
ISO/IEC 9899:2011 §6.5.7 Bitwise shift operators:
If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the behavior is undefined.
The result, therefore, is correct — even if you get a different answer each time you run the code, or recompile the program, or anything else.
This question already has answers here:
Unexpected output when executing left-shift by 32 bits
(2 answers)
undefined behavior when left operand is negative
(3 answers)
Closed 8 years ago.
I'm debugging some code and came across some behavior I cannot explain.
I am trying to shift the number -1 to the left 32 times to produce a zero in this particular case.
int n = 0;
int negOne = ~0;
int negativeN = ( (~n) + 1 );
int toShift = (32 + negativeN); //32 - n
/*HELP!!! These produce two different answers*/
printf("%d << %d = %d \n",negOne, toShift, negOne << toShift);
printf("-1 << 32 = %d \n", -1 << 32) ;
Here is the what the console outputs:
-1 << 32 = -1
-1 << 32 = 0
I am not sure why the left shift is behaving differently in each of these cases.
It's undefined behavior because you shift count is bigger than the number of bits for an int, that means that the result can't be predicted.
When you Shift a number equal to or more than the number of its bit times your result can't be predictable! Simply it is Undefined_behavior.
If you compile your program with flags you will get warning for this shifting
I'm studying C programming language and its bit operators.
I've written codes like below and I expected that the result of the codes are same.
But the reality is not.
#include <stdio.h>
#define N 0
int main() {
int n = 0;
printf("%d\n", ~0x00 + (0x01 << (0x20 + (~n + 1))));
printf("%d\n", ~0x00 + (0x01 << (0x20 + (~N + 1))));
return 0;
}
I assumed that the machine represent numbers as 2's complement on 32-bit.
They both have to be -1 which is all bits are 1 but first one is 0 and second one is -1.
I think both are exactly same code except whether using variable or constant.
I used gcc with option -m32 on Mac of i5 CPU.
What's wrong with it?
Thanks.
The short answer
You're evaluating the same expression in two different ways—once at runtime on an x86, and once at compile time. (And I assume you've disabled optimizations when you compile, see below.)
The long answer
Looking at the disassembled executable I notice the following: the argument to the first printf() is computed at runtime:
movl $0x0,-0x10(%ebp)
mov -0x10(%ebp),%ecx ; ecx = 0 (int n)
mov $0x20,%edx ; edx = 32
sub %ecx,%edx ; edx = 32-0 = 32
mov %edx,%ecx ; ecx = 32
mov $0x1,%edx ; edx = 1
shl %cl,%edx ; edx = 1 << (32 & 31) = 1 << 0 = 1
add $0xffffffff,%edx ; edx = -1 + 1 = 0
The shift is performed by an x86 SHL instruction with %cl as its operator. As per Intel manual: "The destination operand can be a register or a memory location. The count operand can be an immediate value or register CL. The count is masked to five bits, which limits the count range to 0 to 31. A special opcode encoding is provided for a count of 1."
For the above code, that means that you're shifting by 0, thus leaving the 1 in place after the shift instruction.
In contrast, the argument to the second printf() is essentially a constant expression that is computed by the compiler, and the compiler does not mask the shift amount. It therefore performs a "correct" shift of a 32b value: 1<<32 = 0 It then adds -1 to that—and you see the 0+(-1) = -1 as a result.
This also explains why you see only one warning: left shift count >= width of type and not two, as the warning stems from the compiler evaluating the shift of a 32b value by 32 bits. The compiler did not issue any warning regarding the runtime shift.
Reduced test case
The following is a reduction of your example to its essentials:
#define N 0
int n = 0;
printf("%d %d\n", 1<<(32-N) /* compiler */, 1<<(32-n) /* runtime */);
which prints 0 1 demonstrating the different results of the shift.
A word of caution
Note that the above example works only in -O0 compiled code, where you don't have the compiler optimize (evaluate and fold) constant expressions at compile time. If you take the reduced test case and compile it with -O3 then you get the same and correct results 0 0 from this optimized code:
movl $0x0,0x8(%esp)
movl $0x0,0x4(%esp)
I would think that if you change the compiler options for your test, you will see the same changed behavior.
Note There seems to be a code-gen bug in gcc-4.2.1 (and others?) where the runtime result is just off 0 8027 due to a broken optimization.
A simplified example
unsigned n32 = 32;
printf("%d\n", (int) sizeof(int)); // 4
printf("%d\n", (0x01 << n32)); // 1
printf("%d\n", (0x01 << 32)); // 0
You get UB in (0x01 << n32) as the shift >= width of int. (Looks like only 5 lsbits of n32 participated in the shift. Hence a shift of 0.)
You get a UB in (0x01 << 32) as the shift >= width of int. (Looks like complier performed the math with more bits.) This UB could have been the same as above.
I want to calculate 2n-1 for a 64bit integer value.
What I currently do is this
for(i=0; i<n; i++) r|=1<<i;
and I wonder if there is more elegant way to do it.
The line is in an inner loop, so I need it to be fast.
I thought of
r=(1ULL<<n)-1;
but it doesn't work for n=64, because << is only defined
for values of n up to 63.
EDIT:
Thanks for all your answers and comments.
Here is a little table with the solutions that I tried and liked best.
Second column is time in seconds of my (completely unscientific) benchmark.
r=N2MINUSONE_LUT[n]; 3.9 lookup table = fastest, answer by aviraldg
r =n?~0ull>>(64 - n):0ull; 5.9 fastest without LUT, comment by Christoph
r=(1ULL<<n)-1; 5.9 Obvious but WRONG!
r =(n==64)?-1:(1ULL<<n)-1; 7.0 Short, clear and quite fast, answer by Gabe
r=((1ULL<<(n/2))<<((n+1)/2))-1; 8.2 Nice, w/o spec. case, answer by drawnonward
r=(1ULL<<n-1)+((1ULL<<n-1)-1); 9.2 Nice, w/o spec. case, answer by David Lively
r=pow(2, n)-1; 99.0 Just for comparison
for(i=0; i<n; i++) r|=1<<i; 123.7 My original solution = lame
I accepted
r =n?~0ull>>(64 - n):0ull;
as answer because it's in my opinion the most elegant solution.
It was Christoph who came up with it at first, but unfortunately he only posted it in a
comment. Jens Gustedt added a really nice rationale, so I accept his answer instead. Because I liked Aviral Dasgupta's lookup table solution it got 50 reputation points via a bounty.
Use a lookup table. (Generated by your present code.) This is ideal, since the number of values is small, and you know the results already.
/* lookup table: n -> 2^n-1 -- do not touch */
const static uint64_t N2MINUSONE_LUT[] = {
0x0,
0x1,
0x3,
0x7,
0xf,
0x1f,
0x3f,
0x7f,
0xff,
0x1ff,
0x3ff,
0x7ff,
0xfff,
0x1fff,
0x3fff,
0x7fff,
0xffff,
0x1ffff,
0x3ffff,
0x7ffff,
0xfffff,
0x1fffff,
0x3fffff,
0x7fffff,
0xffffff,
0x1ffffff,
0x3ffffff,
0x7ffffff,
0xfffffff,
0x1fffffff,
0x3fffffff,
0x7fffffff,
0xffffffff,
0x1ffffffff,
0x3ffffffff,
0x7ffffffff,
0xfffffffff,
0x1fffffffff,
0x3fffffffff,
0x7fffffffff,
0xffffffffff,
0x1ffffffffff,
0x3ffffffffff,
0x7ffffffffff,
0xfffffffffff,
0x1fffffffffff,
0x3fffffffffff,
0x7fffffffffff,
0xffffffffffff,
0x1ffffffffffff,
0x3ffffffffffff,
0x7ffffffffffff,
0xfffffffffffff,
0x1fffffffffffff,
0x3fffffffffffff,
0x7fffffffffffff,
0xffffffffffffff,
0x1ffffffffffffff,
0x3ffffffffffffff,
0x7ffffffffffffff,
0xfffffffffffffff,
0x1fffffffffffffff,
0x3fffffffffffffff,
0x7fffffffffffffff,
0xffffffffffffffff,
};
How about a simple r = (n == 64) ? -1 : (1ULL<<n)-1;?
If you want to get the max value just before overflow with a given number of bits, try
r=(1ULL << n-1)+((1ULL<<n-1)-1);
By splitting the shift into two parts (in this case, two 63 bit shifts, since 2^64=2*2^63), subtracting 1 and then adding the two results together, you should be able to do the calculation without overflowing the 64 bit data type.
if (n > 64 || n < 0)
return undefined...
if (n == 64)
return 0xFFFFFFFFFFFFFFFFULL;
return (1ULL << n) - 1;
I like aviraldg answer best.
Just to get rid of the `ULL' stuff etc in C99 I would do
static inline uint64_t n2minusone(unsigned n) {
return n ? (~(uint64_t)0) >> (64u - n) : 0;
}
To see that this is valid
an uint64_t is guaranteed to have a width of exactly 64 bit
the bit negation of that `zero of type uint64_t' has thus exactly
64 one bits
right shift of an unsigned value is guaranteed to be a logical
shift, so everything is filled with zeros from the left
shift with a value equal or greater to the width is undefined, so
yes you have to do at least one conditional to be sure of your result
an inline function (or alternatively a cast to uint64_t if you
prefer) makes this type safe; an unsigned long long may
well be an 128 bit wide value in the future
a static inline function should be seamlessly
inlined in the caller without any overhead
The only problem is that your expression isn't defined for n=64? Then special-case that one value.
(n == 64 ? 0ULL : (1ULL << n)) - 1ULL
Shifting 1 << 64 in a 64 bit integer yields 0, so no need to compute anything for n > 63; shifting should be enough fast
r = n < 64 ? (1ULL << n) - 1 : 0;
But if you are trying this way to know the max value a N bit unsigned integer can have, you change 0 into the known value treating n == 64 as a special case (and you are not able to give a result for n > 64 on hardware with 64bit integer unless you use a multiprecision/bignumber library).
Another approach with bit tricks
~-(1ULL << (n-1) ) | (1ULL << (n-1))
check if it can be semplified... of course, n>0
EDIT
Tests I've done
__attribute__((regparm(0))) unsigned int calcn(int n)
{
register unsigned int res;
asm(
" cmpl $32, %%eax\n"
" jg mmno\n"
" movl $1, %%ebx\n" // ebx = 1
" subl $1, %%eax\n" // eax = n - 1
" movb %%al, %%cl\n" // because of only possible shll reg mode
" shll %%cl, %%ebx\n" // ebx = ebx << eax
" movl %%ebx, %%eax\n" // eax = ebx
" negl %%ebx\n" // -ebx
" notl %%ebx\n" // ~-ebx
" orl %%ebx, %%eax\n" // ~-ebx | ebx
" jmp mmyes\n"
"mmno:\n"
" xor %%eax, %%eax\n"
"mmyes:\n"
:
"=eax" (res):
"eax" (n):
"ebx", "ecx", "cc"
);
return res;
}
#define BMASK(X) (~-(1ULL << ((X)-1) ) | (1ULL << ((X)-1)))
int main()
{
int n = 32; //...
printf("%08X\n", BMASK(n));
printf("%08X %d %08X\n", calcn(n), n&31, BMASK(n&31));
return 0;
}
Output with n = 32 is -1 and -1, while n = 52 yields "-1" and 0xFFFFF, casually 52&31 = 20 and of course n = 20 gives 0xFFFFF...
EDIT2 now the asm code produces 0 for n > 32 (since I am on a 32 bit machine), but at this point the a ? b : 0 solution with the BMASK is clearer and I doubt the asm solution is too much faster (if speed is a so big concern the table idea could be the faster).
Since you've asked for an elegant way to do it:
const uint64_t MAX_UINT64 = 0xffffffffffffffffULL;
#define N2MINUSONE(n) ((MAX_UINT64>>(64-(n))))
I hate it that (a) n << 64 is undefined and (b) on the popular Intel hardware shifting by word size is a no-op.
You have three ways to go here:
Lookup table. I recommend against this because of the memory traffic, plus you will write a lot of code to maintain the memory traffic.
Conditional branch. Check if n is equal to the word size (8 * sizeof(unsigned long long)), if so, return ~(unsigned long long)0, otherwise shift and subtract as usual.
Try to get clever with arithmetic. For example, in real numbers 2^n = 2^(n-1) + 2^(n-1), and you can exploit this identity to make sure you never use a power equal to the word size. But you had better be very sure that n is never zero, because if it is, this identity cannot be expressed in the integers, and shifting left by -1 is likely to bite you in the ass.
I personally would go with the conditional branch—it is the hardest to screw up, manifestly handles all reasonable cases of n, and with modern hardware the likelihood of a branch misprediction is small. Here's what I do in my real code:
/* What makes things hellish is that C does not define the effects of
a 64-bit shift on a 64-bit value, and the Intel hardware computes
shifts mod 64, so that a 64-bit shift has the same effect as a
0-bit shift. The obvious workaround is to define new shift functions
that can shift by 64 bits. */
static inline uint64_t shl(uint64_t word, unsigned bits) {
assert(bits <= 64);
if (bits == 64)
return 0;
else
return word << bits;
}
I think the issue you are seeing is caused because (1<<n)-1 is evaluated as (1<<(n%64))-1 on some chips. Especially if n is or can be optimized as a constant.
Given that, there are many minor variations you can do. For example:
((1ULL<<(n/2))<<((n+1)/2))-1;
You will have to measure to see if that is faster then special casing 64:
(n<64)?(1ULL<<n)-1:~0ULL;
It is true that in C each bit-shifting operation has to shift by less bits than there are bits in the operand (otherwise, the behavior is undefined). However, nobody prohibits you from doing the shift in two consecutive steps
r = ((1ULL << (n - 1)) << 1) - 1;
I.e. shift by n - 1 bits first and then make an extra 1 bit shift. In this case, of course, you have to handle n == 0 situation in a special way, if that is a valid input in your case.
In any case, it is better than your for cycle. The latter is basically the same idea but taken to the extreme for some reason.
Ub = universe in bits = lg(U):
high(v) = v >> (Ub / 2)
low(v) = v & ((~0) >> (Ub - Ub / 2)) // Deal with overflow and with Ub even or odd
You can exploit integer division inaccuracy and use the modulo of the exponent to ensure you always shift in the range [0, (sizeof(uintmax_t) * CHAR_BIT) - 1] to create a universal pow2i function for integers of the largest supported native word size, however, this can easily be tweaked to support arbitrary word sizes.
I honestly don't get why this isn't just the implementation in hardware for bit shift overflows.
#include <limits.h>
static inline uintmax_t pow2i(uintmax_t exponent) {
#define WORD_BITS ( sizeof(uintmax_t) * CHAR_BIT )
return ((uintmax_t) 1) << (exponent / WORD_BITS) << (exponent % WORD_BITS);
#undef WORD_BITS
}
From there, you can calculate pow2i(n) - 1.