I am working with a mix of C90 and C99 (cannot fully use C99 for reasons I better don't discuss, because they aren't good for my blood pressure and would endanger the life of the person preventing us from moving our code base into the current millennium). Still I am going to quote the C99 standard.
I have code that is roughly like this, when condensed to the bare minimum (test.c):
#include <stdio.h>
unsigned int foo(unsigned int n)
{
unsigned int x, y;
n = n - 264;
x = (n >> 2) + 1;
y = 1U << (x + 2U);
return y;
}
int main(void)
{
printf("%u\n", foo(384));
return 0;
}
Of course the value passed to foo() can conceivably be bigger than the value given here. Still 384 is the lowest value that will trigger the Clang static analyzer (3.4 compiled from the release tag) to spit a warning:
$ clang -cc1 -triple x86_64-unknown-linux-gnu -analyze -analyzer-checker=core -internal-isystem /usr/local/include -internal-isystem $HOME/bin/LLVM/bin/../lib/clang/3.4/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -O0 -x c test.c
test.c:8:9: warning: The result of the '<<' expression is undefined
y = 1U << (x + 2U);
~~~^~~~~~~~~~~
1 warning generated.
Now going through the lines one by one:
// n == 384
n = n - 264; // n := 384 - 264
// n == 120
x = (n >> 2) + 1; // x := (120 div 4) + 1
// x == 31
y = 1U << (x + 2U); // y := 1 << 33
So, alright it pushes all the meaningful bits out of the integer, and from my understanding of the following (from here) this should give me simply zero:
6.5.7 Bitwise shift operators
...
4
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned
type, the value of the result is E1 × 2^E2, reduced modulo one more
than the maximum value representable in the result type. If E1 has a
signed type and nonnegative value, and E1 × 2^E2 is representable in
the result type, then that is the resulting value; otherwise, the
behavior is undefined.
From how I read this, an undefined result can only occur ever, if signed values are involved. However, I took care that all of the values are unsigned, even made it explicit on the literals.
Am I wrong or is the Clang static analyzer overly zealous?
The original incarnation of this code is from Jonathan Bennetts JB01 implementation (version 1.40a) in C++.
In the C99 standard, right before your quoted part:
3
The integer promotions are performed on each of the operands. The type of the result is
that of the promoted left operand. If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the behavior is undefined.
unsigned int in most machines today has 32 bits, that makes left shift 33, undefined behavior.
That same paragraph also says, before the part you quoted, in paragraph 6.5.7.3:
If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the behavior is undefined.
Thus, clang is doing a fine job since the behavior is indeed undefined once you shift more bits than the promoted left operand can hold.
Related
The code below, when compiled, throws a warning caused by line 9:
warning: shift count >= width of type [-Wshift-count-overflow]
However, line 8 does not throw a similar warning, even though k == 32 (I believe). I'm curious why this behavior is occurring? I am using the gcc compiler system.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int bit_shift(unsigned x, int i){
int k = i * 8;
unsigned n = x << k; /* line 8 */
unsigned m = x << 32; /* line 9 */
return 0;
}
int main(){
bit_shift(0x12345678, 4);
return 0;
}
The value of k in bit_shift is dependent on the parameter i. And because bit_shift is not declared static it is possible that it could be called from other translation units (read: other source files).
So it can't determine at compile time that this shift will always be a problem. That is in contrast to the line unsigned m = x << 32; which always shifts by an invalid amount.
I think why Line 8 does not throw a warning is because left shifting an unsigned int32 >= 32 bits is NOT an undefined behavior.
C standard (N2716, 6.5.7 Bitwise shift operators) says:
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined
This question already has answers here:
What happens when you bit shift beyond the end of a variable?
(3 answers)
Closed 3 years ago.
I am currently trying to extract some bits from an address called addr with a 32 bit mask called mask into another variable called result as follows
int addr = 7;
int x = 0;
uint32_t mask = 0xFFFFFFFF;
result = addr & (mask >> (32 - x));
I am expecting result to be 0 when x = 0, and this is confirmed on online bitshift calculators. however in C code, result is 1. Why is that?
You're performing an illegal bitshift.
Shifting by a value greater or equal than the size in bits of the left operand results in undefined behavior. This is documented in section 6.5.7p3 of the C standard:
The integer promotions are performed on each of the operands. The type
of the result is that of the promoted left operand. If the value of
the right operand is negative or is greater than or equal to the width
of the promoted left operand, the behavior is undefined.
This means you need to check the value of x, and if it is 0 then just use 0 for the bitmask.
int x = 0;
uint32_t mask = 0xFFFFFFFF;
...
if (x == 0) {
result = 0;
} else {
result = addr & (mask >> (32 - x));
}
From the C standard (6.5.7 Bitwise shift operators)
3 The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If the value
of the right operand is negative or is greater than or equal to the
width of the promoted left operand, the behavior is undefined
I'm writing a simple code in C (only using bit-wise operators) that takes a pointer to an unsigned integer x and flips the bit at the nth position n in the binary notation of the integer. The function is declared as follows:
int flip_bit (unsigned * x, unsigned n);
It is assumed that n is between 0 and 31.
In one of the steps, I perform a shift-right operation, but the results are not what I expect. For instance, if I do 0x8000000 >> 30, I get 0xfffffffe as a result, which are 1000 0000 ... 0000 and 1111 1111 ... 1110, respectively, in binary notation. (The expected result is0000 0000 ... 0010).
I am unsure of how or where I am making the mistake. Any help would be appreciated. Thanks.
Edit 1: Below is the code.
#include <stdio.h>
#define INTSIZE 31
void flip_bit(unsigned * x,
unsigned n) {
int a, b, c, d, e, f, g, h, i, j, k, l, m, p, q;
// save bits on the left of n and insert a zero at the end
a = * x >> n + 1;
b = a << 1;
// save bits on the right of n
c = * x << INTSIZE - (n - 1);
d = c >> INTSIZE - (n - 1);
// shift the bits to the left (back in their positions)
// combine all bits
e = d << n;
f = b | e;
// Isolating the nth bit in its position
g = * x >> n;
h = g << INTSIZE;
// THIS LINE BELOW IS THE ONE CAUSING TROUBLE.
i = h >> INTSIZE - n;
// flipping all bits and removing the 1s surrounding
// the nth bit (0 or 1)
j = ~i;
k = j >> n;
l = k << INTSIZE;
p = l >> INTSIZE - n;
// combining the value missing nth bit and
// the one with the flipped one
q = f | p;
* x = q;
}
I'm getting the unusual behavior when I run flip_bit(0x0000004e,0). The line for the shift-right operation in question has comments in uppercase above it.
There is probably a shorter way to do this (without using a thousand variables), but that's what I have now.
Edit 2: The problem was that I declared the variables as int (instead of unsigned). Nevertheless, that's a terrible way to solve the question. #old_timer suggested returning *x ^ (1u << n), which is much better.
The issue here is that you're performing a right shift on a signed int.
From section 6.5.7 of the C standard:
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative
value, the value of the result is the integral part of the quotient of
E1 / 2E2. If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
The bold part is what's happening in your case. Each of your intermediate variables are of type int. Assuming your system uses 2's complement representations for negative numbers, any int value with the high bit set is interpreted as a negative value.
The most common implementation-defined behavior behavior you'll see (and this in fact what gcc and MSVC both do) in this case is that if the high bit is set on a signed value then a 1 will be shifted in on a right shift. This preserves the sign of the value and makes x >> n equivalent to x / 2n for all signed and unsigned values.
You can fix this by changing all of your intermediate variables to unsigned. That way, they match the type of *x and you won't get 1s pushed on to the left.
As for your method of flipping a bit, there is a much simpler way of doing so. You can instead use the ^ operator, which is the bitwise exclusive OR operator.
From section 6.5.11 of the C standard:
4 The result of the ^ operator is the bitwise exclusive OR (XOR) of the
operands (that is, each bit in the result is set if and only if
exactly one of the corresponding bits in the converted operands is
set).
For example:
0010 1000
^ 1100 ^ 1101
------ ------
1110 0101
Note that you can use this to create a bitmask, then use that bitmask to flip the bits in the other operand.
So if you want to flip bit n, take the value 1, left shift it by n to move that bit to the desired location then XOR that value with your target value to flip that bit:
void flip_bit(unsigned * x, unsigned n) {
return *x = *x ^ (1u << n);
}
You can also use the ^= operator in this case which XORs the right operand to the left and assigns the result to the left:
return *x ^= (1u << n);
Also note the u suffix on the integer constant. That causes the type of the constant to be unsigned which helps to avoid the implementation defined behavior you experienced.
#include <stdio.h>
int main ( void )
{
unsigned int x;
int y;
x=0x80000000;
x>>=30;
printf("0x%08X\n",x);
y=0x80000000;
y>>=30;
printf("0x%08X\n",y);
return(0);
}
gcc on mint
0x00000002
0xFFFFFFFE
or what about this
#include <stdio.h>
int main ( void )
{
unsigned int x;
x=0x12345678;
x^=1<<30;
printf("0x%08X\n",x);
}
output
0x52345678
My program is written below:
void main() {
int n =0;
printf("%x", (~0 << (32+ (~n +1) )));
}
As n = 0, ~n = 0xffffffff == -1, so ~n + 1 is equal to 0.
When I execute this program, I get 0xffffffff, which is incorrect as (~0 << 32 ) outputs 0.
When I replace (~n +1) with 0, it outputs 0.
Any help is very much appreciated.
You're shifting a 32-bit wide value by 32 bits.
The result is undefined and could equal mushroom lasagna for all you know.
[C99: 6.5.7/3]: The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
Any further analysis, then, is folly.
int pcount_r (unsigned x) {
if(x==0)
return 0;
else
return ((x & 1) + pcount_r(x >> 1));
}
just wondering why the input argument is unsigned.
best regards!
It is implementation-defined what E1 >> E2 produces when E1 has a signed type and negative value (C99 6.5.7:5). On the other-hand, E1 >> E2 is unambiguously defined by the standard. Accepting and operating on an unsigned integer is a way to make the function most portable.
Incidentally, it is usual to use unsigned types for bit-twiddling.
If the number is signed, then right-shifting will copy the sign-bit (the last bit), effectively giving negative numbers an infinite number of bits.
int pcount_r(int x) {
if (x == 0)
return 0;
else if (x < 0)
return sizeof(int)*8 - pcount_r(~x);
else
return (x & 1) + pcount_r(x >> 1);
}
The problem is that C (unlike Java) does not support signed (arithmetic) shifts. CPUs have two different types of shift operators, signed and unsigned. For example, on an x86, the SAR instruction does an arithmetic shift right, and SHR does an unsigned shift right. Since, C only has one shift right operator (>>), it cannot support both of them. If the compiler implements the code above using an unsigned shift (SHR) and you supply a negative number to that procedure you will get a wrong answer.