2^32 - 1 not part of uint32_t? - c

Here is the program whose compilation output makes me cry:
#include <inttypes.h>
int main()
{
uint32_t limit = (1 << 32) - 1; // 2^32 - 1 right?
}
and here is the compilation output:
~/workspace/CCode$ gcc uint32.c
uint32.c: In function ‘main’:
uint32.c:5:29: warning: left shift count >= width of type [-Wshift-count-overflow]
uint32_t limit = (1 << 32) - 1; // 2^32 - 1 right?
I thought that (1 << 32) - 1 equals to 2^32 - 1 and that unsigned integers on 32 bits range from 0 to 2^32 - 1, isnt it the case? Where did I go wrong?

The warning is correct, the highest bit in a 32bit number is the 31st bit (0 indexed) so the largest shift before overflow is 1 << 30 (30 because of the sign bit). Even though you are doing -1 at some point the result of 1 << 32 must be stored and it will be stored in an int (which in this case happens to be 32 bits). Hence you get the warning.
If you really need to get the max of the 32 bit unsigned int you should do it the neat way:
#include <stdint.h>
uint32_t limit = UINT32_MAX;
Or better yet, use the c++ limits header:
#include <limits>
auto limit = std::numeric_limits<uint32_t>::max();

You have two errors:
1 is of type int, so you are computing the initial value as an int, not as a uint32_t.
As the warning says, shift operators must have their shift argument be less than the width of the type. 1 << 32 is undefined behavior if int is 32 bits or less. (uint32_t)1 << 32 would be undefined as well.
(also, note that 1 << 31 would be undefined behavior as well, if int is 32 bits, because of overflow)
Since arithmetic is done modulo 2^32 anyways, an easier way to do this is just
uint32_t x = -1;
uint32_t y = (uint32_t)0 - 1; // this way avoids compiler warnings

The compiler is using int internally in your example when trying to calculate the target constant. Imagine that rhe compiler didn't have any optimization available and was to generate assembler for your shift. The number 32 would be to big for the 32bit int shift instruction.
Also, if you want all bits set, use ~0

Related

Does bit-shifting in C only work on blocks of 32-bits

I've been experimenting with C again after a while of not coding, and I have come across something I don't understand regarding bit shifting.
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
void main()
{
uint64_t x = 0;
uint64_t testBin = 0b11110000;
x = 1 << testBin;
printf("testBin is %"PRIu64"\nx is %"PRIu64"\n", testBin, x);
//uint64_t y = 240%32;
//printf("%"PRIu64 "\n",y);
}
In the above code, x returns 65536, indicating that after bit shifting 240 places the 1 is now sat in position 17 of a 32-bit register, whereas I'd expect it to be at position 49 of a 64-bit register.
I tried the same with unsigned long long types, that did the same thing.
I've tried compiling both with and without the m64 argument, both the same.
In your setup the constant 1 is a 32 bit integer. Thus the expression 1 << testBin operates on 32 bits. You need to use a 64 bit constant to have the expression operate on 64 bits, e.g.:
x = (uint64_t)1 << testBin;
This does not change the fact that shifting by 240 bits is formally undefined behavior (even though it will probably give the expected result anyway). If testBin is set to 48, the result will be well-defined. Hence the following should be preferred:
x = (uint64_t)1 << (testBin % 64);
It happens because if the default integer type of the constant 1. It is integer (not long long integer). You need to use ULL postfix
x = 1ULL << testbin
PS if you want to shift 240 bits and your integer is less than it (maybe your implementation supports some giant inteters), it is an Undefined Behaviour)

Convert Raw 14 bit Two's Complement to Signed 16 bit Integer

I am doing some work in embedded C with an accelerometer that returns data as a 14 bit 2's complement number. I am storing this result directly into a uint16_t. Later in my code I am trying to convert this "raw" form of the data into a signed integer to represent / work with in the rest of my code.
I am having trouble getting the compiler to understand what I am trying to do. In the following code I'm checking if the 14th bit is set (meaning the number is negative) and then I want to invert the bits and add 1 to get the magnitude of the number.
int16_t fxls8471qr1_convert_raw_accel_to_mag(uint16_t raw, enum fxls8471qr1_fs_range range) {
int16_t raw_signed;
if(raw & _14BIT_SIGN_MASK) {
// Convert 14 bit 2's complement to 16 bit 2's complement
raw |= (1 << 15) | (1 << 14); // 2's complement extension
raw_signed = -(~raw + 1);
}
else {
raw_signed = raw;
}
uint16_t divisor;
if(range == FXLS8471QR1_FS_RANGE_2G) {
divisor = FS_DIV_2G;
}
else if(range == FXLS8471QR1_FS_RANGE_4G) {
divisor = FS_DIV_4G;
}
else {
divisor = FS_DIV_8G;
}
return ((int32_t)raw_signed * RAW_SCALE_FACTOR) / divisor;
}
This code unfortunately doesn't work. The disassembly shows me that for some reason the compiler is optimizing out my statement raw_signed = -(~raw + 1); How do I acheive the result I desire?
The math works out on paper, but I feel like for some reason the compiler is fighting with me :(.
Converting the 14 bit 2's complement value to 16 bit signed, while maintaining the value is simply a metter of:
int16_t accel = (int16_t)(raw << 2) / 4 ;
The left-shift pushes the sign bit into the 16 bit sign bit position, the divide by four restores the magnitude but maintains its sign. The divide avoids the implementation defined behaviour of an right-shift, but will normally result in a single arithmetic-shift-right on instruction sets that allow. The cast is necessary because raw << 2 is an int expression, and unless int is 16 bit, the divide will simply restore the original value.
It would be simpler however to just shift the accelerometer data left by two bits and treat it as if the sensor was 16 bit in the first place. Normalising everything to 16 bit has the benefit that the code needs no change if you use a sensor with any number of bits up-to 16. The magnitude will simply be four times greater, and the least significant two bits will be zero - no information is gained or lost, and the scaling is arbitrary in any case.
int16_t accel = raw << 2 ;
In both cases, if you want the unsigned magnitude then that is simply:
int32_t mag = (int32_t)labs( (int)accel ) ;
I would do simple arithmetic instead. The result is 14-bit signed, which is represented as a number from 0 to 2^14 - 1. Test if the number is 2^13 or above (signifying a negative) and then subtract 2^14.
int16_t fxls8471qr1_convert_raw_accel_to_mag(uint16_t raw, enum fxls8471qr1_fs_range range)
{
int16_t raw_signed = raw;
if(raw_signed >= 1 << 13) {
raw_signed -= 1 << 14;
}
uint16_t divisor;
if(range == FXLS8471QR1_FS_RANGE_2G) {
divisor = FS_DIV_2G;
}
else if(range == FXLS8471QR1_FS_RANGE_4G) {
divisor = FS_DIV_4G;
}
else {
divisor = FS_DIV_8G;
}
return ((int32_t)raw_signed * RAW_SCALE_FACTOR) / divisor;
}
Please check my arithmetic. (Do I have 13 and 14 correct?)
Supposing that int in your particular C implementation is 16 bits wide, the expression (1 << 15), which you use in mangling raw, produces undefined behavior. In that case, the compiler is free to generate code to do pretty much anything -- or nothing -- if the branch of the conditional is taken wherein that expression is evaluated.
Also if int is 16 bits wide, then the expression -(~raw + 1) and all intermediate values will have type unsigned int == uint16_t. This is a result of "the usual arithmetic conversions", given that (16-bit) int cannot represent all values of type uint16_t. The result will have the high bit set and therefore be outside the range representable by type int, so assigning it to an lvalue of type int produces implementation-defined behavior. You'd have to consult your documentation to determine whether the behavior it defines is what you expected and wanted.
If you instead perform a 14-bit sign conversion, forcing the higher-order bits off ((~raw + 1) & 0x3fff) then the result -- the inverse of the desired negative value -- is representable by a 16-bit signed int, so an explicit conversion to int16_t is well-defined and preserves the (positive) value. The result you want is the inverse of that, which you can obtain simply by negating it. Overall:
raw_signed = -(int16_t)((~raw + 1) & 0x3fff);
Of course, if int were wider than 16 bits in your environment then I see no reason why your original code would not work as expected. That would not invalidate the expression above, however, which produces consistently-defined behavior regardless of the size of default int.
Assuming when code reaches return ((int32_t)raw_signed ..., it has a value in the [-8192 ... +8191] range:
If RAW_SCALE_FACTOR is a multiple of 4 then a little savings can be had.
So rather than
int16_t raw_signed = raw << 2;
raw_signed >>= 2;
instead
int16_t fxls8471qr1_convert_raw_accel_to_mag(uint16_t raw,enum fxls8471qr1_fs_range range){
int16_t raw_signed = raw << 2;
uint16_t divisor;
...
// return ((int32_t)raw_signed * RAW_SCALE_FACTOR) / divisor;
return ((int32_t)raw_signed * (RAW_SCALE_FACTOR/4)) / divisor;
}
To convert the 14-bit two's-complement into a signed value, you can flip the sign bit and subtract the offset:
int16_t raw_signed = (raw ^ 1 << 13) - (1 << 13);

find ones position in 64 bit number

I'm trying to find the position of two 1's in a 64 bit number. In this case the ones are at the 0th and 63rd position. The code here returns 0 and 32, which is only half right. Why does this not work?
#include<stdio.h>
void main()
{
unsigned long long number=576460752303423489;
int i;
for (i=0; i<64; i++)
{
if ((number & (1 << i))==1)
{
printf("%d ",i);
}
}
}
There are two bugs on the line
if ((number & (1 << i))==1)
which should read
if (number & (1ull << i))
Changing 1 to 1ull means that the left shift is done on a value of type unsigned long long rather than int, and therefore the bitmask can actually reach positions 32 through 63. Removing the comparison to 1 is because the result of number & mask (where mask has only one bit set) is either mask or 0, and mask is only equal to 1 when i is 0.
However, when I make that change, the output for me is 0 59, which still isn't what you expected. The remaining problem is that 576460752303423489 (decimal) = 0800 0000 0000 0001 (hexadecimal). 0 59 is the correct output for that number. The number you wanted is 9223372036854775809 (decimal) = 8000 0000 0000 0001 (hex).
Incidentally, main is required to return int, not void, and needs an explicit return 0; as its last action (unless you are doing something more sophisticated with the return code). Yes, C99 lets you omit that. Do it anyway.
Because (1 << i) is a 32-bit int value on the platform you are compiling and running on. This then gets sign-extended to 64 bits for the & operation with the number value, resulting in bit 31 being duplicated into bits 32 through 63.
Also, you are comparing the result of the & to 1, which isn't correct. It will not be 0 if the bit is set, but it won't be 1.
Shifting a 32-bit int by 32 is undefined.
Also, your input number is incorrect. The bits set are at positions 0 and 59 (or 1 and 60 if you prefer to count starting at 1).
The fix is to use (1ull << i), or otherwise to right-shift the original value and & it with 1 (instead of left-shifting 1). And of course if you do left-shift 1 and & it with the original value, the result won't be 1 (except for bit 0), so you need to compare != 0 rather than == 1.
#include<stdio.h>
int main()
{
unsigned long long number = 576460752303423489;
int i;
for (i=0; i<64; i++)
{
if ((number & (1ULL << i))) //here
{
printf("%d ",i);
}
}
}
First is to use 1ULL to represent unsigned long long constant. Second is in the if statement, what you mean is not to compare with 1, that will only be true for the rightmost bit.
Output: 0 59
It's correct because 576460752303423489 is equal to 0x800000000000001
The problem could have been avoided in the first place by adopting the methodology of applying the >> operator to a variable, instead of a literal:
if ((variable >> other_variable) & 1)
...
I know the question has some time and multiple correct answers while my should be a comment, but is a bit too long for it. I advice you to encapsulate bit checking logic in a macro and don't use 64 number directly, but rather calculate it. Take a look here for quite comprehensive source of bit manipulation hacks.
#include<stdio.h>
#include<limits.h>
#define CHECK_BIT(var,pos) ((var) & (1ULL<<(pos)))
int main(void)
{
unsigned long long number=576460752303423489;
int pos=sizeof(unsigned long long)*CHAR_BIT-1;
while((pos--)>=0) {
if(CHECK_BIT(number,pos))
printf("%d ",pos);
}
return(0);
}
Rather than resorting to bit manipulation, one can use compiler facilities to perform bit analysis tasks in the most efficient manner (using only a single CPU instruction in many cases).
For example, gcc and clang provide those handy routines:
__builtin_popcountll() - number of bits set in the 64b value
__builtin_clzll() - number of leading zeroes in the 64b value
__builtin_ctzll() - number of trailing zeroes in the 64b value
__builtin_ffsll() - bit index of least significant set bit in the 64b value
Other compilers have similar mechanisms.

How to detect in C whether your machine is 32-bits

So I am revising for an exam and I got stuck in this problem:
2.67 ◆◆
You are given the task of writing a procedure int_size_is_32() that yields 1
when run on a machine for which an int is 32 bits, and yields 0 otherwise. You are
not allowed to use the sizeof operator. Here is a first attempt:
1 /* The following code does not run properly on some machines */
2 int bad_int_size_is_32() {
3 /* Set most significant bit (msb) of 32-bit machine */
4 int set_msb = 1 << 31;
5 /* Shift past msb of 32-bit word */
6 int beyond_msb = 1 << 32;
7
8 /* set_msb is nonzero when word size >= 32
9 beyond_msb is zero when word size <= 32 */
10 return set_msb && !beyond_msb;
11 }
When compiled and run on a 32-bitSUNSPARC, however, this procedure returns 0. The following compiler message gives us an indication of the problem: warning: left shift count >= width of type
A. In what way does our code fail to comply with the C standard?
B. Modify the code to run properly on any machine for which data type int is
at least 32 bits.
C. Modify the code to run properly on any machine for which data type int is
at least 16 bits.
__________ MY ANSWERS:
A: When we shift by 31 in line 4, we overflow, bec according to the unsigned integer standard, the maximum unsigned integer we can represent is 2^31-1
B: In line 4 1<<30
C: In line 4 1<<14 and in line 6 1<<16
Am I right? And if not why please? Thank you!
__________ Second tentative answer:
B: In line 4 (1<<31)>>1 and in line 6: int beyond_msb = set_msb+1; I think I might be right this time :)
A: When we shift by 31 in line 4, we overflow, bec according to the unsigned integer standard, the maximum unsigned integer we can represent is 2^31-1
The error is on line 6, not line 4. The compiler message explains exactly why: shifting by a number of bits greater than the size of the type is undefined behavior.
B: In line 4 1<<30
C: In line 4 1<<14 and in line 6 1<<16
Both of those changes will cause the error to not appear, but will also make the function give incorrect results. You will need to understand how the function works (and how it doesn't work) before you fix it.
For first thing shifting by 30 will not create any overflow as max you can shift is word size w-1.
So when w = 32 you can shift till 31.
Overflow occurs when you shift it by 32 bits as lsb will now move to 33rd bit which is out of bound.
So the problem is in line 6 not 4.
For B.
0xffffffff + 1
If it is 32 bit then it will result 0 otherwise some nozero no.
There is absolutely no way to test the size of signed types in C at runtime. This is because overflow is undefined behavior; you cannot tell if overflow has happened. If you use unsigned int, you can just count how many types you can double a value that starts at 1 before the result becomes zero.
If you want to do the test at compile-time instead of runtime, this will work:
struct { int x:N; };
where N is replaced by successively larger values. The compiler is required to accept the program as long as N is no larger than the width of int, and reject it with a diagnostic/error when N is larger.
You should be able to comply with the C standard by breaking up the shifts left.
B -
Replace Line 6 with
int beyond_msb = (1 << 31) << 1;
C -
Replace Line 4 with
int set_msb = ((1 << 15) << 15) << 1 ;
Replace Line 6 with
int beyond_msb = ((1 << 15) << 15) << 2;
Also, as an extension to the question the following should satisify both B and C, and keep runtime error safe. Shifting left a bit at a time until it reverts back to all zeroes.
int int_size_is_32() {
//initialise our test integer variable.
int x = 1;
//count for checking purposes
int count = 0;
//keep shifting left 1 bit until we have got pushed the 1-bit off the left of the value type space.
while ( x != 0 ) {
x << 1 //shift left
count++;
}
return (count==31);
}

C macro to create a bit mask -- possible? And have I found a GCC bug?

I am somewhat curious about creating a macro to generate a bit mask for a device register, up to 64bits. Such that BIT_MASK(31) produces 0xffffffff.
However, several C examples do not work as thought, as I get 0x7fffffff instead. It is as-if the compiler is assuming I want signed output, not unsigned. So I tried 32, and noticed that the value wraps back around to 0. This is because of C standards stating that if the shift value is greater than or equal to the number of bits in the operand to be shifted, then the result is undefined. That makes sense.
But, given the following program, bits2.c:
#include <stdio.h>
#define BIT_MASK(foo) ((unsigned int)(1 << foo) - 1)
int main()
{
unsigned int foo;
char *s = "32";
foo = atoi(s);
printf("%d %.8x\n", foo, BIT_MASK(foo));
foo = 32;
printf("%d %.8x\n", foo, BIT_MASK(foo));
return (0);
}
If I compile with gcc -O2 bits2.c -o bits2, and run it on a Linux/x86_64 machine, I get the following:
32 00000000
32 ffffffff
If I take the same code and compile it on a Linux/MIPS (big-endian) machine, I get this:
32 00000000
32 00000000
On the x86_64 machine, if I use gcc -O0 bits2.c -o bits2, then I get:
32 00000000
32 00000000
If I tweak BIT_MASK to ((unsigned int)(1UL << foo) - 1), then the output is 32 00000000 for both forms, regardless of gcc's optimization level.
So it appears that on x86_64, gcc is optimizing something incorrectly OR the undefined nature of left-shifting 32 bits on a 32-bit number is being determined by the hardware of each platform.
Given all of the above, is it possible to programatically create a C macro that creates a bit mask from either a single bit or a range of bits?
I.e.:
BIT_MASK(6) = 0x40
BIT_FIELD_MASK(8, 12) = 0x1f00
Assume BIT_MASK and BIT_FIELD_MASK operate from a 0-index (0-31). BIT_FIELD_MASK is to create a mask from a bit range, i.e., 8:12.
Here is a version of the macro which will work for arbitrary positive inputs. (Negative inputs still invoke undefined behavior...)
#include <limits.h>
/* A mask with x least-significant bits set, possibly 0 or >=32 */
#define BIT_MASK(x) \
(((x) >= sizeof(unsigned) * CHAR_BIT) ?
(unsigned) -1 : (1U << (x)) - 1)
Of course, this is a somewhat dangerous macro as it evaluates its argument twice. This is a good opportunity to use a static inline if you use GCC or target C99 in general.
static inline unsigned bit_mask(int x)
{
return (x >= sizeof(unsigned) * CHAR_BIT) ?
(unsigned) -1 : (1U << x) - 1;
}
As Mysticial noted, shifting more than 32 bits with a 32-bit integer results in implementation-defined undefined behavior. Here are three different implementations of shifting:
On x86, only examine the low 5 bits of the shift amount, so x << 32 == x.
On PowerPC, only examine the low 6 bits of the shift amount, so x << 32 == 0 but x << 64 == x.
On Cell SPUs, examine all bits, so x << y == 0 for all y >= 32.
However, compilers are free to do whatever they want if you shift a 32-bit operand 32 bits or more, and they are even free to behave inconsistently (or make demons fly out your nose).
Implementing BIT_FIELD_MASK:
This will set bit a through bit b (inclusive), as long as 0 <= a <= 31 and 0 <= b <= 31.
#define BIT_MASK(a, b) (((unsigned) -1 >> (31 - (b))) & ~((1U << (a)) - 1))
Shifting by more than or equal to the size of the integer type is undefined behavior.
So no, it's not a GCC bug.
In this case, the literal 1 is of type int which is 32-bits in both systems that you used. So shifting by 32 will invoke this undefined behavior.
In the first case, the compiler is not able to resolve the shift-amount to 32. So it likely just issues the normal shift-instruction. (which in x86 uses only the bottom 5-bits) So you get:
(unsigned int)(1 << 0) - 1
which is zero.
In the second case, GCC is able to resolve the shift-amount to 32. Since it is undefined behavior, it (apparently) just replaces the entire result with 0:
(unsigned int)(0) - 1
so you get ffffffff.
So this is a case of where GCC is using undefined behavior as an opportunity to optimize.
(Though personally, I'd prefer that it emits a warning instead.)
Related: Why does integer overflow on x86 with GCC cause an infinite loop?
Assuming you have a working mask for n bits, e.g.
// set the first n bits to 1, rest to 0
#define BITMASK1(n) ((1ULL << (n)) - 1ULL)
you can make a range bitmask by shifting again:
// set bits [k+1, n] to 1, rest to 0
#define BITNASK(n, k) ((BITMASK(n) >> k) << k)
The type of the result is unsigned long long int in any case.
As discussed, BITMASK1 is UB unless n is small. The general version requires a conditional and evaluates the argument twice:
#define BITMASK1(n) (((n) < sizeof(1ULL) * CHAR_BIT ? (1ULL << (n)) : 0) - 1ULL)
#define BIT_MASK(foo) ((~ 0ULL) >> (64-foo))
I'm a bit paranoid about this. I think this assumes that unsigned long long is exactly 64 bits. But it's a start and it works up to 64 bits.
Maybe this is correct:
define BIT_MASK(foo) ((~ 0ULL) >> (sizeof(0ULL)*8-foo))
A "traditional" formula (1ul<<n)-1 has different behavior on different compilers/processors for n=8*sizeof(1ul). Most commonly it overflows for n=32. Any added conditionals will evaluate n multiple times. Going 64-bits (1ull<<n)-1 is an option, but problem migrates to n=64.
My go-to formula is:
#define BIT_MASK(n) (~( ((~0ull) << ((n)-1)) << 1 ))
It does not overflow for n=64 and evaluates n only once.
As downside it will compile to 2 LSH instructions if n is a variable. Also n cannot be 0 (result will be compiler/processor-specific), but it is a rare possibility for all uses that I have(*) and can be dealt with by adding a guarding "if" statement only where necessary (and even better an "assert" to check both upper and lower boundaries).
(*) - usually data comes from a file or pipe, and size is in bytes. If size is zero, then there's no data, so code should do nothing anyway.
What about:
#define BIT_MASK(n) (~(((~0ULL) >> (n)) << (n)))
This works on all endianess system, doing -1 to invert all bits doesn't work on big-endian system.
Since you need to avoid shifting by as many bits as there are in the type (whether that's unsigned long or unsigned long long), you have to be more devious in the masking when dealing with the full width of the type. One way is to sneak up on it:
#define BIT_MASK(n) (((n) == CHAR_BIT * sizeof(unsigned long long)) ? \
((((1ULL << (n-1)) - 1) << 1) | 1) : \
((1ULL << (n )) - 1))
For a constant n such as 64, the compiler evaluates the expression and generates only the case that is used. For a runtime variable n, this fails just as badly as before if n is greater than the number of bits in unsigned long long (or is negative), but works OK without overflow for values of n in the range 0..(CHAR_BIT * sizeof(unsigned long long)).
Note that CHAR_BIT is defined in <limits.h>.
#iva2k's answer avoids branching and is correct when the length is 64 bits. Working on that, you can also do this:
#define BIT_MASK(length) ~(((unsigned long long) -2) << length - 1);
gcc would generate exactly the same code anyway, though.

Resources