When an int is cast to a short and truncated, how is the new value determined? - c

Can someone clarify what happens when an integer is cast to a short in C? I'm using Raspberry Pi, so I'm aware that an int is 32 bits, and therefore a short must be 16 bits.
Let's say I use the following C code for example:
int x = 0x1248642;
short sx = (short)x;
int y = sx;
I get that x would be truncated, but can someone explain how exactly? Are shifts used? How exactly is a number truncated from 32 bits to 16 bits?

According to the ISO C standard, when you convert an integer to a signed type, and the value is outside the range of the target type, the result is implementation-defined. (Or an implementation-defined signal can be raised, but I don't know of any compilers that do this.)
In practice, the most common behavior is that the high-order bits are discarded. So assuming int is 32 bits and short is 16 bits, converting the value 0x1248642 will probably yield a bit pattern that looks like 0x8642. And assuming a two's-complement representation for signed types (which is used on almost all systems), the high-order bit is the sign bit, so the numeric value of the result will be -31166.
int y = sx;
This also involves an implicit conversion, from short to int. Since the range of int is guaranteed to cover at least the entire range of short, the value is unchanged. (Since, in your example, the value of sx happens to be negative, this change of representation is likely to involve sign extension, propagating the 1 sign bit to all 16 high-order bits of the result.)
As I indicated, none of these details are required by the language standard. If you really want to truncate values to a narrower type, it's probably best to use unsigned types (which have language-specified wraparound behavior) and perhaps explicit masking operations, like this:
unsigned int x = 0x1248642;
unsigned short sx = x & 0xFFFF;
If you have a 32-bit quantity that you want to shove into a 16-bit variable, the first thing you should do is decide how you want your code to behave if the value doesn't fit. Once you've decided that, you can figure out how to write C code that does what you want. Sometimes truncation happens to be what you want, in which case your task is going to be easy, especially if you're using unsigned types. Sometimes an out-of-range value is an error, in which case you need to check for it and decide how to handle the error. Sometimes you might want the value to saturate, rather than truncate, so you'll need to write code to do that.
Knowing how conversions work in C is important, but if you start with that question you just might be approaching your problem from the wrong direction.

The 32 bit value is truncated to 16 bits in the same way a 32cm long banana bread would be cut if you jam it into a 16cm long pan. Half of it would fit in and still be a banana bread, and the rest will be "gone".

Truncation happens in CPU registers. These have different sizes: 8/16/32/64 bits. Now, you can imagine a register like:
<--rax----------------------------------------------------------------> (64-bit)
<--eax----------------------------> (32-bit)
<--ax-----------> (16-bit)
<--ah--> <--al--> (8-bit high & low)
01100011 01100001 01110010 01110010 01111001 00100000 01101111 01101110
x is first given the 32 bit value 0x1248642. In memory*, it'll look like:
-----------------------------
| 01 | 24 | 86 | 42 |
-----------------------------
31..24 23..16 15..8 7..0
Now, the compiler loads x in a register. From it, it can simply load the least significant 16 bits (namely, ax) and store them into sx.
*Endianness is not taken into account for the sake of simplicity

Simply the high 16 bits are cut off from the integer. Therefore your short will become 0x8642 which is actually negative number -31166.

Perhaps let the code speak for itself:
#include <stdio.h>
#define BYTETOBINARYPATTERN "%d%d%d%d%d%d%d%d"
#define BYTETOBINARY(byte) \
((byte) & 0x80 ? 1 : 0), \
((byte) & 0x40 ? 1 : 0), \
((byte) & 0x20 ? 1 : 0), \
((byte) & 0x10 ? 1 : 0), \
((byte) & 0x08 ? 1 : 0), \
((byte) & 0x04 ? 1 : 0), \
((byte) & 0x02 ? 1 : 0), \
((byte) & 0x01 ? 1 : 0)
int main()
{
int x = 0x1248642;
short sx = (short) x;
int y = sx;
printf("%d\n", x);
printf("%hu\n", sx);
printf("%d\n", y);
printf("x: "BYTETOBINARYPATTERN" "BYTETOBINARYPATTERN" "BYTETOBINARYPATTERN" "BYTETOBINARYPATTERN"\n",
BYTETOBINARY(x>>24), BYTETOBINARY(x>>16), BYTETOBINARY(x>>8), BYTETOBINARY(x));
printf("sx: "BYTETOBINARYPATTERN" "BYTETOBINARYPATTERN"\n",
BYTETOBINARY(y>>8), BYTETOBINARY(y));
printf("y: "BYTETOBINARYPATTERN" "BYTETOBINARYPATTERN" "BYTETOBINARYPATTERN" "BYTETOBINARYPATTERN"\n",
BYTETOBINARY(y>>24), BYTETOBINARY(y>>16), BYTETOBINARY(y>>8), BYTETOBINARY(y));
return 0;
}
Output:
19170882
34370
-31166
x: 00000001 00100100 10000110 01000010
sx: 10000110 01000010
y: 11111111 11111111 10000110 01000010
As you can see, int -> short yields the lower 16 bits, as expected.
Casting short to int yields the short with the 16 high bits set. However, I suspect this is implementation specific and undefined behavior. You're essentially interpreting 16 bits of memory as an integer, which reads 16 extra bits of whatever rubbish happens to be there (or 1's if the compiler is nice and wants to help you find bugs quicker).
I think it should be safe to do the following:
int y = 0x0000FFFF & sx;
Obviously you won't get back the lost bits, but this will guarantee that the high bits are properly zeroed.
If anyone can verify the short -> int high bit behavior with an authoritative reference, that would be appreciated.
Note: Binary macro adapted from this answer.

sx value will be the same as 2 least significant bytes of x, in this case it will be 0x8642 which (if interpreted as 16 bit signed integer) gives -31166 in decimal.

Related

What does this code does ? There are so many weird things

int n_b ( char *addr , int i ) {
char char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
Like what is that : " i%8 & Ox1" ?
Edit: Note that 0x1 is the hexadecimal notation for 1. Also note that :
0x1 = 0x01 = 0x000001 = 0x0...01
i%8 means i modulo 8, ie the rest in the Euclidean division of i by 8.
& 0x1 is a bitwise AND, it converts the number before to binary form then computes the bitwise operation. (it's already in binary but it's just so you understand)
Example : 0x1101 & 0x1001 = 0x1001
Note that any number & 0x1 is either 0 or one.
Example: 0x11111111 & 0x00000001 is 0x1 and 0x11111110 & 0x00000001 is 0x0
Essentially, it is testing the last bit on the number, which the bit determining parity.
Final edit:
I got the precedence wrong, thanks to the comments for pointing it out. Here is the real precedence.
First, we compute i%8.
The result could be 0, 1, 2, 3, 4, 5, 6, 7.
Then, we shift the char by the result, which is maximum 7. That means the i % 8 th bit is now the least significant bit.
Then, we check if the original i % 8 bit is set (equals one) or not. If it is, return 1. Else, return 0.
This function returns the value of a specific bit in a char array as the integer 0 or 1.
addr is the pointer to the first char.
i is the index to the bit. 8 bits are commonly stored in a char.
First, the char at the correct offset is fetched:
char char_in_chain = addr [ i / 8 ] ;
i / 8 divides i by 8, ignoring the remainder. For example, any value in the range from 24 to 31 gives 3 as the result.
This result is used as the index to the char in the array.
Next and finally, the bit is obtained and returned:
return char_in_chain >> i%8 & 0x1;
Let's just look at the expression char_in_chain >> i%8 & 0x1.
It is confusing, because it does not show which operation is done in what sequence. Therefore, I duplicate it with appropriate parentheses: (char_in_chain >> (i % 8)) & 0x1. The rules (operation precedence) are given by the C standard.
First, the remainder of the division of i by 8 is calculated. This is used to right-shift the obtained char_in_chain. Now the interesting bit is in the least significant bit. Finally, this bit is "masked" with the binary AND operator and the second operand 0x1. BTW, there is no need to mark this constant as hex.
Example:
The array contains the bytes 0x5A, 0x23, and 0x42. The index of the bit to retrieve is 13.
i as given as argument is 13.
i / 8 gives 13 / 8 = 1, remainder ignored.
addr[1] returns 0x23, which is stored in char_in_chain.
i % 8 gives 5 (13 / 8 = 1, remainder 5).
0x23 is binary 0b00100011, and right-shifted by 5 gives 0b00000001.
0b00000001 ANDed with 0b00000001 gives 0b00000001.
The value returned is 1.
Note: If more is not clear, feel free to comment.
What the various operators do is explained by any C book, so I won't address that here. To instead analyse the code step by step...
The function and types used:
int as return type is an indication of the programmer being inexperienced at writing hardware-related code. We should always avoid signed types for such purposes. An experienced programmer would have used an unsigned type, like for example uint8_t. (Or in this specific case maybe even bool, depending on what the data is supposed to represent.)
n_b is a rubbish name, we should obviously never give an identifier such a nondescript name. get_bit or similar would have been a better name.
char* is, again, an indication of the programmer being inexperienced. char is particularly problematic when dealing with raw data, since we can't even know if it is signed or unsigned, it depends on which compiler that is used. Had the raw data contained a value of 0x80 or larger and char was negative, we would have gotten a negative type. And then right shifting a negative value is also problematic, since that behavior too is compiler-specific.
char* is proof of the programmer lacking the fundamental knowledge of const correctness. The function does not modify this parameter so it should have been const qualified. Good code would use const uint8_t* addr.
int i is not really incorrect, the signedness doesn't really matter. But good programming practice would have used an unsigned type or even size_t.
With types unsloppified and corrected, the function might look like this:
#include <stdint.h>
uint8_t get_bit (const uint8_t* addr, size_t i ) {
uint8_t char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
This is still somewhat problematic, because the average C programmer might not remember the precedence of >> vs % vs & on top of their head. It happens to be % over >> over &, but lets write the code a bit more readable still by making precedence explicit: (char_in_chain >> (i%8)) & 0x1.
Then I would question if the local variable really adds anything to readability. Not really, we might as well write:
uint8_t get_bit (const uint8_t* addr, size_t i ) {
return ((addr[i/8]) >> (i%8)) & 0x1;
}
As for what this code actually does: this happens to be a common design pattern for how to access a specific bit in a raw bit-field.
Any bit-field in C may be accessed as an array of bytes.
Bit number n in that bit-field, will be found at byte n/8.
Inside that byte, the bit will be located at n%8.
Bit masking in C is most readably done as data & (1u << bit). Which can be obfuscated as somewhat equivalent but less readable (data >> bit) & 1u, where the masked bit ends up in the LSB.
For example lets assume we have 64 bits of raw data. Bits are always enumerated from 0 to 63 and bytes (just like any C array) from index 0. We want to access bit 33. Then 33/8 integer division = 4.
So byte[4]. Bit 33 will be found at 33%8 = 1. So we can obtain the value of bit 33 from ordinary bit masking byte[33/8] & (1u << (bit%8)). Or similarly, (byte[33/8] >> (bit%8)) & 1u
An alternative, more readable version of it all:
bool is_bit_set (const uint8_t* data, size_t bit)
{
uint8_t byte = data [bit / 8u];
size_t mask = 1u << (bit % 8u);
return (byte & mask) != 0u;
}
(Strictly speaking we could as well do return byte & mask; since a boolean type is used, but it doesn't hurt to be explicit.)

Does bit-shifting in C only work on blocks of 32-bits

I've been experimenting with C again after a while of not coding, and I have come across something I don't understand regarding bit shifting.
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
void main()
{
uint64_t x = 0;
uint64_t testBin = 0b11110000;
x = 1 << testBin;
printf("testBin is %"PRIu64"\nx is %"PRIu64"\n", testBin, x);
//uint64_t y = 240%32;
//printf("%"PRIu64 "\n",y);
}
In the above code, x returns 65536, indicating that after bit shifting 240 places the 1 is now sat in position 17 of a 32-bit register, whereas I'd expect it to be at position 49 of a 64-bit register.
I tried the same with unsigned long long types, that did the same thing.
I've tried compiling both with and without the m64 argument, both the same.
In your setup the constant 1 is a 32 bit integer. Thus the expression 1 << testBin operates on 32 bits. You need to use a 64 bit constant to have the expression operate on 64 bits, e.g.:
x = (uint64_t)1 << testBin;
This does not change the fact that shifting by 240 bits is formally undefined behavior (even though it will probably give the expected result anyway). If testBin is set to 48, the result will be well-defined. Hence the following should be preferred:
x = (uint64_t)1 << (testBin % 64);
It happens because if the default integer type of the constant 1. It is integer (not long long integer). You need to use ULL postfix
x = 1ULL << testbin
PS if you want to shift 240 bits and your integer is less than it (maybe your implementation supports some giant inteters), it is an Undefined Behaviour)

Converting 32 bit number to four 8bit numbers

I am trying to convert the input from a device (always integer between 1 and 600000) to four 8-bit integers.
For example,
If the input is 32700, I want 188 127 00 00.
I achieved this by using:
32700 % 256
32700 / 256
The above works till 32700. From 32800 onward, I start getting incorrect conversions.
I am totally new to this and would like some help to understand how this can be done properly.
Major edit following clarifications:
Given that someone has already mentioned the shift-and-mask approach (which is undeniably the right one), I'll give another approach, which, to be pedantic, is not portable, machine-dependent, and possibly exhibits undefined behavior. It is nevertheless a good learning exercise, IMO.
For various reasons, your computer represents integers as groups of 8-bit values (called bytes); note that, although extremely common, this is not always the case (see CHAR_BIT). For this reason, values that are represented using more than 8 bits use multiple bytes (hence those using a number of bits with is a multiple of 8). For a 32-bit value, you use 4 bytes and, in memory, those bytes always follow each other.
We call a pointer a value containing the address in memory of another value. In that context, a byte is defined as the smallest (in terms of bit count) value that can be referred to by a pointer. For example, your 32-bit value, covering 4 bytes, will have 4 "addressable" cells (one per byte) and its address is defined as the first of those addresses:
|==================|
| MEMORY | ADDRESS |
|========|=========|
| ... | x-1 | <== Pointer to byte before
|--------|---------|
| BYTE 0 | x | <== Pointer to first byte (also pointer to 32-bit value)
|--------|---------|
| BYTE 1 | x+1 | <== Pointer to second byte
|--------|---------|
| BYTE 2 | x+2 | <== Pointer to third byte
|--------|---------|
| BYTE 3 | x+3 | <== Pointer to fourth byte
|--------|---------|
| ... | x+4 | <== Pointer to byte after
|===================
So what you want to do (split the 32-bit word into 8-bits word) has already been done by your computer, as it is imposed onto it by its processor and/or memory architecture. To reap the benefits of this almost-coincidence, we are going to find where your 32-bit value is stored and read its memory byte-by-byte (instead of 32 bits at a time).
As all serious SO answers seem to do so, let me cite the Standard (ISO/IEC 9899:2018, 6.2.5-20) to define the last thing I need (emphasis mine):
Any number of derived types can be constructed from the object and function types, as follows:
An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. [...] Array types are characterized by their element type and by the number of elements in the array. [...]
[...]
So, as elements in an array are defined to be contiguous, a 32-bit value in memory, on a machine with 8-bit bytes, really is nothing more, in its machine representation, than an array of 4 bytes!
Given a 32-bit signed value:
int32_t value;
its address is given by &value. Meanwhile, an array of 4 8-bit bytes may be represented by:
uint8_t arr[4];
notice that I use the unsigned variant because those bytes don't really represent a number per se so interpreting them as "signed" would not make sense. Now, a pointer-to-array-of-4-uint8_t is defined as:
uint8_t (*ptr)[4];
and if I assign the address of our 32-bit value to such an array, I will be able to index each byte individually, which means that I will be reading the byte directly, avoiding any pesky shifting-and-masking operations!
uint8_t (*bytes)[4] = (void *) &value;
I need to cast the pointer ("(void *)") because I can't bear that whining compiler &value's type is "pointer-to-int32_t" while I'm assigning it to a "pointer-to-array-of-4-uint8_t" and this type-mismatch is caught by the compiler and pedantically warned against by the Standard; this is a first warning that what we're doing is not ideal!
Finally, we can access each byte individually by reading it directly from memory through indexing: (*bytes)[n] reads the n-th byte of value!
To put it all together, given a send_can(uint8_t) function:
for (size_t i = 0; i < sizeof(*bytes); i++)
send_can((*bytes)[i]);
and, for testing purpose, we define:
void send_can(uint8_t b)
{
printf("%hhu\n", b);
}
which prints, on my machine, when value is 32700:
188
127
0
0
Lastly, this shows yet another reason why this method is platform-dependent: the order in which the bytes of the 32-bit word is stored isn't always what you would expect from a theoretical discussion of binary representation i.e:
byte 0 contains bits 31-24
byte 1 contains bits 23-16
byte 2 contains bits 15-8
byte 3 contains bits 7-0
actually, AFAIK, the C Language permits any of the 24 possibilities for ordering those 4 bytes (this is called endianness). Meanwhile, shifting and masking will always get you the n-th "logical" byte.
It really depends on how your architecture stores an int. For example
8 or 16 bit system short=16, int=16, long=32
32 bit system, short=16, int=32, long=32
64 bit system, short=16, int=32, long=64
This is not a hard and fast rule - you need to check your architecture first. There is also a long long but some compilers do not recognize it and the size varies according to architecture.
Some compilers have uint8_t etc defined so you can actually specify how many bits your number is instead of worrying about ints and longs.
Having said that you wish to convert a number into 4 8 bit ints. You could have something like
unsigned long x = 600000UL; // you need UL to indicate it is unsigned long
unsigned int b1 = (unsigned int)(x & 0xff);
unsigned int b2 = (unsigned int)(x >> 8) & 0xff;
unsigned int b3 = (unsigned int)(x >> 16) & 0xff;
unsigned int b4 = (unsigned int)(x >> 24);
Using shifts is a lot faster than multiplication, division or mod. This depends on the endianess you wish to achieve. You could reverse the assignments using b1 with the formula for b4 etc.
You could do some bit masking.
600000 is 0x927C0
600000 / (256 * 256) gets you the 9, no masking yet.
((600000 / 256) & (255 * 256)) >> 8 gets you the 0x27 == 39. Using a 8bit-shifted mask of 8 set bits (256 * 255) and a right shift by 8 bits, the >> 8, which would also be possible as another / 256.
600000 % 256 gets you the 0xC0 == 192 as you did it. Masking would be 600000 & 255.
I ended up doing this:
unsigned char bytes[4];
unsigned long n;
n = (unsigned long) sensore1 * 100;
bytes[0] = n & 0xFF;
bytes[1] = (n >> 8) & 0xFF;
bytes[2] = (n >> 16) & 0xFF;
bytes[3] = (n >> 24) & 0xFF;
CAN_WRITE(0x7FD,8,01,sizeof(n),bytes[0],bytes[1],bytes[2],bytes[3],07,255);
I have been in a similar kind of situation while packing and unpacking huge custom packets of data to be transmitted/received, I suggest you try below approach:
typedef union
{
uint32_t u4_input;
uint8_t u1_byte_arr[4];
}UN_COMMON_32BIT_TO_4X8BIT_CONVERTER;
UN_COMMON_32BIT_TO_4X8BIT_CONVERTER un_t_mode_reg;
un_t_mode_reg.u4_input = input;/*your 32 bit input*/
// 1st byte = un_t_mode_reg.u1_byte_arr[0];
// 2nd byte = un_t_mode_reg.u1_byte_arr[1];
// 3rd byte = un_t_mode_reg.u1_byte_arr[2];
// 4th byte = un_t_mode_reg.u1_byte_arr[3];
The largest positive value you can store in a 16-bit signed int is 32767. If you force a number bigger than that, you'll get a negative number as a result, hence unexpected values returned by % and /.
Use either unsigned 16-bit int for a range up to 65535 or a 32-bit integer type.

How to create mask with least significat bits set to 1 in C

Can someone please explain this function to me?
A mask with the least significant n bits set to 1.
Ex:
n = 6 --> 0x2F, n = 17 --> 0x1FFFF // I don't get these at all, especially how n = 6 --> 0x2F
Also, what is a mask?
The usual way is to take a 1, and shift it left n bits. That will give you something like: 00100000. Then subtract one from that, which will clear the bit that's set, and set all the less significant bits, so in this case we'd get: 00011111.
A mask is normally used with bitwise operations, especially and. You'd use the mask above to get the 5 least significant bits by themselves, isolated from anything else that might be present. This is especially common when dealing with hardware that will often have a single hardware register containing bits representing a number of entirely separate, unrelated quantities and/or flags.
A mask is a common term for an integer value that is bit-wise ANDed, ORed, XORed, etc with another integer value.
For example, if you want to extract the 8 least significant digits of an int variable, you do variable & 0xFF. 0xFF is a mask.
Likewise if you want to set bits 0 and 8, you do variable | 0x101, where 0x101 is a mask.
Or if you want to invert the same bits, you do variable ^ 0x101, where 0x101 is a mask.
To generate a mask for your case you should exploit the simple mathematical fact that if you add 1 to your mask (the mask having all its least significant bits set to 1 and the rest to 0), you get a value that is a power of 2.
So, if you generate the closest power of 2, then you can subtract 1 from it to get the mask.
Positive powers of 2 are easily generated with the left shift << operator in C.
Hence, 1 << n yields 2n. In binary it's 10...0 with n 0s.
(1 << n) - 1 will produce a mask with n lowest bits set to 1.
Now, you need to watch out for overflows in left shifts. In C (and in C++) you can't legally shift a variable left by as many bit positions as the variable has, so if ints are 32-bit, 1<<32 results in undefined behavior. Signed integer overflows should also be avoided, so you should use unsigned values, e.g. 1u << 31.
For both correctness and performance, the best way to accomplish this has changed since this question was asked back in 2012 due to the advent of BMI instructions in modern x86 processors, specifically BLSMSK.
Here's a good way of approaching this problem, while retaining backwards compatibility with older processors.
This method is correct, whereas the current top answers produce undefined behavior in edge cases.
Clang and GCC, when allowed to optimize using BMI instructions, will condense gen_mask() to just two ops. With supporting hardware, be sure to add compiler flags for BMI instructions:
-mbmi -mbmi2
#include <inttypes.h>
#include <stdio.h>
uint64_t gen_mask(const uint_fast8_t msb) {
const uint64_t src = (uint64_t)1 << msb;
return (src - 1) ^ src;
}
int main() {
uint_fast8_t msb;
for (msb = 0; msb < 64; ++msb) {
printf("%016" PRIx64 "\n", gen_mask(msb));
}
return 0;
}
First, for those who only want the code to create the mask:
uint64_t bits = 6;
uint64_t mask = ((uint64_t)1 << bits) - 1;
# Results in 0b111111 (or 0x03F)
Thanks to #Benni who asked about using bits = 64. If you need the code to support this value as well, you can use:
uint64_t bits = 6;
uint64_t mask = (bits < 64)
? ((uint64_t)1 << bits) - 1
: (uint64_t)0 - 1
For those who want to know what a mask is:
A mask is usually a name for value that we use to manipulate other values using bitwise operations such as AND, OR, XOR, etc.
Short masks are usually represented in binary, where we can explicitly see all the bits that are set to 1.
Longer masks are usually represented in hexadecimal, that is really easy to read once you get a hold of it.
You can read more about bitwise operations in C here.
I believe your first example should be 0x3f.
0x3f is hexadecimal notation for the number 63 which is 111111 in binary, so that last 6 bits (the least significant 6 bits) are set to 1.
The following little C program will calculate the correct mask:
#include <stdarg.h>
#include <stdio.h>
int mask_for_n_bits(int n)
{
int mask = 0;
for (int i = 0; i < n; ++i)
mask |= 1 << i;
return mask;
}
int main (int argc, char const *argv[])
{
printf("6: 0x%x\n17: 0x%x\n", mask_for_n_bits(6), mask_for_n_bits(17));
return 0;
}
0x2F is 0010 1111 in binary - this should be 0x3f, which is 0011 1111 in binary and which has the 6 least-significant bits set.
Similarly, 0x1FFFF is 0001 1111 1111 1111 1111 in binary, which has the 17 least-significant bits set.
A "mask" is a value that is intended to be combined with another value using a bitwise operator like &, | or ^ to individually set, unset, flip or leave unchanged the bits in that other value.
For example, if you combine the mask 0x2F with some value n using the & operator, the result will have zeroes in all but the 6 least significant bits, and those 6 bits will be copied unchanged from the value n.
In the case of an & mask, a binary 0 in the mask means "unconditionally set the result bit to 0" and a 1 means "set the result bit to the input value bit". For an | mask, an 0 in the mask sets the result bit to the input bit and a 1 unconditionally sets the result bit to 1, and for an ^ mask, an 0 sets the result bit to the input bit and a 1 sets the result bit to the complement of the input bit.

C macro to create a bit mask -- possible? And have I found a GCC bug?

I am somewhat curious about creating a macro to generate a bit mask for a device register, up to 64bits. Such that BIT_MASK(31) produces 0xffffffff.
However, several C examples do not work as thought, as I get 0x7fffffff instead. It is as-if the compiler is assuming I want signed output, not unsigned. So I tried 32, and noticed that the value wraps back around to 0. This is because of C standards stating that if the shift value is greater than or equal to the number of bits in the operand to be shifted, then the result is undefined. That makes sense.
But, given the following program, bits2.c:
#include <stdio.h>
#define BIT_MASK(foo) ((unsigned int)(1 << foo) - 1)
int main()
{
unsigned int foo;
char *s = "32";
foo = atoi(s);
printf("%d %.8x\n", foo, BIT_MASK(foo));
foo = 32;
printf("%d %.8x\n", foo, BIT_MASK(foo));
return (0);
}
If I compile with gcc -O2 bits2.c -o bits2, and run it on a Linux/x86_64 machine, I get the following:
32 00000000
32 ffffffff
If I take the same code and compile it on a Linux/MIPS (big-endian) machine, I get this:
32 00000000
32 00000000
On the x86_64 machine, if I use gcc -O0 bits2.c -o bits2, then I get:
32 00000000
32 00000000
If I tweak BIT_MASK to ((unsigned int)(1UL << foo) - 1), then the output is 32 00000000 for both forms, regardless of gcc's optimization level.
So it appears that on x86_64, gcc is optimizing something incorrectly OR the undefined nature of left-shifting 32 bits on a 32-bit number is being determined by the hardware of each platform.
Given all of the above, is it possible to programatically create a C macro that creates a bit mask from either a single bit or a range of bits?
I.e.:
BIT_MASK(6) = 0x40
BIT_FIELD_MASK(8, 12) = 0x1f00
Assume BIT_MASK and BIT_FIELD_MASK operate from a 0-index (0-31). BIT_FIELD_MASK is to create a mask from a bit range, i.e., 8:12.
Here is a version of the macro which will work for arbitrary positive inputs. (Negative inputs still invoke undefined behavior...)
#include <limits.h>
/* A mask with x least-significant bits set, possibly 0 or >=32 */
#define BIT_MASK(x) \
(((x) >= sizeof(unsigned) * CHAR_BIT) ?
(unsigned) -1 : (1U << (x)) - 1)
Of course, this is a somewhat dangerous macro as it evaluates its argument twice. This is a good opportunity to use a static inline if you use GCC or target C99 in general.
static inline unsigned bit_mask(int x)
{
return (x >= sizeof(unsigned) * CHAR_BIT) ?
(unsigned) -1 : (1U << x) - 1;
}
As Mysticial noted, shifting more than 32 bits with a 32-bit integer results in implementation-defined undefined behavior. Here are three different implementations of shifting:
On x86, only examine the low 5 bits of the shift amount, so x << 32 == x.
On PowerPC, only examine the low 6 bits of the shift amount, so x << 32 == 0 but x << 64 == x.
On Cell SPUs, examine all bits, so x << y == 0 for all y >= 32.
However, compilers are free to do whatever they want if you shift a 32-bit operand 32 bits or more, and they are even free to behave inconsistently (or make demons fly out your nose).
Implementing BIT_FIELD_MASK:
This will set bit a through bit b (inclusive), as long as 0 <= a <= 31 and 0 <= b <= 31.
#define BIT_MASK(a, b) (((unsigned) -1 >> (31 - (b))) & ~((1U << (a)) - 1))
Shifting by more than or equal to the size of the integer type is undefined behavior.
So no, it's not a GCC bug.
In this case, the literal 1 is of type int which is 32-bits in both systems that you used. So shifting by 32 will invoke this undefined behavior.
In the first case, the compiler is not able to resolve the shift-amount to 32. So it likely just issues the normal shift-instruction. (which in x86 uses only the bottom 5-bits) So you get:
(unsigned int)(1 << 0) - 1
which is zero.
In the second case, GCC is able to resolve the shift-amount to 32. Since it is undefined behavior, it (apparently) just replaces the entire result with 0:
(unsigned int)(0) - 1
so you get ffffffff.
So this is a case of where GCC is using undefined behavior as an opportunity to optimize.
(Though personally, I'd prefer that it emits a warning instead.)
Related: Why does integer overflow on x86 with GCC cause an infinite loop?
Assuming you have a working mask for n bits, e.g.
// set the first n bits to 1, rest to 0
#define BITMASK1(n) ((1ULL << (n)) - 1ULL)
you can make a range bitmask by shifting again:
// set bits [k+1, n] to 1, rest to 0
#define BITNASK(n, k) ((BITMASK(n) >> k) << k)
The type of the result is unsigned long long int in any case.
As discussed, BITMASK1 is UB unless n is small. The general version requires a conditional and evaluates the argument twice:
#define BITMASK1(n) (((n) < sizeof(1ULL) * CHAR_BIT ? (1ULL << (n)) : 0) - 1ULL)
#define BIT_MASK(foo) ((~ 0ULL) >> (64-foo))
I'm a bit paranoid about this. I think this assumes that unsigned long long is exactly 64 bits. But it's a start and it works up to 64 bits.
Maybe this is correct:
define BIT_MASK(foo) ((~ 0ULL) >> (sizeof(0ULL)*8-foo))
A "traditional" formula (1ul<<n)-1 has different behavior on different compilers/processors for n=8*sizeof(1ul). Most commonly it overflows for n=32. Any added conditionals will evaluate n multiple times. Going 64-bits (1ull<<n)-1 is an option, but problem migrates to n=64.
My go-to formula is:
#define BIT_MASK(n) (~( ((~0ull) << ((n)-1)) << 1 ))
It does not overflow for n=64 and evaluates n only once.
As downside it will compile to 2 LSH instructions if n is a variable. Also n cannot be 0 (result will be compiler/processor-specific), but it is a rare possibility for all uses that I have(*) and can be dealt with by adding a guarding "if" statement only where necessary (and even better an "assert" to check both upper and lower boundaries).
(*) - usually data comes from a file or pipe, and size is in bytes. If size is zero, then there's no data, so code should do nothing anyway.
What about:
#define BIT_MASK(n) (~(((~0ULL) >> (n)) << (n)))
This works on all endianess system, doing -1 to invert all bits doesn't work on big-endian system.
Since you need to avoid shifting by as many bits as there are in the type (whether that's unsigned long or unsigned long long), you have to be more devious in the masking when dealing with the full width of the type. One way is to sneak up on it:
#define BIT_MASK(n) (((n) == CHAR_BIT * sizeof(unsigned long long)) ? \
((((1ULL << (n-1)) - 1) << 1) | 1) : \
((1ULL << (n )) - 1))
For a constant n such as 64, the compiler evaluates the expression and generates only the case that is used. For a runtime variable n, this fails just as badly as before if n is greater than the number of bits in unsigned long long (or is negative), but works OK without overflow for values of n in the range 0..(CHAR_BIT * sizeof(unsigned long long)).
Note that CHAR_BIT is defined in <limits.h>.
#iva2k's answer avoids branching and is correct when the length is 64 bits. Working on that, you can also do this:
#define BIT_MASK(length) ~(((unsigned long long) -2) << length - 1);
gcc would generate exactly the same code anyway, though.

Resources