Calculate bitmask from a given index in a 16 bit architecture - c

I have a function that accepts an index variable of type unsigned long (this type cannot be changed).
void func(unsigned long index);
I need to convert it to a bitmask such that for index 0 the bitmask will be 1, for index 1 bitmask will be 2, for 2 it will be 4 and so on.
I have done the following:
mask = 1 << index;
The problem is that I'm working with an architecture of 16 bit , therefore unsigned long variables are shown as 32 bit which messes up this variable.
(the lowest 16 bits give me the correct value for mask but the highest 16 bits add extra information which messes this up).
i.e. Instead of getting: mask = 0000000000000001 (16 bit)
I'm getting: xxxxxxxxxxxxxxxx0000000000000001 (32 bits)
Is there another way to calculate this bitmask?
Would appreciate help.
Thank you.

You have the correct approach. However, the problem with your implementation is that the type of 1 in 1 << index expression is int, with implementation-defined representation. Since you are looking for an unsigned long result, use ((unsigned long)1) instead:
unsigned long mask = ((unsigned long)1) << index;
If your platform supports stdint.h and you need a mask of some specific width, use uint32_t instead:
uint32_t mask = UINT32_C(1) << index;

Your basic code is correct, although I notice you didn't specify the type of mask.
If the caller passes a value greater than 15 into index, what are you going to do? It sounds like you have to make the most of a bad situation. Depending on the context you could simply return from func, you could assert, or you could proceed with a mask of zero.
This brings us back to the question of the type of mask. I would define it as unsigned short, uint16 or similar, depending on your environment. But other than that, your first attempt was basically correct. It's just a question of error handling.
uint16 shift = index & 15;
uint16 mask = 1 << shift;

Related

What does this code does ? There are so many weird things

int n_b ( char *addr , int i ) {
char char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
Like what is that : " i%8 & Ox1" ?
Edit: Note that 0x1 is the hexadecimal notation for 1. Also note that :
0x1 = 0x01 = 0x000001 = 0x0...01
i%8 means i modulo 8, ie the rest in the Euclidean division of i by 8.
& 0x1 is a bitwise AND, it converts the number before to binary form then computes the bitwise operation. (it's already in binary but it's just so you understand)
Example : 0x1101 & 0x1001 = 0x1001
Note that any number & 0x1 is either 0 or one.
Example: 0x11111111 & 0x00000001 is 0x1 and 0x11111110 & 0x00000001 is 0x0
Essentially, it is testing the last bit on the number, which the bit determining parity.
Final edit:
I got the precedence wrong, thanks to the comments for pointing it out. Here is the real precedence.
First, we compute i%8.
The result could be 0, 1, 2, 3, 4, 5, 6, 7.
Then, we shift the char by the result, which is maximum 7. That means the i % 8 th bit is now the least significant bit.
Then, we check if the original i % 8 bit is set (equals one) or not. If it is, return 1. Else, return 0.
This function returns the value of a specific bit in a char array as the integer 0 or 1.
addr is the pointer to the first char.
i is the index to the bit. 8 bits are commonly stored in a char.
First, the char at the correct offset is fetched:
char char_in_chain = addr [ i / 8 ] ;
i / 8 divides i by 8, ignoring the remainder. For example, any value in the range from 24 to 31 gives 3 as the result.
This result is used as the index to the char in the array.
Next and finally, the bit is obtained and returned:
return char_in_chain >> i%8 & 0x1;
Let's just look at the expression char_in_chain >> i%8 & 0x1.
It is confusing, because it does not show which operation is done in what sequence. Therefore, I duplicate it with appropriate parentheses: (char_in_chain >> (i % 8)) & 0x1. The rules (operation precedence) are given by the C standard.
First, the remainder of the division of i by 8 is calculated. This is used to right-shift the obtained char_in_chain. Now the interesting bit is in the least significant bit. Finally, this bit is "masked" with the binary AND operator and the second operand 0x1. BTW, there is no need to mark this constant as hex.
Example:
The array contains the bytes 0x5A, 0x23, and 0x42. The index of the bit to retrieve is 13.
i as given as argument is 13.
i / 8 gives 13 / 8 = 1, remainder ignored.
addr[1] returns 0x23, which is stored in char_in_chain.
i % 8 gives 5 (13 / 8 = 1, remainder 5).
0x23 is binary 0b00100011, and right-shifted by 5 gives 0b00000001.
0b00000001 ANDed with 0b00000001 gives 0b00000001.
The value returned is 1.
Note: If more is not clear, feel free to comment.
What the various operators do is explained by any C book, so I won't address that here. To instead analyse the code step by step...
The function and types used:
int as return type is an indication of the programmer being inexperienced at writing hardware-related code. We should always avoid signed types for such purposes. An experienced programmer would have used an unsigned type, like for example uint8_t. (Or in this specific case maybe even bool, depending on what the data is supposed to represent.)
n_b is a rubbish name, we should obviously never give an identifier such a nondescript name. get_bit or similar would have been a better name.
char* is, again, an indication of the programmer being inexperienced. char is particularly problematic when dealing with raw data, since we can't even know if it is signed or unsigned, it depends on which compiler that is used. Had the raw data contained a value of 0x80 or larger and char was negative, we would have gotten a negative type. And then right shifting a negative value is also problematic, since that behavior too is compiler-specific.
char* is proof of the programmer lacking the fundamental knowledge of const correctness. The function does not modify this parameter so it should have been const qualified. Good code would use const uint8_t* addr.
int i is not really incorrect, the signedness doesn't really matter. But good programming practice would have used an unsigned type or even size_t.
With types unsloppified and corrected, the function might look like this:
#include <stdint.h>
uint8_t get_bit (const uint8_t* addr, size_t i ) {
uint8_t char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
This is still somewhat problematic, because the average C programmer might not remember the precedence of >> vs % vs & on top of their head. It happens to be % over >> over &, but lets write the code a bit more readable still by making precedence explicit: (char_in_chain >> (i%8)) & 0x1.
Then I would question if the local variable really adds anything to readability. Not really, we might as well write:
uint8_t get_bit (const uint8_t* addr, size_t i ) {
return ((addr[i/8]) >> (i%8)) & 0x1;
}
As for what this code actually does: this happens to be a common design pattern for how to access a specific bit in a raw bit-field.
Any bit-field in C may be accessed as an array of bytes.
Bit number n in that bit-field, will be found at byte n/8.
Inside that byte, the bit will be located at n%8.
Bit masking in C is most readably done as data & (1u << bit). Which can be obfuscated as somewhat equivalent but less readable (data >> bit) & 1u, where the masked bit ends up in the LSB.
For example lets assume we have 64 bits of raw data. Bits are always enumerated from 0 to 63 and bytes (just like any C array) from index 0. We want to access bit 33. Then 33/8 integer division = 4.
So byte[4]. Bit 33 will be found at 33%8 = 1. So we can obtain the value of bit 33 from ordinary bit masking byte[33/8] & (1u << (bit%8)). Or similarly, (byte[33/8] >> (bit%8)) & 1u
An alternative, more readable version of it all:
bool is_bit_set (const uint8_t* data, size_t bit)
{
uint8_t byte = data [bit / 8u];
size_t mask = 1u << (bit % 8u);
return (byte & mask) != 0u;
}
(Strictly speaking we could as well do return byte & mask; since a boolean type is used, but it doesn't hurt to be explicit.)

Is there a difference between a bit mask and a bit array?

I have heard the two terms used interchangeably. Is there a difference?
For example,
unsigned char chessboard : 64; /* Bit mask */
unsigned char chessboard_2 [64]; /* Bit array */
Bit Mask
A bit mask is a binary value that's used to refer to specific bits in an integer value when using bitwise operators. For instance, you might have:
unsigned int low3 = 0x7;
This is a bit mask with the low order 3 bits set. You can then use it to extract a part of a value:
unsigned int value = 030071;
unsigned int value_low3 = value & low3; // result is 01
or to update part of value:
unsigned int newvalue = (value & ~low3) | 5; // result is 030075
Bit Array
A bit array is an unsigned integer, or an array of unsigned integers, that's used to hold a sequence of boolean flags, where each value is in separate bits of the integer(s). If you have lots of boolean values to store, this is saves lots of memory compared to having each of them in a separate array element.
However, there's a tradeoff: in order to access a specific flag, you need to use masking and shifting.
If your bit array is small enough to fit in a single integer, you might declare:
uint32_t bitarray;
Then to access a specific element of it, you use:
bitvalue = (bitarray >> bitnum) & 0x1;
and to set an element:
bitarray |= (1u << bitnum);
and to clear an element:
bitarray &= ~(1u << bitnum);
If the bit array needs multiple words, you declare an array. You get the array index by dividing the bit number by the number of bits in each array element, then use the remainder to determine the bit number within that word and use the above expressions.
None of them is a bitmask. The first is the definition of a bitfield which should only be valid as a struct member and the second is an array of 64 unsigned chars.

Setting and clearing bits in C

I have some trouble with a C program that does some bit manipulation. In the program, I use an unsigned long long int variable to represent a 64 bit map, each bit representing a position on the map. I need to be able to update these bits (positions) i.e. setting or clearing a bit.
To clear and set a bit, I do (0 is the least significant position):
map &= ~(1 << pos) // clear bit in position 'pos'
map |= (1 << pos) // set bit in position 'pos'
The problem is that when i perform these operations, all the bits in the map which are to the left of pos get set to 0 (while I want only the bit in position pos to change).
What am I doing wrong?
The problem is that those shifts are done using the type int, which on all modern 64-bit systems are still 32 bits. You need to use the same type as map, i.e. unsigned long long:
1ull << pos
Note the ull which tells the compiler that the 1 is not an int but an unsigned long long.

C: how to build up a binary integer

I have some logic that I would like to store as an integer. I have 30 "positions" that can be either yes or no and I would like to represent this as an integer. As I am looping through these positions what would be the easiest way to store this information as an integer?
You can use a 32 bit uint:
uint32_t flags = 0;
flags |= UINT32_C(1) << x; // set x'th bit from right
flags &= ~(UINT32_C(1) << x); // unset x'th bit from right
if (flags & UINT32_C(1) << x) // test x'th bit from right
struct{
int flag0:1;
int flag1:1;
...
int flag31:1;
} myFlags;
Using :x in definition of an integer struct member means bitfield with x bits assigned.
You can access each struct member as usual, but the values can only be according to the size in bits (in my example - either 1 or 0 because only 1 bit is available), and the compiler will enforce it. The struct will be (probably, depends on the compiler settings) packed to a total size of integers needed to represent the total bits.
Another option would be using a int and bitwise operators & and | to access specific bits. In this case you have to make sure yourself that setting one bit won't affect another, and that there are no overflows etc.
#define POSITION_A 1
#define POSITION_B 2
unsigned int position = 0;
// set a position
position |= POSITION_A;
// clear a position
position &= = ~(POSITION_A);
Yes, as WTP's comment, you could save all your data in one unsigned int (uint32_t), and access it with AND(&), OR(|), NOT(~).
If saving storage is not a primary concern, however, I recommend not to use this compact technique.
You may need to expand your code to support more than 2 types(yes/no) of answers such as (yes/no/maybe).
You may have more than 30 questions which does not fit into one unsigned int.
If I were you, I'll use some array/list of small int (short or char) to store the values. It's somewhat waste of storage, but much easier to read, and much easier to add more features.

Bit Shifting, Masking or a Bit Field Struct?

I'm new to working with bits. I'm trying to work with an existing protocol, which can send three different types of messages.
Type 1 is a 16-bit structure:
struct digital
{
unsigned int type:2;
unsigned int highlow:1;
unsigned int sig1:5;
unsigned int :1;
unsigned int sig2:7;
};
The first two bits (type, in my struct above) are always 1 0 . The third bit, highlow, determines whether the signal is on or off, and sig1 + sig2 together define the 12-bit index of the signal. This index is split across the two bytes by a 0, which is always in bit 7.
Type 2 is a 32-bit structure. It has a 2-bit type, a 10-bit index and a 16-bit value, interspersed with 0's at positions 27, 23, 15 & 7. A bit-field struct representation would like something like this:
struct analog
{
unsigned int type:2;
unsigned int val1:2;
unsigned int :1;
unsigned int sig1:3;
unsigned int :1;
unsigned int sig2:7;
unsigned int :1;
unsigned int val2:7;
unsigned int :1;
unsigned int val3:7;
};
sig1 & sig2 together form the 10-bit index. val1 + val2 + val3 together form the 16-bit value of the signal at the 10-bit index.
If I understand how to work with the first two structs, I think I can figure out the third.
My question is, is there a way to assign a single value and have the program work out the bits that need to go into val1, val2 and val3?
I've read about bit shifting, bit-field structs and padding with 0's. The struct seems like the way to go, but I'm not sure how to implement it. None of the examples of bit-packing that I've seen have values that are split the way these are. Ultimately, I'd like to be able to create an analog struct, assign an index (i = 252) and a value (v = 32768) and be done with it.
If someone could suggest the appropriate method or provide a link to a similar sample, I'd greatly appreciate it. If it matters, this code will be incorporated into a larger Objective-C app.
Thanks.
Brad
You can do it with a series of shifts, ands, and ors. I have done the 10-bit index part for Type 2:
unsigned int i = 252;
analog a = (analog)(((i << 16) & 0x7f0000) | (i << 17) & 0x7000000);
Essentially, what this code does is shift the 10 bits of interest in int i to the range 16 - 25, then it ands it with the bitmask 0x7f0000 to set bits 22 - 31 to zero. It also shifts another copy of the 10 bits to the range 17 - 26, then it ands it with the bitmask 0x7000000 to set bits 0 - 22 and 26 - 31 to zero. Then it ors the two values together to create your desired zero-separated value.
.. I'm not absolutely sure that I counted the bitmasks correctly, but I hope you've got the idea. Just shift, and-mask, and or-merge.
Edit: Method 2:
analog a;
a.sig1 = (i & 0x7f); // mask out bit 8 onwards
a.sig2 = ((i<<1) & 0x700); // shift left by one, then mask out bits 0-8
On second thought method 2 is more readable, so you should probably use this.
You don't have to do this, this is where the union keyword comes in - you can specify all the bits out at the same time, or by referring to the same bits with a different name, set them all at once.
You shouldn't use C structure bitfields because the physical layout of bitfields is undefined. While you could figure out what your compiler is doing and get your layout to match the underlying data, the code may not work if you switch to a different compiler or even update your compiler.
I know it's a pain, but do the bit manipulation yourself.

Resources