Is bit masking comparable to "accessing an array" in bits? - c

For all the definitions I've seen of bit masking, they all just dive right into how to bit mask, use bitwise, etc. without explaining a use case for any of it. Is the purpose of updating all the bits you want to keep and all the bits you want to clear to "access an array" in bits?

Is the purpose of updating all the bits you want to keep and all the bits you want to clear to "access an array" in bits?
I will say the answer is no.
When you access an array of int you'll do:
int_array[index] = 42; // Write access
int x = int_array[42]; // Read access
If you want to write similar functions to read/write a specific bit in e.g. an unsigned int in a "array like fashion" it could look like:
unsigned a = 0;
set_bit(a, 4); // Set bit number 4
unsigned x = get_bit(a, 4); // Get bit number 4
The implementation of set_bit and get_bit will require (among other things) some bitwise mask operation.
So yes - to access bits in an "array like fashion" you'll need masking but...
There are many other uses of bit level masking.
Example:
int buffer[64];
unsigned index = 0;
void add_to_cyclic_buffer(int n)
{
buffer[index] = n;
++index;
index &= 0x3f; // Masking by 0x3f ensures index is always in the range 0..63
}
Example:
unsigned a = some_func();
a |= 1; // Make sure a is odd
a &= ~1; // Make sure a is even
Example:
unsigned a = some_func();
a &= ~0xf; // Make sure a is a multiple of 16
This is just a few examples of using "masking" that has nothing to do with accessing bits as an array. Many other examples can be made.
So to conclude:
Masking can be used to write functions that access bits in an array like fashion but masking is used for many other things as well.

So there are 3 (or 4) main uses.
One, as you say, is where you use the word as a set of true/false flags, where each flag is just indexed in a symmetric manner. I use 'word' here to be the piece of discrete memory that you are accessing in a single operation. So a byte holds 8 bit values, and a 'long long' holds 64 bits. With a bit more effort an array of words can be used as an array of more packed flags.
A second is where you are doing some manipulation of the value, but still consider the word to hold one value. There are many tricks like setting or clearing bottom bits to ensure alignment, or clearing top bits to get a modulus, shifting to divide or multiply by powers of 2.
A third use is where you want to pack lots of smaller-ranged values into a word. Each of the values is a particular meaning in context. This may either be because you need to communicate with a device that has defined this as the protocol, or because you need to create so many objects that the saving in space in each object outweighs the increase in code size and code speed cost (though that might be contrasted with the increased cache misses causing slowdown if the object were bigger).
As a distinction the fourth case is where these fields are distinct 1-bit flags that have specific meanings in the context of the code. Data objects tend to collect a number of such flags, and it is simply more convenient sometimes to store them as bits in a single location, than to use separate bytes for each flag. Generally testing a particular fixed indexed bit, or a fixed masked bit is no more expensive in code size or speed than testing the whole byte, though writing can be more complex. The storage savings are clear, so often programmers will declare an enumeration of bit masks by default when faced with creating a number of flags in a structure, or when writing a function.

Related

Operating Rightmost/Leftmost n-Bits, Not All the Bits of A Integer Type Data Variable

In a programming-task, I have to add a smaller integer in variable B (data type int)
to a larger integer (20 decimal integer) in variable A (data type long long int),
then compare A with variable C which is also as large integer (data type long long int) as A.
What I realized, since I add a smaller B to A,
I don't need to check all the digits of A when I compare that with C, in other words, we don't need to check all the bits of A and C.
Given that I know, how many bits from the right I need to check, say n-bits,
is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language?
Because for comparing all the bits take more time, and since I am working with large number, the program becomes slower.
Every time I search in the google, bit-masking appears which uses all the bits of A, C, that doesn't do what I am asking for, so probably I am not using correct terminology, please help.
Addition:
Initial comments of this post made me think there is no way but i found the following -
Bit Manipulation by University of Colorado Boulder
(#cuboulder, after 7:45)
...the bit band region is accessed via a bit band alías, each bit in a
supported bit band region has its own unique address and we can access
that bit using a pointer to its bit band alias location, the least
significant bit in an alias location can be sent or cleared and that
will be mapped to the bit in the corresponding data or peripheral
memory, unfortunately this will not help you if you need to write to
multiple bit locations in memory dependent operations only allow a
single bit to be cleared or set...
Is above what I a asking for? if yes then
where I can find the detail as beginner?
Updated question:
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?
Your assumption that comparing fewer bits is faster might be true in some cases but is probably not true in most cases.
I'm only familiar with x86 CPUs. A x86-64 Processor has 64 bit wide registers. These can be accessed as 64 bit registers but the lower bits also as 32, 16 and 8 bit registers. There are processor instructions which work with the 64, 32, 16 or 8 bit part of the registers. Comparing 8 bits is one instruction but so is comparing 64 bits.
If using the 32 bit comparison would be faster than the 64 bit comparison you could gain some speed. But it seems like there is no speed difference for current processor generations. (Check out the "cmp" instruction with the link to uops.info from #harold.)
If your long long data type is actually bigger then the word size of your processor, then it's a different story. E.g. if your long long is 64 bit but your are on a 32 bit processor then these instructions cannot be handled by one register and you would need multiple instructions. So if you know that comparing only the lower 32 bits would be enough this could save some time.
Also note that comparing only e.g. 20 bits would actually take more time then comparing 32 bits. You would have to compare 32 bits and then mask the 12 highest bits. So you would need a comparison and a bitwise and instruction.
As you see this is very processor specific. And you are on the processors opcode level. As #RawkFist wrote in his comment you could try to get the C compiler to create such instructions but that does not automatically mean that this is even faster.
All of this is only relevant if these operations are executed a lot. I'm not sure what you are doing. If e.g. you add many values B to A and compare them to C each time it might be faster to start with C, subtract the B values from it and compare with 0. Because the compare-operation works internally like a subtraction. So instead of an add and a compare instruction a single subtraction would be enough within the loop. But modern CPUs and compilers are very smart and optimize a lot. So maybe the compiler automatically performs such or similar optimizations.
Try this question.
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?
Yes - when A + B != C. We can short-cut the comparison once a difference is found: from least to most significant.
No - when A + B == C. All bits need comparison.
Now back to OP's original question
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?
No. In order to do so, we need to out-think the compiler. A well enabled compiler itself will notice any "tricks" available for long long + (signed char)int == long long and emit efficient code.
Yet what about really long compares? How about a custom uint1000000 for A and C?
For long compares of a custom type, a quick compare can be had.
First, select a fast working type. unsigned is a prime candidate.
typedef unsigned ufast;
Now define the wide integer.
#include <limits.h>
#include <stdbool.h>
#define UINT1000000_N (1000000/(sizeof(ufast) * CHAR_BIT))
typedef struct {
// Least significant first
ufast digit[UINT1000000_N];
} uint1000000;
Perform the addition and compare one "digit" at a time.
bool uint1000000_fast_offset_compare(const uint1000000 *A, unsigned B,
const uint1000000 *C) {
ufast carry = B;
for (unsigned i = 0; i < UINT1000000_N; i++) {
ufast sum = A->digit[i] + carry;
if (sum != C->digit[i]) {
return false;
}
carry = sum < A->digit[i];
}
return true;
}

Efficient tiny boolean matrix multiplication

I have some unsigned 16 bit integer s which I'd like to map to an unsigned 32 bit integer r in such a way that each flipped bit in s flips at most one (given) bit in r -- simply a mapping between 0..16 and 0..32 that is. So we can see this as a matrix equation
Ps = r
where P is a 32 x 16 boolean matrix, s is a 16 x 1 boolean vector and r is 32 x 1 boolean vector. I have a gut feeling there exists some super simple hack that I'm missing. Important note: the target machine is a 16 bit mcu!
Here's the best I can do:
static u16 P[32] = someArrayOrWhatever();
u32 FsiPermutationHack(u16 s) {
u32 r;
for (u16 i = 0; i < 32; i++)
{
r |= ((u32)((P[i] & s) > 0) << i);
}
return r;
}
The rationale is this: the i:th bit of r is 1 if and only if (P[i] & s) != 0x0000. I am too stupid to disassemble stuff, but I am guessing this would be like ~100 instructions IF we didn't have to do that stupid u32 cast. But then again, perhaps the compiler auto-splits the loop in two for us in which case it's looking pretty good for us.
Apologies for the tangent, just thought I'd share my attempted solution -- do you have a better one?
Inasmuch as you say,
I am guessing this would be like ~100 instructions IF we didn't have
to do that stupid u32 cast. But then again, perhaps the compiler
auto-splits the loop in two for us in which case it's looking pretty
good for us.
and
I have a gut feeling there exists some super simple hack that I'm missing
, I will interpret you to be asking how to minimize the use of 32-bit arithmetic in this code intended for a 16-bit processor.
You really ought to learn how to disassemble and check the compiled result to see whether the compiler does automatically split the loop as you hypothesize, but supposing that it does not, I don't see why you couldn't do the same manually:
static u16 P[32]; /* value assigned elsewhere */
u32 FsiPermutationHack(u16 s) {
u16 *P_hi = P + 16;
u16 r_lo = 0;
u16 r_hi = 0;
for (u16 i = 0; i < 16; i++) {
r_lo |= (P[i] & s) != 0) << i;
r_hi |= (P_hi[i] & s) != 0) << i;
}
return ((u32) r_hi << 16) + r_lo;
}
That supposes u16 and u32 to be unsigned 16-bit and 32-bit (respectively) integers with no padding bits.
Note also that the idea that performing arithmetic with type u16 instead of u32 should be an improvement assumes that type u32 has a higher integer promotion rank than unsigned int. Roughly speaking, that comes down to the implementation's unsigned int being a 16-bit type. That's entirely plausible for an implementation for a 16-bit processor. On a system whose int and unsigned int are instead 32-bit types, however, all narrower integer arithmetic arguments would be promoted to 32 bits anyway.
Update:
As far as the possibility of a better alternative algorithm, I observe that each bit of the result is computed from a different element of array P, that the whole value of each element is used, and that the element size is the same as the target machine's native word size. There seems then no scope for performing fewer 16-bit bitwise AND operations than there are array elements (but see below).
If we accept that each array element must be processed separately, then the provided implementation does a pretty good job of approaching it efficiently:
It performs only 16-bit computations until the time comes to assemble the final result;
It computes both the upper and lower halves of the result in the same loop, thus incurring only 16 iterations' worth of loop overhead instead of 32
It largely removes the extra indexing arithmetic that that would otherwise have required by creating P_hi for accessing the upper half of the array
It would be possible to manually unroll the loop to possibly save a few more cycles, but that's the kind of optimization that you absolutely should rely on your compiler to perform for you.
As far as "bit twiddling hacks", the only scope I see for anything of that nature would be processing adjacent pairs of 16-bit array elements as 32-bit unsigned integers. That would allow performing one 32-bit bitwise AND in place of each two 16-bit ANDs. That would be coupled with two 32-bit comparisons (vs. two 16-bit comparisons in the above code). The 16-bit shift and bitwise OR operations of the above approach could be retained. Aside from that having formally undefined behavior as a result of violating the strict aliasing rule, that would involve 32-bit arithmetic, which presumably is about half as fast as 16-bit arithmetic on your 16-bit machine. Performance is better measured than predicted, but I don't see any reason to expect a significant win from that approach.

Efficiently implementing arrays of ternary data types in C

I need to implement "big" arrays (~1800 elements) of a ternary datatype as runtime-efficiently as possible in C for cryptographic research. I thought of the following:
Using an array of any-sized integers, using 2 Bits to represent one element each
So I'd have
typedef uint32_t block;
const int blocksize = sizeof(block)<<3;
block dataArray[3]; // 3*32 bit => 48 Elements
uint8_t getElementAt(block *data, int position)
{
position = position * 2;
return (data[position/blocksize] >> (position % blocksize)) & 3;
}
returning me 0..2 which i can map to my three values.
Using an array of uint8_t addressing the elementy directly.
uint8_t data[48];
Sure, that needs at least four times more RAM but addressing and setting might be more efficient - is it?
Are there any other good possibilities I'm missing or special caveats in any of the two solutions?
The answer depends on how big the arrays will get, and how you want to optimize. I sketch some scenarios:
Runtime, small arrays.
Simply use unsigned long arr[N]. Reading only on machine word boundaries is the fastest, but uses a lot of memory. When the memory usage gets too big you actually do not want to do this, because cache performance outweighs the aligned reads.
Runtime, big arrays.
Use unsigned char arr[N]. This will give you fast reads/writes at a decent speed.
Good memory usage, mediocre speed.
Use unsigned long arr[N] and store each trit in two bits, unpacking using shifts and masks.
Better memory usage, slow.
Use unsigned long arr[N], and store floor(CHAR_BIT * sizeof(long) * log(2) / log(3)) numbers, by storing digits in base-3. You can pack 20 trits in 32 bits using this method.
Best memory usage, horrendous.
Store all the numbers as digits in one base-3 number, using a bignum implementation.

When to use bit-fields in C

On the question 'why do we need to use bit-fields?', searching on Google I found that bit fields are used for flags.
Now I am curious,
Is it the only way bit-fields are used practically?
Do we need to use bit fields to save space?
A way of defining bit field from the book:
struct {
unsigned int is_keyword : 1;
unsigned int is_extern : 1;
unsigned int is_static : 1;
} flags;
Why do we use int?
How much space is occupied?
I am confused why we are using int, but not short or something smaller than an int.
As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?
A quite good resource is Bit Fields in C.
The basic reason is to reduce the used size. For example, if you write:
struct {
unsigned int is_keyword;
unsigned int is_extern;
unsigned int is_static;
} flags;
You will use at least 3 * sizeof(unsigned int) or 12 bytes to represent three small flags, that should only need three bits.
So if you write:
struct {
unsigned int is_keyword : 1;
unsigned int is_extern : 1;
unsigned int is_static : 1;
} flags;
This uses up the same space as one unsigned int, so 4 bytes. You can throw 32 one-bit fields into the struct before it needs more space.
This is sort of equivalent to the classical home brew bit field:
#define IS_KEYWORD 0x01
#define IS_EXTERN 0x02
#define IS_STATIC 0x04
unsigned int flags;
But the bit field syntax is cleaner. Compare:
if (flags.is_keyword)
against:
if (flags & IS_KEYWORD)
And it is obviously less error-prone.
Now I am curious, [are flags] the only way bitfields are used practically?
No, flags are not the only way bitfields are used. They can also be used to store values larger than one bit, although flags are more common. For instance:
typedef enum {
NORTH = 0,
EAST = 1,
SOUTH = 2,
WEST = 3
} directionValues;
struct {
unsigned int alice_dir : 2;
unsigned int bob_dir : 2;
} directions;
Do we need to use bitfields to save space?
Bitfields do save space. They also allow an easier way to set values that aren't byte-aligned. Rather than bit-shifting and using bitwise operations, we can use the same syntax as setting fields in a struct. This improves readability. With a bitfield, you could write
directions.alice_dir = WEST;
directions.bob_dir = SOUTH;
However, to store multiple independent values in the space of one int (or other type) without bitfields, you would need to write something like:
#define ALICE_OFFSET 0
#define BOB_OFFSET 2
directions &= ~(3<<ALICE_OFFSET); // clear Alice's bits
directions |= WEST<<ALICE_OFFSET; // set Alice's bits to WEST
directions &= ~(3<<BOB_OFFSET); // clear Bob's bits
directions |= SOUTH<<BOB_OFFSET; // set Bob's bits to SOUTH
The improved readability of bitfields is arguably more important than saving a few bytes here and there.
Why do we use int? How much space is occupied?
The space of an entire int is occupied. We use int because in many cases, it doesn't really matter. If, for a single value, you use 4 bytes instead of 1 or 2, your user probably won't notice. For some platforms, size does matter more, and you can use other data types which take up less space (char, short, uint8_t, etc.).
As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?
No, that is not correct. The entire unsigned int will exist, even if you're only using 8 of its bits.
Another place where bitfields are common are hardware registers. If you have a 32 bit register where each bit has a certain meaning, you can elegantly describe it with a bitfield.
Such a bitfield is inherently platform-specific. Portability does not matter in this case.
We use bit fields mostly (though not exclusively) for flag structures - bytes or words (or possibly larger things) in which we try to pack tiny (often 2-state) pieces of (often related) information.
In these scenarios, bit fields are used because they correctly model the problem we're solving: what we're dealing with is not really an 8-bit (or 16-bit or 24-bit or 32-bit) number, but rather a collection of 8 (or 16 or 24 or 32) related, but distinct pieces of information.
The problems we solve using bit fields are problems where "packing" the information tightly has measurable benefits and/or "unpacking" the information doesn't have a penalty. For example, if you're exposing 1 byte through 8 pins and the bits from each pin go through their own bus that's already printed on the board so that it leads exactly where it's supposed to, then a bit field is ideal. The benefit in "packing" the data is that it can be sent in one go (which is useful if the frequency of the bus is limited and our operation relies on frequency of its execution), and the penalty of "unpacking" the data is non-existent (or existent but worth it).
On the other hand, we don't use bit fields for booleans in other cases like normal program flow control, because of the way computer architectures usually work. Most common CPUs don't like fetching one bit from memory - they like to fetch bytes or integers. They also don't like to process bits - their instructions often operate on larger things like integers, words, memory addresses, etc.
So, when you try to operate on bits, it's up to you or the compiler (depending on what language you're writing in) to write out additional operations that perform bit masking and strip the structure of everything but the information you actually want to operate on. If there are no benefits in "packing" the information (and in most cases, there aren't), then using bit fields for booleans would only introduce overhead and noise in your code.
To answer the original question »When to use bit-fields in C?« … according to the book "Write Portable Code" by Brian Hook (ISBN 1-59327-056-9, I read the German edition ISBN 3-937514-19-8) and to personal experience:
Never use the bitfield idiom of the C language, but do it by yourself.
A lot of implementation details are compiler-specific, especially in combination with unions and things are not guaranteed over different compilers and different endianness. If there's only a tiny chance your code has to be portable and will be compiled for different architectures and/or with different compilers, don't use it.
We had this case when porting code from a little-endian microcontroller with some proprietary compiler to another big-endian microcontroller with GCC, and it was not fun. :-/
This is how I have used flags (host byte order ;-) ) since then:
# define SOME_FLAG (1 << 0)
# define SOME_OTHER_FLAG (1 << 1)
# define AND_ANOTHER_FLAG (1 << 2)
/* test flag */
if ( someint & SOME_FLAG ) {
/* do this */
}
/* set flag */
someint |= SOME_FLAG;
/* clear flag */
someint &= ~SOME_FLAG;
No need for a union with the int type and some bitfield struct then. If you read lots of embedded code those test, set, and clear patterns will become common, and you spot them easily in your code.
Why do we need to use bit-fields?
When you want to store some data which can be stored in less than one byte, those kind of data can be coupled in a structure using bit fields.
In the embedded word, when one 32 bit world of any register has different meaning for different word then you can also use bit fields to make them more readable.
I found that bit fields are used for flags. Now I am curious, is it the only way bit-fields are used practically?
No, this not the only way. You can use it in other ways too.
Do we need to use bit fields to save space?
Yes.
As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?
No. Memory only can be occupied in multiple of bytes.
Bit fields can be used for saving memory space (but using bit fields for this purpose is rare). It is used where there is a memory constraint, e.g., while programming in embedded systems.
But this should be used only if extremely required because we cannot have the address of a bit field, so address operator & cannot be used with them.
A good usage would be to implement a chunk to translate to—and from—Base64 or any unaligned data structure.
struct {
unsigned int e1:6;
unsigned int e2:6;
unsigned int e3:6;
unsigned int e4:6;
} base64enc; // I don't know if declaring a 4-byte array will have the same effect.
struct {
unsigned char d1;
unsigned char d2;
unsigned char d3;
} base64dec;
union base64chunk {
struct base64enc enc;
struct base64dec dec;
};
base64chunk b64c;
// You can assign three characters to b64c.enc, and get four 0-63 codes from b64dec instantly.
This example is a bit naive, since Base64 must also consider null-termination (i.e. a string which has not a length l so that l % 3 is 0). But works as a sample of accessing unaligned data structures.
Another example: Using this feature to break a TCP packet header into its components (or other network protocol packet header you want to discuss), although it is a more advanced and less end-user example. In general: this is useful regarding PC internals, SO, drivers, an encoding systems.
Another example: analyzing a float number.
struct _FP32 {
unsigned int sign:1;
unsigned int exponent:8;
unsigned int mantissa:23;
}
union FP32_t {
_FP32 parts;
float number;
}
(Disclaimer: Don't know the file name / type name where this is applied, but in C this is declared in a header; Don't know how can this be done for 64-bit floating-point numbers since the mantissa must have 52 bits and—in a 32 bit target—ints have 32 bits).
Conclusion: As the concept and these examples show, this is a rarely used feature because it's mostly for internal purposes, and not for day-by-day software.
To answer the parts of the question no one else answered:
Ints, not Shorts
The reason to use ints rather than shorts, etc. is that in most cases no space will be saved by doing so.
Modern computers have a 32 or 64 bit architecture and that 32 or 64 bits will be needed even if you use a smaller storage type such as a short.
The smaller types are only useful for saving memory if you can pack them together (for example a short array may use less memory than an int array as the shorts can be packed together tighter in the array). For most cases, when using bitfields, this is not the case.
Other uses
Bitfields are most commonly used for flags, but there are other things they are used for. For example, one way to represent a chess board used in a lot of chess algorithms is to use a 64 bit integer to represent the board (8*8 pixels) and set flags in that integer to give the position of all the white pawns. Another integer shows all the black pawns, etc.
You can use them to expand the number of unsigned types that wrap. Ordinary you would have only powers of 8,16,32,64... , but you can have every power with bit-fields.
struct a
{
unsigned int b : 3 ;
} ;
struct a w = { 0 } ;
while( 1 )
{
printf("%u\n" , w.b++ ) ;
getchar() ;
}
To utilize the memory space, we can use bit fields.
As far as I know, in real-world programming, if we require, we can use Booleans instead of declaring it as integers and then making bit field.
If they are also values we use often, not only do we save space, we can also gain performance since we do not need to pollute the caches.
However, caching is also the danger in using bit fields since concurrent reads and writes to different bits will cause a data race and updates to completely separate bits might overwrite new values with old values...
Bitfields are much more compact and that is an advantage.
But don't forget packed structures are slower than normal structures. They are also more difficult to construct since the programmer must define the number of bits to use for each field. This is a disadvantage.
Why do we use int? How much space is occupied?
One answer to this question that I haven't seen mentioned in any of the other answers, is that the C standard guarantees support for int. Specifically:
A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation defined type.
It is common for compilers to allow additional bit-field types, but not required. If you're really concerned about portability, int is the best choice.
Nowadays, microcontrollers (MCUs) have peripherals, such as I/O ports, ADCs, DACs, onboard the chip along with the processor.
Before MCUs became available with the needed peripherals, we would access some of our hardware by connecting to the buffered address and data buses of the microprocessor. A pointer would be set to the memory address of the device and if the device saw its address along with the R/W signal and maybe a chip select, it would be accessed.
Oftentimes we would want to access individual or small groups of bits on the device.
In our project, we used this to extract a page table entry and page directory entry from a given memory address:
union VADDRESS {
struct {
ULONG64 BlockOffset : 16;
ULONG64 PteIndex : 14;
ULONG64 PdeIndex : 14;
ULONG64 ReservedMBZ : (64 - (16 + 14 + 14));
};
ULONG64 AsULONG64;
};
Now suppose, we have an address:
union VADDRESS tempAddress;
tempAddress.AsULONG64 = 0x1234567887654321;
Now we can access PTE and PDE from this address:
cout << tempAddress.PteIndex;

Bit field manipulation disadvantages

I was reading this link
http://dec.bournemouth.ac.uk/staff/awatson/micro/articles/9907feat2.htm
I could not understand this following statements from the link, Please help me understand about this.
The programmer just writes some macros that shift or mask the
appropriate bits to get what is desired. However, if the data involves
longer binary encoded records, the C API runs into a problem. I have,
over the years, seen many lengthy, complex binary records described
with the short or long integer bit-field definition facilities. C
limits these bit fields to subfields of integer-defined variables,
which implies two limitations: first of all, that bit fields may be no
wider, in bits, than the underlying variable; and secondly, that no
bit field should overlap the underlying variable boundaries. Complex
records are usually composed of several contiguous long integers
populated with bit-subfield definitions.
ANSI-compliant compilers are free to impose these size and alignment
restrictions and to specify, in an implementation-dependent but
predictable way, how bit fields are packed into the underlying machine
word structure. Structure memory alignment often isn’t portable, but
bit field memory is even less so.
What i have understood from these statements is that the macros can be used to mask the bits to left or right shift. But i had this doubt in my mind why do they use macros? - I thought by defining it in macros the portability can be established irrespective of 16-bit or 32-bit OS..Is it true?I could not understand the two disadvantages mentioned in the above statement.1.bit fields may be no wider 2.no bit field should overlap the underlying variable boundaries
and the line,
Complex records are usually composed of several contiguous long integers
populated with bit-subfield definitions.
1.bit fields may be no wider
Let's say you want a bitfield that is 200 bits long.
struct my_struct {
int my_field:200; /* Illegal! No integer type has 200 bits --> compile error!
} v;
2.no bit field should overlap the underlying variable boundaries
Let's say you want two 30 bit bitfields and that the compiler uses a 32 bit integer as the underlying variable.
struct my_struct {
unsigned int my_field1:30;
unsigned int my_field2:30; /* Without padding this field will overlap a 32-bit boundary */
} v;
Ususally, the compiler will add padding automatically, generating a struct with the following layout:
struct my_struct {
unsigned int my_field1:30;
:2 /* padding added by the compiler */
unsigned int my_field2:30; /* Without padding this field will overlap a 32-bit boundary */
:2 /* padding added by the compiler */
} v;

Resources