I am working on a project that involves programming 32 bit ARM micro-controllers. As in many embedded software coding work, setting and clearing bits are essential and quite repetitive task. Masking strategy is useful when working with micros rather than 32 bits to set and clear bits. But when working with 32 bit micro-contollers, it is not really practical to write masks each time we need to set/clear a single bit.
Writing functions to handle this could be a solution; however having a function occupies memory which is not ideal in my case.
Is there any better alternative to handle bit setting/clearing when working with 32 bit micros?
In C or C++, you would typically define macros for bit masks and combine them as desired.
/* widget.h */
#define WIDGET_FOO 0x00000001u
#define WIDGET_BAR 0x00000002u
/* widget_driver.c */
static uint32_t *widget_control_register = (uint32_t*)0x12346578;
int widget_init (void) {
*widget_control_register |= WIDGET_FOO;
if (*widget_control_register & WIDGET_BAR) log(LOG_DEBUG, "widget: bar is set");
}
If you want to define the bit masks from the bit positions rather than as absolute values, define constants based on a shift operation (if your compiler doesn't optimize these constants, it's hopeless).
#define WIDGET_FOO (1u << 0)
#define WIDGET_BAR (1u << 1)
You can define macros to set bits:
/* widget.h */
#define WIDGET_CONTROL_REGISTER_ADDRESS ((uint32_t*)0x12346578)
#define SET_WIDGET_BITS(m) (*WIDGET_CONTROL_REGISTER_ADDRESS |= (m))
#define CLEAR_WIDGET_BITS(m) (*WIDGET_CONTROL_REGISTER_ADDRESS &= ~(uint32_t)(m))
You can define functions rather than macros. This has the advantage of added type verifications during compilations. If you declare the function as static inline (or even just static) in a header, a good compiler will inline the function everywhere, so using a function in your source code won't cost any code memory (assuming that the generated code for the function body is smaller than a function call, which should be the case for a function that merely sets some bits in a register).
/* widget.h */
#define WIDGET_CONTROL_REGISTER_ADDRESS ((uint32_t*)0x12346578)
static inline void set_widget_bits(uint32_t m) {
*WIDGET_CONTROL_REGISTER_ADDRESS |= m;
}
static inline void set_widget_bits(uint32_t m) {
*WIDGET_CONTROL_REGISTER_ADDRESS &= ~m;
}
The other common idiom for registers providing access to individual bits or groups of bits is to define a struct containing bitfields for each register of your device. This can get tricky, and it is dependent on the C compiler implementation. But it can also be clearer than macros.
A simple device with a one-byte data register, a control register, and a status register could look like this:
typedef struct {
unsigned char data;
unsigned char txrdy:1;
unsigned char rxrdy:1;
unsigned char reserved:2;
unsigned char mode:4;
} COMCHANNEL;
#define CHANNEL_A (*(COMCHANNEL *)0x10000100)
// ...
void sendbyte(unsigned char b) {
while (!CHANNEL_A.txrdy) /*spin*/;
CHANNEL_A.data = b;
}
unsigned char readbyte(void) {
while (!CHANNEL_A.rxrdy) /*spin*/;
return CHANNEL_A.data;
}
Access to the mode field is just CHANNEL_A.mode = 3;, which is a lot clearer than writing something like *CHANNEL_A_MODE = (*CHANNEL_A_MODE &~ CHANNEL_A_MODE_MASK) | (3 << CHANNEL_A_MODE_SHIFT);. Of course, the latter ugly expression would usually be (mostly) covered over by macros.
In my experience, once you established a style for describing your peripheral registers you are best served by following that style over the whole project. The consistency will have huge benefits for future code maintenance, and over the lifetime of a project that factor likely is more important that the relatively small detail of whether you adopted the struct and bitfields or macro style.
If you are coding for a target which has already set a style in its manufacturer provided header files and customary compiler toolchain, adopting that style for your own custom hardware and low level code may be best as it will provide the best match between manufacturer documentation and your coding style.
But if you have the luxury of establishing the style for your development at the outset, your compiler platform is well enough documented to permit you to reliably describe device registers with bitfields, and you expect to use the same compiler for the lifetime of the product, then that is often a good way to go.
You can actually have it both ways too. It isn't that unusual to wrap the bitfield declarations inside a union that describes the physical registers, allowing their values to be easily manipulated all bits at once. (I know I've seen a variation of this where conditional compilation was used to provide two versions of the bitfields, one for each bit order, and a common header file used toolchain-specific definitions to decide which to select.)
typedef struct {
unsigned char data;
union {
struct {
unsigned char txrdy:1;
unsigned char rxrdy:1;
unsigned char reserved:2;
unsigned char mode:4;
} bits;
unsigned char status;
};
} COMCHANNEL;
// ...
#define CHANNEL_A_MODE_TXRDY 0x01
#define CHANNEL_A_MODE_TXRDY 0x02
#define CHANNEL_A_MODE_MASK 0xf0
#define CHANNEL_A_MODE_SHIFT 4
// ...
#define CHANNEL_A (*(COMCHANNEL *)0x10000100)
// ...
void sendbyte(unsigned char b) {
while (!CHANNEL_A.bits.txrdy) /*spin*/;
CHANNEL_A.data = b;
}
unsigned char readbyte(void) {
while (!CHANNEL_A.bits.rxrdy) /*spin*/;
return CHANNEL_A.data;
}
Assuming your compiler understands the anonymous union then you can simply refer to CHANNEL_A.status to get the whole byte, or CHANNEL_A.mode to refer to just the mode field.
There are some things to watch for if you go this route. First, you have to have a good understanding of structure packing as defined in your platform. The related issue is the order in which bit fields are allocated across their storage, which can vary. I've assumed that the low order bit is assigned first in my examples here.
There may also be hardware implementation issues to worry about. If a particular register must always be read and written 32 bits at a time, but you have it described as a bunch of small bit fields, the compiler might generate code that violates that rule and accesses only a single byte of the register. There is usually a trick available to prevent this, but it will be highly platform dependent. In this case, using macros with a fixed sized register will be less likely to cause a strange interaction with your hardware device.
These issues are very compiler vendor dependent. Even without changing compiler vendors, #pragma settings, command line options, or more likely optimization level choices can all affect memory layout, padding, and memory access patterns. As a side effect, they will likely lock your project to a single specific compiler toolchain, unless heroic efforts are used to create register definition header files that use conditional compilation to describe the registers differently for different compilers. And even then, you are probably well served to include at least one regression test that verifies your assumptions so that any upgrades to the toolchain (or well-intentioned tweaks to the optimization level) will cause any issues to get caught before they are mysterious bugs in code that "has worked for years".
The good news is that the sorts of deep embedded projects where this technique makes sense are already subject to a number of toolchain lock in forces, and this burden may not be a burden at all. Even if your product development team moves to a new compiler for the next product, it is often critical that firmware for a particular product be maintained with the very same toolchain over its lifetime.
If you use the Cortex M3 you can use bit-banding
Bit-banding maps a complete word of memory onto a single bit in the bit-band region. For example, writing to one of the alias words will set or clear the corresponding bit in the bitband region.
This allows every individual bit in the bit-banding region to be directly accessible from a word-aligned address using a single LDR instruction. It also allows individual bits to be toggled from C without performing a read-modify-write sequence of instructions.
If you have C++ available, and there's a decent compiler available, then something like QFlags is a good idea. It gives you a type-safe interface to bit flags.
It is likely to produce better code than using bitfields in structures, since the bitfields can only be changed one at a time and will likely translate to at least one load/modify/store per each changed bitfield. With a QFlags-like approach, you can get one load/modify/store per each or-assign or and-assign statement. Note that the use of QFlags doesn't require the inclusion of the entire Qt framework. It's a stand-alone header file (after minor tweaks).
At the driver level setting and clearing bits with masks is very common and sometimes the only way. Besides, it's an extremely quick operation; only a few instructions. It may be worthwhile to set up a function that can clear or set certain bits for readability and also reusability.
It's not clear what type of registers you are setting and clearing bits in but in general there are two cases you have to worry about in embedded systems:
Setting and clearing bits in a read/write register
If you want to change a single bit (or a handful of bits) in a read and write register you will first have to read the register, set or clear the appropriate bit using masks and whatever else to get the correct behavior, and then write back to the same register. That way you don't change the other bits.
Writing to separate Set and Clear registers (common in ARM micros) Sometimes there are separate Set and Clear registers. You can write just a single bit to a clear register and it will clear that bit. For instance, if there is a register you want to clear bit 9, just write (1<<9) to the clear register. You don't have to worry about modifying the other bits. Similar for the set register.
You can set and clear bits with a function that takes up as much memory as doing it with a mask:
#define SET_BIT(variableName, bitNumber) variableName |= (0x00000001<<(bitNumber));
#define CLR_BIT(variableName, bitNumber) variableName &= ~(0x00000001<<(bitNumber));
int myVariable = 12;
SET_BIT(myVariable, 0); // myVariable now equals 13
CLR_BIT(myVariable, 1); // myVariable now equals 11
These macros will produce exactly the same assembler instructions as a mask.
Alternatively, you could do this:
#define BIT(n) (0x00000001<<n)
#define NOT_BIT(n) ~(0x00000001<<n)
int myVariable = 12;
myVariable |= BIT(4); //myVariable now equals 28
myVariable &= NOT_BIT(3); //myVariable now equals 20
myVariable |= BIT(5) |
BIT(6) |
BIT(7) |
BIT(8); //myVariable now equals 500
Related
As a beginner C programmer, I am wondering, what would be the best easy-to-read and easy-to-understand solution for setting control bits in a device. Are there any standards? Any example code to mimic? Google didn't give any reliable answer.
For example, I have a control block map:
The first way I see would be to simply set the needed bits. It requires a bunch of explanations in comments and seems to be not all that professional.
DMA_base_ptr[DMA_CONTROL_OFFS] = 0b10001100;
The second way I see is to create a bit field. I'm not sure if this is the one should I stick to, since I never encountered it being used in such way (unlike the first option I mentioned).
struct DMA_control_block_struct
{
unsigned int BYTE:1;
unsigned int HW:1;
// etc
} DMA_control_block_struct;
Is one of the options better than the other one? Are there any options I just don't see?
Any advice would be highly appreciated
The problem with bit fields is that the C standard does not dictate that the order in which they are defined is the same as the order that they are implemented. So you may not be setting the bits you think you are.
Section 6.7.2.1p11 of the C standard states:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field. If enough space remains, a bit-field
that immediately follows another bit-field in a structure shall be
packed into adjacent bits of the same unit. If insufficient space
remains, whether a bit-field that does not fit is put into
the next unit or overlaps adjacent units is
implementation-defined. The order of allocation of bit-fields within
a unit (high-order to low-order or low-order to high-order) is
implementation-defined. The alignment of the addressable storage
unit is unspecified.
As an example, look at the definition of struct iphdr, which represents an IP header, from the /usr/include/netinet/ip.h file file on Linux:
struct iphdr
{
#if __BYTE_ORDER == __LITTLE_ENDIAN
unsigned int ihl:4;
unsigned int version:4;
#elif __BYTE_ORDER == __BIG_ENDIAN
unsigned int version:4;
unsigned int ihl:4;
#else
# error "Please fix <bits/endian.h>"
#endif
u_int8_t tos;
...
You can see here that the bitfields are placed in a different order depending on the implementation. You also shouldn't use this specific check because this behavior is system dependent. It is acceptable for this file because it is part of the system. Other systems may implement this in different ways.
So don't use a bitfield.
The best way to do this is to set the required bits. However, it would make sense to define named constants for each bit and to perform a bitwise OR of the constants you want to set. For example:
const uint8_t BIT_BYTE = 0x1;
const uint8_t BIT_HW = 0x2;
const uint8_t BIT_WORD = 0x4;
const uint8_t BIT_GO = 0x8;
const uint8_t BIT_I_EN = 0x10;
const uint8_t BIT_REEN = 0x20;
const uint8_t BIT_WEEN = 0x40;
const uint8_t BIT_LEEN = 0x80;
DMA_base_ptr[DMA_CONTROL_OFFS] = BIT_LEEN | BIT_GO | BIT_WORD;
Other answers have already covered most of the stuff, but it might be worthwhile to mention that even if you can't use the non-standard 0b syntax, you can use shifts to move the 1 bit into position by bit number, i.e.:
#define DMA_BYTE (1U << 0)
#define DMA_HW (1U << 1)
#define DMA_WORD (1U << 2)
#define DMA_GO (1U << 3)
// …
Note how the last number matches the "bit number" column in the documentation.
The usage for setting and clearing bits doesn't change:
#define DMA_CONTROL_REG DMA_base_ptr[DMA_CONTROL_OFFS]
DMA_CONTROL_REG |= DMA_HW | DMA_WORD; // set HW and WORD
DMA_CONTROL_REG &= ~(DMA_BYTE | DMA_GO); // clear BYTE and GO
The old-school C way is to define a bunch of bits:
#define WORD 0x04
#define GO 0x08
#define I_EN 0x10
#define LEEN 0x80
Then your initialization becomes
DMA_base_ptr[DMA_CONTROL_OFFS] = WORD | GO | LEEN;
You can set individual bits using |:
DMA_base_ptr[DMA_CONTROL_OFFS] |= I_EN;
You can clear individual bits using & and ~:
DMA_base_ptr[DMA_CONTROL_OFFS] &= ~GO;
You can test individual bits using &:
if(DMA_base_ptr[DMA_CONTROL_OFFS] & WORD) ...
Definitely don't use bitfields, though. They have their uses, but not when an external specification defines that the bits are in certain places, as I assume is the case here.
See also questions 20.7 and 2.26 in the C FAQ list.
There is no standard for bit fields. Mapping and bit operation are dependent on compiler in this case. Binary values such as 0b0000 are not standardized also. Usual way to do is defining hexadecimal values for each bit. For example:
#define BYTE (0x01)
#define HW (0x02)
/*etc*/
When you want to set bits, you can use:
DMA_base_ptr[DMA_CONTROL_OFFS] |= HW;
Or you can clear bits with:
DMA_base_ptr[DMA_CONTROL_OFFS] &= ~HW;
Modern C compilers handle trivial inline functions just fine – without overhead. I’d make all of the abstractions functions, so that the user doesn’t need to manipulate any bits or integers, and is unlikely to abuse the implementation details.
You can of course use constants and not functions for implementation details, but the API should be functions. This also allows using macros instead of functions if you’re using an ancient compiler.
For example:
#include <stdbool.h>
#include <stdint.h>
typedef union DmaBase {
volatile uint8_t u8[32];
} DmaBase;
static inline DmaBase *const dma1__base(void) { return (void*)0x12340000; }
// instead of DMA_CONTROL_OFFS
static inline volatile uint8_t *dma_CONTROL(DmaBase *base) { return &(base->u8[12]); }
// instead of constants etc
static inline uint8_t dma__BYTE(void) { return 0x01; }
inline bool dma_BYTE(DmaBase *base) { return *dma_CONTROL(base) & dma__BYTE(); }
inline void dma_set_BYTE(DmaBase *base, bool val) {
if (val) *dma_CONTROL(base) |= dma__BYTE();
else *dma_CONTROL(base) &= ~dma__BYTE();
}
inline bool dma1_BYTE(void) { return dma_BYTE(dma1__base()); }
inline void dma1_set_BYTE(bool val) { dma_set_BYTE(dma1__base(), val); }
Such code should be machine generated: I use gsl (of 0mq fame) to generate those based on a template and some XML input listing the details of the registers.
You could use bit-fields, despite what all the fear-mongers here have been saying. You would just need to know how the compiler(s) and system ABI(s) you intend your code to work with define the "implementation defined" aspects of bit-fields. Don't be scared off by pedants putting words like "implementation defined" in bold.
However what others so far seem to have missed out on are the various aspects of how memory-mapped hardware devices might behave that can be counter-intuitive when dealing with a higher-level language like C and the optimization features such languages offer. For example every read or write of a hardware register may have side-effects sometimes even if bits are not changed on the write. Meanwhile the optimizer may make it difficult to tell when the generated code is actually reading or writing to the address of the register, and even when the C object describing the register is carefully qualified as volatile, great care is required to control when I/O occurs.
Perhaps you will need to use some specific technique defined by your compiler and system in order to properly manipulate memory-mapped hardware devices. This is the case for many embedded systems. In some cases compiler and system vendors will indeed use bit-fields, just as Linux does in some cases. I would suggest reading your compiler manual first.
The bit description table you quote appears to be for the control register of the the Intel Avalon DMA controller core. The "read/write/clear" column gives a hint as to how a particular bit behaves when it is read or written. The status register for that device has an example of a bit where writing a zero will clear a bit value, but it may not read back the same value as was written -- i.e. writing the register may have a side-effect in the device, depending on the value of the DONE bit. Interestingly they document the SOFTWARERESET bit as "RW", but then describe the procedure as writing a 1 to it twice to trigger the reset, and then they also warn Executing a DMA software reset when a DMA transfer is active may result in permanent bus lockup (until the next system reset). The SOFTWARERESET bit should therefore not be written except as a last resort. Managing a reset in C would take some careful coding no matter how you describe the register.
As for standards, well ISO/IEC have produced a "technical report" known as "ISO/IEC TR 18037", with the subtitle "Extensions to support embedded processors". It discusses a number of the issues related to using C to manage hardware addressing and device I/O, and specifically for the kinds of bit-mapped registers you mention in your question it documents a number of macros and techniques available through an include file they call <iohw.h>. If your compiler provides such a header file, then you might be able to use these macros.
There are draft copies of TR 18037 available, the latest being TR 18037(2007), though it provides for rather dry reading. However it does contain an example implementation of <iohw.h>.
Perhaps a good example of a real-world <iohw.h> implementation is in QNX. The QNX documentation offers a decent overview (and an example, though I would strongly suggest using enums for integer values, never macros): QNX <iohw.h>
You should make sure to initialize the bits to a known default value when you declare the variable to store their values. In C, when you declare a variable you are just reserving a block of memory at a address and the size of the block is based its type. If you don't initialize the variable you can encounter undefined / unexpected behavior since the value of the variable will be effected by whatever the value / state of the memory in that block was before you declared it. By initializing the variable to a default value, you are clearing this block of memory of its existing state and putting it in a know state.
As far as readability, you should use a bit field to store the values of the bit. A bit field enables you to store the values of the bits in a struct. This makes it easier to organize since you can use dot notation. Also, you should make sure to comment the declaration of the bit field to explain what the different fields are used for as a best practice. I hope this answers your question. Good luck with you C programming!
Just wondering what the best practice regarding I²C register maps in C or rather what other people use often/prefer.
Up to this point, I have usually done lots of defines, one for every register and one for all the bits, masks, shifts etc.
However, lately I've seen some drivers use (possibly packed) structs instead of defined. I think these were Linux kernel modules.
Anyway, they would
struct i2c_sensor_fuu_registers {
uint8_t id;
uint16_t big_register;
uint8_t another_register;
...
} __attribute__((packed));
Then they'd use offsetof (or a macro) to get the i2c register and use sizeof for the number of bytes to read.
I find that both approaches have their merit:
struct approach:
(+) Register offsets are all logically contained inside a struct instead of having to spell each register out in a define.
(+) Entry sizes are explicitly stated using a data type of appropriate size.
(-) This doesn't account for bit fields which are widely used
(-) This doesn't account for register maps that aren't byte mapped (e.g. LM75), where one reads 2 bytes from offset n+0x00, yet n+0x01 is another register, not the high/low byte of register n+0x00
(-) This doesn't account for large gaps in address space (e.g. registers at 0x00, 0x01, 0x80, 0xAA, no in-betweens...) and (I think?) relies on compiler optimization to get rid of the struct.
define approach:
(+) Each of the registers along with its bits is usually defined in a block, making finding the right symbol easy and relying on a naming convention.
(+) Transparent/unaware of address space gaps.
(-) Each of the registers have to be defined individually, even when there are no gaps
(-) Because defines tend to be global, the names are usually very long, somewhat littering the source code with dozens of long symbol names.
(-) Sizes of data to read are usually either hard-coded magic numbers or (end - start + 1) style computations with possibly long symbol names.
(o) Transparent/unaware of data size vs. address in map.
Basically, I'm looking for a smarter way to handle these cases. I often find myself typing lots and lots of agonizingly long symbol names for each and every register and each bit and possibly masks and shifts (latter two depending on data type) as well, just to end up using just a few of them (but hating to redefine missing symbols later on, which is why I type all in one session).
Still, I notice that sizes of bytes to read/write are mostly magic numbers and usually reading the datasheet and source code side-by-side is required to understand even the most basic interaction.
I wonder how other people handle these kinds of situations? I found some examples online where people also arduously typed every single register, bit etc. in a big header, but nothing quite definitive... However, neither of the two options above seems too smart at this point :(
WARNING: The method described here uses bitfields, whose arrangement in memory is implementation specific. If you do this, make sure you know how your compiler works in this regard.
As you point out, there are advantages and disadvantages to each method. I like a hybrid approach. You can define register offsets, but then use a struct for the contents and a union to specify the bits or the entire register. Inside the union, use the correct size variable for the size of the register (as you mentioned sometimes they're not byte addressable). You don't need quite as many defines, and you're less likely to mess up bit shifts and don't need masks. For example:
#define unsigned char u8;
#define unsigned short u16;
#define CTL_REG_ADDR 0x1234
typedef union {
struct {
u16 not_used:10; //top 10 bits ununsed
u16 foo_bits:3; //a multibit register
u16 bar_bit:1; //just one bit
u16 baz_bits:2; //2 more bits
} fields;
u16 raw;
} CTL_REG_DATA;
#define STATUS_REG_ADDR 0x58
typedef union {
struct {
u8 bar_bits:4; //upper nibble
u8 baz_bits:4; //lower nibble
} fields;
u8 raw;
} STATUS_REG_DATA;
//use them like the following
u16 readregister(u16);
void writeregister(u16,u16);
CTL_REG_DATA reg;
STATUS_REG_DATA rd;
rd = readregister(STATUS_REG_ADDR);
if (rd.fields.bar_bit) {
reg.raw = 0xffff; //set every bit
reg.fields.bar_bit = 0; //but clear this one bit
writeregister(CTL_REG_ADDR, reg);
}
In my ideal world, the hardware designer would supply a header file compatible with C++, C, and ASM. One that was auto-generated based on the actual hardware registers. One that defined every register and bit/field via both #defines (for ASM) and typedef'd structures (for C and C++). One that indicated the access properties of every bit and field (read-only, write-only, write-clear, etc.). One that included comments defining the use and purpose of each register and its bits/fields. It would also need to account for target endianness and compiler, to make sure any registers and bitfields were ordered correctly.
I got as close to this ideal as I could at a previous job. I wrote a script that would parse a register description file (of a format I defined) and auto-generate a full header (structures and #defines) as well as a function to dump all the readable registers for debugging purposes. I've seen similar approaches at other companies, but none that took it to that extent.
I'll point out that if you use a typedef struct to define your register layout then you can easily account for large register gaps in the definition. e.g. Just add a "reserved[80]" or "unused[94]" or "unimplemented[2044]" or "gap[42]" array element to define the gap. You'll always use the struct definition as a pointer to the hardware base address anyway, so it won't take up the actual size of the struct anywhere in memory.
Hope that helps.
I have a couple of questions that are all inter-related. Basically, in the algorithm I am implementing a word w is defined as four bytes, so it can be contained whole in a uint32_t.
However, during the operation of the algorithm I often need to access the various parts of the word. Now, I can do this in two ways:
uint32_t w = 0x11223344;
uint8_t a = (w & 0xff000000) >> 24;
uint8_t b = (w & 0x00ff0000) >> 16;
uint8_t b = (w & 0x0000ff00) >> 8;
uint8_t d = (w & 0x000000ff);
However, part of me thinks that isn't particularly efficient. I thought a better way would be to use union representation like so:
typedef union
{
struct
{
uint8_t d;
uint8_t c;
uint8_t b;
uint8_t a;
};
uint32_t n;
} word32;
Using this method I can assign word32 w = 0x11223344; then I can access the various
parts as I require (w.a=11 in little endian).
However, at this stage I come up against endianness issues, namely, in big endian systems my struct is defined incorrectly so I need to re-order the word prior to it being passed in.
This I can do without too much difficulty. My question is, then, is the first part (various bitwise ands and shifts) efficient compared to the implementation using a union? Is there any difference between the two generally? Which way should I go on a modern, x86_64 processor? Is endianness just a red herring here?
I could inspect the assembly output of course, but my knowledge of compilers is not brilliant. I would have thought a union would be more efficient as it would essentially convert to memory offsets, like so:
mov eax, [r9+8]
Would a compiler realise that is what happening in the bit-shift case above?
If it matters, I'm using C99, specifically my compiler is clang (llvm).
Thanks in advance.
If you need AES, why not use an existing implementation? This can be particularly beneficial on modern Intel processors with hardware support for AES.
The union trick can slow down things due to store-to-load-forwarding (STLF) failures. This may happen, depending on the processor model, if you write data to memory and read it back soon as a different data type (e.g. 32bit vs 8bit).
Such a thing is hard to tell without being able to inspect the real use of these operations in your code:
the shift version will probably do
better if you happen to have all your
variables in registers, anyhow, and
then you do intensive computations on
them. Usually compilers (clang including) are relatively clever in issuing instructions for partial words and stuff like that.
the union version would perhaps be
more efficient if you'd have to load
your bytes from memory most of the
time
In any case I would abstract the access operation into a macro, such that you can modify it easily whence you have a working code.
For my personal taste I would go for the shift version, since it is conceptually simpler, and only go for the union when I'd see that at the end the produced assembler doesn't look satisfactory.
I would guess using a union may be more efficient. Of course, the compiler may be able to optimize the shifts into byte loads since they are known during compilation -- in which case both schemes will yield identical code.
Another option (also byte order dependent) is to cast the word to a byte array and access the bytes directly. I.e., something like the following
uint8_t b = ((uint8_t*)w)[n]
I'm not sure you will see any difference on a real modern 32/64 bit processor, though.
EDIT: It seems like clang produces identical code in both cases.
Given that accessing bits using shift and masking is a common operation I'd expect compilers to be quite smart about it especially if you're using constant shift count and mask.
An option would be to use macros for bit set/get so that you can pick the best strategy at configure time if on a specific platform a compiler happens to be on the dumb side (and wisely chosen names for the macros can also make the code more clear and self explaining).
I'm working on an embedded project (PowerPC target, Freescale Metrowerks Codewarrior compiler) where the registers are memory-mapped and defined in nice bitfields to make twiddling the individual bit flags easy.
At the moment, we are using this feature to clear interrupt flags and control data transfer. Although I haven't noticed any bugs yet, I was curious if this is safe. Is there some way to safely use bit fields, or do I need to wrap each in DISABLE_INTERRUPTS ... ENABLE_INTERRUPTS?
To clarify: the header supplied with the micro has fields like
union {
vuint16_t R;
struct {
vuint16_t MTM:1; /* message buffer transmission mode */
vuint16_t CHNLA:1; /* channel assignement */
vuint16_t CHNLB:1; /* channel assignement */
vuint16_t CCFE:1; /* cycle counter filter enable */
vuint16_t CCFMSK:6; /* cycle counter filter mask */
vuint16_t CCFVAL:6; /* cycle counter filter value */
} B;
} MBCCFR;
I assume setting a bit in a bitfield is not atomic. Is this a correct assumption? What kind of code does the compiler actually generate for bitfields? Performing the mask myself using the R (raw) field might make it easier to remember that the operation is not atomic (it is easy to forget that an assignment like CAN_A.IMASK1.B.BUF00M = 1 isn't atomic).
Your advice is appreciated.
Atomicity depends on the target and the compiler. AVR-GCC for example trys to detect bit access and emit bit set or clear instructions if possible. Check the assembler output to be sure ...
EDIT: Here is a resource for atomic instructions on PowerPC directly from the horse's mouth:
http://www.ibm.com/developerworks/library/pa-atom/
It is correct to assume that setting bitfields is not atomic. The C standard isn't particularly clear on how bitfields should be implemented and various compilers go various ways on them.
If you really only care about your target architecture and compiler, disassemble some object code.
Generally, your code will achieve the desired result but be much less efficient than code using macros and shifts. That said, it's probably more readable to use your bit fields if you don't care about performance here.
You could always write a setter wrapper function for the bits that is atomic, if you're concerned about future coders (including yourself) being confused.
Yes, your assumption is correct, in the sense that you may not assume atomicity. On a specific platform you might get it as an extra, but you can't rely on it in any case.
Basically the compiler performs masking and things for you. He might be able to take advantage of corner cases or special instructions. If you are interested in efficiency look into the assembler that your compiler produces with that, usually it is quite instructive. As a rule of thumb I'd say that modern compilers produces code that is as efficient as medium programming effort would be. Real deep bit twiddeling for your specific compiler could perhaps gain you some cycles.
I think that using bitfields to model hardware registers is not a good idea.
So much about how bitfields are handled by a compiler is implementation-defined (including how fields that span byte or word boundaries are handled, endianess issues, and exactly how getting, setting and clearing bits is implemented). See C/C++: Force Bit Field Order and Alignment
To verify that register accesses are being handled how you might expect or need them to be handled, you would have to carefully study the compiler docs and/or look at the emitted code. I suppose that if the headers supplied with the microprocessor toolset uses them you can be assume that most of my concerns are taken care of. However, I'd guess that atomic access isn't necessarily...
I think it's best to handle these type of bit-level accesses of hardware registers using functions (or macros, if you must) that perform explicit read/modify/write operations with the bit mask that you need, if that's what your processor requires.
Those functions could be modified for architectures that support atomic bit-level accesses (such as the ARM Cortex M3's "bit-banding" addressing). I don't know if the PowerPC supports anything like this - the M3 is the only processor I've dealt with that supports it in a general fashion. And even the M3's bit-banding supports 1-bit accesses; if you're dealing with a field that's 6-bits wide, you have to go back to the read/modify/write scenario.
It totally depends on the architecture and compiler whether the bitfield operations are atomic or not. My personal experience tells: don't use bitfields if you don't have to.
I'm pretty sure that on powerpc this is not atomic, but if your target is a single core system then you can just:
void update_reg_from_isr(unsigned * reg_addr, unsigned set, unsigned clear, unsigned toggle) {
unsigned reg = *reg_addr;
reg |= set;
reg &= ~clear;
reg ^= toggle;
*reg_addr = reg;
}
void update_reg(unsigned * reg_addr, unsigned set, unsigned clear, unsigned toggle) {
interrupts_block();
update_reg_from_isr(reg_addr, set, clear, toggle);
interrupts_enable();
}
I don't remember if powerpc's interrupt handlers are interruptible, but if they are then you should just use the second version always.
If your target is a multiprocessor system then you should make locks (spinlocks, which disable interrupts on the local processor and then wait for any other processors to finish with the lock) that protect access to things like hardware registers, and acquire the needed locks before you access the register, and then release the locks immediately after you have finished updating the register (or registers).
I read once how to implement locks in powerpc -- it involved telling the processor to watch the memory bus for a certain address while you did some operations and then checking back at the end of those operations to see if the watch address had been written to by another core. If it hadn't then your operation was sucessful; if it had then you had to redo the operation. This was in a document written for compiler, library, and OS developers. I don't remember where I found it (probably somewhere on IBM.com) but a little hunting should turn it up. It probably also has info on how to do atomic bit twiddling.
The classic problem of testing and setting individual bits in an integer in C is perhaps one the most common intermediate-level programming skills. You set and test with simple bitmasks such as
unsigned int mask = 1<<11;
if (value & mask) {....} // Test for the bit
value |= mask; // set the bit
value &= ~mask; // clear the bit
An interesting blog post argues that this is error prone, difficult to maintain, and poor practice. The C language itself provides bit level access which is typesafe and portable:
typedef unsigned int boolean_t;
#define FALSE 0
#define TRUE !FALSE
typedef union {
struct {
boolean_t user:1;
boolean_t zero:1;
boolean_t force:1;
int :28; /* unused */
boolean_t compat:1; /* bit 31 */
};
int raw;
} flags_t;
int
create_object(flags_t flags)
{
boolean_t is_compat = flags.compat;
if (is_compat)
flags.force = FALSE;
if (flags.force) {
[...]
}
[...]
}
But this makes me cringe.
The interesting argument my coworker and I had about this is still unresolved. Both styles work, and I maintain the classic bitmask method is easy, safe, and clear. My coworker agrees it's common and easy, but the bitfield union method is worth the extra few lines to make it portable and safer.
Is there any more arguments for either side? In particular is there some possible failure, perhaps with endianness, that the bitmask method may miss but where the structure method is safe?
Bitfields are not quite as portable as you think, as "C gives no guarantee of the ordering of fields within machine words" (The C book)
Ignoring that, used correctly, either method is safe. Both methods also allow symbolic access to integral variables. You can argue that the bitfield method is easier to write, but it also means more code to review.
If the issue is that setting and clearing bits is error prone, then the right thing to do is to write functions or macros to make sure you do it right.
// off the top of my head
#define SET_BIT(val, bitIndex) val |= (1 << bitIndex)
#define CLEAR_BIT(val, bitIndex) val &= ~(1 << bitIndex)
#define TOGGLE_BIT(val, bitIndex) val ^= (1 << bitIndex)
#define BIT_IS_SET(val, bitIndex) (val & (1 << bitIndex))
Which makes your code readable if you don't mind that val has to be an lvalue except for BIT_IS_SET. If that doesn't make you happy, then you take out assignment, parenthesize it and use it as val = SET_BIT(val, someIndex); which will be equivalent.
Really, the answer is to consider decoupling the what you want from how you want to do it.
Bitfields are great and easy to read, but unfortunately the C language does not specify the layout of bitfields in memory, which means they are essentially useless for dealing with packed data in on-disk formats or binary wire protocols. If you ask me, this decision was a design error in C—Ritchie could have picked an order and stuck with it.
You have to think about this from the perspective of a writer -- know your audience. So there are a couple of "audiences" to consider.
First there's the classic C programmer, who have bitmasked their whole lives and could do it in their sleep.
Second there's the newb, who has no idea what all this |, & stuff is. They were programming php at their last job and now they work for you. (I say this as a newb who does php)
If you write to satisfy the first audience (that is bitmask-all-day-long), you'll make them very happy, and they'll be able to maintain the code blindfolded. However, the newb will likely need to overcome a large learning curve before they are able to maintain your code. They will need to learn about binary operators, how you use these operations to set/clear bits, etc. You're almost certainly going to have bugs introduced by the newb as he/she all the tricks required to get this to work.
On the other hand, if you write to satisfy the second audience, the newbs will have an easier time maintaining the code. They'll have an easier time groking
flags.force = 0;
than
flags &= 0xFFFFFFFE;
and the first audience will just get grumpy, but its hard to imagine they wouldn't be able to grok and maintain the new syntax. It's just much harder to screw up. There won't be new bugs, because the newb will more easily maintain the code. You'll just get lectures about how "back in my day you needed a steady hand and a magnetized needle to set bits... we didn't even HAVE bitmasks!" (thanks XKCD).
So I would strongly recommend using the fields over the bitmasks to newb-safe your code.
The union usage has undefined behavior according to the ANSI C standard, and thus, should not be used (or at least not be considered portable).
From the ISO/IEC 9899:1999 (C99) standard:
Annex J - Portability Issues:
1 The following are unspecified:
— The value of padding bytes when storing values in structures or unions (6.2.6.1).
— The value of a union member other than the last one stored into (6.2.6.1).
6.2.6.1 - Language Concepts - Representation of Types - General:
6 When a value is stored in an object of structure or union type, including in a member
object, the bytes of the object representation that correspond to any padding bytes take
unspecified values.[42]) The value of a structure or union object is never a trap
representation, even though the value of a member of the structure or union object may be
a trap representation.
7 When a value is stored in a member of an object of union type, the bytes of the object
representation that do not correspond to that member but do correspond to other members
take unspecified values.
So, if you want to keep the bitfield ↔ integer correspondence, and to keep portability, I strongly suggest you to use the bitmasking method, that contrary to the linked blog post, it is not poor practice.
What it is about the bitfield approach that makes you cringe?
Both techniques have their place, and the only decision I have is which one to use:
For simple "one-off" bit fiddling, I use the bitwise operators directly.
For anything more complex - eg hardware register maps, the bitfield approach wins hands down.
Bitfields are more succinct to use
(at the expense of /slightly/ more
verbosity to write.
Bitfields are
more robust (what size is "int",
anyway)
Bitfields are usually just
as fast as bitwise operators.
Bitfields are very powerful when you
have a mix of single and multiple bit
fields, and extracting the
multiple-bit field involves loads of
manual shifts.
Bitfields are
effectively self-documenting. By
defining the structure and therefore
naming the elements, I know what it's
meant to do.
Bitfields also seamlessly handle structures bigger than a single int.
With bitwise operators, typical (bad) practice is a slew of #defines for the bit masks.
The only caveat with bitfields is to make sure the compiler has really packed the object into the size you wanted. I can't remember if this is define by the standard, so a assert(sizeof(myStruct) == N) is a useful check.
The blog post you are referring to mentions raw union field as alternative access method for bitfields.
The purposes blog post author used raw for are ok, however if you plan to use it for anything else (e.g. serialisation of bit fields, setting/checking individual bits), disaster is just waiting for you around the corner. The ordering of bits in memory is architecture dependent and memory padding rules vary from compiler to compiler (see wikipedia), so exact position of each bitfield may differs, in other words you never can be sure which bit of raw each bitfield corresponds to.
However if you don't plan to mix it you better take raw out and you will be safe.
Well you can't go wrong with structure mapping since both fields are accessable they can be used interchangably.
One benefit for bit fields is that you can easily aggregate options:
mask = USER|FORCE|ZERO|COMPAT;
vs
flags.user = true;
flags.force = true;
flags.zero = true;
flags.compat = true;
In some environments such as dealing with protocol options it can get quite old having to individually set options or use multiple parameters to ferry intermediate states to effect a final outcome.
But sometimes setting flag.blah and having the list popup in your IDE is great especially if your like me and can't remember the name of the flag you want to set without constantly referencing the list.
I personally will sometimes shy away from declaring boolean types because at some point I'll end up with the mistaken impression that the field I just toggled was not dependent (Think multi-thread concurrency) on the r/w status of other "seemingly" unrelated fields which happen to share the same 32-bit word.
My vote is that it depends on the context of the situation and in some cases both approaches may work out great.
Either way, bitfields have been used in GNU software for decades and it hasn't done them any harm. I like them as parameters to functions.
I would argue that bitfields are conventional as opposed to structs. Everyone knows how to AND the values to set various options off and the compiler boils this down to very efficient bitwise operations on the CPU.
Providing you use the masks and tests in the correct way, the abstractions the compiler provide should make it robust, simple, readable and clean.
When I need a set of on/off switches, Im going to continue using them in C.
In C++, just use std::bitset<N>.
It is error-prone, yes. I've seen lots of errors in this kind of code, mainly because some people feel that they should mess with it and the business logic in a totally disorganized way, creating maintenance nightmares. They think "real" programmers can write value |= mask; , value &= ~mask; or even worse things at any place, and that's just ok. Even better if there's some increment operator around, a couple of memcpy's, pointer casts and whatever obscure and error-prone syntax happens to come to their mind at that time. Of course there's no need to be consistent and you can flip bits in two or three different ways, distributed randomly.
My advice would be:
Encapsulate this ---- in a class, with methods such as SetBit(...) and ClearBit(...). (If you don't have classes in C, in a module.) While you're at it, you can document all their behaviour.
Unit test that class or module.
Your first method is preferable, IMHO. Why obfuscate the issue? Bit fiddling is a really basic thing. C did it right. Endianess doesn't matter. The only thing the union solution does is name things. 11 might be mysterious, but #defined to a meaningful name or enum'ed should suffice.
Programmers who can't handle fundamentals like "|&^~" are probably in the wrong line of work.
I nearly always use the logical operations with a bit mask, either directly or as a macro. e.g.
#define ASSERT_GPS_RESET() { P1OUT &= ~GPS_RESET ; }
incidentally your union definition in the original question would not work on my processor/compiler combination. The int type is only 16 bits wide and the bitfield definitions are 32. To make it slightly more portable then you would have to define a new 32 bit type that you could then map to the required base type on each target architecture as part of the porting exercise. In my case
typedef unsigned long int uint32_t
and in the original example
typedef unsigned int uint32_t
typedef union {
struct {
boolean_t user:1;
boolean_t zero:1;
boolean_t force:1;
int :28; /* unused */
boolean_t compat:1; /* bit 31 */
};
uint32_t raw;
} flags_t;
The overlaid int should also be made unsigned.
Well, I suppose that's one way of doing it, but I would always prefer to keep it simple.
Once you're used to it, using masks is straightforward, unambiguous and portable.
Bitfields are straightforward, but they are not portable without having to do additional work.
If you ever have to write MISRA-compliant code, the MISRA guidelines frown on bitfields, unions, and many, many other aspects of C, in order to avoid undefined or implementation-dependent behaviour.
When I google for "C operators", the first three pages are:
Operators in C and C++
http://h30097.www3.hp.com/docs/base_doc/DOCUMENTATION/V40F_HTML/AQTLTBTE/DOCU_059.HTM
http://www.cs.mun.ca/~michael/c/op.html
..so I think that argument about people new to the language is a little silly.
Generally, the one that is easier to read and understand is the one that is also easier to maintain. If you have co-workers that are new to C, the "safer" approach will probably be the easier one for them to understand.
Bitfields are great, except that the bit manipulation operations are not atomic, and can thus lead to problems in multi-threaded application.
For example one could assume that a macro:
#define SET_BIT(val, bitIndex) val |= (1 << bitIndex)
Defines an atomic operation, since |= is one statement. But the ordinary code generated by a compiler will not try to make |= atomic.
So if multiple threads execute different set bit operations one of the set bit operation could be spurious. Since both threads will execute:
thread 1 thread 2
LOAD field LOAD field
OR mask1 OR mask2
STORE field STORE field
The result can be field' = field OR mask1 OR mask2 (intented), or the result can be field' = field OR mask1 (not intented) or the result can be field' = field OR mask2 (not intended).
I'm not adding much to what's already been said, except to emphasize two points:
The compiler is free to arrange bits within a bitfield any way it wants. This mean if you're trying to manipulate bits in a microcontroller register, or if you want to send the bits to another processor (or even the same processor with a different compiler), you MUST use bitmasks.
On the other hand, if you're trying to create a compact representation of bits and small integers for use within a single processor, bitfields are easier to maintain and thus less error prone, and -- with most compilers -- are at least as efficient as manually masking and shifting.