Is it ok to use bit-fields in embedded system firmware?

I see that some embedded system Firmware books/articles suggest not to use C's structure bit field as it isn't portable. I know that the order and padding is implementation defined, but is it always not portable to use bit fields?
I mean, if for example I defined a configuration structure for a 8-bit microcontroller driver like this:
typedef struct
int channel_name :3 ; /*7 possible channels*/
int Enable :1 ; /*if 1 enable,otherwise disable*/
int Mode;
} conf_t
I don't understand how can the implementation defined behavior raise a portability issue in such a case, can anyone explain?

Here are some of the portability issues that are likely to occur:
Byte padding issues. What will be the size of this struct?
Endianess and bit order issues caused by it. Will channel_name get allocated in the 3 MSB or the 3 LSB?
Different behavior from different compilers when you declare an int bitfield of size 1. What goes into that bit, the sign bit or data? In a bitfield (and only there), compilers may treat int either as signed or unsigned, and in case of signed, they may behave differently in regards of the sign bit.
Behavior in terms of bit/byte padding upon mixing different types in the same bitfield.
Then there's a bunch of other things that are poorly-defined as well, but less likely to cause actual problems in reality.

Device registers often have side effects, for example, reading a status register might clear a detected conditions. You have no way of controlling how many accesses the compiler might make in a given expression. Reflexively, when you update a structure with bitfields, the compiler is free to make multiple writes to the storage location, which could have a dramatic effect.
Even if you have sorted out your compiler what advantage are you gaining with this? Does it really make the code more readable, or just shorter? Often the latter implies the former, but there are limits.
The numbering of bits in a bitfield normally follows the byte ordering of the machine; so { int x:1; } would be the least significant bit on an intel machine, but the most significant bit on a motorola machine. In contrast (1 << 0) is the least significant bit on all machines. [ I once had to go through an 8kloc video capture driver stuffed with bitfields to move it to another architecture ].
The casual notion that *p is sufficient to read a register with an appropriate bus protocol is a long dead notion, and should stay there. x = io_readb(device) is inherently self documenting; or even better : if (io_readb(device, &x) != 0) { panic("device failed"); }.


Comparison uint8_t vs uint16_t while declaring a counter

Assuming to have a counter which counts from 0 to 100, is there an advantage of declaring the counter variable as uint16_t instead of uint8_t.
Obviously if I use uint8_t I could save some space. On a processor with natural wordsize of 16 bits access times would be the same for both I guess. I couldn't think why I would use a uint16_t if uint8_t can cover the range.
Using a wider type than necessary can allow the compiler to avoid having to mask the higher bits.
Suppose you were working on a 16 bit architecture, then using uint16_t could be more efficient, however if you used uint16_t instead of uint8_t on a 32 bit architecture then you would still have the mask instructions but just masking a different number of bits.
The most efficient type to use in a cross-platform portable way is just plain int or unsigned int, which will always be the correct type to avoid the need for masking instructions, and will always be able to hold numbers up to 100.
If you are in a MISRA or similar regulated environment that forbids the use of native types, then the correct standard-compliant type to use is uint_fast8_t. This guarantees to be the fastest unsigned integer type that has at least 8 bits.
However, all of this is nonsense really. Your primary goal in writing code should be to make it readable, not to make it as fast as possible. Penny-pinching instructions like this makes code convoluted and more likely to have bugs. Also because it is harder to read, the bugs are less likely to be found during code review.
You should only try to optimize like this once the code is finished and you have tested it and found the particular part which is the bottleneck. Masking a loop counter is very unlikely to be the bottleneck in any real code.
Obviously if I use uint8_t I could save some space.
Actually, that's not necessarily obvious! A loop index variable is likely to end up in a register, and if it does there's no memory to be saved. Also, since the definition of the C language says that much arithmetic takes place using type int, it's possible that using a variable smaller than int might actually end up costing you space in terms of extra code emitted by the compiler to convert back and forth between int and your smaller variable. So while it could save you some space, it's not at all guaranteed that it will — and, in any case, the actual savings are going to be almost imperceptibly small in the grand scheme of things.
If you have an array of some number of integers in the range 0-100, using uint8_t is a fine idea if you want to save space. For an individual variable, on the other hand, the arguments are pretty different.
In general, I'd say that there are two reasons not to use type uint8_t (or, equivalently, char or unsigned char) as a loop index:
It's not going to save much data space (if at all), and it might cost code size and/or speed.
If the loop runs over exactly 256 elements (yours didn't, but I'm speaking more generally here), you may have introduced a bug (which you'll discover soon enough): your loop may run forever.
The interviewer was probably expecting #1 as an answer. It's not a guaranteed answer — under plenty of circumstances, using the smaller type won't cost you anything, and evidently there are microprocessors where it can actually save something — but as a general rule, I agree that using an 8-bit type as a loop index is, well, silly. And whether or not you agree, it's certainly an issue to be aware of, so I think it's a fair interview question.
See also this question, which discusses the same sorts of issues.
The interview question doesn't make much sense from a platform-generic point of view. If we look at code such as this:
for(uint8_t i=0; i<n; i++)
array[i] = x;
Then the expression i<n will get carried out on type int or larger because of implicit promotion. Though the compiler may optimize it to use a smaller type if it doesn't affect the result.
As for array[i], the compiler is likely to use a type corresponding to whatever address size the system is using.
What the interviewer was fishing for is likely that uint32_t on a 32 bitter tend to generate faster code in some situations. For those cases you can use uint_fast8_t, but more likely the compiler will perform optimizations no matter.
The only optimization uint8_t blocks the compiler from doing, is to allocate a larger variable than 8 bits on the stack. It doesn't however block the compiler from optimizing out the variable entirely and using a register instead. Such as for example storing it in an index register with the same width as the address bus.
Example with gcc x86_64: The disassembly is pretty painful to read, but the compiler just picked CPU registers to store anything regardless of the type of i, giving identical machine code between uint8_t anduint16_t. I would have been surprised if it didn't.
On a processor with natural wordsize of 16 bits access times would be the same for both I guess.
Yes this is true for all mainstream 16 bitters. Some might even manage faster code if given 8 bits instead of 16. Some exotic systems like DSP exist, but in case of lets say a 1 byte=16 bits DSP, then the compiler doesn't even provide you with uint8_t to begin with - it is an optional type. One generally doesn't bother with portability to wildly exotic systems, since doing so is a waste of everyone's time and money.
The correct answer: it is senseless to do manual optimization without a specific system in mind. uint8_t is perfectly fine to use for generic, portable code.

Should I use bit-fields for mapping incoming serial data?

We have data coming in over serial (Bluetooth), which maps to a particular structure. Some parts of the structure are sub-byte size, so the "obvious" solution is to map the incoming data to a bit-field. What I can't work out is whether the bit-endianness of the machine or compiler will affect it (which is difficult to test), and whether I should just abandon the bit-fields altogether.
For example, we have a piece of data which is 1.5 bytes, so we used the struct:
uint8_t data1; // lsb
uint8_t data2:4; // msb
uint8_t reserved:4;
} Data;
The reserved bits are always 1
So for example, if the incoming data is 0xD2,0xF4, the value is 0x04D2, or 1234.
The struct we have used is always working on the systems we have tested on, but we need it to be as portable as possible.
My questions are:
Will data1 always represent the correct value as expected regardless of endianness (I assume yes, and that the hardware/software interface should always handle that correctly for a single, whole byte - if 0xD2 is sent, 0xD2 should be received)?
Could data2 and reserved be the wrong way around, with data2 representing the upper 4 bits instead of the lower 4 bits?
If yes:
Is the bit endianness (generally) dependent on the byte endianness, or can they differ entirely?
Is the bit-endianness determined by the hardware or the compiler? It seems all linux systems on Intel are the same - is that true for ARM as well? (If we can say we can support all Intel and ARM linux builds, we should be OK)
Is there a simple way to determine in the compiler which way around it is, and reserve the bit-fields entries if needed?
Although bit-fields are the neatest way, code-wise, to map the incoming data, I suppose I am just wondering if it's a lot safer to just abandon them, and use something like:
struct {
uint8_t data1; // lsb (0xFF)
uint8_t data2; // msb (0x0F) & reserved (0xF0)
} Data;
Data d;
int value = (d.data2 & 0x0F) << 16 + d.data1
The reason we have not just done this in the first place is because a number of the data fields are less than 1 byte, rather than more than 1 - meaning that generally with a bit-field we don't have to do any masking and shifting, so the post-processing is simpler.
Should I use bit-fields for mapping incoming serial data?
No. Bit-fields have a lot of implementation specified behaviour that makes using them a nightmare.
Will data1 always represent the correct value as expected regardless of endianness.
Yes, but that is because uint8_t is smallest possible addressable unit: a byte. For larger data types you need to take care of the byte endianness.
Could data2 and reserved be the wrong way around, with data2 representing the upper 4 bits instead of the lower 4 bits?
Yes. They could also be on different bytes. Also, compiler doesn't have to support uint8_t for bitfields, even if it would support the type otherwise.
Is the bit endianness (generally) dependent on the byte endianness, or can they differ entirely?
The least signifact bit will always be in the least significant byte, but it's impossible to determine in C where in the byte the bit will be.
Bit shifting operators give reliable abstraction of the order that is good enough: For data type uint8_t the (1u << 0) is always the least significant and (1u << 7) the most significant bit, for all compilers and for all architectures.
Bit-fields on the other hand are so poorly defined that you cannot determine the order of bits by the order of your defined fields.
Is the bit-endianness determined by the hardware or the compiler?
Compiler dictates how datatypes map to actual bits, but hardware heavily influences it. For bit-fields, two different compilers for the same hardware can put fields in different order.
Is there a simple way to determine in the compiler which way around it is, and reserve the bit-fields entries if needed?
Not really. It depends on your compiler how to do it, if it's possible at all.
Although bit-fields are the neatest way, code-wise, to map the incoming data, I suppose I am just wondering if it's a lot safer to just abandon them, and use something like:
Definitely abandon bit-fields, but I would also recommend abandoning structures altogether for this purpose, because:
You need to use compiler extensions or manual work to handle byte order.
You need to use compiler extensions to disable padding to avoid gaps due to alignment restrictions. This affects member access performance on some systems.
You cannot have variable width or optional fields.
It's very easy to have strict aliasing violations if you are unaware of those issues. If you define byte array for the data frame and cast that to pointer to structure and then dereference that, you have problems in many cases.
Instead I recommend doing it manually. Define byte array and then write each field into it manually by breaking them apart using bit shifting and masking when necessary. You can write a simple reusable conversion functions for the basic data types.

Are there reasons to avoid bit-field structure members?

I long knew there are bit-fields in C and occasionally I use them for defining densely packed structs:
typedef struct Message_s {
unsigned int flag : 1;
unsigned int channel : 4;
unsigned int signal : 11;
} Message;
When I read open source code, I instead often find bit-masks and bit-shifting operations to store and retrieve such information in hand-rolled bit-fields. This is so common that I do not think the authors were not aware of the bit-field syntax, so I wonder if there are reasons to roll bit-fields via bit-masks and shifting operations your own instead of relying on the compiler to generate code for getting and setting such bit-fields.
Why other programmers use hand-coded bit manipulations instead of bitfields to pack multiple fields into a single word?
This answer is opinion based as the question is quite open:
Many programmers are unaware of the availability of bitfields or unsure about their portability and precise semantics. Some even distrust the compiler's ability to produce correct code. They prefer to write explicit code that they understand.
As commented by Cornstalks, this attitude is rooted in real life experience as explained in this article.
Bitfield's actual memory layout is implementation defined: if the memory layout must follow a precise specification, bitfields should not be used and hand-coded bit manipulations may be required.
The handing of signed values in signed typed bitfields is implementation defined. If signed values are packed into a range of bits, it may be more reliable to hand-code the access functions.
Are there reasons to avoid bitfield-structs?
bitfield-structs come with some limitations:
Bit fields result in non-portable code. Also, the bit field length has a high dependency on word size.
Reading (using scanf()) and using pointers on bit fields is not possible due to non-addressability.
Bit fields are used to pack more variables into a smaller data space, but cause the compiler to generate additional code to manipulate these variables. This results in an increase in both space as well as time complexities.
The sizeof() operator cannot be applied to the bit fields, since sizeof() yields the result in bytes and not in bits.
So whether you should use them or not depends. Read more in Why bit endianness is an issue in bitfields?
PS: When to use bit-fields in C?
There is no reason for it. Bitfields are useful and convenient. They are in the common use in the embedded projects. Some architectures (like ARM) have even special instructions to manipulate bitfields.
Just compare the code (and write the rest of the function foo1)
In many cases, it is useful to be able to address individual groups of bits within a word, or to operate on a word as a unit. The Standard presently does not provide
any practical and portable way to achieve such functionality. If code is written to use bitfields and it later becomes necessary to access multiple groups as a word, there would be no nice way to accommodate that without reworking all the code using the bit fields or disabling type-based aliasing optimizations, using type punning, and hoping everything gets laid out as expected.
Using shifts and masks may be inelegant, but until C provides a means of treating an explicitly-designated sequence of bits within one lvalue as another lvalue, it is often the best way to ensure that code will be adaptable to meet needs.

What is the difference between copying unsigned int 2 times and unsigned long 1 time in 64 bit systems?

What is the difference between
*(unsigned*)d = *(unsigned*)s;
d+=4; s+=4;
*(unsigned*)d = *(unsigned*)s;
d+=4; s+=4;
*(unsigned long*)d = *(unsigned long*)s;
d+=8; s+=8;
on 64bit systems?
Provided that nothing unpleasant happens in respect of padding bits or strict aliasing rules, and assuming the sizes of the types are as you expect, and provided that the memory regions don't overlap, and are correctly aligned, then they each copy 8 bytes from one place to another.
Of course, aside from the practical effect there may be a difference in performance and/or code size.
If you're seeing something break, then look at the actual code emitted, that might tell you what has gone wrong. Unless you have a lot of optimization switched on, and maybe even with optimization, I don't immediately see why those wouldn't be equivalent with AMD64, Ubuntu, and gcc.
Things I've mentioned that could go wrong:
padding bits - doesn't apply to GCC, but the standard permits unsigned an unsigned long to have padding bits, and if so then there could be bit patterns which are trap representations of one or both, which could explode as soon as you dereference.
strict aliasing - unlikely to affect what that code does, but could affect the code you use to check the result. For example, if s and d are the result of casting pointers-to-double to uint8_t*, and you look at the resulting double, then in one or both cases you might not see the effects of the change because you have an illegal type-pun.
sizes of the types - shouldn't apply here since 64 bit linux is LP64, but obviously if sizeof(long) == 4 then the two aren't equivalent. long is 32 bits on 64bit Windows systems, just not 64bit Linux ones.
overlap - if d == s + 4, then the two code snippets have different effect. Because of this, you won't see the first optimized to become the second unless the compiler knows that d and s point to entirely different places (and that's what C99 restrict is for).
alignment - I can't remember what the alignment requirements are for x86-64: for x86 you can get away with an unaligned read/write, it's just slower. In general, if s or d is correctly aligned for int but not long then there's a difference. (Edit: apparently you can enable or disable hardware exceptions for unaligned access on x86-64).
If you need to copy exactly eight byte, why not using memcpy() ?
memcpy(d, s, 8);
Using GCC, it will emit inline code instead of calling the library function, so it should be as faster as your hand written memory copy.
Added bonuses, your code will work on ILP32 systems, LP64 (most 64bits Unix) and LLP64 (win64), and even on system with strict alignment requirements.
If performance is not critical, you should probably just use the memcpy() as in another answer.
If this code occurs soon after a write to *s, match the types; if this code occurs soon before a read from *d, match the types. This will ensure store-to-load forwarding (moving the data from the store directly to the load, without waiting for the store to write the data back into the data cache) will work on as many CPUs as possible. Store-to-load forwarding almost always works if the addresses and sizes of the store and load match and are aligned, and may work more often depending on CPU. If store-to-load forwarding fails, the penalty tends to be in the order of 10 clock cycles.
If you can avoid a store-to-load forwarding problem by adding additional shift/and/or operations, this is often faster.
If you use C's type system more effectively and avoid casts, many store-to-load forwarding problems will be avoided.
Try casting as (unsigned long long*)

Smart typedefs

I've always used typedef in embedded programming to avoid common mistakes:
int8_t - 8 bit signed integer
int16_t - 16 bit signed integer
int32_t - 32 bit signed integer
uint8_t - 8 bit unsigned integer
uint16_t - 16 bit unsigned integer
uint32_t - 32 bit unsigned integer
The recent embedded muse (issue 177, not on the website yet) introduced me to the idea that it's useful to have some performance specific typedefs. This standard suggests having typedefs that indicate you want the fastest type that has a minimum size.
For instance, one might declare a variable using int_fast16_t, but it would actually be implemented as an int32_t on a 32 bit processor, or int64_t on a 64 bit processor as those would be the fastest types of at least 16 bits on those platforms. On an 8 bit processor it would be int16_t bits to meet the minimum size requirement.
Having never seen this usage before I wanted to know
Have you seen this in any projects, embedded or otherwise?
Any possible reasons to avoid this sort of optimization in typedefs?
For instance, one might declare a
variable using int_fast16_t, but it
would actually be implemented as an
int32_t on a 32 bit processor, or
int64_t on a 64 bit processor as those
would be the fastest types of at least
16 bits on those platforms
That's what int is for, isn't it? Are you likely to encounter an 8-bit CPU any time soon, where that wouldn't suffice?
How many unique datatypes are you able to remember?
Does it provide so much additional benefit that it's worth effectively doubling the number of types to consider whenever I create a simple integer variable?
I'm having a hard time even imagining the possibility that it might be used consistently.
Someone is going to write a function which returns a int16fast_t, and then someone else is going to come along and store that variable into an int16_t.
Which means that in the obscure case where the fast variants are actually beneficial, it may change the behavior of your code. It may even cause compiler errors or warnings.
Check out stdint.h from C99.
The main reason I would avoid this typedef is that it allows the type to lie to the user. Take int16_t vs int_fast16_t. Both type names encode the size of the value into the name. This is not an uncommon practice in C/C++. I personally use the size specific typedefs to avoid confusion for myself and other people reading my code. Much of our code has to run on both 32 and 64 bit platforms and many people don't know the various sizing rules between the platforms. Types like int32_t eliminate the ambiguity.
If I had not read the 4th paragraph of your question and instead just saw the type name, I would have assumed it was some scenario specific way of having a fast 16 bit value. And I obviously would have been wrong :(. For me it would violate the "don't surprise people" rule of programming.
Perhaps if it had another distinguishing verb, letter, acronym in the name it would be less likely to confuse users. Maybe int_fast16min_t ?
When I am looking at int_fast16_t, and I am not sure about the native width of the CPU in which it will run, it may make things complicated, for example the ~ operator.
int_fast16_t i = 10;
int_16_t j = 10;
if (~i != ~j) {
// scary !!!
Somehow, I would like to willfully use 32 bit or 64 bit based on the native width of the processor.
I'm actually not much of a fan of this sort of thing.
I've seen this done many times (in fact, we even have these typedefs at my current place of employment)... For the most part, I doubt their true usefulness... It strikes me as change for changes sake... (and yes, I know the sizes of some of the built ins can vary)...
I commonly use size_t, it happens to be the fastest address size, a tradition I picked up in embedding. And it never caused any issues or confusion in embedded circles, but it actually began causing me problems when I began working on 64bit systems.
