standardising bit fields using unions (making them portable) - c

The page shown is from
"Programming Embedded Systems"
by Michael Barr, Anthony Massa,
published by O'Reilly,
ISBN: 0-596-00983-6, page 204.
I am asking you to give more details and explanations on this, like:
Does this means that the bit fields are going to be portable across all the compilers?
For (different) architectures does this work for bit fields with sizes more than one byte or (considering the endianness difference which I don't think that using this method will overcome this problem)?
For (same) architectures does this work for bit fields with sizes more than one byte?
If they are standardised across all compilers as the book says, can we specify how are they going to be aligned?
Q.1.2 if the bit fields are just one byte so the endianness problem won't affect it, right? So will the bit fields be portable across all the compilers on different architectures and endiannesses?

does this means that the bit fields are going to be portable across all the compilers
No, the cited text about unions somehow making bit-fields portable is strange. union does not add any further portability guarantees what-so-ever. There are many aspects of bit-fields that make them completely non-portable, because they are very poorly specified by the standard. Some examples here.
For example, using uint8_t or a char type for a bit-field is not covered by the standard. The book fails to mention this even though it makes such a non-standard example.
for (different) architectures does this work for bit fields with sizes more than one byte
for (same) architectures does this work for bit fields with sizes more than one byte
No, there are no guarantees at all.
if they are standardised across all compilers as the book says ,can we specify how are they going to be aligned?
They aren't, the book is misleading. My advise is to stop reading at "bitfields are not portable" then forget that you ever heard about bit-fields. It is a 100% superfluous feature anyway. Instead, use bitwise operators.

That is a very disturbing text, I think I would toss the whole book.
Please go read the spec, you dont have to pay for it there are older ones available and draft versions (which of course are not the final) versions available, but for that couple of decades they are more common that different.
If I could add more upvotes to Lundin's answer I would, not worth creating new users with email addresses just to do that...
I have/will possibly spark an argument on this, but...The spec does say that if you define some number of (non-zero sized) bitfields in a row in a struct or union they will get packed, and there is a special zero sized one that is used to break that so that you can declare a bunch of bitfields and group them without having to make some other struct.
Perhaps it says they will be aligned, but I would never assume, that. I know for a fact the same compiler will treat endians different and pack them on opposite ends (top bit down or bottom bit up). But there is no reason to assume that any compiler follows any convention other than packing, and I would assume although perhaps it is also subject to interpretation, that they are packed in the order defined once you figure out where they start. So I wouldnt assume that 6 bits worth of declaration are aligned either way, could be up to six different alignents within a byte assuming a byte is the size of the unit, if the size of the unit is 32 or 64 bits then I am not going to bother counting the combinations, it is more than one and that is all that matters. I know for a fact from gcc that when the 32 to 64 bit x86 transition happened, that caused problems with code making assumptions on where those bits landed.
I personally wouldnt even assume that the bits are in the declared order when they pack them together...Popular compilers tend to do that, but the spec does not say more than they are packed...what does that mean. If I have a 1 then an 8 then a 1 then an 8 then a 6 bit I would hope the compiler alignes the 8 bit ones on a byte boundary and then moves the two ones near the 6, if I were to ever use a bitfield which I dont...
The prime contention here, is to me the spec is very clear that the initial items in more than one declaration in a union only uses the same memory if the order and size are the same they are compatible types. A one bit unsigned it is not the same as a 32 bit unsigned int, so they are NOT compatible types IMO. The spec goes further to state that, for bitfields the types have to be the same type and size, so for a bitfield to share the same memory in a union, you need two structures with the same initial bitfield items to be the same type and size, and only those items are per spec going to share memory, what happens with the rest of the bits is a different story, per the spec. so your example from my reading of the spec does nothing to say that the 8 bit char (using a made up non-spec declaration) and 8 declared bits of bit field are in no way expected to line up with each other and share the same memory. Just because a compiler chooses to do that in some version does not mean you can assume that, the union in particular does not make that code portable or more portable in anyway, in fact perhaps worse as now you not only have a bitfield issue across compilers or versions, but you now have union issues across compilers or versions.
As a general rule NEVER use a structure across a compile domain (with or without bitfields)(this includes unions), so never read a file into a structure, you have crossed a compile domain, never point structures at hardware registers, you have crossed a compile domain, never point structures at memory, dont point a structure at a char array that contains an ethernet packet and use the struct and/or bitfields to pick apart the ip header. Yes, these rules are widely used and abused and are a ticking time bomb. The primary reason the time bomb only goes off rarely is that the code keeps using the same compiler, and or a couple of vary popular compilers that currently have the same implementation. but struct pointing in general fails very very often, bitfield failures are just a side effect of that, and perhaps because of the horrible text in your book, unions are starting to show up a lot making the time bomb now nuclear instead of conventional.
So if you want to use a struct or a union or a bitfield and have the code actually work without maintenance. Then stay within the same compile domain (one program compiled at the same time with the same compiler and settings), pass structures defined as structures across functions, do not point at memory or other arrays, etc. for unions never access across individually defined items, if a single variable only use that variable until finished completely with it assume it is now trash if you use a struct or other variable in that union. With bitfields each variable is a standalone item independent of the other variables next to it, you are just ATTEMPTING to save memory by using them but you are actually wasting a lot of code overhead, performance, code space by using them in the first place. Keep to that and your code is far more likely to work without maintenance, across compilers. Now if you want to use this as job security and have your code fail to build or function every minor or major release of a compiler, then do those things above, point structs across a compile domain, point bitfields at hardware registers, etc. Other than your boss noting that you write horrible code that breaks often when some other employees dont, you will have to keep maintaining that code on a regular basis for the life of that product.
All the compiler does with your bitfield is generate masks and shifts, if you write those masks and shifts yourself, MASSIVELY more portable, you may still have endian issues (which can actually at times be easily solved in portable endian-less code) but you wont be completely pointing at the wrong thing using masking and shifting. it simply works, it does not produce more code, does not produce slower code, if you really need make macros for everything, using a macro to isolate a field within a "unit" is far more portable than using a bitfield.
Forget you ever read about bitfields or heard about them, never ever use them again. The need for them died decades ago. Unions somewhat fall into the same category as well, they do actually save memory but you have to be careful to share that memory properly.
And I would toss that book as well, if they dont understand this simple topic what else do those authors not understand. As with most folks this may be confusing a popular compilers interpretation with reality.

There is a little 'bit' confusion with the concepts of bit-fields and endians -
Suppose you have an MCU of 32bit - it means that internal memory of device have to be multiplied with a size of 32bits.
Now as you might know, the way each MCU stores the memory is LSB or MSB which is Big Endian and Little Endian respectively,
see here for illustrations: Endians figure
As can be seen, same data: 0x12345678 (32 bit value) is stored differently internally.
When you are reading and writing memory using a 32 bit pointer (trivial) case - the MCU will convert it for you (it doesn't makes any difference between the endians) - the problem arose when you are dealing with a one by one byte manipulation or when exporting (or importing) from other MCU / memory peripheral also suing 8 bit / 1 byte manipulations.
Bit field will be aligned to the Byte, Word and to the Long Word types (as seen) so it can be miss-interpreted when porting to other target.
Hence, the answer your questions:
If it is only one byte that you dividing into bits it will be ported nicely.
if you define a multi-bytes union it will make you goes in trouble.
Answered at the introduction of this answer
See answer no. 1
See the figure I have attached for illustrations
Right in general

1, 2: Not quite: it always depends on the platform (endianess) and the types you are using.
3: Yes they will always land on the same spot in memory.
4: Which alignment do you mean — the memory alignment or the field alignment?

Related

Are there reasons to avoid bit-field structure members?

I long knew there are bit-fields in C and occasionally I use them for defining densely packed structs:
typedef struct Message_s {
unsigned int flag : 1;
unsigned int channel : 4;
unsigned int signal : 11;
} Message;
When I read open source code, I instead often find bit-masks and bit-shifting operations to store and retrieve such information in hand-rolled bit-fields. This is so common that I do not think the authors were not aware of the bit-field syntax, so I wonder if there are reasons to roll bit-fields via bit-masks and shifting operations your own instead of relying on the compiler to generate code for getting and setting such bit-fields.
Why other programmers use hand-coded bit manipulations instead of bitfields to pack multiple fields into a single word?
This answer is opinion based as the question is quite open:
Many programmers are unaware of the availability of bitfields or unsure about their portability and precise semantics. Some even distrust the compiler's ability to produce correct code. They prefer to write explicit code that they understand.
As commented by Cornstalks, this attitude is rooted in real life experience as explained in this article.
Bitfield's actual memory layout is implementation defined: if the memory layout must follow a precise specification, bitfields should not be used and hand-coded bit manipulations may be required.
The handing of signed values in signed typed bitfields is implementation defined. If signed values are packed into a range of bits, it may be more reliable to hand-code the access functions.
Are there reasons to avoid bitfield-structs?
bitfield-structs come with some limitations:
Bit fields result in non-portable code. Also, the bit field length has a high dependency on word size.
Reading (using scanf()) and using pointers on bit fields is not possible due to non-addressability.
Bit fields are used to pack more variables into a smaller data space, but cause the compiler to generate additional code to manipulate these variables. This results in an increase in both space as well as time complexities.
The sizeof() operator cannot be applied to the bit fields, since sizeof() yields the result in bytes and not in bits.
Source
So whether you should use them or not depends. Read more in Why bit endianness is an issue in bitfields?
PS: When to use bit-fields in C?
There is no reason for it. Bitfields are useful and convenient. They are in the common use in the embedded projects. Some architectures (like ARM) have even special instructions to manipulate bitfields.
Just compare the code (and write the rest of the function foo1)
https://godbolt.org/g/72b3vY
In many cases, it is useful to be able to address individual groups of bits within a word, or to operate on a word as a unit. The Standard presently does not provide
any practical and portable way to achieve such functionality. If code is written to use bitfields and it later becomes necessary to access multiple groups as a word, there would be no nice way to accommodate that without reworking all the code using the bit fields or disabling type-based aliasing optimizations, using type punning, and hoping everything gets laid out as expected.
Using shifts and masks may be inelegant, but until C provides a means of treating an explicitly-designated sequence of bits within one lvalue as another lvalue, it is often the best way to ensure that code will be adaptable to meet needs.

The use of __int8_t on some small values on 32bit machine is useless optimization effort?

I have some values on my functions that are < 10, normally. So,if I use __int8_t instead of int to store this values is useless effort of optimization?
Not only can this be a useless optimization, but you may cause integer misalignment (i.e., integers not aligned to 4-byte or 8-byte boundaries depending on the architecture) which may actually slow performance. To get around this, most compilers try to align integers on most architectures, so you may not be saving any space at all, because the compiler adds padding (e.g., for stack based variables and struct members) to properly align larger variables or struct members that follow your 8-bit integer. Since your question looked like it was regarding a local variable in a function and assuming that there was only one such variable and that the function was not recursive, the optimization is most likely unnecessary and you should just use the platform native integer size (i.e., int).
The optimization you mentioned can be worthwhile if you have very many instances of a particular type and you want to consume less memory, by using 8-bit fields instead of say 64-bit fields for small integers in a struct. If this is your intention, be sure to use the correct pragmas and/or compiler switches for your platform, so that structs get properly packed to the smallest size. Another instance where using a smaller integer size can come in handy is when arrays, especially large ones, are involved. Finally, keep in mind that specifying the number of bits in the type is non-portable, as the leading double underscore in its name indicated.
I hazard to mention this, since it appears to only be tangentially related to your question, but you can go even smaller than 8-bit ints in structs, for even smaller ranges of values than 0-255. In this case, bit fields become an option - the epitomy of micro-optimization. Don't use bit fields unless you have actually measured performance in the time and space domains and are sure that there are significant savings to be had. To disuade you further, I present some of the pitfalls of using them in the following paragraph.
Bit fields add more complexity to the development process by forcing you to manually order struct members to pack data into as few bytes as possible - a task that is not only arduous, but also often leads to a completely unintuitive struct member variable ordering. Changes, additions and deletions of member variables often tend to cascade to the members that follow them causing a maintenance problem down the road. To get around this problem, developers often stick padding and alignment bits into the struct, complicating the process further for all but the original developer of the code. Commenting such code stops being a nicety and often becomes a necessity. There is also the problem of developers forgetting that they are dealing with a bit field within the code and treating it as if it had the entire domain of its underlying type, leading to no end of silent and sometimes difficult to debug under/overflows.
Given the above guidelines, profile your application to see whether the optimizations just discussed make sense for your particular compiler(s) and platform(s).
It won't save time, but if you have a million of these things, it will save 3mb of space (and that may save time).

Questions about memory alignement in structures and portability of the sizeof operator

I have a question about structure padding and memory alignment optimizations regarding structures in C language. I am sending a structure over the network, I know that, for run-time optimizations purposes, the memory inside a structure is not contiguous. I've run some tests on my local computer and indeed, sizeof(my_structure) was different than the sum of all my structure members. I ran some research to find out two things :
First, the sizeof() operator retrieves the padded size of the structure (i.e the real size that would be stored in memory).
When specifying __attribute__((__packed__)) in the declaration of the structure this optimization is disabled by the compiler, so sizeof(my_structure) will be exactly the same as the sum of the fields of my structure.
That being said, i am wondering if the sizeof operator was getting the padded size on every compilers implementation and on every architecture, in other words, is it always safe to copy a structure with memcpy for example using the sizeof operator such as :
memcpy(struct_dest, struct_src, sizeof(struct_src));
I am also wondering what is the real purpose of __attribute__((__packed__)), is it used to send a less important amount the data on a network when submitting a structure or is it, in fact, used to avoid some unspecified and platform-dependant sizeof operator behaviour ?
Thanks by advance.
Different compilers on different architectures can and do use different padding. So for wire transmission it is not uncommon to pack structs to achieve a consistent binary layout. This can then cater for the code at each end of the wire running on different architecture.
However you also need to make sure that your data types are the same size if you use this approach. For example, on 64 bit systems, long is 4 bytes on Windows and 8 bytes almost everywhere else. And you also need to deal with endianness issues. The standard is to transmit over the wire in network byte order. In practice you would be better using a dedicated serialization library rather than trying to reinvent solutions to all these issues.
I am sending a structure over the network
Stop there. Perhaps some would disagree with me on this (in practice you do see a lot of projects doing this), but struct is a way of laying out things in memory - it's not a serialization mechanism. By using this tool for the job, you're already tying yourself to a bunch of non-portable assumptions.
Sure, you may be able to fake it with things like structure padding pragmas and attributes, but - can you really? Even with those non-portable mechanisms you never know what quirks might show up. I recall working in a code base where "packed" structures were used, then suddenly taking it to a platform where access had to be word aligned... even though it was nominally the same compiler (thus supported the same proprietary extensions) it produced binaries which crashed. Any pain you get from this path is probably deserved, and I would say only take it if you can be 100% sure it will only run in a given compiler and environment, and that will never change. I'd say the safer bet is to write a proper serialization mechanism that doesn't allow writing structures around across process boundaries.
Is it always safe to copy a structure with memcpy for example using the sizeof operator
Yes, it is and that is the purpose of providing the sizeof operator.
Usually __attribute__((__packed__)) is used not for size considerations but when you want want to to make sure of the layout of a structure is exactly as you want it to be.
For ex:
If a structure is to be used to match hardware or be sent on a wire then it needs to have the exact same layout without any padding.This is because different architectures usually implement different kinds & amounts of padding and alignment and the only way to ensure common ground is to remove padding out out of the picture by using packing.

Reasons to use (or not) stdint

I already know that stdint is used to when you need specific variable sizes for portability between platforms. I don't really have such an issue for now, but what are the cons and pros of using it besides the already shown fact above?
Looking for this on stackoverflow and others sites, I found 2 links that treats about the theme:
codealias.info - this one talks about the portability of the stdint.
stackoverflow - this one is more specific about uint8_t.
These two links are great specially if one is looking to know more about the main reason of this header - portability. But for me, what I like most about it is that I think uint8_t is cleaner than unsigned char (for storing an RBG channel value for example), int32_t looks more meaningful than simply int, etc.
So, my question is, exactly what are the cons and pros of using stdint besides the portability? Should I use it just in some specifics parts of my code, or everywhere? if everywhere, how can I use functions like atoi(), strtok(), etc. with it?
Thanks!
Pros
Using well-defined types makes the code far easier and safer to port, as you won't get any surprises when for example one machine interprets int as 16-bit and another as 32-bit. With stdint.h, what you type is what you get.
Using int etc also makes it hard to detect dangerous type promotions.
Another advantage is that by using int8_t instead of char, you know that you always get a signed 8 bit variable. char can be signed or unsigned, it is implementation-defined behavior and varies between compilers. Therefore, the default char is plain dangerous to use in code that should be portable.
If you want to give the compiler hints of that a variable should be optimized, you can use the uint_fastx_t which tells the compiler to use the fastest possible integer type, at least as large as 'x'. Most of the time this doesn't matter, the compiler is smart enough to make optimizations on type sizes no matter what you have typed in. Between sequence points, the compiler can implicitly change the type to another one than specified, as long as it doesn't affect the result.
Cons
None.
Reference: MISRA-C:2004 rule 6.3."typedefs that indicate size and signedness shall be used in place of the basic types".
EDIT : Removed incorrect example.
The only reason to use uint8_t rather than unsigned char (aside from aesthetic preference) is if you want to document that your program requires char to be exactly 8 bits. uint8_t exists if and only if CHAR_BIT==8, per the requirements of the C standard.
The rest of the intX_t and uintX_t types are useful in the following situations:
reading/writing disk/network (but then you also have to use endian conversion functions)
when you want unsigned wraparound behavior at an exact cutoff (but this can be done more portably with the & operator).
when you're controlling the exact layout of a struct because you need to ensure no padding exists (e.g. for memcmp or hashing purposes).
On the other hand, the uint_least8_t, etc. types are useful anywhere that you want to avoid using wastefully large or slow types but need to ensure that you can store values of a certain magnitude. For example, while long long is at least 64 bits, it might be 128-bit on some machines, and using it when what you need is just a type that can store 64 bit numbers would be very wasteful on such machines. int_least64_t solves the problem.
I would avoid using the [u]int_fastX_t types entirely since they've sometimes changed on a given machine (breaking the ABI) and since the definitions are usually wrong. For instance, on x86_64, the 64-bit integer type is considered the "fast" one for 16-, 32-, and 64-bit values, but while addition, subtraction, and multiplication are exactly the same speed whether you use 32-bit or 64-bit values, division is almost surely slower with larger-than-necessary types, and even if they were the same speed, you're using twice the memory for no benefit.
Finally, note that the arguments some answers have made about the inefficiency of using int32_t for a counter when it's not the native integer size are technically mostly correct, but it's irrelevant to correct code. Unless you're counting some small number of things where the maximum count is under your control, or some external (not in your program's memory) thing where the count might be astronomical, the correct type for a count is almost always size_t. This is why all the standard C functions use size_t for counts. Don't consider using anything else unless you have a very good reason.
cons
The primary reason the C language does not specify the size of int or long, etc. is for computational efficiency. Each architecture has a natural, most-efficient size, and the designers specifically empowered and intended the compiler implementor to use the natural native data size data for speed and code size efficiency.
In years past, communication with other machines was not a primary concern—most programs were local to the machine—so the predictability of each data type's size was of little concern.
Insisting that a particular architecture use a particular size int to count with is a really bad idea, even though it would seem to make other things easier.
In a way, thanks to XML and its brethren, data type size again is no longer much of a concern. Shipping machine-specific binary structures from machine to machine is again the exception rather than the rule.
I use stdint types for one reason only, when the data I hold in memory shall go on disk/network/descriptor in binary form. You only have to fight the little-endian/big-endian issue but that's relatively easy to overcome.
The obvious reason not to use stdint is when the code is size-independent, in maths terms everything that works over the rational integers. It would produce ugly code duplicates if you provided a uint*_t version of, say, qsort() for every expansion of *.
I use my own types in that case, derived from size_t when I'm lazy or the largest supported unsigned integer on the platform when I'm not.
Edit, because I ran into this issue earlier:
I think it's noteworthy that at least uint8_t, uint32_t and uint64_t are broken in Solaris 2.5.1.
So for maximum portability I still suggest avoiding stdint.h (at least for the next few years).

#defined bitflags and enums - peaceful coexistence in "c"

I have just discovered the joy of bitflags. I have several questions related to "best-practices" regarding the use of bitflags in C. I learned everything from various examples I found on the web but still have questions.
In order to save space, I am using a single 32bit integer field in a struct (A->flag) to represent several different sets of boolean properties. In all, 20 different bits are #defined. Some of these are truly presence/absence flags (STORAGE-INTERNAL vs. STORAGE-EXTERNAL). Others have more than two values (e.g. mutually exclusive set of formats: FORMAT-A, FORMAT-B, FORMAT-C). I have defined macros for setting specific bits (and simultaneously turning off mutually exclusive bits). I have also defined macros for testing if specific combination of bits are set in the flag.
However, what is lost in the above approach is the specific grouping of flags that is best captured by enums. For writing functions, I would like to use enums (e.g., STORAGE-TYPE and FORMAT-TYPE), so that function definitions look nice. I expect to use enums only for passing parameters and #defined macros for setting and testing flags.
(a) How do I define flag (A->flag) as a 32 bit integer in a portable fashion (across 32 bit / 64 bit platforms)?
(b) Should I worry about potential size differences in how A->flag vs. #defined constants vs. enums are stored?
(c) Am I making things unnecessarily complicated, meaning should I just stick to using #defined constants for passing parameters as ordinary ints? What else should I worry about in all this?
I apologize for the poorly articulated question. It reflects my ignorance about potential issues.
There is a C99 header that was intended to solve that exact problem (a) but for some reason Microsoft doesn't implement it. Fortunately, you can get <stdint.h> for Microsoft Windows here. Every other platform will already have it. The 32-bit int types are uint32_t and int32_t. These also come in 8, 16, and 64- bit flavors.
So, that takes care of (a).
(b) and (c) are kind of the same question. We do make assumptions whenever we develop something. You assume that C will be available. You assume that <stdint.h> can be found somewhere. You could always assume that int was at least 16 bits and now a >= 32 bit assumption is fairly reasonable.
In general, you should try to write conforming programs that don't depend on layout, but they will make assumptions about word length. You should worry about performance at the algorithm level, that is, am I writing something that is quadratic, polynomial, exponential?
You should not worry about performance at the operation level until (a) you notice a performance lag, and (b) you have profiled your program. You need to get your job done without bogging down worrying about individual operations. :-)
Oh, I should add that you particularly don't need to worry about low level operation performance when you are writing the program in C in the first place. C is the close-to-the-metal go-as-fast-as-possible language. We routinely write stuff in php, python, ruby, or lisp because we want a powerful language and the CPU's are so fast these days that we can get away with an entire interpreter, never mind a not-perfect choice of bit-twiddle-word-length ops. :-)
You can use bit-fields and let the compiler do the bit twiddling. For example:
struct PropertySet {
unsigned internal_storage : 1;
unsigned format : 4;
};
int main(void) {
struct PropertySet x;
struct PropertySet y[10]; /* array of structures containing bit-fields */
if (x.internal_storage) x.format |= 2;
if (y[2].internal_storage) y[2].format |= 2;
return 0;
}
Edited to add array of structures
As others have said, your problem (a) is resolvable by using <stdint.h> and either uint32_t or uint_least32_t (if you want to worry about Burroughs mainframes which have 36-bit words). Note that MSVC does not support C99, but #DigitalRoss shows where you can obtain a suitable header to use with MSVC.
Your problem (b) is not an issue; C will type convert safely for you if it is necessary, but it probably isn't even necessary.
The area of most concern is (c) and in particular the format sub-field. There, 3 values are valid. You can handle this by allocating 3 bits and requiring that the 3-bit field is one of the values 1, 2, or 4 (any other value is invalid because of too many or too few bits set). Or you could allocate a 2-bit number, and specify that either 0 or 3 (or, if you really want to, one of 1 or 2) is invalid. The first approach uses one more bit (not currently a problem since you're only using 20 of 32 bits) but is a pure bitflag approach.
When writing function calls, there is no particular problem writing:
some_function(FORMAT_A | STORAGE_INTERNAL, ...);
This will work whether FORMAT_A is a #define or an enum (as long as you specify the enum value correctly). The called code should check whether the caller had a lapse in concentration and wrote:
some_function(FORMAT_A | FORMAT_B, ...);
But that is an internal check for the module to worry about, not a check for users of the module to worry about.
If people are going to be switching bits in the flags member around a lot, the macros for setting and unsetting the format field might be beneficial. Some might argue that any pure boolean fields barely need it, though (and I'd sympathize). It might be best to treat the flags member as opaque and provide 'functions' (or macros) to get or set all the fields. The less people can get wrong, the less will go wrong.
Consider whether using bit-fields works for you. My experience is that they lead to big code and not necessarily very efficient code; YMMV.
Hmmm...nothing very definitive here, so far.
I would use enums for everything because those are guaranteed to be visible in a debugger where #define values are not.
I would probably not provide macros to get or set bits, but I'm a cruel person at times.
I would provide guidance on how to set the format part of the flags field, and might provide a macro to do that.
Like this, perhaps:
enum { ..., FORMAT_A = 0x0010, FORMAT_B = 0x0020, FORMAT_C = 0x0040, ... };
enum { FORMAT_MASK = FORMAT_A | FORMAT_B | FORMAT_C };
#define SET_FORMAT(flag, newval) (((flag) & ~FORMAT_MASK) | (newval))
#define GET_FORMAT(flag) ((flag) & FORMAT_MASK)
SET_FORMAT is safe if used accurately but horrid if abused. One advantage of the macros is that you could replace them with a function that validated things thoroughly if necessary; this works well if people use the macros consistently.
For question a, if you are using C99 (you probably are using it), you can use the uint32_t predefined type (or, if it is not predefined, it can be found in the stdint.h header file).
Regarding (c): if your enumerations are defined correctly you should be able to pass them as arguments without a problem. A few things to consider:
enumeration storage is often
compiler specific, so depending on
what kind of development you are
doing (you don't mention if it's
Windows vs. Linux vs. embedded vs.
embedded Linux :) ) you may want to
visit compiler options for enum
storage to make sure there are no
issues there. I generally agree with
the above consensus that the
compiler should cast your
enumerations appropriately - but
it's something to be aware of.
in the case that you are doing
embedded work, many static quality
checking programs such as PC Lint
will "bark" if you start getting too
clever with enums, #defines, and
bitfields. If you are doing
development that will need to pass
through any quality gates, this
might be something to keep in mind.
In fact, some automotive standards
(such as MISRA-C) get downright
irritable if you try to get trig
with bitfields.
"I have just discovered the joy of
bitflags." I agree with you - I find
them very useful.
I added comments to each answer above. I think I have some clarity. It seems enums are cleaner as it shows up in debugger and keeps fields separate. macros can be used for setting and getting values.
I have also read that enums are stored as small integers - which as I understand it, is not a problem with the boolean tests as these would be peroformed starting at the right most bits. But, can enums be used to store large integers (1 << 21)??
thanks again to you all. I have already learned more than I did two days ago!!
~Russ

Resources