I am in the early stages of framing stuff out on a new project.
I defined a function with a return type of "bool"
I got this output from PC-Lint
Including file sockets.h (hdr)
bool sock_close(uint8_t socket_id);
^
"LINT: sockets.h (52, 1) Note 970: Use of modifier or type '_Bool' outside of a typedef [MISRA 2012 Directive 4.6, advisory]"
I went ahead and defined this in another header to shut lint up:
typedef bool bool_t;
Then I started wondering why I had to do that and why it changed anything. I turned to MISRA 2012 Dir 4.6. It is concerned mostly about the width of primitive types like short, int, and long, their width, and how they are signed.
The standard does not give any amplification, rational, exception, or example for bool.
bool is explicitly defined as _Bool in stdbool.h in C99. So does this criteria really apply bool?
I thought _Bool was explicitly always the "smallest standard unsigned integer type large enough to store the values 0 and 1" according to section 6.2.5 of C99. So we know bool is unsigned. Is it then just a matter of the fact that _Bool is not fixed width and subject being promoted somehow that's the issue? Because the rational would seem to contradict that notion.
Adherence to this guideline does not guarantee portability because the size of the int type may determine whether or not an expression is subject to integer promotion.
How does just putting typedef bool bool_t; change anything - because I do nothing to indicate the width or the signdedness in doing so? The width of bool_t will just be platform dependent too. Is there a better way to redefine bool?
A type must not be defined with a specific length unless the implemented type is actually of that length
so typedef bool bool8_t; should be totally illegal.
Is Gimpel wrong in their interpretation of Directive 4.6 or are they spot on?
Use of modifier or type '_Bool' outside of a typedef [MISRA 2012 Directive 4.6, advisory]
That's nonsense, directive 4.6 is only concerned about using the types in stdint.h rather than int, short etc. The directive is about the basic numerical types. bool has nothing to do with that directive whatsoever, as it is not a numerical type.
For reasons unknown, MISRA-C:2012 examples use a weird type called bool_t, which isn't standard. But MISRA does by no means enforce this type to be used anywhere, particularly they do not enforce it in directive 4.6, which doesn't even mention booleans. MISRA does not discourage the use of bool or _Bool anywhere.
Is Gimpel wrong in their interpretation of Directive 4.6
Yes, their tool is giving incorrect diagnostics.
In addition, you may have to configure the tool (if possible) to tell it which bool type that is used. 5.3.2 mentions that you might have to do so if not using _Bool, implying that all static analysers must understand _Bool. But even if the bool type is correctly configured, dir 4.6 has nothing to do with it.
A potential concern with Boolean types is that a lot of code prior to C99 used a single-byte type to hold true/false values, and a fair amount of it may have used the name "bool". Attempting to store any multiple of 256 into most such types would be regarded as storing zero, while storing a non-zero multiple of 256 into a c99 "bool" would yield 1. If a piece of code which uses a C99 "bool" is ported into a piece of code that uses a typedef'ed byte, the resulting code could very easily malfunction (it's somewhat less likely that code written for a typedef'ed byte would rely upon any particular behavior when storing a value other than 0 or 1).
Related
The C11 spec on enums1, states that the enumerator constants must have type int (1440-1441):
1440 The expression that defines the value of an enumeration constant shall be an integer constant expression that has a value representable as an int.
1441 The identifiers in an enumerator list are declared as constants that have type int and may appear wherever such are permitted.107)
However, it indicates that the backing type of the enum can be either a signed int, and unsigned int, or a char, so long as it fits the range of constants in the enum (1447-1448):
1447 Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type.
1448 The choice of type is implementation-defined,108) but shall be capable of representing the values of all the members of the enumeration.
This seems to indicate that only the compiler can know the width of an enum type, which is fine until you consider an array of enum types as part of a dynamically linked library.
Say you had a function:
enum my_enum return_fifth(enum my_enum[] lst) {
return lst[5];
}
This would be fine when linked to statically, because the compiler knows the size of a my_enum, but any other C code linking to it may not.
So, how is it possible for one C library to dynamically link to another C library, and know how the compiler decided to implement the enums? (Or do most modern compilers just stick with int/uint and forgo using chars altogether?
1Okay, I know this website is not quite the C11 standard, where as this one is a bit closer: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
C standard doesn't says anything about dynamical library or even static library, these concepts doesn't exist in standard. This is in the implemented behavior domain.
But as you said nothing prevent a compiler to use different type for an enumeration, this mean the one compiler can use a type and another use a different type.
This would be fine when linked to statically
In fact no, let say that A is a compiler that used char and B is a compiler that used int, and let say these types are not the same size. You compile a static library with the A compiler, and you statically link this library to a program compiled by B. This is static and still, B can't know that A doesn't use the B type for the enum.
So, how is it possible for one C library to dynamically link to another C library
Well, as I said this is not possible for static library, for the same reason, this is not possible for dynamic library.
Or do most modern compilers just stick with int/uint and forgo using chars altogether?
Big compiler generally talk between them to use the same rules on the same environment, yes. But nothing in C guaranty this behavior. (On the compatibility problem: a lot of people use "The C ABI", despite the fact it doesn't exist in the standard.)
So the best advice is to compile your dynamic library with the same compiler and option that compile your main program, also check the documentation of your compiler is a big plus.
The complete definition of the enum needs to be visible at the time it is used. Then the compiler will know what the size will be.
Forward declarations of enum types is not allowed.
Suppose I have
typedef struct {
unsigned short bar : 1;
} foo_bf;
typedef union {
unsigned short val;
foo_bf bf;
} foo_t;
How do I correctly assign a value to this bitfield from an type e.g uint16_t?
uint16_t myValue = 1;
foo_t foo;
foo.bf.bar = myValue
Running PC-Lint, this turns into a MISRA error:
Expression assigned to a narrower or different essential type.
I tried to limit the number of used bits without any success.
foo.bf.bar = (myValue 0x1U)
Is there any chance to get it MISRA complient if I have to use a uint16_t value as origin?
MISRA-C's essential type model isn't really applicable to bit-fields. The terms narrower and wider refer to the size in bytes (see 8.10.2). So it isn't obvious if a static analyser should warn here or not, since the rules for essential type do not address bit-fields.
EDIT: I was wrong here, see the answer by Andrew. Appendix D.4 tells how to translate a bit-field type to the matching essential type category.
However, using bit-fields in a MISRA-C application is a bad idea. Bit-fields are very poorly specified by the standard, and therefore non-deterministic and unreliable. Also, MISRA-C 6.1 requires that you document how your compiler supports bit-fields with uint16_t, as that is not one of the standard integer types allowed for bit-fields.
But the real deal-breaker here is Directive 1.1, which requires that all implementation-defined behavior is documented and understood. For a MISRA-C implementation, I once actually tried to document all implementation-defined aspects of bit-fields. Soon I found myself writing a whole essay, because there are so many problems with them. See this for the top of the iceberg.
The work-around for not having to write such a "bit-field behavior book" is to unconditionally ban the use of bit-fields entirely in your coding standard. They are a 100% superfluous feature anyway. Use bit-wise operators instead.
Appendix D.4 of MISRA C:2012 is usefully titled "The essential types of bit fields"
For a bit-field which is implemented with an essentially Boolean type, it is essentially Boolean
For a bit-field which is implemented with a signed type, it is the Signed Type of Lowest Rank which is able to represent the bit field
For a bit-field which is implemented with a unsigned type, it is the Unsigned Type of Lowest Rank which is able to represent the bit field
The Unsigned Type of Lowest Rank of a single-bit unsigned integer would be uint8_t (aka unsigned char) - assuming that the tool does not interpret a single-bit as being boolean...
Beyond observing that this looks like a mis-diagnosis by PC-Lint, a workaround that avoids any possibility of doubt would to cast:
foo.bf.bar = (uint8_t)myValue
As an aside MISRA C:2012 Rule 6.1 gives guidance on the use of types other than signed/unsigned int for bit-fields...
Any time I had the need of a Boolean type I was told to either create one, or better yet, use stdbool.h.
Since stdbool.h uses typedef bool _Bool, is there a reason to use the header instead just using type _Bool? Is it just for the additional macros (/* #define true 1 #define false 0 */)?
The obvious type to add into the language was bool. But unfortunately, plenty of code was written that included bool in other shapes and forms. Recall that support for a boolean type was added only in C99.
So the C language committee had no choice but to pull out a reserved identifier for it (_Bool). But, since the obvious choice of type name is still the same, stdbool.h was added to allow users the obvious name. That way, if your code didn't have a home-brewed bool, you could use the built in one.
So do indeed use stdbool.h if you aren't bound to some existing home-brewed bool. It will be the standard type, with all the benefits that type brings in.
The common practice has always been to use bool but when the type was officially introduced into the standard in C99, they didn't want to break the "roll-your-own" implementations. So they made the type _Bool as kind of a hack around the unofficial bools. Now there's no type name collision. Anyway, point is, use bool unless a legacy codebase breaks.
They are same. bool is an alias for _Bool.
Before C99 we used we dont have this type. (Earlier the use was limited to an integer tyoe with 0 as false and 1 as true).
You may not use it. Even you can undef bool (but it is recommended not to do so). But including it (stdbool.h and bool alias of _Bool) is good because then if someday it becomes reserved your code complies to that.1
1. You can use bool other way but it is better not to. Because in general when this stdbool.h is introduced it bears the plan of gradually making it standard and then even more stricter rule applies where we can't use bool as something other and it will be reserved as keyword.
I am using a C library provided to me already compiled. I have limited information on the compiler, version, options, etc., used when compiling the library. The library interface uses enum both in structures that are passed and directly as passed parameters.
The question is: how can I assure or establish that when I compile code to use the provided library, that my compiler will use the same size for those enums? If it does not, the structures won't line up, and the parameter passing may be messed up, e.g. long vs. int.
My concern stems from the C99 standard, which states that the enum type:
shall be compatible with char, a signed integer type, or an unsigned
integer type. The choice of type is implementation-defined, but shall
be capable of representing the values of all the members of the
enumeration.
As far as I can tell, so long as the largest value fits, the compiler can pick any type it darn well pleases, effectively on a whim, potentially varying not only between compilers, but different versions of the same compiler and/or compiler options. It could pick 1, 2, 4, or 8-byte representations, resulting in potential incompatibilities in both structures and parameter passing. (It could also pick signed or unsigned, but I don't see a mechanism for that being a problem in this context.)
Am I missing something here? If I am not missing something, does this mean that enum should never be used in an API?
Update:
Yes, I was missing something. While the language specification doesn't help here, as noted by #Barmar the Application Binary Interface (ABI) does. Or if it doesn't, then the ABI is deficient. The ABI for my system indeed specifies that an enum must be a signed four-byte integer. If a compiler does not obey that, then it is a bug. Given a complete ABI and compliant compilers, enum can be used safely in an API.
APIs that use enum are depending on the assumption that the compiler will be consistent, i.e. given the same enum declaration, it will always choose the same underlying type.
While the language standard doesn't specifically require this, it would be quite perverse for a compiler to do anything else.
Furthermore, all compilers for a particular OS need to be consistent with the OS's ABI. Otherwise, you would have far more problems, such as the library using 64-bit int while the caller uses 32-bit int. Ideally, the ABI should constrain the representation of enums, to ensure compatibility.
More generally, the language specification only ensures compatibility between programs compiled with the same implementation. The ABI ensures compatibility between programs compiled with different implementations.
From the question:
The ABI for my system indeed specifies that an enum must be a signed four-byte integer. If a compiler does not obey that, then it is a bug.
I'm surprised about that. I suspect in reality you're compiler will select a 64-bit (8 byte) size for your enum if you define an enumerated constant with a value larger that 2^32.
On my platforms (MinGW gcc 4.6.2 targeting x86 and gcc 4,.4 on Linux targeting x86_64), the following code says that I get both 4 and 8 byte enums:
#include <stdio.h>
enum { a } foo;
enum { b = 0x123456789 } bar;
int main(void) {
printf("%lu\n", sizeof(foo));
printf("%lu", sizeof(bar));
return 0;
}
I compiled with -Wall -std=c99 switches.
I guess you could say that this is a compiler bug. But the alternatives of removing support for enumerated constants larger than 2^32 or always using 8-byte enums both seem undesirable.
Given that these common versions of GCC don't provide a fixed size enum, I think the only safe action in general is to not use enums in APIs.
Further notes for GCC
Compiling with "-pedantic" causes the following warnings to be generated:
main.c:4:8: warning: integer constant is too large for 'long' type [-Wlong-long]
main.c:4:12: warning: ISO C restricts enumerator values to range of 'int' [-pedantic]
The behavior can be tailored via the --short-enums and --no-short-enums switches.
Results with Visual Studio
Compiling the above code with VS 2008 x86 causes the following warnings:
warning C4341: 'b' : signed value is out of range for enum constant
warning C4309: 'initializing' : truncation of constant value
And with VS 2013 x86 and x64, just:
warning C4309: 'initializing' : truncation of constant value
I am working with embedded device, with 32K of memory, writing in plain C using IAR EWARM v6.30.
To make code more readable I would like to define some enum types, for example, something like
{RIGHT_BUTTON, CENTER_BUTTON, LEFT_BUTTON}
instead of using 0, 1, 2 values, but I am afraid it will take additional memory that is already scarce.
So I have 2 questions:
1) Can I force enum to be of short or byte type intead of int?
2) What is an exact memory imprint of defining enum type?
In fully compliant ISO C the size and type of an enum constant is that of signed int. Some embedded systems compilers deliberately do not comply with that as an optimisation or extension.
In ISO C++ "The underlying type of an enumeration is an integral type that can represent all the enumerator values defined in the enumeration.", so a compiler is free to use the smallest possible type, and most do, but are not obliged to do so.
In your case (IAR EWARM), the manual clearly states:
No option required, in fact you'd need to use --enum_is_int to force compliant behaviour. Other compilers may behave differently or have different extensions, pragmas or options to control this. Such things will normally be defined in the documentation.
If you really need to keep the data size down to a char then you can always use a set of #define constant values to represent the enum states and only ever use these values in the your assignments and tests.
For a conforming compiler, an enumerated constant is always of type int (equivalently, signed int). But such constants aren't typically stored in memory, so their type probably won't have much effect on memory requirements.
A declared object of the enumerated type is of the enumerated type itself, which is compatible with char or with some signed or unsigned integer type. The choice of type is implementation-defined (i.e., the compiler gets to choose, but it must document how it makes the choice); the only requirement is that the type has to be capable of storing the values of all the constants.
It's admittedly odd that the constants are of type int rather than the enumerated type, but that's how the language is defined (the reasons are historical, and C++ has different rules).
For example, given:
enum foo { x, y, z };
enum foo obj;
obj = z;
the expression z is of type int and has the value 2 (just like the decimal constant 2), but the object obj is of type enum foo and may be as small as one byte, depending on the compiler. The assignment obj = z; involves an implicit conversion from int to enum foo (that conversion may or may not require additional code).
Some compilers may provide some non-standard way to specify the type to be chosen for an enumerated type. Some may even violate the standard in some way. Consult your compiler's documentation, print out the value of sizeof (enum foo), and, if necessary, examine the generated code.
It's likely that your compiler will make reasonable decisions within the constraints imposed by the language. For a compiler targeted at memory-poor embedded systems, it's particularly likely that the compiler will either choose a small type, or will let you specify one. Consult your compiler's documentation.
As Ian's answer suggests, if you want to control memory usage yourself, you can use char or unsigned char objects. You can still use an enum definition to define the constants, though. For example:
enum { x, y, z }; // No tag, so you can't declare objects of this type
typedef unsigned char foo; // an enum_foo object is guaranteed to be 1 byte
foo obj = z;
Reference: section 6.7.2.2 of the C standard. The link is to a 1.7-megabyte PDF of a recent draft of the 2011 ISO C standard; this particular section hasn't changed significantly since 1989.
An ANSI C compiler will always represent an enum as an int to represent variables of type enum.
http://en.wikipedia.org/wiki/Enumerated_type#C_and_syntactically_similar_languages
One option to use ints in your program would be to use them to define values, but cast to char when actually used
char value = (char)Buttons.RIGHT_BUTTON;