How to avoid pedantic warnings while using Hexadecimal in Enum? - c

I have an enum like this
typedef enum {
FIRST,
SECOND,
THIRD = 0X80000001,
FOURTH,
FIFTH,
} STATUS;
I am getting a pedantic warning since I am compiling my files with the option -Wpedantic:
warning: ISO C restricts enumerator values to range of 'int' [-Wpedantic]
I found that it occurs since when I convert the hex value 0X80000001 to integer it exceeds the unsigned integer limits. My purpose is to have continuous hex values as the status in the enum without this warning.
I cannot use the macros since this will defy the purpose of having the enums in the first place. What code change will avoid this warning?

Enumeration constants are guaranteed to be of the same size as (signed) int. Apparently your system uses 32 bit int, so an unsigned hex literal larger than 0x7FFFFFFF will not fit.
So the warning is not just "pedantic", it hints of a possibly severe bug. Note that -pedantic in GCC does not mean "be picky and give me unimportant warnings" but rather "ensure that my code actually follows the C standard".
It appears that you want to do a list of bit masks or hardware addresses, or some other hardware-related programming. enum is unsuitable for such tasks, because in hardware-related programming, you rarely ever want to use signed types, but always unsigned ones.
If you must have a safe and portable program, then there is no elegant way to do this. C is a language with a lot of flaws, the way enum is defined by the standard is one of them.
One work-around is to use some sort of "poor man's enum", such as:
typedef uint32_t STATUS;
#define THIRD 0X80000001
If you must also have the increased type safety of an enum, then you could possibly use a struct:
typedef struct
{
uint32_t value;
} STATUS;
Or alternatively, just declare an array of constants and use an enum to define the array index. Probably the cleanest solution but takes a little bit of extra overhead:
typedef enum {
FIRST,
SECOND,
THIRD,
FOURTH,
FIFTH,
STATUS_N
} STATUS;
const uint32_t STATUS_DATA [STATUS_N] =
{
0,
1,
0X80000001,
0X80000002,
0X80000003
};

Related

How to force enum size with GCC? [duplicate]

Already read through this related question, but was looking for something a little more specific.
Is there a way to tell your compiler specifically how wide you want your enum to be?
If so, how do you do it? I know how to specify it in C#; is it similarly done in C?
Would it even be worth doing? When the enum value is passed to a function, will it be passed as an int-sized value regardless?
I believe there is a flag if you are using GCC.
-fshort-enums
Is there a way to tell your compiler
specifically how wide you want your
enum to be?
In general case no. Not in standard C.
Would it even be worth doing?
It depends on the context. If you are talking about passing parameters to functions, then no, it is not worth doing (see below). If it is about saving memory when building aggregates from enum types, then it might be worth doing. However, in C you can simply use a suitably-sized integer type instead of enum type in aggregates. In C (as opposed to C++) enum types and integer types are almost always interchangeable.
When the enum value is passed to a function, will it be passed as an int-sized value regardless?
Many (most) compilers these days pass all parameters as values of natural word size for the given hardware platform. For example, on a 64-bit platform many compilers will pass all parameters as 64-bit values, regardless of their actual size, even if type int has 32 bits in it on that platform (so, it is not generally passed as "int-sized" value on such a platform). For this reason, it makes no sense to try to optimize enum sizes for parameter passing purposes.
You can force it to be at least a certain size by defining an appropriate value. For example, if you want your enum to be stored as the same size as an int, even though all the values would fit in a char, you can do something like this:
typedef enum {
firstValue = 1,
secondValue = 2,
Internal_ForceMyEnumIntSize = MAX_INT
} MyEnum;
Note, however, that the behavior can be dependent on the implementation.
As you note, passing such a value to a function will cause it to be expanded to an int anyway, but if you are using your type in an array or a struct, then the size will matter. If you really care about element sizes, you should really use types like int8_t, int32_t, etc.
Even if you are writing strict C code, the results are going to be compiler dependent. Employing the strategies from this thread, I got some interesting results...
enum_size.c
#include <stdio.h>
enum __attribute__((__packed__)) PackedFlags {
PACKED = 0b00000001,
};
enum UnpackedFlags {
UNPACKED = 0b00000001,
};
int main (int argc, char * argv[]) {
printf("packed:\t\t%lu\n", sizeof(PACKED));
printf("unpacked:\t%lu\n", sizeof(UNPACKED));
return 0;
}
$ gcc enum_size.c
$ ./a.out
packed: 4
unpacked: 4
$ gcc enum_size.c -fshort_enums
$ ./a.out
packed: 4
unpacked: 4
$ g++ enum_size.c
$ ./a.out
packed: 1
unpacked: 4
$ g++ enum_size.c -fshort_enums
$ ./a.out
packed: 1
unpacked: 1
In my example above, I did not realize any benefit from __attribute__((__packed__)) modifier until I started using the C++ compiler.
EDIT:
#technosaurus's suspicion was correct.
By checking the size of sizeof(enum PackedFlags) instead of sizeof(PACKED) I see the results I had expected...
printf("packed:\t\t%lu\n", sizeof(enum PackedFlags));
printf("unpacked:\t%lu\n", sizeof(enum UnpackedFlags));
I now see the expected results from gcc:
$ gcc enum_size.c
$ ./a.out
packed: 1
unpacked: 4
$ gcc enum_size.c -fshort_enums
$ ./a.out
packed: 1
unpacked: 1
There is also another way if the enum is part of a structure:
enum whatever { a,b,c,d };
struct something {
char :0;
enum whatever field:CHAR_BIT;
char :0;
};
The :0; can be omitted if the enum field is surrounded by normal fields. If there's another bitfield before, the :0 will force byte alignement to the next byte for the field following it.
In some circumstances, this may be helpful:
typedef uint8_t command_t;
enum command_enum
{
CMD_IDENT = 0x00, //!< Identify command
CMD_SCENE_0 = 0x10, //!< Recall Scene 0 command
CMD_SCENE_1 = 0x11, //!< Recall Scene 1 command
CMD_SCENE_2 = 0x12, //!< Recall Scene 2 command
};
/* cmdVariable is of size 8 */
command_t cmdVariable = CMD_IDENT;
On one hand type command_t has size 1 (8bits) and can be used for variable and function parameter type.
On the other hand you can use the enum values for assignation that are of type int by default but the compiler will cast them immediately when assigned to a command_t type variable.
Also, if you do something unsafe like defining and using a CMD_16bit = 0xFFFF, the compiler will warn you with following message:
warning: large integer implicitly truncated to unsigned type [-Woverflow]
As #Nyx0uf says, GCC has a flag which you can set:
-fshort-enums
Allocate to an enum type only as many bytes as it needs for the declared range of possible values. Specifically, the enum type is equivalent to the smallest integer type that has enough room.
Warning: the -fshort-enums switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface.
Source: https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html
Additional great reading for general insight: https://www.embedded.fm/blog/2016/6/28/how-big-is-an-enum.
Interesting...notice the line I highlighted in yellow below! Adding an enum entry called ARM_EXCEPTION_MAKE_ENUM_32_BIT and with a value equal to 0xffffffff, which is the equivalent of UINT32_MAX from stdint.h (see here and here), forces this particular Arm_symbolic_exception_name enum to have an integer type of uint32_t. That is the sole purpose of this ARM_EXCEPTION_MAKE_ENUM_32_BIT entry! It works because uint32_t is the smallest integer type which can contain all of the enum values in this enum--namely: 0 through 8, inclusive, as well as 0xffffffff, or decimal 2^32-1 = 4294967295.
Keywords: ARM_EXCEPTION_MAKE_ENUM_32_BIT enum purpose why have it? Arm_symbolic_exception_name purpose of 0xffffffff enum entry at end.
Right now I can't answer your first two questions, because I am trying to find a good way to do this myself. Maybe I will edit this if I find a strategy that I like. It isn't intuitive though.
But I want to point something out that hasn't been mentioned so far, and to do so I will answer the third question like so:
It is "worth doing" when writing a C API that will be called from languages that aren't C. Anything that directly links to the C code will need to correctly understand the memory layout of all structs, parameter lists, etc in the C code's API. Unfortunately, C types like int, or worst yet, enums, are fairly unpredictably sized (changes by compiler, platform, etc), so knowing the memory layout of anything containing an enum can be dodgy unless your other programming language's compiler is also the C compiler AND it has some in-language mechanism to exploit that knowledge. It is much easier to write problem-free bindings to C libraries when the API uses predictably-sized C types like uint8_t, uint16_t, uint32_t, uint64_t, void*, uintptr_t, etc, and structs/unions composed of those predictably-sized types.
So I would care about enum sizing when it matters for program correctness, such as when memory layout and alignment issues are possible. But I wouldn't worry about it so much for optimization, not unless you have some niche situation that amplifies the opportunity cost (ex: a large array/list of enum-typed values on a memory constrained system like a small MCU).
Unfortunately, situations like what I'm mentioning are not helped by something like -fshort-enums, because this feature is vendor-specific and less predictable (e.g. another system would have to "guess" enum size by approximating GCC's algorithm for -fshort-enums enum sizing). If anything, it would allow people to compile C code in a way that would break common assumptions made by bindings in other languages (or other C code that wasn't compiled with the same option), with the expected result being memory corruption as parameters or struct members get written to, or read from, the wrong locations in memory.
As of C23, this is finally possible in standard C:
You can put a colon and an integer type after the enum keyword (or after the name tag, if it's named) to specify the enum's fixed underyling type, which sets the size and range of the enum type.
Would it even be worth doing? When the enum value is passed to a function, will it be passed as an int-sized value regardless?
On x86_64, the type of a integer does not influence whether it is passed in register or not (as long as it fits in a single register). The size of data on the heap however is very significant for cache performance.
It depends on the values assigned for the enums.
Ex:
If the value greater than 2^32-1 is stored, the size allocated for the overall enum will change to the next size.
Store 0xFFFFFFFFFFFF value to a enum variable, it will give warning if tried to compile in a 32 bit environment (round off warning)
Where as in a 64 bit compilation, it will be successful and the size allocated will be 8 bytes.

Work on an array of signed int as if it contained unsigned values

I've inherited some old code that assumes that an int can store values from -231 to 2^31-1, that overflow just wraps around, and that the sign bit is the high-order bit. In other words, that code should have used uint32_t, except that it wasn't. I would like to fix this code to use uint32_t.
The difficulty is that the code is distributed as source code and I'm not allowed to change the external interface. I have a function that works on an array of int. What it does internally is its own business, but int is exposed in the interface. In a nutshell, the interface is:
struct data {
int a[10];
};
void frobnicate(struct data *param);
I'd like to change int a[10] to uint32_t a[10], but I'm not allowed to modify the definition of struct data.
I can make the code work on uint32_t or unsigned internally:
struct internal_data {
unsigned a[10];
};
void frobnicate(struct data *param) {
struct internal_data *internal = (struct internal_data *)param;
// ... work with internal ...
}
However this is not actually correct C since it's casting between pointers to different types.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build? If int is less than 32 bits, the code has never worked anyway. For the vast majority of users, the code should build, and in a way that tells the compiler not to do “weird” things with overflowing int calculations.
I distribute the source code and people may use it with whatever compiler they choose, so compiler-specific tricks are not relevant.
I'm at least going to add
#if INT_MIN + 1 != -0x7fffffff
#error "This code only works with 32-bit two's complement int"
#endif
With this guard, what can go wrong with the cast above? Is there a reliable way of manipulating the int array as if its elements were unsigned, without copying the array?
In summary:
I can't change the function prototype. It references an array of int.
The code should manipulate the array (not a copy of the array) as an array of unsigned.
The code should build on platforms where it worked before (at least with sufficiently friendly compilers) and should not build on platforms where it can't work.
I have no control over which compiler is used and with which settings.
However this is not actually correct C since it's casting between pointers to different types.
Indeed, you cannot do such casts, because the two structure types are not compatible. You could however use a work-around such as this:
typedef union
{
struct data;
uint32_t array[10];
} internal_t;
...
void frobnicate(struct data *param) {
internal_t* internal = (internal_t*)param;
...
Another option if you can change the original struct declaration but not its member names, is to use C11 anonymous union:
struct data {
union {
int a[10];
uint32_t u32[10];
}
};
This means that user code accessing foo.a won't break. But you'd need C11 or newer.
Alternatively, you could use a uint32_t* to access the int[10] directly. This is also well-defined, since uint32_t in this case is the unsigned equivalent of the effective type int.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build?
The obvious is static_assert(sizeof(int) == 4, "int is not 32 bits"); but again this requires C11. If backwards compatibility with older C is needed, you can invent some dirty "poor man's static assert":
#define stat_assert(expr) typedef int dummy_t [expr];
#if INT_MIN != -0x80000000
Depending on how picky you are, this isn't 100% portable. int could in theory be 64 bits, but probably portability to such fictional systems isn't desired either.
If you don't want to drag limits.h around, you could also write the macro as
#if (unsigned int)-1 != 0xFFFFFFFF
It's a better macro regardless, since it doesn't have any hidden implicit promotion gems - note that -0x80000000 is always 100% equivalent to 0x80000000 on a 32 bit system.

Partial bytewise access to C enum

Setting:
I define an enum in C99:
enum MY_ENUM {TEST_ENUM_ITEM1, TEST_ENUM_ITEM2, TEST_ENUM_ITEM_MAX};
I ensure with compile time asserts that TEST_ENUM_ITEM_MAX does not exceed UINT16_MAX. I assume little endian as byte order.
I have a serialize-into-buffer function with following parameters:
PutIntoBuffer(uint8_t* src, uint32_t count);
I serialize a variable holding an value into a buffer. For this task i access the variable, holding the enum, like this:
enum MY_ENUM testVar = TEST_ENUM_ITEM;
PutIntoBuffer((uint8_t*) &testVar, sizeof(uint16_t));
Question: Is it legitimate to access the enum (which is an int) in this way? Does C standard guarantee the intended behaviour?
It is legitimate as in "it will work if int is 16 bits". It does not violate any pointer aliasing rules either, as long as you use a character type like uint8_t. (De-serializing is another story though.)
However, the code is not portable. In case int is 32 bit, the enumeration constants will turn 32 bit too, as may the enum variable itself. Then the code will turn endianess-dependent and you might end up reading garbage. Checking TEST_ENUM_ITEM_MAX against UINT16_MAX doesn't solve this.
The proper way to serialize an enum is to use a pre-generated read-only look-up table which is guaranteed to be 8 bits, like this:
#include <stdint.h>
enum MY_ENUM {TEST_ENUM_ITEM1, TEST_ENUM_ITEM2, TEST_ENUM_ITEM_MAX};
static const uint8_t MY_ENUM8 [] =
{
[TEST_ENUM_ITEM1] = TEST_ENUM_ITEM1,
[TEST_ENUM_ITEM2] = TEST_ENUM_ITEM2,
};
int main (void)
{
_Static_assert(sizeof(MY_ENUM8)==TEST_ENUM_ITEM_MAX, "Something went wrong");
}
The designated initializer syntax improves the integrity of the data, should the enum be updated during maintenance. Similarly, the static assert will ensure that the list contains the right number of items.

What makes a better constant in C, a macro or an enum?

I am confused about when to use macros or enums. Both can be used as constants, but what is the difference between them and what is the advantage of either one? Is it somehow related to compiler level or not?
In terms of readability, enumerations make better constants than macros, because related values are grouped together. In addition, enum defines a new type, so the readers of your program would have easier time figuring out what can be passed to the corresponding parameter.
Compare
#define UNKNOWN 0
#define SUNDAY 1
#define MONDAY 2
#define TUESDAY 3
...
#define SATURDAY 7
to
typedef enum {
UNKNOWN,
SUNDAY,
MONDAY,
TUESDAY,
...
SATURDAY,
} Weekday;
It is much easier to read code like this
void calendar_set_weekday(Weekday wd);
than this
void calendar_set_weekday(int wd);
because you know which constants it is OK to pass.
A macro is a preprocessor thing, and the compiled code has no idea about the identifiers you create. They have been already replaced by the preprocessor before the code hits the compiler. An enum is a compile time entity, and the compiled code retains full information about the symbol, which is available in the debugger (and other tools).
Prefer enums (when you can).
In C, it is best to use enums for actual enumerations: when some variable can hold one of multiple values which can be given names. One advantage of enums is that the compiler can perform some checks beyond what the language requires, like that a switch statement on the enum type is not missing one of the cases. The enum identifiers also propagate into the debugging information. In a debugger, you can see the identifier name as the value of an enum variable, rather than just the numeric value.
Enumerations can be used just for the side effect of creating symbolic constants of integral type. For instance:
enum { buffer_size = 4096 }; /* we don't care about the type */
this practice is not that wide spread. For one thing, buffer_size will be used as an integer and not as an enumerated type. A debugger will not render 4096 into buffer_size, because that value won't be represented as the enumerated type. If you declare some char array[max_buffer_size]; then sizeof array will not show up as buffer_size. In this situation, the enumeration constant disappears at compile time, so it might as well be a macro. And there are disadvantages, like not being able to control its exact type. (There might be some small advantage in some situation where the output of the preprocessing stages of translation is being captured as text. A macro will have turned into 4096, whereas buffer_size will stay as buffer_size).
A preprocessor symbol lets us do this:
#define buffer_size 0L /* buffer_size is a long int */
Note that various values from C's <limits.h> like UINT_MAX are preprocessor symbols and not enum symbols, with good reasons for that, because those identifiers need to have a precisely determined type. Another advantage of a preprocessor symbol is that we can test for its presence, or even make decisions based on its value:
#if ULONG_MAX > UINT_MAX
/* unsigned long is wider than unsigned int */
#endif
Of course we can test enumerated constants also, but not in such a way that we can change global declarations based on the result.
Enumerations are also ill suited for bitmasks:
enum modem_control { mc_dsr = 0x1, mc_dtr = 0x2, mc_rts = 0x4, ... }
it just doesn't make sense because when the values are combined with a bitwise OR, they produce a value which is outside of the type. Such code causes a headache, too, if it is ever ported to C++, which has (somewhat more) type-safe enumerations.
Note there are some differences between macros and enums, and either of these properties may make them (un)suitable as a particular constant.
enums are signed (compatible with int). In any context where an unsigned type is required (think especially bitwise operations!), enums are out.
if long long is wider than int, big constants won't fit in an enum.
The size of an enum is (usually) sizeof(int). For arrays of small values (up to say, CHAR_MAX) you might want a char foo[] rather than an enum foo[] array.
enums are integral numbers. You can't have enum funny_number { PI=3.14, E=2.71 }.
enums are a C89 feature; K&R compilers (admittedly ancient) don't understand them.
If macro is implemented properly (i.e it does not suffer from associativity issues when substituted), then there's not much difference in applicability between macro and enum constants in situations where both are applicable, i.e. in situation where you need signed integer constants specifically.
However, in general case macros provide much more flexible functionality. Enums impose a specific type onto your constants: they will have type int (or, possibly, larger signed integer type), and they will always be signed. With macros you can use constant syntax, suffixes and/or explicit type conversions to produce a constant of any type.
Enums work best when you have a group of tightly associated sequential integer constants. They work especially well when you don't care about the actual values of the constants at all, i.e. when you only care about them having some well-behaved unique values. In all other cases macros are a better choice (or basically the only choice).
As a practical matter, there is little difference. They are equally usable as constants in your programs. Some may prefer one or the other for stylistic reasons, but I can't think of any technical reason to prefer one over the other.
One difference is that macros allow you to control the integral type of related constants. But an enum will use an int.
#define X 100L
enum { Y = 100L };
printf("%ld\n", X);
printf("%d\n", Y); /* Y has int type */
enum has an advantage: block scope:
{ enum { E = 12 }; }
{ enum { E = 13 }; }
With macros there is a need to #undef.

Specifying size of enum type in C

Already read through this related question, but was looking for something a little more specific.
Is there a way to tell your compiler specifically how wide you want your enum to be?
If so, how do you do it? I know how to specify it in C#; is it similarly done in C?
Would it even be worth doing? When the enum value is passed to a function, will it be passed as an int-sized value regardless?
I believe there is a flag if you are using GCC.
-fshort-enums
Is there a way to tell your compiler
specifically how wide you want your
enum to be?
In general case no. Not in standard C.
Would it even be worth doing?
It depends on the context. If you are talking about passing parameters to functions, then no, it is not worth doing (see below). If it is about saving memory when building aggregates from enum types, then it might be worth doing. However, in C you can simply use a suitably-sized integer type instead of enum type in aggregates. In C (as opposed to C++) enum types and integer types are almost always interchangeable.
When the enum value is passed to a function, will it be passed as an int-sized value regardless?
Many (most) compilers these days pass all parameters as values of natural word size for the given hardware platform. For example, on a 64-bit platform many compilers will pass all parameters as 64-bit values, regardless of their actual size, even if type int has 32 bits in it on that platform (so, it is not generally passed as "int-sized" value on such a platform). For this reason, it makes no sense to try to optimize enum sizes for parameter passing purposes.
You can force it to be at least a certain size by defining an appropriate value. For example, if you want your enum to be stored as the same size as an int, even though all the values would fit in a char, you can do something like this:
typedef enum {
firstValue = 1,
secondValue = 2,
Internal_ForceMyEnumIntSize = MAX_INT
} MyEnum;
Note, however, that the behavior can be dependent on the implementation.
As you note, passing such a value to a function will cause it to be expanded to an int anyway, but if you are using your type in an array or a struct, then the size will matter. If you really care about element sizes, you should really use types like int8_t, int32_t, etc.
Even if you are writing strict C code, the results are going to be compiler dependent. Employing the strategies from this thread, I got some interesting results...
enum_size.c
#include <stdio.h>
enum __attribute__((__packed__)) PackedFlags {
PACKED = 0b00000001,
};
enum UnpackedFlags {
UNPACKED = 0b00000001,
};
int main (int argc, char * argv[]) {
printf("packed:\t\t%lu\n", sizeof(PACKED));
printf("unpacked:\t%lu\n", sizeof(UNPACKED));
return 0;
}
$ gcc enum_size.c
$ ./a.out
packed: 4
unpacked: 4
$ gcc enum_size.c -fshort_enums
$ ./a.out
packed: 4
unpacked: 4
$ g++ enum_size.c
$ ./a.out
packed: 1
unpacked: 4
$ g++ enum_size.c -fshort_enums
$ ./a.out
packed: 1
unpacked: 1
In my example above, I did not realize any benefit from __attribute__((__packed__)) modifier until I started using the C++ compiler.
EDIT:
#technosaurus's suspicion was correct.
By checking the size of sizeof(enum PackedFlags) instead of sizeof(PACKED) I see the results I had expected...
printf("packed:\t\t%lu\n", sizeof(enum PackedFlags));
printf("unpacked:\t%lu\n", sizeof(enum UnpackedFlags));
I now see the expected results from gcc:
$ gcc enum_size.c
$ ./a.out
packed: 1
unpacked: 4
$ gcc enum_size.c -fshort_enums
$ ./a.out
packed: 1
unpacked: 1
There is also another way if the enum is part of a structure:
enum whatever { a,b,c,d };
struct something {
char :0;
enum whatever field:CHAR_BIT;
char :0;
};
The :0; can be omitted if the enum field is surrounded by normal fields. If there's another bitfield before, the :0 will force byte alignement to the next byte for the field following it.
In some circumstances, this may be helpful:
typedef uint8_t command_t;
enum command_enum
{
CMD_IDENT = 0x00, //!< Identify command
CMD_SCENE_0 = 0x10, //!< Recall Scene 0 command
CMD_SCENE_1 = 0x11, //!< Recall Scene 1 command
CMD_SCENE_2 = 0x12, //!< Recall Scene 2 command
};
/* cmdVariable is of size 8 */
command_t cmdVariable = CMD_IDENT;
On one hand type command_t has size 1 (8bits) and can be used for variable and function parameter type.
On the other hand you can use the enum values for assignation that are of type int by default but the compiler will cast them immediately when assigned to a command_t type variable.
Also, if you do something unsafe like defining and using a CMD_16bit = 0xFFFF, the compiler will warn you with following message:
warning: large integer implicitly truncated to unsigned type [-Woverflow]
As #Nyx0uf says, GCC has a flag which you can set:
-fshort-enums
Allocate to an enum type only as many bytes as it needs for the declared range of possible values. Specifically, the enum type is equivalent to the smallest integer type that has enough room.
Warning: the -fshort-enums switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface.
Source: https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html
Additional great reading for general insight: https://www.embedded.fm/blog/2016/6/28/how-big-is-an-enum.
Interesting...notice the line I highlighted in yellow below! Adding an enum entry called ARM_EXCEPTION_MAKE_ENUM_32_BIT and with a value equal to 0xffffffff, which is the equivalent of UINT32_MAX from stdint.h (see here and here), forces this particular Arm_symbolic_exception_name enum to have an integer type of uint32_t. That is the sole purpose of this ARM_EXCEPTION_MAKE_ENUM_32_BIT entry! It works because uint32_t is the smallest integer type which can contain all of the enum values in this enum--namely: 0 through 8, inclusive, as well as 0xffffffff, or decimal 2^32-1 = 4294967295.
Keywords: ARM_EXCEPTION_MAKE_ENUM_32_BIT enum purpose why have it? Arm_symbolic_exception_name purpose of 0xffffffff enum entry at end.
Right now I can't answer your first two questions, because I am trying to find a good way to do this myself. Maybe I will edit this if I find a strategy that I like. It isn't intuitive though.
But I want to point something out that hasn't been mentioned so far, and to do so I will answer the third question like so:
It is "worth doing" when writing a C API that will be called from languages that aren't C. Anything that directly links to the C code will need to correctly understand the memory layout of all structs, parameter lists, etc in the C code's API. Unfortunately, C types like int, or worst yet, enums, are fairly unpredictably sized (changes by compiler, platform, etc), so knowing the memory layout of anything containing an enum can be dodgy unless your other programming language's compiler is also the C compiler AND it has some in-language mechanism to exploit that knowledge. It is much easier to write problem-free bindings to C libraries when the API uses predictably-sized C types like uint8_t, uint16_t, uint32_t, uint64_t, void*, uintptr_t, etc, and structs/unions composed of those predictably-sized types.
So I would care about enum sizing when it matters for program correctness, such as when memory layout and alignment issues are possible. But I wouldn't worry about it so much for optimization, not unless you have some niche situation that amplifies the opportunity cost (ex: a large array/list of enum-typed values on a memory constrained system like a small MCU).
Unfortunately, situations like what I'm mentioning are not helped by something like -fshort-enums, because this feature is vendor-specific and less predictable (e.g. another system would have to "guess" enum size by approximating GCC's algorithm for -fshort-enums enum sizing). If anything, it would allow people to compile C code in a way that would break common assumptions made by bindings in other languages (or other C code that wasn't compiled with the same option), with the expected result being memory corruption as parameters or struct members get written to, or read from, the wrong locations in memory.
As of C23, this is finally possible in standard C:
You can put a colon and an integer type after the enum keyword (or after the name tag, if it's named) to specify the enum's fixed underyling type, which sets the size and range of the enum type.
Would it even be worth doing? When the enum value is passed to a function, will it be passed as an int-sized value regardless?
On x86_64, the type of a integer does not influence whether it is passed in register or not (as long as it fits in a single register). The size of data on the heap however is very significant for cache performance.
It depends on the values assigned for the enums.
Ex:
If the value greater than 2^32-1 is stored, the size allocated for the overall enum will change to the next size.
Store 0xFFFFFFFFFFFF value to a enum variable, it will give warning if tried to compile in a 32 bit environment (round off warning)
Where as in a 64 bit compilation, it will be successful and the size allocated will be 8 bytes.

Resources