Setting:
I define an enum in C99:
enum MY_ENUM {TEST_ENUM_ITEM1, TEST_ENUM_ITEM2, TEST_ENUM_ITEM_MAX};
I ensure with compile time asserts that TEST_ENUM_ITEM_MAX does not exceed UINT16_MAX. I assume little endian as byte order.
I have a serialize-into-buffer function with following parameters:
PutIntoBuffer(uint8_t* src, uint32_t count);
I serialize a variable holding an value into a buffer. For this task i access the variable, holding the enum, like this:
enum MY_ENUM testVar = TEST_ENUM_ITEM;
PutIntoBuffer((uint8_t*) &testVar, sizeof(uint16_t));
Question: Is it legitimate to access the enum (which is an int) in this way? Does C standard guarantee the intended behaviour?
It is legitimate as in "it will work if int is 16 bits". It does not violate any pointer aliasing rules either, as long as you use a character type like uint8_t. (De-serializing is another story though.)
However, the code is not portable. In case int is 32 bit, the enumeration constants will turn 32 bit too, as may the enum variable itself. Then the code will turn endianess-dependent and you might end up reading garbage. Checking TEST_ENUM_ITEM_MAX against UINT16_MAX doesn't solve this.
The proper way to serialize an enum is to use a pre-generated read-only look-up table which is guaranteed to be 8 bits, like this:
#include <stdint.h>
enum MY_ENUM {TEST_ENUM_ITEM1, TEST_ENUM_ITEM2, TEST_ENUM_ITEM_MAX};
static const uint8_t MY_ENUM8 [] =
{
[TEST_ENUM_ITEM1] = TEST_ENUM_ITEM1,
[TEST_ENUM_ITEM2] = TEST_ENUM_ITEM2,
};
int main (void)
{
_Static_assert(sizeof(MY_ENUM8)==TEST_ENUM_ITEM_MAX, "Something went wrong");
}
The designated initializer syntax improves the integrity of the data, should the enum be updated during maintenance. Similarly, the static assert will ensure that the list contains the right number of items.
Related
I've inherited some old code that assumes that an int can store values from -231 to 2^31-1, that overflow just wraps around, and that the sign bit is the high-order bit. In other words, that code should have used uint32_t, except that it wasn't. I would like to fix this code to use uint32_t.
The difficulty is that the code is distributed as source code and I'm not allowed to change the external interface. I have a function that works on an array of int. What it does internally is its own business, but int is exposed in the interface. In a nutshell, the interface is:
struct data {
int a[10];
};
void frobnicate(struct data *param);
I'd like to change int a[10] to uint32_t a[10], but I'm not allowed to modify the definition of struct data.
I can make the code work on uint32_t or unsigned internally:
struct internal_data {
unsigned a[10];
};
void frobnicate(struct data *param) {
struct internal_data *internal = (struct internal_data *)param;
// ... work with internal ...
}
However this is not actually correct C since it's casting between pointers to different types.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build? If int is less than 32 bits, the code has never worked anyway. For the vast majority of users, the code should build, and in a way that tells the compiler not to do “weird” things with overflowing int calculations.
I distribute the source code and people may use it with whatever compiler they choose, so compiler-specific tricks are not relevant.
I'm at least going to add
#if INT_MIN + 1 != -0x7fffffff
#error "This code only works with 32-bit two's complement int"
#endif
With this guard, what can go wrong with the cast above? Is there a reliable way of manipulating the int array as if its elements were unsigned, without copying the array?
In summary:
I can't change the function prototype. It references an array of int.
The code should manipulate the array (not a copy of the array) as an array of unsigned.
The code should build on platforms where it worked before (at least with sufficiently friendly compilers) and should not build on platforms where it can't work.
I have no control over which compiler is used and with which settings.
However this is not actually correct C since it's casting between pointers to different types.
Indeed, you cannot do such casts, because the two structure types are not compatible. You could however use a work-around such as this:
typedef union
{
struct data;
uint32_t array[10];
} internal_t;
...
void frobnicate(struct data *param) {
internal_t* internal = (internal_t*)param;
...
Another option if you can change the original struct declaration but not its member names, is to use C11 anonymous union:
struct data {
union {
int a[10];
uint32_t u32[10];
}
};
This means that user code accessing foo.a won't break. But you'd need C11 or newer.
Alternatively, you could use a uint32_t* to access the int[10] directly. This is also well-defined, since uint32_t in this case is the unsigned equivalent of the effective type int.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build?
The obvious is static_assert(sizeof(int) == 4, "int is not 32 bits"); but again this requires C11. If backwards compatibility with older C is needed, you can invent some dirty "poor man's static assert":
#define stat_assert(expr) typedef int dummy_t [expr];
#if INT_MIN != -0x80000000
Depending on how picky you are, this isn't 100% portable. int could in theory be 64 bits, but probably portability to such fictional systems isn't desired either.
If you don't want to drag limits.h around, you could also write the macro as
#if (unsigned int)-1 != 0xFFFFFFFF
It's a better macro regardless, since it doesn't have any hidden implicit promotion gems - note that -0x80000000 is always 100% equivalent to 0x80000000 on a 32 bit system.
I saw few questions here about how to force an enum to 8 or 16 bits. The common answer was that it can be done in C++11 or higher. Unfortunately, I don't have that luxury.
So, here is what I'm thinking. I only need the enum to be 8 bits when it's in a struct whose size I want to minimize. So:
Option A:
typedef enum { A, B, C, MAX = 0xFF } my_enum;
struct my_compact_struct
{
my_enum field1 : 8; // Forcing field to be 8 bits
uint8_t something;
uint16_t something_else;
};
I think most or all optimizers should be smart enough to handle the 8 bitfield efficiently.
Option B:
Not use an enum. Use typedef and constants instead.
typedef uint8_t my_type;
static const my_type A = 0;
static const my_type B = 1;
static const my_type C = 2;
struct my_compact_struct
{
my_type field1;
uint8_t something;
uint16_t something_else;
};
Option A is currently implemented and seems to be working, but since I want to do (now and in the future) what's correct now and not just what's working, I was wondering if option B is clearly better.
Thanks,
If your specific values in an enum can fit into a smaller type than an int, then a C implementation is free to choose the underlying type of the enum to be a smaller type than an int (but the type of the enum constants in this case will be int). But there is no way you can force a C compiler to use a type smaller than an int. So with this in mind and the fact that an int is at least 16 bits, you're out of luck.
But enums in C are little more than debugging aids. Just use an uint8_t type if you compiler has it:
static const uint8_t something = /*some value*/
If not then use a char and hope that CHAR_BIT is 8.
Option B would be best. You'd be defining the type in question to be a known size, and the const values you define will also be the correct size.
While you would lose out on the implicit numbering of an enum, the explicit sizing of the field and its values makes up for it.
Step into same problem, I solved it this by using attribute((packed)). Packet align data to 4 Bytes, with packet (int)sizeof(my_compact_struct) = 4, without packet (int)sizeof(my_compact_struct) = 8
typedef enum __attribute__((packed)){
A = 0x01
B = 0x10
C = 0x255 // Max value, or any other lower than this
} my_enum;
struct my_compact_struct __attribute__((packed)){
my_enum field1 : 8; // Forcing field to be 8 bits
uint8_t something;
uint16_t something_else;
};
You can use enum or even typedef enum like a grouped #define. Do not actually define a structure member, data stream, or global variable as type enum. Rather define all storage members as fixed types like uint8_t. As if you used #define just set them to the enum literal constants. If you use any kind of lint tool, then this design style will raise some messages which you will need to tailor. Just like malformed #define if the literal value doesn't fit, then bad things can happen, and either way you need to pay attention. In a debugger or hardware simulator, the enum can provide useful display reference information. Temporary variables are an exception to how global definitions are treated. For function parameters or automatic variables, and only then, define them with the enum type. In this context int is going to be the most efficient word size as well as the standard behavior of enum. There is no error possible nor hyper-optimizing you can do.
I have an enum like this
typedef enum {
FIRST,
SECOND,
THIRD = 0X80000001,
FOURTH,
FIFTH,
} STATUS;
I am getting a pedantic warning since I am compiling my files with the option -Wpedantic:
warning: ISO C restricts enumerator values to range of 'int' [-Wpedantic]
I found that it occurs since when I convert the hex value 0X80000001 to integer it exceeds the unsigned integer limits. My purpose is to have continuous hex values as the status in the enum without this warning.
I cannot use the macros since this will defy the purpose of having the enums in the first place. What code change will avoid this warning?
Enumeration constants are guaranteed to be of the same size as (signed) int. Apparently your system uses 32 bit int, so an unsigned hex literal larger than 0x7FFFFFFF will not fit.
So the warning is not just "pedantic", it hints of a possibly severe bug. Note that -pedantic in GCC does not mean "be picky and give me unimportant warnings" but rather "ensure that my code actually follows the C standard".
It appears that you want to do a list of bit masks or hardware addresses, or some other hardware-related programming. enum is unsuitable for such tasks, because in hardware-related programming, you rarely ever want to use signed types, but always unsigned ones.
If you must have a safe and portable program, then there is no elegant way to do this. C is a language with a lot of flaws, the way enum is defined by the standard is one of them.
One work-around is to use some sort of "poor man's enum", such as:
typedef uint32_t STATUS;
#define THIRD 0X80000001
If you must also have the increased type safety of an enum, then you could possibly use a struct:
typedef struct
{
uint32_t value;
} STATUS;
Or alternatively, just declare an array of constants and use an enum to define the array index. Probably the cleanest solution but takes a little bit of extra overhead:
typedef enum {
FIRST,
SECOND,
THIRD,
FOURTH,
FIFTH,
STATUS_N
} STATUS;
const uint32_t STATUS_DATA [STATUS_N] =
{
0,
1,
0X80000001,
0X80000002,
0X80000003
};
Does using typedef enum { VALUE_1 = 0x00, ... } typeName; have any more overhead in C (specifically, compiling using AVR-GCC for an AVR MCU) than doing typedef unsigned char typeName; and then just defining each value with #define VALUE_1 0x00?
My specific application is status codes that can be returned and checked by functions. It seems neater to me to use the typedef enum style, but I wanted to be sure that it wasn't going to add any significant overhead to the compiled application.
I would assume no, but I wasn't really sure. I tried to look for similar questions but most of them pertained to C++ and got more specific answers to C++.
An enum declaration creates an enumerated type. Such a type is compatible with (and therefore has the same size and representation as) some predefined integer type, but the compiler chooses which one.
But the enumeration constants are always of type int. (This differs from C++, where the constants are of the enumerated type.)
So typedef unsigned char ... vs. typedef enum ... will likely change the size and representation of the type, which can matter if you define objects of the type or functions that return the type, but the constants VALUE_1 et al will be of type int either way.
It's probably best to use the enum type; that way the compiler can decide what representation is best. Your alternative of specifying unsigned char will minimize storage, but depending on the platform it might actually slow down access to objects relative to, say, using something compatible with int.
Incidentally, the typedef isn't strictly necessary. If you prefer, you can use a tag:
enum typeName { Value_1 = 0x00, ... };
But then you have to refer to the type as enum typeName rather than just typeName. The advantage of typedef is that it lets you give the type a name that's just a single identifier.
I am getting error
error: aggregate value used where an integer was expected
on compiling this code:
#include <stdio.h>
typedef unsigned long U32;
typedef struct hello_s
{
U32 a:8;
U32 b:24;
}hello_t;
int main()
{
hello_t str;
U32 var;
str.a = 0xAA;
str.b = 0xAAA;
var = (U32)str;
printf("var : %lX\n", var);
return 0;
}
Can someone please explain what the error means, and what I am doing wrong.
EDIT: I understand this is a stupid thing to do. What I wanted to know was why the compiler is crying about this. Why cant it just assign the first 32 bits to the integer.
var = (U32)str;
Because str is an object of a structure type and you cannot convert structure objects to object of arithmetic types. C does not let you perform this kind of conversion.
If you want to access you structure object as an integer you can create an union of your structure and of an U32.
Note that the common construct var = *(U32 *) str; is undefined behavior in C. It violates aliasing and alignment rules.
Well, I think one should not mistake the C99 spec by assuming that it is against the language standards.
The standard only says that the results of the conversion may not portable across different machines/architectures.
As long as you have coded the program to work on a particular architecture, it's fine.
For example I group selective members of a data structure DataStr_t that uniquely identifies the object (i.e., the key), packed into another struct, say, DataStrKey_t. I'll make sure that sizeof(DataStrKey_t) is equal to sizeof(uint64) and for all practical purposes use it as uint64 as it is easy to handle.
I also do the below operation often:
memcmp(&var_uint64, &var_DataStructKey, sizeof(uint64));
If you read access or write access the object using the key on a machine the value resulting from conversion is predictable and consistent in bit-wise alignment.
Well if you move "only this data" to a different machine (which actually didn't write the data) and try to read it, things may break.
Your program slightly modified for more explanation and successful compilation:
Here, as long as LINE_A and LINE_B are execute on the same machine result is always predictable.
But if you write the (var_uint64,var_DataStructKey) to a file and read it from a different machine, then execute LINE_B on those populated values, comparison "may" fail.
#include <stdio.h>
#include <string.h>
typedef unsigned long U32;
typedef struct hello_s
{
U32 a:8;
U32 b:24;
}hello_t;
int main()
{
hello_t str;
U32 var;
str.a = 0xAA;
str.b = 0xAAA;
var = *(U32 *)(&str); //LINE_A
if(0 == memcmp(&var, &str, sizeof(U32))) //LINE_B
printf("var : %lu\n", var);
return 0;
}
I guess my answer is too late, but attempted to explain.
And what do you expect that cast to result in exactly? You could always just cast it's address to a pointer to int and dereference it... but are you sure you can safely do so (no, you can't)? Is structure member alignment going to bite you someday (the answer is "probably, yes, it depends")?
Also, from the C99 styandard:
C99 §6.7.2.1, paragraph 10: "The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined."
That is not always a stupid thing to do. In my case, I have a struct that I need to send over a network connection. The data must be sent over an SPI bus in byte form, so the struct must be accessed a byte at a time. I used this define to access each byte. You must be aware of the byte ordering of your platform to do this correctly. Also, you must make sure your structs are __PACKED (see also: C/C++: Force Bit Field Order and Alignment), so the compiler does not insert any padding blocks or alignment blocks. This will also not work if any of the bit members fall across the byte boundaries (at least, with the Microchip XC16 compiler it does not).
typedef unsigned char byte;
#define STRUCT_LB(x) ((byte *)(&x))[0]
#define STRUCT_HB(x) ((byte *)(&x))[1]
A nicer way to do this is to define your struct as a union of a bitfield and a byte array like so:
typedef unsigned char byte;
typedef struct {
union {
struct __PACKED {
byte array[2];
} bytes;
struct __PACKED {
byte b0: 1;
byte b1: 1;
byte b2: 1;
byte b3: 1;
byte b4: 1;
byte other: 3;
byte more: 6;
byte stuff: 2;
} fields;
};
} MyData;
Not all typecasting are allowed in C. Per this manual, only the following cases are legal,
Convert an integer to any pointer type.
Convert a pointer to any integer type.
Convert a pointer to an object to a pointer to another object.
Convert a pointer to a function to a pointer to another function.
Correctness of converting null between pointers (either object or function).
Hence, casting a struct to an integer is obviously not a legal conversion.