Platform independent storage of c bitfields - c

I use C bitfields to store data in memory. For archive usage these data has to be written to a file (and later on be combined with data from another machine). It seems to be a bad idea to save the bitfields directly to the file, since the arrangement of data is implementation specific.
For this reason I wrote some methods to "serialize" these bitfields to save them in a unified format:
/* uint16 is a unsigned int with size 16 */
typedef struct {
uint16 a : 1;
/* ... just examples ... */
uint16 z : 13;
} data;
void save_data(FILE* fp, data d) {
uint16 tmp;
tmp = d.a;
fwrite(&tmp,sizeof(uint16),1,fp);
/* ... */
tmp = d.z;
fwrite(&tmp,sizeof(uint16),1,fp);
}
While this is perfectly working, it seems not to be well to extend, since adding more members in data requires adding the data to the save routine as well.
Is there any trick to automatically convert bitfield data to a unified format without needing to adapt the routine/macro when changing the bitfield data?

Here is one method. I cannot recommend it, but it's out there and it sort of works, so why not look at it. This incarnation is still platform-dependent but you can easily switch to a platform-independent, possibly human-readable format. Error handling is omitted for brevity.
// uglymacro.h
#if defined(DEFINE_STRUCT)
#define BEGINSTRUCT(struct_tag) typedef struct struct_tag {
#define ENDSTRUCT(struct_typedef) } struct_typedef;
#define BITFIELD(name,type,bit) type name : bit;
#define FIELD(name,type) type name;
#define ARRAYFIELD(name,type,size) type name[size];
#elif defined(DEFINE_SAVE)
#define BEGINSTRUCT(struct_tag) void save_##struct_tag(FILE* fp, \
struct struct_tag* p_a) {
#define ENDSTRUCT(struct_typedef) }
#define BITFIELD(name,type,bit) { type tmp; tmp = p_a->name; \
fwrite (&tmp, sizeof(type), 1, fp); }
#define FIELD(name,type) { fwrite (&p_a->name, sizeof(p_a->name), 1, fp); }
#define ARRAYFIELD(name,type,size) { fwrite (p_a->name, sizeof(p_a->name[0]), size, fp); }
#elif defined(DEFINE_READ)
#define BEGINSTRUCT(struct_tag) void read_##struct_tag(FILE* fp, \
struct struct_tag* p_a) {
#define ENDSTRUCT(struct_typedef) }
#define BITFIELD(name,type,bit) { type tmp; fread (&tmp, sizeof(type), 1, fp); \
p_a->name = tmp; }
#define FIELD(name,type) { fread (&p_a->name, sizeof(p_a->name), 1, fp); }
#define ARRAYFIELD(name,type,size) { fread (p_a->name, sizeof(p_a->name[0]), size, fp); }
#else
#error "Must define either DEFINE_STRUCT or DEFINE_SAVE or DEFINE_READ"
#endif
#undef DEFINE_STRUCT
#undef DEFINE_READ
#undef DEFINE_WRITE
#undef BEGINSTRUCT
#undef ENDSTRUCT
#undef FIELD
#undef BITFIELD
#undef ARRAYFIELD
Your struct definition looks like this:
// mystruct_def.h
BEGINSTRUCT(mystruct)
BITFIELD(a,int,1)
FIELD(b,int)
ARRAYFIELD(c,int,10)
ENDSTRUCT(mystruct)
You use it like this:
// in mystruct.h file
#define DEFINE_STRUCT
#include "uglymacro.h"
#include "mystruct_def.h"
// in mystruct.c file
#include "mystruct.h"
#define DEFINE_READ
#include "mystruct_def.h"
#define DEFINE_WRITE
#include "mystruct_def.h"
Frankly, by modern standards this method is ugly. I have used something similar about 20 years ago and it was ugly back then.
Another alternative is using a more humane code-generation facility instead of the C preprocessor.

If you are willing to invest a bit you can use tools like P99 for "statement unrolling":
// in header
#define MY_BITFIELDS a, z
#define WRITE_IT(X) fwrite(&(unsigned){ d.X }, sizeof(unsigned), 1, fp)
#define WRITE_ALL(...) P99_SEP(WRITE_IT, __VA_ARGS__)
// in your function
WRITE_ALL(MY_BITFIELDS);
BTW, never use int for bitfields if you can avoid this. The semantic of a set of bits is much better matched by unsigned.
With a bit of more macro coding you could even use something like
#define MY_BITFIELDS (a, 1), (z, 11)
to produce the struct declaration and the write part.

Why not use a human readable text format?
typedef struct {
int a : 1;
/* ... just examples ... */
int z : 13;
} data;
void save_data(FILE* fp, data d) {
fprintf( fp, "a:%d\n", d.a );
fprintf( fp, "b:%d\n", d.b );
...
fprintf( fp, "z:%d\n", d.z );
}
The advantage of this technique is that somebody using any different language could quickly write a parser to load your data on any machine, any architecture.

Related

C array size depending on variable value known at compilation time

I am working on a library to read Serial data from an electric counter. The counter can transmit up to 105 different tags depending on your contract. Because this library could be on lightweight and not powerful embedded systems (typically arduino or esp8266), I am trying to reduce the memory usage to the bare minimum.
To read each tag, my library has a buffer with its length equal the biggest tag value length possible (ranging from 2 bytes to 94 bytes). I would like my library to automatically set the buffer size to the correct value kowning that the user is able to indicate which tag he wants with #define instructions in its main file.
main.cpp
#define BASE
#define HCHP
#define OPTARIF
#include "LinkyTIC.h"
#include <SoftwareSerial.h>
SoftwareSerial LinkySerial(13, 15);
LinkyTIC linky(LinkySerial);
void setup() {
}
void loop() {
if(linky.read()){
int base = linky.GetBASE();
int hchp = linky.GetHCHP();
Serial.println(base);
Serial.println(hchp);
}
}
What I tried :
In my header file, I used a chain of #ifdef to keep the maximum length, but I get the error "the value of variable "max_length" cannot be used as a constant". I tried to cast the value to a constant but it didn't work either.
header.h
uint8_t max_length = 8;
#ifdef ADCO
uint8_t max_length = max_length > 12 ? max_length : 12;
#endif
#ifdef OPTARIF
uint8_t max_length = max_length > 4 ? max_length : 4;
#endif
#ifdef ISOUSC
uint8_t max_length = max_length > 2 ? max_length : 2;
#endif
#ifdef BASE
uint8_t max_length = max_length > 9 ? max_length : 9;
#endif
...
class LinkyTIC {
public:
// unrelated stuff
private:
// unrelated stuff
char _buffer_tag[8]; // buffer for the tag name
>>>> char _buffer_date[max_length]; // buffer for the (optional) tag date. Must be the same length as value because we can't know in advance if it s going to be a date or a value
^^^^^^^^^^ the value of variable "max_length" cannot be used as a constant
>>>> char _buffer_value[max_length]; // buffer for the tag value
^^^^^^^^^^ the value of variable "max_length" cannot be used as a constant
char _buffer_checksum[1]; // buffer for the tag checksum
char _checksum;
char* _buffers[4] = {_buffer_tag, _buffer_date, _buffer_value, _buffer_checksum};
uint8_t _buffer_reference_index;
uint8_t _buffer_index;
...
}
The value of max_length can thus be known at compile time, but I can't figure out a way to create my array using its value.
What would be the best way to achieve what I want to do ? Because flash is also an issue, I would rather avoid using std::vector.
There are a number of options using preprocessor macros and #if statements, depending on what your requirements are, which you have not clearly stated. One example is:
#define MaximumLength 8
#if defined ADCO
#if MaximumLength > 12
#undef MaximumLength
#define MaximumLength 12
#endif
#endif
#if defined OPTARIF
#if MaximumLength > 4
#undef MaximumLength
#define MaximumLength 4
#endif
#endif
#if defined ISOUSC
#if MaximumLength > 2
#undef MaximumLength
#define MaximumLength 2
#endif
#endif
#if defined BASE
#if MaximumLength > 9
#undef MaximumLength
#define MaximumLength 9
#endif
#endif
uint8_t max_length = MaximumLength; // If this variable is still desired.
…
char _buffer_date[MaximumLength];
First of all you need to define your variable either const or consexpr.
Since you code already has error, you can't redefine variable with same name, I'm not sure what you intended to do, but something like this should work:
#include <cstdint>
#include <iostream>
#include <algorithm>
// Uncomment line(s) below to check behavior
//#define ADCO
//#define OPTARIF
//#define ISOUSC
//#define BASE
const uint8_t def_max_length = 8;
#ifdef ADCO
const uint8_t max_length = std::max<uint8_t>(def_max_length, 12);
#elif defined(OPTARIF)
const uint8_t max_length = std::max<uint8_t>(def_max_length, 4);
#elif defined(ISOUSC)
const uint8_t max_length = std::max<uint8_t>(def_max_length, 2);
#elif defined(BASE)
const uint8_t max_length = std::max<uint8_t>(def_max_length, 9);
#else
const uint8_t max_length = def_max_length;
#endif
char buffer[max_length];
int main()
{
std::cout << std::size(buffer) << std::endl;
}
If you need to check all defines independently, then you can define one variable for each define, something like:
#ifdef ADCO
const uint8_t max_length_adco = 12;
#else
const uint8_t max_length_adco = 0;
#ednif
And then just get maximum of all these variables:
const uint8_t max_length = std::max({def_max_length, max_length_adco, max_length_optarif, max_length_isousc, max_length_base});

How to initialize an array of structures with variable index

I have structure like below
typedef struct
{
int a;
int b;
int c;
} my_struct;
and in another file I have declared a variable of this my_struct type, like below.
my_struct strct_arr[MAX];
Where MAX is a macro which is a configurable value that is a multiple of 18 (18 or 36 or 54 and so on.. it may go up to 18*n times).
I have to initialize the structure with {0xff,0,0}. So, how to initialize whole array of structure my_struct strct_arr[MAX]; with my initial values without using any kind of loops.
I am expecting the output as below:
my_struct strct_arr[MAX]={
{0xff,0,0},
{0xff,0,0},
{0xff,0,0},
{0xff,0,0},
…
};
But without knowing MAX value, how to initialize it?
There is GCC extension for this. Try this
#define MAX 18
my_struct strct_arr[MAX]={ [0 ... (MAX - 1)] = {0xff,0,0}};
Check https://gcc.gnu.org/onlinedocs/gcc-4.2.1/gcc/Designated-Inits.html
Yes, this is possible using the C preprocessor!
#include <stdio.h>
#include <boost/preprocessor/repetition/repeat.hpp>
#define INITS(z, n, t) { 0xFF, 0, 0 },
#define REP(item, n) BOOST_PP_REPEAT(n, INITS, item)
#define MAX 123
typedef struct { int a,b,c; } my_struct;
my_struct ms[] = { REP(, MAX) };
int main()
{
// Check it worked
printf("%d\n", (int)(sizeof ms / sizeof *ms));
}
Note: boost is a package of C++ stuff, however the boost/preprocessor just uses the preprocessor features which are common to both languages. If your implementation doesn't allow this #include by default, you can find a copy of repeat.hpp from the boost source code.
Also, BOOST_PP_REPEAT defaults to a max of 256. If your MAX is bigger than this, you can edit repeat.hpp to allow bigger values, it should be obvious what to do from there.
Note: this post describes a system for recursive macro that would not require the same sort of implementation as repeat.hpp uses, but I haven't been able to get it to work.
Credit: this post
Well, there's is no direct and immediate syntax in standard C to specify an initializer that would do what you want. If you wanted to initialize the whole thing with zeros, then = { 0 } would work regardless of size, but that 0xff makes it a completely different story. GCC compiler supports a non-standard extension that works in such cases (see Sanket Parmar's answers for details), but alas it is not standard.
There's also a non-standard memcpy hack that is sometimes used to fill memory regions with repetitive patterns. In your case it would look as follows
my_struct strct_arr[MAX] = { { 0xff, 0, 0 } };
memcpy(strct_arr + 1, strct_arr, sizeof strct_arr - sizeof *strct_arr);
But this is a hack, since it relies on memcpy doing its copying in byte-by-byte fashion and in strictly left-to-right direction (i.e. from smaller memory addresses to larger ones). However, that's not guaranteed by the language specification. If you want to "legalize" this trick, you have to write your own version of my_memcpy that works in that way specifically (byte-by-byte, left-to-right) and use it instead. Of course, this is formally a cyclic solution that is not based entirely on initializer syntax.
Paraphrasing Jonathan Leffler's solution:
struct my_struct { char c, int a; int b; }
#define MAX 135
#define INIT_X_1 { 0xff, 0, 0 }
#define INIT_X_2 INIT_X_1, INIT_X_1
#define INIT_X_4 INIT_X_2, INIT_X_2
#define INIT_X_8 INIT_X_4, INIT_X_4
#define INIT_X_16 INIT_X_8, INIT_X_8
#define INIT_X_32 INIT_X_16, INIT_X_16
#define INIT_X_64 INIT_X_32, INIT_X_32
#define INIT_X_128 INIT_X_64, INIT_X_64
struct my_struct strct_arr[MAX] =
{
#if (MAX & 1)
INIT_X_1,
#endif
#if (MAX & 2)
INIT_X_2,
#endif
#if (MAX & 4)
INIT_X_4,
#endif
#if (MAX & 8)
INIT_X_8,
#endif
#if (MAX & 16)
INIT_X_16,
#endif
#if (MAX & 32)
INIT_X_32,
#endif
#if (MAX & 64)
INIT_X_64,
#endif
#if (MAX & 128)
INIT_X_128,
#endif
};
Just for sake of variety, since you know the array will be a multiple of 18, you could use something like this:
#define INIT_X_1 { 0xff, 0, 0 }
#define INIT_X_3 INIT_X_1, INIT_X_1, INIT_X_1
#define INIT_X_9 INIT_X_3, INIT_X_3, INIT_X_3
#define INIT_X_18 INIT_X_9, INIT_X_9
my_struct strct_arr[MAX] =
{
INIT_X_18,
#if MAX > 18
INIT_X_18,
#if MAX > 36
INIT_X_18,
#endif
#endif
};
This will work without needing C99 support (it would even work with pre-standard C), GCC extensions, or Boost Preprocessor library. In every other respect, the other solutions are better.

Real-world use of X-Macros

I just learned of X-Macros. What real-world uses of X-Macros have you seen? When are they the right tool for the job?
I discovered X-macros a couple of years ago when I started making use of function pointers in my code. I am an embedded programmer and I use state machines frequently. Often I would write code like this:
/* declare an enumeration of state codes */
enum{ STATE0, STATE1, STATE2, ... , STATEX, NUM_STATES};
/* declare a table of function pointers */
p_func_t jumptable[NUM_STATES] = {func0, func1, func2, ... , funcX};
The problem was that I considered it very error prone to have to maintain the ordering of my function pointer table such that it matched the ordering of my enumeration of states.
A friend of mine introduced me to X-macros and it was like a light-bulb went off in my head. Seriously, where have you been all my life x-macros!
So now I define the following table:
#define STATE_TABLE \
ENTRY(STATE0, func0) \
ENTRY(STATE1, func1) \
ENTRY(STATE2, func2) \
...
ENTRY(STATEX, funcX) \
And I can use it as follows:
enum
{
#define ENTRY(a,b) a,
STATE_TABLE
#undef ENTRY
NUM_STATES
};
and
p_func_t jumptable[NUM_STATES] =
{
#define ENTRY(a,b) b,
STATE_TABLE
#undef ENTRY
};
as a bonus, I can also have the pre-processor build my function prototypes as follows:
#define ENTRY(a,b) static void b(void);
STATE_TABLE
#undef ENTRY
Another usage is to declare and initialize registers
#define IO_ADDRESS_OFFSET (0x8000)
#define REGISTER_TABLE\
ENTRY(reg0, IO_ADDRESS_OFFSET + 0, 0x11)\
ENTRY(reg1, IO_ADDRESS_OFFSET + 1, 0x55)\
ENTRY(reg2, IO_ADDRESS_OFFSET + 2, 0x1b)\
...
ENTRY(regX, IO_ADDRESS_OFFSET + X, 0x33)\
/* declare the registers (where _at_ is a compiler specific directive) */
#define ENTRY(a, b, c) volatile uint8_t a _at_ b:
REGISTER_TABLE
#undef ENTRY
/* initialize registers */
#define ENTRY(a, b, c) a = c;
REGISTER_TABLE
#undef ENTRY
My favourite usage however is when it comes to communication handlers
First I create a comms table, containing each command name and code:
#define COMMAND_TABLE \
ENTRY(RESERVED, reserved, 0x00) \
ENTRY(COMMAND1, command1, 0x01) \
ENTRY(COMMAND2, command2, 0x02) \
...
ENTRY(COMMANDX, commandX, 0x0X) \
I have both the uppercase and lowercase names in the table, because the upper case will be used for enums and the lowercase for function names.
Then I also define structs for each command to define what each command looks like:
typedef struct {...}command1_cmd_t;
typedef struct {...}command2_cmd_t;
etc.
Likewise I define structs for each command response:
typedef struct {...}command1_resp_t;
typedef struct {...}command2_resp_t;
etc.
Then I can define my command code enumeration:
enum
{
#define ENTRY(a,b,c) a##_CMD = c,
COMMAND_TABLE
#undef ENTRY
};
I can define my command length enumeration:
enum
{
#define ENTRY(a,b,c) a##_CMD_LENGTH = sizeof(b##_cmd_t);
COMMAND_TABLE
#undef ENTRY
};
I can define my response length enumeration:
enum
{
#define ENTRY(a,b,c) a##_RESP_LENGTH = sizeof(b##_resp_t);
COMMAND_TABLE
#undef ENTRY
};
I can determine how many commands there are as follows:
typedef struct
{
#define ENTRY(a,b,c) uint8_t b;
COMMAND_TABLE
#undef ENTRY
} offset_struct_t;
#define NUMBER_OF_COMMANDS sizeof(offset_struct_t)
NOTE: I never actually instantiate the offset_struct_t, I just use it as a way for the compiler to generate for me my number of commands definition.
Note then I can generate my table of function pointers as follows:
p_func_t jump_table[NUMBER_OF_COMMANDS] =
{
#define ENTRY(a,b,c) process_##b,
COMMAND_TABLE
#undef ENTRY
}
And my function prototypes:
#define ENTRY(a,b,c) void process_##b(void);
COMMAND_TABLE
#undef ENTRY
Now lastly for the coolest use ever, I can have the compiler calculate how big my transmit buffer should be.
/* reminder the sizeof a union is the size of its largest member */
typedef union
{
#define ENTRY(a,b,c) uint8_t b##_buf[sizeof(b##_cmd_t)];
COMMAND_TABLE
#undef ENTRY
}tx_buf_t
Again this union is like my offset struct, it is not instantiated, instead I can use the sizeof operator to declare my transmit buffer size.
uint8_t tx_buf[sizeof(tx_buf_t)];
Now my transmit buffer tx_buf is the optimal size and as I add commands to this comms handler, my buffer will always be the optimal size. Cool!
One other use is to create offset tables:
Since memory is often a constraint on embedded systems, I don't want to use 512 bytes for my jump table (2 bytes per pointer X 256 possible commands) when it is a sparse array. Instead I will have a table of 8bit offsets for each possible command. This offset is then used to index into my actual jump table which now only needs to be NUM_COMMANDS * sizeof(pointer). In my case with 10 commands defined. My jump table is 20bytes long and I have an offset table that is 256 bytes long, which is a total of 276bytes instead of 512bytes. I then call my functions like so:
jump_table[offset_table[command]]();
instead of
jump_table[command]();
I can create an offset table like so:
/* initialize every offset to 0 */
static uint8_t offset_table[256] = {0};
/* for each valid command, initialize the corresponding offset */
#define ENTRY(a,b,c) offset_table[c] = offsetof(offset_struct_t, b);
COMMAND_TABLE
#undef ENTRY
where offsetof is a standard library macro defined in "stddef.h"
As a side benefit, there is a very easy way to determine if a command code is supported or not:
bool command_is_valid(uint8_t command)
{
/* return false if not valid, or true (non 0) if valid */
return offset_table[command];
}
This is also why in my COMMAND_TABLE I reserved command byte 0. I can create one function called "process_reserved()" which will be called if any invalid command byte is used to index into my offset table.
X-Macros are essentially parameterized templates. So they are the right tool for the job if you need several similar things in several guises. They allow you to create an abstract form and instantiate it according to different rules.
I use X-macros to output enum values as strings. And since encountering it, I strongly prefer this form which takes a "user" macro to apply to each element. Multiple file inclusion is just far more painful to work with.
/* x-macro constructors for error and type
enums and string tables */
#define AS_BARE(a) a ,
#define AS_STR(a) #a ,
#define ERRORS(_) \
_(noerror) \
_(dictfull) _(dictstackoverflow) _(dictstackunderflow) \
_(execstackoverflow) _(execstackunderflow) _(limitcheck) \
_(VMerror)
enum err { ERRORS(AS_BARE) };
char *errorname[] = { ERRORS(AS_STR) };
/* puts(errorname[(enum err)limitcheck]); */
I'm also using them for function dispatch based on object type. Again by hijacking the same macro I used to create the enum values.
#define TYPES(_) \
_(invalid) \
_(null) \
_(mark) \
_(integer) \
_(real) \
_(array) \
_(dict) \
_(save) \
_(name) \
_(string) \
/*enddef TYPES */
#define AS_TYPE(_) _ ## type ,
enum { TYPES(AS_TYPE) };
Using the macro guarantees that all my array indices will match the associated enum values, because they construct their various forms using the bare tokens from the macro definition (the TYPES macro).
typedef void evalfunc(context *ctx);
void evalquit(context *ctx) { ++ctx->quit; }
void evalpop(context *ctx) { (void)pop(ctx->lo, adrent(ctx->lo, OS)); }
void evalpush(context *ctx) {
push(ctx->lo, adrent(ctx->lo, OS),
pop(ctx->lo, adrent(ctx->lo, ES)));
}
evalfunc *evalinvalid = evalquit;
evalfunc *evalmark = evalpop;
evalfunc *evalnull = evalpop;
evalfunc *evalinteger = evalpush;
evalfunc *evalreal = evalpush;
evalfunc *evalsave = evalpush;
evalfunc *evaldict = evalpush;
evalfunc *evalstring = evalpush;
evalfunc *evalname = evalpush;
evalfunc *evaltype[stringtype/*last type in enum*/+1];
#define AS_EVALINIT(_) evaltype[_ ## type] = eval ## _ ;
void initevaltype(void) {
TYPES(AS_EVALINIT)
}
void eval(context *ctx) {
unsigned ades = adrent(ctx->lo, ES);
object t = top(ctx->lo, ades, 0);
if ( isx(t) ) /* if executable */
evaltype[type(t)](ctx); /* <--- the payoff is this line here! */
else
evalpush(ctx);
}
Using X-macros this way actually helps the compiler to give helpful error messages. I omitted the evalarray function from the above because it would distract from my point. But if you attempt to compile the above code (commenting-out the other function calls, and providing a dummy typedef for context, of course), the compiler would complain about a missing function. For each new type I add, I am reminded to add a handler when I recompile this module. So the X-macro helps to guarantee that parallel structures remain intact even as the project grows.
Edit:
This answer has raised my reputation 50%. So here's a little more. The following is a negative example, answering the question: when not to use X-Macros?
This example shows the packing of arbitrary code fragments into the X-"record". I eventually abandoned this branch of the project and did not use this strategy in later designs (and not for want of trying). It became unweildy, somehow. Indeed the macro is named X6 because at one point there were 6 arguments, but I got tired of changing the macro name.
/* Object types */
/* "'X'" macros for Object type definitions, declarations and initializers */
// a b c d
// enum, string, union member, printf d
#define OBJECT_TYPES \
X6( nulltype, "null", int dummy , ("<null>")) \
X6( marktype, "mark", int dummy2 , ("<mark>")) \
X6( integertype, "integer", int i, ("%d",o.i)) \
X6( booleantype, "boolean", bool b, (o.b?"true":"false")) \
X6( realtype, "real", float f, ("%f",o.f)) \
X6( nametype, "name", int n, ("%s%s", \
(o.flags & Fxflag)?"":"/", names[o.n])) \
X6( stringtype, "string", char *s, ("%s",o.s)) \
X6( filetype, "file", FILE *file, ("<file %p>",(void *)o.file)) \
X6( arraytype, "array", Object *a, ("<array %u>",o.length)) \
X6( dicttype, "dict", struct s_pair *d, ("<dict %u>",o.length)) \
X6(operatortype, "operator", void (*o)(), ("<op>")) \
#define X6(a, b, c, d) #a,
char *typestring[] = { OBJECT_TYPES };
#undef X6
// the Object type
//forward reference so s_object can contain s_objects
typedef struct s_object Object;
// the s_object structure:
// a bit convoluted, but it boils down to four members:
// type, flags, length, and payload (union of type-specific data)
// the first named union member is integer, so a simple literal object
// can be created on the fly:
// Object o = {integertype,0,0,4028}; //create an int object, value: 4028
// Object nl = {nulltype,0,0,0};
struct s_object {
#define X6(a, b, c, d) a,
enum e_type { OBJECT_TYPES } type;
#undef X6
unsigned int flags;
#define Fread 1
#define Fwrite 2
#define Fexec 4
#define Fxflag 8
size_t length; //for lint, was: unsigned int
#define X6(a, b, c, d) c;
union { OBJECT_TYPES };
#undef X6
};
One big problem was the printf format strings. While it looks cool, it's just hocus pocus. Since it's only used in one function, overuse of the macro actually separated information that should be together; and it makes the function unreadable by itself. The obfuscation is doubly unfortunate in a debugging function like this one.
//print the object using the type's format specifier from the macro
//used by O_equal (ps: =) and O_equalequal (ps: ==)
void printobject(Object o) {
switch (o.type) {
#define X6(a, b, c, d) \
case a: printf d; break;
OBJECT_TYPES
#undef X6
}
}
So don't get carried away. Like I did.
Some real-world uses of X-Macros by popular and large projects:
Java HotSpot
In the Oracle HotSpot Virtual Machine for the Java® Programming Language, there is the file globals.hpp, which uses the RUNTIME_FLAGS in that way.
See the source code:
JDK 7
JDK 8
JDK 9
Chromium
The list of network errors in net_error_list.h is a long, long list of macro expansions of this form:
NET_ERROR(IO_PENDING, -1)
It is used by net_errors.h from the same directory:
enum Error {
OK = 0,
#define NET_ERROR(label, value) ERR_ ## label = value,
#include "net/base/net_error_list.h"
#undef NET_ERROR
};
The result of this preprocessor magic is:
enum Error {
OK = 0,
ERR_IO_PENDING = -1,
};
What I don't like about this particular use is that the name of the constant is created dynamically by adding the ERR_. In this example, NET_ERROR(IO_PENDING, -100) defines the constant ERR_IO_PENDING.
Using a simple text search for ERR_IO_PENDING, it is not possible to see where this constant it defined. Instead, to find the definition, one has to search for IO_PENDING. This makes the code hard to navigate and therefore adds to the obfuscation of the whole code base.
I like to use X macros for creating 'rich enumerations' which support iterating the enum values as well as getting the string representation for each enum value:
#define MOUSE_BUTTONS \
X(LeftButton, 1) \
X(MiddleButton, 2) \
X(RightButton, 4)
struct MouseButton {
enum Value {
None = 0
#define X(name, value) ,name = value
MOUSE_BUTTONS
#undef X
};
static const int *values() {
static const int a[] = {
None,
#define X(name, value) name,
MOUSE_BUTTONS
#undef X
-1
};
return a;
}
static const char *valueAsString( Value v ) {
#define X(name, value) static const char str_##name[] = #name;
MOUSE_BUTTONS
#undef X
switch ( v ) {
case None: return "None";
#define X(name, value) case name: return str_##name;
MOUSE_BUTTONS
#undef X
}
return 0;
}
};
This not only defines a MouseButton::Value enum, it also lets me do things like
// Print names of all supported mouse buttons
for ( const int *mb = MouseButton::values(); *mb != -1; ++mb ) {
std::cout << MouseButton::valueAsString( (MouseButton::Value)*mb ) << "\n";
}
I use a pretty massive X-macro to load contents of INI-file into a configuration struct, amongst other things revolving around that struct.
This is what my "configuration.def" -file looks like:
#define NMB_DUMMY(...) X(__VA_ARGS__)
#define NMB_INT_DEFS \
TEXT("long int") , long , , , GetLongValue , _ttol , NMB_SECT , SetLongValue ,
#define NMB_STR_DEFS NMB_STR_DEFS__(TEXT("string"))
#define NMB_PATH_DEFS NMB_STR_DEFS__(TEXT("path"))
#define NMB_STR_DEFS__(ATYPE) \
ATYPE , basic_string<TCHAR>* , new basic_string<TCHAR>\
, delete , GetValue , , NMB_SECT , SetValue , *
/* X-macro starts here */
#define NMB_SECT "server"
NMB_DUMMY(ip,TEXT("Slave IP."),TEXT("10.11.180.102"),NMB_STR_DEFS)
NMB_DUMMY(port,TEXT("Slave portti."),TEXT("502"),NMB_STR_DEFS)
NMB_DUMMY(slaveid,TEXT("Slave protocol ID."),0xff,NMB_INT_DEFS)
.
. /* And so on for about 40 items. */
It's a bit confusing, I admit. It quickly become clear that I don't actually want to write all those type declarations after every field-macro. (Don't worry, there's a big comment to explain everything which I omitted for brevity.)
And this is how I declare the configuration struct:
typedef struct {
#define X(ID,DESC,DEFVAL,ATYPE,TYPE,...) TYPE ID;
#include "configuration.def"
#undef X
basic_string<TCHAR>* ini_path; //Where all the other stuff gets read.
long verbosity; //Used only by console writing functions.
} Config;
Then, in the code, firstly the default values are read into the configuration struct:
#define X(ID,DESC,DEFVAL,ATYPE,TYPE,CONSTRUCTOR,DESTRUCTOR,GETTER,STRCONV,SECT,SETTER,...) \
conf->ID = CONSTRUCTOR(DEFVAL);
#include "configuration.def"
#undef X
Then, the INI is read into the configuration struct as follows, using library SimpleIni:
#define X(ID,DESC,DEFVAL,ATYPE,TYPE,CONSTRUCTOR,DESTRUCTOR,GETTER,STRCONV,SECT,SETTER,DEREF...)\
DESTRUCTOR (conf->ID);\
conf->ID = CONSTRUCTOR( ini.GETTER(TEXT(SECT),TEXT(#ID),DEFVAL,FALSE) );\
LOG3A(<< left << setw(13) << TEXT(#ID) << TEXT(": ") << left << setw(30)\
<< DEREF conf->ID << TEXT(" (") << DEFVAL << TEXT(").") );
#include "configuration.def"
#undef X
And overrides from commandline flags, that also are formatted with the same names (in GNU long form), are applies as follows in the foillowing manner using library SimpleOpt:
enum optflags {
#define X(ID,...) ID,
#include "configuration.def"
#undef X
};
CSimpleOpt::SOption sopt[] = {
#define X(ID,DESC,DEFVAL,ATYPE,TYPE,...) {ID,TEXT("--") #ID TEXT("="), SO_REQ_CMB},
#include "configuration.def"
#undef X
SO_END_OF_OPTIONS
};
CSimpleOpt ops(argc,argv,sopt,SO_O_NOERR);
while(ops.Next()){
switch(ops.OptionId()){
#define X(ID,DESC,DEFVAL,ATYPE,TYPE,CONSTRUCTOR,DESTRUCTOR,GETTER,STRCONV,SECT,...) \
case ID:\
DESTRUCTOR (conf->ID);\
conf->ID = STRCONV( CONSTRUCTOR ( ops.OptionArg() ) );\
LOG3A(<< TEXT("Omitted ")<<left<<setw(13)<<TEXT(#ID)<<TEXT(" : ")<<conf->ID<<TEXT(" ."));\
break;
#include "configuration.def"
#undef X
}
}
And so on, I also use the same macro to print the --help -flag output and sample default ini file, configuration.def is included 8 times in my program. "Square peg into a round hole", maybe; how would an actually competent programmer proceed with this? Lots and lots of loops and string processing?
https://github.com/whunmr/DataEx
I am using the following xmacros to generate a C++ class, with serialize and deserialize functionality built in.
#define __FIELDS_OF_DataWithNested(_) \
_(1, a, int ) \
_(2, x, DataX) \
_(3, b, int ) \
_(4, c, char ) \
_(5, d, __array(char, 3)) \
_(6, e, string) \
_(7, f, bool)
DEF_DATA(DataWithNested);
Usage:
TEST_F(t, DataWithNested_should_able_to_encode_struct_with_nested_struct) {
DataWithNested xn;
xn.a = 0xCAFEBABE;
xn.x.a = 0x12345678;
xn.x.b = 0x11223344;
xn.b = 0xDEADBEEF;
xn.c = 0x45;
memcpy(&xn.d, "XYZ", strlen("XYZ"));
char buf_with_zero[] = {0x11, 0x22, 0x00, 0x00, 0x33};
xn.e = string(buf_with_zero, sizeof(buf_with_zero));
xn.f = true;
__encode(DataWithNested, xn, buf_);
char expected[] = { 0x01, 0x04, 0x00, 0xBE, 0xBA, 0xFE, 0xCA,
0x02, 0x0E, 0x00 /*T and L of nested X*/,
0x01, 0x04, 0x00, 0x78, 0x56, 0x34, 0x12,
0x02, 0x04, 0x00, 0x44, 0x33, 0x22, 0x11,
0x03, 0x04, 0x00, 0xEF, 0xBE, 0xAD, 0xDE,
0x04, 0x01, 0x00, 0x45,
0x05, 0x03, 0x00, 'X', 'Y', 'Z',
0x06, 0x05, 0x00, 0x11, 0x22, 0x00, 0x00, 0x33,
0x07, 0x01, 0x00, 0x01};
EXPECT_TRUE(ArraysMatch(expected, buf_));
}
Also, another example is in https://github.com/whunmr/msgrpc.
Chromium has an interesting variation of a X-macro at dom_code_data.inc. Except it's not just a macro, but an entirely separate file.
This file is intended for keyboard input mapping between different platforms' scancodes, USB HID codes, and string-like names.
The file contains code like:
DOM_CODE_DECLARATION {
// USB evdev XKB Win Mac Code
DOM_CODE(0x000000, 0x0000, 0x0000, 0x0000, 0xffff, NULL, NONE), // Invalid
...
};
Each macro invocation actually passes in 7 arguments, and the macro can choose which arguments to use and which to ignore. One usage is to map between OS keycodes and platform-independent scancodes and DOM strings. Different macros are used on different OSes to pick the keycodes appropriate for that OS.
// Table of USB codes (equivalent to DomCode values), native scan codes,
// and DOM Level 3 |code| strings.
#if defined(OS_WIN)
#define DOM_CODE(usb, evdev, xkb, win, mac, code, id) \
{ usb, win, code }
#elif defined(OS_LINUX)
#define DOM_CODE(usb, evdev, xkb, win, mac, code, id) \
{ usb, xkb, code }
#elif defined(OS_MACOSX)
#define DOM_CODE(usb, evdev, xkb, win, mac, code, id) \
{ usb, mac, code }
#elif defined(OS_ANDROID)
#define DOM_CODE(usb, evdev, xkb, win, mac, code, id) \
{ usb, evdev, code }
#else
#define DOM_CODE(usb, evdev, xkb, win, mac, code, id) \
{ usb, 0, code }
#endif
#define DOM_CODE_DECLARATION const KeycodeMapEntry usb_keycode_map[] =
#include "ui/events/keycodes/dom/dom_code_data.inc"
#undef DOM_CODE
#undef DOM_CODE_DECLARATION
My humble example:
One of the steps to speed up the FFmpeg HEVC decoder - hardcode a matrix consisting of only three rows of small integer coefficients, which is used in several places:
https://github.com/aliakseis/FFmpegPlayer/commit/53a28b61cd98e1dda6d04251b713d39122c021d2#diff-8c65aa37510be2621e7b5a550a33c445b4c85607a789c9b483c2e78cdffcd65bL607

Designing an API with compile-time option to remove first parameter to most functions and use a global

I'm trying to design a portable API in ANSI C89/ISO C90 to access a wireless networking device on a serial interface. The library will have multiple network layers, and various versions need to run on embedded devices as small as an 8-bit micro with 32K of code and 2K of data, on up to embedded devices with a megabyte or more of code and data.
In most cases, the target processor will have a single network interface and I'll want to use a single global structure with all state information for that device. I don't want to pass a pointer to that structure through the network layers.
In a few cases (e.g., device with more resources that needs to live on two networks) I will interface to multiple devices, each with their own global state, and will need to pass a pointer to that state (or an index to a state array) through the layers.
I came up with two possible solutions, but neither one is particularly pretty. Keep in mind that the full driver will potentially be 20,000 lines or more, cover multiple files, and contain hundreds of functions.
The first solution requires a macro that discards the first parameter for every function that needs to access the global state:
// network.h
typedef struct dev_t {
int var;
long othervar;
char name[20];
} dev_t;
#ifdef IF_MULTI
#define foo_function( x, a, b, c) _foo_function( x, a, b, c)
#define bar_function( x) _bar_function( x)
#else
extern dev_t DEV;
#define IFACE (&DEV)
#define foo_function( x, a, b, c) _foo_function( a, b, c)
#define bar_function( x) _bar_function( )
#endif
int bar_function( dev_t *IFACE);
int foo_function( dev_t *IFACE, int a, long b, char *c);
// network.c
#ifndef IF_MULTI
dev_t DEV;
#endif
int bar_function( dev_t *IFACE)
{
memset( IFACE, 0, sizeof *IFACE);
return 0;
}
int foo_function( dev_t *IFACE, int a, long b, char *c)
{
bar_function( IFACE);
IFACE->var = a;
IFACE->othervar = b;
strcpy( IFACE->name, c);
return 0;
}
The second solution defines macros to use in the function declarations:
// network.h
typedef struct dev_t {
int var;
long othervar;
char name[20];
} dev_t;
#ifdef IF_MULTI
#define DEV_PARAM_ONLY dev_t *IFACE
#define DEV_PARAM DEV_PARAM_ONLY,
#else
extern dev_t DEV;
#define IFACE (&DEV)
#define DEV_PARAM_ONLY void
#define DEV_PARAM
#endif
int bar_function( DEV_PARAM_ONLY);
// I don't like the missing comma between DEV_PARAM and arg2...
int foo_function( DEV_PARAM int a, long b, char *c);
// network.c
#ifndef IF_MULTI
dev_t DEV;
#endif
int bar_function( DEV_PARAM_ONLY)
{
memset( IFACE, 0, sizeof *IFACE);
return 0;
}
int foo_function( DEV_PARAM int a, long b, char *c)
{
bar_function( IFACE);
IFACE->var = a;
IFACE->othervar = b;
strcpy( IFACE->name, c);
return 0;
}
The C code to access either method remains the same:
// multi.c - example of multiple interfaces
#define IF_MULTI
#include "network.h"
dev_t if0, if1;
int main()
{
foo_function( &if0, -1, 3.1415926, "public");
foo_function( &if1, 42, 3.1415926, "private");
return 0;
}
// single.c - example of a single interface
#include "network.h"
int main()
{
foo_function( 11, 1.0, "network");
return 0;
}
Is there a cleaner method that I haven't figured out? I lean toward the second since it should be easier to maintain, and it's clearer that there's some macro magic in the parameters to the function. Also, the first method requires prefixing the function names with "_" when I want to use them as function pointers.
I really do want to remove the parameter in the "single interface" case to eliminate unnecessary code to push the parameter onto the stack, and to allow the function to access the first "real" parameter in a register instead of loading it from the stack. And, if at all possible, I don't want to have to maintain two separate codebases.
Thoughts? Ideas? Examples of something similar in existing code?
(Note that using C++ isn't an option, since some of the planned targets don't have a C++ compiler available.)
I like your second solution. I just prefer declaring every function twice rather than have that PARAM macro in the public header. I much prefer to put macro hijinks in the hidden C file.
// common header
#ifdef IF_MULTI
int foo_func1(dev_t* if, int a);
int foo_func2(dev_t* if, int a, int b);
int foo_func3(dev_t* if);
#else
int foo_func1(int a);
int foo_func2(int a, int b);
int foo_func3();
#endif
// your C file
#ifdef IF_MULTI
#define IF_PARM dev_t* if,
#define GET_IF() (if)
#else
dev_t global_if;
#define IF_PARM
#define GET_IF() (&global_if)
#endif
int foo_func1(IF_PARM int a)
{
GET_IF()->x = a;
return GET_IF()->status;
}
int foo_func2(IF_PARM int a, int b)
int foo_func3(IF_PARM);
Here's a solution that won't work if you have threads (or switch interfaces on re-entrance or something like that), but it is a clean interface, and it might work for you.
You could have your single instance functions using a global DEV, and have your multi interface functions set this global and call their single instance counterparts.
For example:
dev_t *DEV;
int foo_function(int x, int y)
{
/* DEV->whatever; */
return DEV->status;
}
int foo_function_multi(dev_t *IFACE, int x, int y)
{
DEV = IFACE;
return foo_function(x, y);
}
Another option is to use variadic args, and pass and fetch an extra arg (which contains the interface to use) #ifdef MULTI, but that's horrible because you lose your type safety, and would prevent passing the arg in a register which you possibly care quite a bit about on your platform. Also, all functions with variadic args must have at least one named argument, and your question is all about avoiding arguments! But anyway:
#ifndef MULTI
dev_t *DEV;
#endif
int foo(int x, int y, ...)
{
#ifdef MULTI
va_list args;
va_start(args, y);
dev_t *DEV = va_arg(args, (dev_t*));
va_end(args);
#endif
/* DEV->whatever */
return DEV->status;
}
// call from single
int quux()
{
int status = foo(23, 17);
}
// call from multi
int quux()
{
int status = foo(23, 17, &if0);
}
Personally I prefer your first solution :-)
This will work on gcc:
#ifdef TOMSAPI_SMALL
#define TOMSAPI_ARGS( dev, ...) (__VA_ARGS__)
#else // ! TOMSAPI_SMALL
#define TOMSAPI_ARGS( dev, ...) (dev, ## __VA_ARGS__)
#endif // TOMSAPI_SMALL
#ifdef TOMSAPI_SMALL
#define TOMSAPI_DECLARE_DEVP(local_dev_ptr) device_t * local_dev_ptr = &global_dev; NULL
// The trailing NULL is to make the compiler make you put a ; after calling the macro,
// but without allowing something that would mess up the declaration if you forget the ;
// You can't use the do{...}while(0) trick for a variable declaration.
#else // ! TOMSAPI_SMALL
#define TOMSAPI_DECLARE_DEVP(local_dev_ptr) device_t * local_dev_ptr = arg_dev; NULL
#endif // TOMSAPI_SMALL
and then
int tomsapi_init TOMSAPI(device_t *arg_dev, void * arg_for_illustration_purposes ) {
TOMSAPI_DECLARE_DEVP( my_dev );
my_dev->stuff = arg_for_illustration_purposes;
return 0;
}
Using this method you would have to ensure that all of your API functions used the same name for the device pointer, but all of your function definitions and declarations would look like they needed the full number of arguments. If this were not important to you you could do:
#ifdef TOMSAPI_SMALL
#define TOMSAPI_ARGS(...) (__VA_ARGS__)
#else // ! TOMSAPI_SMALL
#define TOMSAPI_ARGS(...) (device_t *dev, ## __VA_ARGS__)
#endif // TOMSAPI_SMALL
#ifdef TOMSAPI_SMALL
#define TOMSAPI_DECLARE_DEVP() device_t * dev = &global_dev; NULL
#else // ! TOMSAPI_SMALL
#define TOMSAPI_DECLARE_DEVP(local_dev_ptr) NULL
#endif // TOMSAPI_SMALL
and then
int tomsapi_init TOMSAPI(void * arg_for_illustration_purposes ) {
dev->stuff = arg_for_illustration_purposes;
return 0;
}
But this ends up looking like dev is never declared to someone reading your code.
All of that being said, you may find that on the single device small platform that using a global device struct ends up costing more than passing the pointer around due to the number of times the address of this struct will have to be reloaded. This is more likely if you API is stacked (some of your functions call other of your functions and pass them the dev pointer), uses a lot of tail recursion, and/or your platform uses registers for passing most arguments rather than the stack.
EDIT:
I just realized that there could be a problem with this method if you have api functions which take no additional arguments, even if you do use the ## operator if your compiler wants to force you to say int foo(void) for functions that take no arguments.

Define array and symbolic indices at same time

I'm trying to think of a clever way (in C) to create an array of strings, along with symbolic names (enum or #define) for the array indices, in one construct for easy maintenance. Something like:
const char *strings[] = {
M(STR_YES, "yes"),
M(STR_NO, "no"),
M(STR_MAYBE, "maybe")
};
where the result would be equivalent to:
const char *strings[] = {"yes", "no", "maybe"};
enum indices {STR_YES, STR_NO, STR_MAYBE};
(or #define STR_YES 0, etc)
but I'm drawing a blank for how to construct the M macro in this case.
Any clever ideas?
A technique used in the clang compiler source is to create .def files that contains a list like this, which is designed like a C file and can easily be maintained without touching other code files that use it. For example:
#ifndef KEYWORD
#define KEYWORD(X)
#endif
#ifndef LAST_KEYWORD
#define LAST_KEYWORD(X) KEYWORD(X)
#endif
KEYWORD(return)
KEYWORD(switch)
KEYWORD(while)
....
LAST_KEYWORD(if)
#undef KEYWORD
#undef LAST_KEYWORD
Now, what it does is including the file like this:
/* some code */
#define KEYWORD(X) #X,
#define LAST_KEYWORD(X) #X
const char *strings[] = {
#include "keywords.def"
};
#define KEYWORD(X) kw_##X,
#define LAST_KEYWORD(X) kw_##X
enum {
#include "keywords.def"
};
In your case, you could do similar. If you can live with STR_yes, STR_no, ... as enumerator names you could use the same approach like above. Otherwise, just pass the macro two things. One lowercase name and one uppercase name. Then you could stringize the one you want like above.
This is a good place to use code generation. Use a language like perl, php or whatever to generate your .h file.
It is not required to put this into specific .def files; using only the preprocessor is perfectly possible. I usually define a list named ...LIST where each element is contained within ...LIST_ELEMENT. Depending on what I will use the list for I will either just separate with a comma for all but the last entry (simplest), or in the general case make it possible to select the separator individually on each usage. Example:
#include <string.h>
#define DIRECTION_LIST \
DIRECTION_LIST_ELEMENT( up, DIRECTION_LIST_SEPARATOR ) \
DIRECTION_LIST_ELEMENT( down, DIRECTION_LIST_SEPARATOR ) \
DIRECTION_LIST_ELEMENT( right, DIRECTION_LIST_SEPARATOR ) \
DIRECTION_LIST_ELEMENT( left, NO_COMMA )
#define COMMA ,
#define NO_COMMA /**/
#define DIRECTION_LIST_ELEMENT(elem, sep) elem sep
#define DIRECTION_LIST_SEPARATOR COMMA
typedef enum {
DIRECTION_LIST
} direction_t;
#undef DIRECTION_LIST_ELEMENT
#undef DIRECTION_LIST_SEPARATOR
#define DIRECTION_LIST_ELEMENT(elem, sep) void (*move_ ## elem)(struct object_s * object);
#define DIRECTION_LIST_SEPARATOR NO_COMMA
typedef struct object_s {
char *name;
// ...
DIRECTION_LIST
} object_t;
#undef DIRECTION_LIST_ELEMENT
#undef DIRECTION_LIST_SEPARATOR
static void move(object_t *object_p, const char * direction_string)
{
if (0) {
}
#define DIRECTION_LIST_SEPARATOR NO_COMMA
#define DIRECTION_LIST_ELEMENT(elem, sep) \
else if (strcmp(direction_string, #elem) == 0) { \
object_p->move_ ## elem(object_p); \
}
DIRECTION_LIST
#undef DIRECTION_LIST_ELEMENT
#undef DIRECTION_LIST_SEPARATOR
}

Resources