Asking c preprocessor to generate code discriminating on an argument? - c

I've met my match. I thought I could do this, but like Captain Ahab, I don't know when to call it quits. If all else fails, I'll run a Python script to generate the code, but I'm hoping there's someone out there who is as obsessed with C preprocessor macros as I am...
I'm using GCC. Given a definition like this (or any syntactic changes that would simplify things):
// M(_slot_name, _type, _arg, _accessor)
#define DEFINE_SLOTS(M) \
M(ENABLE_A, bool, ENABLE_A_BITPOS, registry_enable_a) \
M(SLEEP_TIME, uint32_t, sleep_time, registry_sleep_time) \
M(ENABLE_B, bool, ENABLE_B_BITPOS, registry_enable_b) \
M(WAKE_TIME, uint32_t, wake_time, registry_wake_time)
I want the C preprocessor to expand the above into three different things. The first one is easy (generating an enum for each _slot_name). What I'm getting tripped up on is using the _type field to conditionally generate different output.
1. An enum for each slot_name:
typedef enum {
ENABLE_A,
SLEEP_TIME,
ENABLE_B,
WAKE_TIME,
} slot_id_t;
(This one is easy - I know how to do it...)
2. An enum that only includes the slots with a bool type:
typedef enum {
ENABLE_A_BITPOS,
ENABLE_B_BITPOS,
} bitpos_t;
3. Functions whose bodies differ by the _type field:
bool registry_enable_a(void) { return foo(ENABLE_A_BITPOS); }
uint32_t registry_sleep_time(void) { return bar->sleep_time); }
bool registry_enable_b(void) { return foo(ENABLE_B_BITPOS); }
uint32_t registry_wake_time(void) { return bar->wake_time); }

Provided that the _type values are all limited to elements of a known-in-advance set and that they contain only single-token identifiers,* you can engage token pasting and some supplementary macros to achieve what you describe. Example:
// M(_slot_name, _type, _arg, _accessor)
#define DEFINE_SLOTS(M) \
M(ENABLE_A, bool, ENABLE_A_BITPOS, registry_enable_a) \
M(SLEEP_TIME, uint32_t, sleep_time, registry_sleep_time) \
M(ENABLE_B, bool, ENABLE_B_BITPOS, registry_enable_b) \
M(WAKE_TIME, uint32_t, wake_time, registry_wake_time)
#define SLOT_ENUM_VALUE(_slot_name, ...) _slot_name,
#define bool_BP_ENUM_VALUE(_arg) _arg,
#define uint32_t_BP_ENUM_VALUE(_arg) /* nothing */
#define BITPOS_ENUM_VALUE(_1, _type, _arg, _4) _type ## _BP_ENUM_VALUE(_arg)
#define bool_SLOT_FUNCTION(_slot_name, _type, _arg, _accessor) \
_type _accessor(void) { return foo(_arg); }
#define uint32_t_SLOT_FUNCTION(_slot_name, _type, _arg, _accessor) \
_type _accessor(void) { return bar->_arg; }
#define SLOT_FUNCTION(_slot_name, _type, _arg, _accessor) \
_type ## _SLOT_FUNCTION(_slot_name, _type, _arg, _accessor)
////////
typedef enum {
DEFINE_SLOTS(SLOT_ENUM_VALUE)
} slot_id_t;
typedef enum {
DEFINE_SLOTS(BITPOS_ENUM_VALUE)
} bitpos_t;
DEFINE_SLOTS(SLOT_FUNCTION)
The key here is that macro expansion does not provide for any conditional logic per se, but it does provide for treating data as code. After all, that's what the X-macro approach you were already using does. Combining that with token pasting can get you a lot more (wholly deterministic) variety in your macro expansions.
*You can use typedefs to provide single-token identifiers where necessary.

Related

C MacroMagic - Struct Definition

Im looking for a solution for define a struct, where the user may enable/disable struct members as in the example (pseudo-code):
#define DEF_STRUCT_1(NAME,VAL1,VAL2) \
struct my_struct_t \
{ \
#if(NAME == TRUE) \
bool name; \
#endif \
#if(VAL1 == TRUE) \
bool val1; \
#endif \
#if(VAL2 == TRUE) \
bool val2; \
#endif \
} instance1
void main() {
DEF_STRUCT_1(TRUE,FALSE,TRUE);
instance1.name = true;
//instance1.val1 = false; // error, unavailable
instance1.val2 = false;
}
I'm not sure how useful this is, but the following should do what you ask:
#define CONDITIONAL_TRUE(code) code
#define CONDITIONAL_FALSE(code)
#define DEF_STRUCT_1(NAME,VAL1,VAL2) \
struct my_struct_t \
{ \
CONDITIONAL_##NAME(bool name;) \
CONDITIONAL_##VAL1(bool val1;) \
CONDITIONAL_##VAL2(bool val2;) \
} instance1
int main() {
DEF_STRUCT_1(TRUE,FALSE,TRUE);
instance1.name = true;
//instance1.val1 = false; // error, unavailable
instance1.val2 = false;
}
All the TRUE/FALSE parameters would have to be available at compile-time. And if you want more than one version of these parameters to be used in the same program, you should make the struct name a parameter as well.
Since you say that this is intended for a library, it isn't clear how you're planning for the library code to be able to access this struct, since it would need to know which members are available. This significantly reduces the usefulness of this method.
A more common method used by libraries is to have a config.h file, editable by the library user, with definitions such as #define USE_NAME_MEMBER 1. Then you can make a normal struct definition with #if directives:
//in mylibrary.h:
#include <mylibrary_config.h>
struct my_struct_t {
#if USE_NAME_MEMBER
bool name;
#endif
/...
};
Then you would also put #if directives around any library code that accesses the name member.
Given that the struct needs to be generated differently at compile-time, given some conditions, you will be facing the problem that all code using the struct will need to be modified accordingly. Compiler switches (#ifdef FOO .... #endif) tend to scale badly with increased complexity. If there is a large number of struct members, all the needed compiler switches will make a horrible, unmaintainable mess out of the program.
There is a well-known design pattern known as "X macros", that can be used to centralize maintenance in programs to one single place, as well as allowing compile-time iteration of all items involved. They make the code hard to read too, and therefore they are a bit of a last resort. But they are a bit of de facto standard and their ugliness doesn't scale with complexity, so they are preferred over some compiler switch madness. It goes like this:
#define INSTANCE_LIST \
/* name, type */ \
X(name, bool) \
X(val1, bool) \
X(val2, bool) \
typedef struct
{
#define X(name, type) type name;
INSTANCE_LIST
#undef X
} instance_t;
This code gets pre-processed into:
typedef struct
{
bool name;
bool val1;
bool val2;
} instance_t;
The only part that needs to be maintained is the "INSTANCE_LIST". By commenting out a line in the list, that struct member will go away. This means that all code using the struct has to be using the same list accordingly. For example, lets add code to the same example, that lists the init values of each member and then sets them:
#include <stdbool.h>
#include <stdio.h>
#define INSTANCE_LIST \
/* name, type, init */ \
X(name, bool, true) \
X(val1, bool, false) \
X(val2, bool, false) \
typedef struct
{
#define X(name, type, init) type name;
INSTANCE_LIST
#undef X
} instance_t;
int main (void)
{
instance_t inst;
#define X(name, type, init) inst.name = init;
INSTANCE_LIST
#undef X
printf("%d ", inst.name);
printf("%d ", inst.val1);
printf("%d ", inst.val2);
}
Very flexible and maintainable - you can easily add more struct members without changing any other macro than the list. But as mentioned, the down-side is that the code looks quite cryptic, especially to those who aren't used to this design pattern.

Making x macro work with do while 0

I am trying to make x macro work with do while 0 as checkpatch.pl is unhappy about it. However, it breaks the logic. Wondering if anyone have any suggestions to make it work?
#define X_TYPES do { \
X(BABA, "baba") \
X(INVALID, "invalid") \
} while (0)
#define X(type, name) type,
enum x_type {
X_TYPES
};
#undef X
#define X(type, name) name,
const char *x_name[] = {
X_TYPES
};
#undef X
int main()
{
return 0;
}
You cannot wrap the X macro with do while(0), because it will not be syntactically correct for the enumeration use cases that it is intended for.
Seek an exception mechanism in your checker script, or place the X macro into a separate source file that is excluded from being checked.

Weird macro expansion

I want to simulate classes in c, and hide the implementation with macros, but i got unexpected expansion behaviour of the macros.
#define decl_class struct class { void *base ## ;
#define end_class } ## ; typedef struct class class ## ;
#define decl_methods struct class ## Methods {
#define method(returnType, methodName, ...) returnType (*methodName)(struct class *self, __VA_ARGS__) ## ;
#define end_methods } ## ;
#define class Integer
decl_class
int value;
decl_methods
method(int, getValue)
end_methods
end_class
#undef class
#define class Double
decl_class
double value;
decl_methods
method(double, getValue)
end_methods
end_class
#undef class
The compiler says that i am declaring twice the struct classMethods(class should be the name of the class instead). This means that "class" doesn't get replaced when i want it to. Is it even possible to do so?
Your first problem is that
#define end_methods } ## ;
is a syntax error (if the macro is expanded), because the result of the token paste is not a single valid token. You should have gotten error messages like
error: pasting "}" and ";" does not give a valid preprocessing token
Your second problem is that token pastes are executed before nested macro expansion. That means your macro
#define decl_methods struct class ## Methods {
is effectively the same as if you had written
#define decl_methods struct classMethods {
To get it to do what you want, class must be a formal parameter to a function-like macro:
#define decl_class(class) struct class {
#define end_class(class) }; typedef struct class class;
#define decl_methods(class) struct class ## Methods {
#define end_methods(class) };
#define method(class, returnType, methodName, ...) \
returnType (*methodName)(struct class *self, __VA_ARGS__);
and then
decl_class(Double)
double value;
decl_methods(Double)
method(Double, double, get_value);
end_methods(Double)
end_class(Double)
I suppose you could avoid having to repeat the name of the class in every macro invocation by having an additional set of macros that stick the class pseudo-argument in there, but (for reasons too tedious to get into here; read the "Argument Prescan" section of the GNU CPP manual very carefully) you will need two layers of nested expansion to get the effect you want:
#define decl_class__(class_) struct class_ {
#define decl_class_(class_) decl_class__(class_)
#define decl_class decl_class_(class)
#define decl_methods__(class_) struct class_ ## Methods {
#define decl_methods_(class_) decl_methods__(class_)
#define decl_methods decl_methods_(class)
/* etc */
This is technically only required when the innermost macro needs to use ## (or #) but if you're seriously going to use these macros in a real program, you should do it uniformly for all of them otherwise you'll be tearing your hair out six months later.
And after you get past all of that you will discover that your method macro doesn't work right for zero-argument methods, e.g.
#define class Integer
method(int, getValue)
either throws an error because, in standard C, ... in a macro parameter list must receive at least one argument, or it expands to a syntactically invalid declaration,
int (*getValue)(struct Integer *self, );
The only way to work around this one is to use a GNU extension:
#define method__(class_, returnType, methodName, ...) \
returnType (*methodName)(struct class_ *self, ##__VA_ARGS__);
In GNU extended C, ## in between , and __VA_ARGS__ has the special effect of causing the comma to be deleted when the ... received no arguments. (This extension was proposed for standardization about 15 years ago, but the committee wasn't interested.)
At this point I invite you to reconsider the possibility of just using C++ instead.

Combining _Generic macros

I am delighted by C11's _Generic mechanism - switching on type is something I miss from C++. It is however proving difficult to compose.
For an example, given functions:
bool write_int(int);
bool write_foo(foo);
bool write_bar(bar);
// bool write_unknown is not implemented
I can then write
#define write(X) _Generic((X), \
int : write_int, \
foo: write_foo, \
bar: write_bar, \
default: write_unknown)(X)
and, provided I don't try to use &write or pass it to a function, I can call write(obj) and, provided obj is an instance of one of those types, all is well.
However, in general foo and bar are entirely unrelated to each other. They are defined in different headers, rarely (but occasionally) used together in a single source file. Where then should the macro expanding to the _Generic be written?
At present, I am accumulating header files called things like write.h, equal.h, copy.h, move.h each of which contains a set of function prototypes and a single _Generic. This is workable, but not brilliant. I don't like the requirement to collect together a list of every type in the program in a single place.
I would like to be able to define type foo in a header file, along with the function write_foo, and somehow have the client code able to call the 'function' write. Default looks like a vector through which this could be achieved.
The closest match I can find on this site is c11 generic adding types which has a partial solution, but it's not quite enough for me to see how to combine the various macros.
Let's say that, somewhere in a header file that defines write_bar, we have an existing macro definition:
#define write(x) _Generic((x), bar: write_bar, default: some_magic_here)(x)
Or we could omit the trailing (x)
#define write_impl(x) _Generic((x), bar: write_bar, default: some_magic_here)
Further down in this header, I would like a version of write() that handles either foo or bar. I think it needs to call the existing macro in its default case, but I don't believe the preprocessor is able to rename the existing write macro. If it were able to, the following could work:
#ifndef WRITE_3
#define WRITE_3(X) write(x)
#undef write(x)
#define write(x) __Generic((x),foo: write_foo,default: WRITE_3)(x)
Having just typed that out I can sort-of see a path forward:
// In bar.h
#ifndef WRITE_1
#define WRITE_1(x) __Generic((x), bar: write_bar)
#elif !defined(WRITE_2)
#define WRITE_2(x) __Generic((x), bar: write_bar)
#elif !defined(WRITE_3)
#define WRITE_3(x) __Generic((x), bar: write_bar)
#endif
// In foo.h
#ifndef WRITE_1
#define WRITE_1(x) __Generic((x), foo: write_foo)
#elif !defined(WRITE_2)
#define WRITE_2(x) __Generic((x), foo: write_foo)
#elif !defined(WRITE_3)
#define WRITE_3(x) __Generic((x), foo: write_foo)
#endif
// In write.h, which unfortunately needs to be included after the other two
// but happily they can be included in either order
#ifdef WRITE_2
#define write(x) WRITE_1(x) WRITE_2(x) (x)
#elif
// etc
#endif
This doesn't actually work though, since I can't find a way to make WRITE_N(x) expand to nothing when x doesn't match the argument list. I see the error
controlling expression type 'struct foo' not compatible with any generic association type
Or
expected expression // attempting to present an empty default clause
I believe to distribute the write() definition between several files | macros I need to work around either of the above. A _Generic clause which reduces to nothing in the default case would work, as would one which reduces to nothing if none of the types match.
Getting yet more hackish, if the functions take a pointer to a struct instead of an instance of one, and I provide write_void(void*x) {(void)x;} as the default option, then the code does compile and run. However, expanding write as
write(x) => write_void(x); write_foo(x); write_void(x);
is clearly pretty bad in itself, plus I don't really want to pass everything by pointer.
So - can anyone see a way to define a single _Generic 'function' incrementally, i.e. without starting with a list of all types it will map over? Thank you.
The need for type-generic functions across multiple, unrelated files suggests that the program design is poor.
Either those files are related and should share a common parent ("abstract base class") where the type-generic macros and function declarations can then be stated.
Or they are unrelated, but share some common method for whatever reason, in which case you need to invent a common, generic abstraction layer interface which they can then implement. You should always consider the program design on a system level the first thing you do.
This answer does not use _Generic, but proposes a different program design entirely.
To take the example from a comment, with bool equal(T lhs, T rhs). That's the latter of the above two cases, a common interface shared by multiple modules. The first thing to observe is that this is a functor, a function which can be used in turn by generic algorithms such as search/sort algorithms. The C standard suggests how functors should preferably be written:
int compare (const void* p1, const void* p2)
This is the format used by standard functions bsearch and qsort. Unless you have good reasons, you shouldn't deviate from that format, because if you don't, you'll get searching & sorting for free. Also, this form has the advantage of doing lesser, greater and equal checks all in the same function.
The classic C way to implement a common interface for such a function in C would be a header containing this macro:
Interface header:
#define compare(type, x, y) (compare_ ## type(x, y))
Module that implements the header:
// int.c
int compare_int (const void* p1, const void* p2)
{
return *(int*)p1 - *(int*)p2;
}
Caller:
if( compare(int, a, b) == 0 )
{
// equal
}
This has the advantage of abstraction: the interface header file doesn't need to know all the types used. The disadvantage is that there is no type safety what-so-ever.
(But this is C, you'll never get 100% type safety through the compiler. Use static analysis if it is a big concern.)
With C11 you can improve type safety somewhat by introducing a _Generic macro. There's a big problem with that though: that macro has to know about all existing types in advance, so you can't put it in an abstract interface header. Rather, it should not be in a common header because then you'll create a tight coupling between every single, unrelated module using that header. You could make such a macro in the calling application, not to define an interface, but to ensure type safety.
What you could do instead, is to enforce an interface through inheritance of an abstract base class:
// interface.h
typedef int compare_t (const void* p1, const void* p2);
typedef struct data_t data_t; // incomplete type
typedef struct
{
compare_t* compare;
data_t* data;
} interface_t;
The module that inherits the interface sets the compare function pointer to point at the specific comparison function, upon object creation. data is private to the module and could be anything. Suppose we create a module called "xy" that inherits the above interface:
//xy.c
struct data_t
{
int x;
int y;
};
static int compare_xy (const void* p1, const void* p2)
{
// compare an xy object in some meaningful way
}
void xy_create (interface_t* inter, int x, int y)
{
inter->data = malloc(sizeof(data_t));
assert(inter->data != NULL);
inter->compare = compare_xy;
inter->data->x = x;
inter->data->y = y;
}
A caller can then work with the generic interface_t and call the compare member. We've achieved polymorphism, as the type-specific compare function will then get called.
Based loosely on Leushenko's answer to multiparameter generics I have come up with the following horrible solution. It requires that the arguments will be passed by pointer, and the boilerplate involved is pretty bad. It does compile and run though, in a fashion which allows functions to return a value.
// foo.h
#ifndef FOO
#define FOO
#include <stdio.h>
#include <stdbool.h>
struct foo
{
int a;
};
static inline int write_foo(struct foo* f)
{
(void)f;
return printf("Writing foo\n");
}
#if !defined(WRITE_1)
#define WRITE_1
#define WRITE_PRED_1(x) _Generic((x), struct foo * : true, default : false)
#define WRITE_CALL_1(x) \
_Generic((x), struct foo * \
: write_foo((struct foo*)x), default \
: write_foo((struct foo*)0))
#elif !defined(WRITE_2)
#define WRITE_2
#define WRITE_PRED_2(x) _Generic((x), struct foo * : true, default : false)
#define WRITE_CALL_2(x) \
_Generic((x), struct foo * \
: write_foo((struct foo*)x), default \
: write_foo((struct foo*)0))
#elif !defined(WRITE_3)
#define WRITE_3
#define WRITE_PRED_3(x) _Generic((x), struct foo * : true, default : false)
#define WRITE_CALL_3(x) \
_Generic((x), struct foo * \
: write_foo((struct foo*)x), default \
: write_foo((struct foo*)0))
#endif
#endif
// bar.h
#ifndef BAR
#define BAR
#include <stdio.h>
#include <stdbool.h>
struct bar
{
int a;
};
static inline int write_bar(struct bar* b)
{
(void)b;
return printf("Writing bar\n");
}
#if !defined(WRITE_1)
#define WRITE_1
#define WRITE_PRED_1(x) _Generic((x), struct bar * : true, default : false)
#define WRITE_CALL_1(x) \
_Generic((x), struct bar * \
: write_bar((struct bar*)x), default \
: write_bar((struct bar*)0))
#elif !defined(WRITE_2)
#define WRITE_2
#define WRITE_PRED_2(x) _Generic((x), struct bar * : true, default : false)
#define WRITE_CALL_2(x) \
_Generic((x), struct bar * \
: write_bar((struct bar*)x), default \
: write_bar((struct bar*)0))
#elif !defined(WRITE_3)
#define WRITE_3
#define WRITE_PRED_3(x) _Generic((x), struct bar * : true, default : false)
#define WRITE_CALL_3(x) \
_Generic((x), struct bar * \
: write_bar((struct bar*)x), default \
: write_bar((struct bar*)0))
#endif
#endif
// write.h
#ifndef WRITE
#define WRITE
#if defined(WRITE_3)
#define write(x) \
WRITE_PRED_1(x) ? WRITE_CALL_1(x) : WRITE_PRED_2(x) ? WRITE_CALL_2(x) \
: WRITE_CALL_3(x)
#elif defined(WRITE_2)
#define write(x) WRITE_PRED_1(x) ? WRITE_CALL_1(x) : WRITE_CALL_2(x)
#elif defined(WRITE_1)
#define write(x) WRITE_CALL_1(x)
#else
#error "Write not defined"
#endif
#endif
// main.c
#include "foo.h"
#include "bar.h"
#include "write.h"
int main()
{
struct foo f;
struct bar b;
int fi = write(&f);
int bi = write(&b);
return fi + bi;
}
I really hope there's a better way than this.

Define array and symbolic indices at same time

I'm trying to think of a clever way (in C) to create an array of strings, along with symbolic names (enum or #define) for the array indices, in one construct for easy maintenance. Something like:
const char *strings[] = {
M(STR_YES, "yes"),
M(STR_NO, "no"),
M(STR_MAYBE, "maybe")
};
where the result would be equivalent to:
const char *strings[] = {"yes", "no", "maybe"};
enum indices {STR_YES, STR_NO, STR_MAYBE};
(or #define STR_YES 0, etc)
but I'm drawing a blank for how to construct the M macro in this case.
Any clever ideas?
A technique used in the clang compiler source is to create .def files that contains a list like this, which is designed like a C file and can easily be maintained without touching other code files that use it. For example:
#ifndef KEYWORD
#define KEYWORD(X)
#endif
#ifndef LAST_KEYWORD
#define LAST_KEYWORD(X) KEYWORD(X)
#endif
KEYWORD(return)
KEYWORD(switch)
KEYWORD(while)
....
LAST_KEYWORD(if)
#undef KEYWORD
#undef LAST_KEYWORD
Now, what it does is including the file like this:
/* some code */
#define KEYWORD(X) #X,
#define LAST_KEYWORD(X) #X
const char *strings[] = {
#include "keywords.def"
};
#define KEYWORD(X) kw_##X,
#define LAST_KEYWORD(X) kw_##X
enum {
#include "keywords.def"
};
In your case, you could do similar. If you can live with STR_yes, STR_no, ... as enumerator names you could use the same approach like above. Otherwise, just pass the macro two things. One lowercase name and one uppercase name. Then you could stringize the one you want like above.
This is a good place to use code generation. Use a language like perl, php or whatever to generate your .h file.
It is not required to put this into specific .def files; using only the preprocessor is perfectly possible. I usually define a list named ...LIST where each element is contained within ...LIST_ELEMENT. Depending on what I will use the list for I will either just separate with a comma for all but the last entry (simplest), or in the general case make it possible to select the separator individually on each usage. Example:
#include <string.h>
#define DIRECTION_LIST \
DIRECTION_LIST_ELEMENT( up, DIRECTION_LIST_SEPARATOR ) \
DIRECTION_LIST_ELEMENT( down, DIRECTION_LIST_SEPARATOR ) \
DIRECTION_LIST_ELEMENT( right, DIRECTION_LIST_SEPARATOR ) \
DIRECTION_LIST_ELEMENT( left, NO_COMMA )
#define COMMA ,
#define NO_COMMA /**/
#define DIRECTION_LIST_ELEMENT(elem, sep) elem sep
#define DIRECTION_LIST_SEPARATOR COMMA
typedef enum {
DIRECTION_LIST
} direction_t;
#undef DIRECTION_LIST_ELEMENT
#undef DIRECTION_LIST_SEPARATOR
#define DIRECTION_LIST_ELEMENT(elem, sep) void (*move_ ## elem)(struct object_s * object);
#define DIRECTION_LIST_SEPARATOR NO_COMMA
typedef struct object_s {
char *name;
// ...
DIRECTION_LIST
} object_t;
#undef DIRECTION_LIST_ELEMENT
#undef DIRECTION_LIST_SEPARATOR
static void move(object_t *object_p, const char * direction_string)
{
if (0) {
}
#define DIRECTION_LIST_SEPARATOR NO_COMMA
#define DIRECTION_LIST_ELEMENT(elem, sep) \
else if (strcmp(direction_string, #elem) == 0) { \
object_p->move_ ## elem(object_p); \
}
DIRECTION_LIST
#undef DIRECTION_LIST_ELEMENT
#undef DIRECTION_LIST_SEPARATOR
}

Resources