Identifying and avoiding macros

Identifying and avoiding macros - c

I have a combination of libraries that are suffering from namespace pollution due to the use of macros and macro names in function prototypes of the same name. Specifically, I am using my GUI library Eagle and my friend's networking library Nilorea.
How do I obtain a list of all the function like macros included in my code? I looked at MSVS and CodeBlocks and I can't seem to find a way to list them. I can also use grep, but that is only effective on a directory or easily obtained list of files.
I read about this (Macro and function with same name) and discovered I can (surround) a function name in a declaration to prevent macro expansion, but that only solves the instances where I am using a function with the same name as the macro. MinGW is using macro names for functions I cannot change.
I avoided some namespace collision by separating out bad headers like windows.h and defining NOGDI to avoid some others, but I can't seem to get rid of them all.
I know I can #undef a symbol if I need to and I've had limited success with this, along with changing the order of inclusion of headers, but I'm still having problems.
So what options do I have to scrub my code of these obnoxious macros?'
I looked at the CPP (C preprocessor) program's help and it doesn't show any abilities to list the macros in a given header.
Do I need to write my own custom solution? What about Clang?
For an example of the collision between Nilorea and MinGW-W64 see here :
This is a macro in Nilorea that collides with mingw :
/*! Free Handler to get errors */
#define Free( ptr ) \
if ( ptr )\
{\
free( ptr );\
ptr = NULL;\
}\
else\
{\
n_log( LOG_DEBUG , "Free( %s ) already done or NULL at line %d of %s \n", #ptr , __LINE__ , __FILE__ );\
}
The compiler error it generates is misleading, but points to c:\mingw\i686-w64-mingw32\include\objidbase.h and oaidl.h in these function prototypes :
#if defined(__cplusplus) && !defined(CINTERFACE)
MIDL_INTERFACE("00000002-0000-0000-c000-000000000046")
IMalloc : public IUnknown
{
virtual void * STDMETHODCALLTYPE Alloc(
SIZE_T cb) = 0;
virtual void * STDMETHODCALLTYPE Realloc(
void *pv,
SIZE_T cb) = 0;
virtual void STDMETHODCALLTYPE Free(
void *pv) = 0;
virtual SIZE_T STDMETHODCALLTYPE GetSize(
void *pv) = 0;
virtual int STDMETHODCALLTYPE DidAlloc(
void *pv) = 0;
virtual void STDMETHODCALLTYPE HeapMinimize(
) = 0;
};

Related

C Macros function definition syntax question

I've been looking through a program called hickit, and at one point (count.c, function starts at line 105), and they call a macros function (kavl_insert) from the Klib library as follows:
static void hk_count_nei2_core(int32_t n_pairs, struct cnt_nei2_aux *a, int r1, int r2)
{
struct cnt_nei2_aux *root = 0;
int32_t i, j, left;
unsigned cl;
left = 0;
kavl_insert(nei2, &root, &a[0], 0);
...
Looking at the Klib library (more specifically, in kavl.h), this function (I think) is defined as follows:
#define __KAVL_INSERT(suf, __scope, __type, __head, __cmp) \
__scope __type *kavl_insert_##suf(__type **root_, __type *x, unsigned *cnt_) { \
Later on in the kavl.h file there is this standalone line (line 322):
#define kavl_insert(suf, proot, x, cnt) kavl_insert_##suf(proot, x, cnt)
I don't have much technical knowledge with C (just learned parts as they were relevant), and I'm wondering how this works. The casing is different, and there is the "__" precursor in the #define line. How does this work?

The first __KAVL_INSERT macro is used to declare functions which all start with the same prefix (kavl_insert_) and end with the specified suffix (parameter suf).
So, when you see this:
__KAVL_INSERT(foo, static, int, null, null)
preprocessor will replace it with a function with the appropriate name, scope, and parameter types:
static int *kavl_insert_foo(int **root_, int *x, unsigned *cnt_) { \
/* actual function body ... */ \
/* with lots of trailing backshashes ... */ \
/* because it's the only way to create ... */ \
/* a multiline macro in C */ \
}
The lowercase kavl_insert macro, on the other hand:
kavl_insert(foo, &something, &whatever, 0);
simply expands to the actual function call, i.e. it's equivalent to calling the function defined above:
kavl_insert_foo(&something, &whatever, 0);
The idea behind this kind of macros is usually to create a generic type-safe data structure in C, using the preprocessor, like the klib library of various generic data structures.

Broken multi line macro

This multi line macro from the Nilorea library fails to compile when I include it in my C++ project. It is marked as extern "C".
Tried GodBolt, and the GCC 8.1 compiler barfs on the if statement in the following code : https://godbolt.org/z/Lq_7aT
#define Free( __ptr )\
if ( __ptr )\
{\
free( __ptr );\
__ptr = NULL;\
}
int* i = 0;
Free(i);
It should compile. Is this a matter of the standard in use?
I edited the question with a bad compilable example.

The Godbolt code fails to compile because
You are calling the code outside a function
You are attempting to assign to literal 0
You fail to include the necessary headers.
In addition, as noted in the comments, double underscore in identifiers is reserved for the implementation. The compiler doesn’t diagnose this but it’s illegal anyway.
When fixing these three issues, it works:
#include <stdlib.h>
#define Free(ptr) \
if (ptr) \
{ \
free(ptr); \
ptr = NULL; \
}
int main(void) {
int *px = NULL;
Free(px);
}
(I’ve also fixed the atrocious, inconsistent spacing.)

Why doesn't ANSI C have namespaces?

Having namespaces seems like no-brainer for most languages. But as far as I can tell, ANSI C doesn't support it. Why not? Any plans to include it in a future standard?

For completeness there are several ways to achieve the "benefits" you might get from namespaces, in C.
One of my favorite methods is using a structure to house a bunch of method pointers which are the interface to your library/etc..
You then use an extern instance of this structure which you initialize inside your library pointing to all your functions. This allows you to keep your names simple in your library without stepping on the clients namespace (other than the extern variable at global scope, 1 variable vs possibly hundreds of methods..)
There is some additional maintenance involved but I feel that it is minimal.
Here is an example:
/* interface.h */
struct library {
const int some_value;
void (*method1)(void);
void (*method2)(int);
/* ... */
};
extern const struct library Library;
/* end interface.h */
/* interface.c */
#include "interface.h"
void method1(void)
{
...
}
void method2(int arg)
{
...
}
const struct library Library = {
.method1 = method1,
.method2 = method2,
.some_value = 36
};
/* end interface.c */
/* client code */
#include "interface.h"
int main(void)
{
Library.method1();
Library.method2(5);
printf("%d\n", Library.some_value);
return 0;
}
/* end client code */
The use of . syntax creates a strong association over the classic Library_function(), Library_some_value method. There are some limitations however, for one you can't use macros as functions.

C does have namespaces. One for structure tags, and one for other types. Consider the following definition:
struct foo
{
int a;
};
typedef struct bar
{
int a;
} foo;
The first one has tag foo, and the later is made into type foo with a typedef. Still no name-clashing happens. This is because structure tags and types (built-in types and typedef'ed types) live in separate namespaces.
What C doesn't allow is to create new namespace by will. C was standardized before this was deemed important in a language, and adding namespaces would also threaten backwards-compatibility, because it requires name mangling to work right. I think this can be attributed due to technicalities, not philosophy.
EDIT:
JeremyP fortunately corrected me and mentioned the namespaces I missed. There are namespaces for labels and for struct/union members as well.

C has namespaces. The syntax is namespace_name. You can even nest them as in general_specific_name. And if you want to be able to access names without writing out the namespace name every time, include the relevant preprocessor macros in a header file, e.g.
#define myfunction mylib_myfunction
This is a lot cleaner than name mangling and the other atrocities certain languages commit to deliver namespaces.

Historically, C compilers don't mangle names (they do on Windows, but the mangling for the cdecl calling convention consists of only adding an underscore prefix).
This makes it easy to use C libraries from other languages (including assembler) and is one of the reasons why you often see extern "C" wrappers for C++ APIs.

just historical reasons. nobody thought of having something like a namespace at that time. Also they were really trying to keep the language simple. They may have it in the future

Not an answer, but not a comment. C doesn't provide a way to define namespace explicitly. It has variable scope. For example:
int i=10;
struct ex {
int i;
}
void foo() {
int i=0;
}
void bar() {
int i=5;
foo();
printf("my i=%d\n", i);
}
void foobar() {
foo();
bar();
printf("my i=%d\n", i);
}
You can use qualified names for variables and functions:
mylib.h
void mylib_init();
void mylib_sayhello();
The only difference from namespaces it that you cannot be using and cannot import from mylib.

ANSI C was invented before namespaces were.

Because people who want to add this capability to C have not gotten together and organized to put some pressure on compiler author teams and on ISO bodies.

C doesn't support namespaces like C++. The implementation of C++ namespaces mangle the names. The approach outlined below allows you to get the benefit of namespaces in C++ while having names that are not mangled. I realize that the nature of the question is why doesn't C support namespaces (and a trivial answer would be that it doesn't because it wasn't implemented :)). I just thought that it might help someone to see how I've implemented the functionality of templates and namespaces.
I wrote up a tutorial on how to get the advantage of namespaces and/or templates using C.
Namespaces and templates in C
Namespaces and templates in C (using Linked Lists)
For the basic namespace, one can simply prefix the namespace name as a convention.
namespace MY_OBJECT {
struct HANDLE;
HANDLE *init();
void destroy(HANDLE * & h);
void do_something(HANDLE *h, ... );
}
can be written as
struct MY_OBJECT_HANDLE;
struct MY_OBJECT_HANDLE *my_object_init();
void my_object_destroy( MY_OBJECT_HANDLE * & h );
void my_object_do_something(MY_OBJECT_HANDLE *h, ... );
A second approach that I have needed that uses the concept of namespacing and templates is to use the macro concatenation and include. For example, I can create a
template<T> T multiply<T>( T x, T y ) { return x*y }
using template files as follows
multiply-template.h
_multiply_type_ _multiply_(multiply)( _multiply_type_ x, _multiply_type_ y);
multiply-template.c
_multiply_type_ _multiply_(multiply)( _multiply_type_ x, _multiply_type_ y) {
return x*y;
}
We can now define int_multiply as follows. In this example, I'll create a int_multiply.h/.c file.
int_multiply.h
#ifndef _INT_MULTIPLY_H
#define _INT_MULTIPLY_H
#ifdef _multiply_
#undef _multiply_
#endif
#define _multiply_(NAME) int ## _ ## NAME
#ifdef _multiply_type_
#undef _multiply_type_
#endif
#define _multiply_type_ int
#include "multiply-template.h"
#endif
int_multiply.c
#include "int_multiply.h"
#include "multiply-template.c"
At the end of all of this, you will have a function and header file for.
int int_multiply( int x, int y ) { return x * y }
I created a much more detailed tutorial on the links provided which show how it works with linked lists. Hopefully this helps someone!

You can. Like other's answer, define function pointers in a struct.
However, declare it in your header file, mark it static const and initialize it with the corresponding functions.
With -O1 or higher it will be optimized as normal function calls
eg:
void myfunc(void);
static const struct {
void(*myfunc)(void);
} mylib = {
.myfunc = myfunc
};
Take advantage of the #include statement so you do not need to define all functions in one single header.
Do not add header guards as you are including it more than once.
eg:
header1.h
#ifdef LIB_FUNC_DECL
void func1(void);
#elif defined(LIB_STRUCT_DECL)
struct {
void(*func)(void);
} submodule1;
#else
.submodule1.func = func1,
#endif
mylib.h
#define LIB_FUNC_DECL
#include "header1.h"
#undef LIB_FUNC_DECL
#define LIB_STRUCT_DECL
static const struct {
#include "header1.h"
#undef LIB_STRUCT_DECL
} mylib = {
#include "header1.h"
};

Can macros be used to simulate C++ templated functions?

I have a C program in which I need to create a whole family of functions which have the same signatures and bodies, and differ only in their types. What I would like to do is define a macro which generates all of those functions for me, as otherwise I will spend a long time copying and modifying the original functions. As an example, one of the functions I need to generate looks like this:
int copy_key__sint_(void *key, void **args, int argc, void **out {
if ((*out = malloc(sizeof(int))) {
return 1;
}
**((_int_ **) out) = *((_int_ *) key);
return 0;
}
The idea is that I could call a macro, GENERATE_FUNCTIONS("int", "sint") or something like this, and have it generate this function. The italicized parts are what need to be plugged in.
Is this possible?

I don't understand the example function that you are giving very well, but using macros for the task is relatively easy. Just you wouldn't give strings to the macro as arguments but tokens:
#define DECLARE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg)
#define DEFINE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg) { \
/* do something with TYPE */ \
return whatever; \
}
You may then use this to declare the functions in a header file
DECLARE_MY_COPY_FUNCTION(unsigned, toto);
DECLARE_MY_COPY_FUNCTION(double, hui);
and define them in a .c file:
DEFINE_MY_COPY_FUNCTION(unsigned, toto);
DEFINE_MY_COPY_FUNCTION(double, hui);
In this version as stated here you might get warnings on superfluous `;'. But you can get rid of them by adding dummy declarations in the macros like this
#define DEFINE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg) { \
/* do something with TYPE */ \
return whatever; \
} \
enum { dummy_enum_for_copy_function_ ## SUFFIX }

Try something like this (I just tested the compilation, but not the result in an executed program):
#include "memory.h"
#define COPY_KEY(type, name) \
type name(void *key, void **args, int argc, void **out) { \
if (*out = malloc(sizeof(type))) { \
return 1; \
} \
**((type **) out) = *((type *) key); \
return 0; \
} \
COPY_KEY(int, copy_key_sint)
For more on the subject of generic programming in C, read this blog wich contains a few examples and also this book which contains interesting solutions to the problem for basic data structures and algorithm.

That should work. To create copy_key_sint, use copy_key_ ## sint.
If you can't get this to work with CPP, then write a small C program which generates a C source file.

Wouldn't a macro which just takes sizeof(*key) and calls a single function that uses memcpy be a lot cleaner (less preprocessor abuse and code bloat) than making a new function for each type just so it can do a native assignment rather than memcpy?
My view is that the whole problem is your attempt to apply C++ thinking to C. C has memcpy for a very good reason.

Get a pointer to the current function in C (gcc)?

is there a magic variable in gcc holding a pointer to the current function ?
I would like to have a kind of table containing for each function pointer a set of information.
I know there's a __func__ variable containing the name of the current function as a string but not as a function pointer.
This is not to call the function then but just to be used as an index.
EDIT
Basically what i would like to do is being able to run nested functions just before the execution of the current function (and also capturing the return to perform some things.)
Basically, this is like __cyg_profile_func_enter and __cyg_profile_func_exit (the instrumentation functions)... But the problem is that these instrumentation functions are global and not function-dedicated.
EDIT
In the linux kernel, you can use unsigned long kallsyms_lookup_name(const char *name) from include/linux/kallsyms.h ... Note that the CONFIG_KALLSYMS option must be activated.

void f() {
void (*fpointer)() = &f;
}

Here's a trick that gets the address of the caller, it can probably be cleaned up a bit.
Relies on a GCC extension for getting a label's value.
#include <stdio.h>
#define MKLABEL2(x) label ## x
#define MKLABEL(x) MKLABEL2(x)
#define CALLFOO do { MKLABEL(__LINE__): foo(&&MKLABEL(__LINE__));} while(0)
void foo(void *addr)
{
printf("Caller address %p\n", addr);
}
int main(int argc, char **argv)
{
CALLFOO;
return 0;
}

#define FUNC_ADDR (dlsym(dlopen(NULL, RTLD_NOW), __func__))
And compile your program like
gcc -rdynamic -o foo foo.c -ldl

I think you could build your table using strings (the function names) as keys, then look up by comparing with the __func__ builtin variable.
To enforce having a valid function name, you could use a macro that gets the function pointer, does some dummy operation with it (e.g. assigning it to a compatible function type temporary variable) to check that it's indeed a valid function identifier, and then stringifies (with #) the function name before being used as a key.
UPDATE:
What I mean is something like:
typedef struct {
char[MAX_FUNC_NAME_LENGTH] func_name;
//rest of the info here
} func_info;
func_info table[N_FUNCS];
#define CHECK_AND_GET_FUNC_NAME(f) ({void (*tmp)(int); tmp = f; #f})
void fill_it()
{
int i = -1;
strcpy(table[++i].func_name, CHECK_AND_GET_FUNC_NAME(foo));
strcpy(table[++i].func_name, CHECK_AND_GET_FUNC_NAME(bar));
//fill the rest
}
void lookup(char *name) {
int i = -1;
while(strcmp(name, table[++i]));
//now i points to your entry, do whatever you need
}
void foo(int arg) {
lookup(__func__);
//do something
}
void bar(int arg) {
lookup(__func__);
//do something
}
(the code might need some fixes, I haven't tried to compile it, it's just to illustrate the idea)

I also had the problem that I needed the current function's address when I created a macro template coroutine abstraction that people can use like modern coroutine language features (await and async). It compensates for a missing RTOS when there is a central loop which schedules different asynchronous functions as (cooperative) tasks. Turning interrupt handlers into asynchronous functions even causes race conditions like in a preemptive multi-tasking system.
I noticed that I need to know the caller function's address for the final return address of a coroutine (which is not return address of the initial call of course). Only asynchronous functions need to know their own address so that they can pass it as hidden first argument in an AWAIT() macro. Since instrumenting the code with a macro solution is as simple as just defining the function it suffices to have an async-keyword-like macro.
This is a solution with GCC extensions:
#define _VARGS(...) _VARGS0(__VA_ARGS__)
#define _VARGS0(...) ,##__VA_ARGS__
typedef union async_arg async_arg_t;
union async_arg {
void (*caller)(void*);
void *retval;
};
#define ASYNC(FUNCNAME, FUNCARGS, ...) \
void FUNCNAME (async_arg_t _arg _VARGS FUNCARGS) \
GENERATOR( \
void (*const THIS)(void*) = (void*) &FUNCNAME;\
static void (*CALLER)(void*), \
CALLER = _arg.caller; \
__VA_ARGS__ \
)
#define GENERATOR(INIT,...) { \
__label__ _entry, _start, _end; \
static void *_state = (void*)0; \
INIT; \
_entry:; \
if (_state - &&_start <= &&_end - &&_start) \
goto *_state; \
_state = &&_start; \
_start:; \
__VA_ARGS__; \
_end: _state = &&_entry; \
}
#define AWAIT(FUNCNAME,...) ({ \
__label__ _next; \
_state = &&_next; \
return FUNCNAME((async_arg_t)THIS,##__VA_ARGS__);\
_next: _arg.retval; \
})
#define _I(...) __VA_ARGS__
#define IF(COND,THEN) _IF(_I(COND),_I(THEN))
#define _IF(COND,THEN) _IF0(_VARGS(COND),_I(THEN))
#define _IF0(A,B) _IF1(A,_I(B),)
#define _IF1(A,B,C,...) C
#define IFNOT(COND,ELSE) _IFNOT(_I(COND),_I(ELSE))
#define _IFNOT(COND,ELSE) _IFNOT0(_VARGS(COND),_I(ELSE))
#define _IFNOT0(A,B) _IFNOT1(A,,_I(B))
#define _IFNOT1(A,B,C,...) C
#define IF_ELSE(COND,THEN,ELSE) IF(_I(COND),_I(THEN))IFNOT(_I(COND),_I(ELSE))
#define WAIT(...) ({ \
__label__ _next; \
_state = &&_next; \
IF_ELSE(_I(__VA_ARGS__), \
static __typeof__(__VA_ARGS__) _value;\
_value = (__VA_ARGS__); \
return; \
_next: _value; \
, return; _next:;) \
})
#define YIELD(...) do { \
__label__ _next; \
_state = &&_next; \
return IF(_I(__VA_ARGS__),(__VA_ARGS__));\
_next:; \
} while(0)
#define RETURN(VALUE) do { \
_state = &&_entry; \
if (CALLER != 0) \
CALLER((void*)(VALUE +0));\
return; \
} while(0)
#define ASYNCALL(FUNC, ...) FUNC ((void*)0,__VA_ARGS__)
I know, a more portable (and maybe secure) solution would use the switch-case statement instead of label addresses but I think, gotos are more efficient than switch-case-statements. It also has the advantage that you can use the macros within any other control structures easily and break will have no unexpected effects.
You can use it like this:
#include <stdint.h>
int spi_start_transfer(uint16_t, void *, uint16_t, void(*)());
#define SPI_ADDR_PRESSURE 0x24
ASYNC(spi_read_pressure, (void* dest, uint16_t num),
void (*callback)(void) = (void*)THIS; //see here! THIS == &spi_read_pressure
int status = WAIT(spi_start_transfer(SPI_ADDR_PRESSURE,dest,num,callback));
RETURN(status);
)
int my_gen() GENERATOR(static int i,
while(1) {
for(i=0; i<5; i++)
YIELD(i);
}
)
extern volatile int a;
ASYNC(task_read, (uint16_t threshold),
while(1) {
static uint16_t pressure;
int status = (int)AWAIT(spi_read_pressure, &pressure, sizeof pressure);
if (pressure > threshold) {
a = my_gen();
}
}
)
You must use AWAIT to call asynchronous functions for return value and ASYNCALL without return value. AWAIT can only be called by ASYNC-functions. You can use WAIT with or without value. WAIT results in the expression which was given as argument, which is returned AFTER the function is resumed. WAIT can be used in ASYNC-functions only. Keeping the argument with WAIT wastes one new piece of static memory for each WAIT() call with argument though so it is recommended to use WAIT() without argument. It could be improved, if all WAIT calls would use the same single static variable for the entire function.
It is only a very simple version of a coroutine abstraction. This implementation cannot have nested or intertwinned calls of the same function because all static variables comprise one static stack frame.
If you want to solve this problem, you also need to distinguish resuming an old and starting a new function call. You can add details like a stack-frame queue at the function start in the ASYNC macro. Create a custom struct for each function's stack frame (which also can be done within the macro and an additional macro argument). This custom stack frame type is loaded from a queue when entering the macro, is stored back when exiting it or is removed when the call finishes.
You could use a stack frame index as alternative argument in the async_arg_t union. When the argument is an address, it starts a new call or when given a stack frame index it resumes an old call. The stack frame index or continuation must be passed as user-defined argument to the callback that resumes the coroutine.

If you went for C++ the following information might help you:
Objects are typed, functors are functions wrapped as objects, RTTI allows the identification of type at runtime.
Functors carry a runtime overhead with them, and if this is a problem for you I would suggest hard-coding the knowledge using code-generation or leveraging a OO-heirarchy of functors.

No, the function is not aware of itself. You will have to build the table you are talking about yourself, and then if you want a function to be aware of itself you will have to pass the index into the global table (or the pointer of the function) as a parameter.
Note: if you want to do this you should have a consistent naming scheme of the parameter.

If you want to do this in a 'generic' way, then you should use the facilities you already mention (__cyg_profile_func*) since that is what they are designed for. Anything else will have to be as ad hoc as your profile.
Honestly, doing things the generic way (with a filter) is probably less error prone than any new method that you will insert on-the-fly.

You can capture this information with setjmp(). Since it saves enough information to return to your current function, it must include that information in the provided jmp_buf.
This structure is highly nonportable, but you mention GCC explicitly so that's probably not a blocking issue. See this GCC/x86 example to get an idea how it roughly works.

If you want to do code generation I would recomend GSLGen from Imatix. It uses XML to structure a model of your code and then a simple PHP like top-down generation language to spit out the code -- it has been used to generate C code.
I have personally been toying arround with lua to generate code.

static const char * const cookie = __FUNCTION__;
__FUNCTION__ will be stored at the text segment at your binary and a pointer will always be unique and valid.

Another option, if portability is not an issue, would be to tweak the GCC source-code... any volunteers?!

If all you need is a unique identifier for each function, then at the start of every function, put this:
static const void * const cookie = &cookie;
The value of cookie is then guaranteed to be a value uniquely identifying that function.