Related
I want to implement unit testing using the -Wl,--wrap trick, however this doesn't work for functions within the same file. One solution is to rename the function (after it has been defined) to the wrapped one, as suggested here: https://stackoverflow.com/a/11758777/526568
I came up with the following macro to avoid having to manually define __wrap_foo:
#define UNIT_TEST_SYMBOL(x) \
typeof(x) __wrap_##x __attribute__((weak, alias(#x)))
void foo(void) {
/* function body */
}
UNIT_TEST_SYMBOL(foo);
#define foo __wrap_foo
void bar(void) {
foo();
}
I then compile with -Wl,--wrap=foo.
Is it possible to avoid having to manually define foo to __wrap_foo? Can this be somehow part of the UNIT_TEST_SYMBOL?
I'm trying to understand the code here.
Simple Interpreter
But I'm having problem understand the define here:
...
#define MK_CMD(x) void cmd_ ## x (arg_t*)
...
#define CMD(func, params, help) {#func, cmd_ ## func, params, help}
...
How does it work?
You use this kind of macros when you are "lazy". Sometimes you have a bunch of
functions that are almost identical and they differ only very slightly. Instead
of writing the same code over and over, you can use a macro to save you
keystrokes. And if you find a bug in one, the others might have to fixed at the
same place again. Having a macro like this solves the problem, because if you
fix the bug in the macro, you fix it for all functions at the same time.
The ## in a macro is a concatenation, it allows to merge tokens when expanding
the macro. A useful place for this is this:
#define MK_CMD(x) void cmd_ ## x (arg_t*)
//Functions definitions
MK_CMD(prompt);
MK_CMD(load);
MK_CMD(disp);
MK_CMD(add);
MK_CMD(mul);
MK_CMD(sqrt);
MK_CMD(exit);
MK_CMD(help);
This will expand to
void cmd_prompt(arg_t*);
void cmd_load(arg_t*);
void cmd_disp(arg_t*);
...
This is declaring functions for the compiler, so that it knows that there is a
function called cmd_prompt that takes a pointer to argt_t as an argument.
There is a function called cmd_load that ....
Let's say you later realized that the cmd_* functions need a second argument,
an int, then you don't have to manually change all function prototypes, you
only have to change the macro to
#define MK_CMD(x) void cmd_ ## x (arg_t*,int)
and all other functions will have that parameter. See, a feature for the "lazy"
programmer.
The other macro falls also into this category, this time is to create a
initialization for the array with curly braces (this has an specific name I cannot
remember right now), like int arr[3] = {1, 2, 3};
Once again, the "lazy" programmer may not want to use the curly braces all over
the place and to increase readability, so it does:
#define CMD(func, params, help) {#func, cmd_ ## func, params, help}
#define CMDS 8
cmd_t dsp_table[CMDS] ={
CMD(prompt,"s","Select the prompt for input"),
CMD(load,"cf","Load into register float"),
CMD(disp,"c","Display register"),
CMD(add,"ff","Add two numbers"),
CMD(mul,"ff","Multiply two numbers"),
CMD(sqrt,"f","Take the square root of number"),
CMD(exit,"","Exits the interpreter"),
CMD(help,"","Display this help")};
which expands to:
cmd_t dsp_table[8] = {
{"prompt", cmd_prompt, "s", "Select the prompt for input"},
{"load", cmd_load, "cf", "Load into register float"},
...
};
I use lazy in quotes, because I don't necessarily mean that as a negative thing.
This features of macros can be dead useful when used properly and can save you a
lot of time. I have used that in the past for a library that encapsulates reading and setting
values through something like a union, but more complex. The code looks like this:
#define sensor_set_value_typed(gtype, type_short_name, c_type)\
int sensor_set_value_ ## type_short_name(sensor *sens, c_type val)\
{\
gtype t_val;\
gtype_id type_id;\
if(sens == NULL)\
return 0;\
...\
gtype_init(&t_val);\
gtype_set_type(&t_val, gtype);\
gtype_set_value(&t_val, &val);\
return complicated_api_set_value(sens, &t_val);\
}
I removed many parts of the code and renamed some of the variables and functions, because this code is not open source, I just want to ilustrate
the idea behind the macros without revealing all that happens behind the scenes. The algorithm is the
same for 99% percent of the code, only the gtype information is different and
these functions can be used as a wrapper of the more clunkly type encapsulating
library. But in order to do so without the macro, I would have to make a lot of copy&paste and change one line for all those
functions. If I found one error on one of the wrappers, I have to fix the same
error at the same place on all wrappers. With the macro I can do this:
sensor_set_value_typed(GTYPE_BOOL, bool, bool);
sensor_set_value_typed(GTYPE_I8, i8, int8_t);
sensor_set_value_typed(GTYPE_U8, u8, uint8_t);
sensor_set_value_typed(GTYPE_I16, i16, int16_t);
sensor_set_value_typed(GTYPE_U16, u16, uint16_t);
sensor_set_value_typed(GTYPE_I32, i32, int32_t);
sensor_set_value_typed(GTYPE_U32, u32, uint32_t);
sensor_set_value_typed(GTYPE_I64, i64, int64_t);
sensor_set_value_typed(GTYPE_U64, u64, uint64_t);
sensor_set_value_typed(GTYPE_FLOAT, float, float);
sensor_set_value_typed(GTYPE_LONG, long, long);
sensor_set_value_typed(GTYPE_DOUBLE, double, double);
and now I have 12 wrappers of the basic library in 12 lines. Now the user can use the
wrapper
int32_t val = get_value_from_real_sensor();
sensor_set_value_i32(sensor, val);
instead of using the complicated underlying library. The GTYPE_BOOL,
GTYPE_I8, etc are values defined in an enum which describes the basic types.
## is the preprocessor token pasting operator, aka macro concatenation.
It is used to join together text in the preprocessor.
For instance #define type i##nt replaces to int. It's a way to get the pre-processor to recognize arguments that need to be concatenated to adjacent tokens.
#define MK_CMD(x) void cmd_ ## x (arg_t*) is used by the author to declare a function prototype.
When invoked, MK_CMD(hello) would replace to void cmd_hello(arg_t*).
If instead the #define were declared as #define MK_CMD(x) void cmd_x (arg_t*), it would only ever emit void cmd_x (arg_t*) instead of replacing x with the macro argument.
I have a function dangerous(GEN x) which is called frequently in my code, where GEN is a typedef. For debugging purposes, I would like to add checkSafe to all instances of this function, something like
#ifdef DEBUG
#define dangerous(x) GEN __x = (x); if(checkSafe(__x)) dangerous(__x)
#endif
but I'm concerned that this might not work as intended. What's the right way to do something like this? The function is used too often to instrument each use individually and it is not desirable to check outside debug mode (for various reasons).
Things to be aware of / careful about:
Using a macro and a function with the same name at the same time. While it can produce valid C, you'll have to 1) take extra precautions to avoid unwanted expansion (either always define the function before the macro, or enclose the function name in parentheses at definition time) and 2) double check that every use of the function also includes your instrumenting code.
Solution: rename the original function into something like _dangerous.
Using the macro in various situations:
in an if with a single statement: if (foo) dangerous(x);
around an else from the parent if: if (foo) dangerous(x); else bar();
when leaking variables into the parent namespace can break things: GEN __x = 5; dangerous(__x);.
Solution: enclose the macro in a construct like do { ... } while(0).
You must take into account any side effects at copy time, like resource allocation or CPU intensive operations (since GEN is a typedef, this is likely not a concern).
Lastly, you may also want to complain when checkSafe fails, e.g. by logging an error message, or even aborting the program.
Putting the above together, you would instrument the function like this:
#ifdef DEBUG
#define dangerous(x) do { \
GEN __x = (x); \
if (checkSafe(__x)) \
_dangerous(__x); \
else \
complainAbout(__x); \
} while(0)
#else
#define dangerous _dangerous
#endif
If dangerous() returns a value (e.g. int) that you want to use.
Solution: Define a function to instrument your original function and pass the return value up:
#ifdef DEBUG
static inline int dangerous(GEN x) {
if (checkSafe(x))
return _dangerous(x);
complainAbout(x);
return ERROR_CODE;
}
#else
#define dangerous _dangerous
#endif
I was studying the Linux wireless subsystem code and noticed this code (in ieee80211_rx_handlers):
It first defines the macro:
#define CALL_RXH(rxh) \
do { \
res = rxh(rx); \
if (res != RX_CONTINUE) \
goto rxh_next; \
} while (0);
Then the macro is used to call a series of functions:
CALL_RXH(ieee80211_rx_h_check_more_data)
CALL_RXH(ieee80211_rx_h_uapsd_and_pspoll)
CALL_RXH(ieee80211_rx_h_sta_process)
CALL_RXH(ieee80211_rx_h_decrypt)
CALL_RXH(ieee80211_rx_h_defragment)
CALL_RXH(ieee80211_rx_h_michael_mic_verify)
My question is, why not just call the functions directly like:
ieee80211_rx_h_check_more_data(rx);
ieee80211_rx_h_uapsd_and_pspoll(rx);
...
Is it just for the sake of outlining the code for easy reading?
Each use of the macro expands into the if check and goto, not just a single function call.
The if tests differ only by which function is called to produce the condition. Because the code would otherwise be repetitive, they used a macro to generate the boilerplate.
They could perhaps have interspersed calls res = xyz( rx ); with a macro expanding to the if … goto part, and then the macro would not take any parameter. How much gets encapsulated into the macro is a matter of code factoring style.
The do {} while(0) Macro could be easily used in condition block.
#define FUNC1() doing A; dong B;
#define FUNC2() do { doing A; doing B; } while(0)
We could use FUNC2() in if condition code block like this:
if (true)
FUNC2();
But FUNC1() could only be used like this:
if (true) {
FUNC1()
}
is there a magic variable in gcc holding a pointer to the current function ?
I would like to have a kind of table containing for each function pointer a set of information.
I know there's a __func__ variable containing the name of the current function as a string but not as a function pointer.
This is not to call the function then but just to be used as an index.
EDIT
Basically what i would like to do is being able to run nested functions just before the execution of the current function (and also capturing the return to perform some things.)
Basically, this is like __cyg_profile_func_enter and __cyg_profile_func_exit (the instrumentation functions)... But the problem is that these instrumentation functions are global and not function-dedicated.
EDIT
In the linux kernel, you can use unsigned long kallsyms_lookup_name(const char *name) from include/linux/kallsyms.h ... Note that the CONFIG_KALLSYMS option must be activated.
void f() {
void (*fpointer)() = &f;
}
Here's a trick that gets the address of the caller, it can probably be cleaned up a bit.
Relies on a GCC extension for getting a label's value.
#include <stdio.h>
#define MKLABEL2(x) label ## x
#define MKLABEL(x) MKLABEL2(x)
#define CALLFOO do { MKLABEL(__LINE__): foo(&&MKLABEL(__LINE__));} while(0)
void foo(void *addr)
{
printf("Caller address %p\n", addr);
}
int main(int argc, char **argv)
{
CALLFOO;
return 0;
}
#define FUNC_ADDR (dlsym(dlopen(NULL, RTLD_NOW), __func__))
And compile your program like
gcc -rdynamic -o foo foo.c -ldl
I think you could build your table using strings (the function names) as keys, then look up by comparing with the __func__ builtin variable.
To enforce having a valid function name, you could use a macro that gets the function pointer, does some dummy operation with it (e.g. assigning it to a compatible function type temporary variable) to check that it's indeed a valid function identifier, and then stringifies (with #) the function name before being used as a key.
UPDATE:
What I mean is something like:
typedef struct {
char[MAX_FUNC_NAME_LENGTH] func_name;
//rest of the info here
} func_info;
func_info table[N_FUNCS];
#define CHECK_AND_GET_FUNC_NAME(f) ({void (*tmp)(int); tmp = f; #f})
void fill_it()
{
int i = -1;
strcpy(table[++i].func_name, CHECK_AND_GET_FUNC_NAME(foo));
strcpy(table[++i].func_name, CHECK_AND_GET_FUNC_NAME(bar));
//fill the rest
}
void lookup(char *name) {
int i = -1;
while(strcmp(name, table[++i]));
//now i points to your entry, do whatever you need
}
void foo(int arg) {
lookup(__func__);
//do something
}
void bar(int arg) {
lookup(__func__);
//do something
}
(the code might need some fixes, I haven't tried to compile it, it's just to illustrate the idea)
I also had the problem that I needed the current function's address when I created a macro template coroutine abstraction that people can use like modern coroutine language features (await and async). It compensates for a missing RTOS when there is a central loop which schedules different asynchronous functions as (cooperative) tasks. Turning interrupt handlers into asynchronous functions even causes race conditions like in a preemptive multi-tasking system.
I noticed that I need to know the caller function's address for the final return address of a coroutine (which is not return address of the initial call of course). Only asynchronous functions need to know their own address so that they can pass it as hidden first argument in an AWAIT() macro. Since instrumenting the code with a macro solution is as simple as just defining the function it suffices to have an async-keyword-like macro.
This is a solution with GCC extensions:
#define _VARGS(...) _VARGS0(__VA_ARGS__)
#define _VARGS0(...) ,##__VA_ARGS__
typedef union async_arg async_arg_t;
union async_arg {
void (*caller)(void*);
void *retval;
};
#define ASYNC(FUNCNAME, FUNCARGS, ...) \
void FUNCNAME (async_arg_t _arg _VARGS FUNCARGS) \
GENERATOR( \
void (*const THIS)(void*) = (void*) &FUNCNAME;\
static void (*CALLER)(void*), \
CALLER = _arg.caller; \
__VA_ARGS__ \
)
#define GENERATOR(INIT,...) { \
__label__ _entry, _start, _end; \
static void *_state = (void*)0; \
INIT; \
_entry:; \
if (_state - &&_start <= &&_end - &&_start) \
goto *_state; \
_state = &&_start; \
_start:; \
__VA_ARGS__; \
_end: _state = &&_entry; \
}
#define AWAIT(FUNCNAME,...) ({ \
__label__ _next; \
_state = &&_next; \
return FUNCNAME((async_arg_t)THIS,##__VA_ARGS__);\
_next: _arg.retval; \
})
#define _I(...) __VA_ARGS__
#define IF(COND,THEN) _IF(_I(COND),_I(THEN))
#define _IF(COND,THEN) _IF0(_VARGS(COND),_I(THEN))
#define _IF0(A,B) _IF1(A,_I(B),)
#define _IF1(A,B,C,...) C
#define IFNOT(COND,ELSE) _IFNOT(_I(COND),_I(ELSE))
#define _IFNOT(COND,ELSE) _IFNOT0(_VARGS(COND),_I(ELSE))
#define _IFNOT0(A,B) _IFNOT1(A,,_I(B))
#define _IFNOT1(A,B,C,...) C
#define IF_ELSE(COND,THEN,ELSE) IF(_I(COND),_I(THEN))IFNOT(_I(COND),_I(ELSE))
#define WAIT(...) ({ \
__label__ _next; \
_state = &&_next; \
IF_ELSE(_I(__VA_ARGS__), \
static __typeof__(__VA_ARGS__) _value;\
_value = (__VA_ARGS__); \
return; \
_next: _value; \
, return; _next:;) \
})
#define YIELD(...) do { \
__label__ _next; \
_state = &&_next; \
return IF(_I(__VA_ARGS__),(__VA_ARGS__));\
_next:; \
} while(0)
#define RETURN(VALUE) do { \
_state = &&_entry; \
if (CALLER != 0) \
CALLER((void*)(VALUE +0));\
return; \
} while(0)
#define ASYNCALL(FUNC, ...) FUNC ((void*)0,__VA_ARGS__)
I know, a more portable (and maybe secure) solution would use the switch-case statement instead of label addresses but I think, gotos are more efficient than switch-case-statements. It also has the advantage that you can use the macros within any other control structures easily and break will have no unexpected effects.
You can use it like this:
#include <stdint.h>
int spi_start_transfer(uint16_t, void *, uint16_t, void(*)());
#define SPI_ADDR_PRESSURE 0x24
ASYNC(spi_read_pressure, (void* dest, uint16_t num),
void (*callback)(void) = (void*)THIS; //see here! THIS == &spi_read_pressure
int status = WAIT(spi_start_transfer(SPI_ADDR_PRESSURE,dest,num,callback));
RETURN(status);
)
int my_gen() GENERATOR(static int i,
while(1) {
for(i=0; i<5; i++)
YIELD(i);
}
)
extern volatile int a;
ASYNC(task_read, (uint16_t threshold),
while(1) {
static uint16_t pressure;
int status = (int)AWAIT(spi_read_pressure, &pressure, sizeof pressure);
if (pressure > threshold) {
a = my_gen();
}
}
)
You must use AWAIT to call asynchronous functions for return value and ASYNCALL without return value. AWAIT can only be called by ASYNC-functions. You can use WAIT with or without value. WAIT results in the expression which was given as argument, which is returned AFTER the function is resumed. WAIT can be used in ASYNC-functions only. Keeping the argument with WAIT wastes one new piece of static memory for each WAIT() call with argument though so it is recommended to use WAIT() without argument. It could be improved, if all WAIT calls would use the same single static variable for the entire function.
It is only a very simple version of a coroutine abstraction. This implementation cannot have nested or intertwinned calls of the same function because all static variables comprise one static stack frame.
If you want to solve this problem, you also need to distinguish resuming an old and starting a new function call. You can add details like a stack-frame queue at the function start in the ASYNC macro. Create a custom struct for each function's stack frame (which also can be done within the macro and an additional macro argument). This custom stack frame type is loaded from a queue when entering the macro, is stored back when exiting it or is removed when the call finishes.
You could use a stack frame index as alternative argument in the async_arg_t union. When the argument is an address, it starts a new call or when given a stack frame index it resumes an old call. The stack frame index or continuation must be passed as user-defined argument to the callback that resumes the coroutine.
If you went for C++ the following information might help you:
Objects are typed, functors are functions wrapped as objects, RTTI allows the identification of type at runtime.
Functors carry a runtime overhead with them, and if this is a problem for you I would suggest hard-coding the knowledge using code-generation or leveraging a OO-heirarchy of functors.
No, the function is not aware of itself. You will have to build the table you are talking about yourself, and then if you want a function to be aware of itself you will have to pass the index into the global table (or the pointer of the function) as a parameter.
Note: if you want to do this you should have a consistent naming scheme of the parameter.
If you want to do this in a 'generic' way, then you should use the facilities you already mention (__cyg_profile_func*) since that is what they are designed for. Anything else will have to be as ad hoc as your profile.
Honestly, doing things the generic way (with a filter) is probably less error prone than any new method that you will insert on-the-fly.
You can capture this information with setjmp(). Since it saves enough information to return to your current function, it must include that information in the provided jmp_buf.
This structure is highly nonportable, but you mention GCC explicitly so that's probably not a blocking issue. See this GCC/x86 example to get an idea how it roughly works.
If you want to do code generation I would recomend GSLGen from Imatix. It uses XML to structure a model of your code and then a simple PHP like top-down generation language to spit out the code -- it has been used to generate C code.
I have personally been toying arround with lua to generate code.
static const char * const cookie = __FUNCTION__;
__FUNCTION__ will be stored at the text segment at your binary and a pointer will always be unique and valid.
Another option, if portability is not an issue, would be to tweak the GCC source-code... any volunteers?!
If all you need is a unique identifier for each function, then at the start of every function, put this:
static const void * const cookie = &cookie;
The value of cookie is then guaranteed to be a value uniquely identifying that function.