Get a pointer to the current function in C (gcc)? - c

is there a magic variable in gcc holding a pointer to the current function ?
I would like to have a kind of table containing for each function pointer a set of information.
I know there's a __func__ variable containing the name of the current function as a string but not as a function pointer.
This is not to call the function then but just to be used as an index.
EDIT
Basically what i would like to do is being able to run nested functions just before the execution of the current function (and also capturing the return to perform some things.)
Basically, this is like __cyg_profile_func_enter and __cyg_profile_func_exit (the instrumentation functions)... But the problem is that these instrumentation functions are global and not function-dedicated.
EDIT
In the linux kernel, you can use unsigned long kallsyms_lookup_name(const char *name) from include/linux/kallsyms.h ... Note that the CONFIG_KALLSYMS option must be activated.

void f() {
void (*fpointer)() = &f;
}

Here's a trick that gets the address of the caller, it can probably be cleaned up a bit.
Relies on a GCC extension for getting a label's value.
#include <stdio.h>
#define MKLABEL2(x) label ## x
#define MKLABEL(x) MKLABEL2(x)
#define CALLFOO do { MKLABEL(__LINE__): foo(&&MKLABEL(__LINE__));} while(0)
void foo(void *addr)
{
printf("Caller address %p\n", addr);
}
int main(int argc, char **argv)
{
CALLFOO;
return 0;
}

#define FUNC_ADDR (dlsym(dlopen(NULL, RTLD_NOW), __func__))
And compile your program like
gcc -rdynamic -o foo foo.c -ldl

I think you could build your table using strings (the function names) as keys, then look up by comparing with the __func__ builtin variable.
To enforce having a valid function name, you could use a macro that gets the function pointer, does some dummy operation with it (e.g. assigning it to a compatible function type temporary variable) to check that it's indeed a valid function identifier, and then stringifies (with #) the function name before being used as a key.
UPDATE:
What I mean is something like:
typedef struct {
char[MAX_FUNC_NAME_LENGTH] func_name;
//rest of the info here
} func_info;
func_info table[N_FUNCS];
#define CHECK_AND_GET_FUNC_NAME(f) ({void (*tmp)(int); tmp = f; #f})
void fill_it()
{
int i = -1;
strcpy(table[++i].func_name, CHECK_AND_GET_FUNC_NAME(foo));
strcpy(table[++i].func_name, CHECK_AND_GET_FUNC_NAME(bar));
//fill the rest
}
void lookup(char *name) {
int i = -1;
while(strcmp(name, table[++i]));
//now i points to your entry, do whatever you need
}
void foo(int arg) {
lookup(__func__);
//do something
}
void bar(int arg) {
lookup(__func__);
//do something
}
(the code might need some fixes, I haven't tried to compile it, it's just to illustrate the idea)

I also had the problem that I needed the current function's address when I created a macro template coroutine abstraction that people can use like modern coroutine language features (await and async). It compensates for a missing RTOS when there is a central loop which schedules different asynchronous functions as (cooperative) tasks. Turning interrupt handlers into asynchronous functions even causes race conditions like in a preemptive multi-tasking system.
I noticed that I need to know the caller function's address for the final return address of a coroutine (which is not return address of the initial call of course). Only asynchronous functions need to know their own address so that they can pass it as hidden first argument in an AWAIT() macro. Since instrumenting the code with a macro solution is as simple as just defining the function it suffices to have an async-keyword-like macro.
This is a solution with GCC extensions:
#define _VARGS(...) _VARGS0(__VA_ARGS__)
#define _VARGS0(...) ,##__VA_ARGS__
typedef union async_arg async_arg_t;
union async_arg {
void (*caller)(void*);
void *retval;
};
#define ASYNC(FUNCNAME, FUNCARGS, ...) \
void FUNCNAME (async_arg_t _arg _VARGS FUNCARGS) \
GENERATOR( \
void (*const THIS)(void*) = (void*) &FUNCNAME;\
static void (*CALLER)(void*), \
CALLER = _arg.caller; \
__VA_ARGS__ \
)
#define GENERATOR(INIT,...) { \
__label__ _entry, _start, _end; \
static void *_state = (void*)0; \
INIT; \
_entry:; \
if (_state - &&_start <= &&_end - &&_start) \
goto *_state; \
_state = &&_start; \
_start:; \
__VA_ARGS__; \
_end: _state = &&_entry; \
}
#define AWAIT(FUNCNAME,...) ({ \
__label__ _next; \
_state = &&_next; \
return FUNCNAME((async_arg_t)THIS,##__VA_ARGS__);\
_next: _arg.retval; \
})
#define _I(...) __VA_ARGS__
#define IF(COND,THEN) _IF(_I(COND),_I(THEN))
#define _IF(COND,THEN) _IF0(_VARGS(COND),_I(THEN))
#define _IF0(A,B) _IF1(A,_I(B),)
#define _IF1(A,B,C,...) C
#define IFNOT(COND,ELSE) _IFNOT(_I(COND),_I(ELSE))
#define _IFNOT(COND,ELSE) _IFNOT0(_VARGS(COND),_I(ELSE))
#define _IFNOT0(A,B) _IFNOT1(A,,_I(B))
#define _IFNOT1(A,B,C,...) C
#define IF_ELSE(COND,THEN,ELSE) IF(_I(COND),_I(THEN))IFNOT(_I(COND),_I(ELSE))
#define WAIT(...) ({ \
__label__ _next; \
_state = &&_next; \
IF_ELSE(_I(__VA_ARGS__), \
static __typeof__(__VA_ARGS__) _value;\
_value = (__VA_ARGS__); \
return; \
_next: _value; \
, return; _next:;) \
})
#define YIELD(...) do { \
__label__ _next; \
_state = &&_next; \
return IF(_I(__VA_ARGS__),(__VA_ARGS__));\
_next:; \
} while(0)
#define RETURN(VALUE) do { \
_state = &&_entry; \
if (CALLER != 0) \
CALLER((void*)(VALUE +0));\
return; \
} while(0)
#define ASYNCALL(FUNC, ...) FUNC ((void*)0,__VA_ARGS__)
I know, a more portable (and maybe secure) solution would use the switch-case statement instead of label addresses but I think, gotos are more efficient than switch-case-statements. It also has the advantage that you can use the macros within any other control structures easily and break will have no unexpected effects.
You can use it like this:
#include <stdint.h>
int spi_start_transfer(uint16_t, void *, uint16_t, void(*)());
#define SPI_ADDR_PRESSURE 0x24
ASYNC(spi_read_pressure, (void* dest, uint16_t num),
void (*callback)(void) = (void*)THIS; //see here! THIS == &spi_read_pressure
int status = WAIT(spi_start_transfer(SPI_ADDR_PRESSURE,dest,num,callback));
RETURN(status);
)
int my_gen() GENERATOR(static int i,
while(1) {
for(i=0; i<5; i++)
YIELD(i);
}
)
extern volatile int a;
ASYNC(task_read, (uint16_t threshold),
while(1) {
static uint16_t pressure;
int status = (int)AWAIT(spi_read_pressure, &pressure, sizeof pressure);
if (pressure > threshold) {
a = my_gen();
}
}
)
You must use AWAIT to call asynchronous functions for return value and ASYNCALL without return value. AWAIT can only be called by ASYNC-functions. You can use WAIT with or without value. WAIT results in the expression which was given as argument, which is returned AFTER the function is resumed. WAIT can be used in ASYNC-functions only. Keeping the argument with WAIT wastes one new piece of static memory for each WAIT() call with argument though so it is recommended to use WAIT() without argument. It could be improved, if all WAIT calls would use the same single static variable for the entire function.
It is only a very simple version of a coroutine abstraction. This implementation cannot have nested or intertwinned calls of the same function because all static variables comprise one static stack frame.
If you want to solve this problem, you also need to distinguish resuming an old and starting a new function call. You can add details like a stack-frame queue at the function start in the ASYNC macro. Create a custom struct for each function's stack frame (which also can be done within the macro and an additional macro argument). This custom stack frame type is loaded from a queue when entering the macro, is stored back when exiting it or is removed when the call finishes.
You could use a stack frame index as alternative argument in the async_arg_t union. When the argument is an address, it starts a new call or when given a stack frame index it resumes an old call. The stack frame index or continuation must be passed as user-defined argument to the callback that resumes the coroutine.

If you went for C++ the following information might help you:
Objects are typed, functors are functions wrapped as objects, RTTI allows the identification of type at runtime.
Functors carry a runtime overhead with them, and if this is a problem for you I would suggest hard-coding the knowledge using code-generation or leveraging a OO-heirarchy of functors.

No, the function is not aware of itself. You will have to build the table you are talking about yourself, and then if you want a function to be aware of itself you will have to pass the index into the global table (or the pointer of the function) as a parameter.
Note: if you want to do this you should have a consistent naming scheme of the parameter.

If you want to do this in a 'generic' way, then you should use the facilities you already mention (__cyg_profile_func*) since that is what they are designed for. Anything else will have to be as ad hoc as your profile.
Honestly, doing things the generic way (with a filter) is probably less error prone than any new method that you will insert on-the-fly.

You can capture this information with setjmp(). Since it saves enough information to return to your current function, it must include that information in the provided jmp_buf.
This structure is highly nonportable, but you mention GCC explicitly so that's probably not a blocking issue. See this GCC/x86 example to get an idea how it roughly works.

If you want to do code generation I would recomend GSLGen from Imatix. It uses XML to structure a model of your code and then a simple PHP like top-down generation language to spit out the code -- it has been used to generate C code.
I have personally been toying arround with lua to generate code.

static const char * const cookie = __FUNCTION__;
__FUNCTION__ will be stored at the text segment at your binary and a pointer will always be unique and valid.

Another option, if portability is not an issue, would be to tweak the GCC source-code... any volunteers?!

If all you need is a unique identifier for each function, then at the start of every function, put this:
static const void * const cookie = &cookie;
The value of cookie is then guaranteed to be a value uniquely identifying that function.

Related

A way around defining new variables in function calls

I have a function dangerous(GEN x) which is called frequently in my code, where GEN is a typedef. For debugging purposes, I would like to add checkSafe to all instances of this function, something like
#ifdef DEBUG
#define dangerous(x) GEN __x = (x); if(checkSafe(__x)) dangerous(__x)
#endif
but I'm concerned that this might not work as intended. What's the right way to do something like this? The function is used too often to instrument each use individually and it is not desirable to check outside debug mode (for various reasons).
Things to be aware of / careful about:
Using a macro and a function with the same name at the same time. While it can produce valid C, you'll have to 1) take extra precautions to avoid unwanted expansion (either always define the function before the macro, or enclose the function name in parentheses at definition time) and 2) double check that every use of the function also includes your instrumenting code.
Solution: rename the original function into something like _dangerous.
Using the macro in various situations:
in an if with a single statement: if (foo) dangerous(x);
around an else from the parent if: if (foo) dangerous(x); else bar();
when leaking variables into the parent namespace can break things: GEN __x = 5; dangerous(__x);.
Solution: enclose the macro in a construct like do { ... } while(0).
You must take into account any side effects at copy time, like resource allocation or CPU intensive operations (since GEN is a typedef, this is likely not a concern).
Lastly, you may also want to complain when checkSafe fails, e.g. by logging an error message, or even aborting the program.
Putting the above together, you would instrument the function like this:
#ifdef DEBUG
#define dangerous(x) do { \
GEN __x = (x); \
if (checkSafe(__x)) \
_dangerous(__x); \
else \
complainAbout(__x); \
} while(0)
#else
#define dangerous _dangerous
#endif
If dangerous() returns a value (e.g. int) that you want to use.
Solution: Define a function to instrument your original function and pass the return value up:
#ifdef DEBUG
static inline int dangerous(GEN x) {
if (checkSafe(x))
return _dangerous(x);
complainAbout(x);
return ERROR_CODE;
}
#else
#define dangerous _dangerous
#endif

Function prototype or not?

I found the following in the original linux kernel code. (link)
static inline _syscall0(int,fork)
static inline _syscall0(int,pause)
static inline _syscall0(int,setup)
static inline _syscall0(int,sync)
It can't be a function call since static inline is prefixed. It can't be a prototype since functions can't be overloaded in C. And there's no semicolon in the end. What is this?
Also, it is preceded by this comment (if you could explain this too)
/*
* we need this inline - forking from kernel space will result
* in NO COPY ON WRITE (!!!), until an execve is executed. This
* is no problem, but for the stack. This is handled by not letting
* main() use the stack at all after fork(). Thus, no function
* calls - which means inline code for fork too, as otherwise we
* would use the stack upon exit from 'fork()'.
*
* Actually only pause and fork are needed inline, so that there
* won't be any messing with the stack from main(), but we define
* some others too.
*/
Because _syscall0 is a macro (in unistd.h):
#define _syscall0(type,name) \
type name(void) \
{ \
type __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name)); \
if (__res >= 0) \
return __res; \
errno = -__res; \
return -1; \
}
A simplified example:
#define call(type,name) \
type name(void) \
{return 0;}
static inline call(int,func)
int main(void)
{
return func();
}
Expands to:
static inline int func(void) {return 0;}
int main(void)
{
return func();
}
What the comments are trying to explain is that, for normal implementation of fork() in the user space, copy-on-write technique is used. According to which the parent and child, both processes, share the parent's address space. The data in this case is set as read-only, as soon as any one of them tries to write into it a copy is made and they no longer share the resources.
In kernel space a single master kernel page table is present and it is shared among all the processes in the kernel space. If we try to create a copy of the master kernel page table weird errors may arise, as it is shared between many processes and many user level processes also refer to it at some point. Also it may be that the changes processes make to the master kernel page table be visible to other processes. May be that's why no copy on write is implemented here.
On encountering execve a new process is created and hence there are no changes done to the master kernel page table. It may be that while calling the system calls if functions are used then they may cause changes in the kernel stack, so to prevent that inline functions are used.

Why use macro to call functions

I was studying the Linux wireless subsystem code and noticed this code (in ieee80211_rx_handlers):
It first defines the macro:
#define CALL_RXH(rxh) \
do { \
res = rxh(rx); \
if (res != RX_CONTINUE) \
goto rxh_next; \
} while (0);
Then the macro is used to call a series of functions:
CALL_RXH(ieee80211_rx_h_check_more_data)
CALL_RXH(ieee80211_rx_h_uapsd_and_pspoll)
CALL_RXH(ieee80211_rx_h_sta_process)
CALL_RXH(ieee80211_rx_h_decrypt)
CALL_RXH(ieee80211_rx_h_defragment)
CALL_RXH(ieee80211_rx_h_michael_mic_verify)
My question is, why not just call the functions directly like:
ieee80211_rx_h_check_more_data(rx);
ieee80211_rx_h_uapsd_and_pspoll(rx);
...
Is it just for the sake of outlining the code for easy reading?
Each use of the macro expands into the if check and goto, not just a single function call.
The if tests differ only by which function is called to produce the condition. Because the code would otherwise be repetitive, they used a macro to generate the boilerplate.
They could perhaps have interspersed calls res = xyz( rx ); with a macro expanding to the if … goto part, and then the macro would not take any parameter. How much gets encapsulated into the macro is a matter of code factoring style.
The do {} while(0) Macro could be easily used in condition block.
#define FUNC1() doing A; dong B;
#define FUNC2() do { doing A; doing B; } while(0)
We could use FUNC2() in if condition code block like this:
if (true)
FUNC2();
But FUNC1() could only be used like this:
if (true) {
FUNC1()
}

How is this generic function is used

I am going through code where some write has been made to some register. Now they made it as a generic function so that write to different register has to go through the same function:
#define RGS(x) \
static inline void write_##x(u8 val) \
{ \
}
#define REGW(x) RGS(x)
write_wdc(val);
Now I want to know when the call to write_wdc is made, how it is replaced by these macros.
This doesn't show the macro actually being used, in order for the final line (the call) to work, there also has to be something like:
REGW(wdc)
somewhere in the code, to use the macro. The above will be replaced by the preprocessor with:
RGS(wdc)
Which in turn will be replaced with
static inline void write_wdc(u8 val) { }
I assume the body of the function is missing from your macro declaration too, I would expect something like x = val; in there to actually make the write happen.
This uses the ## preprocessor operator to "glue" the words together.

Can macros be used to simulate C++ templated functions?

I have a C program in which I need to create a whole family of functions which have the same signatures and bodies, and differ only in their types. What I would like to do is define a macro which generates all of those functions for me, as otherwise I will spend a long time copying and modifying the original functions. As an example, one of the functions I need to generate looks like this:
int copy_key__sint_(void *key, void **args, int argc, void **out {
if ((*out = malloc(sizeof(int))) {
return 1;
}
**((_int_ **) out) = *((_int_ *) key);
return 0;
}
The idea is that I could call a macro, GENERATE_FUNCTIONS("int", "sint") or something like this, and have it generate this function. The italicized parts are what need to be plugged in.
Is this possible?
I don't understand the example function that you are giving very well, but using macros for the task is relatively easy. Just you wouldn't give strings to the macro as arguments but tokens:
#define DECLARE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg)
#define DEFINE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg) { \
/* do something with TYPE */ \
return whatever; \
}
You may then use this to declare the functions in a header file
DECLARE_MY_COPY_FUNCTION(unsigned, toto);
DECLARE_MY_COPY_FUNCTION(double, hui);
and define them in a .c file:
DEFINE_MY_COPY_FUNCTION(unsigned, toto);
DEFINE_MY_COPY_FUNCTION(double, hui);
In this version as stated here you might get warnings on superfluous `;'. But you can get rid of them by adding dummy declarations in the macros like this
#define DEFINE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg) { \
/* do something with TYPE */ \
return whatever; \
} \
enum { dummy_enum_for_copy_function_ ## SUFFIX }
Try something like this (I just tested the compilation, but not the result in an executed program):
#include "memory.h"
#define COPY_KEY(type, name) \
type name(void *key, void **args, int argc, void **out) { \
if (*out = malloc(sizeof(type))) { \
return 1; \
} \
**((type **) out) = *((type *) key); \
return 0; \
} \
COPY_KEY(int, copy_key_sint)
For more on the subject of generic programming in C, read this blog wich contains a few examples and also this book which contains interesting solutions to the problem for basic data structures and algorithm.
That should work. To create copy_key_sint, use copy_key_ ## sint.
If you can't get this to work with CPP, then write a small C program which generates a C source file.
Wouldn't a macro which just takes sizeof(*key) and calls a single function that uses memcpy be a lot cleaner (less preprocessor abuse and code bloat) than making a new function for each type just so it can do a native assignment rather than memcpy?
My view is that the whole problem is your attempt to apply C++ thinking to C. C has memcpy for a very good reason.

Resources