In a C program, is it possible to reset all global variables to default vaues? - c

I have a legacy C Linux application that I need to reuse . This application uses a lot of global variables. I want to reuse this application's main method and invoke that in a loop. I have found that when I call the main method( renamed to callableMain) in a loop , the application behavior is not consistent as the values of global variables set in previous iteration impact the program flow in the new iteration.
What I would like to do is to reset all the global variables to the default value before the execution of the the new iteration.
for example , the original program is like this
OriginalMain.C
#include <stdio.h>
int global = 3; /* This is the global variable. */
void doSomething(){
global++; /* Reference to global variable in a function. */
}
// i want to rename this main method to callableMain() and
// invoke it in a loop
int main(void){
if(global==3) {
printf(" All Is Well \n");
doSomething() ;
}
else{
printf(" Noooo\n");
doNothing() ;
}
return 0;
}
I want to change this program as follows:
I changed the above file to rename the main() to callableMain()
And my new main methods is as follows:
int main(){
for(int i=0;i<20;i++){
callableMain();
// this is where I need to reset the value of global vaiables
// otherwise the execution flow changes
}
}
Is this possible to reset all the global variables to the values before main() was invoked ?
The short answer is that there is no magical api call that would reset global variables. The global variables would have to be cached and reused.

I would invoke it as a subprocess, modifying its input and output as needed. Let the operating system do the dirty work for you.
The idea is to isolate the legacy program from your new program by relegating it to its own process. Then you have a clean separation between the two. Also, the legacy program is reset to a clean state every time you run it.
First, modify the program so that it reads the input data from a file, and writes its output in a machine-readable format to another file, with the files being given on the command line.
You can then create named pipes (using the mkfifo call) and invoke the legacy program using system, passing it the named pipes on the command line. Then you feed it its input and read back its output.
I am not an expert on these matters; there is probably a better way of doing the IPC. Others here have mentioned fork. However, the basic idea of separating out the legacy code and invoking it as a subprocess is probably the best approach here.

fork() early?
You could fork(2) at some early point when you think the globals are in a good state, and then have the child wait on a pipe or something for some work to do. This would require writing any changed state or at least the results back to the parent process but would decouple your worker from your primary control process.
In fact, it might make sense to fork() at least twice, once to set up a worker controller and save the initialized (but not too initialized :-) global state, and then have this worker controller fork() again for each loop you need run.
A simpler variation might be to just modify the code so that the process can start in a "worker mode", and then use fork() or system() to start the application at the top, but with an argument that puts it in to the slave mode.

There is a way to do this on certain platforms / compilers, you'd basically be performing the same initialization your compiler performs before calling main().
I have done this for a TI DSP, in that case I had the section with globals mapped to a specific section of memory and there were linker directives available that declared variables pointing to the start and end of this section (so you can memset() the whole area to zero before starting initialization). Then, the compiler provided a list of records, each of which comprised of an address, data length and the actual data to be copied into the address location. So you'd just loop through the records and do memcpy() into the target address to initialize all globals.
Very compiler specific, so hopefully the compiler you're using allows you to do something similar.

In short, no. What I would do in this instance is create definitions, constants if you will, and then use those to reset the global variables with.
Basically
#define var1 10
int vara = 10
etc... basic C right?
You can then go ahead and wrap the reinitialization in a handy function =)

I think you must change the way you see the problem.
Declare all the variables used by callableMain() inside callableMain()'s body, so they are not global anymore and are destroyed after the function is executed and created once again with the default values when you call callableMain() on the next iteration.
EDIT:
Ok, here's what you could do if you have the source code for callableMain(): in the beginning of the function, add a check to verify if its the first time the function its being called. Inside this check you will copy the values of all global variables used to another set of static variables (name them as you like). Then, on the function's body replace all occurences of the global variables by the static variables you created.
This way you will preserve the initial values of all the global variables and use them on every iteration of callableMain(). Does it makes sense to you?
void callableMain()
{
static bool first_iter = true;
if (first_iter)
{
first_iter = false;
static int my_global_var1 = global_var1;
static float my_global_var2 = global_var2;
..
}
// perform operations on my_global_var1 and my_global_var2,
// which store the default values of the original global variables.
}

for (int i = 0; i < 20; i++) {
int saved_var1 = global_var1;
char saved_var2 = global_var2;
double saved_var3 = global_var3;
callableMain();
global_var1 = saved_var1;
global_var2 = saved_var2;
global_var3 = saved_var2;
}
Or maybe you can find out where global variables start memcpy them. But I would always cringe when starting a loop ...
for (int i = 0; i < 20; i++) {
static unsigned char global_copy[SIZEOFGLOBALDATA];
memcpy(global_copy, STARTOFGLOBALDATA, SIZEOFGLOBALDATA);
callableMain();
memcpy(STARTOFGLOBALDATA, global_copy, SIZEOFGLOBALDATA);
}

If you don't want to refactor the code and encapsulate these global variables, I think the best you can do is define a reset function and then call it within the loop.

Assuming we are dealing with ELF on Linux, then the following function to reset the variables works
// these extern variables come from glibc
// https://github.com/ysbaddaden/gc/blob/master/include/config.h
extern char __data_start[];
extern char __bss_start[];
extern char _end[];
#define DATA_START ((char *)&__data_start)
#define DATA_END ((char *)&__bss_start)
#define BSS_START ((char *)&__bss_start)
#define BSS_END ((char *)&_end)
/// first call saves globals, subsequent calls restore
void reset_static_data();
// variable for quick check
static int pepa = 42;
// writes to memory between global variables are reported as buffer overflows by asan
ATTRIBUTE_NO_SANITIZE_ADDRESS
void reset_static_data()
{
// global variable, ok to leak it
static char * x;
size_t s = BSS_END - DATA_START;
// memcpy is always sanitized, so access memory as chars in a loop
if (x == NULL) { // store current static variables
x = (char *) malloc(s);
for (size_t i = 0; i < s; i++) {
*(x+i) = *(DATA_START + i);
}
} else { // restore previously saved static variables
for (size_t i = 0; i < s; i++) {
*(DATA_START + i) = *(x+i);
}
}
// quick check, see that pepa does not grow in stderr output
fprintf(stderr, "pepa: %d\n", pepa++);
}
The general approach is based on answer in How to get the data and bss address space in run time (In Unix C program), see the linked ysbaddaden/gc GitHub repo for macOS version of the macros.
To test the above code, just call it a few times and note that the incremented global variable pepa still keeps the value of 42.
reset_static_data();
reset_static_data();
reset_static_data();
Saving current state of the globals is convenient in that it does not require rerunning __attribute__((constructor)) functions which would be necessary if I set everything in .bss to zero (which is easy) and everything in .data to the initial values (which is not so easy). For example, if you load libpython3.so in your program, it does do run-time initialization which is lost by zeroing .bss. Calling into Python then crashes.
Sanitizers
Writing into areas of memory immediately before or after a static variable will trigger buffer-overflow warning from Address Sanitizer. To prevent this, use the ATTRIBUTE_NO_SANITIZE_ADDRESS macro the way the code above does. The macro is defined in sanitizer/asan_interface.h.
Code coverage
Code coverage counters are implemented as global variables. Therefore, resetting globals will cause coverage information to be forgotten. To solve this, always dump the coverage-to-date before restoring the globals. There does not seem to be a macro to detect whether code coverage is enabled or not in the compiler, so use your build system (CMake, ...) to define suitable macro yourself, such as QD_COVERAGE below.
// The __gcov_dump function writes the coverage counters to gcda files
// and the __gcov_reset function resets them to zero.
// The interface is defined at https://github.com/gcc-mirror/gcc/blob/7501eec65c60701f72621d04eeb5342bad2fe4fb/libgcc/libgcov-interface.c
extern "C" void __gcov_reset();
extern "C" void __gcov_dump();
void flush_coverage() {
#if defined(QD_COVERAGE)
__gcov_dump();
__gcov_reset();
#endif
}

Related

How do you avoid using global variables in inherently stateful programs?

I am currently writing a small game in C and feel like I can't get away from global variables.
For example I am storing the player position as a global variable because it's needed in other files. I have set myself some rules to keep the code clean.
Only use a global variable in the file it's defined in, if possible
Never directly change the value of a global from another file (reading from another file using extern is okay)
So for example graphics settings would be stored as file scope variables in graphics.c. If code in other files wants to change the graphics settings they would have to do so through a function in graphics.c like graphics_setFOV(float fov).
Do you think those rules are sufficient for avoiding global variable hell in the long term?
How bad are file scope variables?
Is it okay to read variables from other files using extern?
Typically, this kind of problem is handled by passing around a shared context:
graphics_api.h
#ifndef GRAPHICS_API
#define GRAPHICS_API
typedef void *HANDLE;
HANDLE init_graphics(void);
void destroy_graphics(HANDLE handle);
void use_graphics(HANDLE handle);
#endif
graphics.c
#include <stdio.h>
#include <stdlib.h>
#include "graphics_api.h"
typedef struct {
int width;
int height;
} CONTEXT;
HANDLE init_graphics(void) {
CONTEXT *result = malloc(sizeof(CONTEXT));
if (result) {
result->width = 640;
result->height = 480;
}
return (HANDLE) result;
}
void destroy_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
free(context);
}
}
void use_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
printf("width = %5d\n", context->width);
printf("height = %5d\n", context->height);
}
}
main.c
#include <stdio.h>
#include "graphics_api.h"
int main(void) {
HANDLE handle = init_graphics();
if (handle) {
use_graphics(handle);
destroy_graphics(handle);
}
return 0;
}
Output
width = 640
height = 480
Hiding the details of the context by using a void pointer prevents the user from changing the data contained within the memory to which it points.
How do you avoid using global variables in inherently stateful programs?
By passing arguments...
// state.h
/// state object:
struct state {
int some_value;
};
/// Initializes state
/// #return zero on success
int state_init(struct state *s);
/// Destroys state
/// #return zero on success
int state_fini(struct state *s);
/// Does some operation with state
/// #return zero on success
int state_set_value(struct state *s, int new_value);
/// Retrieves some operation from state
/// #return zero on success
int state_get_value(struct state *s, int *value);
// state.c
#include "state.h"
int state_init(struct state *s) {
s->some_value = -1;
return 0;
}
int state_fini(struct state *s) {
// add free() etc. if needed here
// call fini of other objects here
return 0;
}
int state_set_value(struct state *s, int value) {
if (value < 0) {
return -1; // ERROR - invalid argument
// you may return EINVAL here
}
s->some_value = value;
return 0; // success
}
int state_get_value(struct state *s, int *value) {
if (s->some_value < 0) { // value not set yet
return -1;
}
*value = s->some_value;
return 0;
}
// main.c
#include "state.h"
#include <stdlib.h>
#include <stdio.h>
int main() {
struct state state; // local variable
int err = state_init(&state);
if (err) abort();
int value;
err = state_get_value(&state, &value);
if (err != 0) {
printf("Getting value errored: %d\n", err);
}
err = state_set_value(&state, 50);
if (err) abort();
err = state_get_value(&state, &value);
if (err) abort();
printf("Current value is: %d\n", value);
err = state_fini(&state);
if (err) abort();
}
The only single case where global variables (preferably only a single pointer to some stack variable anyway) have to be used are signal handlers. The standard way would be to only increment a single global variable of type sig_atomic_t inside a signal handler and do nothing else - then execute all signal handling related logic from the normal flow in the rest of the code by checking the value of that variable. (On POSIX system) all other asynchronous communication from the kernel, like timer_create, that take sigevent structure, they can pass arguments to notified function by using members in union sigval.
Do you think those rules are sufficient for avoiding global variable hell in the long term?
Subjectively: no. I believe that a potentially uneducated programmer has too much freedom in creating global variables given the first rule. In complex programs I would use a hard rule: Do not use global variables. If finally after researching all other ways and all other possibilities have been exhausted and you have to use a global variables, make sure global variables leave the smallest possible memory footprint.
In simple short programs I wouldn't care much.
How bad are file scope variables?
This is opinion based - there are good cases where projects use many global variables. I believe that topic is exhausted in are global variables bad and numerous other internet resources.
Is it okay to read variables from other files using extern?
Yes, it's ok.
There are no "hard rules" and each project has it's own rules. I also recommend to read c2 wiki global variables are bad.
The first thing you have to ask yourself is: Just why did the programming world come to loath global variables? Obviously, as you noted, the way to model a global state is essentially a global (set of) variable(s). So what's the problem with that?
The Problem
All parts of the program have access to that state. The whole program becomes tightly coupled. Global variables violate the prime directive in programming, divide and conquer. Once all functions operate on the same data you can as well do away with the functions: They are no longer logical separations of concern but degrade to a notational convenience to avoid large files.
Write access is worse than read access: You'll have a hard time finding out just why on earth the state is unexpected at a certain point; the change can have happened anywhere. It is tempting to take shortcuts: "Ah, we can make the state change right here instead of passing a computation result back up three layers to the caller; that makes the code much smaller."
Even read access can be used to cheat and e.g. change behavior of some deep-down code depending on some global information: "Ah, we can skip rendering, there is no display yet!" A decision which should not be made in the rendering code but at top level. What if top level renders to a file!?
This creates both a debugging and a development/maintenance nightmare. If every piece of the code potentially relies on the presence and semantics of certain variables — and can change them! — it becomes exponentially harder to debug or change the program. The code agglomerating around the global data is like a cast, or perhaps a Boa Constrictor, which starts to immobilize and strangle your program.
Such programming can be avoided with (self-)discipline, but imagine a large project with many teams! It's much better to "physically" prevent access. Not coincidentally all programming languages after C, even if they are otherwise fundamentally different, come with improved modularization.
So what can we do?
The solution is indeed to pass parameters to functions, as KamilCuk said; but each function should only get the information they legitimately need. Of course it is best if the access is read-only and the result is a return value: Pure functions cannot change state at all and thus perfectly separate concerns.
But simply passing a pointer to the global state around does not cut the mustard: That's only a thinly veiled global variable.
Instead, the state should be separated into sub-states. Only top-level functions (which typically do not do much themselves but mostly delegate) have access to the overall state and hand sub-states to the functions they call. Third-tier functions get sub-sub states, etc. The corresponding implementation in C is a nested struct; pointers to the members — const whenever possible — are passed to functions which therefore cannot see, let alone alter, the rest of the global state. Separation of concerns is thus guaranteed.

How can I use global variables used between two C function which are called in Modelica?

I have two function definitions in C which share some global variables. I want to call these functions in Modelica but I do not know how can I correctly keep the value of the global variable between two function calls.
file.c
/*Global variable definition*/
int* global_test1;
void FirstFunc (const int* init_value){
memcpy(global_test1, init_value, 2*sizeof(int));
}
void SecondFunc(int* some_output_variable){
memcpy(some_output_variable, global_test1, 2*sizeof(int));
}
calling_FirstFunc.mo
function calling_FirstFunc
input Integer[2,1] init_value = [3;3];
external "C" FirstFunc(init_value);
end;
calling_SecondFunc.mo
function calling_SecondFunc
output Integer[2,1] output_var;
external "C" SecondFunc(output_var);
end;
model.mo
model Calling_TwoFuncs
Integer[2,1] input_var = [3;5];
Integer[2,1] output_var;
equation
calling_FirstFunc(input_var);
when time>5.0 then
output_var = calling_SecondFunc();
end when;
end Calling_TwoFuncs;
Your sample code should almost work correctly. The C-functions will keep their state and work fine if (and only if) they are called in the order First, Second. You also need to allocate memory for global_test1... But this order is not guaranteed in the code. I suggest using external objects instead; then you can create multiple instances of the same model and not have a global state in the C-code (because you can malloc memory and return this for the constructor call; the First call). Note that you should probably pass the size of the vector to the constructor in order to be more general.

How to prevent re-initializing pthread_rwlock_t

I'm declaring array of pthread_rwlock_t static global.
e.g. static pthread_rwlock_t cm[255];
Inside constructor I want to initialize one of the 255 mutex( I keep track with static counter)
Now I'm confused with
1) I don't want to re-initialize lock again, that is bad!
I thought reinitialize should return some error code, but it doesn't:
#include<stdio.h>
#include <pthread.h>
static pthread_rwlock_t cm[2];
int main()
{
int ret;
ret = pthread_rwlock_init(&cm[0], NULL);
ret = pthread_rwlock_wrlock(&cm[0]);
printf("Ret: %d\n", ret);
ret = pthread_rwlock_init(&cm[0], NULL);
printf("Ret: %d\n", ret);
ret = pthread_rwlock_wrlock(&cm[0]);
printf("Ret: %d\n", ret);
}
Result:
Ret: 0
Ret: 0
Ret: 0
Can anyone help, 1) If this is possible, then how? 2) If not what should be alternative approach?
EDIT 1:
I'm updating from comments/answers I got:
Instead, just put the rwlocks inside the objects they protect.
So I have n # of objects getting called, and will be using that many pthread_lock .. so disadvantage is memory. Hence I'm trying to improve on that part with global array of locks. Picking 256 to get good distribution.
It's undefined behavior to call pthread_rwlock_init (or analogously any of the pthread primitive init functions) more than once on the same object, and logically there's no way it would make sense to do so anyway since (as you've demonstrated) the object is already in use. You said in the comments on 2501's answer that you can't use pthread_once, but this makes no sense. If you're able to call pthread_rwlock_init, you can instead just call pthread_once using an init function which performs the call to pthread_rwlock_init.
However I really think you're experiencing an XY problem. There is no sense in maintaining a "global pool" of rwlocks and handing them out dynamically in constructors. Instead, just put the rwlocks inside the objects they protect. If you really want to hand them out from a global pool like you're doing, you need to keep track of which ones have been handed out independently of the job of initializing them, and have the task of initializing them after obtaining one, and destroying one before giving it back to the pool, be handled by the constructor/destructor for the object using them.
If you need static initialization, use PTHREAD_RWLOCK_INITIALIZER on your array.
static pthread_rwlock_t cm[2] = { PTHREAD_RWLOCK_INITIALIZER ,
PTHREAD_RWLOCK_INITIALIZER} ;
This is equivalent as calling pthread_rwlock_init() on every element with attr parameter specified as NULL, except that no error checking is performed.

Arduino EthernetServer read() only works when Serial is initialized and read characters are printed

I have an Arduino project where I read data from a webserver.
I have an EthernetClient that reads the data character by character in a callback function.
My working code looks like (only the relevant parts):
void setup() {
Serial.begin(9600);
...
}
void loop() {
char* processedData = processData(callback); // this is in a external lib
}
boolean callback(char* buffer, int& i) {
...
if (Client.available()) {
char c = client.read();
buffer[i++] = c;
Serial.print(c);
}
...
}
This works without any problems (reading and processing the data), but when I remove Serial.begin(9600); and Serial.print(c); it stops working and I don't know why? The only thing changed is that the char c is not printed. What could be the problem?
A common reason why callback functions change their behavior when seemingly unrelated code is altered, is optimizer-related bugs.
Many embedded compilers fail to understand that a callback function (or an interrupt service routine) will ever be called in the program. They see no explicit call to that function and then assumes it is never called.
When the compiler has made such an assumption, it will optimize variables that are changed by the callback function, because it fails to see that the variable is changed by the program, between the point of initialization and the point of access.
// Bad practice example:
int x;
void main (void)
{
x=5;
...
if(x == 0) /* this whole if statement will get optimized away,
the compiler assumes that x has never been changed. */
{
do_stuff();
}
}
void callback (void)
{
x = 0;
}
When this bug strikes, it is nearly impossible to find, it can cause any kind of weird symptoms.
The solution is to always declare all file scope ("global") variables shared between main() and an interrupt/callback/thread as volatile. This makes it impossible for the compiler to make incorrect optimizer assumptions.
(Please note that the volatile keyword cannot be used to achieve synchronization nor does it guarantee any memory barriers. This answer is not in the slightest related to such issues!)
A guess: Because without the serial driver started, there is no data to process, and therefore your callback is not hit.
What were you hoping the serial callback to be doing in the absence of data?
Providing more information about Client and processData may help.

How can I check that all my init functions have been called?

I am writing a large C program for embedded use. Every module in this program has an init() function (like a constructor) to set up its static variables.
The problem is that I have to remember to call all of these init functions from main(). I also have to remember to put them back if I have commented them out for some reason.
Is there anything clever I do to make sure that all of these functions are getting called? Something along the lines of putting a macro in each init function that, when you call a check_inited() function later, sends a warning to STDOUT if not all the functions are called.
I could increment a counter, but I'd have to maintain the correct number of init functions somewhere and that is also prone to error.
Thoughts?
The following is the solution I decided on, with input from several people in this thread
My goal is to make sure that all my init functions are actually being called. I want to do
this without maintaining lists or counts of modules across several files. I can't call
them automatically as Nick D suggested because they need to be called in a certain order.
To accomplish this, a macro included in every module uses the gcc constructor attribute to
add the init function name to a global list.
Another macro included in the body of the init function updates the global list to make a
note that the function was actually called.
Finally, a check function is called in main() after all of the inits are done.
Notes:
I chose to copy the strings into an array. This not strictly necessary because the
function names passed will always be static strings in normal usage. If memory was short
you could just store a pointer to the string that was passed in.
My reusable library of utility functions is called "nx_lib". Thus all the 'nxl' designations.
This isn't the most efficient code in the world but it's only called a boot time so that
doesn't matter for me.
There are two lines of code that need to be added to each module. If either is omitted,
the check function will let you know.
you might be able to make the constructor function static, which would avoid the need to give it a name that is unique across the project.
this code is only lightly tested and it's really late so please check carefully before trusting it.
Thank you to:
pierr who introduced me to the constructor attribute.
Nick D for demonstrating the ## preprocessor trick and giving me the framework.
tod frye for a clever linker-based approach that will work with many compilers.
Everyone else for helping out and sharing useful tidbits.
nx_lib_public.h
This is the relevant fragment of my library header file
#define NX_FUNC_RUN_CHECK_NAME_SIZE 20
typedef struct _nxl_function_element{
char func[NX_FUNC_RUN_CHECK_NAME_SIZE];
BOOL called;
} nxl_function_element;
void nxl_func_run_check_add(char *func_name);
BOOL nxl_func_run_check(void);
void nxl_func_run_check_hit(char *func_name);
#define NXL_FUNC_RUN_CHECK_ADD(function_name) \
void cons_ ## function_name() __attribute__((constructor)); \
void cons_ ## function_name() { nxl_func_run_check_add(#function_name); }
nxl_func_run_check.c
This is the libary code that is called to add function names and check them later.
#define MAX_CHECKED_FUNCTIONS 100
static nxl_function_element m_functions[MAX_CHECKED_FUNCTIONS];
static int m_func_cnt = 0;
// call automatically before main runs to register a function name.
void nxl_func_run_check_add(char *func_name)
{
// fail and complain if no more room.
if (m_func_cnt >= MAX_CHECKED_FUNCTIONS) {
print ("nxl_func_run_check_add failed, out of space\r\n");
return;
}
strncpy (m_functions[m_func_cnt].func, func_name,
NX_FUNC_RUN_CHECK_NAME_SIZE);
m_functions[m_func_cnt].func[NX_FUNC_RUN_CHECK_NAME_SIZE-1] = 0;
m_functions[m_func_cnt++].called = FALSE;
}
// call from inside the init function
void nxl_func_run_check_hit(char *func_name)
{
int i;
for (i=0; i< m_func_cnt; i++) {
if (! strncmp(m_functions[i].func, func_name,
NX_FUNC_RUN_CHECK_NAME_SIZE)) {
m_functions[i].called = TRUE;
return;
}
}
print("nxl_func_run_check_hit(): error, unregistered function was hit\r\n");
}
// checks that all registered functions were called
BOOL nxl_func_run_check(void) {
int i;
BOOL success=TRUE;
for (i=0; i< m_func_cnt; i++) {
if (m_functions[i].called == FALSE) {
success = FALSE;
xil_printf("nxl_func_run_check error: %s() not called\r\n",
m_functions[i].func);
}
}
return success;
}
solo.c
This is an example of a module that needs initialization
#include "nx_lib_public.h"
NXL_FUNC_RUN_CHECK_ADD(solo_init)
void solo_init(void)
{
nxl_func_run_check_hit((char *) __func__);
/* do module initialization here */
}
You can use gcc's extension __attribute__((constructor)) if gcc is ok for your project.
#include <stdio.h>
void func1() __attribute__((constructor));
void func2() __attribute__((constructor));
void func1()
{
printf("%s\n",__func__);
}
void func2()
{
printf("%s\n",__func__);
}
int main()
{
printf("main\n");
return 0;
}
//the output
func2
func1
main
I don't know how ugly the following looks but I post it anyway :-)
(The basic idea is to register function pointers, like what atexit function does.
Of course atexit implementation is different)
In the main module we can have something like this:
typedef int (*function_t)(void);
static function_t vfunctions[100]; // we can store max 100 function pointers
static int vcnt = 0; // count the registered function pointers
int add2init(function_t f)
{
// todo: error checks
vfunctions[vcnt++] = f;
return 0;
}
...
int main(void) {
...
// iterate vfunctions[] and call the functions
...
}
... and in some other module:
typedef int (*function_t)(void);
extern int add2init(function_t f);
#define M_add2init(function_name) static int int_ ## function_name = add2init(function_name)
int foo(void)
{
printf("foo\n");
return 0;
}
M_add2init(foo); // <--- register foo function
Why not write a post processing script to do the checking for you. Then run that script as part of your build process... Or better yet, make it one of your tests. You are writing tests, right? :)
For example, if each of your modules has a header file, modX.c. And if the signature of your init() function is "void init()"...
Have your script grep through all your .h files, and create a list of module names that need to be init()ed. Then have the script check that init() is indeed called on each module in main().
If your single module represents "class" entity and has instance constructor, you can use following construction:
static inline void init(void) { ... }
static int initialized = 0;
#define INIT if (__predict_false(!initialized)) { init(); initialized = 1; }
struct Foo *
foo_create(void)
{
INIT;
...
}
where "__predict_false" is your compiler's branch prediction hint. When first object is created, module is auto-initialized (for once).
Splint (and probably other Lint variants) can give a warning about functions that are defined but not called.
It's interesting that most compilers will warn you about unused variables, but not unused functions.
Larger running time is not a problem
You can conceivably implement a kind of "state-machine" for each module, wherein the actions of a function depend on the state the module is in. This state can be set to BEFORE_INIT or INITIALIZED.
For example, let's say we have module A with functions foo and bar.
The actual logic of the functions (i.e., what they actually do) would be declared like so:
void foo_logic();
void bar_logic();
Or whatever the signature is.
Then, the actual functions of the module (i.e., the actual function declared foo()) will perform a run-time check of the condition of the module, and decide what to do:
void foo() {
if (module_state == BEFORE_INIT) {
handle_not_initialized_error();
}
foo_logic();
}
This logic is repeated for all functions.
A few things to note:
This will obviously incur a huge penalty performance-wise, so is
probably not a good idea (I posted
anyway because you said runtime is
not a problem).
This is not a real state-machine, since there are only two states which are checked using a basic if, without some kind of smart general logic.
This kind of "design-pattern" works great when you're using separate threads/tasks, and the functions you're calling are actually called using some kind of IPC.
A state machine can be nicely implemented in C++, might be worth reading up on it. The same kind of idea can conceivably be coded in C with arrays of function pointers, but it's almost certainly not worth your time.
you can do something along these lines with a linker section. whenever you define an init function, place a pointer to it in a linker section just for init function pointers. then you can at least find out how many init functions have been compiled.
and if it does not matter what order the init functions are called, and the all have the same prototype, you can just call them all in a loop from main.
the exact details elude my memory, but it works soemthing like this::
in the module file...
//this is the syntax in GCC..(or would be if the underscores came through in this text editor)
initFuncPtr thisInit __attribute((section(.myinits)))__= &moduleInit;
void moduleInit(void)
{
// so init here
}
this places a pointer to the module init function in the .myinits section, but leaves the code in the .code section. so the .myinits section is nothing but pointers. you can think of this as a variable length array that module files can add to.
then you can access the section start and end address from the main. and go from there.
if the init functions all have the same protoytpe, you can just iterate over this section, calling them all.
this, in effect, is creating your own static constructor system in C.
if you are doing a large project and your linker is not at least this fully featured, you may have a problem...
Can I put up an answer to my question?
My idea was to have each function add it's name to a global list of functions, like Nick D's solution.
Then I would run through the symbol table produced by -gstab, and look for any functions named init_* that had not been called.
This is an embedded app so I have the elf image handy in flash memory.
However I don't like this idea because it means I always have to include debugging info in the binary.

Resources