Stack overflow for huge file with many variables in different scopes - c

I have a autogenerated file creating structs and doing some calculations with them.
Each struct has its dedicated scope.
typedef struct
{
uint16_t a;
uint16_t b;
} Addition_t;
uint8_t StructsOverflow(void)
{
{ // use new scope to declare same variable multiple times
Addition_t x = {.a = 5, .b=6};
if (x.a == x.b)
return 1;
}
{
Addition_t x = {.a = 3, .b=6};
if (x.a == x.b)
return 1;
}
{
Addition_t x = {.a = 3, .b=5};
if (x.a == x.b)
return 1;
}
// and so on
// here other structs are created in the same fashion as above
return 0;
}
For a huge number of Lines (about 100,000 structs), running the .exe stops with a StackOverflow: Exception thrown at 0x00007FF7F2C8B6C8 in EnergyPredictionMain.exe: 0xC00000FD: Stack overflow (parameters: 0x0000000000000001, 0x0000001815603000)..
Im using the MSVC 2019 compiler and cppvsdbg for debugging.
Why is there an stackoverflow? In my understanding the variables are destroyed after the scope, so only the memory of one struct should be used.

Why? Because, in a Debug build (IIRC), MSVC doesn't deallocate local variables when they go out of scope in this way. In a Release build, it will probably work.
But what's really broken here, IMO, is whatever it is that autogenerates that file. Would it be practical to change it to generate 100,000 separate sub-functions, each initialising (and then processing) one struct? Then invoke each of them in turn from an (also auto-generated) 'master' function.
If you can do that, it should provide a robust and future-proof fix.

Related

How do you avoid using global variables in inherently stateful programs?

I am currently writing a small game in C and feel like I can't get away from global variables.
For example I am storing the player position as a global variable because it's needed in other files. I have set myself some rules to keep the code clean.
Only use a global variable in the file it's defined in, if possible
Never directly change the value of a global from another file (reading from another file using extern is okay)
So for example graphics settings would be stored as file scope variables in graphics.c. If code in other files wants to change the graphics settings they would have to do so through a function in graphics.c like graphics_setFOV(float fov).
Do you think those rules are sufficient for avoiding global variable hell in the long term?
How bad are file scope variables?
Is it okay to read variables from other files using extern?
Typically, this kind of problem is handled by passing around a shared context:
graphics_api.h
#ifndef GRAPHICS_API
#define GRAPHICS_API
typedef void *HANDLE;
HANDLE init_graphics(void);
void destroy_graphics(HANDLE handle);
void use_graphics(HANDLE handle);
#endif
graphics.c
#include <stdio.h>
#include <stdlib.h>
#include "graphics_api.h"
typedef struct {
int width;
int height;
} CONTEXT;
HANDLE init_graphics(void) {
CONTEXT *result = malloc(sizeof(CONTEXT));
if (result) {
result->width = 640;
result->height = 480;
}
return (HANDLE) result;
}
void destroy_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
free(context);
}
}
void use_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
printf("width = %5d\n", context->width);
printf("height = %5d\n", context->height);
}
}
main.c
#include <stdio.h>
#include "graphics_api.h"
int main(void) {
HANDLE handle = init_graphics();
if (handle) {
use_graphics(handle);
destroy_graphics(handle);
}
return 0;
}
Output
width = 640
height = 480
Hiding the details of the context by using a void pointer prevents the user from changing the data contained within the memory to which it points.
How do you avoid using global variables in inherently stateful programs?
By passing arguments...
// state.h
/// state object:
struct state {
int some_value;
};
/// Initializes state
/// #return zero on success
int state_init(struct state *s);
/// Destroys state
/// #return zero on success
int state_fini(struct state *s);
/// Does some operation with state
/// #return zero on success
int state_set_value(struct state *s, int new_value);
/// Retrieves some operation from state
/// #return zero on success
int state_get_value(struct state *s, int *value);
// state.c
#include "state.h"
int state_init(struct state *s) {
s->some_value = -1;
return 0;
}
int state_fini(struct state *s) {
// add free() etc. if needed here
// call fini of other objects here
return 0;
}
int state_set_value(struct state *s, int value) {
if (value < 0) {
return -1; // ERROR - invalid argument
// you may return EINVAL here
}
s->some_value = value;
return 0; // success
}
int state_get_value(struct state *s, int *value) {
if (s->some_value < 0) { // value not set yet
return -1;
}
*value = s->some_value;
return 0;
}
// main.c
#include "state.h"
#include <stdlib.h>
#include <stdio.h>
int main() {
struct state state; // local variable
int err = state_init(&state);
if (err) abort();
int value;
err = state_get_value(&state, &value);
if (err != 0) {
printf("Getting value errored: %d\n", err);
}
err = state_set_value(&state, 50);
if (err) abort();
err = state_get_value(&state, &value);
if (err) abort();
printf("Current value is: %d\n", value);
err = state_fini(&state);
if (err) abort();
}
The only single case where global variables (preferably only a single pointer to some stack variable anyway) have to be used are signal handlers. The standard way would be to only increment a single global variable of type sig_atomic_t inside a signal handler and do nothing else - then execute all signal handling related logic from the normal flow in the rest of the code by checking the value of that variable. (On POSIX system) all other asynchronous communication from the kernel, like timer_create, that take sigevent structure, they can pass arguments to notified function by using members in union sigval.
Do you think those rules are sufficient for avoiding global variable hell in the long term?
Subjectively: no. I believe that a potentially uneducated programmer has too much freedom in creating global variables given the first rule. In complex programs I would use a hard rule: Do not use global variables. If finally after researching all other ways and all other possibilities have been exhausted and you have to use a global variables, make sure global variables leave the smallest possible memory footprint.
In simple short programs I wouldn't care much.
How bad are file scope variables?
This is opinion based - there are good cases where projects use many global variables. I believe that topic is exhausted in are global variables bad and numerous other internet resources.
Is it okay to read variables from other files using extern?
Yes, it's ok.
There are no "hard rules" and each project has it's own rules. I also recommend to read c2 wiki global variables are bad.
The first thing you have to ask yourself is: Just why did the programming world come to loath global variables? Obviously, as you noted, the way to model a global state is essentially a global (set of) variable(s). So what's the problem with that?
The Problem
All parts of the program have access to that state. The whole program becomes tightly coupled. Global variables violate the prime directive in programming, divide and conquer. Once all functions operate on the same data you can as well do away with the functions: They are no longer logical separations of concern but degrade to a notational convenience to avoid large files.
Write access is worse than read access: You'll have a hard time finding out just why on earth the state is unexpected at a certain point; the change can have happened anywhere. It is tempting to take shortcuts: "Ah, we can make the state change right here instead of passing a computation result back up three layers to the caller; that makes the code much smaller."
Even read access can be used to cheat and e.g. change behavior of some deep-down code depending on some global information: "Ah, we can skip rendering, there is no display yet!" A decision which should not be made in the rendering code but at top level. What if top level renders to a file!?
This creates both a debugging and a development/maintenance nightmare. If every piece of the code potentially relies on the presence and semantics of certain variables — and can change them! — it becomes exponentially harder to debug or change the program. The code agglomerating around the global data is like a cast, or perhaps a Boa Constrictor, which starts to immobilize and strangle your program.
Such programming can be avoided with (self-)discipline, but imagine a large project with many teams! It's much better to "physically" prevent access. Not coincidentally all programming languages after C, even if they are otherwise fundamentally different, come with improved modularization.
So what can we do?
The solution is indeed to pass parameters to functions, as KamilCuk said; but each function should only get the information they legitimately need. Of course it is best if the access is read-only and the result is a return value: Pure functions cannot change state at all and thus perfectly separate concerns.
But simply passing a pointer to the global state around does not cut the mustard: That's only a thinly veiled global variable.
Instead, the state should be separated into sub-states. Only top-level functions (which typically do not do much themselves but mostly delegate) have access to the overall state and hand sub-states to the functions they call. Third-tier functions get sub-sub states, etc. The corresponding implementation in C is a nested struct; pointers to the members — const whenever possible — are passed to functions which therefore cannot see, let alone alter, the rest of the global state. Separation of concerns is thus guaranteed.

How to avoid globals in EEPROM structs for system settings?

I'm struggling with getting system settings from EEPROM and trying to avoid having them as global variables and wondered what the prevailing wisdom is and if there's an accepted practice and / or elegant solution.
I'm getting system settings stored in an EEPROM via structures with some error checking and the sizeof operator in main.c along the lines of:
// EEPROM data structures
typedef struct system_tag
{
uint8_t buzzer_volume;
uint8_t led_brightness;
uint8_t data_field_3;
} system_t;
typedef struct counters_tag
{
uint16_t counter_1;
uint16_t counter_2;
uint16_t counter_3;
} counters_t;
typedef struct eeprom_tag
{
system_t system_data;
uint8_t system_crc;
counters_t counters;
uint8_t counters_crc;
} eeprom_t;
// Default values
static system_t system_data =
{
.buzzer_volume = 50,
.led_brightness = 50,
.data_field_3 = 30
};
static counters_t counter =
{
.counter_1 = 0,
.counter_2 = 0,
.counter_3 = 0
};
// Get system settings data from the EEPROM
if (EEPROM_check_ok(EEPROM_BASE_ADDRESS, sizeof(system_t)))
{
eeprom_read_block(&system_data, (uint16_t *) EEPROM_BASE_ADDRESS, sizeof(system_t));
}
if (EEPROM_check_ok((EEPROM_BASE_ADDRESS + offsetof(eeprom_t, counters)), sizeof(counters_t)))
{
eeprom_read_block(&counter, (uint16_t *) EEPROM_BASE_ADDRESS, sizeof(counters_t));
}
I'm then using the system settings data at the moment to set other variables in different modules. E.g. in another file, buzzer.c, I have a module static variable (in an effort to avoid globals) with accessor functions to try and give some encapsulation:
// Current volume setting of the buzzer
static uint8_t volume = 50;
void BUZZER_volume_set(uint8_t new_volume)
{
volume = new_volume;
}
uint8_t BUZZER_volume_get(void)
{
return (volume);
}
The problem I feel is I've now got unnecessary duplication of data, as when I pass the buzzer_volume from the system data to set the static volume variable in the buzzer module things could get out of synchronisation. Having the system settings as globals would be easy, but I know this is frowned upon.
Is there a more elegant way of doing this without using globals and still having some encapsulation?
Any suggestions would be gratefully received.
General advice to avoiding globals (and why you need to do so) are given in Jack Ganssle's excelent article "A Pox on Globals". Essential reading.
One solution is simply to have accessor functions in main.c (or better a separate nvdata.c, to protect it from direct access by anything).
Rather then relying on a single initialisation function being called before any access to the data, I would suggest an "initialise on first use" semantic thus:
const system_t* getSystemData()
{
static bool initialised = false ;
if( !initialised )
{
eeprom_read_block( &system_data,
(uint16_t*)EEPROM_BASE_ADDRESS,
sizeof(system_t) ) ;
initialised = true ;
}
return &system_data ;
}
void setSystemData( const system_t* new_system_data )
{
system_data = *new_system_data ;
eeprom_write_block( &system_data,
(uint16_t*)EEPROM_BASE_ADDRESS,
sizeof(system_t));
}
Then in buzzer.c:
uint8_t BUZZER_volume_get(void)
{
return getSystemData()->buzzer_volume ;
}
void BUZZER_volume_set( uint8_t new_volume )
{
system_t new_system_data = *getSystemData() ;
new_system_data.buzzer_volume = new_volume ;
setSystemData( &new_system_data ) ;
}
There are some issues with this - such as if your structures are large updating a single member can be expensive. That could be resolved however, but may not be an issue in your application.
Another issue is the writing back to the EEPROM on every change - that may cause unnecessary thrashing of the EEPROM and stall your program for significant periods if you have several sequential changes to the same structure. In that case a simple method is to have a separate commit operation:
void setSystemData( const system_t* new_system_data )
{
system_data = *new_system_data ;
system_data_commit_pending = true ;
}
void commitSystemData()
{
if( system_data_commit_pending )
{
eeprom_write_block( &system_data,
(uint16_t*)EEPROM_BASE_ADDRESS,
sizeof(system_t));
}
}
where you commit the data only when necessary or safe to do so - such as on a controlled shutdown or explicitly selected UI "save settings" operation for example.
A more sophisticated method is to set a timer on change and have the commit function called when the timer expires, each "set" would restart the timer, so the commit would only occur in "quiet" periods. This method is especially suited to a multi-threaded solution.

Expose C functions/variables via Lua metatable without large if/else block?

I want to create a global interface for a struct instance accessible in Lua. For example, I would create a global instance of a metatable called window as main_window, I would want to then do things like this from Lua:
main_window.color = {1, 2, 3}
main_window.position.x = 64
main_window.show(true)
In an attempt to do this, I used the code from this answer as a base since it's the closest thing I could find. I ended up with an API like this
lua_create_window_type(L);
lua_expose_window(L, main_window);
lua_setglobal(L, "main_window");
...
static int lua_window_index(lua_State* L)
{
struct window_state** w = luaL_checkudata(L, 1, "window");
char* index = luaL_checkstring(L, 2);
if (strcmp(index, "x") == 0) {
lua_pushnumber(L, (*w)->x);
} else if (strcmp(index, "show") == 0) {
lua_pushcfunction(L, lua_window_show);
} else {
...
}
return 1;
}
static int lua_window_newindex(lua_State* L)
{
struct window_state** w = luaL_checkudata(L, 1, "window");
char* index = luaL_checkstring(L, 2);
if (strcmp(index, "x") == 0) {
(*w)->x = luaL_checkinteger(L, 3);
} else {
...
}
return 0;
}
I am inevitably going to end up with tens or hundreds of functions and variables I want to be accessible. Using this template I would have to manually create a if strcmp == 0 else if for every single one. I'd have to duplicate the entire block to allow assignment. I also don't want to end up with functions near the end being comparatively slow to call due to the amount of string comparisons. Overall this does not seem like a "maintainable" solution, nor one the Lua authors would have intended.
When I only needed functions all I had to do was push a standard global table and whatever functions I needed, but trying to allow direct variable access like a native Lua table makes this more difficult (before you ask why not just use functions, I've tried and only having "getter/setter" function access from Lua is very painful and ugly for my uses).
Suggestions for more maintainable alternatives to duplicated if/else blocks?

Can gcc/clang optimize initialization computing?

I recently wrote a parser generator tool that takes a BNF grammar (as a string) and a set of actions (as a function pointer array) and output a parser (= a state automaton, allocated on the heap). I then use another function to use that parser on my input data and generates a abstract syntax tree.
In the initial parser generation, there is quite a lot of steps, and i was wondering if gcc or clang are able to optimize this, given constant inputs to the parser generation function (and never using the pointers values, only dereferencing them) ? Is is possible to run the function at compile time, and embed the result (aka, the allocated memory) in the executable ?
(obviously, that would be using link time optimization, since the compiler would need to be able to check that the whole function does indeed have the same result with the same parameters)
What you could do in this case is have code that generates code.
Have your initial parser generator as a separate piece of code that runs independently. The output of this code would be a header file containing a set of variable definitions initialized to the proper values. You then use this file in your main code.
As an example, suppose you have a program that needs to know the number of bits that are set in a given byte. You could do this manually whenever you need:
int count_bits(uint8_t b)
{
int count = 0;
while (b) {
count += b & 1;
b >>= 1;
}
return count;
}
Or you can generate the table in a separate program:
int main()
{
FILE *header = fopen("bitcount.h", "w");
if (!header) {
perror("fopen failed");
exit(1);
}
fprintf(header, "int bit_counts[256] = {\n");
int count;
unsigned v;
for (v=0,count=0; v<256; v++) {
uint8_t b = v;
while (b) {
count += b & 1;
b >>= 1;
}
fprintf(header, " %d,\n" count);
}
fprintf(header, "};\n");
fclose(header);
return 0;
}
This create a file called bitcount.h that looks like this:
int bit_counts[256] = {
0,
1,
1,
2,
...
7,
};
That you can include in your "real" code.

Any jump with return instruction in C?

I have multiple locations in my code where I want to be able to jump to one specific location and return to where I was before.
A function calls provides that control flow but is not an option for me as I want the code I branch to to access a number of variables and passing all of them as arguments to the function call wouldn't be practical or efficient.
And the goto statement is only built to take a label, i.e. expected to be a one-way ticket.
Currently I am achieving what I need with the following:
void *return_addr;
int x,y;
...
return_addr=&&RETURN_0;
goto SOMEWHERE;
RETURN_0:
...
x+=1;
...
return_addr=&&RETURN_1;
goto SOMEWHERE;
RETURN_1:
...
SOMEWHERE:
y=x;
...
goto *return_addr;
Is there something more elegant and less cumbersome?
Is there something more elegant and less cumbersome?
You are obviously using GCC, as the computed goto statement is a GCC extension. With GCC we can use a nested function and access local variables without needing to pass them as arguments:
{
int x, y;
void SOMEWHERE()
{
y = x;
//...
}
//...
SOMEWHERE();
//...
x += 1;
//...
SOMEWHERE();
//...
}
Let's have the variables collected in a structure:
struct data_t {
int a;
int b;
/* and so on */
int x;
int y;
};
Let's have the repeated code defined in a function:
void func(struct data_t* data) {
data->y = data->x;
/* and so on */
}
Let's have the function used:
struct data_t data = {1, 2, ..., 24, 25};
func(&data);
data.x += 1;
func(&data);
/* and so on */
C has setjmp() / longjmp(), which can support what you describe. Do not use them. Even more, however, do not rely on your current approach, which is not standard C, and which is terribly poor form.
What you describe is what functions are for. If you have a lot of data that you must share between the caller and callee, then either
record them in file-scope variables so that both functions can access them directly, or
create one or more complex data types (presumably structs) with which to hold and organize the data, and give the callee access by passing a pointer to such a struct.
A state machine can be written like this:
typedef enum { start, stop, state1, ... } state;
state s = start;
while (s != stop) {
switch (s) {
case start:
do_stuff; // lots of code
// computed goto
s = cond ? state23 : state45;
break;
...
Need a call stack?
state stack[42]; int sp=0;
...
do_stuff;
stack[sp++] = state33;
s = state45; // call
break;
case state33:
case state45:
do_processing; // some code
s = stack[--sp]; // ret
break;
You should only do this after you benchmark your time-critical code sections and find that the normal function call mechanism is indeed the bottleneck.

Resources