I'm writing "threaded interpreter" using computed goto. How do I initialize address lookup table to be visible from different functions without additional runtime cost?
Label address is only visible at same function and static lookup table is initialized by compiler in data section without runtime cost at each call. But it's visible only in same function and I want to have another function to have access to it, for example to cache addresses and save lookups in main interpreter code. I can take pointer to this table and store it somewhere, but it will happen every time function is called, and it will get called frequently. Yes, it's just only one mov, but is there another way?
#include <stdio.h>
static void** table_ptr;
// how do i declare static variable and init it later once?
// Tried this. Generates runtime assigns at each call. Not unexpected
// static void** jumps_addr;
int main()
{
// labels are visible only inside the function
// generates runtime assigns at each call
// jumps_addr = (void* [10]){
// this initializes it in static data section, but name is only visible inside this function
static void* jumps_addr[10] = {
[1] = &&operation_print,
};
// want another way instead of this
table_ptr = jumps_addr;
// not optimize this
volatile int opcode = 1;
goto *jumps_addr[opcode];
return 0;
operation_print:;
printf("hello\n");
return 0;
}
void do_some_preprocessing_work(void){
// want access to jumps_addr table here
// without having to store it somewhere
// [do something with table_ptr]
// this is to prevent optimization to explore what compiler does on godbolt.org
// because it will optimize away table_ptr entirely if not used
volatile i = 1;
i += table_ptr[i];
//actual code here will store labbel addrs into opcode struct to save table lookup at runtime
}
The solution might sound unorthodox, but how about not to use any functions, but only goto.
Like so:
#include <stdio.h>
int main()
{
volatile int opcode;
static void* jumps_addr[10] = {
[0] = &&do_some_preprocessing_work,
[1] = &&operation_print
};
opcode = 0;
goto *jumps_addr[opcode];
return 1;
operation_print:
printf("hello\n");
return 0;
do_some_preprocessing_work:
printf("jumps_addr[%i]\n", ++opcode);
goto *jumps_addr[opcode];
return 1;
}
Related
I am currently writing a small game in C and feel like I can't get away from global variables.
For example I am storing the player position as a global variable because it's needed in other files. I have set myself some rules to keep the code clean.
Only use a global variable in the file it's defined in, if possible
Never directly change the value of a global from another file (reading from another file using extern is okay)
So for example graphics settings would be stored as file scope variables in graphics.c. If code in other files wants to change the graphics settings they would have to do so through a function in graphics.c like graphics_setFOV(float fov).
Do you think those rules are sufficient for avoiding global variable hell in the long term?
How bad are file scope variables?
Is it okay to read variables from other files using extern?
Typically, this kind of problem is handled by passing around a shared context:
graphics_api.h
#ifndef GRAPHICS_API
#define GRAPHICS_API
typedef void *HANDLE;
HANDLE init_graphics(void);
void destroy_graphics(HANDLE handle);
void use_graphics(HANDLE handle);
#endif
graphics.c
#include <stdio.h>
#include <stdlib.h>
#include "graphics_api.h"
typedef struct {
int width;
int height;
} CONTEXT;
HANDLE init_graphics(void) {
CONTEXT *result = malloc(sizeof(CONTEXT));
if (result) {
result->width = 640;
result->height = 480;
}
return (HANDLE) result;
}
void destroy_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
free(context);
}
}
void use_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
printf("width = %5d\n", context->width);
printf("height = %5d\n", context->height);
}
}
main.c
#include <stdio.h>
#include "graphics_api.h"
int main(void) {
HANDLE handle = init_graphics();
if (handle) {
use_graphics(handle);
destroy_graphics(handle);
}
return 0;
}
Output
width = 640
height = 480
Hiding the details of the context by using a void pointer prevents the user from changing the data contained within the memory to which it points.
How do you avoid using global variables in inherently stateful programs?
By passing arguments...
// state.h
/// state object:
struct state {
int some_value;
};
/// Initializes state
/// #return zero on success
int state_init(struct state *s);
/// Destroys state
/// #return zero on success
int state_fini(struct state *s);
/// Does some operation with state
/// #return zero on success
int state_set_value(struct state *s, int new_value);
/// Retrieves some operation from state
/// #return zero on success
int state_get_value(struct state *s, int *value);
// state.c
#include "state.h"
int state_init(struct state *s) {
s->some_value = -1;
return 0;
}
int state_fini(struct state *s) {
// add free() etc. if needed here
// call fini of other objects here
return 0;
}
int state_set_value(struct state *s, int value) {
if (value < 0) {
return -1; // ERROR - invalid argument
// you may return EINVAL here
}
s->some_value = value;
return 0; // success
}
int state_get_value(struct state *s, int *value) {
if (s->some_value < 0) { // value not set yet
return -1;
}
*value = s->some_value;
return 0;
}
// main.c
#include "state.h"
#include <stdlib.h>
#include <stdio.h>
int main() {
struct state state; // local variable
int err = state_init(&state);
if (err) abort();
int value;
err = state_get_value(&state, &value);
if (err != 0) {
printf("Getting value errored: %d\n", err);
}
err = state_set_value(&state, 50);
if (err) abort();
err = state_get_value(&state, &value);
if (err) abort();
printf("Current value is: %d\n", value);
err = state_fini(&state);
if (err) abort();
}
The only single case where global variables (preferably only a single pointer to some stack variable anyway) have to be used are signal handlers. The standard way would be to only increment a single global variable of type sig_atomic_t inside a signal handler and do nothing else - then execute all signal handling related logic from the normal flow in the rest of the code by checking the value of that variable. (On POSIX system) all other asynchronous communication from the kernel, like timer_create, that take sigevent structure, they can pass arguments to notified function by using members in union sigval.
Do you think those rules are sufficient for avoiding global variable hell in the long term?
Subjectively: no. I believe that a potentially uneducated programmer has too much freedom in creating global variables given the first rule. In complex programs I would use a hard rule: Do not use global variables. If finally after researching all other ways and all other possibilities have been exhausted and you have to use a global variables, make sure global variables leave the smallest possible memory footprint.
In simple short programs I wouldn't care much.
How bad are file scope variables?
This is opinion based - there are good cases where projects use many global variables. I believe that topic is exhausted in are global variables bad and numerous other internet resources.
Is it okay to read variables from other files using extern?
Yes, it's ok.
There are no "hard rules" and each project has it's own rules. I also recommend to read c2 wiki global variables are bad.
The first thing you have to ask yourself is: Just why did the programming world come to loath global variables? Obviously, as you noted, the way to model a global state is essentially a global (set of) variable(s). So what's the problem with that?
The Problem
All parts of the program have access to that state. The whole program becomes tightly coupled. Global variables violate the prime directive in programming, divide and conquer. Once all functions operate on the same data you can as well do away with the functions: They are no longer logical separations of concern but degrade to a notational convenience to avoid large files.
Write access is worse than read access: You'll have a hard time finding out just why on earth the state is unexpected at a certain point; the change can have happened anywhere. It is tempting to take shortcuts: "Ah, we can make the state change right here instead of passing a computation result back up three layers to the caller; that makes the code much smaller."
Even read access can be used to cheat and e.g. change behavior of some deep-down code depending on some global information: "Ah, we can skip rendering, there is no display yet!" A decision which should not be made in the rendering code but at top level. What if top level renders to a file!?
This creates both a debugging and a development/maintenance nightmare. If every piece of the code potentially relies on the presence and semantics of certain variables — and can change them! — it becomes exponentially harder to debug or change the program. The code agglomerating around the global data is like a cast, or perhaps a Boa Constrictor, which starts to immobilize and strangle your program.
Such programming can be avoided with (self-)discipline, but imagine a large project with many teams! It's much better to "physically" prevent access. Not coincidentally all programming languages after C, even if they are otherwise fundamentally different, come with improved modularization.
So what can we do?
The solution is indeed to pass parameters to functions, as KamilCuk said; but each function should only get the information they legitimately need. Of course it is best if the access is read-only and the result is a return value: Pure functions cannot change state at all and thus perfectly separate concerns.
But simply passing a pointer to the global state around does not cut the mustard: That's only a thinly veiled global variable.
Instead, the state should be separated into sub-states. Only top-level functions (which typically do not do much themselves but mostly delegate) have access to the overall state and hand sub-states to the functions they call. Third-tier functions get sub-sub states, etc. The corresponding implementation in C is a nested struct; pointers to the members — const whenever possible — are passed to functions which therefore cannot see, let alone alter, the rest of the global state. Separation of concerns is thus guaranteed.
I'm beginner with c and have a simple question:
I have a function myfunction() which is called periodically every 100 ms.
Within this function I have to call an other function but only once at the first call at beginn of myfunction(), but no periodically.
void myfunction() // function is called periodically every 100 ms
{
...
mySubfunction(); // this function have to be called only once in the first call of myFunction() and than skipped each time after that.
} ...
How to realize this in c?
Use static? Something along the lines of
void myfunction() // function is called periodically every 100 ms
{
static int once = 1;
if (once) {
mySubfunction();
once = 0;
}
}
The variable once in the example will be initalized only once and retain its value between invocations because of static keyword.
Be aware of implications in multithreaded environment, see this question.
you can have something like
static int flag = 1
void myfunction() // function is called periodically every 100 ms
{
if(flag)
{
mySubfunction();
flag = 0;
}
}
...
on first look task is very simply, can be next code:
void func()
{
static LONG first = TRUE;
if (_InterlockedExchange(&first, FALSE))
{
subfunc();
}
// some code
}
this give 100% guarantee that subfunc() will be called once and only once even if several thread in concurrent call your func()
but what be if // some code depended on result of subfunc ? in this case task become already not trivial. need some synchronization. and here already depended from os or compiler. in Windows, begin from Vista understand this problem and add function InitOnceExecuteOnce - read Using One-Time Initialization
if your subfunc() have no in and out parameters code can be very simply:
BOOL CALLBACK InitOnceCallback(PINIT_ONCE /*InitOnce*/, PVOID /*Parameter*/,PVOID* /*Context*/)
{
subfunc();
return TRUE;
}
void func()
{
static INIT_ONCE once = RTL_RUN_ONCE_INIT;
if (InitOnceExecuteOnce(&once, InitOnceCallback, 0, 0))
{
// somecode
}
// error init
}
also some modern compilers can correct handle static one time initialization. say latest versions of CL. with it code can be next:
void func()
{
static char tag = (subfunc(), 0);
// some code
}
here CL internally call special functions (implemented in CRT) _Init_thread_header, _Init_thread_footer - implementation can be look in crt source code - thread_safe_statics.cpp
This may be more advanced than you're looking for, but you could use function pointers and change which function gets called.
// Function declarations
void mySubfunction(void);
void myNormalfunction(void);
// Function pointer, which can be changed at run time.
static void (*myfunction)(void) = mySubfunction;
void mySubfunction(void)
{
// Do the sub-function stuff.
// Change the function pointer to the normal function.
myfunction = myNormalfunction();
// Do the normal function stuff (if necessary on the first call).
myNormalfunction();
}
void myNormalfunction(void)
{
// Etc.
}
int main(void)
{
int x;
for(x = 0; x < 3; x++)
{
// Call myfunction as you usually would.
myfunction();
}
return 0;
}
I would like to create a wrapper for c functions, so that I can convert a function call of the form ret = function(arg1,arg2,arg3); into the form /*void*/ function_wrapper(/*void*/);. That is similar to function objects in C++ and boost bind.
Is this possible? how can I do it?
Update:
To explain in more details what I am looking for:
We start with this function:
int f(int i){
//do stuff
return somevalue;
}
Obvioulsy, it is called like this:
// do stuff
int x = 0;
ret = f(0);
// do more stuff.
I would like to do some magic that will wrap the function into void function(void)
struct function_object fo;
fo.function_pointer = &f;
fo.add_arg(x, int);
fo.set_ret_pointer(&ret);
fo.call();
Note: I saw that there was a vote for closing this question and marking it as unclear. Please do not do that. I have a legitimate need to get this question answered. If you need explanation, ask and I will be glad to elaborate.
I came up with a better code that might allow you to do what you want. First I'll explain how it works, show the code and explain why I still don't think it's a good idea to use it (though the code might open doors for improvements that addresses those issues).
Functionality:
Before you start using the "function objects", you have to call an initialization function (FUNCTIONOBJ_initialize();), which will initialize the mutexes on every data structure used in the library.
After initializing, every time you want to call one of those "function objects", without using the parameters, you will have to set it up first. This is done by creating a FUNCTIONOBJ_handler_t pointer and calling get_function_handler(). This will search for a free FUNCTIONOBJ_handler data structure that can be used at the moment.
If none is found (all FUNCTIONOBJ_handler data structures are busy, being used by some function call) NULL is returned.
If get_function_handler() does find a FUNCTIONOBJ_handler data structure it will try to lock the FUNCTIONOBJ_id_holder data structure, that holds the ID of the FUNCTIONOBJ_handler of the function about to be called.
If FUNCTIONOBJ_id_holder is locked already, get_function_handler() will hang until it's unlocked by the thread using it.
Once FUNCTIONOBJ_id_holder is locked, the ID of the grabbed FUNCTIONOBJ_handler is wrote on it and the FUNCTIONOBJ_handler pointer is returned by get_function_handler.
With the pointer in hand, the user can set the pointer to the arguments and the return variable with set_args_pointer and set_return_pointer, which both take a void * as arguments.
Finally, you can call the function you want. It has to:
1 - Grab the FUNCTIONOBJ_handler ID from the FUNCTIONOBJ_id_holder data structure and use it to get a pointer to the FUNCTIONOBJ_handler itself.
2 - Use the FUNCTIONOBJ_handler to access the arguments.
3 - Return by using one of the return function (on the example we have ret_int, which will return an integer and unlock the FUNCTIONOBJ_handler)
Below is a simplified mind map describing a bit of what is going on:
Finally, the code:
funcobj.h:
#include <stdio.h>
#include <pthread.h>
#define MAX_SIMULTANEOUS_CALLS 1024
typedef struct {
//Current ID about to be called
int current_id;
//Mutex
pthread_mutex_t id_holder_mutex;
} FUNCTIONOBJ_id_holder_t;
typedef struct {
//Attributes
void *arguments;
void *return_pointer;
//Mutex
pthread_mutex_t handler_mutex;
} FUNCTIONOBJ_handler_t;
FUNCTIONOBJ_handler_t FUNCTIONOBJ_handler[MAX_SIMULTANEOUS_CALLS];
FUNCTIONOBJ_id_holder_t FUNCTIONOBJ_id_holder;
void set_return_pointer(FUNCTIONOBJ_handler_t *this, void *pointer);
void set_args_pointer(FUNCTIONOBJ_handler_t *this, void *pointer);
void ret_int(FUNCTIONOBJ_handler_t *this, int return_value);
void FUNCTIONOBJ_initialize(void);
FUNCTIONOBJ_handler_t *get_function_handler(void);
funcobj.c:
#include "funcobj.h"
void set_return_pointer(FUNCTIONOBJ_handler_t *this, void *pointer){
this->return_pointer = pointer;
}
void set_args_pointer(FUNCTIONOBJ_handler_t *this, void *pointer){
this->arguments = pointer;
}
void ret_int(FUNCTIONOBJ_handler_t *this, int return_value){
if(this->return_pointer){
*((int *) (this->return_pointer)) = return_value;
}
pthread_mutex_unlock(&(this->handler_mutex));
}
void FUNCTIONOBJ_initialize(void){
for(int i = 0; i < MAX_SIMULTANEOUS_CALLS; ++i){
pthread_mutex_init(&FUNCTIONOBJ_handler[i].handler_mutex, NULL);
}
pthread_mutex_init(&FUNCTIONOBJ_id_holder.id_holder_mutex, NULL);
}
FUNCTIONOBJ_handler_t *get_function_handler(void){
int i = 0;
while((0 != pthread_mutex_trylock(&FUNCTIONOBJ_handler[i].handler_mutex)) && (i < MAX_SIMULTANEOUS_CALLS)){
++i;
}
if(i >= MAX_SIMULTANEOUS_CALLS){
return NULL;
}
//Sets the ID holder to hold this ID until the function is called
pthread_mutex_lock(&FUNCTIONOBJ_id_holder.id_holder_mutex);
FUNCTIONOBJ_id_holder.current_id = i;
return &FUNCTIONOBJ_handler[i];
}
main.c:
#include "funcobj.h"
#include <string.h>
//Function:
void print(void){
//First the function must grab the handler that contains all its attributes:
//The FUNCTIONOBJ_id_holder is mutex locked, so we can just access its value and
//then free the lock:
FUNCTIONOBJ_handler_t *this = &FUNCTIONOBJ_handler[FUNCTIONOBJ_id_holder.current_id];
//We dont need the id_holder anymore, free it!
pthread_mutex_unlock(&FUNCTIONOBJ_id_holder.id_holder_mutex);
//Do whatever the function has to do
printf("%s\n", (char *) this->arguments);
//Return the value to the pointed variable using the function that returns an int
ret_int(this, 0);
}
void *thread_entry_point(void *data){
int id = (int) data;
char string[100];
snprintf(string, 100, "Thread %u", id);
int return_val;
FUNCTIONOBJ_handler_t *this;
for(int i = 0; i < 200; ++i){
do {
this = get_function_handler();
} while(NULL == this);
set_args_pointer(this, string);
set_return_pointer(this, &return_val);
print();
}
return NULL;
}
int main(int argc, char **argv){
//Initialize global data strucutres (set up mutexes)
FUNCTIONOBJ_initialize();
//testing with 20 threads
pthread_t thread_id[20];
for(int i = 0; i < 20; ++i){
pthread_create(&thread_id[i], NULL, &thread_entry_point, (void *) i);
}
for(int i = 0; i < 20; ++i){
pthread_join(thread_id[i], NULL);
}
return 0;
}
To compile: gcc -o program main.c funcobj.c -lpthread
Reasons to avoid it:
By using this, you are limiting the number of "function objects" that can be running simultaneously. That's because we need to use global data structures to hold the information required by the functions (arguments and return pointer).
You will be seriously slowing down the program when using multiple threads if those use "function objects" frequently: Even though many functions can run at the same time, only a single function object can be set up at a time. So at least for that fraction of time it takes for the program to set up the function and actually call it, all other threads trying to run a function will be hanging waiting the the data structure to be unlocked.
You still have to write some non-intuitive code at the beginning and end of each function you want to work without arguments (grabbing the FUNCTIONOBJ_handler structure, unlocking the FUNCTIONOBJ_id_holder structure, accessing arguments through the pointer you grabbed and returning values with non-built-in functions). This increases the chances of bugs drastically if care is not taken, specially some nasty ones:
Increases the chances of deadlocks. If you forget to unlock one of the data structures in any point of your code, you might end up with a program that works fine at some moments, but randomly freeze completely at others (because all function calls without arguments will be hanging waiting for the lock to be freed). That is a risk that happens on multithreaded programs anyways, but by using this you are increasing the amount of code that requires locks unnecessarily (for style purposes).
Complicates the use of recursive functions: Every time you call the function object you'll have to go through the set up phrase (even when inside another function object). Also, if you call the recursive function enough times to fill all FUNCTIONOBJ_handler structures the program will deadlock.
Amongst other reasons I might not notice at the moment :p
I have multiple locations in my code where I want to be able to jump to one specific location and return to where I was before.
A function calls provides that control flow but is not an option for me as I want the code I branch to to access a number of variables and passing all of them as arguments to the function call wouldn't be practical or efficient.
And the goto statement is only built to take a label, i.e. expected to be a one-way ticket.
Currently I am achieving what I need with the following:
void *return_addr;
int x,y;
...
return_addr=&&RETURN_0;
goto SOMEWHERE;
RETURN_0:
...
x+=1;
...
return_addr=&&RETURN_1;
goto SOMEWHERE;
RETURN_1:
...
SOMEWHERE:
y=x;
...
goto *return_addr;
Is there something more elegant and less cumbersome?
Is there something more elegant and less cumbersome?
You are obviously using GCC, as the computed goto statement is a GCC extension. With GCC we can use a nested function and access local variables without needing to pass them as arguments:
{
int x, y;
void SOMEWHERE()
{
y = x;
//...
}
//...
SOMEWHERE();
//...
x += 1;
//...
SOMEWHERE();
//...
}
Let's have the variables collected in a structure:
struct data_t {
int a;
int b;
/* and so on */
int x;
int y;
};
Let's have the repeated code defined in a function:
void func(struct data_t* data) {
data->y = data->x;
/* and so on */
}
Let's have the function used:
struct data_t data = {1, 2, ..., 24, 25};
func(&data);
data.x += 1;
func(&data);
/* and so on */
C has setjmp() / longjmp(), which can support what you describe. Do not use them. Even more, however, do not rely on your current approach, which is not standard C, and which is terribly poor form.
What you describe is what functions are for. If you have a lot of data that you must share between the caller and callee, then either
record them in file-scope variables so that both functions can access them directly, or
create one or more complex data types (presumably structs) with which to hold and organize the data, and give the callee access by passing a pointer to such a struct.
A state machine can be written like this:
typedef enum { start, stop, state1, ... } state;
state s = start;
while (s != stop) {
switch (s) {
case start:
do_stuff; // lots of code
// computed goto
s = cond ? state23 : state45;
break;
...
Need a call stack?
state stack[42]; int sp=0;
...
do_stuff;
stack[sp++] = state33;
s = state45; // call
break;
case state33:
case state45:
do_processing; // some code
s = stack[--sp]; // ret
break;
You should only do this after you benchmark your time-critical code sections and find that the normal function call mechanism is indeed the bottleneck.
Suppose there is a library function (can not modify) that accept a callback (function pointer) as its argument which will be called at some point in the future. My question: is there a way to store extra data along with the function pointer, so that when the callback is called, the extra data can be retrieved. The program is in c.
For example:
// callback's type, no argument
typedef void (*callback_t)();
// the library function
void regist_callback(callback_t cb);
// store data with the function pointer
callback_t store_data(callback_t cb, int data);
// retrieve data within the callback
int retrieve_data();
void my_callback() {
int a;
a = retrieve_data();
// do something with a ...
}
int my_func(...) {
// some variables that i want to pass to my_callback
int a;
// ... regist_callback may be called multiple times
regist_callback(store_data(my_callback, a));
// ...
}
The problem is because callback_t accept no argument. My idea is to generate a small piece of asm code each time to fill into regist_callback, when it is called, it can find the real callback and its data and store it on the stack (or some unused register), then jump to the real callback, and inside the callback, the data can be found.
pseudocode:
typedef struct {
// some asm code knows the following is the real callback
char trampoline_code[X];
callback_t real_callback;
int data;
} func_ptr_t;
callback_t store_data(callback_t cb, int data) {
// ... malloc a func_ptr_t
func_ptr_t * fpt = malloc(...);
// fill the trampoline_code, different machine and
// different calling conversion are different
// ...
fpt->real_callback = cb;
fpt->data = data;
return (callback_t)fpt;
}
int retrieve_data() {
// ... some asm code to retrive data on stack (or some register)
// and return
}
Is it reasonable? Is there any previous work done for such problem?
Unfortunately you're likely to be prohibited from executing your trampoline in more and more systems as time goes on, as executing data is a pretty common way of exploiting security vulnerabilities.
I'd start by reporting the bug to the author of the library. Everybody should know better than to offer a callback interface with no private data parameter.
Having such a limitation would make me think twice about how whether or not the library is reentrant. I would suggest ensuring you can only have one call outstanding at a time, and store the callback parameter in a global variable.
If you believe that the library is fit for use, then you could extend this by writing n different callback trampolines, each referring to their own global data, and wrap that up in some management API.