Is this an appropriate use of void pointers? - c

This question is about the appropriateness of using void pointers in a particular implementation.
I have a relatively simple program that consists of an infinite loop. On each loop, the program iterates over a fixed range of constant values and calls a function on each value. The particular function which is called can be one of three available and is specified at run time by an argument. Before the infinite loop starts, there is a condition block which sets a functional pointer to a function based on the supplied argument. This way the condition logic only has to be run once and not on every iteration in every loop.
This I have implemented and it works well, but I want to keep state between each call to the function. My proposal is to store state in a struct and pass that struct when calling the function on each of the values. The problem is that each function requires a different struct to store a different set of values of its state and the prototype of all three functions must be compatible (for the function pointer). I intend to solve this by using a void pointer in the prototypes of the three functions, thus maintaining compatible prototypes but allowing me to pass a different struct to each function.
The question is; is my proposal an appropriate use of void pointers or is it introducing too much runtime dynamism and I should therefore rethink my approach?
Note: It is not possible to use static variables in each of the three functions as the structs also need to be available in the infinite loop as there is also some processing to be done before and after the range of values is iterated.

As long as you are careful to keep your calls type-correct, this is a fairly C-idiomatic way to accomplish what you describe.

You could gain some measure of type safety by using a union:
typedef struct {
int a;
char *b;
} s1;
typedef struct {
double d;
int *e;
} s2;
typedef union {
s1 s1;
s2 s2;
} ocd;
typedef int (*daemon_function)(ocd *);
Then all your functions could be of type daemon_function but take different arguments through ocd.s1 or ocd2.s2. I'd tend to call all that a bunch of pointless busy-work though. A simple void* would work just as well.
You could also include a magic number at the front of your structures and then the functions could check type safety by looking at the magic number and seeing if it was the right one:
#define MAGIC 0x4d475600L
typedef struct {
long magic;
/* ... */
} whatever;
And then:
int f(void *p) {
whatever *w = (whatever *)p;
if(w->magic != MAGIC) {
/* complain and go boom! */
}
/* ... */
}
I did the magic number trick all the time back in my Motif programming days, you pass around a lot of void* pointers in Motif/Xt/X11 development.

Void pointers are a method to tell the c typing system that you want it to stop doing its job and trust you to not mess up. It is an appropriate use of a void *, the only issue is that you have lost access to any type checking that your compiler performs. You can potentially create some very bizarre and hard to diagnose bugs. If you are sure that you know what you are doing (you sound like you do) and if you have checked every single line of your code several times and are sure that there are no logical errors in it, then you should be fine.

void * is quite idiomatic in C. Personally I use it prevalently, but whenever I do it, I tend to used tagged structures for safety, i.e. I put a unique type ID at the beginning of each structure to identify it.

Generally it is OK.
I really prefer using the void * contexts but it looks like you want to avoid it.
Since you already have some code that parses the argument and choose the function, you can just select the function in a switch and call it explicitly for each iteration.

Related

Using different struct definitions to simulate public and private fields in C

I have been writing C for a decent amount of time, and obviously am aware that C does not have any support for explicit private and public fields within structs. However, I (believe) I have found a relatively clean method of implementing this without the use of any macros or voodoo, and I am looking to gain more insight into possible issues I may have overlooked.
The folder structure isn't all that important here but I'll list it anyway because it gives clarity as to the import names (and is also what CLion generates for me).
- example-project
- cmake-build-debug
- example-lib-name
- include
- example-lib-name
- example-header-file.h
- src
- example-lib-name
- example-source-file.c
- CMakeLists.txt
- CMakeLists.txt
- main.c
Let's say that example-header-file.h contains:
typedef struct ExampleStruct {
int data;
} ExampleStruct;
ExampleStruct* new_example_struct(int, double);
which just contains a definition for a struct and a function that returns a pointer to an ExampleStruct.
Obviously, now if I import ExampleStruct into another file, such as main.c, I will be able to create and return a pointer to an ExampleStruct by calling
ExampleStruct* new_struct = new_example_struct(<int>, <double>);,
and will be able to access the data property like: new_struct->data.
However, what if I also want private properties in this struct. For example, if I am creating a data structure, I don't want it to be easy to modify the internals of it. I.e. if I've implemented a vector struct with a length property that describes the current number of elements in the vector, I wouldn't want for people to just be able to change that value easily.
So, back to our example struct, let's assume we also want a double field in the struct, that describes some part of internal state that we want to make 'private'.
In our implementation file (example-source-file.c), let's say we have the following code:
#include <stdlib.h>
#include <stdbool.h>
typedef struct ExampleStruct {
int data;
double val;
} ExampleStruct;
ExampleStruct* new_example_struct(int data, double val) {
ExampleStruct* new_example_struct = malloc(sizeof(ExampleStruct));
example_struct->data=data;
example_struct->val=val;
return new_example_struct;
}
double get_val(ExampleStruct* e) {
return e->val;
}
This file simply implements that constructor method for getting a new pointer to an ExampleStruct that was defined in the header file. However, this file also defines its own version of ExampleStruct, that has a new member field not present in the header file's definition: double val, as well as a getter which gets that value. Now, if I import the same header file into main.c, which contains:
#include <stdio.h>
#include "example-lib-name/example-header-file.h"
int main() {
printf("Hello, World!\n");
ExampleStruct* test = new_example(6, 7.2);
printf("%d\n", test->data); // <-- THIS WORKS
double x = get_val(test); // <-- THIS AND THE LINE BELOW ALSO WORK
printf("%f\n", x); //
// printf("%f\n", test->val); <-- WOULD THROW ERROR `val not present on struct!`
return 0;
}
I tested this a couple times with some different fields and have come to the conclusion that modifying this 'private' field, val, or even accessing it without the getter, would be very difficult without using pointer arithmetic dark magic, and that is the whole point.
Some things I see that may be cause for concern:
This may make code less readable in the eyes of some, but my IDE has arrow buttons that take me to and from the definition and the implementation, and even without that, a one line comment would provide more than enough documentation to point someone in the direction of where the file is.
Questions I'd like answers on:
Are there significant performance penalties I may suffer as a result of writing code this way?
Am I overlooking something that may make this whole ordeal pointless, i.e. is there a simpler way to do this or is this explicitly discouraged, and if so, what are the objective reasons behind it.
Aside: I am not trying to make C into C++, and generally favor the way C does things, but sometimes I really want some encapsulation of data.
Am I overlooking something that may make this whole ordeal pointless, i.e. is there a simpler way to do this or is this explicitly discouraged, and if so, what are the objective reasons behind it.
Yes: your approach produces undefined behavior.
C requires that
All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
(C17 6.2.7/2)
and that
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
[...]
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a
subaggregate or contained union), or
a character type.
(C17 6.5/7, a.k.a. the "Strict Aliasing Rule")
Your two definitions of struct ExampleStruct define incompatible types because they specify different numbers of members (see C17 6.2.7/1 for more details on structure type compatibility). You will definitely have problems if you pass instances by value between functions relying on different of these incompatible definitions. You will have trouble if you construct arrays of them, whether dynamically, automatically, or statically, and attempt to use those across boundaries between TUs using one definition and those using another. You may have problems even if you do none of the above, because the compiler may behave unexpectedly, especially when optimizing. DO NOT DO THIS.
Other alternatives:
Opaque pointers. This means you do not provide any definition of struct ExampleStruct in those TUs where you want to hide any of its members. That does not prevent declaring and using pointers to such a structure, but it does prevent accessing any members, declaring new instances, or passing or receiving instances by value. Where member access is needed from TUs that do not have the structure definition, it would need to be mediated by accessor functions.
Just don't access the "private" members. Do not document them in the public documentation, and if you like, explicity mark them (in code comments, for example) as reserved. This approach will be familiar to many C programmers, as it is used a lot for structures declared in POSIX system headers.
As long as the public has a complete definition for ExampleStruct, it can make code like:
ExampleStruct a = *new_example_struct(42, 1.234);
Then the below will certainly fail.
printf("%g\n", get_val(&a));
I recommend instead to create an opaque pointer and provide access public functions to the info in .data and .val.
Think of how we use FILE. FILE *f = fopen(...) and then fread(..., f), fseek(f, ...), ftell(f) and eventually fclose(f). I suggest this model instead. (Even if in some implementations FILE* is not opaque.)
Are there significant performance penalties I may suffer as a result of writing code this way?
Probably:
Heap allocation is expensive, and - today - usually not optimized away even when that is theoretically possible.
Dereferencing a pointer for member access is expensive; although this might get optimized away with link-time-optimization... if you're lucky.
i.e. is there a simpler way to do this
Well, you could use a slack array of the same size as your private fields, and then you wouldn't need to go through pointers all the time:
#define EXAMPLE_STRUCT_PRIVATE_DATA_SIZE sizeof(double)
typedef struct ExampleStruct {
int data;
_Alignas(max_align_t) private_data[EXAMPLE_STRUCT_PRIVATE_DATA_SIZE];
} ExampleStruct;
This is basically a type-erasure of the private data without hiding the fact that it exists. Now, it's true that someone can overwrite the contents of this array, but it's kind of useless to do it intentionally when you "don't know" what the data means. Also, the private data in the "real" definition will need to have the same, maximal, _AlignAs() as well (if you want the private data not to need to use AlignAs(), you will need to use the real alignment quantum for the type-erased version).
The above is C11. You can sort of do about the same thing by typedef'ing max_align_t yourself, then using an array of max_align_t elements for private data, with an appropriate length to cover the actual size of the private data.
An example of the use of such an approach can be found in CUDA's driver API:
Parameters for copying a 3D array: CUDA_MEMCPY3D vs
Parameters for copying a 3D array between two GPU devices: CUDA_MEMCPY3D_peer
The first structure has a pair of reserved void* fields, hiding the fact that it's really the second structure. They could have used an unsigned char array, but it so happens that the private fields are pointer-sized, and void* is also kind of opaque.
This causes undefined behaviour, as detailed in the other answers. The usual way around this is to make a nested struct.
In example.h, one defines the public-facing elements. struct example is not meant to be instantiated; in a sense, it is abstract. Only pointers that are obtained from one of it's (in this case, the) constructor are valid.
struct example { int data; };
struct example *new_example(int, double);
double example_val(struct example *e);
and in example.c, instead of re-defining struct example, one has a nested struct private_example. (Such that they are related by composite aggregation.)
#include <stdlib.h>
#include "example.h"
struct private_example {
struct example public;
double val;
};
struct example *new_example(int data, double val) {
struct private_example *const example = malloc(sizeof *example);
if(!example) return 0;
example->public.data = data;
example->val = val;
return &example->public;
}
/** This is a poor version of `container_of`. */
static struct private_example *example_upcast(struct example *example) {
return (struct private_example *)(void *)
((char *)example - offsetof(struct private_example, public));
}
double example_val(struct example *e) {
return example_upcast(e)->val;
}
Then one can use the object as in main.c. This is used frequently in linux kernel code for container abstraction. Note that offsetof(struct private_example, public) is zero, ergo example_upcast does nothing and a cast is sufficient: ((struct private_example *)e)->val. If one builds structures in a way that always allows casting, one is limited by single inheritance.

Avoid global variables when using a predefined number of parameters in a function

I'm programing a command shell as an assignment where I need to run a series of commands.
I implemented a couple of structs to make it easier to escalate (as the project is going to get bigger during the semester)
commands are defined with three values:
(tokens is an array of all the flags that are passed to the command and also the values for things like "mkdir dir")
struct cmd {
char *cmd_name;
int (*cmd_fun)(char *tokens[], int ntokens);
char *cmd_help;
};
and then created as a struct of all of them
struct cmd cmds[] ={
{"command1", command1, "description of command1"},
{"command2", command2, "description of command2"},
{NULL, NULL, NULL}
};
So far this works great. I have a function that reads the user input and compares it with the cmd_name then executing the cmd_fun.
The issue here is that I have a couple of commands in this struct that need a list to function, more specifically I have a history of commands, and I can't add it as a parameter because cmd_fun is defined with only tokens[] and ntokens.
I first got around this with a global variable that creates the list, but I now want to separate things into different files and having the list in my commands.c seems messy, plus my professor complained a bit about the implementation.
Is there any way I could pass this list more elegantly?
I cannot guess the strange case in which you need to avoid using globals in a function and signal that fact by using a special list of parameters. Why not just write two functions, and call one or the other when you need?
You are talking about different ways to implement the storage of an algorithm (which is something the compiler must know at compilation time) and you are planning to implement a single function in two different ways, based on the argument list you pass to it, that you will detect at runtime?
Why not to write two functions, one with globals and one argument list, and other without them, and a different interface, and call one or the other depending on your needs... you cannot call a function in different ways at runtime because you have to write the function call in different ways at compilation time, so the whole problem is static in nature and you cannot move it to a runtime decision.
The simplest way to take a decision at runtime to call an implementation of one algorithm or another is to use function pointers.... but in that case, the argument lists cannot be different, just the pointer is dynamically assigned and so, a different call is made because of the pointer contents at the time of the call.
void f1(int, int, int);
void f2(int, int, int);
void (*ptr)(int, int, int);
int main()
{
if (something) {
ptr = f1;
} else {
ptr = f2;
}
/* .... later, much later */
ptr(34, 25, 18); /* will call f1 or f2 depending on 'something' */
}

How to avoid globals in this case (embedded C)

I'm still learning C to be used in microprocessors. In the beginning I used lots of globals. Now I'm trying to avoid it as much as a can, but for me it's not always clear to see how to do this.
For example a battery monitor, in this case there are 4 functions that need to read or modify a variable.
I have these functions all using the variable LowVoltage.
void Check_Voltage(){
checks current voltage against LowVoltage
}
void Menu_Voltage(){
a menu on the LCD screen to set the value of LowVoltage
}
void Save_LowVoltage(){
runs after the settings menu is finished to save LowVoltage to EEPROM
}
void Load_LowVoltage(){
reads EEPROM and sets LowVoltage at startup
}
Check_Voltage() and Save_LowVoltage() need to read LowVoltage.
Load_LowVoltage() need to write LowVoltage.
Menu_Voltage() needs to read and write LowVoltage.
How can I make this work without making LowVoltage global??
Would I need to make another function to read or write LowVoltage?
Something like this:
unsigned int Low_Voltage(short Get, unsigned int Value){
static unsigned int LowVoltage;
if(Get) return LowVoltage;
else LowVoltage= Value;
}
Or are there better ways to do this? I guess there must be :)
I've been reading about structures lately, but to be honest I don't fully understand them and I'm not even sure it would help me in cases like this?
There are several choices to sharing a variable among functions:
Allocate your variable in static memory - this is pretty much what your code does. Your two choices there are function-static, translation unit-static, and global
Pass a pointer to variable as function parameter - This choice requires passing the pointer around in some form
Use thread-local storage with clever initialization - This choice is not usually available when you work with microcontrollers; I list it here for completeness.
In your case, I think that using a translation unit-static variable would be appropriate. Put implementations of the four functions into a single C file, and declare LowVoltage at the top as a static variable:
static unsigned int LowVoltage;
This simple but efficient encapsulation mechanism gives you all benefits of having a global variable, without the drawbacks of having a global variable:
All functions inside the C module "see" this variable, and can freely manipulate it
No other functions outside the C module can access this variable. They can declare their own LowVoltage variable, giving it an entirely different meaning.
Two solutions I can think of
Make the function signatures like this
unsigned int Load_LowVoltage(unsigned int lowVoltage);
and then pass LowVoltage and assign to it the return value, like this
LowVoltage = Load_LowVoltage(LowVoltage);
Modify LowVoltage inside the function and pass a pointer to your original LowVoltage like this
void LowVoltage(unsigned int *lowVoltage)
{
*lowVoltage = modifiedValue;
}
then you can use it like this
Load_LowVoltage(&LowVoltage);
I think the second solution is cleaner and since you are in un environment where resources are limited, it's also better in that sense. But they both are easy to implement and work as good.
You could create a struct holding all battery parameters, for example:
typedef struct {
int low_voltage;
int voltage;
int capacity;
int temperature;
...
} BatteryData
At the start of your program you allocate memory for it and initialise members to some starting values:
BatteryData *battery = malloc(sizeof(BatteryData));
battery->low_voltage = 0;
...
Then you pass pointer to the whole struct to functions that set or read individual values, for example:
void Load_LowVoltage(BatteryData *battery){
//reads EEPROM and sets LowVoltage at startup
int eeprom_val = get_low_voltage_from_eeprom();
battery->low_voltage = eeprom_val;
}
Free the structure when not needed:
free(battery);

How to check if a void* pointer can be safely cast to something else?

Let's say I have this function, which is part of some gui toolkit:
typedef struct _My_Struct My_Struct;
/* struct ... */
void paint_handler( void* data )
{
if ( IS_MY_STRUCT(data) ) /* <-- can I do something like this? */
{
My_Struct* str = (My_Struct*) data;
}
}
/* in main() */
My_Struct s;
signal_connect( SIGNAL_PAINT, &paint_handler, (void*) &s ); /* sent s as a void* */
Since the paint_handler will also be called by the GUI toolkit's main loop with other arguments, I cannot always be sure that the parameter I am receiving will always be a pointer to s.
Can I do something like IS_MY_STRUCT in the paint_handler function to check that the parameter I am receiving can be safely cast back to My_Struct* ?
Your void pointer looses all its type information, so by that alone, you cannot check if it can be cast safely. It's up to the programmer to know if a void* can be cast safely to a type.
Unfortunately there is no function to check what the pointer was before it appears in that context (void).
The one solution I can think of is if you place an int _struct_id as the first member of all of your structs. This id member can then be safely checked regardless of the type but this will fail if you pass pointers that don't implement this member (or int, char, ... pointers).
The best you could do would be to look at what data points to to see if it has telltale signs of being what you want, although a) it wouldn't be anywhere close to a guarantee and b) might be dangerous, as you don't know how big the thing data actually points to is. I suppose it isn't any more dangerous than just casting it and using it, but (as has been suggested) a redesign would be better.
If you are creating the type that is being used, you could include as part of the type some kind of identifying information that would help you rule out some void pointers as not being of the type you are looking for. While you would run the chance that some random area of memory would contain the same data or signature as what you are looking for, at least you would know when something was not the type you were looking for.
This approach would require that the struct was initialized in such a way that the signature members, used to determine if the memory area is not valid, is initialized to the signature value.
An example:
typedef struct {
ULONG ulSignature1;
// .. data elements that you want to have
ULONG ulSignature2;
} MySignedStruct;
#define MYSIGNEDSTRUCT_01 0x1F2E3D4C
#define MYSIGNEDSTRUCT_02 0xF1E2D3C4
#define IS_MY_STRUCT(sAdr) ( (((MySignedStruct *)sAdr)->ulSignature1 == MYSIGNEDSTRUCT_01 ) && (((MySignedStruct *)sAdr)->ulSignature1 == MYSIGNEDSTRUCT_02))
This is kind of a rough approach however it can help. Naturally using a macro like IS_MY_STRUCT() where the argument is used twice can be problematic if the argument has a side effect so you would have to be careful of something like IS_MY_STRUCT(xStruct++) where xStruct is a pointer to a MySignedStruct.
There really isn't in c. void pointers are typeless, and should only ever be casted when you truly know what they point to.
Perhaps you should instead reconsider your design; rewrite your code so that no inspection is necessary. This is the same reason google disallows RTTI in its style guide.
I know the question is 3 years old but here I go,
How about using a simple global enum to distinguish where the function is called from. then you can switch between what type to cast the void pointer to.

Passing more parameters in C function pointers

Let's say I'm creating a chess program. I have a function
void foreachMove( void (*action)(chess_move*), chess_game* game);
which will call the function pointer action on each valid move. This is all well and good, but what if I need to pass more parameters to the action function? For example:
chess_move getNextMove(chess_game* game, int depth){
//for each valid move, determine how good the move is
foreachMove(moveHandler, game);
}
void moveHandler(chess_move* move){
//uh oh, now I need the variables "game" and "depth" from the above function
}
Redefining the function pointer is not the optimal solution. The foreachMove function is versatile and many different places in the code reference it. It doesn't make sense for each one of those references to have to update their function to include parameters that they don't need.
How can I pass extra parameters to a function that I'm calling through a pointer?
Ah, if only C supported closures...
Antonio is right; if you need to pass extra parameters, you'll need to redefine your function pointer to accept the additional arguments. If you don't know exactly what parameters you'll need, then you have at least three choices:
Have the last argument in your prototype be a void*. This gives you flexibility of passing in anything else that you need, but it definitely isn't type-safe.
Use variadic parameters (...). Given my lack of experience with variadic parameters in C, I'm not sure if you can use this with a function pointer, but this gives even more flexibility than the first solution, albeit still with the lack of type safety.
Upgrade to C++ and use function objects.
You'd probably need to redefine the function pointer to take additional arguments.
void foreachMove( void (*action)(chess_move*, int), chess_game* game )
If you're willing to use some C++, you can use a "function object":
struct MoveHandler {
chess_game *game;
int depth;
MoveHandler(chess_game *g, int d): game(g), depth(d) {}
void operator () (chess_move*) {
// now you can use the game and the depth
}
};
and turn your foreachMove into a template:
template <typename T>
void foreachMove(T action, chess_game* game);
and you can call it like this:
chess_move getNextMove(chess_game* game, int depth){
//for each valid move, determine how good the move is
foreachMove(MoveHandler(game, depth), game);
}
but it won't disrupt your other uses of MoveHandler.
If I'm reading this right, what I'd suggest is to make your function take a pointer to a struct as an argument. Then, your struct can have "game" and "depth" when it needs them, and just leave them set to 0 or Null when you don't need them.
What is going on in that function? Do you have a conditional that says,
if (depth > -1) //some default
{
//do something
}
Does the function always REQUIRE "game" and "depth"? Then, they should always be arguments, and that can go into your prototypes.
Are you indicating that the function only sometimes requires "game" and "depth"? Well, maybe make two functions and use each one when you need to.
But, having a structure as the argument is probably the easiest thing.
I'd suggest using an array of void*, with the last entry always void.
say you need 3 parameters you could do this:
void MoveHandler (void** DataArray)
{
// data1 is always chess_move
chess_move data1 = DataArray[0]? (*(chess_move*)DataArray[0]) : NULL;
// data2 is always float
float data1 = DataArray[1]? (*(float*)DataArray[1]) : NULL;
// data3 is always char
char data1 = DataArray[2]? (*(char*)DataArray[2]) : NULL;
//etc
}
void foreachMove( void (*action)(void**), chess_game* game);
and then
chess_move getNextMove(chess_game* game, int depth){
//for each valid move, determine how good the move is
void* data[4];
data[0] = &chess_move;
float f1;
char c1;
data[1] = &f1;
data[2] = &c1;
data[3] = NULL;
foreachMove(moveHandler, game);
}
If all the parameters are the same type then you can avoid the void* array and just send a NULL-terminated array of whatever type you need.
+1 to Antonio. You need to change your function pointer declaration to accept additional parameters.
Also, please don't start passing around void pointers or (especially) arrays of void pointers. That's just asking for trouble. If you start passing void pointers, you're going to also have to pass some kind of message to indicate what the pointer type is (or types are). This technique is rarely appropriate.
If your parameters are always the same, just add them to your function pointer arguments (or possibly pack them into a struct and use that as the argument if there are a lot of parameters). If your parameters change, then consider using multiple function pointers for the multiple call scenarios instead of passing void pointers.
If your parameters change, I would change the function pointer declaration to use the "..." technique to set up a variable number of arguments. It could save you in readability and also having to make a change for each parameter you want to pass to the function. It is definately a lot safer than passing void around.
http://publications.gbdirect.co.uk/c_book/chapter9/stdarg.html
Just an FYI, about the example code in the link: some places they have “n args” and others it is “n_args” with the underscore. They should all have the underscore. I thought the syntax looked a little funny until I realized they had dropped the underscore in some places.
Use a typedef for the function pointer. See my answer for this question
Another option would be to modify the chess_move structure instead of the function prototype. The structure is presumably defined in only one place already. Add the members to the structure, and fill the structure with appropriate data before any call which uses it.

Resources