How to simply access data in a union inside of a struct? - c

Here is a type I declared:
(I declared t_sphere, t_cylinder and t_triangle too)
typedef struct s_intersection{
double t1;
double t2;
int id;
union {
t_sphere sph;
t_cylinder cyl;
t_triangle tri;
} u;
} t_intersection;
When I use that intersection structure in some code, is there a handy way to refer to the member inside my union ?
e.g. Let's say I want to write a function that acts differently according to the type of geometric_figure it contains. Will I have to do it like this ?
if (geometric_figure_id == SPHERE_ID)
// I know I will have to refer to p->u with p->u.sph...
else if(geometric_figure_id == CYLINDER_ID)
// I know I will have to refer to p->u with p->u.cyl...
else if (geometric_figure_id == TRIANGLE_ID)
// I know I will have to refer to p->u with p->u.tri...
What if I had 10 different geometric_figures types inside my union ?
This feels very heavy.
Do you have a more elegant solution ?

Let's say I want to write a function that acts differently according to the type of geometric_figure it contains. Will I have to do it like this ?
It sounds like an awareness of other languages' support for runtime polymorphic function dispatch may be part of the context for your question. If so, then it's important to recognize that what you're dispatching on here is not the type of any object, as you typically would in (say) C++ or Java, but rather the value of an integer.
If you want to follow different control paths for different runtime values of an integer, then there is no alternative to writing a flow-control statement -- generally an if / else if / else or a switch -- that directs control appropriately.*
Now, if you were dispatching on type then C does offer type-generic expressions. For the most part, these make sense to use only inside a macro:
#define VOLUME(x) _Generic((x), \
t_sphere: 4 * PI * (x).r * (x).r * (x).r / 3, \
t_cylinder: PI * (x).r * (x).r * (x).h, \
t_cube: (x).edge * (x).edge * (x).edge \
)
In contexts other than a macro, you know the type involved already, so a type-generic expression gains you nothing worth having.
Depending on how much macro magic you want to apply, there are ways avoid writing out long if / else if / else statements or long switch statements by hand. Type-generic macros would likely play a role in something like that. But such a course of action is difficult to implement well. You're more likely to end up with a confusing maintenance nightmare of complex macro stacks than with something that would compare favorably to manually-written switch statements.
*Or maybe a complex expression with nested use of the ternary operator could be taken as fullfilling that need, but using such an expression is not realistic if it needs to be written and maintained by hand.

Do you have a more elegant solution ?
Saying "more elegant" makes it opinion based.
So instead of "more elegant" solution, I'll rather call this an alternative solution.
To avoid the many nested if-statements (or a big switch-statement), you can add a function pointer. So when you make an instance of t_intersection, you also set the function pointer to point to the function needed for the specific sub-type of t_intersection.
Here is an example based on OPs code with a few modifications to make it simpler.
#include <stdio.h>
#define SPHERE_ID 0
#define CYLINDER_ID 1
double sph_calculation(void* p)
{
puts("sph_calculation");
int n = *(int*)p; // Cast back to correct type
double res = 1.0/n;
return res;
}
double cyl_calculation(void* p)
{
puts("cyl_calculation");
float f = *(float*)p; // Cast back to correct type
double res = 1.0/f;
return res;
}
typedef struct s_intersection{
double t1;
double t2;
int id;
double (*calculation)(void*); // Function pointer
union {
int sph;
float cyl;
} u;
} t_intersection;
int main(void)
{
t_intersection m[2];
m[0].id = SPHERE_ID;
m[0].calculation = sph_calculation; // Set function to be called
m[0].u.sph = 2;
m[1].id = CYLINDER_ID;
m[1].calculation = cyl_calculation; // Set function to be called
m[1].u.cyl = 3.0;
// Do the calculation for all types without a need for nested if-statements
for (int i = 0; i < 2; ++i)
{
double x = m[i].calculation(&m[i].u);
printf("%f\n", x);
}
return 0;
}
Output:
sph_calculation
0.500000
cyl_calculation
0.333333

Related

How to construct a C function with void pointer parameters and conditionally cast them to other types at runtime?

I'm trying to create a function where parameters are passed as void pointers, and including a parameter setting the data type the void pointers will be cast to, so that the function may be used on different types. Something like the following, which does not work:
void test_function(int use_type, void * value, void * array) {
// Set types to the parameters based on 'use_type'
if (use_type == 0) { // Int type
int * valueT = (int *) value;
int * arrayT = (int *) array;
} else if (use_type == 1) { // Double type
double * valueT = (double *) value;
double * arrayT = (double *) array;
}
// Main code of the program, setting an array item, regardless of type
arrayT[0] = *valueT;
}
There are two problems with the above code: the properly typed valueT and arrayT are scoped in the conditional blocks and not visible to the main part of the code. Moving their declarations out of the blocks isn't viable in the given structure of the code though, as they would then need different names for int and double versions, defeating the whole idea of what I'm trying to achieve. The other problem is that valueT and arrayT are local to the function. What I really want is to set the parameter array: array[0] = *value.
It appears that what I'm trying to do isn't possible in C... Is there a way that this could be done?
EDIT:
The assignment to array line is there to demonstrate what I want to do, there is a lot more code in that part. There will also be a number of other types besides int and double. Moving the assignment line into the blocks would mean too much code duplication.
You're trying to implement polymorphism in C. Down this path lies madness, unmaintainable code, and new programming languages.
Instead, I strongly recommend refactoring your code to use a better method of working with mixed data. union or struct or pointers or any of the solutions here. This will be less work in the long run and result in faster and more maintainable code.
Or you can switch to C++ and use templates.
Or you can use somebody else's implementation like GLib's GArray. This is a system of clever macros and functions to allow easy access to any type of data in an array. It's Open Source so you can examine its implementation, a mix of macros and clever functions. It has many features like automatic resizing and garbage collection. And it is very mature and well tested.
A GArray remembers its type, so it isn't necessary to keep telling it.
GArray *ints = g_array_new(FALSE, FALSE, sizeof(int));
GArray *doubles = g_array_new(FALSE, FALSE, sizeof(double));
int val1 = 23;
double val2 = 42.23;
g_array_append_val(ints, val1);
g_array_append_val(doubles, val2);
The underlying plain C array can be accessed as the data field of the GArray struct. It's typed gchar * so it must be recast.
double *doubles_array = (double *)doubles->data;
printf("%f", doubles_array[0]);
If we continue down your path, the uncertainty about the type infects every "generic" function and you wind up writing parallel implementations anyway.
For example, let's write a function that adds two indexes together. Something which should be simple.
First, let's do it conventionally.
int add_int(int *array, size_t idx1, size_t idx2) {
return array[idx1] + array[idx2];
}
double add_double(double *array, size_t idx1, size_t idx2) {
return array[idx1] + array[idx2];
}
int main() {
int ints[] = {5, 10, 15, 20};
int value = add_int(ints, 1, 2);
printf("%d\n", value);
}
Taking advantage of token concatenation, we can put a clever macro in front of that to choose the correct function for us.
#define add(a, t, i1, i2) (add_ ## t(a, i1, i2))
int main() {
int ints[] = {5, 10, 15, 20};
int value = add(ints, int, 1, 2);
printf("%d\n", value);
}
The macro is clever, but probably not worth the extra complexity. So long as you're consistent about the naming the programmer can choose between the _int and _double form themselves. But it's there if you like.
Now let's see it with "one" function.
// Using an enum gives us some type safety and code clarity.
enum Types { _int, _double };
void *add(void * array, enum Types type, size_t idx1, size_t idx2) {
// Using an enum on a switch, with -Wswitch, will warn us if we miss a type.
switch(type) {
case _int : {
int *sum = malloc(sizeof(int));
*sum = (int *){array}[idx1] + (int *){array}[idx2];
return sum;
};
case _double : {
double *sum = malloc(sizeof(double));
*sum = (double *){array}[idx1] + (double *){array}[idx2];
return sum;
};
};
}
int main() {
int ints[] = {5, 10, 15, 20};
int value = *(int *)add((void *)ints, _int, 1, 2);
printf("%d\n", value);
}
Here we see the infection. We need a return value, but we don't know the type, so we have to return a void pointer. That means we need to allocate memory of the correct type. And we need to access the array with the correct type, more redundancy, more typecasting. And then the caller has to mess with a bunch of typecasting.
What a mess.
We can clean up some of the redundancy with macros.
#define get_idx(a,t,i) ((t *){a}[i])
#define make_var(t) ((t *)malloc(sizeof(t)))
void *add(void * array, enum Types type, size_t idx1, size_t idx2) {
switch(type) {
case _int : {
int *sum = make_var(int);
*sum = get_idx(array, int, idx1) + get_idx(array, int, idx2);
return sum;
};
case _double : {
double *sum = make_var(double);
*sum = get_idx(array, double, idx1) + get_idx(array, double, idx2);
return sum;
};
};
}
You can probably reduce the redundancy with even more macros, like Patrick's answer, but boy is this rapidly turning into macro hell. At a certain point you're no longer coding in C as you are rapidly expanding custom language implemented with stacks of macros.
Clifford's very clever idea of using sizes rather than types will not work here. In order to actually do anything with the values we need to know their types.
Once again, I cannot express strongly enough how big of a tar pit polymorphism in C is.
Instead of passing a type identifier, it is sufficient and simpler to pass the size of the object:
void test_function( size_t sizeof_type, void* value, void* array )
{
size_t element_index = 0 ; // for example
memcpy( (char*)array + element_index * sizeof_type, value, sizeof_type ) ;
}
In order to remain type-agnostic and maintain the flexibility of usage you appear to want, you'll need move your "main code" into a macro and call it for each case:
typedef enum {
USE_TYPE_INT = 0,
USE_TYPE_DOUBLE = 1,
// ...
} USE_TYPE;
void test_function(USE_TYPE use_type, void * value, void * array) {
#define TEST_FUNCTION_T(type) do { \
type * valueT = value; \
type * arrayT = array; \
/* Main code of the program */ \
arrayT[0] = *valueT; \
/* ... */ \
} while(0)
// Set types to the parameters based on 'use_type'
switch (use_type) {
case USE_TYPE_INT:
TEST_FUNCTION_T(int);
break;
case USE_TYPE_DOUBLE:
TEST_FUNCTION_T(double);
break;
// ...
}
#undef TEST_FUNCTION_T
}
Note that, while you only define the TEST_FUNCTION_T macro once, each usage will result in a duplicate code block differing only by the type pasted into the macro call when the program is compiled.
The direct answer to your question is do the assignment dereferencing in the block in which the pointers are valid:
void test_function(int use_type, void * value, void * array) {
// Set types to the parameters based on 'use_type'
if (use_type == 0) { // Int type
int * valueT = value, *arrayT = array; //the casts in C are unnecessary
arrayT[0] = *valueT;
} else if (use_type == 1) { // Double type
double * valueT = value, *arrayT = array;
arrayT[0] = *valueT;
}
}
but you should probably be doing this inline, without any type<->int translation:
(type*){array}[0] = *(type*){value} //could make it DRY with a macro

Is it possible to "typedef"(of sorts) a function prototype?

I have multiple functions that are similar to each other - they take in the same arguments, and return the same type:
double mathFunction_1(const double *values, const size_t array_length);
I already use typedef'd pointers to those functions, as I store them as an array to easily use any number of them on the same data, map them etc.:
typedef double (* MathFunction_ptr )(const double *, const size_t);
double proxy(MathFunction_ptr mathfun_ptr, const double *values, const size_t array_length);
What I want to achieve, is a similar ease-of-use with declaring and defining the functions, as I already have with using pointers to them.
Thus, I was thinking about using a similar typedef to make it easier for me to write the actual functions. I tried doing it like this:
// declaration
typedef double MathFunction (const double *values, const size_t array_length);
MathFunction mathFunction_2;
The following approach works partially. It lets me "save a few keystrokes" in the declaration, however the definition has to be fully typed out.
double mathFunction_2(const double *values, const size_t array_length)
{
// ...
}
What I found by searching more for this issue is this: Can a function prototype typedef be used in function definitions?
However it doesn't provide many alternatives, and only reaffirms that what I tried to do in my other experiments is forbidden according to the Standard. The only alternative it provides is using
#define FUNCTION(name) double name(const double* values, size_t array_length)
which sounds clunky to me(as I'm wary and skeptical of using the preprocessor).
What are the alternatives to what I'm trying to do?
Two other approaches I tried that don't work(and, as I just read, are forbidden and absolutely wrong according to the C standard 6.9.1):
1.This approach doesn't work, as it means that I'm telling it to define a variable mathFunction_2(I believe that variable is treated as a pointer, though I don't understand this well enough yet) like a function:
MathFunction mathFunction_2
{
// ...
}
2.This approach doesn't work, as it means I'm telling it to create a function which returns a function(unacceptable in the C language):
MathFunction mathFunction_2()
{
// ...
}
You could use a typedef for the signature (see also this):
typedef double MathFunction_ty (const double *, const size_t);
and then declare several functions of the same signature:
MathFunction_ty func1, func2;
or declare some function pointer using that:
MathFunction_ty* funptr;
etc... All this in C11, read n1570.
however the definition has to be fully typed out.
Of course, since you need to give a name to each formal parameter (and such names are not part of the type of the function) in the function's definition. Therefore
double func1(const double*p, const size_t s) {
return (double)s * p[0];
}
and
double func1(cont double*arr, const size_t ix) {
return arr[ix];
}
have the same type (the one denoted by MathFunction_ty above), even if their formal parameters (or formal arguments) are named differently.
You might abuse of the preprocessor and have an ugly macro to shorten the definition of such functions:
// ugly code:
#define DEFINE_MATH_FUNCTION(Fname,Arg1,Arg2) \
double Fname (const double Arg1, const size_t Arg2)
DEFINE_MATH_FUNCTION(func1,p,s) { return (double)s * p[0]; }
I find such code confusing and unreadable. I don't recommend coding like that, even if it is certainly possible. But sometimes I do code something similiar (for other reasons).
(BTW, imagine if C required every first formal argument to be named $1, every second formal argument to be named $2, etc...; IMHO that would make a much less readable programming langage; so formal parameter's name matters to the human reader, even if systematic names would make the compiler's life simpler)
Read also about λ-calculus, anonymous functions (C don't have them but C++ has lambda expressions), closures (they are not C functions, because they have closed values so mix code with data; C++ has std::function-s), callbacks (a necessary convention to "mimick" closures)... Read SICP, it will improve your thinking about C or C++. Look also into that answer.
Unfortunately in C I don't believe there is any way to do what you're asking without using preprocessor macros, and personally at least I agree with your assessment that they are clunky and to be avoided (though this is a matter of opinion and open to debate).
In C++ you could potentially take advantage of auto parameters in lambdas.
The example function signatures you show here really aren't complicated and I wouldn't worry about the perceived duplication. If the signatures were much more complicated, I would view this as a "code smell" that your design could be improved, and I'd focus my efforts there rather than on syntactic methods to shorten the declaration. That just isn't the case here.
Yes, you can. Indeed, that's the purpose of the typedef declaration, to use a type identifier to declare a type of variable. The only thing is that when you use such a declaration in a header file:
typedef int (*callback_ptr)(int, double, char *);
and then you declare something like:
callback_ptr function_to_callback;
it's not clear that you are declaring a function pointer and the number and type of the parameters, but despite of this, everything is correct.
Finally, I want to note you something particularly special. When you deal with something like this, it is normally far cheaper and quick to go to the compiler and try some example. If the compiler does what you want without any complaint, the most probable thing is that you are correct.
#include <stdio.h>
#include <math.h>
typedef double (*ptr_to_mathematical_function)(double);
extern double find_zero(ptr_to_mathematical_function f, double aprox_a, double aprox_b, double epsilon);
int main()
{
#define P(exp) printf(#exp " ==> %lg\n", exp)
P(find_zero(cos, 1.4, 1.6, 0.000001));
P(find_zero(sin, 3.0, 3.2, 0.000001));
P(find_zero(log, 0.9, 1.5, 0.000001));
}
double find_zero(
ptr_to_mathematical_function f,
double a, double b, double eps)
{
double f_a = f(a), f_b = f(b);
double x = a, f_x = f_a;
do {
x = (a*f_b - b*f_a) / (f_b - f_a);
f_x = f(x);
if (fabs(x - a) < fabs(x - b)) {
b = x; f_b = f_x;
} else {
a = x; f_a = f_x;
}
} while(fabs(a-b) >= eps);
return x;
}
The second, and main part of your question, if you are having such a problem, the only way you can solve it is via using macros (see how I repeated the above printf(3) function calls with similar, but not identical parameter lists, and how the problem is solved below):
#define MY_EXPECTED_PROTOTYPE(name) double name(double x)
and then, in the definitions, just use:
MY_EXPECTED_PROTOTYPE(my_sin) {
return sin(x);
}
MY_EXPECTED_PROTOTYPE(my_cos) {
return cos(x);
}
MY_EXPECTED_PROTOTYPE(my_tan) {
return tan(x);
}
...
that will expand to:
double my_sin(double x) {
...
double my_cos(double x) {
...
double my_tan(double x) {
...
you can even use it in the header file, like:
MY_EXPECTED_PROTOTYPE(my_sin);
MY_EXPECTED_PROTOTYPE(my_cos);
MY_EXPECTED_PROTOTYPE(my_tan);
As it has been pointed in other answers, there are other languages (C++) that give support for this and much more, but I think this is out of scope here.

Is it possible to write a function which returns a pointer to a function different from the function in its argument?

I have recently stumbled on this curious thought while handling a C code.
I have written a function which returns a double and takes in as argument the pointer to a function and a certain number of parameters, namely
double template_1(double (*)(double),...);
this function correctly identifies a certain property of a real function
double f(double );
represented as a pointer in template_1, in order to maketemplate_1 valid for every real function I might plug-in.
Now I had to write another function, let it be:
double derivative(double (*)(double),double);
double derivative(double (*f)(double),double x){
double epsilon = ...;
return ( f(x+epsilon)-f(x-epsilon) )/(2.0*epsilon);
}
again with f in the argument to make it work for every f.
My question is: since I would like to use derivative in template_1 without modifying it, is it possible to write a function which takes derivative and spits out something that has the form of double (*)(double ) ?
My idea was to define typedef double (*real_function)(double);
and then to define
real_function g(double (*derivative)(double (*)(double),double ) )
which I'd like it to spit out something like: double derivative_2(double x); so that I could define something like g(derivative) = double (*h)( double); directly in template_1 argument
unfortunately I don't have the faintest idea of how to make this work, or even if it can work.
There are a couple ways to do anonymous functions in C. As the comments said, they aren't portable. But depending on the use case you may find this useful: Anonymous functions using GCC statement expressions
A couple of people have seemed to have similar issues, not sure how portable they are but they may be resourceful:
https://github.com/graphitemaster/lambdapp
https://github.com/Leushenko/C99-Lambda
Basically, if there's a way to architect your program in a way that doesn't require anonymous functions, then do it that way. If you have no other option, then I would give one of these a shot.
Warning: I am a C++ developer with little C knowledge so everything that follows is likely unidiomatic C.
As KerrekSB said, you would need to carry some state with your function. This is not possible with raw functions but you can define a struct that carries the state and add a function that works with this struct. This obviously has the drawback of losing the nice function call syntax. I whipped up an example:
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
typedef double (*raw_fptr)(double);
struct real_function;
typedef double (*evaluate_function)(struct real_function*, double);
struct real_function {
evaluate_function evaluate;
};
typedef struct real_function real_function;
double evaluate(real_function *f, double x) {
if(f) {
return f->evaluate(f, x);
}
return NAN;
}
struct raw_real_function {
real_function real_function_base;
raw_fptr raw_function;
};
typedef struct raw_real_function raw_real_function;
double evaluate_raw_real_function(real_function *f_base, double x) {
if(f_base) {
raw_real_function *f = (raw_real_function*)f_base;
return f->raw_function(x);
}
return NAN;
}
raw_real_function make_raw_real_function(raw_fptr function) {
raw_real_function result;
result.raw_function = function;
result.real_function_base.evaluate = evaluate_raw_real_function;
return result;
}
struct derive_real_function {
real_function real_function_base;
real_function *function_to_derive;
};
typedef struct derive_real_function derive_real_function;
double derive(real_function *f_base, double x) {
derive_real_function *f = (derive_real_function*)f_base;
double epsilon = 1e-3;
double upper = evaluate(f->function_to_derive, x+epsilon);
double lower = evaluate(f->function_to_derive, x-epsilon);
double result = (upper - lower)/(2.0*epsilon);
return result;
}
derive_real_function make_derivative(real_function * function_to_derive) {
derive_real_function result;
result.real_function_base.evaluate = derive;
result.function_to_derive = function_to_derive;
return result;
}
double x_cubed(double x) {
return x * x * x;
}
int main(int argc, char **argv) {
raw_real_function x_cubed_wrapped = make_raw_real_function(x_cubed);
derive_real_function derived = make_derivative(&x_cubed_wrapped.real_function_base);
derive_real_function derived_twice = make_derivative(&derived.real_function_base);
double x = atof(argv[1]);
double derivative = evaluate(&derived.real_function_base, x);
double second_derivative = evaluate(&derived_twice.real_function_base, x);
printf("derivative of x^3 at %f = %f\n", x, derivative);
printf("second derivative of x^3 at %f = %f\n", x, second_derivative);
return 0;
}
See (a slight variaton, due to input limitations) running here.
How does it work? I faked some inheritance with the structs real_function, raw_real_function and derive_real_function to generate virtual function calls. The struct real_function serves as the container of a virtual function table consisting of only the entry evaluate. This function pointer points to the "derived" structs' relevant evaluate function:
raw_real_function instances point to evaluate_raw_real_function (as initialized in make_raw_real_function. derive_real_function instances point evaluate to derive (as initialized in make_derivative).
When calling evaluate on the real_function_base member, it will call the associated evaluation function, which casts the real_function* to it's associated struct pointer and does what is needed with that information.
Since everything is just a real_function*, we can chain them at will but need to convert "normal" functions into the real_function format, that's what make_raw_real_function does.
If you have a function my_fancy_function:
double my_fancy_function (double x) { return sin(x) + cos(x); }
Then, you can use a helper macro that creates the derived function for you.
#define DEFINE_DERIVATIVE_OF(FUNC) \
double derivative_of_ ## FUNC (double x) { \
return derivative(FUNC, x); \
}
DEFINE_DERIVATIVE_OF(my_fancy_function)
You then pass this newly defined function to your template.
template_1(derivative_of_my_fancy_function, x, y, z);

Casting Structs With Void Pointers into Structs With Typed Pointers

Short version:
Suppose I have two structs:
struct charPtrWithLen
{
size_t len;
char * charPtr;
}
struct voidPtrWithLen
{
size_t len;
void * voidPtr;
}
Is there a way to cast voidPtrWithLen into charPtrWithLen and vice-versa, or even better, implicitly convert one into the other, much the same way that a char * and a void * can be readily cast and implicitly converted between each other?
Put another way:
I am trying to write all my C so that all pointers to arrays bring their size information with them. I am also trying to write generic functions using void pointers where applicable to keep operations which are essentially identical, well, identical. I am looking for a way to pass the typed-pointer-containing 'sized-array' structs into the generic functions taking void-pointer-containing 'sized-array' arguments.
Long version, with involved example:
So, void pointers are wonderfully flexible, so I can do this:
int foo(void * ptr, size_t dataLen);
/* ... */
char * c;
size_t c_n;
/* ... */
foo(c, c_n);
/* ... */
int * i;
size_t i_n;
/* ... */
foo(i, i_n);
But since the pattern of "pointer to arbitrary length array, plus size there-of" is so common, suppose at some point I get tired of specifying my various functions in terms of pairs of arguments, pointer and length, and instead I start to code with such pairs encapsulated in a struct instead:
typedef struct
{
size_t v_n;
void * v;
}
pointerWithSize;
/* ... */
int foo(pointerWithSize);
So far so good. I can always assign my "char * c" or "int * i" into the pointerWithSize's "void * v" with minimal difficulty. But when you do this long enough, using the same pattern, you run into the following problem: Soon enough you have a bunch of general functions which work with the data agnostically, and are thus happy to take void pointers, for example things like:
pointerWithSize combinePointersWithSize(pointerWithSize p1, pointerWithSize p2);
int readFromStream(FILE * readFromHere, pointerWithSize * readIntoHere);
But you also end up with functions which are inherently intended for specific data types:
size_t countOccurancesOfChar(pointerWithSize str, char c);
int summate(pointerWithSize integers);
And then you end up with the annoyance of having to do casts inside the latter category of functions. E.g. you end up with stuff like this:
/* This inside countOccurancesOfChar */
if(((char * )str.m)[i] == c) {
/* ..or this inside summate: */
sum += ((int * )integers.m)[i];
So you get to a point where you have a lot of functions which operate specifically on "strings with size", and in all of those cases, you don't want to have to much around with void pointers. So instead, in those cases you start doing stuff like this:
typedef struct
{
size_t v_n;
char * v;
}
stringWithSize;
/* ... */
size_t countOccurancesOfChar(stringWithSize str, char c);
int parseFormatting(stringWithSize str, struct someFormat_t foo);
Which is great, because now all the string related code doesn't need to be cluttered with casts. BUT, now I can't use my wonderful generic function combinePointersWithSize to concatenate my strings contained within the stringWithSize, in a way that's as syntactically clean, as I could if I was still writing my functions in terms of two separate arguments for each pointer-and-size pair.
To finish up the illustration:
pointerWithSize combinePointersWithSize(pointerWithSize p1, pointerWithSize p2);
void * combineAlternative(void * p1, size_t p_n1, void * p2);
/* ... */
stringWithSize a, b, c;
/* ... */
/* This doesn't work, incompatible types: */
c = combinePointersWithSize(a, b);
/* But this works, because char * can be passed into void * parameter. */
c.v_n = a.v_n + b.v_n;
c.v = combineAlternative(a.v, a.v_n, b.v, b.v_n); /* Works fine. */
Possible Solutions I've Considered:
1: Don't write my functions with those structs as arguments, instead write them with individual pair arguments. But this is a big part of what I want to avoid in the first place - I like the 'cleanness' and clarity of intent that having a size_t and a pointer bundled in one struct represents.
2: Do something like this:
stringWithSize a, b, c;
/* ... */
pointerWithSize d;
d = combinePointersWithSize((pointerWithSize){.v=a.v, .v_n=a.v_n}, (pointerWithSize){.v=b.v, .v_n=b.v_n})
/* and then do either this: */
c.v = d.v;
c.v_n = d.v_n;
foo(c);
/* ..or this: */
foo((stringWithSize){.v=d.v, .v_n=d.v_n});
..but I think most would agree, this is also as bad or worse as the original problem of casting within the library functions. On the surface it looks worse, because it offloads the casting burden to the client code instead of library code which can hopefully be fairly stable after being implemented/completed (incl. testing/etc). On the other hand, if you did keep every function defined in terms of the void * containing pointerWithSize, you could end up forcing similar casts to the kind you're doing inside your own functions, elsewhere in their code, and worse, you're losing the advantage of the compiler yelling at you, because now the code is carrying everything within the same pointerWithSize struct.
I'm also concerned about how many compilers out there have the ability to optimize the first of the two variants of this solution away (where 'd' servers as merely a temporary result holder.
3: Union-of-pointers. Instead of my prior pointerWithSize example, I would do:
typedef union
{
void * void;
char * char;
int * int;
/* ...and so on... */
}
rainbowPointer;
typedef struct
{
size_t v_n;
rainbowPointer v;
}
pointerWithSize;
At first glance this is almost good enough. However, I very frequently end up wanting to store arrays of some struct which is specific to the program I'm working on inside this "pointer with size" construct, and in those cases, a predefined union of pointer types would be useless to me, I'd still be right back at this problem.
4: I could write wrapper functions for each permuted pointer type. I could EVEN write function-like macros to define each of these pointer-with-size struct types, which would in the same swoop generate the wrapper functions. For example:
#define pointerWithSizeDef(T, name) \
typedef struct \
{ \
size_t v_n; \
T * v;
} \
name; \
foo_ ## name (name p1) \
{ \
/* generic function code defined in macro */ \
/* Or something like this: */ \
foo((pointerWithSize){.v=p1.v, .v_n=p1.v_n});
};
/* Then, stuff like this: */
pointerWithSizeDef(char, stringWithSize)
My intuition is that sooner or later this method would become unwieldy.
5: If there is a mechanism with no performance impact, but which is unappealing otherwise, I could write my generic functions as function-like macros, which in turn invoke the underlying actual function:
int foo_actual(void * v, size_t v_n);
#define foo(p) \
foo_actual(p.v, p.v_n);
..or even something like this, to replace casting syntax:
#define castToPointerWithSize(p) \
((pointerWithSize){.v=p.v, .v_n=p.v_n})
/* ... */
stringWithSize a;
foo(castToPointerWithSize(a));
But as these examples for possible-solution-#5 show, I can't actually think of a way to do this that wouldn't quickly become a possible problem (e.g. if someone wanted to place a function call which returned a pointerWithSize in place of 'p' in the above examples - you'd be running the function twice, and it wouldn't be at all obvious from the code.
So I don't think any of the solutions I've thought of are really sufficient for my usecase, so I'm hoping some of you know of some C syntax or mechanism I could take advantage of here to make it easy to cast/"cast" between two structs which are identical save for the pointer type of one of their members.
Firstly, any kind of "actual" casting isn't going to be allowed per the letter of the standard, because C makes no guarantee at all that all pointers have the same format. A cast from some arbitrary pointer type to a void pointer is allowed to involve a conversion of representation (that gets reversed when you cast it back in order to access the data), including possibly to a different size of pointer or a pointer existing in a separate address space. So a simple reinterpretation of a bit pattern to change pointer type is not safe; void*'s bit pattern isn't guaranteed to mean anything in particular, and the bit patterns of other types aren't guaranteed to be related in any particular way. (How many systems actually take advantage of this, I have no idea.)
Since the explicit conversion between void* and other types has to exist somewhere, using whole-value conversion is probably the safest idea. What you could do is define a macro to quickly and easily generate "cast functions" for you, e.g.:
#define GEN_CAST(NAME, FROM_TYPE, TO_TYPE) \
static inline TO_TYPE NAME(FROM_TYPE from) { \
return (TO_TYPE){ .v=p.v, .v_n=p.v_n }; \
}
GEN_CAST(s_to_v, stringWithSize, pointerWithSize)
GEN_CAST(v_to_s, pointerWithSize, stringWithSize)
...that you can then use in place of the cast operator in expressions:
stringWithSize a, b, c;
pointerWithSize d;
d = combinePointersWithSize(s_to_v(a), s_to_v(b));
foo(v_to_s(d));
A good compiler should recognise that on common platforms the conversion function is an identity operation, and remove it entirely.
You should be able to cast one to another by converting one to a pointer, casting it to a pointer of the other type, and dereferencing it. This will work in reverse too.
struct charPtrWithLen
{
size_t len;
char * charPtr;
};
struct voidPtrWithLen
{
size_t len;
void * voidPtr;
};
int main() {
struct charPtrWithLen cpwl = {.len = 6, .charPtr = "Hello"};
struct voidPtrWithLen vpwl = *(struct voidPtrWithLen *)&cpwl;
return 0;
}
Note this will only work as long as the struct layout is the same for both structs.

Dynamically creating functions in C

How can I dynamically create a function in C?
I try to summarize my C problem as follows:
I have a matrix and I want to be able to use some function to generate its elements.
function has no arguments
Hence I define the following:
typedef double(function)(unsigned int,unsigned int);
/* writes f(x,y) to each element x,y of the matrix*/
void apply(double ** matrix, function * f);
Now I need to generate constant functions within the code. I thought about creating a nested function and returning its pointer, but GCC manual (which allows nested functions) says:
"If you try to call the nested function through its address after the
containing function has exited, all hell will break loose."
which I would kind of expect from this code...
function * createConstantFunction(const double value){
double function(unsigned int,unsigned int){
return value;
}
return &function;
}
So how can I get it to work?
Thanks!
C is a compiled language. You can't create code at run-time "in C"; there is no specific C support to emit instructions to memory and so on. You can of course try just allocating memory, making sure it's executable, and emit raw machine code there. Then call it from C using a suitable function pointer.
You won't get any help from the language itself though, this is just like generating code and calling it in BASIC on an old 8-bit machine.
You must be familiar with some programming language which supports closure mechanism ,don't you?
Unfortunately, C does not support closure like that itself.
You could find out some useful libraries which simulate closure in C if you insisted on closure. But most of those libraries are complex and machine-dependence.
Alternatively, you can change your mind to agree with the C-style closure if you could change the signature of double ()(unsigned,unsigned);.
In C, functions itself has no data (or context) except the parameters of it and the static variable which it could access.
So the context must be passed by yourself. Here is a example using extra parameter :
// first, add one extra parameter in the signature of function.
typedef double(function)(double extra, unsigned int,unsigned int);
// second, add one extra parameter in the signature of apply
void apply(double* matrix,unsigned width,unsigned height, function* f, double extra)
{
for (unsigned y=0; y< height; ++y)
for (unsigned x=0; x< width ++x)
matrix[ y*width + x ] = f(x, y, extra);
// apply will passing extra to f
}
// third, in constant_function, we could get the context: double extra, and return it
double constant_function(double value, unsigned x,unsigned y) { return value; }
void test(void)
{
double* matrix = get_a_matrix();
// fourth, passing the extra parameter to apply
apply(matrix, w, h, &constant_function, 1212.0);
// the matrix will be filled with 1212.0
}
Is a double extra enough? Yes, but only in this case.
How should we do if more context is required?
In C, the general purpose parameter is void*, we can pass any context though one void* parameter by passing the address of context.
Here is another example :
typedef double (function)(void* context, int, int );
void apply(double* matrix, int width,int height,function* f,void* context)
{
for (int y=0; y< height; ++y)
for (int x=0; x< width ++x)
matrix[ y*width + x ] = f(x, y, context); // passing the context
}
double constant_function(void* context,int x,int y)
{
// this function use an extra double parameter \
// and context points to its address
double* d = context;
return *d;
}
void test(void)
{
double* matrix = get_a_matrix();
double context = 326.0;
// fill matrix with 326.0
apply( matrix, w, h, &constant_function, &context);
}
(function,context) pair like &constant_function,&context is the C-style closure.
Each function(F) that needs a closure must has one context parameter which will be passed to closure as its context.
And the caller of F must use a correct (f,c) pair.
If you can change the signature of function to fit to C-style closure, your code will be simple and machine-independence.
If couldn't (function and apply is not written by you), try to persuade him to change his code.
If failed, you have no choice but to use some closure libraries.
Since you want to generate a function that follows a simple recipe,
this shouldn't be too tricky to do with some inline assembly and
a block of executable/writable memory.
This approach feels a bit hacky so I wouldn't recommend it in production code. Due to the use of inline assembly this solution works only on Intel x86-64 / AMD64, and will need to be translated to work with other architectures.
You might prefer this to other JIT-based solutions as it does not depend on any external library.
If you would like a longer explanation of how the below code works,
leave a comment and I'll add it.
For security reasons, the code page should be marked PROT_READ|PROT_EXEC after a function is generated (see mprotect).
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <sys/mman.h>
int snippet_processor(char *buffer, double value, int action);
enum snippet_actions {
S_CALC_SIZE,
S_COPY,
};
typedef double (*callback_t) (unsigned int, unsigned int);
int main(int argc, char **argv) {
unsigned int pagesize = 4096;
char *codepage = 0;
int snipsz = 0;
callback_t f;
/* allocate some readable, writable and executable memory */
codepage = mmap(codepage,
pagesize,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_ANONYMOUS | MAP_PRIVATE,
0,
0);
// generate one function at `codepage` and call it
snipsz += snippet_processor(codepage, 12.55, S_COPY);
f = (callback_t) (codepage);
printf("result :: %f\n", f(1, 2));
/* ensure the next code address is byte aligned
* - add 7 bits to ensure an overflow to the next byte.
* If it doesn't overflow then it was already byte aligned.
* - Next, throw away any of the "extra" bit from the overflow,
* by using the negative of the alignment value
* (see how 2's complement works.
*/
codepage += (snipsz + 7) & -8;
// generate another function at `codepage` and call it
snipsz += snippet_processor(codepage, 16.1234, S_COPY);
f = (callback_t) (codepage);
printf("result :: %f\n", f(1, 2));
}
int snippet_processor(char *buffer, double value, int action) {
static void *snip_start = NULL;
static void *snip_end = NULL;
static void *double_start = NULL;
static int double_offset_start = 0;
static int size;
char *i, *j;
int sz;
char *func_start;
func_start = buffer;
if (snip_start == NULL) {
asm volatile(
// Don't actually execute the dynamic code snippet upon entry
"jmp .snippet_end\n"
/* BEGIN snippet */
".snippet_begin:\n"
"movq .value_start(%%rip), %%rax\n"
"movd %%rax, %%xmm0\n"
"ret\n"
/* this is where we store the value returned by this function */
".value_start:\n"
".double 1.34\n"
".snippet_end:\n"
/* END snippet */
"leaq .snippet_begin(%%rip), %0\n"
"leaq .snippet_end(%%rip), %1\n"
"leaq .value_start(%%rip), %2\n"
:
"=r"(snip_start),
"=r"(snip_end),
"=r"(double_start)
);
double_offset_start = (double_start - snip_start);
size = (snip_end - snip_start);
}
if (action == S_COPY) {
/* copy the snippet value */
i = snip_start;
while (i != snip_end) *(buffer++) = *(i++);
/* copy the float value */
sz = sizeof(double);
i = func_start + double_offset_start;
j = (char *) &value;
while (sz--) *(i++) = *(j++);
}
return size;
}
Using FFCALL, which handles the platform-specific trickery to make this work:
#include <stdio.h>
#include <stdarg.h>
#include <callback.h>
static double internalDoubleFunction(const double value, ...) {
return value;
}
double (*constDoubleFunction(const double value))() {
return alloc_callback(&internalDoubleFunction, value);
}
main() {
double (*fn)(unsigned int, unsigned int) = constDoubleFunction(5.0);
printf("%g\n", (*fn)(3, 4));
free_callback(fn);
return 0;
}
(Untested since I don't have FFCALL currently installed, but I remember that it works something like this.)
One way of doing would be to write a standard C file with the set of functions you want, compile it via gcc and the load it as a dynamic library to get pointers to the functions.
Ultimately, it probably would be better if you were able to specify your functions without having to define them on-the-fly (like via having a generic template function that takes arguments that define its specific behavior).
If you want to write code on the fly for execution, nanojit might be a good way to go.
In your code above, you're trying to create a closure. C doesn't support that. There are some heinous ways to fake it, but out of the box you're not going to be able to runtime bind a variable into your function.
As unwind already mentioned, "creating code at runtime" is not supported by the language and will be a lot of work.
I haven't used it myself, but one of my co-workers swears by Lua, an "embedded language". There is a Lua C API which will (theoretically, at least) allow you to perform dynamic (scripted) operations.
Of course, the downside would be that the end user may need some sort of training in Lua.
It may be a dumb question, but why does the function have to be generated within your application? Similarly what advantage does the end-user get from generating the function themselves (as opposed to selecting from one or more predefined functions that you provide)?
This mechanism is called reflection where code modifies its own behavior at runtime. Java supports reflection api to do this job.
But I think this support is not available in C.
Sun web site says :
Reflection is powerful, but should not
be used indiscriminately. If it is
possible to perform an operation
without using reflection, then it is
preferable to avoid using it. The
following concerns should be kept in
mind when accessing code via
reflection.
Drawbacks of Reflection
Performance Overhead Because
reflection involves types that are
dynamically resolved, certain Java
virtual machine optimizations can not
be performed. Consequently, reflective
operations have slower performance
than their non-reflective
counterparts, and should be avoided in
sections of code which are called
frequently in performance-sensitive
applications.
Security Restrictions
Reflection requires a runtime
permission which may not be present
when running under a security manager.
This is in an important consideration
for code which has to run in a
restricted security context, such as
in an Applet.
Exposure of Internals
Since reflection allows code to
perform operations that would be
illegal in non-reflective code, such
as accessing private fields and
methods, the use of reflection can
result in unexpected side-effects,
which may render code dysfunctional
and may destroy portability.
Reflective code breaks abstractions
and therefore may change behavior with
upgrades of the platform. .
It looks like you're coming from another language where you commonly use this type of code. C doesn't support it and it although you could certainly cook up something to dynamically generate code, it is very likely that this isn't worth the effort.
What you need to do instead is add an extra parameter to the function that references the matrix it is supposed to work on. This is most likely what a language supporting dynamic functions would do internally anyway.
If you really need to dynamically create the functions, maybe an embedded C interpreter could help. I've just googled for "embedded C interpreter" and got Ch as a result:
http://www.softintegration.com/
Never heard of it, so I don't know anything about it, but it seems to be worth a look.

Resources