I have a question about C compiler optimization and when/how loops in inline functions are unrolled.
I am developing a numerical code which does something like the example below. Basically, my_for() would compute some kind of stencil and call op() to do something with the data in my_type *arg for each i. Here, my_func() wraps my_for(), creating the argument and sending the function pointer to my_op()... who’s job it is to modify the ith double for each of the (arg->n) double arrays arg->dest[j].
typedef struct my_type {
int const n;
double *dest[16];
double const *src[16];
} my_type;
static inline void my_for( void (*op)(my_type *,int), my_type *arg, int N ) {
int i;
for( i=0; i<N; ++i )
op( arg, i );
}
static inline void my_op( my_type *arg, int i ) {
int j;
int const n = arg->n;
for( j=0; j<n; ++j )
arg->dest[j][i] += arg->src[j][i];
}
void my_func( double *dest0, double *dest1, double const *src0, double const *src1, int N ) {
my_type Arg = {
.n = 2,
.dest = { dest0, dest1 },
.src = { src0, src1 }
};
my_for( &my_op, &Arg, N );
}
This works fine. The functions are inlining as they should and the code is (almost) as efficient as having written everything inline in a single function and unrolled the j loop, without any sort of my_type Arg.
Here’s the confusion: if I set int const n = 2; rather than int const n = arg->n; in my_op(), then the code becomes as fast as the unrolled single-function version. So, the question is: why? If everything is being inlined into my_func(), why doesn’t the compiler see that I am literally defining Arg.n = 2? Furthermore, there is no improvement when I explicitly make the bound on the j loop arg->n, which should look just like the speedier int const n = 2; after inlining. I also tried using my_type const everywhere to really signal this const-ness to the compiler, but it just doesn't want to unroll the loop.
In my numerical code, this amounts to about a 15% performance hit. If it matters, there, n=4 and these j loops appear in a couple of conditional branches in an op().
I am compiling with icc (ICC) 12.1.5 20120612. I tried #pragma unroll. Here are my compiler options (did I miss any good ones?):
-O3 -ipo -static -unroll-aggressive -fp-model precise -fp-model source -openmp -std=gnu99 -Wall -Wextra -Wno-unused -Winline -pedantic
Thanks!
Well, obviously the compiler isn't 'smart' enough to propagate the n constant and unroll the for loop. Actually it plays it safe since arg->n can change between instantiation and usage.
In order to have consistent performance across compiler generations and squeeze the maximum out of your code, do the unrolling by hand.
What people like myself do in these situations (performance is king) is rely on macros.
Macros will 'inline' in debug builds (useful) and can be templated (to a point) using macro parameters. Macro parameters which are compile time constants are guaranteed to remain this way.
It's faster, because your program does not assign memory to the variable.
If you don't have to perform any operations on unknown values they are treated as if they were #define constant 2 with type checking. They are just added while the compilation.
Could you please chose one of the two tags (I mean C or C++), it's confusing, because the languages treat const values differently - C treats them like normal variables which value just can't be changed, and in C++ they do or don't have memory assigned depending on the context (if you need their address or if you need to compute them when the program is running, then memory is assigned).
Source: "Thinking in C++". No exact quote.
Related
I'm new to C and C++, and I've read that at least in C++ it's preferable to use std::array or std::vector when using vectors and arrays, specially when passing these into a function.
In my research I found the following, which makes sense. I suppose using std::vector would fix the problem of indexing outside of the variable's scope.
void foo(int arr[10]) { arr[9] = 0; }
void bar() {
int data[] = {1, 2};
foo(data);
}
The above code is wrong but the compiler thinks everything is fine and
issues no warning about the buffer overrun.
Instead use std::array or std::vector, which have consistent value
semantics and lack any 'special' behavior that produces errors like
the above.
(answer from bames53, thanks btw!)
What I want to code is
float foo(int X, int Y, int l){
// X and Y are arrays of length l
float z[l];
for (int i = 0; i < l; i ++){
z[i] = X[i]+Y[i];
}
return z;
}
int bar(){
int l = 100;
int X[l];
int Y[l];
float z[l];
z = foo(X,Y,l);
return 0;
}
I want this to be coded in C, so my question is is there a std::vector construct for C? I couldn't find anything on that.
Thanks in advance, also please excuse my coding (I'm green as grass in C and C++)
Standard C has nothing like std::vector or other container structures. All you get is built-in arrays and malloc.
I suppose using std::vector would fix the problem of indexing outside of the variable's scope.
You might think so, but you'd be wrong: Indexing outside of the bounds of a std::vector is just as bad as with a built-in array. The operator[] of std::vector doesn't do any bounds checking either (or at least it is not guaranteed to). If you want your index operations checked, you need to use arr.at(i) instead of arr[i].
Also note that code like
float z[l];
...
return z;
is wrong because there are no array values in C (or C++, for that matter). When you try to get the value of an array, you actually get a pointer to its first element. But that first element (and all other elements, and the whole array) is destroyed when the function returns, so this is a classic use-after-free bug: The caller gets a dangling pointer to an object that doesn't exist anymore.
The customary C solution is to have the caller deal with memory allocation and pass an output parameter that the function just writes to:
void foo(float *z, const int *X, const int *Y, int l){
// X and Y are arrays of length l
for (int i = 0; i < l; i ++){
z[i] = X[i]+Y[i];
}
}
That said, there are some libraries that provide dynamic data structures for C, but they necessarily look and feel very different from C++ and std::vector (e.g. I know about GLib).
Your question might be sensitive for some programmers of the language.
Using constructs of one language into another can be considered cursing as different languages have different design decisions.
C++ and C share a huge part, in a way that C code can (without a lot of modifications) be compiled as C++. However, if you learn to master C++, you will realize that a lot of strange things happen because how C works.
Back to the point: C++ contains a standard library with containers as std::vector. These containers make use of several C++ constructions that ain't available in C:
RAII (the fact that a Destructor gets executed when the instance goes out-of-scope) will prevent a memory leak of the allocated memory
Templates will allow type safety to not mix doubles, floats, classes ...
Operator overloading will allow different signatures for the same function (like erase)
Member functions
None of these exist in C, so in order to have a similar structure, several adaptions are required for getting a data structure that behaves almost the same.
In my experience, most C projects have their own generic version of data structures, often based on void*. Often this will look similar like:
struct Vector
{
void *data;
long size;
long capacity;
};
Vector *CreateVector()
{
Vector *v = (Vector *)(malloc(sizeof(Vector)));
memset(v, 0, sizeof(Vector));
return v;
}
void DestroyVector(Vector *v)
{
if (v->data)
{
for (long i = 0; i < v->size; ++i)
free(data[i]);
free(v->data);
}
free(v);
}
// ...
Alternatively, you could mix C and C++.
struct Vector
{
void *cppVector;
};
#ifdef __cplusplus
extern "C" {
#endif
Vector CreateVector()
void DestroyVector(Vector v)
#ifdef __cplusplus
}
#endif
vectorimplementation.cpp
#include "vector.h"
struct CDataFree
{
void operator(void *ptr) { if (ptr) free(ptr); }
};
using CData = std::unique_ptr<void*, CDataFree>;
Vector CreateVector()
{
Vector v;
v.cppVector = static_cast<void*>(std::make_unique<std::vector<CData>>().release());
return v;
}
void DestroyVector(Vector v)
{
auto cppV = static_cast<std::vector<CData>>(v.cppVector);
auto freeAsUniquePtr = std::unique_ptr<std::vector<CData>>(cppV);
}
// ...
The closest equivalent of std::array in c is probably a preprocessor macro defintion like
#define ARRAY(type,name,length) \
type name[(length)]
This kind of initialization works
int arr[3][4] = { {1,2,3,4}, {1,2,3,4}, {1,2,3,4} } ;
but this one here doesn't
const size_t row_size = 3;
const size_t col_size = 4;
int arr[row_size][col_size] = { {1,2,3,4},{1,2,3,4},{1,2,3,4}};
these codes are in c but after changing the file extension to c++ and re-compiling
it works fine. why such a behavior?
That used to be a problem with C and macros were used to solve such problems. But in C++ if you use "const" keyword then the compiler should automatically replace those values during compile time so there shouldn't be any problem with g++. That code runs perfectly fine when compiled with g++. Maybe you are trying to compile it with gcc (I got the same error with gcc which is as expected).
Actually, having a const-variable doesn't mean it is constant at compile time. For example, I can
void f(int x) {
const int y = x;
int m[y]; // should that work?
}
It won't work in good old C++03, as compiler cannot determine y at compile time. Though it would be possible in C++ soon with the feature called Variable Length Arrays, it seems all you want is plain compile-time constant. Just say it's also static:
static const int size_t row_size = 3;
Since now, you can use it in compile-time.
Is there any way that I can discover the type of a variable automatically in C, either through some mechanism within the program itself, or--more likely--through a pre-compilation script that uses the compiler's passes up to the point where it has parsed the variables and assigned them their types? I'm looking for general suggestions about this. Below is more background about what I need and why.
I would like to change the semantics of the OpenMP reduction clause. At this point, it seems easiest simply to replace the clause in the source code (through a script) with a call to a function, and then I can define the function to implement the reduction semantics I want. For instance, my script would convert this
#pragma omp parallel for reduction(+:x)
into this:
my_reduction(PLUS, &x, sizeof(x));
#pragma omp parallel for
where, earlier, I have (say)
enum reduction_op {PLUS, MINUS, TIMES, AND,
OR, BIT_AND, BIT_OR, BIT_XOR, /* ... */};
And my_reduction has signature
void my_reduction(enum reduction_op op, void * var, size_t size);
Among other things, my_reduction would have to apply the addition operation to the reduction variable as the programmer had originally intended. But my function cannot know how to do this correctly. In particular, although it knows the kind of operation (PLUS), the location of the original variable (var), and the size of the variable's type, it does not know the variable's type itself. In particular, it does not know whether var has an integral or floating-point type. From a low-level POV, the addition operation for those two classes of types is completely different.
If only the nonstandard operator typeof, which GCC supports, would work the way sizeof works--returning some sort of type variable--I could solve this problem easily. But typeof is not really like sizeof: it can only be used, apparently, in l-value declarations.
Now, the compiler obviously does know the type of x before it finishes generating the executable code. This leads me to wonder whether I can somehow leverage GCC's parser, just to get x's type and pass it to my script, and then run GCC again, all the way, to compile my altered source code. It would then be simple enough to declare
enum var_type { INT8, UINT8, INT16, UINT16, /* ,..., */ FLOAT, DOUBLE};
void my_reduction(enum reduction_op op, void * var, enum var_type vtype);
And my_reduction can cast appropriately before dereferencing and applying the operator.
As you can see, I am trying to create a kind of "dispatching" mechanism in C. Why not just use C++ overloading? Because my project constrains me to work with legacy source code written in C. I can alter the code automatically with a script, but I cannot rewrite it into a different language.
Thanks!
C11 _Generic
Not a direct solution, but it does allow you to achieve the desired result if you are patient to code all types as in:
#include <assert.h>
#include <string.h>
#define typename(x) _Generic((x), \
int: "int", \
float: "float", \
default: "other")
int main(void) {
int i;
float f;
void* v;
assert(strcmp(typename(i), "int") == 0);
assert(strcmp(typename(f), "float") == 0);
assert(strcmp(typename(v), "other") == 0);
}
Compile and run with:
gcc -std=c11 a.c
./a.out
A good starting point with tons of types can be found in this answer.
Tested in Ubuntu 17.10, GCC 7.2.0. GCC only added support in 4.9.
You can use sizeof function to determine type , let the variable of unknown type be var.
then
if(sizeof(var)==sizeof(char))
printf("char");
else if(sizeof(var)==sizeof(int))
printf("int");
else if(sizeof(var)==sizeof(double))
printf("double");
Thou it will led to complications when two or more primary types might have same size .
C doesn't really have a way to perform this at pre-compile time, unless you write a flood of macros. I would not recommend the flood of macros approach, it would basically go like this:
void int_reduction (enum reduction_op op, void * var, size_t size);
#define reduction(type,op,var,size) type##_reduction(op, var, size)
...
reduction(int, PLUS, &x, sizeof(x)); // function call
Note that this is very bad practice and should only be used as last resort when maintaining poorly written legacy code, if even then. There is no type safety or other such guarantees with this approach.
A safer approach is to explicitly call int_reduction() from the caller, or to call a generic function which decides the type in runtime:
void reduction (enum type, enum reduction_op op, void * var, size_t size)
{
switch(type)
{
case INT_TYPE:
int_reduction(op, var, size);
break;
...
}
}
If int_reduction is inlined and various other optimizations are done, this runtime evaluation isn't necessarily that much slower than the obfuscated macros, but it is far safer.
GCC provides the typeof extension. It is not standard, but common enough (several other compilers, e.g. clang/llvm, have it).
You could perhaps consider customizing GCC by extending it with MELT (a domain specific language to extend GCC) to fit your purposes.
You could also consider customizing GCC with a plugin or a MELT extension for your needs. However, this requires understanding some of GCC internal representations (Gimple, Tree) which are complex (so will take you days of work at least).
But types are a compile-only thing in C. They are not reified.
In general it is not possible to identify what kind of data is in a given byte or sequence of bytes. For example, the 0 byte could be an empty string or the integer 0. the bit pattern for 99 could be that number, or the letter 'c'.
The following is a bit of hackery to turn an arbitrary sequence of bytes into a printable value. It works in most cases (but not for numbers that could also be characters). It is for the lcc compiler under Windows 7, with 32-bit ints, longs and 64-bit doubles.
char* OclAnyToString(void* x)
{ char* ss = (char*) x;
int ind = 0;
int* ix = (int*) x;
long* lx = (long*) x;
double* dx = (double*) x;
char* sbufi = (char*) calloc(21, sizeof(char));
char* sbufl = (char*) calloc(21, sizeof(char));
char* sbufd = (char*) calloc(21, sizeof(char));
if (ss[0] == '\0')
{ sprintf(sbufi, "%d", *ix);
sprintf(sbufd, "%f", *dx);
if (strcmp(sbufi,"0") == 0 &&
strcmp(sbufd,"0.000000") == 0)
{ return "0"; }
else if (strcmp(sbufd,"0.000000") != 0)
{ return sbufd; }
else
{ return sbufi; }
}
while (isprint(ss[ind]) && 0 < ss[ind] && ss[ind] < 128 && ind < 1024)
{ /* printf("%d\n", ss[ind]); */
ind++;
}
if (ss[ind] == '\0')
{ return (char*) x; }
sprintf(sbufi, "%d", *ix);
sprintf(sbufl, "%ld", *lx);
sprintf(sbufd, "%f", *dx);
if (strcmp(sbufd,"0.000000") != 0)
{ free(sbufi);
free(sbufl);
return sbufd;
}
if (strcmp(sbufi,sbufl) == 0)
{ free(sbufd);
free(sbufl);
return sbufi;
}
else
{ free(sbufd);
free(sbufi);
return sbufl;
}
}
Having this code:
typedef volatile int COUNT;
COUNT functionOne( COUNT *number );
int functionTwo( int *number );
I can't get rid of some warnings..
I get this warning 1 at functionOne prototype
[Warning] type qualifiers ignored on
function return type
and I get this warning 2, wherever I call functionTwo with a COUNT pointer argument instead of an int pointer
[Warning] cast discards qualifiers
from pointer target type
obviously variables/pointers can't be "cast" to volatile/un-volatile.. but every arguments must be specified as volatile too? so how can I use any library function if it's already defined for non-volatile variable?
EDIT: Using gcc -std=c99 -pedantic -Wall -Wshadow -Wpointer-arith -Wcast-qual -Wextra -Wstrict-prototypes -Wmissing-prototypes …
EDIT: After Jukka Suomela advice this is a code sample for warning two
typedef volatile int COUNT;
static int functionTwo(int *number) {
return *number + 1;
}
int main(void) {
COUNT count= 10;
count = functionTwo(&count);
return 0;
}
The volatile keyword was designed to be applied to objects that represent storage and not to functions. Returning a volatile int from a function does not make much sense. The return value of a function will not be optimized away (with the possible exception of inlined functions, but that's another case altogether...), and no external actor will be modifying it. When a function returns, it passes a copy of the return value to the calling function. A copy of a volatile object is not itself volatile. Therefore, attempting to return a volatile int will result in a copy, casting it down to a non-volatile int, which is what is triggering your compiler messages. Returning a volatile int* might be useful, but not a volatile int.
Passing an object by value into a function makes a copy of the object, thus using a volatile int as a function parameter necessarily involves a conversion that ignores a qualifier. Passing a volatile by address is perfectly reasonable, but not by value.
According to the C spec, the behavior of volatile is completely implementation-dependent, so YMMV.
Are you using volatile in this way to try to defeat some sort of compiler optimization? If so, there is probably a better way to do it.
Edit:
Taking into account the updates to your question, it appears that you may be able to approach this in a different way. If you are trying to defeat compiler optimizations, why not take the direct approach and simply tell the compiler not to optimize some things? You can use #pragma GCC optimize or __attribute__((optimize)) to give specific optimization parameters for a function. For example, __attribute__((optimize(0))) should disable all optimizations for a given function. That way, you can keep your data types non-volatile and avoid the type problems you are having. If disabling all optimizations is a bit too much, you can also turn individual optimization options on or off with that attribute/pragma.
Edit:
I was able to compile the following code without any warnings or errors:
static int functionTwo(int *number) {
return *number + 1;
}
typedef union {
int i;
volatile int v;
} fancy_int;
int main(void) {
fancy_int count;
count.v = 10;
count.v = functionTwo(&count.i);
return 0;
}
This hack"technique" probably has some kind of odd side-effects, so test it thoroughly before production use. It's most likely no different than directly casting the address to a (int*), but it doesn't trigger any warnings.
It's possible that I am way off base here but volatile isn't something normally associated with stack memory region. Therefore I'm not sure if the following prototype really makes much sense.
volatile int functionOne(volatile int number);
I'm not sure how a returned integer can be volatile. What's going to cause the value of EAX to change? The same applies to the integer. Once the value is pushed onto the stack so that it can be passed as a parameter what's going to change its value?
I don't understand why you'd want to have the volatile qualifier on a function return type. The variable that you assign the function's return value to should be typed as a volatile instead.
Try making these changes:
typedef int COUNT_TYPE;
typedef volatile COUNT_TYPE COUNT;
COUNT_TYPE functionOne( COUNT number );
COUNT_TYPE functionTwo( COUNT_TYPE number );
And when calling functionTwo(), explicitly cast the argument:
functionTwo( (COUNT_TYPE)arg );
HTH,
Ashish.
If I compile
typedef volatile int COUNT;
static int functionTwo(int number) {
return number + 1;
}
int main(void) {
COUNT count = 10;
count = functionTwo(count);
return 0;
}
using
gcc -std=c99 -pedantic -Wall -Wshadow -Wpointer-arith -Wcast-qual \
-Wextra -Wstrict-prototypes -Wmissing-prototypes foo.c
I don't get any warnings. I tried gcc 4.0, 4.2, 4.3, and 4.4. Your warningTwo sounds like you are passing pointers, not values, and that's another story...
EDIT:
Your latest example should be written like this; again, no warnings:
typedef volatile int COUNT;
static int functionTwo(COUNT *number) { return *number + 1; }
int main(void) { COUNT count = 10; count = functionTwo(&count); return 0; }
EDIT:
If you can't change functionTwo:
typedef volatile int COUNT;
static int functionTwo(int *number) { return *number + 1; }
int main(void) {
COUNT count= 10;
int countcopy = count;
count = functionTwo(&countcopy);
return 0;
}
Note that any access to a volatile variable is "special". In the first version with functionTwo(COUNT *number), functionTwo knows how to access it properly. In the second version with countcopy, the main function knows how to access it properly when assigning countcopy = copy.
It's possible that those who wrote it wanted to be sure that all the operations are atomic, and declared all int variables as volatile (is it a MT application with poor syncronization?), so all the ints from the code are declared as volatile "for consistency".
Or maybe by declaring the function type as volatile they expect to stop the optimizations of the repeated calls for pure functions? An increment of a static variable inside the function would solve it.
However, try to guess their original intention, because this just does not make any sense.
How can I dynamically create a function in C?
I try to summarize my C problem as follows:
I have a matrix and I want to be able to use some function to generate its elements.
function has no arguments
Hence I define the following:
typedef double(function)(unsigned int,unsigned int);
/* writes f(x,y) to each element x,y of the matrix*/
void apply(double ** matrix, function * f);
Now I need to generate constant functions within the code. I thought about creating a nested function and returning its pointer, but GCC manual (which allows nested functions) says:
"If you try to call the nested function through its address after the
containing function has exited, all hell will break loose."
which I would kind of expect from this code...
function * createConstantFunction(const double value){
double function(unsigned int,unsigned int){
return value;
}
return &function;
}
So how can I get it to work?
Thanks!
C is a compiled language. You can't create code at run-time "in C"; there is no specific C support to emit instructions to memory and so on. You can of course try just allocating memory, making sure it's executable, and emit raw machine code there. Then call it from C using a suitable function pointer.
You won't get any help from the language itself though, this is just like generating code and calling it in BASIC on an old 8-bit machine.
You must be familiar with some programming language which supports closure mechanism ,don't you?
Unfortunately, C does not support closure like that itself.
You could find out some useful libraries which simulate closure in C if you insisted on closure. But most of those libraries are complex and machine-dependence.
Alternatively, you can change your mind to agree with the C-style closure if you could change the signature of double ()(unsigned,unsigned);.
In C, functions itself has no data (or context) except the parameters of it and the static variable which it could access.
So the context must be passed by yourself. Here is a example using extra parameter :
// first, add one extra parameter in the signature of function.
typedef double(function)(double extra, unsigned int,unsigned int);
// second, add one extra parameter in the signature of apply
void apply(double* matrix,unsigned width,unsigned height, function* f, double extra)
{
for (unsigned y=0; y< height; ++y)
for (unsigned x=0; x< width ++x)
matrix[ y*width + x ] = f(x, y, extra);
// apply will passing extra to f
}
// third, in constant_function, we could get the context: double extra, and return it
double constant_function(double value, unsigned x,unsigned y) { return value; }
void test(void)
{
double* matrix = get_a_matrix();
// fourth, passing the extra parameter to apply
apply(matrix, w, h, &constant_function, 1212.0);
// the matrix will be filled with 1212.0
}
Is a double extra enough? Yes, but only in this case.
How should we do if more context is required?
In C, the general purpose parameter is void*, we can pass any context though one void* parameter by passing the address of context.
Here is another example :
typedef double (function)(void* context, int, int );
void apply(double* matrix, int width,int height,function* f,void* context)
{
for (int y=0; y< height; ++y)
for (int x=0; x< width ++x)
matrix[ y*width + x ] = f(x, y, context); // passing the context
}
double constant_function(void* context,int x,int y)
{
// this function use an extra double parameter \
// and context points to its address
double* d = context;
return *d;
}
void test(void)
{
double* matrix = get_a_matrix();
double context = 326.0;
// fill matrix with 326.0
apply( matrix, w, h, &constant_function, &context);
}
(function,context) pair like &constant_function,&context is the C-style closure.
Each function(F) that needs a closure must has one context parameter which will be passed to closure as its context.
And the caller of F must use a correct (f,c) pair.
If you can change the signature of function to fit to C-style closure, your code will be simple and machine-independence.
If couldn't (function and apply is not written by you), try to persuade him to change his code.
If failed, you have no choice but to use some closure libraries.
Since you want to generate a function that follows a simple recipe,
this shouldn't be too tricky to do with some inline assembly and
a block of executable/writable memory.
This approach feels a bit hacky so I wouldn't recommend it in production code. Due to the use of inline assembly this solution works only on Intel x86-64 / AMD64, and will need to be translated to work with other architectures.
You might prefer this to other JIT-based solutions as it does not depend on any external library.
If you would like a longer explanation of how the below code works,
leave a comment and I'll add it.
For security reasons, the code page should be marked PROT_READ|PROT_EXEC after a function is generated (see mprotect).
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <sys/mman.h>
int snippet_processor(char *buffer, double value, int action);
enum snippet_actions {
S_CALC_SIZE,
S_COPY,
};
typedef double (*callback_t) (unsigned int, unsigned int);
int main(int argc, char **argv) {
unsigned int pagesize = 4096;
char *codepage = 0;
int snipsz = 0;
callback_t f;
/* allocate some readable, writable and executable memory */
codepage = mmap(codepage,
pagesize,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_ANONYMOUS | MAP_PRIVATE,
0,
0);
// generate one function at `codepage` and call it
snipsz += snippet_processor(codepage, 12.55, S_COPY);
f = (callback_t) (codepage);
printf("result :: %f\n", f(1, 2));
/* ensure the next code address is byte aligned
* - add 7 bits to ensure an overflow to the next byte.
* If it doesn't overflow then it was already byte aligned.
* - Next, throw away any of the "extra" bit from the overflow,
* by using the negative of the alignment value
* (see how 2's complement works.
*/
codepage += (snipsz + 7) & -8;
// generate another function at `codepage` and call it
snipsz += snippet_processor(codepage, 16.1234, S_COPY);
f = (callback_t) (codepage);
printf("result :: %f\n", f(1, 2));
}
int snippet_processor(char *buffer, double value, int action) {
static void *snip_start = NULL;
static void *snip_end = NULL;
static void *double_start = NULL;
static int double_offset_start = 0;
static int size;
char *i, *j;
int sz;
char *func_start;
func_start = buffer;
if (snip_start == NULL) {
asm volatile(
// Don't actually execute the dynamic code snippet upon entry
"jmp .snippet_end\n"
/* BEGIN snippet */
".snippet_begin:\n"
"movq .value_start(%%rip), %%rax\n"
"movd %%rax, %%xmm0\n"
"ret\n"
/* this is where we store the value returned by this function */
".value_start:\n"
".double 1.34\n"
".snippet_end:\n"
/* END snippet */
"leaq .snippet_begin(%%rip), %0\n"
"leaq .snippet_end(%%rip), %1\n"
"leaq .value_start(%%rip), %2\n"
:
"=r"(snip_start),
"=r"(snip_end),
"=r"(double_start)
);
double_offset_start = (double_start - snip_start);
size = (snip_end - snip_start);
}
if (action == S_COPY) {
/* copy the snippet value */
i = snip_start;
while (i != snip_end) *(buffer++) = *(i++);
/* copy the float value */
sz = sizeof(double);
i = func_start + double_offset_start;
j = (char *) &value;
while (sz--) *(i++) = *(j++);
}
return size;
}
Using FFCALL, which handles the platform-specific trickery to make this work:
#include <stdio.h>
#include <stdarg.h>
#include <callback.h>
static double internalDoubleFunction(const double value, ...) {
return value;
}
double (*constDoubleFunction(const double value))() {
return alloc_callback(&internalDoubleFunction, value);
}
main() {
double (*fn)(unsigned int, unsigned int) = constDoubleFunction(5.0);
printf("%g\n", (*fn)(3, 4));
free_callback(fn);
return 0;
}
(Untested since I don't have FFCALL currently installed, but I remember that it works something like this.)
One way of doing would be to write a standard C file with the set of functions you want, compile it via gcc and the load it as a dynamic library to get pointers to the functions.
Ultimately, it probably would be better if you were able to specify your functions without having to define them on-the-fly (like via having a generic template function that takes arguments that define its specific behavior).
If you want to write code on the fly for execution, nanojit might be a good way to go.
In your code above, you're trying to create a closure. C doesn't support that. There are some heinous ways to fake it, but out of the box you're not going to be able to runtime bind a variable into your function.
As unwind already mentioned, "creating code at runtime" is not supported by the language and will be a lot of work.
I haven't used it myself, but one of my co-workers swears by Lua, an "embedded language". There is a Lua C API which will (theoretically, at least) allow you to perform dynamic (scripted) operations.
Of course, the downside would be that the end user may need some sort of training in Lua.
It may be a dumb question, but why does the function have to be generated within your application? Similarly what advantage does the end-user get from generating the function themselves (as opposed to selecting from one or more predefined functions that you provide)?
This mechanism is called reflection where code modifies its own behavior at runtime. Java supports reflection api to do this job.
But I think this support is not available in C.
Sun web site says :
Reflection is powerful, but should not
be used indiscriminately. If it is
possible to perform an operation
without using reflection, then it is
preferable to avoid using it. The
following concerns should be kept in
mind when accessing code via
reflection.
Drawbacks of Reflection
Performance Overhead Because
reflection involves types that are
dynamically resolved, certain Java
virtual machine optimizations can not
be performed. Consequently, reflective
operations have slower performance
than their non-reflective
counterparts, and should be avoided in
sections of code which are called
frequently in performance-sensitive
applications.
Security Restrictions
Reflection requires a runtime
permission which may not be present
when running under a security manager.
This is in an important consideration
for code which has to run in a
restricted security context, such as
in an Applet.
Exposure of Internals
Since reflection allows code to
perform operations that would be
illegal in non-reflective code, such
as accessing private fields and
methods, the use of reflection can
result in unexpected side-effects,
which may render code dysfunctional
and may destroy portability.
Reflective code breaks abstractions
and therefore may change behavior with
upgrades of the platform. .
It looks like you're coming from another language where you commonly use this type of code. C doesn't support it and it although you could certainly cook up something to dynamically generate code, it is very likely that this isn't worth the effort.
What you need to do instead is add an extra parameter to the function that references the matrix it is supposed to work on. This is most likely what a language supporting dynamic functions would do internally anyway.
If you really need to dynamically create the functions, maybe an embedded C interpreter could help. I've just googled for "embedded C interpreter" and got Ch as a result:
http://www.softintegration.com/
Never heard of it, so I don't know anything about it, but it seems to be worth a look.