Prevent Frama-C's slicing plugin from changing input code - c

Given a C file, I want to compute the backward slice for some criteria and compare the slice to the original code. Because I don't want to implement a slicing program from cratch, I've already tried to get used to Frama-C which seems to help with this task.
However, my problem is, that Frama-C's slicing plugin changes the preprocessed input code, so that it makes it harder to identify which lines of the original also appear in the slice.
Example:
Input file test1.c:
double func1(double param) {
return 2+param;
}
int main() {
int a=3;
double c=4.0;
double d=10.0;
if(a<c)
c=(double)a/4.0;
double res = func1(c);
return 0;
}
Preprocessed file (yielded by frama-c test1.c -print -ocode test1_norm.c):
/* Generated by Frama-C */
double func1(double param)
{
double __retres;
__retres = (double)2 + param;
return __retres;
}
int main(void)
{
int __retres;
int a;
double c;
double d;
double res;
a = 3;
c = 4.0;
d = 10.0;
if ((double)a < c) c = (double)a / 4.0;
res = func1(c);
__retres = 0;
return __retres;
}
Slice (yielded by frama-c -slice-calls func1 test1.c -then-on 'Slicing export' -print):
/* Generated by Frama-C */
double func1_slice_1(double param)
{
double __retres;
__retres = (double)2 + param;
return __retres;
}
void main(void)
{
int a;
double c;
double res;
a = 3;
c = 4.0;
c = (double)a / 4.0;
res = func1_slice_1(c);
return;
}
Note that the signature of main differs and that the name of func1 was changed to func1_slice_1.
Is there a way to suppress that behaviour in order to make the slice and the (preprocessed) original more easily comparable (in terms of a computable diff)?

First, to clarify a simpler question that you don't need answering but that someone searching for the same keywords could, you cannot have the sliced program printed as a selection of the lines of the original program (most of the differences between the two corresponds to lost information, basically. If the information was there, it would be used to print the most resembling program possible).
What you can do is print Frama-C's representation of the original program, which you are already doing with frama-c test2.c -print -ocode test2_norm.c.
To solve your problem of func1 being renamed to func1_slice_1, you can try playing with option -slicing-level 0:
$ frama-c -slicing-level 0 -slice-calls func1 test1.c -then-on 'Slicing export' -print
...
/* Generated by Frama-C */
double func1(double param)
{
double __retres;
__retres = (double)2 + param;
return __retres;
}
void main(void)
{
int a;
double c;
double res;
a = 3;
c = 4.0;
c = (double)a / 4.0;
res = func1(c);
return;
}
I think this will prevent the slicer from slicing inside func1 at all. The help says:
-slicing-level <n> set the default level of slicing used to propagate to the
calls
0 : don't slice the called functions
1 : don't slice the called functions but propagate the
marks anyway
2 : try to use existing slices, create at most one
3 : most precise slices

Related

pass struct of arrays into function

I am trying to pass a struct of 2D arrays and to do calculations on them.
typedef struct{
float X[80][2];
float Y[80][2];
float Z[80][2];
int T[80][2];
int K[80];
} STATS;
void MovingAverage(STATS *stat_array, int last_stat) {
//Average = Average(Prev) + (ValueToAverage/n) - (Average(Prev)/n)
stat_array->**X**[last_stat][0] = stat_array->**X**[last_stat][0] +
(stat_array->**X**[last_stat][1] / stat_array->T[last_stat][0]) -
(stat_array->**X**[last_stat][0] / stat_array->T[last_stat][0]);
}
calling the function:
MovingAverage(*stat_array, last_stat);
My question is:
how do I access in a generic way to X Y and Z inside MovingAverage function?
Edit:
void MovingAverage(STATS *stat_array, int last_stat, (char *(array_idx)) {
//Average = Average(Prev) + (ValueToAverage/n) - (Average(Prev)/n)
stat_array->**array_idx**[last_stat][0] =
stat_array->**array_idx**[last_stat][0] +
(stat_array->**array_idx**[last_stat][1] /
stat_array->T[last_stat][0]) -
(stat_array->**array_idx**[last_stat][0] /
stat_array->T[last_stat][0]);
}
I know it won't work, but just to demonstrate my willings,
Somebody here (not me) could probably come up with some preprocessor magic to do what you're asking, but that is a solution I would not pursue. I consider it bad practice since macros can quickly get hairy and tough to debug. You can't have "variables" inside your source code, if that makes sense. During the build procedure, one of the first things that runs is the preprocessor, which resolves all your macros. It then passes that source code to the compiler. The compiler is not going to do any text substitutions for you, it cranks on the source code it has. To achieve what you want, write a function that operates on the type you want, and call that function with all your types. I'd change your MovingAverage function to something like this:
void MovingAverage(float arr[80][2], const int T[80][2], int last_stat)
{
arr[last_stat][0] = ... // whatever calculation you want to do here
}
int main(void)
{
STATS stat_array;
int last_stat;
// .. initialize stat_array and last_stat
// now call MovingAverage with each of your 3 arrays
MovingAverage(stat_array.X, stat_array.T, last_stat);
MovingAverage(stat_array.Y, stat_array.T, last_stat);
MovingAverage(stat_array.Z, stat_array.T, last_stat);
...
return 0;
}

What is this madness?

I've never seen anything like this; I can't seem to wrap my head around it. What does this code even do? It looks super fancy, and I'm pretty sure this stuff is not described anywhere in my C book. :(
union u;
typedef union u (*funcptr)();
union u {
funcptr f;
int i;
};
typedef union u $;
int main() {
int printf(const char *, ...);
$ fact =
($){.f = ({
$ lambda($ n) {
return ($){.i = n.i == 0 ? 1 : n.i * fact.f(($){.i = n.i - 1}).i};
}
lambda;
})};
$ make_adder = ($){.f = ({
$ lambda($ n) {
return ($){.f = ({
$ lambda($ x) {
return ($){.i = n.i + x.i};
}
lambda;
})};
}
lambda;
})};
$ add1 = make_adder.f(($){.i = 1});
$ mul3 = ($){.f = ({
$ lambda($ n) { return ($){.i = n.i * 3}; }
lambda;
})};
$ compose = ($){
.f = ({
$ lambda($ f, $ g) {
return ($){.f = ({
$ lambda($ n) {
return ($){.i = f.f(($){.i = g.f(($){.i = n.i}).i}).i};
}
lambda;
})};
}
lambda;
})};
$ mul3add1 = compose.f(mul3, add1);
printf("%d\n", fact.f(($){.i = 5}).i);
printf("%d\n", mul3.f(($){.i = add1.f(($){.i = 10}).i}).i);
printf("%d\n", mul3add1.f(($){.i = 10}).i);
return 0;
}
This example primarily builds on two GCC extensions: nested functions, and statement expressions.
The nested function extension allows you to define a function within the body of another function. Regular block scoping rules apply, so the nested function has access to the local variables of the outer function when it is called:
void outer(int x) {
int inner(int y) {
return x + y;
}
return inner(6);
}
...
int z = outer(4)' // z == 10
The statement expression extension allows you to wrap up a C block statement (any code you would normally be able to place within braces: variable declarations, for loops, etc.) for use in a value-producing context. It looks like a block statement in parentheses:
int foo(x) {
return 5 + ({
int y = 0;
while (y < 10) ++y;
x + y;
});
}
...
int z = foo(6); // z == 20
The last statement in the wrapped block provides the value. So it works pretty much like you might imagine an inlined function body.
These two extensions used in combination let you define a function body with access to the variables of the surrounding scope, and use it immediately in an expression, creating a kind of basic lambda expression. Since a statement expression can contain any statement, and a nested function definition is a statement, and a function's name is a value, a statement expression can define a function and immediately return a pointer to that function to the surrounding expression:
int foo(int x) {
int (*f)(int) = ({ // statement expression
int nested(int y) { // statement 1: function definition
return x + y;
}
nested; // statement 2 (value-producing): function name
}); // f == nested
return f(6); // return nested(6) == return x + 6
}
The code in the example is dressing this up further by using the dollar sign as a shortened identifier for a return type (another GCC extension, much less important to the functionality of the example). lambda in the example isn't a keyword or macro (but the dollar is supposed to make it look like one), it's just the name of the function (reused several times) being defined within the statement expression's scope. C's rules of scope nesting mean it's perfectly OK to reuse the same name within a deeper scope (nested "lambdas"), especially when there's no expectation of the body code using the name for any other purpose (lambdas are normally anonymous, so the functions aren't expected to "know" that they're actually called lambda).
If you read the GCC documentation for nested functions, you'll see that this technique is quite limited, though. Nested functions expire when the lifetime of their containing frame ends. That means they can't be returned, and they can't really be stored usefully. They can be passed up by pointer into other functions called from the containing frame that expect a normal function pointer, so they are fairly useful still. But they don't have anywhere near the flexibility of true lambdas, which take ownership (shared or total depends on the language) of the variables they close over, and can be passed in all directions as true values or stored for later use by a completely unrelated part of the program. The syntax is also fairly ungainly, even if you wrap it up in a lot of helper macros.
C will most likely be getting true lambdas in the next version of the language, currently called C2x. You can read more about the proposed form here - it doesn't really look much like this (it copies the anonymous function syntax and semantics found in Objective-C). The functions created this way have lifetimes that can exceed their creating scope; the function bodies are true expressions, without the need for a statement-containing hack; and the functions themselves are truly anonymous, no intermediate names like lambda required.
A C2x version of the above example will most likely look something like this:
#include <stdio.h>
int main(void) {
typedef int (^ F)(int);
__block F fact; // needs to be mutable - block can't copy-capture
// its own variable before initializing it
fact = ^(int n) {
return n == 0 ? 1 : n * fact(n - 1);
};
F (^ make_adder)(int) = ^(int n) {
return _Closure_copy(^(int x) { return n + x; });
};
F add1 = make_adder(1);
F mul3 = ^(int n) { return n * 3; };
F (^ compose)(F, F) = ^(F f, F g) {
return _Closure_copy(^(int n) { return f(g(n)); });
};
F mul3add1 = compose(mul3, add1);
printf("%d\n", fact(5));
printf("%d\n", mul3(add1(10)));
printf("%d\n", mul3add1(10));
_Closure_free(add1);
_Closure_free(mul3add1);
return 0;
}
Much simpler without all that union stuff.
(You can compile and run this modified example in Clang right now - use the -fblocks flag to enable the lambda extension, add #include <Block.h> to the top of the file, and replace _Closure_copy and _Closure_free with Block_copy and Block_release respectively.)

Passing operator as a parameter in C99

I want to pass an operator as a parameter in C99.
My solution is this:
int add(int l, int r)
{
return l + r;
}
int sub(int l, int r)
{
return l - r;
}
// ... long list of operator functions
int perform(int (*f)(int, int), int left, int right)
{
return f(left, right);
}
int main(void)
{
int a = perform(&add, 3, 2);
}
Is there some other way to do it? I don't want to write a function for every operator.
It could look like this:
int a = perform(something_cool_here, 3, 2);
You could use switch/case, for example:
int perform(char op,int a,int b)
{
switch (op)
{
case '+': return a+b;
case '-': return a-b;
default: return 0;
}
}
But you would still have to write some code for each operator; you don't get anything for free in C.
You can use X Macros. By defining a single macro that contains a table of repeated values in a redefinable macro, you can just redefine the internal macro for the current task and insert a single macro to handle the whole set.
Here is a compact way to do it with single operand floating point builtins as an example. The process is similar for other types.
//add name of each function you want to use here:
#define UNARYFPBUILTINS \
$(acos) $(acosh) $(asin) $(asinh) $(atan) $(atanh) $(cbrt) $(ceil) \
$(cos) $(erf) $(erfc) $(exp) $(exp10) $(exp2) $(expm1) $(fabs) \
$(floor) $(gamma) $(j0) $(j1) $(lgamma) $(log) $(log10) $(log1p) \
$(log2) $(logb) $(pow10) $(round) $(signbit) $(significand) \
$(sin) $(sqrt) $(tan) $(tgamma) $(trunc) $(y0) $(y1)
//now define the $(x) macro for our current use case - defining enums
#define $(x) UFPOP_##x,
enum ufp_enum{ UNARYFPBUILTINS };
#undef $ //undefine the $(x) macro so we can reuse it
//feel free to remove the __builtin_## ... its just an optimization
double op(enum ufp_enum op, double f){
switch(op){ //now we can use the same macros for our cases
#define $(x) case UFPOP_##x : f = __builtin_##x(f);break;
UNARYFPBUILTINS
#undef $
}
return f;
}
You can continue using it for other stuff as well
///////////EXTRA STUFF/////////
//unused - may be good mapping the enums to strings
//#define $(x) #x,
//const char * ufp_strings{ UNARYFPBUILTINS };
//#undef $
//this uses float instead of double, so adds the ##f to each function
float opf(enum ufp_enum op, float f){
switch(op){
#define $(x) case UFPOP_##x : f = __builtin_##x##f(f);break;
UNARYFPBUILTINS
#undef $
}
return f;
}
//you could do the same thing for long double here
Edit: Note that $ in macros is implementation dependent, You can call it whatever
Edit2: Here is an example with multiple parameters to do arithmetic operators. This one uses computed gotos instead of a switch in case your compiler handles one better than the other.
#define IOPS $(SUB,-) $(MUL,*) $(DIV,/) $(MOD,%) $(ADD,+) $(AND,&) $(OR,|) \
$(XOR,^) $(SR,>>) $(SL,<<)
enum iops_enum {
#define $(x,op) IOPSENUM_##x,
IOPS
IOPSENUM_COUNT
#undef $
};
int opi(int a, enum iops_enum b, int c){
static const char array[] = { //you may get better results with short or int
#define $(x,op) &&x - &&ADD,
IOPS
#undef $
};
if (b >= IOPSENUM_COUNT) return a;
goto *(&&ADD + array[b]);
//else should give a warning here.
#define $(x,op) x: return a op c;
IOPS
#undef $
}

Garbage storage, programming in C

I'm new programing in C. I have a main code with 781 lines that is out of control because garbage value is stored in vectors. A short part of the main code is shown below where it calls a subroutine called diff_conv_intermedia1.
diff_conv_intermedia1(&factorteta,&N,ID,DIFF,X1_intermedia,Y1_intermedia,X1C_intermedia,Y1C_intermedia,CU1_intermedia,CV1_intermedia,AW1_intermedia,AE1_intermedia,AS1_intermedia,AN1_intermedia,AP1_intermedia,Q1_intermedia,FXI1,FYI1,FI_intermedia1,1,2,1,1);
int q,w;
for(q=1;q<(*factorteta_Ptr)*2+1;q++)
{
for(w=1;w<(*N_Ptr)+1;w++)
{
printf("%lf\n",AP1_intermedia[q][w]);
}
}
So the subroutine shown below. When I print the results inside the subroutine, everything is OK, but when I print the results outside the subroutine, in the main code, garbage is stored in the vectors as AP1_intermedia. I don't know what could be wrong. I repeat the same procedure with other subroutines and I don't have any errors.
int diff_conv_intermedia1(int *factorteta_Ptr,
int *N_Ptr,
int ID,
double DIFF,
double X[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double Y[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double XC[(*factorteta_Ptr)*2+2][*N_Ptr+2],
double YC[(*factorteta_Ptr)*2+2][*N_Ptr+2],
double CU[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double CV[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AW[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AE[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AS[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AN[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AP[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double Q[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double FX[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double FY[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double FI[(*factorteta_Ptr)*2+1][*N_Ptr+1],
int WBC,int EBC,int SBC,int NBC)
{
int i,j;
double value,* valuePtr;
double AED, AWD, AND, ASD;
double AEC, AWC, ANC, ASC;
valuePtr = &value;
// Diffusive coefficients
for(i=1;i<(*factorteta_Ptr)*2+1;i++)
{
for(j=1;j<*N_Ptr+1;j++)
{
AWD = -DIFF*(Y[i][j-1]-Y[i-1][j-1])/(XC[i][j]-XC[i][j-1]);
AED = -DIFF*(Y[i][j]-Y[i-1][j])/(XC[i][j+1]-XC[i][j]);
AND = -DIFF*(X[i][j]-X[i][j-1])/(YC[i+1][j]-YC[i][j]);
ASD = -DIFF*(X[i-1][j]-X[i-1][j-1])/(YC[i][j]-YC[i-1][j]);
// Convection term
if(ID==2)
{
max1_or_min2(CU[i][j-1],1,&value);
AWC=-*valuePtr;
max1_or_min2(CU[i][j],2,&value);
AEC=*valuePtr;
max1_or_min2(CV[i-1][j],1,&value);
ASC=-*valuePtr;
max1_or_min2(CV[i][j],2,&value);
ANC=*valuePtr;
}
if(ID==1)
{
AWC =-CU[i][j-1]*(1.0-FX[i][j-1]);
AEC =CU[i][j]*FX[i][j];
ASC =-CV[i-1][j]*(1.0-FY[i-1][j]);
ANC =CV[i][j]*FY[i][j];
}
// Set Coefficients matrix
AW[i][j] = AWD+AWC;
AE[i][j] = AED+AEC;
AS[i][j] = ASD+ASC;
AN[i][j] = AND+ANC;
AP[i][j] = -(AE[i][j]+AW[i][j]+AN[i][j]+AS[i][j]);
Q[i][j] = 0.0;
}
}
// West Boundary - Inlet B.C
for(i=1;i<(*factorteta_Ptr)*2+1;i++)
{
if(WBC==1) Q[i][1] = Q[i][1]-AW[i][1]*FI[i][0];
if(WBC==2) AP[i][1] = AP[i][1] + AW[i][1];
AW[i][1] = 0.0;
// East Boundary - (1)Dirichlet (2)ZERP-GRAD Outflow B.C
if(EBC==1) Q[i][*N_Ptr] = Q[i][*N_Ptr] - AE[i][*N_Ptr]*FI[i][*N_Ptr+1];
if(EBC==2) AP[i][*N_Ptr] = AP[i][*N_Ptr] + AE[i][*N_Ptr];
AE[i][*N_Ptr] = 0.0;
}
// South Boundary - (1)Dirichlet (2)ZERO-GRAD
for(j=1;j<*N_Ptr+1;j++)
{
if(SBC==1) Q[1][j] = Q[1][j] - AS[1][j]*FI[0][j];
if(SBC==2) AP[1][j] = AP[1][j] + AS[1][j];
AS[1][j] = 0.0;
// North Boundary - (1)Dirichlet (2)ZERO-GRAD
if(NBC==1) Q[(*factorteta_Ptr)*2][j] = Q[(*factorteta_Ptr)*2][j] - AN[(*factorteta_Ptr)*2][j]*FI[(*factorteta_Ptr)*2+1][j];
if(NBC==2) AP[(*factorteta_Ptr)*2][j] = AP[(*factorteta_Ptr)*2][j] + AN[(*factorteta_Ptr)*2][j];
AN[(*factorteta_Ptr)*2][j] = 0.0;
}
// Print
int l,k;
for(l=1;l<(*factorteta_Ptr)*2+1;l++)
{
for(k=1;k<*N_Ptr+1;k++)
{
printf("%lf %lf %lf %lf\n",AP[l][k],AS[l][k],AN[l][k],FI[l][k]);
}
}
return 0;
}
If anybody wants I can send all code, but have many extensions.
In your function declaration:
double AP[(*factorteta_Ptr)*2+1][*N_Ptr+1]
I don't quite think this is doing what you think it is doing. While I haven't seen something like this myself before, I believe that this is telling the compiler to create a variable length 2D array for you based on the other given parameters. Then, you fill in these values in your function. But, because you don't return this value nor do you declare it as pass by reference, it is thrown away when you return, thus the work is lost and you have garbage in your array in main(). Better form would be to create this array in main(), then pass it in by reference something like double *AP[][], or return this array upon exit, or hack things up even worse than this function and just make it a global so that you can see it anywhere.

When is CAMLparamX required?

I am writing an interface to a C-library using external declarations in OCaml. I used ctypes for testing but it involved a 100% overhead for fast calls (measured by a core_bench micro benchmark).
The functions look like this:
/* external _create_var : float -> int -> int -> int -> _npnum = "ocaml_tnp_number_create_var" ;; */
value ocaml_tnp_number_create_var(value v, value nr, value p, value o) {
//CAMLparam4(v, nr, p, o);
const int params = Int_val(p);
const int order = Int_val(o);
const int number = Int_val(nr);
const double value = Double_val(v);
return CTYPES_FROM_PTR(tnp_number_create_variable(value, number, params, order));
}
/* external _delete : _npnum -> unit = "ocaml_tnp_number_delete" ;; */
value ocaml_tnp_number_delete(value num) {
//CAMLparam1(num);
struct tnp_number* n = CTYPES_TO_PTR(num);
tnp_number_delete(n);
return Val_unit;
}
I borrowed the CTYPES_* macros, so I am basically moving pointers around as Int64 values.
#define CTYPES_FROM_PTR(P) caml_copy_int64((intptr_t)P)
#define CTYPES_TO_PTR(I64) ((void *)Int64_val(I64))
#define CTYPES_PTR_PLUS(I64, I) caml_copy_int64(Int64_val(I64) + I)
AFAIK, those values are represented as boxes which are tagged as "custom", which should be left untouched by the GC.
Do I need to uncomment the CAMLparamX macros to notify the GC about my usage or is it legal to omit them?
According to the comment in byterun/memory.h your function must start with a CAMLparamN macro with all value parameters.

Resources