Variable names in extending Maple code generation - c

I want to extend the Maple CodeGeneration[C] by a handler for the piecewise function (no idea why it is not included).
To this end I did:
with(CodeGeneration):
with(LanguageDefinition):
LanguageDefinition:-Define("NewC", extend="C",
AddFunction("piecewise", anything::numeric,
proc()
local i;
Printer:-Print("if(",_passed[1],"){",_passed[2],"}");
for i from 3 to _npassed-2 by 2 do
Printer:-Print("else if(",_passed[i],"){",_passed[i+1],"}");
end do;
Printer:-Print("else{",_passed[_npassed],"}");
end proc,
numeric=double)
);
Note that I am using if else statements in favour of case statements on puropose.
Here is an example code to translate:
myp:=proc(x::numeric)
piecewise(x>1,1*x,x>2,2*x,x>3,3*x,0);
end proc:
Translate(myp, language="NewC");
The output is
void myp (double x)
{
if(0.1e1 < x){x}else if(0.2e1 < x){0.2e1 * x}else if(0.3e1 < x){0.3e1 * x}else{0};
;
}
For a valid C-routine I obviously need to replace the curly brackets like
{x}
by something like
{result=x;}
and analogous for the others. I could do this by hand by modifying the strings in the above AddFunction statement. But then the variable name result is not known to the code generator, so there will not be any declaration nor will the value of result be returned as needed to match the routine myp or any more complicated procedure in which the result of piecewise may be assigned to some other variable or used in computations. So how do I treat this properly in the CodeGeneration routines? I.e. how can I get a valid variable name etc.

How about something like this?
restart:
with(CodeGeneration):
with(LanguageDefinition):
LanguageDefinition:-Define("NewC", extend="C",
AddFunction("piecewise", anything::numeric,
proc()
local i;
Printer:-Print("( (",_passed[1],") ? ",_passed[2]);
for i from 3 to _npassed-2 by 2 do
Printer:-Print(" : (",_passed[i],") ? ",_passed[i+1]);
end do;
Printer:-Print(" : ",_passed[_npassed]," ) ");
end proc,
numeric=double)
);
myp:=proc(x::numeric) local result::numeric;
result := piecewise(x>3,3*x,x>2,2*x,x>1,1*x,0);
end proc:
Translate(myp, language="NewC");
double myp (double x)
{
double result;
result = ( (0.3e1 < x) ? 0.3e1 * x : (0.2e1 < x) ? 0.2e1 * x : (0.1e1 < x) ? x : 0 ) ;
return(result);
}
[edited, to add the material below]
It turns out that CodeGeneration[C] does handle piecewise, but only if the optimize option is supplied. (I'll submit a bug report, that it should be handled by default.)
restart:
with(CodeGeneration):
with(LanguageDefinition):
myp:=proc(x::numeric) local result::numeric;
result:=piecewise(x>3,3*x,x>2,2*x,x>1,1*x,0);
end proc;
myp := proc(x::numeric)
local result::numeric;
result := piecewise(3 < x, 3*x, 2 < x, 2*x, 1 < x, x, 0)
end proc;
Translate(myp, language="C", optimize);
double myp (double x)
{
double result;
double s1;
if (0.3e1 < x)
s1 = 0.3e1 * x;
else if (0.2e1 < x)
s1 = 0.2e1 * x;
else if (0.1e1 < x)
s1 = x;
else
s1 = 0.0e0;
result = s1;
return(result);
}
As you can see, piecewise is handled above by translation to a separate if(){..} block, with assignment to an introduced temporary variable. That temporary is subsequently used wherever the piecewise call existed in the Maple procedure. And the temporary is declared. Nice and automatic. So that might suffice for your use of piecewise.
You also asked how you could both introduce and declare such temporary variables in your own extensions (if I understand you rightly). Continuing on in the same Maple session from above, here are some ideas along those lines. An unassigned global name is generated. The myp procedure is put into inert form, to which the new local variable is added. And then that altered inert form is turned back into an actual procedure. As an illustration, I used a modified version of your original extension to handle piecewise. This all produces something close to acceptable. The only snag is that the assignment statement,
result = temporary_variable;
is out of place! It lies before the piecewise translation block. I don't yet see how to repair that in the method.
LanguageDefinition:-Define("NewC", extend="C",
AddFunction("piecewise", anything::numeric,
proc()
global T;
local i, t;
t:=convert(T,string);
Printer:-Print(t,";\n");
Printer:-Print(" if (",_passed[1],
")\n { ",t," = ",_passed[2],"; }\n");
for i from 3 to _npassed-2 by 2 do
Printer:-Print(" else if (",_passed[i],")\n { ",
t," = ",_passed[i+1],"; }\n");
end do;
Printer:-Print(" else { ",t," = ",_passed[_npassed],"; }");
end proc,
numeric=double)
):
T:=`tools/genglobal`('s'):
newmyp := FromInert(subsindets(ToInert(eval(myp)),'specfunc(anything,_Inert_LOCALSEQ)',
z->_Inert_LOCALSEQ(op(z),
_Inert_DCOLON(_Inert_NAME(convert(T,string)),
_Inert_NAME("numeric",
_Inert_ATTRIBUTE(_Inert_NAME("protected",
_Inert_ATTRIBUTE(_Inert_NAME("protected")
))))))));
newmyp := proc(x::numeric)
local result::numeric, s::numeric;
result := piecewise(3 < x, 3*x, 2 < x, 2*x, 1 < x, x, 0)
end proc;
Translate(newmyp, language="NewC");
double newmyp (double x)
{
double result;
double s;
result = s;
if (0.3e1 < x)
{ s = 0.3e1 * x; }
else if (0.2e1 < x)
{ s = 0.2e1 * x; }
else if (0.1e1 < x)
{ s = x; }
else { s = 0; };
return(result);
}
If you rerun the last three statements above (from the assignment to T, through to the Translate call) then you should see a new temp variable used, such as s0. And then s1 if repeated yet again. And so on.
Perhaps this will give you some more ideas to work with. Cheers.

Related

What is this madness?

I've never seen anything like this; I can't seem to wrap my head around it. What does this code even do? It looks super fancy, and I'm pretty sure this stuff is not described anywhere in my C book. :(
union u;
typedef union u (*funcptr)();
union u {
funcptr f;
int i;
};
typedef union u $;
int main() {
int printf(const char *, ...);
$ fact =
($){.f = ({
$ lambda($ n) {
return ($){.i = n.i == 0 ? 1 : n.i * fact.f(($){.i = n.i - 1}).i};
}
lambda;
})};
$ make_adder = ($){.f = ({
$ lambda($ n) {
return ($){.f = ({
$ lambda($ x) {
return ($){.i = n.i + x.i};
}
lambda;
})};
}
lambda;
})};
$ add1 = make_adder.f(($){.i = 1});
$ mul3 = ($){.f = ({
$ lambda($ n) { return ($){.i = n.i * 3}; }
lambda;
})};
$ compose = ($){
.f = ({
$ lambda($ f, $ g) {
return ($){.f = ({
$ lambda($ n) {
return ($){.i = f.f(($){.i = g.f(($){.i = n.i}).i}).i};
}
lambda;
})};
}
lambda;
})};
$ mul3add1 = compose.f(mul3, add1);
printf("%d\n", fact.f(($){.i = 5}).i);
printf("%d\n", mul3.f(($){.i = add1.f(($){.i = 10}).i}).i);
printf("%d\n", mul3add1.f(($){.i = 10}).i);
return 0;
}
This example primarily builds on two GCC extensions: nested functions, and statement expressions.
The nested function extension allows you to define a function within the body of another function. Regular block scoping rules apply, so the nested function has access to the local variables of the outer function when it is called:
void outer(int x) {
int inner(int y) {
return x + y;
}
return inner(6);
}
...
int z = outer(4)' // z == 10
The statement expression extension allows you to wrap up a C block statement (any code you would normally be able to place within braces: variable declarations, for loops, etc.) for use in a value-producing context. It looks like a block statement in parentheses:
int foo(x) {
return 5 + ({
int y = 0;
while (y < 10) ++y;
x + y;
});
}
...
int z = foo(6); // z == 20
The last statement in the wrapped block provides the value. So it works pretty much like you might imagine an inlined function body.
These two extensions used in combination let you define a function body with access to the variables of the surrounding scope, and use it immediately in an expression, creating a kind of basic lambda expression. Since a statement expression can contain any statement, and a nested function definition is a statement, and a function's name is a value, a statement expression can define a function and immediately return a pointer to that function to the surrounding expression:
int foo(int x) {
int (*f)(int) = ({ // statement expression
int nested(int y) { // statement 1: function definition
return x + y;
}
nested; // statement 2 (value-producing): function name
}); // f == nested
return f(6); // return nested(6) == return x + 6
}
The code in the example is dressing this up further by using the dollar sign as a shortened identifier for a return type (another GCC extension, much less important to the functionality of the example). lambda in the example isn't a keyword or macro (but the dollar is supposed to make it look like one), it's just the name of the function (reused several times) being defined within the statement expression's scope. C's rules of scope nesting mean it's perfectly OK to reuse the same name within a deeper scope (nested "lambdas"), especially when there's no expectation of the body code using the name for any other purpose (lambdas are normally anonymous, so the functions aren't expected to "know" that they're actually called lambda).
If you read the GCC documentation for nested functions, you'll see that this technique is quite limited, though. Nested functions expire when the lifetime of their containing frame ends. That means they can't be returned, and they can't really be stored usefully. They can be passed up by pointer into other functions called from the containing frame that expect a normal function pointer, so they are fairly useful still. But they don't have anywhere near the flexibility of true lambdas, which take ownership (shared or total depends on the language) of the variables they close over, and can be passed in all directions as true values or stored for later use by a completely unrelated part of the program. The syntax is also fairly ungainly, even if you wrap it up in a lot of helper macros.
C will most likely be getting true lambdas in the next version of the language, currently called C2x. You can read more about the proposed form here - it doesn't really look much like this (it copies the anonymous function syntax and semantics found in Objective-C). The functions created this way have lifetimes that can exceed their creating scope; the function bodies are true expressions, without the need for a statement-containing hack; and the functions themselves are truly anonymous, no intermediate names like lambda required.
A C2x version of the above example will most likely look something like this:
#include <stdio.h>
int main(void) {
typedef int (^ F)(int);
__block F fact; // needs to be mutable - block can't copy-capture
// its own variable before initializing it
fact = ^(int n) {
return n == 0 ? 1 : n * fact(n - 1);
};
F (^ make_adder)(int) = ^(int n) {
return _Closure_copy(^(int x) { return n + x; });
};
F add1 = make_adder(1);
F mul3 = ^(int n) { return n * 3; };
F (^ compose)(F, F) = ^(F f, F g) {
return _Closure_copy(^(int n) { return f(g(n)); });
};
F mul3add1 = compose(mul3, add1);
printf("%d\n", fact(5));
printf("%d\n", mul3(add1(10)));
printf("%d\n", mul3add1(10));
_Closure_free(add1);
_Closure_free(mul3add1);
return 0;
}
Much simpler without all that union stuff.
(You can compile and run this modified example in Clang right now - use the -fblocks flag to enable the lambda extension, add #include <Block.h> to the top of the file, and replace _Closure_copy and _Closure_free with Block_copy and Block_release respectively.)

Prevent Frama-C's slicing plugin from changing input code

Given a C file, I want to compute the backward slice for some criteria and compare the slice to the original code. Because I don't want to implement a slicing program from cratch, I've already tried to get used to Frama-C which seems to help with this task.
However, my problem is, that Frama-C's slicing plugin changes the preprocessed input code, so that it makes it harder to identify which lines of the original also appear in the slice.
Example:
Input file test1.c:
double func1(double param) {
return 2+param;
}
int main() {
int a=3;
double c=4.0;
double d=10.0;
if(a<c)
c=(double)a/4.0;
double res = func1(c);
return 0;
}
Preprocessed file (yielded by frama-c test1.c -print -ocode test1_norm.c):
/* Generated by Frama-C */
double func1(double param)
{
double __retres;
__retres = (double)2 + param;
return __retres;
}
int main(void)
{
int __retres;
int a;
double c;
double d;
double res;
a = 3;
c = 4.0;
d = 10.0;
if ((double)a < c) c = (double)a / 4.0;
res = func1(c);
__retres = 0;
return __retres;
}
Slice (yielded by frama-c -slice-calls func1 test1.c -then-on 'Slicing export' -print):
/* Generated by Frama-C */
double func1_slice_1(double param)
{
double __retres;
__retres = (double)2 + param;
return __retres;
}
void main(void)
{
int a;
double c;
double res;
a = 3;
c = 4.0;
c = (double)a / 4.0;
res = func1_slice_1(c);
return;
}
Note that the signature of main differs and that the name of func1 was changed to func1_slice_1.
Is there a way to suppress that behaviour in order to make the slice and the (preprocessed) original more easily comparable (in terms of a computable diff)?
First, to clarify a simpler question that you don't need answering but that someone searching for the same keywords could, you cannot have the sliced program printed as a selection of the lines of the original program (most of the differences between the two corresponds to lost information, basically. If the information was there, it would be used to print the most resembling program possible).
What you can do is print Frama-C's representation of the original program, which you are already doing with frama-c test2.c -print -ocode test2_norm.c.
To solve your problem of func1 being renamed to func1_slice_1, you can try playing with option -slicing-level 0:
$ frama-c -slicing-level 0 -slice-calls func1 test1.c -then-on 'Slicing export' -print
...
/* Generated by Frama-C */
double func1(double param)
{
double __retres;
__retres = (double)2 + param;
return __retres;
}
void main(void)
{
int a;
double c;
double res;
a = 3;
c = 4.0;
c = (double)a / 4.0;
res = func1(c);
return;
}
I think this will prevent the slicer from slicing inside func1 at all. The help says:
-slicing-level <n> set the default level of slicing used to propagate to the
calls
0 : don't slice the called functions
1 : don't slice the called functions but propagate the
marks anyway
2 : try to use existing slices, create at most one
3 : most precise slices

Garbage storage, programming in C

I'm new programing in C. I have a main code with 781 lines that is out of control because garbage value is stored in vectors. A short part of the main code is shown below where it calls a subroutine called diff_conv_intermedia1.
diff_conv_intermedia1(&factorteta,&N,ID,DIFF,X1_intermedia,Y1_intermedia,X1C_intermedia,Y1C_intermedia,CU1_intermedia,CV1_intermedia,AW1_intermedia,AE1_intermedia,AS1_intermedia,AN1_intermedia,AP1_intermedia,Q1_intermedia,FXI1,FYI1,FI_intermedia1,1,2,1,1);
int q,w;
for(q=1;q<(*factorteta_Ptr)*2+1;q++)
{
for(w=1;w<(*N_Ptr)+1;w++)
{
printf("%lf\n",AP1_intermedia[q][w]);
}
}
So the subroutine shown below. When I print the results inside the subroutine, everything is OK, but when I print the results outside the subroutine, in the main code, garbage is stored in the vectors as AP1_intermedia. I don't know what could be wrong. I repeat the same procedure with other subroutines and I don't have any errors.
int diff_conv_intermedia1(int *factorteta_Ptr,
int *N_Ptr,
int ID,
double DIFF,
double X[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double Y[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double XC[(*factorteta_Ptr)*2+2][*N_Ptr+2],
double YC[(*factorteta_Ptr)*2+2][*N_Ptr+2],
double CU[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double CV[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AW[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AE[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AS[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AN[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double AP[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double Q[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double FX[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double FY[(*factorteta_Ptr)*2+1][*N_Ptr+1],
double FI[(*factorteta_Ptr)*2+1][*N_Ptr+1],
int WBC,int EBC,int SBC,int NBC)
{
int i,j;
double value,* valuePtr;
double AED, AWD, AND, ASD;
double AEC, AWC, ANC, ASC;
valuePtr = &value;
// Diffusive coefficients
for(i=1;i<(*factorteta_Ptr)*2+1;i++)
{
for(j=1;j<*N_Ptr+1;j++)
{
AWD = -DIFF*(Y[i][j-1]-Y[i-1][j-1])/(XC[i][j]-XC[i][j-1]);
AED = -DIFF*(Y[i][j]-Y[i-1][j])/(XC[i][j+1]-XC[i][j]);
AND = -DIFF*(X[i][j]-X[i][j-1])/(YC[i+1][j]-YC[i][j]);
ASD = -DIFF*(X[i-1][j]-X[i-1][j-1])/(YC[i][j]-YC[i-1][j]);
// Convection term
if(ID==2)
{
max1_or_min2(CU[i][j-1],1,&value);
AWC=-*valuePtr;
max1_or_min2(CU[i][j],2,&value);
AEC=*valuePtr;
max1_or_min2(CV[i-1][j],1,&value);
ASC=-*valuePtr;
max1_or_min2(CV[i][j],2,&value);
ANC=*valuePtr;
}
if(ID==1)
{
AWC =-CU[i][j-1]*(1.0-FX[i][j-1]);
AEC =CU[i][j]*FX[i][j];
ASC =-CV[i-1][j]*(1.0-FY[i-1][j]);
ANC =CV[i][j]*FY[i][j];
}
// Set Coefficients matrix
AW[i][j] = AWD+AWC;
AE[i][j] = AED+AEC;
AS[i][j] = ASD+ASC;
AN[i][j] = AND+ANC;
AP[i][j] = -(AE[i][j]+AW[i][j]+AN[i][j]+AS[i][j]);
Q[i][j] = 0.0;
}
}
// West Boundary - Inlet B.C
for(i=1;i<(*factorteta_Ptr)*2+1;i++)
{
if(WBC==1) Q[i][1] = Q[i][1]-AW[i][1]*FI[i][0];
if(WBC==2) AP[i][1] = AP[i][1] + AW[i][1];
AW[i][1] = 0.0;
// East Boundary - (1)Dirichlet (2)ZERP-GRAD Outflow B.C
if(EBC==1) Q[i][*N_Ptr] = Q[i][*N_Ptr] - AE[i][*N_Ptr]*FI[i][*N_Ptr+1];
if(EBC==2) AP[i][*N_Ptr] = AP[i][*N_Ptr] + AE[i][*N_Ptr];
AE[i][*N_Ptr] = 0.0;
}
// South Boundary - (1)Dirichlet (2)ZERO-GRAD
for(j=1;j<*N_Ptr+1;j++)
{
if(SBC==1) Q[1][j] = Q[1][j] - AS[1][j]*FI[0][j];
if(SBC==2) AP[1][j] = AP[1][j] + AS[1][j];
AS[1][j] = 0.0;
// North Boundary - (1)Dirichlet (2)ZERO-GRAD
if(NBC==1) Q[(*factorteta_Ptr)*2][j] = Q[(*factorteta_Ptr)*2][j] - AN[(*factorteta_Ptr)*2][j]*FI[(*factorteta_Ptr)*2+1][j];
if(NBC==2) AP[(*factorteta_Ptr)*2][j] = AP[(*factorteta_Ptr)*2][j] + AN[(*factorteta_Ptr)*2][j];
AN[(*factorteta_Ptr)*2][j] = 0.0;
}
// Print
int l,k;
for(l=1;l<(*factorteta_Ptr)*2+1;l++)
{
for(k=1;k<*N_Ptr+1;k++)
{
printf("%lf %lf %lf %lf\n",AP[l][k],AS[l][k],AN[l][k],FI[l][k]);
}
}
return 0;
}
If anybody wants I can send all code, but have many extensions.
In your function declaration:
double AP[(*factorteta_Ptr)*2+1][*N_Ptr+1]
I don't quite think this is doing what you think it is doing. While I haven't seen something like this myself before, I believe that this is telling the compiler to create a variable length 2D array for you based on the other given parameters. Then, you fill in these values in your function. But, because you don't return this value nor do you declare it as pass by reference, it is thrown away when you return, thus the work is lost and you have garbage in your array in main(). Better form would be to create this array in main(), then pass it in by reference something like double *AP[][], or return this array upon exit, or hack things up even worse than this function and just make it a global so that you can see it anywhere.

`gdb` gives different results for `ret` and natural return

I have the following function:
edge** graph_out_edges(graph* g, int src) {
int i;
int num_edges = 0;
edge** es = (edge**) malloc(sizeof(edge*) * g->num_edges);
for (i = 0; i < g->num_edges; i++) {
if (src == g->edges[i]->src) {
es[num_edges++] = g->edges[i];
}
}
es[num_edges] = NULL;
return es;
}
I add a breakpoint to the function using b graph_out_edges, run the program using r, and then continue (c) twice (I get a segfault if I continue again). I then n through the function until it moves to the command just after the call to the function
edge** new = graph_out_edges(g, min->dest);
p new[0] and p new[1] give valid edges (the members are populated), and p new[2] gives 0x0, as expected. I then type r to restart the program, again continuing twice, but this time I then type ret (confirming I want to return), type n to execute the assignment, but now when I type p new[0] I get
Cannot access memory at address 0x2
(Just for clarity, p new now says $10 = (edge**) 0x2)
Any suggestions on why there is this discrepancy between the return value when "nexting" through the function manually and forcing a return?
In the function, if you guarantee the value of es is right, then after the call, the value of new should be right.
Perhaps, I suppose that,
First, check the es;
Then, compare the return value of new with es.

Tool for static loop termination detection in C

Is there any "almost-usable" static analysis tool for C (or C-like) programs that can automatically infer loop termination, at least for very simple programs?
I looked around a bit and found several research articles, a few prototypes, and even some tools (such as Frama-C) that try to infer some termination properties from an annotated source code, but I was expecting to find at least one simple tool that you could just give it a C program and it would output: loop #N terminates/does not terminate/unknown.
(I know this is undecidable in the general case, but for some classes of loops semi-algorithms are possible).
I'd also be interested in tools that work for imperative languages other than C, such as Java.
Edit: just an update to my question, I found LoopFrog, built on top of goto-cc, that seems to be in the direction of what I was looking for, however I still didn't have time to actually understand what its output means precisely. Should it be the answer to my question, I'll post an update here.
I don't know if you have read these two blog posts (1, 2), but one of the “simple”
tools you are looking for could be a script that, in parallel, launches Frama-C's value analysis as an ordinary sound abstract interpreter (able to infer that the end of a program is unreachable) and with its option -obviously-terminates (in which case it can infer that all executions of a program terminate). In both cases you might want to use a timeout. For the analysis with option -obviously-terminates, the timeout is mandatory, because the analysis fails to terminate if the analysed program does not itself terminate.
According to these blog posts I wrote, you should be able to diagnose the following examples, not all of which are entirely trivial:
char x, y;
main()
{
x = input();
y = input();
while (x>0 && y>0)
{
if (input() == 1)
{
x = x - 1;
y = input();
}
else
y = y - 1;
}
}
Terminates
char x, y;
main()
{
x = input();
y = input();
while (x>0 && y>0)
{
// Frama_C_dump_each();
if (input())
{
x = x - 1;
y = x;
}
else
{
x = y - 2;
y = x + 1;
}
}
}
Terminates
char x;
main(){
x = input();
while (x > 0)
{
Frama_C_dump_each();
if (x > 11)
x = x - 12;
else
x = x + 1;
}
}
Terminates
unsigned char u;
int main(){
while (u * u != 17)
{
u = u + 1;
}
return u;
}
Does not terminate.
However, the option -obviously-terminates involved in these examples was not originally designed for this use (it was more of an optimisation for the analysis of a certain kind of program). I did not realise that in some rare cases, when this option was set, the analysis could terminate without the analysed program itself terminating. If you are willing to recompile from sources, this issue would be fixed by setting variable obviously_terminates to true (instead of false) in file state_set.ml. If you have reasons to think that this script is not the solution you are looking for, then don't bother: the issue seems rare, as I said. I only noticed it while trying to determinate whether programs terminated in a more competitive setting.

Resources