LLVM Loop Simplify Pass

LLVM Loop Simplify Pass - loops

I am probably misunderstanding some basic concept how LLVM & passes work, anyhow here is my question:
I am currently working on a pass where I extend the runOnModule (https://llvm.org/doxygen/classllvm_1_1ModulePass.html) function. I would like to run LoopSimplify first on the IR, but I do not seem to understand how to do that. There is a run(Function &F, FunctionAnalysisManager &AM) function as described on https://llvm.org/doxygen/classllvm_1_1LoopSimplifyPass.html and as far as I understand it I can call it on every function in my module. But for that I need a member of that class (LoopSimplify) to call it on which I do not know where to get from and also some FunctionAnalysisManager. What are they for and how do they need to look like? It is not like I can just feed it some empty constructs right?
I want to do this for the following guarantee:
"Loop pre-header insertion guarantees that there is a single, non-critical
entry edge from outside of the loop to the loop header. This simplifies a
number of analyses and transformations, such as LICM." as described in https://llvm.org/doxygen/LoopSimplify_8h_source.html.

While I support the directions to integrate your pass into using the pass manager, nonetheless, there is a way to force LoopSimplify to run by making your pass require it. This is also used in many of the LLVM provided passes, such as Scalar/LoopVersioningLICM.cpp
// This header includes LoopSimplifyID as an extern
#include "llvm/Transforms/Utils.h"
...
void YourPass::getAnalysisUsage(AnalysisUsage& AU) const {
AU.addRequiredID(LoopSimplifyID);
}
Doing so will force the pass to be run prior to your pass, no need to invoke it. However, if you need interface with this or another pass, you can request its analysis:
getAnalysis<LoopSimplifyPass>(F); // Where F is a function&

Related

Enforce function isolation at compile time

I am working on a project in C Visual Studio, and I have two sets of functions, let’s call them SET_1 and SET_2.
I wonder if there is a way to ensure that a function from SET_1 calls only functions from SET_1 and not functions from SET_2.
The simplest solution will be to split the project in 2, but I want to avoid any major refactoring. I probably can make some runtime checks but I want to avoid this too…
So, I am wondering if there is something like SAL annotations that I can use to enforce this isolation at compile time?
Here is an example of what I want:
#define SET_1 ...
#define SET_2 ...
SET_1
void Fct1()
{
// ...
}
SET_1
void Fct2()
{
Fct1(); // Ok, both functions have SET_1 tag
}
SET_2
void Fct3()
{
Fct1(); // Compile error, Fct1 has a different tag
}
I don’t want to write some kind of code parser to manually enforce this rule.
I have multiple files and a file contains functions from both sets. The functions don’t have any common characteristic, I manually need to specify the set for each function.
The solution can be at compile time or at build time. I just want to make sure that a function from set1 will not be called from set2
I can modify the code and I know that the right solution will be to refactor the project, but I am curious if there is another solution.
For example, if the code was in C++, I could include all functions from set1 inside a namespace and those from set2 inside another namespace. But this will not work if we have a class with function members in different sets.

How to insert print for each function of C language for debugging?

I am studying and debugging one software. There are thousands of functions in this software. I plan to add printf() at the entry and exit point of each function. It will take a lot of time.
Is there one tool/script to do this?
I may use '__cyg_profile_func_enter'. But it can only get address. But I have to run another script to get function name. I also hope to get value of input parameters of this function too.

You should give a try to AOP : Aspect Oriented Programming. Personnaly I've only tried with Java and Spring AOP but there's an API for C too : AspectC (https://sites.google.com/a/gapp.msrg.utoronto.ca/aspectc/home). From what I've seen, it's not the only one.
From what I've red about this library, you can add an pointcut before compiling with AspectC :
// before means it's a before function aspect
// call means it's processed when a function is called
// args(...) means it applies to any function with any arguments
// this->funcName is the name of the function handled by AspectC
before(): call(args(...)) {
printf("Entering %s\n", this->funcName);
}
(not tried by myself but extracted from the reference page https://sites.google.com/a/gapp.msrg.utoronto.ca/aspectc/tutorial)
This is only a basic overview of what can be done and you still have to deal with the compilation (documented in the page linked before) but it looks like it could possibly help you. Give a try with a simple POC maybe.

How can I parametrize a callback function that I submit to an external library

Say I have an external library that computes the optima, say minima, of a given function. Say its headers give me a function
double[] minimizer(ObjFun f)
where the headers define
typedef double (*ObjFun)(double x[])
and "minimizer" returns the minima of the function f of, say, a two dimensional vector x.
Now, I want to use this to minimize a parameterized function. I don't know how to express this in code exactly, but say if I am minimizing quadratic forms (just a silly example, I know these have closed form minima)
double quadraticForm(double x[]) {
return x[0]*x[0]*q11 + 2*x[0]*x[1]*q12 + x[1]*x[1]*q22
}
which is parameterized by the constants (q11, q12, q22). I want to write code where the user can input (q11, q12, q22) at runtime, I can generate a function to give to the library as a callback, and return the optima.
What is the recommended way to do this in C?
I am rusty with C, so asking about both feasibility and best practices. Really I am trying to solve this using C/Cython code. I was using python bindings to the library so far and using "inner functions" it was really obvious how to do this in python:
def getFunction(q11, q12, q22):
def f(x):
return x[0]*x[0]*q11 + 2*x[0]*x[1]*q12 + x[1]*x[1]*q22
return f
// now submit getFunction(/*user params*/) to the library
I am trying to figure out the C construct so that I can be better informed in creating a Cython equivalent.

The header defines the prototype of a function which can be used as a callback. I am assuming that you can't/won't change that header.
If your function has more parameters, they cannot be filled by the call.
Your function therefor cannot be called as callback, to avoid undefined behaviour or bogus values in parameters.
The function therefor cannot be given as callback; not with additional parameters.
Above means you need to drop the idea of "parameterizing" your function.
Your actual goal is to somehow allow the constants/coefficients to be changed during runtime.
Find a different way of doing that. Think of "dynamic configuration" instead of "parameterizing".
I.e. the function does not always expect those values at each call. It just has access to them.
(This suggests the configuration values are less often changed than the function is called, but does not require it.)
How:
I only can think of one simple way and it is pretty ugly and vulnerable (e.g. due to racing conditions, concurrent access, reentrance; you name it, it will hurt you ...):
Introduce a set of global variables, or better one struct-variable, for readability. (See recommendation below for "file-global" instead of "global".)
Set them at runtime to the desired values, using a separate function.
Initialise them to meaningful defaults, in case they never get written.
Read them at the start of the minimizing callback function.
Recommendation: Have everything (the minimizing function, the configuration variable and the function which sets the configuration at runtime) in one code file and make the configuration variable(s) static (i.e. restricts access to it this code file).
Note:
The answer is only the analysis that and why you should not try paraemeters.
The proposed method is not considered part of the answer; it is more simple than good.
I invite more holistic answers, which propose safer implementation.

Why Function Pointer rather than Calling function Directly

I have come across the function pointers. I know understand how this works. But i am not pretty sure, in what situation it will use. After some google and other search in Stack Overflow. I came know to know that it will use in two case
when callback mechanism is used
Store a array of functions, to call dynamically.
In this case also, why don't we call function directly. In the call back Mechanism also, as particular events occur, callback pointer is assigned to that function(Address). Then that is called. Can't we call function directly rather than using the function pointer. Can some some one tell me, what is the exact usage of Function pointer and in what situation.

Take a look at functions needing a callback, like
bsearch or qsort for the comparator, signal for the handler, or others.
Also, how would you want to program other openly-extensible mechanisms, like C++-like virtual-dispatch (vptr-table with function-pointers and other stuff)?
In short, function-pointers are used for making a function generic by making parts of the behavior user-defined.

One of the situation when function pointers would be useful is when you are trying to implement callback functions.
For example, in a server that I've been implementing in C and libevent accepts a message from clients and determine what to do. Instead of defining hundreds of switch-case blocks, I store function pointer of function to be called in a hash table so the message can be directly mapped to the respective function.
Event handling in libevent API(read about event_new()) also demonstrates the usefulness of having function points in APIs such that users can define their own behaviour given a certain situation and need not to modify the master function's code, which creates flexibility while maintaining certain level of abstraction. This design is also widely used in the Kernel API.

You said:
In the call back Mechanism also, as particular events occur, callback pointer is assigned to that function(Address).
Callback functions are registered at a very different place than where the callback functions are called.
A simple example:
In a GUI, the place where you register a function when a button is pressed is your toplevel application setup. The place where the function gets called is the implementation of the button. They need to remain separate to allow for the user of the button to have the freedom of what they wish to do when a button is pressed.
In general, you need a function pointer when the pointer needs to be stored to be used at a future time.

In the case of a callback situation, including interrupt driven code, a sequence of call backs or interrupts may occur for a single logical process. Say you have a set of functions like step1(), step2(), ... , to perform some process where a common callback is being used to step through a sequence. The initial call sets the callback to step1(), when step1() is called, it changes the pointer to function to step2() and initiates the next step. When that step completes, step2() is called, and it can set a pointer to function to step3(), and so on, depending on how many steps it takes to perform the sequence. I've mostly used this method for interrupt driven code.

Sometimes I use function pointers just to make (as I see it) the code more legible, and easier to change. But this is a matter of taste, there is no one 'correct' way. It's possible that the function pointer code will be slower, but probably only slightly and of course as far as performance goes it's always a matter of measuring, and usually more a matter of choosing better algorithms than of micro-optimisation.
One example is when you have two functions, with identical and long argument lists and sometimes you want to call one and sometimes the other. You could write
if ( condition)
{ one( /* long argument list */);
}
else
{ other( /* long argument list */);
}
or you could write
(condition ? one : other)(/* long argument list */);
I prefer the second as there is only one instance of the long argument list, and so it's easier to get right, and to change.
Another case is implementing state machines; one could write
switch( state)
{ case STATE0: state = state0_fun( input); break;
// etc
}
or
typedef int (*state_f)( void*);
state_f statefs[] = { state0_fun /* etc */}
state = statefs[ state](input);
Again I find the second form more maintainable, but maybe that's just me.

Ruby C Extension using Singleton

I only wanted to allow one instance of my C extension class to be made, so I wanted to include the singleton module.
void Init_mousetest() {
VALUE mouseclass = rb_define_class("MyMouse",rb_cObject);
rb_require("singleton");
VALUE singletonmodule = rb_const_get(rb_cObject,rb_intern("Singleton"));
rb_include_module(mouseclass,singletonmodule);
rb_funcall(singletonmodule,rb_intern("included"),1,mouseclass);
### ^ Why do I need this line here?
rb_define_method(mouseclass,"run",method_run,0);
rb_define_method(mouseclass,"spawn",method_spawn,0);
rb_define_method(mouseclass,"stop",method_stop,0);
}
As I understand it, what that line does is the same as Singleton.included(MyMouse), but if I try to invoke that, I get
irb(main):006:0> Singleton.included(MyMouse)
NoMethodError: private method `included' called for Singleton:Module
from (irb):6
from C:/Ruby19/bin/irb:12:in `<main>'
Why does rb_include_module behave differently than I would expect it to? Also any tangential discussions/explanations or related articles are appreciated. Ruby beginner here.
Also it seems like I could have just kept my extension as simple as possible and just hack some kind of interface later on to ensure I only allow one instance. Or just put my mouse related methods into a module... Any of that make sense?

according to http://www.groupsrv.com/computers/about105620.html the rb_include_module() is actually just Module#append_features.
Apparently Module#include calls Module#append_features and Module#included. So in our C code we must also call included. Since clearly something important happens there.