Enforce function isolation at compile time - c

I am working on a project in C Visual Studio, and I have two sets of functions, let’s call them SET_1 and SET_2.
I wonder if there is a way to ensure that a function from SET_1 calls only functions from SET_1 and not functions from SET_2.
The simplest solution will be to split the project in 2, but I want to avoid any major refactoring. I probably can make some runtime checks but I want to avoid this too…
So, I am wondering if there is something like SAL annotations that I can use to enforce this isolation at compile time?
Here is an example of what I want:
#define SET_1 ...
#define SET_2 ...
SET_1
void Fct1()
{
// ...
}
SET_1
void Fct2()
{
Fct1(); // Ok, both functions have SET_1 tag
}
SET_2
void Fct3()
{
Fct1(); // Compile error, Fct1 has a different tag
}
I don’t want to write some kind of code parser to manually enforce this rule.
I have multiple files and a file contains functions from both sets. The functions don’t have any common characteristic, I manually need to specify the set for each function.
The solution can be at compile time or at build time. I just want to make sure that a function from set1 will not be called from set2
I can modify the code and I know that the right solution will be to refactor the project, but I am curious if there is another solution.
For example, if the code was in C++, I could include all functions from set1 inside a namespace and those from set2 inside another namespace. But this will not work if we have a class with function members in different sets.

Related

LLVM Loop Simplify Pass

I am probably misunderstanding some basic concept how LLVM & passes work, anyhow here is my question:
I am currently working on a pass where I extend the runOnModule (https://llvm.org/doxygen/classllvm_1_1ModulePass.html) function. I would like to run LoopSimplify first on the IR, but I do not seem to understand how to do that. There is a run(Function &F, FunctionAnalysisManager &AM) function as described on https://llvm.org/doxygen/classllvm_1_1LoopSimplifyPass.html and as far as I understand it I can call it on every function in my module. But for that I need a member of that class (LoopSimplify) to call it on which I do not know where to get from and also some FunctionAnalysisManager. What are they for and how do they need to look like? It is not like I can just feed it some empty constructs right?
I want to do this for the following guarantee:
"Loop pre-header insertion guarantees that there is a single, non-critical
entry edge from outside of the loop to the loop header. This simplifies a
number of analyses and transformations, such as LICM." as described in https://llvm.org/doxygen/LoopSimplify_8h_source.html.
While I support the directions to integrate your pass into using the pass manager, nonetheless, there is a way to force LoopSimplify to run by making your pass require it. This is also used in many of the LLVM provided passes, such as Scalar/LoopVersioningLICM.cpp
// This header includes LoopSimplifyID as an extern
#include "llvm/Transforms/Utils.h"
...
void YourPass::getAnalysisUsage(AnalysisUsage& AU) const {
AU.addRequiredID(LoopSimplifyID);
}
Doing so will force the pass to be run prior to your pass, no need to invoke it. However, if you need interface with this or another pass, you can request its analysis:
getAnalysis<LoopSimplifyPass>(F); // Where F is a function&

How can I parametrize a callback function that I submit to an external library

Say I have an external library that computes the optima, say minima, of a given function. Say its headers give me a function
double[] minimizer(ObjFun f)
where the headers define
typedef double (*ObjFun)(double x[])
and "minimizer" returns the minima of the function f of, say, a two dimensional vector x.
Now, I want to use this to minimize a parameterized function. I don't know how to express this in code exactly, but say if I am minimizing quadratic forms (just a silly example, I know these have closed form minima)
double quadraticForm(double x[]) {
return x[0]*x[0]*q11 + 2*x[0]*x[1]*q12 + x[1]*x[1]*q22
}
which is parameterized by the constants (q11, q12, q22). I want to write code where the user can input (q11, q12, q22) at runtime, I can generate a function to give to the library as a callback, and return the optima.
What is the recommended way to do this in C?
I am rusty with C, so asking about both feasibility and best practices. Really I am trying to solve this using C/Cython code. I was using python bindings to the library so far and using "inner functions" it was really obvious how to do this in python:
def getFunction(q11, q12, q22):
def f(x):
return x[0]*x[0]*q11 + 2*x[0]*x[1]*q12 + x[1]*x[1]*q22
return f
// now submit getFunction(/*user params*/) to the library
I am trying to figure out the C construct so that I can be better informed in creating a Cython equivalent.
The header defines the prototype of a function which can be used as a callback. I am assuming that you can't/won't change that header.
If your function has more parameters, they cannot be filled by the call.
Your function therefor cannot be called as callback, to avoid undefined behaviour or bogus values in parameters.
The function therefor cannot be given as callback; not with additional parameters.
Above means you need to drop the idea of "parameterizing" your function.
Your actual goal is to somehow allow the constants/coefficients to be changed during runtime.
Find a different way of doing that. Think of "dynamic configuration" instead of "parameterizing".
I.e. the function does not always expect those values at each call. It just has access to them.
(This suggests the configuration values are less often changed than the function is called, but does not require it.)
How:
I only can think of one simple way and it is pretty ugly and vulnerable (e.g. due to racing conditions, concurrent access, reentrance; you name it, it will hurt you ...):
Introduce a set of global variables, or better one struct-variable, for readability. (See recommendation below for "file-global" instead of "global".)
Set them at runtime to the desired values, using a separate function.
Initialise them to meaningful defaults, in case they never get written.
Read them at the start of the minimizing callback function.
Recommendation: Have everything (the minimizing function, the configuration variable and the function which sets the configuration at runtime) in one code file and make the configuration variable(s) static (i.e. restricts access to it this code file).
Note:
The answer is only the analysis that and why you should not try paraemeters.
The proposed method is not considered part of the answer; it is more simple than good.
I invite more holistic answers, which propose safer implementation.

Which of these functions is more testable in C?

I write code in C. I have been striving to write more testable code but I am a little
confused on deciding between writing pure functions that are really good for testing
but require smaller functions and hurt readability in my opinion and writing functions
that do modify some internal state.
For example (all state variables are declared static and hence are "private" to my module):
Which of this is more testable in your opinion:
int outer_API_bar()
{
// Modify internal state
internal_foo()
}
int internal_foo()
{
// Do stuff
if (internal_state_variable)
{
// Do some more stuff
internal_state_variable = false;
}
}
OR
int outer_API_bar()
{
// Modify internal state
internal_foo(internal_state_variable)
// This could be another function if repeated many
// times in the module
if (internal_state_variable)
{
internal_state_variable = false;
}
}
int internal_foo(bool arg)
{
// Do stuff
if (arg)
{
// Do some more stuff
}
}
Although second implementation is more testable wrt to internal_foo as it has no sideeffects but it makes bar uglier and requires smaller functions that make it hard for the reader to even follow small snippets as he has to constantly shift attention to different functions.
Which one do you think is better ? Compare this to writing OOPS code, the private functions most of the time use internal state and are not pure. Testing is done by setting up internal state on a mock object instance and testing the private function. I am getting a little confused on whether to use or whether to pass in internal state to private functions for the sake of "testability"
Whenever writing automated tests, ideally we want to focus on testing the specification of that unit of code, not the implementation (otherwise we create fragile tests that will break whenever we modify the implementation). Therefore, what happens internally in the object should not be of concern to the test.
For this example, I would look to build a test that:
Executes the test by calling outer_API_bar.
Asserts that the correct behavior of the call using other publicly accessible functions and/or state (there must be some way of doing this, as if the only side effect of calling outer_API_bar was internal to this unit of code, then calling this function could not impact your wider application in any way, and essentially be useless).
This way, you are able to keep the fact that you use functions like internal_foo, and variables like internal_state_variable as implementation details, which you can freely change when refactoring your code (i.e. to make it more readable) without having to change your tests.
NOTE: This suggestion is based on my own personal preference for only testing public functions, and not private ones. You will find much debate on this topic where some people pose good arguments for testing private functions being a valid thing to do.
To answer your question very specifically pure functions are waaaaay more 'testable' than any other kind of abstraction. The more pure functions you can include, the more testable your code would be. As you rightly mention, this can come at the cost of readability, and I am sure there are other trade offs to consider. My suggestion would be to aim for more pure functions and look for other techniques that would allow you to compensate on the readability side of things.
Both snippets are testable via mocks. The second one, however, has the advantage that you can also check the argument of internal_foo(bool arg) for an expected value of true or false when the mock for internal_foo() is invoked. In my opinion, that would make for a more meaningful test.
Depending on the rest of the code that we don't know, testing without mocks may be more difficult.

method vs function vs procedure vs class?

I know the basics of this methods,procedures,function and classes but i always confuse to differentiate among those in contrast of Object oriented programming so please can any body tell me the difference among those with simple examples ?
A class, in current, conventional OOP, is a collection of data (member variables) bound together with the functions/procedures that work on that data (member functions or methods). The class has no relationship to the other three terms aside from the fact that it "contains" (more properly "is associated with") the latter.
The other three terms ... well, it depends.
A function is a collection of computing statements. So is a procedure. In some very anal retentive languages, though, a function returns a value and a procedure doesn't. In such languages procedures are generally used for their side effects (like I/O) while functions are used for calculations and tend to avoid side effects. (This is the usage I tend to favour. Yes, I am that anal retentive.)
Most languages are not that anal retentive, however, and as a result people will use the terms "function" and "procedure" interchangeably, preferring one to the other based on their background. (Modula-* programmers will tend to use "procedure" while C/C++/Java/whatever will tend to use "function", for example.)
A method is just jargon for a function (or procedure) bound to a class. Indeed not all OOP languages use the term "method". In a typical (but not universal!) implementation, methods have an implied first parameter (called things like this or self or the like) for accessing the containing class. This is not, as I said, universal. Some languages make that first parameter explicit (and thus allow to be named anything you'd like) while in still others there's no magic first parameter at all.
Edited to add this example:
The following untested and uncompiled C++-like code should show you what kind of things are involved.
class MyClass
{
int memberVariable;
void setMemberVariableProcedure(int v)
{
memberVariable = v;
}
int getMemberVariableFunction()
{
return memberVariable;
}
};
void plainOldProcedure(int stuff)
{
cout << stuff;
}
int plainOldFunction(int stuff)
{
return 2 * stuff;
}
In this code getMemberVariableProcedure and getMemberVariableFunction are both methods.
Procedures, function and methods are generally alike, they hold some processing statements.
The only differences I can think between these three and the places where they are used.
I mean 'method' are generally used to define functions inside a class, where several types of user access right like public, protected, private can be defined.
"Procedures", are also function but they generally represent a series of function which needs to be carried out, upon the completion of one function or parallely with another.
Classes are collection of related attributes and methods. Attributes define the the object of the class where as the methods are the action done by or done on the class.
Hope, this was helpful
Function, method and procedure are homogeneous and each of them is a subroutine that performs some calculations.
A subroutine is:
a method when used in Object-Oriented Programming (OOP). A method can return nothing (void) or something and/or it can change data outside of the subroutine or method.
a procedure when it does not return anything but it can change data outside of the subroutine, think of a SQL stored procedure. Not considering output parameters!
a function when it returns something (its calculated result) without changing data outside of the subroutine or function. This is the way how SQL functions work.
After all, they are all a piece of re-usable code that does something, e.g. return data, calculate or manipulate data.
There is no difference between of among.
Method : no return type like void
Function : which have return type

GCC function attributes vs caching

I have one costly function that gets called many times and there is a very limited set of possible values for the parameter.
Function return code depends only on arguments so the obvious way to speed things up is to keep a static cache within the function for possible arguments and corresponding return codes, so for every combination of the parameters, the costly operation will be performed only once.
I always use this approach in such situations and it works fine but it just occurred to me that GCC function attributes const or pure probably can help me with this.
Does anybody have experience with this? How GCC uses pure and const attributes - only at compile time or at runtime as well?
Can I rely on GCC to be smart enough to call a function, declared as
int foo(int) __attribute__ ((pure))
just once for the same parameter value, or there is no guarantee whatsoever and I better stick to caching approach?
EDIT: My question is not about caching/memoization/lookup tables, but GCC function atributes.
I think you are confusing the GCC pure attribute with memoization.
The GCC pure attribute allows the compiler to reduce the number of times the function is called in certain circumstances (such as loop unrolling). However it makes no guarantees that it will do so, only if it think it's appropriate.
What you appear to be looking for is memoization of your function. Memoization is an optimization where calculations for the same input should not be repeated. Instead the previous result should be returned. The GCC pure attribute does not make a function work in this way. You would have to hand implement this.
I have one costly function that gets called many times and there is very limited set of possible values for the parameter.
Why not use a static constant map then (the arguments' can be hashed to generate a key, the return code the value)?
This sounds like it might be solved with a template function. If all if the known parameters and return values are known at compile-time, you could perhaps generate a template instance of the function for each possible parameter. Essentially you'd be calling a different instance of the function for each possible parameter. Not sure it would be any easier than the static cache you've already implemented, but might be worth exploring.
Check out template metaprogramming. The concepts are similar to 'memoization', suggested by JaredPar, even using the same introductory example of a factorial function. It might be appropriate to say that these kinds of templates are compile-time implementations of memoization.
I dont like to reopen old threads, but there was a particularly offensive comment here:
"templates are for dealing with different types, rather than different values of the same type"
Now, take a simple template factorial implementation:
template<int n> struct Factorial {
static const int value = n * Factorial<n-1>::value;
};
template<> struct Factorial<0> {
static const int value = 1;
};
The template parameter here is an integer, not a typename.

Resources