Explicitely initialize a function pointer [duplicate] - c

What is a lambda expression in C++11? When would I use one? What class of problem do they solve that wasn't possible prior to their introduction?
A few examples, and use cases would be useful.

The problem
C++ includes useful generic functions like std::for_each and std::transform, which can be very handy. Unfortunately they can also be quite cumbersome to use, particularly if the functor you would like to apply is unique to the particular function.
#include <algorithm>
#include <vector>
namespace {
struct f {
void operator()(int) {
// do something
}
};
}
void func(std::vector<int>& v) {
f f;
std::for_each(v.begin(), v.end(), f);
}
If you only use f once and in that specific place it seems overkill to be writing a whole class just to do something trivial and one off.
In C++03 you might be tempted to write something like the following, to keep the functor local:
void func2(std::vector<int>& v) {
struct {
void operator()(int) {
// do something
}
} f;
std::for_each(v.begin(), v.end(), f);
}
however this is not allowed, f cannot be passed to a template function in C++03.
The new solution
C++11 introduces lambdas allow you to write an inline, anonymous functor to replace the struct f. For small simple examples this can be cleaner to read (it keeps everything in one place) and potentially simpler to maintain, for example in the simplest form:
void func3(std::vector<int>& v) {
std::for_each(v.begin(), v.end(), [](int) { /* do something here*/ });
}
Lambda functions are just syntactic sugar for anonymous functors.
Return types
In simple cases the return type of the lambda is deduced for you, e.g.:
void func4(std::vector<double>& v) {
std::transform(v.begin(), v.end(), v.begin(),
[](double d) { return d < 0.00001 ? 0 : d; }
);
}
however when you start to write more complex lambdas you will quickly encounter cases where the return type cannot be deduced by the compiler, e.g.:
void func4(std::vector<double>& v) {
std::transform(v.begin(), v.end(), v.begin(),
[](double d) {
if (d < 0.0001) {
return 0;
} else {
return d;
}
});
}
To resolve this you are allowed to explicitly specify a return type for a lambda function, using -> T:
void func4(std::vector<double>& v) {
std::transform(v.begin(), v.end(), v.begin(),
[](double d) -> double {
if (d < 0.0001) {
return 0;
} else {
return d;
}
});
}
"Capturing" variables
So far we've not used anything other than what was passed to the lambda within it, but we can also use other variables, within the lambda. If you want to access other variables you can use the capture clause (the [] of the expression), which has so far been unused in these examples, e.g.:
void func5(std::vector<double>& v, const double& epsilon) {
std::transform(v.begin(), v.end(), v.begin(),
[epsilon](double d) -> double {
if (d < epsilon) {
return 0;
} else {
return d;
}
});
}
You can capture by both reference and value, which you can specify using & and = respectively:
[&epsilon, zeta] captures epsilon by reference and zeta by value
[&] captures all variables used in the lambda by reference
[=] captures all variables used in the lambda by value
[&, epsilon] captures all variables used in the lambda by reference but captures epsilon by value
[=, &epsilon] captures all variables used in the lambda by value but captures epsilon by reference
The generated operator() is const by default, with the implication that captures will be const when you access them by default. This has the effect that each call with the same input would produce the same result, however you can mark the lambda as mutable to request that the operator() that is produced is not const.

What is a lambda function?
The C++ concept of a lambda function originates in the lambda calculus and functional programming. A lambda is an unnamed function that is useful (in actual programming, not theory) for short snippets of code that are impossible to reuse and are not worth naming.
In C++ a lambda function is defined like this
[]() { } // barebone lambda
or in all its glory
[]() mutable -> T { } // T is the return type, still lacking throw()
[] is the capture list, () the argument list and {} the function body.
The capture list
The capture list defines what from the outside of the lambda should be available inside the function body and how.
It can be either:
a value: [x]
a reference [&x]
any variable currently in scope by reference [&]
same as 3, but by value [=]
You can mix any of the above in a comma separated list [x, &y].
The argument list
The argument list is the same as in any other C++ function.
The function body
The code that will be executed when the lambda is actually called.
Return type deduction
If a lambda has only one return statement, the return type can be omitted and has the implicit type of decltype(return_statement).
Mutable
If a lambda is marked mutable (e.g. []() mutable { }) it is allowed to mutate the values that have been captured by value.
Use cases
The library defined by the ISO standard benefits heavily from lambdas and raises the usability several bars as now users don't have to clutter their code with small functors in some accessible scope.
C++14
In C++14 lambdas have been extended by various proposals.
Initialized Lambda Captures
An element of the capture list can now be initialized with =. This allows renaming of variables and to capture by moving. An example taken from the standard:
int x = 4;
auto y = [&r = x, x = x+1]()->int {
r += 2;
return x+2;
}(); // Updates ::x to 6, and initializes y to 7.
and one taken from Wikipedia showing how to capture with std::move:
auto ptr = std::make_unique<int>(10); // See below for std::make_unique
auto lambda = [ptr = std::move(ptr)] {return *ptr;};
Generic Lambdas
Lambdas can now be generic (auto would be equivalent to T here if
T were a type template argument somewhere in the surrounding scope):
auto lambda = [](auto x, auto y) {return x + y;};
Improved Return Type Deduction
C++14 allows deduced return types for every function and does not restrict it to functions of the form return expression;. This is also extended to lambdas.

Lambda expressions are typically used to encapsulate algorithms so that they can be passed to another function. However, it is possible to execute a lambda immediately upon definition:
[&](){ ...your code... }(); // immediately executed lambda expression
is functionally equivalent to
{ ...your code... } // simple code block
This makes lambda expressions a powerful tool for refactoring complex functions. You start by wrapping a code section in a lambda function as shown above. The process of explicit parameterization can then be performed gradually with intermediate testing after each step. Once you have the code-block fully parameterized (as demonstrated by the removal of the &), you can move the code to an external location and make it a normal function.
Similarly, you can use lambda expressions to initialize variables based on the result of an algorithm...
int a = []( int b ){ int r=1; while (b>0) r*=b--; return r; }(5); // 5!
As a way of partitioning your program logic, you might even find it useful to pass a lambda expression as an argument to another lambda expression...
[&]( std::function<void()> algorithm ) // wrapper section
{
...your wrapper code...
algorithm();
...your wrapper code...
}
([&]() // algorithm section
{
...your algorithm code...
});
Lambda expressions also let you create named nested functions, which can be a convenient way of avoiding duplicate logic. Using named lambdas also tends to be a little easier on the eyes (compared to anonymous inline lambdas) when passing a non-trivial function as a parameter to another function. Note: don't forget the semicolon after the closing curly brace.
auto algorithm = [&]( double x, double m, double b ) -> double
{
return m*x+b;
};
int a=algorithm(1,2,3), b=algorithm(4,5,6);
If subsequent profiling reveals significant initialization overhead for the function object, you might choose to rewrite this as a normal function.

Answers
Q: What is a lambda expression in C++11?
A: Under the hood, it is the object of an autogenerated class with overloading operator() const. Such object is called closure and created by compiler.
This 'closure' concept is near with the bind concept from C++11.
But lambdas typically generate better code. And calls through closures allow full inlining.
Q: When would I use one?
A: To define "simple and small logic" and ask compiler perform generation from previous question. You give a compiler some expressions which you want to be inside operator(). All other stuff compiler will generate to you.
Q: What class of problem do they solve that wasn't possible prior to their introduction?
A: It is some kind of syntax sugar like operators overloading instead of functions for custom add, subrtact operations...But it save more lines of unneeded code to wrap 1-3 lines of real logic to some classes, and etc.! Some engineers think that if the number of lines is smaller then there is a less chance to make errors in it (I'm also think so)
Example of usage
auto x = [=](int arg1){printf("%i", arg1); };
void(*f)(int) = x;
f(1);
x(1);
Extras about lambdas, not covered by question. Ignore this section if you're not interest
1. Captured values. What you can to capture
1.1. You can reference to a variable with static storage duration in lambdas. They all are captured.
1.2. You can use lambda for capture values "by value". In such case captured vars will be copied to the function object (closure).
[captureVar1,captureVar2](int arg1){}
1.3. You can capture be reference. & -- in this context mean reference, not pointers.
[&captureVar1,&captureVar2](int arg1){}
1.4. It exists notation to capture all non-static vars by value, or by reference
[=](int arg1){} // capture all not-static vars by value
[&](int arg1){} // capture all not-static vars by reference
1.5. It exists notation to capture all non-static vars by value, or by reference and specify smth. more.
Examples:
Capture all not-static vars by value, but by reference capture Param2
[=,&Param2](int arg1){}
Capture all not-static vars by reference, but by value capture Param2
[&,Param2](int arg1){}
2. Return type deduction
2.1. Lambda return type can be deduced if lambda is one expression. Or you can explicitly specify it.
[=](int arg1)->trailing_return_type{return trailing_return_type();}
If lambda has more then one expression, then return type must be specified via trailing return type.
Also, similar syntax can be applied to auto functions and member-functions
3. Captured values. What you can not capture
3.1. You can capture only local vars, not member variable of the object.
4. Сonversions
4.1 !! Lambda is not a function pointer and it is not an anonymous function, but capture-less lambdas can be implicitly converted to a function pointer.
p.s.
More about lambda grammar information can be found in Working draft for Programming Language C++ #337, 2012-01-16, 5.1.2. Lambda Expressions, p.88
In C++14 the extra feature which has named as "init capture" have been added. It allow to perform arbitarily declaration of closure data members:
auto toFloat = [](int value) { return float(value);};
auto interpolate = [min = toFloat(0), max = toFloat(255)](int value)->float { return (value - min) / (max - min);};

A lambda function is an anonymous function that you create in-line. It can capture variables as some have explained, (e.g. http://www.stroustrup.com/C++11FAQ.html#lambda) but there are some limitations. For example, if there's a callback interface like this,
void apply(void (*f)(int)) {
f(10);
f(20);
f(30);
}
you can write a function on the spot to use it like the one passed to apply below:
int col=0;
void output() {
apply([](int data) {
cout << data << ((++col % 10) ? ' ' : '\n');
});
}
But you can't do this:
void output(int n) {
int col=0;
apply([&col,n](int data) {
cout << data << ((++col % 10) ? ' ' : '\n');
});
}
because of limitations in the C++11 standard. If you want to use captures, you have to rely on the library and
#include <functional>
(or some other STL library like algorithm to get it indirectly) and then work with std::function instead of passing normal functions as parameters like this:
#include <functional>
void apply(std::function<void(int)> f) {
f(10);
f(20);
f(30);
}
void output(int width) {
int col;
apply([width,&col](int data) {
cout << data << ((++col % width) ? ' ' : '\n');
});
}

One of the best explanation of lambda expression is given from author of C++ Bjarne Stroustrup in his book ***The C++ Programming Language*** chapter 11 (ISBN-13: 978-0321563842):
What is a lambda expression?
A lambda expression, sometimes also referred to as a lambda
function or (strictly speaking incorrectly, but colloquially) as a
lambda, is a simplified notation for defining and using an anonymous function object. Instead of defining a named class with an operator(), later making an object of that class, and finally
invoking it, we can use a shorthand.
When would I use one?
This is particularly useful when we want to pass an operation as an
argument to an algorithm. In the context of graphical user interfaces
(and elsewhere), such operations are often referred to as callbacks.
What class of problem do they solve that wasn't possible prior to their introduction?
Here i guess every action done with lambda expression can be solved without them, but with much more code and much bigger complexity. Lambda expression this is the way of optimization for your code and a way of making it more attractive. As sad by Stroustup :
effective ways of optimizing
Some examples
via lambda expression
void print_modulo(const vector<int>& v, ostream& os, int m) // output v[i] to os if v[i]%m==0
{
for_each(begin(v),end(v),
[&os,m](int x) {
if (x%m==0) os << x << '\n';
});
}
or via function
class Modulo_print {
ostream& os; // members to hold the capture list int m;
public:
Modulo_print(ostream& s, int mm) :os(s), m(mm) {}
void operator()(int x) const
{
if (x%m==0) os << x << '\n';
}
};
or even
void print_modulo(const vector<int>& v, ostream& os, int m)
// output v[i] to os if v[i]%m==0
{
class Modulo_print {
ostream& os; // members to hold the capture list
int m;
public:
Modulo_print (ostream& s, int mm) :os(s), m(mm) {}
void operator()(int x) const
{
if (x%m==0) os << x << '\n';
}
};
for_each(begin(v),end(v),Modulo_print{os,m});
}
if u need u can name lambda expression like below:
void print_modulo(const vector<int>& v, ostream& os, int m)
// output v[i] to os if v[i]%m==0
{
auto Modulo_print = [&os,m] (int x) { if (x%m==0) os << x << '\n'; };
for_each(begin(v),end(v),Modulo_print);
}
Or assume another simple sample
void TestFunctions::simpleLambda() {
bool sensitive = true;
std::vector<int> v = std::vector<int>({1,33,3,4,5,6,7});
sort(v.begin(),v.end(),
[sensitive](int x, int y) {
printf("\n%i\n", x < y);
return sensitive ? x < y : abs(x) < abs(y);
});
printf("sorted");
for_each(v.begin(), v.end(),
[](int x) {
printf("x - %i;", x);
}
);
}
will generate next
0
1
0
1
0
1
0
1
0
1
0 sortedx - 1;x - 3;x - 4;x - 5;x - 6;x - 7;x - 33;
[] - this is capture list or lambda introducer: if lambdas require no access to their local environment we can use it.
Quote from book:
The first character of a lambda expression is always [. A lambda
introducer can take various forms:
• []: an empty capture list. This
implies that no local names from the surrounding context can be used
in the lambda body. For such lambda expressions, data is obtained from
arguments or from nonlocal variables.
• [&]: implicitly capture by
reference. All local names can be used. All local variables are
accessed by reference.
• [=]: implicitly capture by value. All local
names can be used. All names refer to copies of the local variables
taken at the point of call of the lambda expression.
• [capture-list]: explicit capture; the capture-list is the list of names of local variables to be captured (i.e., stored in the object) by reference or by value. Variables with names preceded by & are captured by
reference. Other variables are captured by value. A capture list can
also contain this and names followed by ... as elements.
• [&, capture-list]: implicitly capture by reference all local variables with names not men- tioned in the list. The capture list can contain this. Listed names cannot be preceded by &. Variables named in the
capture list are captured by value.
• [=, capture-list]: implicitly capture by value all local variables with names not mentioned in the list. The capture list cannot contain this. The listed names must be preceded by &. Vari- ables named in the capture list are captured by reference.
Note that a local name preceded by & is always captured by
reference and a local name not pre- ceded by & is always captured by
value. Only capture by reference allows modification of variables in
the calling environment.
Additional
Lambda expression format
Additional references:
Wiki
open-std.org, chapter 5.1.2

The lambda's in c++ are treated as "on the go available function".
yes its literally on the go, you define it; use it; and as the parent function scope finishes the lambda function is gone.
c++ introduced it in c++ 11 and everyone started using it like at every possible place.
the example and what is lambda can be find here https://en.cppreference.com/w/cpp/language/lambda
i will describe which is not there but essential to know for every c++ programmer
Lambda is not meant to use everywhere and every function cannot be replaced with lambda. It's also not the fastest one compare to normal function. because it has some overhead which need to be handled by lambda.
it will surely help in reducing number of lines in some cases.
it can be basically used for the section of code, which is getting called in same function one or more time and that piece of code is not needed anywhere else so that you can create standalone function for it.
Below is the basic example of lambda and what happens in background.
User code:
int main()
{
// Lambda & auto
int member=10;
auto endGame = [=](int a, int b){ return a+b+member;};
endGame(4,5);
return 0;
}
How compile expands it:
int main()
{
int member = 10;
class __lambda_6_18
{
int member;
public:
inline /*constexpr */ int operator()(int a, int b) const
{
return a + b + member;
}
public: __lambda_6_18(int _member)
: member{_member}
{}
};
__lambda_6_18 endGame = __lambda_6_18{member};
endGame.operator()(4, 5);
return 0;
}
so as you can see, what kind of overhead it adds when you use it.
so its not good idea to use them everywhere.
it can be used at places where they are applicable.

Well, one practical use I've found out is reducing boiler plate code. For example:
void process_z_vec(vector<int>& vec)
{
auto print_2d = [](const vector<int>& board, int bsize)
{
for(int i = 0; i<bsize; i++)
{
for(int j=0; j<bsize; j++)
{
cout << board[bsize*i+j] << " ";
}
cout << "\n";
}
};
// Do sth with the vec.
print_2d(vec,x_size);
// Do sth else with the vec.
print_2d(vec,y_size);
//...
}
Without lambda, you may need to do something for different bsize cases. Of course you could create a function but what if you want to limit the usage within the scope of the soul user function? the nature of lambda fulfills this requirement and I use it for that case.

C++ 11 introduced lambda expression to allow us write an inline function which can be used for short snippets of code
[ capture clause ] (parameters) -> return-type
{
definition of method
}
Generally return-type in lambda expression are evaluated by compiler itself and we don’t need to specify that explicitly and -> return-type part can be ignored but in some complex case as in conditional statement, compiler can’t make out the return type and we need to specify that.
// C++ program to demonstrate lambda expression in C++
#include <bits/stdc++.h>
using namespace std;
// Function to print vector
void printVector(vector<int> v)
{
// lambda expression to print vector
for_each(v.begin(), v.end(), [](int i)
{
std::cout << i << " ";
});
cout << endl;
}
int main()
{
vector<int> v {4, 1, 3, 5, 2, 3, 1, 7};
printVector(v);
// below snippet find first number greater than 4
// find_if searches for an element for which
// function(third argument) returns true
vector<int>:: iterator p = find_if(v.begin(), v.end(), [](int i)
{
return i > 4;
});
cout << "First number greater than 4 is : " << *p << endl;
// function to sort vector, lambda expression is for sorting in
// non-decreasing order Compiler can make out return type as
// bool, but shown here just for explanation
sort(v.begin(), v.end(), [](const int& a, const int& b) -> bool
{
return a > b;
});
printVector(v);
// function to count numbers greater than or equal to 5
int count_5 = count_if(v.begin(), v.end(), [](int a)
{
return (a >= 5);
});
cout << "The number of elements greater than or equal to 5 is : "
<< count_5 << endl;
// function for removing duplicate element (after sorting all
// duplicate comes together)
p = unique(v.begin(), v.end(), [](int a, int b)
{
return a == b;
});
// resizing vector to make size equal to total different number
v.resize(distance(v.begin(), p));
printVector(v);
// accumulate function accumulate the container on the basis of
// function provided as third argument
int arr[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int f = accumulate(arr, arr + 10, 1, [](int i, int j)
{
return i * j;
});
cout << "Factorial of 10 is : " << f << endl;
// We can also access function by storing this into variable
auto square = [](int i)
{
return i * i;
};
cout << "Square of 5 is : " << square(5) << endl;
}
Output
4 1 3 5 2 3 1 7
First number greater than 4 is : 5
7 5 4 3 3 2 1 1
The number of elements greater than or equal to 5 is : 2
7 5 4 3 2 1
Factorial of 10 is : 3628800
Square of 5 is : 25
A lambda expression can have more power than an ordinary function by having access to variables from the enclosing scope. We can capture external variables from enclosing scope by three ways :
Capture by reference
Capture by value
Capture by both (mixed capture)
The syntax used for capturing variables :
[&] : capture all external variable by reference
[=] : capture all external variable by value
[a, &b] : capture a by value and b by reference
A lambda with empty capture clause [ ] can access only those variable which are local to it.
#include <bits/stdc++.h>
using namespace std;
int main()
{
vector<int> v1 = {3, 1, 7, 9};
vector<int> v2 = {10, 2, 7, 16, 9};
// access v1 and v2 by reference
auto pushinto = [&] (int m)
{
v1.push_back(m);
v2.push_back(m);
};
// it pushes 20 in both v1 and v2
pushinto(20);
// access v1 by copy
[v1]()
{
for (auto p = v1.begin(); p != v1.end(); p++)
{
cout << *p << " ";
}
};
int N = 5;
// below snippet find first number greater than N
// [N] denotes, can access only N by value
vector<int>:: iterator p = find_if(v1.begin(), v1.end(), [N](int i)
{
return i > N;
});
cout << "First number greater than 5 is : " << *p << endl;
// function to count numbers greater than or equal to N
// [=] denotes, can access all variable
int count_N = count_if(v1.begin(), v1.end(), [=](int a)
{
return (a >= N);
});
cout << "The number of elements greater than or equal to 5 is : "
<< count_N << endl;
}
Output:
First number greater than 5 is : 7
The number of elements greater than or equal to 5 is : 3

One problem it solves: Code simpler than lambda for a call in constructor that uses an output parameter function for initializing a const member
You can initialize a const member of your class, with a call to a function that sets its value by giving back its output as an output parameter.

Related

How to test the return value from a function against multiple values without storing into a variable

How could I achieve something like this ...
int main(void)
{
if (f(x) == (a || b))
{
puts("Success");
}
return (0);
}
This would print Success if the return of f(x) is equal to a or b.
I know it is possible to store it in a variable but my question is:
"Could something like this be done by calling the f(x) function only once without using a variable?"
Edit 1: I'm not allowed to use the switch statement for this assignment
Edit 2: Could I set a range with only one expression like this?
if ( 2 < f(x) < 5)
Would this be valid (return type is int)?
how to test for multiple return values from a function called once without storing into a variable (?)
Not really, but with some restrictions let us abuse C and assume a, b and f() return a character.
1Form a character array made up of a and b and search it using memchr(). Inspired by #David C. Rankin (It does not store the result of f() in a variable, but does call a function)
int main(void) {
// v-------------v compound literal
if (memchr((char [2]){a,b}, f(x), 2)) {
puts("Success");
}
return 0;
}
I see OP added "return type is int" - Oh well.
if ( 2 < f(x) < 5) is valid code, but is does not do what OP wants.
It is like if ( (2 < f(x)) < 5) which compares f(x) with 2 and results in 0 or 1, which is always less than 5.
Tough crowd tonight, so how about the below. Needs a bit of extension math for int overflow`, but is close.
abs(2*f(x) - (a+b)) == abs(a-b)
1 Not serious code suggestions for production code - use a temporary.
This can obviously be done using a switch statement. Another way would be calling a function returning true or false with the first function value as input, another way could be a jump table or even > or bit checking using binary operators depending on a and b values (very common for testing multiple bit flags at once).
But really you shouldn't care about using or not using a variable in such cases. Current compilers are quite good putting temporary variables like that in registers.
EDIT: given the constraints, the most likely solution is using some bit fu, but it fully depends of values of a and b and c, etc. The common way is using powers of two as values to check. Then you can check a set of values in only one operation.
exemple: a = 1, b = 2, c = 4
if (f(x) & (1+2+4)) {...}
checks if we have a or b or c or a superposition of these values.
C language does not such constructs. You need do save the result of the function or/and both a & b.
Of course you can:
int compare(int a, int b, int f)
{
if(a == f || b == f) { puts("Success"); return 0;}
return -1;
}
int f(int x)
{
return x * x;;
}
int main()
{
compare(5,8,f(3));
}
but of course it saves all the values as the functions parameters.

How to do static code logical analysis with AST tree or other tool?

void f1(char *s)
{
s[20] = 0;
}
void f2()
{
char a[10];
if (x + y == 2) {
f1(a);
}
}
Cppcheck will report this message:
Array 'a[10]' index 20 out of bounds
How could Cppcheck get the connection between ‘a’ in f2 and ‘s’ in f1?
I have built AST tree, But It only supplies information of each symbol, and give little information to me on the logical relationship of symbols.
How could computer know ‘a’ in f2 and ‘s’ in f1 are the same thing?
As I know, we have to take so many situations into consideration, such as:
void f1(char *s)
{
char str_arry[30];
s= str_arry;
s[20] = 0;
}
In this case 's' and 'a' are not the same things.
I don't know how exactly Cppcheck works but I'll tell you how to solve this problem in general. There are two main approaches to the analysis of interrelated functions.
In the first case, when an analyzer meets function call it starts analyzing its body considering value of factual arguments transmitted through the function. This happens naturally only if it is known which values are transmitted to the function. This refers to: an exact value, a range, a set of values, null/non-null pointer, etc. The complexity of the transmitted information depends on the analyzer sophistication. For example, it can start analyzing the function body knowing that two of the transmitted pointers refer to the same array.
It's an excellent accurate approach. But there's a serious problem. The analyzers based on this concept are very slow. They have to analyze functions bodies with different input data sets over and over again. The functions in turn call other ones and so on. And at some point the "inside" analysis has to be stopped which, in practice, makes this approach not that accurate and excellent as it might seem in theory.
There's a second approach. It's based on automatic function annotations. The thing is, when analyzing functions the information on how its arguments are used and which values they can't take is being gazed. Let's consider the simple example that I gave in the article called 'Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities'.
int Div(int X)
{
return 10 / X;
}
void Foo()
{
for (int i = 0; i < 5; ++i)
Div(i);
}
An analyzer recognizes that X variable is used in Div function as a divider. Based on it, a special Div function annotation is created automatically. Then it takes into account the fact that a range of [0..4] values is transmitted to the function as the X argument. The analyzer concludes that the division by zero should appear.
This approach is more crude and not that accurate as the first one. But it is very fast and allows to create strong correlations between big amount of functions with no loss of productivity.
It can be much more complicated in practice. For example, the PVS-Studio analyzer uses the second approach as the main one but not always. Sometimes when dealing with template functions we analyze them once more (the first approach). In other words, we use a combined approach to maintain the balance between the depth and speed of analysis.
In order to analyze the possible sources of some value, it's a good idea to turn all variables into immutables by introducing a new symbol whenever the original was changed and using the new symbol for all following occurences (the original symbol won't be used after the point where it was re-assigned in the original code).
Consider the following code:
// control flow block 1
int i = 1;
if (some_condition()) {
// control flow block 2
i = 2;
}
// control flow block 3
int j = i;
With the control flow graph
[1]
| \ <- if (some_condition())
| [2]
| / <- join of control flow after the if block ends
[3]
You could write a list of all symbols that are alive (have a value that is used anywhere later in the control flow graph) at the entry and exit point of a block in the control flow graph:
[1] entry: nothing; exit: i
[2] entry: nothing; exit: i
[3] entry: i; exit: i, j (I assume i, j are re-used after the end of this example)
Notice that [2] entry is empty, since i is never read and always written within block [2]. The problem with this representation is, that i is in the exit list of all blocks but it has different possible values for each block.
So, lets introduce the immutable symbols in pseudo-code:
// control flow block 1
i = 1;
if (some_condition()) {
// control flow block 2
i_1 = 2;
}
// control flow block 3
// join-logic of predecessor [1] and [2]
i_2 = one_of(i, i_1);
j = i_2;
Now every variable is coupled exactly to its first (and only) assignment. Meaning, a dependency graph can be constructed by analyzing the symbols that are involved in an assignment
i -> i_2
i_1 -> i_2
i_2 -> j
Now in case there is any constraint on the allowed value of j, a static checker could require that all predecessors of j (namely i_2, in turn originating from i and i_1), satisfy this requirement.
In case of function calls, the dependency graph would contain an edge from every calling argument to the corresponding parameter in the function definition.
Applying this to your example is straight forward if we only focus on the array variable and ignore changes to the array content (I'm not quite sure to what extent a static checker would track the content of individual array items in order to find danger down the road):
Example 1:
void f1(char *s)
{
s[20] = 0;
}
void f2()
{
char a[10];
if (x + y == 2) {
f1(a);
}
}
Transforms to
f1(s)
{
s[20] = 0;
}
f2()
{
a = char[10];
if (x + y == 2) {
call f1(a);
}
}
With dependency graph including the passed arguments via function call
a -> s
So it's immediately clear that a has to be considered for the static analysis of the safety of s[20].
Example 2:
void f1(char *s)
{
char str_arry[30];
s= str_arry;
s[20] = 0;
}
Transforms to
f1(s)
{
// control flow block 1
str_arry = char[30];
s_1 = str_arry;
s_1[20] = 0;
}
With dependency graph
str_arry -> s_1
So it's immediately clear that the only value to be considered for the static analysis of the safety of s_1[20] is str_arry.
How could Cppcheck get the connection between ‘a’ in f2 and ‘s’ in f1?
They are definitely not the same. One of the following can happen:
You pass a to the function, and CPPcheck continues to remember the size of a, even though you access it with the formal parameter s.
You have to keep in mind that static analysis tools and compilers work differently, with different purposes in mind. Static analysis tools were crated EXACTLY for the purpose of catching things like you presented in your question.
In your second example you have:
s= str_arry;
which removes the connection between s and a.

Applying an expression depending on a condition in a loop

I guess this question surely has been asked several times, but I couldn't find anything.
I have a function that depends on an argument (caseset) which can be of two different kinds. Depending on its nature, in a loop I need to perform an operation rather than another. Since the nature of the object is known at the beginning, it appears to me inefficient and inelegant to have an if statement each time in the loop. Ideally, I'd apply the right expression each time and choose it atop of the loop. Here is a code to have an idea of what I'm after.
SEXP doSomething(SEXP anObject, SEXP caseset, SEXP isMat) {
/*
* here anObject is an external pointer to a C structure,
* caseset is either a character matrix or a data.frame made of character columns.
*/
int i,j,nrow,ncol;
int isMatrix = LOGICAL(isMat)[0];
const char *field;
/*
* Determine the number of rows and columns in each case
*/
if (isMatrix) {
ncol = length(VECTOR_ELT(getAttrib(caseset,R_DimNamesSymbol),1));
nrow = length(caseset)/ncol;
} else {
ncol = length(caseset);
nrow = length(VECTOR_ELT(caseset,0));
}
for (i=0;i<nrow;i++) {
for (j=0;j<ncol;j++) {
if (isMatrix) {
field = CHAR(STRING_ELT(caseset,j*nrow+i));
} else {
field = CHAR(STRING_ELT(VECTOR_ELT(caseset,j),i));
}
/*
* Do stuff involving field and anObject
*/
}
}
return result;
}
I'm writing a C function callable from R. I'm passing R objects (the SEXP types). The caseset object can be either a matrix or a data.frame. I'm processing a row at the time and since the two objects stores their element very differently, to get the (i,j) value of the table you have to move differently. Note the if condition stated each time (which for every call of doSomething will have the same result). The rest of the function is pretty long.
I can certainly:
move the if condition outside the loop and rewrite two identical blocks of code (except for one line) depending on the value of isMatrix;
write two almost identical functions and "dispatch" the right one depending on the nature of caseset.
However, both the above options look inelegant to me. I'd prefer to have something that let you apply the right line in the loop without checking the condition each time and without having to rewrite twice code.
C is not exactly well known for elegance. Other languages might allow you to use some sort of iterator perhaps. Checking isMatrix twice is not bad. But of course you might need to check more times or maybe support more types.
Consider using two internal functions based on isMat:
SEXP doSomething(SEXP anObject, SEXP caseset, SEXP isMat) {
/*
* here anObject is an external pointer to a C structure,
* caseset is either a character matrix or a data.frame made of character columns.
*/
return LOGICAL(isMat)[0] ? doSomethingMatrix(anObject,caseset) : doSomethingFrame(anObject,caseset);
}
static doSomethingMatrix(SEXP anObject, SEXP caseset) {
int i,j,nrow,ncol;
const char *field;
ncol = length(VECTOR_ELT(getAttrib(caseset,R_DimNamesSymbol),1));
nrow = length(caseset)/ncol;
for (i=0;i<nrow;i++) {
for (j=0;j<ncol;j++) {
field = CHAR(STRING_ELT(caseset,j*nrow+i));
// Share the long processing code between the two functions
doStuffField(anObject,field);
}
}
return result;
}
You could use an array of function pointers.
For example, to program a calculator, instead of:
if (c == '+')
return (a + b);
elif (c == '-')
return (a - b);
...
You could do something like:
char *op = {'+', '-', '/', '*', '%', 'whatever you want', NULL};
for (int i=0, op[i] && op[i] != c, i++};
if (op[i])
return (my_function_ptr[i](a, b));
And it would call the function number 'i' in the array.

Is it possible to use a for loop to change a variable name in C?

This is a generic question, so there is no actual code that I am trying to troubleshoot. But what I want to know is, can I use a for loop to change the name of a variable in C? For instance, if I have part1, part2, part3, part..., as my variable names; is there a way to attach it to my loop counter so that it will increment with each passing? I toyed around with some things, nothing seemed to work.
In C, you can't 'change the name of the loop variable' but your loop variable does not have to be determined at compile time as a single variable.
For instance, there is no reason in C why you can't do this:
int i[10];
int j;
j = /* something */;
for (i[j] = 0 ; i[j] < 123 ; i[j]++)
{
...
}
or event supply a pointer
void
somefunc f(int *i)
{
for (*i = 0; *i<10; *i++)
{
...
}
}
It's not obvious why you want to do this, which means it's hard to post more useful examples, but here's an example that uses recursion to iterate a definable number of levels deep and pass the innermost function all the counter variables:
void
recurse (int levels, int level, int max, int *counters)
{
if (level < levels)
{
for (counters[level] = 0;
counters[level] < max;
counters[level]++)
{
recurse (levels, level+1, max, counters);
}
return;
}
/* compute something using counters[0] .. counters[levels-1] */
/* each of which will have a value 0 .. max */
}
Also note that in C, there is really no such thing as a loop variable. In a for statement, the form is:
for ( A ; B ; C ) BODY
Expression A gets evaluated once at the start. Expression B is evaluated prior to each execution of BODY and the loop statement will terminate (and not execute BODY) if it evaluates to 0. Expression C is evaluated after each execution of BODY. So you can if you like write:
int a;
int b = /* something */;
int c = /* something */;
for ( a=0; b<5 ; c++ ) { ... }
though it will not usually be a good idea.
The answer is, as #user2682768 correctly remarked, an array. I am not sure whether you are aware of that and consciously do not want to use an array for some reason; your little experience doesn't give me enough information. If so, please bear with me.
But you'll recognize the structural similarity between part1, part2, part3... and part[1], part[2], part[3]. The difference is that the subscript of an array is variable and can be changed programmatically, while the subscript part of a variable name cannot because it is burned in at compile time. (Using macros introduces a meta compiling stage which lets you programmatically change the source before actually compiling it, but that's a different matter.)
So let's compare code. Say you want to store the square of a value in a variable whose name has the value as a suffix. You would like to do something like
int square1, square2, square3;
int i;
for(i=1; i<=3; i++)
{
square/i/ = i*i; /* /i/ to be replaced by suffix "i".
}
With arrays, that changes to
int square[4];
int i;
for(i=1; i<=3; i++)
{
/* the (value of) i is now used as an index in the array.*/
square[i] = i*i;
}
Your idea to change the variable name programmatically implies that all variables have the same type (because they would have to work in the same piece of code, like in my example). This requirement makes them ideally suited for array elements which all have to be of the same type. If that is too restrictive, you need to do something fancier, like using unions (but how do you know what's in it at any given moment? It's almost as if you had different variables to begin with), void pointers to untyped storage or C++ with templates.
In C You cannot append to a variable name an expression that expands to a number and use it as a sort of suffix to access different variables that begin in the same way.
The closest you can get, is to "emulate" this behaviour using a switch construct, but there wouldn't be much of a point to try to do this.
What you asked for is more suited to scripting languages.

Dynamic if then statements for simple arithmetic and resetting a counter

I write a lot of code like this
int x = 0;
//in some loop
x++;
if (x > 10) {
x = 0;
//call some function
}
How could I extract this into a function, or a macro, where I can combine the decrementing (or incrementing), the value to exceed (or go under), the reset value (0 in this case) and the function call?
EDIT:
I'd love to do something like (pseudocode)
cycle_counter(x, ++, <10, 0, myfunc);
#define SATURATE_ADD(x, max, inc, reset) ((x) = (x) > (max) ? (reset) : (x) + (inc))
and in your example:
int x = 0;
SATURATE_ADD(x, 10, 1, 0);
And if you want to call some function, just add a function parameter that you will call in your macro.
I tried to create a function to solve the problem similar to the function in pseudocode
void cycle_counter(int *px, char operatorr, int max, int init, void (*p)())
{
*px=init;
//in some loop
operatorr?(*px)++:(*px)--; //if the operator is 0 we usee -- otherwise ++
((*px)>max)?((*px)=0,p()):0; // This line replaces the if statement in your
// code and the callback
}
so here we have used the pointer named px which is pointing to x so it can keep the changes
The operator operatorr is a boolean which takes the value 0 for -- and ++ otherwise
max is an integer that will be provided for comparison. In your example it will be 10
init is for the initialisation of the integer pointed by the pointer px
Finally p refers to a pointer to function
Now if you want to call this function in the main function you will proceed like that
cycle_counter(px, 1, 10, 0, myfunc);
with px is
int x;
int *px=&x;
There are some exceptions not treated but I've jsut tried to make the difficult part which treats the pointers to functions. The other part (not done > and >=) is easy you can make the ternary operator even more complicated to handle all the cases but you will probably get a lovely line which can be considered as an obfuscated line ;)

Resources