Any good use for nested function in C? - c

I read somewhere that nested functions are permissible in C (at least the GNU compiler allows it). Consider the following code:
/* nestedfunc.c */
#include <stdlib.h> /* for atoi(3) */
#include <stdio.h>
int F (int q)
{
int G (int r)
{
return (q + r);
}
return (G (5));
}
int main (int argc, const char* argv[])
{
int q = 0;
if (argc > 1)
{
q = atoi (argv[1]);
}
printf ("%d\n", F (q));
return 0;
}
Compiling and running:
gcc -o nestedfunc -O2 -s -Wall nestedfunc.c
me#mybox:~/college/c++/other stuff$ ./nestedfunc 8
13
me#mybox:~/college/c++/other stuff$
I've also read that some other programming languages support these. My question is this: What useful purpose does a nested function have? Thanks in advance.

Nested functions can access the outer function's locals. Somewhat like closures, you can take a pointer to a nested function and pass this pointer to other functions, and the nested function will have access to the current invocation's locals (bad things happen if this invocation has already returned). Because the C runtime system is not designed for this, a function pointer is generally just a pointer to the first instruction of the function and pointers to nested functions can only be done by writing some code on the stack and passing a pointer to that. This is a bad idea from a security perspective.
If you want to use nested functions, use a language with a runtime system with proper support for them. To achieve a similar result in C, put "the outer function's locals" in a context structure and pass this along manually.

Nested functions provide encapsulation through lexical scope.
In your example, G() can only be called by F() and by other functions defined within F().

In general, a nested function is usually a helper function, which is only used inside one other function. It's sort of like a static function (with file scope) but even more localised (only function scope).
Nested functions are not standard C, so there's little reason to use them in that language.

In other programming languages (like Python, Ruby for example) functions are first class objects. You have closures which are powerful abstraction concept. In python you can do this:
def curry(func):
from inspect import getfullargspec
args = getfullargspec(func)
num_args = len(args[0])
def new_func(list_args, *args):
l = len(list_args) + len(args)
nl = list_args + list(args)
if l > num_args:
raise TypeError("Too many arguments to function")
elif l == num_args:
return func(*nl)
else:
return lambda *new_args: new_func(nl, *new_args)
return lambda *args: new_func([], *args)
That is curry decorator which takes a function and makes it curried.

Related

How to find all memory accesses (global,local) by each function in a given C code?

Given a C code and a variable in the C code (global or a local variable of a function), is there a way to find the functions which uses this variable? This should also show the accesses to the variable by a function if it is also accessed through a pointer.
Tried to extract info using LLVM IR but seems difficult.
int a = 2;
int array1 = {1,2,3};
int function1(int c, int d) {
return c + d;
}
int function2 (int arg1[], int * p1, int *p2) {
int a;
return arg1[2]+ (*p1) +a + (*p2);
}
int main() {
int e =2, f=3,g;
g = function1(e,f);
int array2[] = {1,2,3,4};
g = function2(array1,&e,array2);
return 0;
}
variables and the functions which uses them
globals:
a - none,
array1 - function2, main
local variables :
function2:a - function2,
main:e - main, function2,
main:f - main,
main:g - main,
main:array2 - main,function2
is there a way to find the functions which uses this variable
Your best shot will be to use IDE, most of them will be able to trace references to global variables.
Alternatively, you can use static analysis tool like cxref (the one matching https://linux.die.net/man/1/cxref). I used it long time ago, and it was useful. There is a documentation tool with the same name - which might work.
As last resort, if you do not have any other choice, comment the variable declaration, and try building the code. The compiler will raise an error on every bad reference. (Minor exception: locally scoped variables that hides global definitions may not raise an error).
show the accesses to the variable by a function if it is also accessed
through a pointer.
This is extremely hard (impossible for real programs) with static analysis. Usually, this is done at runtime. Some debuggers (e.g. gdb watch) allow you to identify when a variable is being modified (including via pointers). With hardware support it is also possible to set 'read watch' in gdb. See gdb rwatch, and Can I set a breakpoint on 'memory access' in GDB?

Call a function without argument, although it needs one [K&R-C]

It's K&R-C and here is the code: http://v6shell.org/history/if.c
Look at the main-Method. There is this line "if(exp())".
But the function exp is declared as: exp(s). So it needs an argument.
Why does this work? And why would you do this?
Ultimately, it is a bug in the Unix V6 shell support command, if.
The code for the function is:
exp(s) {
int p1;
p1 = e1();
if (eq(nxtarg(), "-o")) return(p1 | exp());
ap--;
return(p1);
}
The function parameter s, implicitly of type int (as is the function iteself), is actually unused in the function, so the bug is the presence of s, not the absence of an argument in the calls to exp(). All the calls with zero actual arguments are correct.
If you look at the definition:
exp(s) {
int p1;
p1 = e1();
if (eq(nxtarg(), "-o")) return(p1 | exp());
ap--;
return(p1);
}
s is not used
C doesn't require compile time type checking. It doesn't even require function parameters to be checked. Everything is a/series of bytes
Why does C not do any of those checks? From what I hear it's 'cause during first few years of C, computers were fairly weak. Doing those checks would require multiple passes to scan the source code, which basically increases compile time by a magnitude of n passes. So it just does a single pass, and takes every name as is, which is why function overloading is not supported
So if the definitions did make use of s in some way, you would most likely get some horrible runtime error with wonderful outputs to the console
This is because in c
The compiler will not be able to perform compile-time checking of argument types and arity when the function is applied to some arguments. This can cause problems
Calling an undeclared function is poor style in C (See this) and illegal in C++
for example-
#include<stdio.h>
void f(int x);
int main()
{
f(); /* Poor style in C, invalid in C++*/
getchar();
return 0;
}
void f(int x)
{
printf("%d", x);
}
This program will work but shouldn't be used.See this Wiki link for more.

Closure in C - Does this work?

I am starting to learn functional programming and wanted to see if I could get away with closures in C. In order to reproduce first example from Wikipedia - Closures I coded up the following code:
#include <stdio.h>
void closure (int(** f)(int), int *x) {
int fcn(int y) {
return *x + y;
};
*f = fcn;
}
int main()
{
int x = 1;
int(* f)(int);
closure(&f, &x);
printf("%d", f(2));
return 0;
}
It was compiled (gcc 4.8.2. on Ubuntu 14.04.) and it works, it prints out 3. Since the lack of my expertise in C (only basic courses on college), my question is, is there something seriously wrong with this code? I was taught function definitions should be global and I was never expecting this to work...
Edit:
And why is it, when I change the main function like this:
int main()
{
int x = 1;
int(* f)(int);
closure(&f, &x);
printf("%d", f(2));
printf("%d", f(3)); // the only difference
return 0;
}
I get Segmentation fault?
Your code works because gcc has a language extension that supports nested functions.
However, in standard C, you cannot define a function within another function.
Gnu Nested Functions
If you try to call the nested function through its address after the containing function exits, all hell breaks loose. If you try to call it after a containing scope level exits, and if it refers to some of the variables that are no longer in scope, you may be lucky, but it's not wise to take the risk. If, however, the nested function does not refer to anything that has gone out of scope, you should be safe.
First of all, your program invokes undefined behaviour. That's because the function fcn defined in the function closure is local to it. fcn no longer exists once closure returns. So the printf call
printf("%d", f(2));
invokes undefined behaviour by calling f(2) because f points to the function fcn which is not in scope.
The C language does not have closure and that's because functions in C are not first-class objects. What this means is functions cannot be passed to other functions or cannot be returned from a function. What actually gets passed or returned is a pointer to it.
Please note that there is a difference between nested functions and closure. What you see is a GCC extension for nested function. It is not part of the C standard.

Benefits of pure function

Today i was reading about pure function, got confused with its use:
A function is said to be pure if it returns same set of values for same set of inputs and does not have any observable side effects.
e.g. strlen() is a pure function while rand() is an impure one.
__attribute__ ((pure)) int fun(int i)
{
return i*i;
}
int main()
{
int i=10;
printf("%d",fun(i));//outputs 100
return 0;
}
http://ideone.com/33XJU
The above program behaves in the same way as in the absence of pure declaration.
What are the benefits of declaring a function as pure[if there is no change in output]?
pure lets the compiler know that it can make certain optimisations about the function: imagine a bit of code like
for (int i = 0; i < 1000; i++)
{
printf("%d", fun(10));
}
With a pure function, the compiler can know that it needs to evaluate fun(10) once and once only, rather than 1000 times. For a complex function, that's a big win.
When you say a function is 'pure' you are guaranteeing that it has no externally visible side-effects (and as a comment says, if you lie, bad things can happen). Knowing that a function is 'pure' has benefits for the compiler, which can use this knowledge to do certain optimizations.
Here is what the GCC documentation says about the pure attribute:
pure
Many functions have no effects except the return value and their return
value depends only on the parameters and/or global variables.
Such a function can be subject to common subexpression elimination and
loop optimization just as an arithmetic operator would be. These
functions should be declared with the attribute pure. For example,
int square (int) __attribute__ ((pure));
Philip's answer already shows how knowing a function is 'pure' can help with loop optimizations.
Here is one for common sub-expression elimination (given foo is pure):
a = foo (99) * x + y;
b = foo (99) * x + z;
Can become:
_tmp = foo (99) * x;
a = _tmp + y;
b = _tmp + z;
In addition to possible run-time benefits, a pure function is much easier to reason about when reading code. Furthermore, it's much easier to test a pure function since you know that the return value only depends on the values of the parameters.
A non-pure function
int foo(int x, int y) // possible side-effects
is like an extension of a pure function
int bar(int x, int y) // guaranteed no side-effects
in which you have, besides the explicit function arguments x, y,
the rest of the universe (or anything your computer can communicate with) as an implicit potential input. Likewise, besides the explicit integer return value, anything your computer can write to is implicitly part of the return value.
It should be clear why it is much easier to reason about a pure function than a non-pure one.
Just as an add-on, I would like to mention that C++11 codifies things somewhat using the constexpr keyword. Example:
#include <iostream>
#include <cstring>
constexpr unsigned static_strlen(const char * str, unsigned offset = 0) {
return (*str == '\0') ? offset : static_strlen(str + 1, offset + 1);
}
constexpr const char * str = "asdfjkl;";
constexpr unsigned len = static_strlen(str); //MUST be evaluated at compile time
//so, for example, this: int arr[len]; is legal, as len is a constant.
int main() {
std::cout << len << std::endl << std::strlen(str) << std::endl;
return 0;
}
The restrictions on the usage of constexpr make it so that the function is provably pure. This way, the compiler can more aggressively optimize (just make sure you use tail recursion, please!) and evaluate the function at compile time instead of run time.
So, to answer your question, is that if you're using C++ (I know you said C, but they are related), writing a pure function in the correct style allows the compiler to do all sorts of cool things with the function :-)
In general, Pure functions has 3 advantages over impure functions that the compiler can take advantage of:
Caching
Lets say that you have pure function f that is being called 100000 times, since it is deterministic and depends only on its parameters, the compiler can calculate its value once and use it when necessary
Parallelism
Pure functions don't read or write to any shared memory, and therefore can run in separate threads without any unexpected consequence
Passing By Reference
A function f(struct t) gets its argument t by value, and on the other hand, the compiler can pass t by reference to f if it is declared as pure while guaranteeing that the value of t will not change and have performance gains
In addition to the compile time considerations, pure functions can be tested fairly easy: just call them.
No need to construct objects or mock connections to DBs / file system.

dlsym/dlopen with runtime arguments

I am trying to do something like the following
enum types {None, Bool, Short, Char, Integer, Double, Long, Ptr};
int main(int argc, char ** args) {
enum types params[10] = {0};
void* triangle = dlopen("./foo.so", RTLD_LAZY);
void * fun = dlsym(triangle, ars[1]);
<<pseudo code>>
}
Where pseudo code is something like
fun = {}
for param in params:
if param == None:
fun += void
if param == Bool:
fun += Boolean
if param == Integer:
fun += int
...
returnVal = fun.pop()
funSignature = returnval + " " + funName + "(" + Riffle(fun, ",") + ")"
exec funSignature
Thank you
Actually, you can do nearly all you want. In C language (unlike C++, for example), the functions in shared objects are referenced merely by their names. So, to find--and, what is most important, to call--the proper function, you don't need its full signature. You only need its name! It's both an advantage and disadvantage --but that's the nature of a language you chose.
Let me demonstrate, how it works.
#include <dlfcn.h>
typedef void* (*arbitrary)();
// do not mix this with typedef void* (*arbitrary)(void); !!!
int main()
{
arbitrary my_function;
// Introduce already loaded functions to runtime linker's space
void* handle = dlopen(0,RTLD_NOW|RTLD_GLOBAL);
// Load the function to our pointer, which doesn't know how many arguments there sould be
*(void**)(&my_function) = dlsym(handle,"something");
// Call something via my_function
(void) my_function("I accept a string and an integer!\n",(int)(2*2));
return 0;
}
In fact, you can call any function that way. However, there's one drawback. You actually need to know the return type of your function in compile time. By default, if you omit void* in that typedef, int is assumed as return type--and, yes, it's a correct C code. The thing is that the compiler needs to know the size of the return type to operate the stack properly.
You can workaround it by tricks, for example, by pre-declaring several function types with different sizes of return types in advance and then selecting which one you actually are going to call. But the easier solution is to require functions in your plugin to return void* or int always; the actual result being returned via pointers given as arguments.
What you must ensure is that you always call the function with the exact number and types of arguments it's supposed to accept. Pay closer attention to difference between different integer types (your best option would be to explicitly cast arguments to them).
Several commenters reported that the code above is not guaranteed to work for variadic functions (such as printf).
What dlsym() returns is normally a function pointer - disguised as a void *. (If you ask it for the name of a global variable, it will return you a pointer to that global variable, too.)
You then invoke that function just as you might using any other pointer to function:
int (*fun)(int, char *) = (int (*)(int, char *))dlsym(triangle, "function");
(*fun)(1, "abc"); # Old school - pre-C89 standard, but explicit
fun(1, "abc"); # New school - C89/C99 standard, but implicit
I'm old school; I prefer the explicit notation so that the reader knows that 'fun' is a pointer to a function without needing to see its declaration. With the new school notation, you have to remember to look for a variable 'fun' before trying to find a function called 'fun()'.
Note that you cannot build the function call dynamically as you are doing - or, not in general. To do that requires a lot more work. You have to know ahead of time what the function pointer expects in the way of arguments and what it returns and how to interpret it all.
Systems that manage more dynamic function calls, such as Perl, have special rules about how functions are called and arguments are passed and do not call (arguably cannot call) functions with arbitrary signatures. They can only call functions with signatures that are known about in advance. One mechanism (not used by Perl) is to push the arguments onto a stack, and then call a function that knows how to collect values off the stack. But even if that called function manipulates those values and then calls an arbitrary other function, that called function provides the correct calling sequence for the arbitrary other function.
Reflection in C is hard - very hard. It is not undoable - but it requires infrastructure to support it and discipline to use it, and it can only call functions that support the infrastructure's rules.​​​​
The Proper Solution
Assuming you're writing the shared libraries; the best solution I've found to this problem is strictly defining and controlling what functions are dynamically linked by:
Setting all symbols hidden
for example clang -dynamiclib Person.c -fvisibility=hidden -o libPerson.dylib when compiling with clang
Then using __attribute__((visibility("default"))) and extern "C" to selectively unhide and include functions
Profit! You know what the function's signature is. You wrote it!
I found this in Apple's Dynamic Library Design Guidelines. These docs also include other solutions to the problem above was just my favorite.
The Answer to your Question
As stated in previous answers, C and C++ functions with extern "C" in their definition aren't mangled so the function's symbols simply don't include the full function signature. If you're compiling with C++ without extern "C" however functions are mangled so you could demangle them to get the full function's signature (with a tool like demangler.com or a c++ library). See here for more details on what mangling is.
Generally speaking it's best to use the first option if you're trying to import functions with dlopen.

Resources