Technically, can all functions be a void function?

Technically, can all functions be a void function? - c

For example:
int f1() {
return 3;
}
void f2(int *num) {
*num = 3;
}
int n1, n2;
n1 = f1();
f2(&n2);
With f1, we can return a value and do "variable=f1()"
But the same can be done with a void function that updates the value of that variable given its address without having to do "variable=f1()".
So, does this mean that we can actually just use void functions for everything? Or is there something that a void function cannot do to replace another int function/(type) function?

The main problem with making everything a void function (which in some people's lexicon is called a "routine") is that you can't chain them easily:
f(g(x))
becomes, if you really want to chain it:
int gout;
f((g(x, &gout), gout))
Which is painful.

Yes you could use void return types for everything and rely exclusively on returning via modified parameters. In fact, you could avoid using functions entirely and put everything in your main method.
As with any other feature of the language, return values give you particular advantages, and its up to you to decide if you want them. Here are some advantages of return values off the top of my head:
Returned values can be assigned to const variables, which can make your code easier to reason about
Certain types of optimisation can be applied by the compiler for returned values (this is more applicable to C++ RVO but may also apply to C's structs; I'm not sure)
Code which uses returned values is often easier to read, especially when the functions are mathematical (e.g. imagine having to declare all the temporaries manually for a large mathematical operation using sin/cos/etc. if they required the output to be via parameters). Compare:
double x = A*sin(a) + B*cos(b);
with
double tmpA, tmpB;
sin(&tmpA, a);
cos(&tmpB, b);
double x = A * tmpA + B * tmpB;
or to use a similar structure as John Zwinck suggested in his answer:
double tmpA, tmpB;
double x = A * (sin(&tmpA, a), tmpA) + B * (cos(&tmpB, b), tmpB);
It is guaranteed that the value will be set no matter what happens inside the function, as this is enforced by the compiler (except some very special cases such as longjumps)
You do not need to worry about checking if the assigned value is used or not; you can return the value and if the requester doesn't need it, they can ignore it (compare this to needing NULL-checks everywhere in your alternative method)
Of course there are also disadvantages:
You only get a single return value, so if your function logically returns multiple types of data (and they can't logically be combined into a single struct), returning via parameters may be better
Large objects may introduce performance penalties due to the need to copy them (which is why RVO was introduced in C++, which makes this much less of an issue)

So, does this mean that we can actually just use void functions for everything?
Indeed. And as it turn out, doing so is a fairly common coding style. But rather than void, such styles usually state that the return value should always be reserved for error codes.
In practice, you usually won't be able to stick to such a style consistently. There are a some special cases where not using the return value becomes inconvenient.
For example when writing callback functions of the kind used by standard C generic functions bsearch or qsort. The expect a callback of the format
int compare (const void *p1, const void *p2);
where the function returns less than zero, more than zero or zero. Design-wise it is important to keep the parameters passed as read-only, you wouldn't want your generic search algorithm to suddenly start modifying the searched contents. So while there is no reason in theory why these kind of functions couldn't be of void return type too, in practice it would make the code uglier and harder to read.

Of course you could; but that does not make it a good idea.
It may not always be convenient or lead to easy to comprehended code. A function returning void cannot be used directly as an operand in an expression. For example while you could write:
if( f1() == 3 )
{
...
}
for f2() you would have to write:
f2( &answer ) ;
if( answer )
{
...
}
Another issue is one of access control - by passing a pointer to the function you are giving that function indirect access to the caller's data, which is fine so long as the function is well behaved and does not overrun. A pointer may refer to a single object or an array of objects - the function taking that pointer has to impose appropriate rules, so it is intrinsically less safe.

Related

Casting function pointers with different return types

This question is about using function pointers, which are not precisely compatible, but which I hope I can use nevertheless as long as my code relies only on the compatible parts. Let's start with some code to get the idea:
typedef void (*funcp)(int* val);
static void myFuncA(int* val) {
*val *= 2;
return;
}
static int myFuncB(int* val) {
*val *= 2;
return *val;
}
int main(void) {
funcp f = NULL;
int v = 2;
f = myFuncA;
f(&v);
// now v is 4
// explicit cast so the compiler will not complain
f = (funcp)myFuncB;
f(&v);
// now v is 8
return 0;
}
While the arguments of myFuncA and myFuncB are identical and fully compatible, the return values are not and are thus just ignored by the calling code. I tried the above code and it works correctly using GCC.
What I learned so far from here and here is that the functions are incompatible by definition of the standard and may cause undefined behavior. My intuition, however, tells me that my code example will still work correctly, since it does not rely in any way on the incompatible parts (the return value). However, in the answers to this question a possible corruption of the stack has been mentioned.
So my question is: Is my example above valid C code so it will always work as intended (without any side effects), or does it depend on the compiler?
EDIT:
I want to do this in order to use a "more powerful" function with a "les powerful" interface. In my example funcp is the interface but I would like to provide additional functionality like myFuncB for optional use.

Agreed it is undefined behaviour, don't do that!
Yes the code functions, i.e. it doesn't fall over, but the value you assign after returning void is undefined.
In a very old version of "C" the return type was unspecified and int and void functions could be 'safely' intermixed. The integer value being returned in the designated accumulator register. I remember writing code using this 'feature'!
For almost anything else you might return the results are likely to be fatal.
Going forward a few years, floating-point return values are often returned using the fp coprocessor (we are still in the 80s) register, so you can't mix int and float return types, because the state of the coprocessor would be confused if the caller does not strip off the value, or strips off a value that was never placed there and causes an fp exception. Worse, if you build with fp emulation, then the fp value may be returned on the stack as described next. Also on 32-bit builds it is possible to pass 64bit objects (on 16 bit builds you can have 32 bit objects) which would be returned either using multiple registers or on the stack. If they are on the stack and allocated the wrong size, then some local stomping will occur,
Now, c supports struct return types and return value copy optimisations. All bets are off if you don't match the types correctly.
Also some function models have the caller allocate stack space for the parameters for the call, but the function itself releases the stack. Disagreement between caller and implementation on on the number or types of parameters and return values would be fatal.
By default C function names are exported and linked undecorated - just the function name defines the symbol, so different modules of your program could have different views about function signatures, which conflict when you link, and potentially generate very interesting runtime errors.
In c++ the function names are highly decorated primarily to allow overloading, but also it helps to avoid signature mismatches. This helps with keeping arguments in step, but actually, ( as noted by #Jens ) the return type is not encoded into the decorated name, primarily because the return type isn't used (wasn't, but I think occasionally can now influence) for overload resolution.

Yes, this is an undefined behaviour and you should never rely on undefined behaviour if want to write portable code.
Function with different return values can have different calling conventions. Your example will probably work for small return types, but when returning large structs (e.g. larger than 32 bits) some compilers will generate code where struct is returned in a temporary memory area which should be cleaned up the the caller.

When to return value from function, and when to use out parameter? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm learning C and programming in general, and I don't know when to return a value and when to use void.
Is there any rule to apply when to use one over the another ?
Is there any difference between this two cases? I know that first case is
working with a local copy of int (n) , and second with original value.
#include <stdio.h>
int case_one(int n)
{
return n + 2;
}
void case_two(int *n)
{
*n = *n + 2;
}
int main(int argc, char *argv[])
{
int n = 5;
n = case_one(n);
printf("%i\n", n);
n = 5;
case_two(&n);
printf("%i\n", n);
return 0;
}

There is one more reason to use out param instead of return value - error handling. Usually return value (int) of the function call in C represents success of the operation. Error represented by not 0 value.
Example:
#include <stdio.h>
int extract_ip(const char *str, int out[4]) {
return -1;
}
int main() {
int out[4];
int rv = extract_ip("test", out);
if (rv != 0) {
printf("Error :%d", rv);
};
}
This approach used in POSIX socket API for example.

It very much depends on what you want to do, but basically You should use the former unless You have a good reason to use the latter.
Think of the implications of the choices. First, let's think about the way we provide input to function. It is quite often that you provide explicitly constant, compile-time constant or temporary data as input:
foo(1);
const int a = 2;
foo(a);
int x = 5;
int y = 5;
foo(x + y);
In all of the above cases the source of initial value is not a viable location for storing the result.
Next, let's think about how we may want to store the result. Foremost we may often want to use the result elsewhere. It may be inconvenient to use the same variable to store and then pass input, and to store output. But furthermore, often we would like to use the result immediately. We invoke the function as a part of larger expression:
x = foo(1) + foo(2);
Rewriting the preceding line in a manner that would use a pointer would require much unnecessary code - that is time and complication that we certainly don't want, when it's not really buying anything.
So when do we actually want to use a pointer? C functions are pass-by-value. Whenever we pass anything, a copy is created. We can then work on that copy and upon returning it, it requires copying again. If we do know that all we want to do is manipulate certain data set in place, we can provide a pointer as a handler and all that is copied is several bytes that store address.
So the former, prevalent way to create functions is flexible and leads to concise usage. The latter is useful for manipulation of objects in place.
Obviously sometimes our input actually is an address, and that's a trivial case for using pointers as function parameters.

I'm learning C and programming in general, and i don't know when to return a value and when to use void.
There is no definitive rule, and it is a matter of opinion. Notice that you might re-code your case_one as the following:
// we take the convention that the first argument would be ...
// a pointer to the "result"
void case_one_proc(int *pres, int n) {
*pres = n+2;
}
then a code like
int i = j+3; /// could be any arbitrary expression initializing i
int r = case_one(i);
is equivalent to
int i = j+3; // the same expression initializing i
int r;
case_one_proc(&r, i); //we pass the address of the "result" as first argument
Hence, you can guess that you might replace any whole C program with an equivalent program having only void returning functions (that is, only procedures). Of course, you may have to introduce supplementary variables like r above.
So you see that you might even avoid any value returning function. However, that would not be convenient (for human developers to code and to read other code) and might not be efficient.
(actually you could even make a complex C program which transforms the text of any C program -given by their several translation units- into another equivalent C program without any value returning function)
Notice that at the most elementary machine code level (at least on real processors like x86 and ARM), everything are instructions, and expressions don't exist anymore! And you favorite C compiler is transforming your program (and every C program in practice) into such machine code.
If you want more theory about such whole-program transformations, read about A-normal forms and about continuations (and CPS transformations)
Is there any rule to apply when to use one over the another ?
The rule is to be pragmatic, and favor first the readability and understandability of your source code. As a rule of thumb, any pure function implementing a mathematical function (like your case_one, which mathematically is a translation) is better coded as returning some result. Conversely, any program function which has mostly side effects is often coded as returning void. For cases in between, use your common sense, and look at existing practice -their source code- in existing free software projects (e.g. on github). Often a side effecting procedure might return some error code (or some success flag). Read documentation of printf & scanf for good examples.
Notice also that many ABIs and calling conventions are passing the result (and some arguments) in processor registers (and that is faster than passing thru memory).
Notice also that C has only call by value.
Some programming languages have only functions returning one value (sometimes ignored, or uninteresting), and have just expressions, without any notion of statement or instruction. Read about functional programming. An excellent example of a programming language with only value returning functions is Scheme, and SICP is an excellent introductory book (freely available) to programming that I strongly recommend.

The first approach is preferred, where you return a value. The second can be used when multiple values are computed by the function.
For example, the function strtol has this prototype:
long strtol(const char *s, char **endp, int base);
strtol attempts to interpret the initial portion of the string pointed to by s as the representation of a long integer expressed in base base. It returns the converted value with a return statement, and stores a pointer to the character that follows the number in s into *endp. Note however that this standard function should have returned 3 values: the converted value, the updated pointer and a success indicator.
There are other ways to return multiple values:
returning a structure.
updating a structure to which you receive a pointer.
C offers some flexibility. Sometimes different methods are equivalent and choosing one is mostly a matter of conventions, personal style or local practice, but for the example you give, the first option is definitely preferred.

Pass by const pointer vs. pass by value for built in types. Efficiency

I almost only find C++ posts and not C, when I try to find an answer for this.
For built in types as int, char etc. is there any performance difference between pass-by-value and by const pointers?
Is it still good programming practice to use the const keyword when passing-by-value?
int PassByValue(int value)
{
return value / 2;
}
int ConstPointer(const int * value)
{
return (*value) / 2;
}

Pass by const pointer is never faster than by value as long as the value is less or equal the size of a pointer (sizeof). It also is more annoying and sometimes even wrong (stack variables).

In general the pass by value should be faster. In fact the value might have been already in the registers, in which case would not be necessary to access the cache. However if the function code is compiled together with the caller code, it is possible that the compiler will optimize anyway.

Passing built-in types like int, char by pointer does not result in better performance results.
Using const keyword passing-by-value does not matter since original value won't be changed.

Why use a callback instead of a normal function call?

I'm trying to understand callbacks, and do get the idea, but do not understand why it is really needed.
Specifically, what added benefit does it provide over a normal function call? I'm looking at the accepted answer here : What is a "callback" in C and how are they implemented?
I have redone the same thing below, just without using function pointers. How is that different from this?
void populate_array(int *array, size_t arraySize)
{
for (size_t i=0; i<arraySize; i++)
array[i] = getNextRandomValue();
}
int getNextRandomValue(void)
{
return rand();
}
int main(void)
{
int myarray[10];
populate_array(myarray, 10);
...
}
Is it only beneficial if lower-layer software needs to call a function that was defined at a higher-layer?

This implementation is different in that the only way it knows to populate the array is hard-coded: getNextRandomValue is the way it populates the array, and it is part of the library that provides the populate_array functionality. Essentially, the two functions are baked together into a single whole (the fancy way of saying the same thing is to say that they are "tightly coupled").
With the original implementation I can do all of the things below without changing a line in the populate_array:
int getNextRandomValue(void) {
return rand();
}
int getNextEvenValue(void) {
static int even = 2;
return (even += 2);
}
int getNextNegativeValue(void) {
static int neg = -1;
return neg--;
}
int getNextValueFromUser(void) {
int val;
scanf("%d", &val);
return val;
}
populate_array(myarray, 10, getNextRandomValue);
populate_array(myarray, 10, getNextEvenValue);
populate_array(myarray, 10, getNextNegativeValue);
populate_array(myarray, 10, getNextValueFromUser);
With your implementation I would have to write a separate populate_array for each of these cases. If the way of populating an array that I need is not part of the library, I am on the hook for implementing the whole populate_array, not only its value-producing generator function. This may not be that bad in this case, but in cases where callbacks really shine (threads, asynchronous communications) reimplementing the original without callbacks would be prohibitive.

In your example, getNextRandomValue must be known at compile time. This means that you cannot reuse the function with different definitions of getNextRandomValue, neither let other users link to your code and specify their version of getNextRandomValue. Note that this is not a callback, rather a "generic" function: think of getNextRandomValue as a placeholder for a function.
Callbacks are a special use of "generic" functions: they are called upon completion of an asynchronous task, to notify the user of the operation results (or progress).
I don't think that function pointers only suit low-to-high-level communication. Function pointers, as said, allow you to build generic code in C. A quick example, qsort(): order sorting needs a function that, given a and b, tells whether a>b, a<b or a==b, so that a and b can be placed in the right order. Well, qsort let you provide your own definition for order, thus offering a general, reusable piece of code.

To refine other answers here: Your function populate_array has a bad name. It should be named populate_array_with_random_values.

As said in the answer you linked to "there is no callback in c". There only is a method to pass pointers to function to a function.
Which is useful in general not only for lower-layer software that needs to call a function on a higher-level though this might be often how it is used.
Passing function pointers as parameters is an elegant way to write code. In your example if you want to have two variations on how to get the next random value you already see how it is much easier to pass the function as a parameter rather then trying to pass a parameter that then selects the next function to call using a switch.

Benefits of pure function

Today i was reading about pure function, got confused with its use:
A function is said to be pure if it returns same set of values for same set of inputs and does not have any observable side effects.
e.g. strlen() is a pure function while rand() is an impure one.
__attribute__ ((pure)) int fun(int i)
{
return i*i;
}
int main()
{
int i=10;
printf("%d",fun(i));//outputs 100
return 0;
}
http://ideone.com/33XJU
The above program behaves in the same way as in the absence of pure declaration.
What are the benefits of declaring a function as pure[if there is no change in output]?

pure lets the compiler know that it can make certain optimisations about the function: imagine a bit of code like
for (int i = 0; i < 1000; i++)
{
printf("%d", fun(10));
}
With a pure function, the compiler can know that it needs to evaluate fun(10) once and once only, rather than 1000 times. For a complex function, that's a big win.

When you say a function is 'pure' you are guaranteeing that it has no externally visible side-effects (and as a comment says, if you lie, bad things can happen). Knowing that a function is 'pure' has benefits for the compiler, which can use this knowledge to do certain optimizations.
Here is what the GCC documentation says about the pure attribute:
pure
Many functions have no effects except the return value and their return
value depends only on the parameters and/or global variables.
Such a function can be subject to common subexpression elimination and
loop optimization just as an arithmetic operator would be. These
functions should be declared with the attribute pure. For example,
int square (int) __attribute__ ((pure));
Philip's answer already shows how knowing a function is 'pure' can help with loop optimizations.
Here is one for common sub-expression elimination (given foo is pure):
a = foo (99) * x + y;
b = foo (99) * x + z;
Can become:
_tmp = foo (99) * x;
a = _tmp + y;
b = _tmp + z;

In addition to possible run-time benefits, a pure function is much easier to reason about when reading code. Furthermore, it's much easier to test a pure function since you know that the return value only depends on the values of the parameters.

A non-pure function
int foo(int x, int y) // possible side-effects
is like an extension of a pure function
int bar(int x, int y) // guaranteed no side-effects
in which you have, besides the explicit function arguments x, y,
the rest of the universe (or anything your computer can communicate with) as an implicit potential input. Likewise, besides the explicit integer return value, anything your computer can write to is implicitly part of the return value.
It should be clear why it is much easier to reason about a pure function than a non-pure one.

Just as an add-on, I would like to mention that C++11 codifies things somewhat using the constexpr keyword. Example:
#include <iostream>
#include <cstring>
constexpr unsigned static_strlen(const char * str, unsigned offset = 0) {
return (*str == '\0') ? offset : static_strlen(str + 1, offset + 1);
}
constexpr const char * str = "asdfjkl;";
constexpr unsigned len = static_strlen(str); //MUST be evaluated at compile time
//so, for example, this: int arr[len]; is legal, as len is a constant.
int main() {
std::cout << len << std::endl << std::strlen(str) << std::endl;
return 0;
}
The restrictions on the usage of constexpr make it so that the function is provably pure. This way, the compiler can more aggressively optimize (just make sure you use tail recursion, please!) and evaluate the function at compile time instead of run time.
So, to answer your question, is that if you're using C++ (I know you said C, but they are related), writing a pure function in the correct style allows the compiler to do all sorts of cool things with the function :-)

In general, Pure functions has 3 advantages over impure functions that the compiler can take advantage of:
Caching
Lets say that you have pure function f that is being called 100000 times, since it is deterministic and depends only on its parameters, the compiler can calculate its value once and use it when necessary
Parallelism
Pure functions don't read or write to any shared memory, and therefore can run in separate threads without any unexpected consequence
Passing By Reference
A function f(struct t) gets its argument t by value, and on the other hand, the compiler can pass t by reference to f if it is declared as pure while guaranteeing that the value of t will not change and have performance gains
In addition to the compile time considerations, pure functions can be tested fairly easy: just call them.
No need to construct objects or mock connections to DBs / file system.