I am asking this question in the context of the C language, though it applies really to any language supporting pointers or pass-by-reference functionality.
I come from a Java background, but have written enough low-level code (C and C++) to have observed this interesting phenomenon. Supposing we have some object X (not using "object" here in the strictest OOP sense of the word) that we want to fill with information by way of some other function, it seems there are two approaches to doing so:
Returning an instance of that object's type and assigning it, e.g. if X has type T, then we would have:
T func(){...}
X = func();
Passing in a pointer / reference to the object and modifying it inside the function, and returning either void or some other value (in C, for instance, a lot of functions return an int corresponding to the success/failure of the operation). An example of this here is:
int func(T* x){...x = 1;...}
func(&X);
My question is: in what situations makes one method better than the other? Are they equivalent approaches to accomplishing the same outcome? What are the restrictions of each?
Thanks!
There is a reason that you should always consider using the second method, rather than the first. If you look at the return values for the entirety of the C standard library, you'll notice that there's almost always an element of error handling involved in them. For example, you have to check the return value of the following functions before you assume they've succeeded:
calloc, malloc and realloc
getchar
fopen
scanf and family
strtok
There are other non-standard functions that follow this pattern:
pthread_create, etc.
socket, connect, etc.
open, read, write, etc.
Generally speaking, a return value conveys a number of items successfully read/written/converted or a flat-out boolean success/fail value, and in practice you'll almost always need such a return value, unless you're going to exit(EXIT_FAILURE); at any errors (in which case I would rather not use your modules, because they give me no opportunity to clean up within my own code).
There are functions that don't use this pattern in the standard C library, because they use no resources (e.g. allocations or files) and so there's no chance of any error. If your function is a basic translation function (e.g. like toupper, tolower and friends which translate single character values), for example, then you don't need a return value for error handling because there are no errors. I think you'll find this scenario quite rare indeed, but if that is your scenario, by all means use the first option!
In summary, you should always highly consider using option 2, reserving the return value for a similar use, for the sake of consistent with the rest of the world, and because you might later decide that you need the return value for communicating errors or number of items processed.
Method (1) passes the object by value, which requires that the object be copied. It's copied when you pass it in and copied again when it's returned. Method (2) passes only a pointer. When you're passing a primitive, (1) is just fine, but when you're passing an object, a struct, or an array, that's just wasted space and time.
In Java and many other languages, objects are always passed by reference. Behind the scenes, only a pointer is copied. This means that even though the syntax looks like (1), it actually works like (2).
I think I got you.
These to approach are very different.
The question you have to ask your self when ever you trying to decide which approach to take is :
Which class would have the responsibility?
In case you passing the reference to the object you are decapul the creation of the object to the caller and creating this functionality to be more serviceability and you would be able to create a util class that all of the functions inside will be stateless, they are getting object manipulate the input and returning it.
The other approach is more likely and API, you are requesting an opperation.
For an example, you are getting array of bytes and you would like to convert it to string, you would probably would chose the first approch.
And if you would like to do some opperation in DB you would chose the second one.
When ever you will have more than 1 function from the first approch that cover the same area you would encapsulate it into a util class, same applay to the second, you will encapsulate it into an API.
In method 2, we call x an output parameter. This is actually a very common design utilized in a lot of places...think some of the various built-in C functions that populate a text buffer, like snprintf.
This has the benefit of being fairly space-efficient, since you won't be copying structs/arrays/data onto the stack and returning brand new instances.
A really, really convenient quality of method 2 is that you can essentially have any number of "return values." You "return" data through the output parameters, but you can also return a success/error indicator from the function.
A good example of method 2 being used effectively is in the built-in C function strtol. This function converts a string to a long (basically, parses a number from a string). One of the parameters is a char **. When calling the function, you declare char * endptr locally, and pass in &endptr.
The function will return either:
the converted value if it was successful,
0 if it failed, or
LONG_MIN or LONG_MAX if it was out of range
as well as set the endptr to point to the first non-digit it found.
This is great for error reporting if your program depends on user input, because you can check for failure in so many ways and report different errors for each.
If endptr isn't null after the call to strtol, then you know precisely that the user entered a non-integer, and you can print straight away the character that the conversion failed on if you'd like.
Like Thom points out, Java makes implementing method 2 simpler by simulating pass-by-reference behavior, which is just pointers behind the scenes without the pointer syntax in the source code.
To answer your question: I think C lends itself well to the second method. Functions like realloc are there to give you more space when you need it. However, there isn't much stopping you from using the first method.
Maybe you're trying to implement some kind of immutable object. The first method will be the choice there. But in general, I opt for the second.
(Assuming we are talking about returning only one value from the function.)
In general, the first method is used when type T is relatively small. It is definitely preferable with scalar types. It can be used with larger types. What is considered "small enough" for these purposes depends on the platform and the expected performance impact. (The latter is caused by the fact that the returned object is copied.)
The second method is used when the object is relatively large, since this method does not perform any copying. And with non-copyable types, like arrays, you have no choice but to use the second method.
Of course, when performance is not an issue, the first method can be easily used to return large objects.
An interesting matter is optimization opportunities available to C compiler. In C++ language compilers are allowed to perform Return Value Optimizations (RVO, NRVO), which effectively turn the first method into the second one "under the hood" in situations when the second method offers better performance. To facilitate such optimizations C++ language relaxes some address-identity requirements imposed on the involved objects. AFAIK, C does not offer such relaxations, thus preventing (or at least impeding) any attempts at RVO/NRVO.
Short answer: take 2 if you don't have a necessary reason to take 1.
Long answer: In the world of C++ and its derived languages, Java, C#, exceptions help a lot. In C world, there is not very much you can do. Following is an sample API I take from CUDA library, which is a library I like and consider well designed:
cudaError_t cudaMalloc (void **devPtr, size_t size);
compare this API with malloc:
void *malloc(size_t size);
in old C interfaces, there are many such examples:
int open(const char *pathname, int flags);
FILE *fopen(const char *path, const char *mode);
I would argue to the end of the world, the interface CUDA is providing is much obvious and lead to proper result.
There are other set of interfaces that the valid return value space actually overlaps with the error code, so the designers of those interfaces scratched their heads and come up with not brilliant at all ideas, say:
ssize_t read(int fd, void *buf, size_t count);
a daily function like reading a file content is restricted by the definition of ssize_t. since the return value has to encode error code too, it has to provide negative number. in a 32bit system, the max of ssize_t is 2G, which is very much limited the number of bytes you can read from your file.
If your error designator is encoded inside of the function return value, I bet 10/10 programmers won't try to check it, though they really know they should; they just don't, or don't remember, because the form is not obvious.
And another reason, is human beings are very lazy and not good at dealing if's. The documentation of these functions will describe that:
if return value is NULL then ... blah.
if return value is 0 then ... blah.
yak.
In the first form, things changes. How do you judge if the value has been returned? No NULL or 0 any more. You have to use SUCCESS, FAILURE1, FAILURE2, or something similar. This interface forces users to code more safer and makes the code much robust.
With these macro, or enum, it's much easier for programmers to learn about the effect of the API and the cause of different exceptions too. With all these advantages, there actually is no extra runtime overhead for it too.
I will try to explain :)
Let say you have to load a giant rocket into semi,
Method 1)
Truck driver places a truck on a parking lot, and goes on to find a hookers, you are stack with putting the load onto forklift or some kind of trailer to bring it to the track.
Method 2)
Truck driver forgets hooker and backs truck up right to the rocket, then you need just to push it in.
That is the difference between those two :). What it boils down to in programming is:
Method 1)
Caller function reserves and address for called function to return its return value to, but how is calling function going to get that value does not matter, will it have to reserve another address or not does not matter, I need something returned, it is your job to get it to me :). So called function goes and reserves the address for its calculations and than stores the value in address then returns value to caller. So caller goes and say oh thank you let me just copy it to the address I reserved earlier.
Method 2)
Caller function says "Hey I will help you, I will give you the address that I have reserved, store what ever calculations you do in it", this way you save not only memory but you save in time.
And I think second is better, and here is why:
So let say that you have struct with 1000 ints inside of it, method 1 would be pointless, it will have to reserve 2*100*32 bits of memory, which is 6400 plus you have to copy it to first location than copy it to second one. So if each copy takes 1 millisecond you will need to way 6.4 seconds to store and copy variables. Where if you have address you only have to store it once.
They are equivalent to me but not in the implementation.
#include <stdio.h>
#include <stdlib.h>
int func(int a,int b){
return a+b;
}
int funn(int *x){
*x=1;
return 777;
}
int main(void){
int sx,*dx;
/* case static' */
sx=func(4,6); /* looks legit */
funn(&sx); /* looks wrong in this case */
/* case dynamic' */
dx=malloc(sizeof(int));
if(dx){
*dx=func(4,6); /* looks wrong in this case */
sx=funn(dx); /* looks legit */
free(dx);
}
return 0;
}
In a static' approach it is more comfortable to me doing your first method. Because I don't want to mess with the dynamic part (with legit pointers).
But in a dynamic' approach I'll use your second method. Because it is made for it.
So they are equivalent but not the same, the second approach is clearly made for pointers and so for the dynamic part.
And so far more clear ->
int main(void){
int sx,*dx;
sx=func(4,6);
dx=malloc(sizeof(int));
if(dx){
sx=funn(dx);
free(dx);
}
return 0;
}
than ->
int main(void){
int sx,*dx;
funn(&sx);
dx=malloc(sizeof(int));
if(dx){
*dx=func(4,6);
free(dx);
}
return 0;
}
Related
what is the best way of using Tcl_GetVar2 in c ?
axes = (Axes*) malloc(sizeof(Axes));
axes->Xorientation = Tcl_GetVar2(interp, "axes", "XOrientation", TCL_GLOBAL_ONLY);
with axes as the structure
typedef struct {
double XvaleurMin;
double XvaleurMax;
double XRatio;
char *Xorientation;
double YvaleurMin;
double YvaleurMax;
double YRatio;
char *Yorientation;
} Axes;
This is example is getting errors in execution when parsing a data file with missings data.
for example Segmentation fault when doing
printf("axes->Yorientation %s\n",axes->Yorientation );
Thanks
While the Tcl_GetVar2 API function does return a char *, you need to remember that Tcl retains ownership of that pointer: it can and will deallocate it at a time of its choosing, possibly on the next API call you make. Also, you really ought to treat it as const char *; don't modify that string either (there are some historical reasons for us not making that const). If you want to keep it around, you need to copy it immediately and that will entail some work with buffer management, with there being many ways to achieve it.
Or alternatively you might be able to use Tcl_GetVar2Ex, which returns a Tcl_Obj *, a Tcl native value reference. With those, you can keep them around (as long as you remember to Tcl_IncrRefCount them when you do so and Tcl_DecrRefCount them once you no longer need it) and you can then get the char * (though still not something you should modify) from that value reference at any time using Tcl_GetString or Tcl_GetStringFromObj (the latter also reports the number of bytes in the string).
If you are really interested in parsing the value to mean some sort of string chosen from a small set — the sort of thing I'd expect when parsing an “orientation” attribute of some kind — I strongly suggest looking at Tcl_GetIndexFromObj. That's explicitly designed for that sort of task, handling a great many aspects for you that are normally an annoying amount of fiddly work to get right for yourself.
Is there any difference in functionality between these two sections of code? The main difference is the use of 'static' in the first example to save the value of x each time function1 is called. But the second example removes the need to use 'static' altogether by passing the value of i from main to function1 on each iteration of the for loop. The both have exactly the same output. Is there any hidden advantage to using one way over the other?
Note: the first example is a simplified version of a piece of code I've seen. Just wondering why this was used rather than the alternative.
First example:
void function1()
{
static int x = 0;
printf("function1 has now been called %d times\n", ++x);
}
int main(void)
{
for (int i = 0; i < 5; i++)
function1();
return 0;
}
Second example:
void function1(int i)
{
printf("function1 has now been called %d times\n", ++i);
}
int main(void)
{
int i;
for (i = 0; i < 5; i++)
function1(i);
return 0;
}
I'd appreciate any shared knowledge.
As people stated in the comments, there are pros/cons to each approach. Which one you choose depends on the situation you are in, and what tradeoffs you are willing to make. Below are a few ideas to get you rolling.
Static variable approach
void function1(void)
{
static int x = 0;
printf("function1 has now been called %d times\n", ++x);
}
Pros:
Lower resource usage: You aren't passing x on the stack, so you use less memory, if memory is a premium. Additionally, you save a few instructions moving the argument onto the stack. The address is also fixed, so you don't need to store it or manipulate it in code.
Good locality: As a static, x remains minimally scoped to it's purpose, meaning the code is easier to understand and debug. If the scope of x was increased to the entire file, it would be a lot harder to understand (who is modifying x, how can it change, etc).
More flexible architecturally (simple interface): There really isn't a wrong way to use this function — just call it. Don't need to worry about validating inputs. If you need to move it around, just drop in the header and you are good to go.
Cons:
Less flexible functionally: If you need to change the value of x, for example, needing to reset it, you don't have a way to do that. You would need to alter your design somehow (make x a global, add some reset parameter to the function, etc) to make it work. Requirements change all the time, and the one that can be minimally changed to do what you want is a hallmark of good design.
Harder to test: Tying into the point above, how would you unit test this function? If you wanted to test some low numbers and some high numbers (typical boundary tests), you would need to iterate through the entire space, which may not be feasible if the function takes a long time to run.
Argument approach
void function1(int x)
{
printf("function1 has now been called %d times\n", x);
}
void caller(int *x) /* could get x from anywhere */
{ /* showing it as a pointer from outside here */
*x = *x + 1;
function1(*x);
}
Pros:
More flexible functionally: If your design requirements change, it's relatively simple to change. If you need to change the sequence or have special conditions, like repeat the first value or skip a particular value, that's really easy to do from the calling code. You've separated out things so you can bolt on a new driver and don't need to touch this function anymore. Things are more modular.
More testable: The function can be tested much easier, granting more confidence that it actually works. You can do boundary testing, test inputs you are worried about, or recreate a failure scenario with ease.
Easier to understand: Functions in this format have the ability to be easier to understand. If a function produces the same output for a given set of inputs, it is said to be pure. The example given is not pure, because it is doing IO (printing to the screen). However, generally speaking, a pure function is easier to reason about because it doesn't hold any internal state, so the only things you really care about are the inputs. Just like 1 + 1 = 2, a pure function has this same simplifying property.
(Potentially) more performant: In the case of pure functions, compilers can take advantage of the function's referential transparency (a fancy word meaning you can replace add(1,1) with 3) and cache results from previous calls safely. Why do the same work again if you've already done it? If calling the function was particularly expensive and sat in a tight loop that called it with similar arguments, you've just saved tons of cycles. Again, this function is not pure, but even if it was, you wouldn't get any performance boosts since it is a sequential counter. Any benefits would be seen when the counter wraps, or the function is called with the same arguments again.
Cons:
More resource usage: If you are squeezed for memory, you utilize a little more stack space as you pass the variable over. You also use more instructions keeping track of it's address and moving it over.
Easier to screw up: If you only support numbers 1-10, someone is going to pass it 0 or -1. Now you need to decide how to handle that.
(Potentially) less performant: You can also bog down your code handling cases that should never happen in the name of defensive programming (which is a good thing!). But generally speaking, defensive programming programming is not built for speed. Speed comes from carefully thought out assumptions. If you are guaranteed that your input falls in a certain range, you can keep things moving as fast as possible without pesky sanity checks peppering your pipeline. The flexibility you gain from exposing this interface comes at a performance cost.
Less flexible architecturally (more cruft): If you call this function from a lot of places, now you need to string along this parameter to feed to it. If you are in a particularly deep call stack, there can be 20 or more functions which pass this argument along. And if the design changes and you need to move this function call from one place to another, you have the pleasure of ripping out the argument from the existing call stack, changing all the callers to conform to the new signature, inserting the argument into the new call stack, and changing all it's callers to conform to the new signature! Your other option is to leave the old call stack alone, which leads to harder maintainability and a higher "huh" factor when someone peruses the extra baggage from the days of yore.
I notice that the standard c library contains several string functions that don't check the input parameter(whether it's NULL), like strcmp:
int strcmp(const char *s1, const char *s2)
{
for ( ; *s1 == *s2; s1++, s2++)
if (*s1 == '\0')
return 0;
return ((*(unsigned char *)s1 < *(unsigned char *)s2) ? -1 : +1);
}
And many others do not do the same validation. Is this a good practice?
In other library, I saw they check every single parameter, like this:
int create_something(int interval, int mode, func_t cb, void *arg, int id)
{
if (interval == 0) return err_code_1;
if (valid(mode)) return err_code_2;
if (cb == NULL) return err_code_3;
if (arg == NULL) return err_code_4;
if (id == 0) return err_code_5;
// ...
}
Which one is better? When you design an API, would you check all parameters to make it function well or just let it go crash?
I'd like to argue that not checking pointers for NULL in library functions that expect valid pointers is actually better practice than to do error returns or silently ignoring them.
NULL is not the only invalid pointer. There are billions of other pointer values that are actually incorrect, why should we give preferential treatment to just one value?
Error returns are often ignored, misunderstood or mismanaged. Forgetting to check one error return could lead to a misbehaving program. I'd like to argue that a program that silently misbehaves is worse than a program that doesn't work at all. Incorrect results can be worse than no results.
Failing early and hard eases debugging. This is the biggest reason. An end user of a program doesn't want the program to crash, but as a programmer I'm the end user of a library and I actually want it to crash. Crashing makes it evident that there's a bug I need to fix and the faster we hit the bug and the closer the crash is to the source of the bug, the faster and easier I can find it and fix it. A NULL pointer dereference is one of the most trivial bugs to catch, debug and fix. It's much easier than trawling through gigabytes of logs to spot one line that says "create_something had a null pointer".
With error returns, what if the caller catches that error, returns an error itself (in your example that would be err_create_something_failed) and its caller returns another error (err_caller_of_create_something_failed)? Then you have an error return 3 functions away, that might not even indicate what actually went wrong. And even if it manages to indicate what actually went wrong (by having a whole framework for error handling that records exactly where the error happened through the whole chain of callers) the only thing you can do with it is to look up the error value in some table and from that conclude that there was a NULL pointer in create_something. It's a lot of pain when instead you could just have opened a debugger and seen exactly where the assumption was violated and what exact chain of function calls lead to that problem.
In the same spirit you can use assert to validate other function arguments to cause early and easy to debug failures. Crash on the assert and you have the full correct call chain that leads to the problem. I just wouldn't use asserts to check pointers because it's pointless (at least on an operating system with memory management) and makes things slower while giving you the same behavior (minus the printed message).
You can use assert.h to check your parameters:
assert(pointer != NULL);
That will make the program fail on debug mode if 'ponter == NULL', but there will be no check at all on release so you can check everything you want with no performance hit.
Anyway if a function requires parameters within a range checking that is a waste of resources, it is the user of your API who should do the checks.
But is up to you how you want to design the API. There is no correct way on that matter: if a function expects a number between 1 and 5 and the user pass a 6 you can perform a check or simply specify that the function will have undefined behaviour.
There is no universally correct way to perform argument validation. In general, you should use assert when you can to validate arguments, but assert is usually disabled in non-debug builds and might not always be appropriate.
There are several things to consider that can vary from case to case, such as:
Do you expect your function to be called a lot? Is performance critical? If a caller will be invoking your function many, many times in a tight loop, then validating arguments can be expensive. This is especially bad for inline functions and if the runtime cost of the validation checks dwarfs the runtime cost of the rest of your function.
Are the checks easy for the caller to perform? If the checks are non-trivial, then it's less error-prone to do validation in the function itself than forcing the extra work on the callers. Note that in some cases, callers might not even be able to perform proper validation themselves (for example, if there's a possibility of a race condition in checking the argument's validity).
Is your function well documented? Does it clearly describe its preconditions, specifying what valid values for its arguments are? If so, then you usually should consider it the caller's responsibility to pass valid arguments.
Is your function self-documenting? Is it obvious to callers what valid arguments are?
Should passing a bad argument be a logic error or a runtime error? That is, should it be considered a programmer's mistake? Is it likely that the argument could come directly from user input? You should consider how you expect callers to use your function. If assertions are enabled, should a bad argument be fatal and terminate the program?
Who are your function's users? Is your function going to be used internally (where you might have some expectation for the competence of other programmers using it), or is it going to be exposed to the general public? If the latter, what failure mode will minimize the amount of technical support that you need to provide? (The stance I took with my dropt library is that I relied on assertions to validate arguments to internal functions and reported error codes for public functions.)
I notice that the standard c library contains several string functions that don't check the input parameter(whether it's NULL), like strcmp:
The string handle functions of the standard C library require "... pointer arguments on such a call shall still have valid values ..." C11dr §7.24.1 2
NULL is not a valid pointer to a string and there is no requirement on the function's part to check pointer validity, so no NULL check.
C's performance does come at a price.
When you design an API, would you check all parameters to make it function well or just let it go crash?
This enters design philosophy. Consider a simpler example. Should the API test the input parameters first? It depends on coding goals.
int add(int a, int b) {
return a + b;
}
// return 1 on failure
int add_safe(int *sum, int a, int b) {
if (a >= 0) {
if (b > INT_MAX - a) return 1; // Overflow
} else {
if (b < INT_MIN - a) return 1; // Underflow
} if (sum) { // NULL check
*sum = a + b;
}
return 0;
}
When in doubt, create an API that does nominal checking. Use strong checking of inputs that may originate from a user (human) or another process. If you want heavy pedantic checking, C is not an efficient target language choice.
With many an API I have made, NULL is a valid input value as a pointer and code adjusts its functionality base on that as above. A NULL check is made, but it is not an error check.
Hello Friends,
How can I use an array of function pointers?
If we will see the above link, it tells us how function pointer works.
Now the question is why should I choose function pointer?
Why can't I use function call directly?
What are the benifits will I get with function pointer?
e.g
enum{a, b, c};
void A();
void B();
void C();
void (*fun[3])() = {A, B, C};
Now when I need to call a function I am doing like,
fun[a](); // This calls function A
// And so on...
Same can be done in function calls also
like when I need to call function A, B or C.
directly I can right like
A();
B();
or
C();
Then why function pointer?
There are many reasons to use a function pointer, in particular for doing things generically in C.
The main place you'll see them being used are as an argument to a function. For example with the function bsearch, it uses a comparison function, passed as a function pointer, to compare items and sort the data:
void *bsearch(const void *key, const void *base,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
That allows bsearch to be generic and sort any type of data, since only that comparison function has to know the type of the data.
Another one has to do with avoiding multiple checks. ie.
void makePizza(void) { }
void bakeCake(void) { }
iWantPizza = 1;
...
if (iWantPizza)
makePizza();
else
bakeCake();
/* want to call the same function again... */
if (iWantPizza)
makePizza();
else
bakeCake();
..
/* alternative */
void (*makeFood)(void) = iWantPizza ? makePizza : bakeCake;
makeFood();
/* want to call the same function again... */
makeFood();
Now the question is why should I choose function pointer?
What are the benifits will I get with function pointer?
You use function pointers when you need to implement a Asynchronous mechanism.
You need a function be called asynchronously when something happens.
How will you know which function to call?
The address of every function is Unique,So you need to use and store the function address.
Where do you store this function address?
A function pointer
For the example that you showed, one more thing can be done.
Lets say there are bunch of function that needs to run for some device operation.
In simple way, you can write all function calls in another master function and call that
master function.
Another way to do it is, write all function names in a curly bracket and call each by using a function pointer and a loop. That looks smart. I'm not sure how that helps you in better way but I saw this in linux kernel code.
I agree to all the answers here. Apart from this I have some of my own judgements to use function pointer.
Lets take an example of some complex math calculation (like printing Fibonacci, integration, fourier Xform, etc...).
You have a function FX(which does that complicated math calculation or anything else) that you use many a times in your program. This function is used in many different jobs.
After using your program for a few months, you find out that, for some work, you can improve the function and for some, current one is best.
What you will do? Write a new function, go and change the function name at all places.
Everytime you find something better, you are gonna do same.
Instead, use different function pointer for different work. At initial stage, all pointers can point to one function. When you discover a better function for some work, just divert the pointer and you are done.
Take another scenario.
Here, you have a real big code like mobile phone OS. (not fully open but half compiled).
You need to add bluetooth driver to it for a particular hardware.
Now, you can add or you can leave is the option available in OS.
You may need to turn on/off bluetooth from many places.
So what OS does is, it makes a function pointer that turn bluetooth ON and use it wherever it is needed. This code is already compiled so you cannot add your code in it. But what can be done is, you can write function and make that pointer point to your function.
This is what I have already seen under Android OS. (not exactly but nearer)
In my experience, function pointers are mainly used to pass a function as a parameter to another function.
Looking at your code, they could also be used like with arrays, so you can just loop through the entire array (which could consist of hundreds of function pointers) and it will just execute them all.
Function Pointers are pointers(like variable pointers) which point to the address of a function. They actually calling the underlying function when you dereference them like a function call does.
The main advantage of a function pointer is that you can pass it to another function as a parameter for example ...
What is the difference between Function Pointer vs Function Call?
It's like the difference between asking the compiler to "tell me the address of the National Gallery (I might want to go there later and I want to be ready to do it)", rather than "take me to the National Gallery right now (but I won't be paying attention to how you get me there so don't expect me to know later on)". Crucially, if you ask for the address/pointer you can write it down in some place like "next Sunday afternoon's big trip"... you don't even have to remember that it is the National Gallery you'll be going to - it can be a pleasant surprise when you get there - but you immediately know your Sunday's entertainment's all sorted.
What benefits will I get with function pointer?
Well, as above, at the time you set the function pointer you need to make a decision about where you'll call later, but then you can forget about all the reasons for making that decision and just know that later destination's all ready for use. At the time when you're actually doing stuff... "my Sunday routine: sleep in to 10, eat a big breakfast, go back to bed, have a shower, if I've got plenty of money then go on my Sunday afternoon big trip, meet friends for dinner"... the earlier decision just kicks in to get you to the gallery. Crucially, you can keep using your familiar Sunday schedule and start "pointing" the "next Sunday afternoon's big trip" address/pointer at new places as they catch your eye, even if they didn't exist when your general schedule was formed.
You see this post-facto flexibility to change the destination dispatched to at one step in an old routine illustrated well by AusCBloke's mention of bsearch. qsort is another classic example - it knows the big picture logic of efficiently sorting arrays of arbitrary things, but has to be told where to go to compare two of the things you're actually using it for.
These are good examples, particularly the first in Urvish's above. I have wondered the same thing, and I think the answer is purely design. In my mind, they are the same result, as in you can point to a function and get a+b or you can just call a function regularly and get a+b, and with the examples on SO, they are usually small and trivial for illustration. But, if one had a 10k line C program, and you had to change something fifty times because you made a change, you'd probably pick up pretty quickly why you'd want to use function pointers.
It also makes you appreciate the design of OOP languages and the philosophy behind OOD.
My program is written in C for Linux, and has many functions with different patterns for return values:
1) one or two return n on success and -1 on failure.
2) some return 0 on success and -1 on failure.
3) some return 1 on success and 0 on failure (I generally spurn using a boolean type).
4) pointers return 0 on failure (I generally spurn using NULL).
My confusion arises over the first three -- functions that return pointers always return 0 on failure, that's easy.
The first option usually involves functions which return a length which may only be positive.
The second option, is usually involved with command line processing functions, but I'm unsure it has correctness, perhaps better values would be EXIT_SUCCESS and EXIT_FAILURE?
The third option is intended for functions which are convenient and natural to be called within conditions, and I usually emulate a boolean type here, by using int values 1 and 0.
Despite this all seeming reasonably sensible, I still find areas where this is not so clear or obvious as to which style to use when I create the function, or which style is in use when I wish to use it.
So how can I add clarity to my approach when deciding upon return types here?
So how can I add clarity to my approach when deciding upon return types here?
Pick one pattern per return type and stick with it, or you'll drive yourself crazy. Model your pattern on the conventions that have long been established for the platform:
If you are making lots of system calls, than any integer-returning function should return -1 on failure.
If you are not making system calls, you are free to follow the convention of the C control structures that nonzero means success and zero means failure. (I don't know why you dislike bool.)
If a function returns a pointer, failure should be indicated by returning NULL.
If a function returns a floating-point number, failure should be indicated by returning a NaN.
If a function returns a full range of signed and unsigned integers, you probably should not be coding success or failure in the return value.
Testing of return values is a bane to C programmers. If failure is rare and you can write a central handler, consider using an exception macro package that can indicate failures using longjmp.
Why don't you use the method used by the C standard library? Oh, wait...
Not an actual answer to your question, but some random comments you might find interesting:
it's normally obvious when to use case (1), but it gets ugly when unsigned types are involved - return (size_t)-1 still works, but it ain't pretty
if you're using C99, there's nothing wrong with using _Bool; imo, it's a lot cleaner than just using an int
I use return NULL instead of return 0 in pointer contexts (peronal preference), but I rarely check for it as I find it more natural to just treat the pointer as a boolean; a common case would look like this:
struct foo *foo = create_foo();
if(!foo) /* handle error */;
I try to avoid case (2); using EXIT_SUCCESS and EXIT_FAILURE might be feasible, but imo this approach only makes sense if there are more than two possible outcomes and you'll have to use an enum anyway
for more complicated programs, it might make sense to implement your own error handling scheme; there are some fairly advanced implementations using setjmp()/longjmp() around, but I prefer something errno-like with different variables for different types of errors
One condition I can think of where your above methodology can fail is a function that can return any value including -1, say a function to add two signed numbers.
In that case testing for -1 will surely be a bad idea.
In case something fails, I would better set a global error condition flag provided by the C standard in form of errno and use that to handle error.
Although, C++ standard library provides exceptions which takes off much hardwork for error handling.
For can't fail deterministic. Yes/no responses using a more specific (bool) return type can help maintain consistency. Going further for higher level interfaces one may want to think about returning or updating a systems specific messaging/result detail structure.
My preference for 0 to always be a success is based on the following ideas:
Zero enables some basic classing for organizing failures by negative vs positive values such as total failure vs conditioned success. I don't recommend this generally as it tends to be a bit too shallow to be useful and might lead to dangerous behaviorial assumptions.
When success is zero one can make a bunch of orthogonal calls and check for group success in a single condition later simply by comparing the return code of the group..
rc = 0;
rc += func1();
rc += func2();
rc += func3();
if (rc == 0)
success!
Most importantly zero from my experience seems to be a consistent indication of success when working with standard libraries and third-party systems.
So how can I add clarity to my approach when deciding upon return types here?
Just the fact that you're thinking about this goes a long way. If you come up with one or two rules - or even more if they make sense (you might need more than one rule - like you mention, you might want to handle returned pointers differently than other things) I think you'll be better off than many shops.
I personally like to have 0 returned to signal failure and non-zero to indicate success, but I don't have a strong need to hold to this. I can understand the philosophy that might want to reverse that sense so that you can return different reasons for the failure.
The most important thing is to have guidelines that get followed. Even nicer is to have guidelines that have a documented rationale (I believe that with rationales people are more likely to follow the guidelines). Like I said, just the fact that you're thinking about these things puts you ahead of many others.
That is a matter of preference, but what I have noticed is the inconsistency. Consider this using a pre C99 compiler
#define SUCCESS 1
#define ERROR 0
then any function that returns an int, return either one or the other to minimize confusion and stick to it religiously. Again, depending on, and taking into account of the development team, stick to their standard.
In pre C99 compilers, an int of zero is false, and anything greater than zero is to be true. That is dependant on what standard is your compiler, if it's C99, use the stdbool's _Bool type.
The big advantage of C is you can use your personal style, but where team effort is required, stick to the team's standard that is laid out and follow it religiously, even after you leave that job, another programmer will be thankful of you.
And keep consistent.
Hope this helps,
Best regards,
Tom.
Much of the C standard library uses the strategy to only return true (or 1) on success and false (or 0) on failure, and store the result in a passed in location. More specific error codes than "it failed" is stored in the special variable errno.
Something like this int add(int* result, int a, int b) which stores a+b in *result and returns 1 (or returns 0 and sets errno to a suitable value if e.g. a+b happens to be larger than maxint).