I'm relatively new to C, and found it intriguing that both of the following calls to the function pointer compile and work fine. One with and one without dereferencing the function pointer before calling it.
#include <stdio.h>
#include <stdlib.h>
void func() {
puts("I'm a func");
}
int main(void) {
void (*f)() = func;
f();
(*f)();
return EXIT_SUCCESS;
}
I think I understand that (*f)() is the "official" way to call a function pointer, but why does simply calling f() work? Is that syntactic sugar of recent C versions?
This is a piece of syntactic/semantic sugar that has, AFAIK, worked since the very earliest versions of C. It makes sense if you think of functions as pointers to code.
The only special rule needed to make function pointers work this way is that indirecting a function pointer gives the same pointer back (because you can't manipulate code in standard C anyway): when f is a function pointer, then f == (*f) == (**f), etc.
(Aside: watch out with declaration such as void (*f)(). An empty argument list denotes an old-style, pre-C89 function declaration that matches on the return type only. Prefer void (*f)(void) for type safety.)
A function call expression is always of the form "function pointer", "round parenthesis", "arguments", "round parenthesis". In order for you not to have to spell out (&printf)("Hello World\n") every time1, there is a separate rule by which an expression which denotes a function decays to the respective function pointer.
Since a function pointer can be dereferenced to give an expression that denotes a function again, this will again decay, so you can keep dereferencing and there'll be a lot of decay:
(&f)(); // no decay (decay does not happen when the expression
// is the operand of &)
f(); // one decay
(*f)(); // two decays
(**f)(); // three decays
1) Early Perl has function syntax like that.
The C 2011 standard says in clause 6.3.2.1, paragraph 4:
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, a function designator with type ‘‘function returning type’’ is converted to an expression that has type ‘‘pointer to function returning type’’.
This means that, if f designates a function, then, in a call such as f(), f is automatically converted to &f, resulting in (&f)(). This is actually the “proper” way to call a function, because the function-call expression requires a pointer to a function.
Now consider what happens in *f. The f is automatically converted to &f, so we have *&f. In this expression, the result of the * operator is a function, f; it just reverse the operation performed by &. So *&f is the same as f. We can repeat this indefinitely: **f is automatically converted to **&f, which is *f, which is automatically converted to *&f, which is f, which is automatically converted to &f.
In C you can call your function like:
f();
(*f)();
(**f)();
(********f)();
(*****************************f)();
all are valid. In C, dereferencing or taking the address of a function just evaluates to a pointer to that function, and dereferencing a function pointer just evaluates back to the function pointer. C is designed in such a way that both function name identifier as well as variable holding function's pointer mean the same: address to CODE memory. And it allows to jump to that memory by using call () syntax either on an identifier or variable.
And the last but but least, standard says that:
C11: 6.3.2.1:
4 A function designator is an expression that has function type. Except when it is the
operand of the sizeof operator, the _Alignof operator,65) or the unary & operator, a
function designator with type ‘‘function returning type’’ is converted to an expression that has type ‘‘pointer to function returning type’’.
Related
This question already has answers here:
Why do function pointer definitions work with any number of ampersands '&' or asterisks '*'?
(5 answers)
Closed 4 years ago.
In the book, "Beginning C from Novice to Professional", the author does not use the address of operator when assigning a function to a function pointer. I typed in the code on my compiler both with and without the address of operator and it compiled and performed as expected both times. Why is this and which way would be preferred in an enterprise/business setting?
int sum(int, int);
int main(void)
{
...
int (*pfun)(int, int);
pfun = ∑
pfun = sum;
...
}
int sum(int x, int y)
{
return x + y;
}
This is a peculiarity of functions in C. The C standard says the following (C11 3.4.1p4):
A function designator is an expression that has function type. Except when it is the operand of the sizeof operator, the _Alignof operator, 65) or the unary & operator, a function designator with type ''function returning type'' is converted to an expression that has type ''pointer to function returning type''.
I.e. sum that is a function designator is in any expression context, except when preceded by & or the said 2 operators is converted to a pointer to function. Of course in the expression &sum, the result is a pointer to a function. And ISO C does not allow sizeof or _Alignof be applied to a function, so in any expression that compiles, a function designator is either implicitly, or in the case of address-of operator, explicitly converted to a pointer to a function.
Even the function call operator () requires that its operand be a pointer to function, hence you can call pfun without dereferencing: pfun(1, 2); and in sum(1, 2) sum is first converted to a pointer to a function, and then the function call operator is applied to this pointer.
There are coding conventions that say that a call through a function pointer should use the dereference operator *, i.e. (*pfun)(1, 2), and likewise that the assignment be written as pfun = ∑.
As such, writing (*pfun)(1, 2) would not make it clearer that it is a pointer as the same syntax would equally work for a function designator, i.e. (*sum)(1, 2); in the latter, sum is first converted to a pointer to a function since it is an operand to *; then the dereference converts the pointer to function to a function designator again, and then since it is an operand to a function call operator, it is converted to a function pointer again.
Lastly, beware that pfun being an object of function pointer type, &pfun would actually get the address of the pointer variable, which is almost never what you wanted.
Why and how does dereferencing a function pointer just "do nothing"?
This is what I am talking about:
#include<stdio.h>
void hello() { printf("hello"); }
int main(void) {
(*****hello)();
}
From a comment over here:
function pointers dereference just
fine, but the resulting function
designator will be immediately
converted back to a function pointer
And from an answer here:
Dereferencing (in way you think) a
function's pointer means: accessing a
CODE memory as it would be a DATA
memory.
Function pointer isn't suppose to be
dereferenced in that way. Instead, it
is called.
I would use a name "dereference" side
by side with "call". It's OK.
Anyway: C is designed in such a way
that both function name identifier as
well as variable holding function's
pointer mean the same: address to CODE
memory. And it allows to jump to that
memory by using call () syntax either
on an identifier or variable.
How exactly does dereferencing of a function pointer work?
It's not quite the right question. For C, at least, the right question is
What happens to a function value in an rvalue context?
(An rvalue context is anywhere a name or other reference appears where it should be used as a value, rather than a location — basically anywhere except on the left-hand side of an assignment. The name itself comes from the right-hand side of an assignment.)
OK, so what happens to a function value in an rvalue context? It is immediately and implicitly converted to a pointer to the original function value. If you dereference that pointer with *, you get the same function value back again, which is immediately and implicitly converted into a pointer. And you can do this as many times as you like.
Two similar experiments you can try:
What happens if you dereference a function pointer in an lvalue context—the left-hand side of an assignment. (The answer will be about what you expect, if you keep in mind that functions are immutable.)
An array value is also converted to a pointer in an lvalue context, but it is converted to a pointer to the element type, not to a pointer to the array. Dereferencing it will therefore give you an element, not an array, and the madness you show doesn't occur.
Hope this helps.
P.S. As to why a function value is implicitly converted to a pointer, the answer is that for those of us who use function pointers, it's a great convenience not to have to use &'s everywhere. There's a dual convenience as well: a function pointer in call position is automatically converted to a function value, so you don't have to write * to call through a function pointer.
P.P.S. Unlike C functions, C++ functions can be overloaded, and I'm not qualified to comment on how the semantics works in C++.
C++03 §4.3/1:
An lvalue of function type T can be converted to an rvalue of type “pointer to T.” The result is a pointer to the function.
If you attempt an invalid operation on a function reference, such as the unary * operator, the first thing the language tries is a standard conversion. It's just like converting an int when adding it to a float. Using * on a function reference causes the language to take its pointer instead, which in your example, is square 1.
Another case where this applies is when assigning a function pointer.
void f() {
void (*recurse)() = f; // "f" is a reference; implicitly convert to ptr.
recurse(); // call operator is defined for pointers
}
Note that this doesn't work the other way.
void f() {
void (&recurse)() = &f; // "&f" is a pointer; ERROR can't convert to ref.
recurse(); // OK - call operator is *separately* defined for references
}
Function reference variables are nice because they (in theory, I've never tested) hint to the compiler that an indirect branch may be unnecessary, if initialized in an enclosing scope.
In C99, dereferencing a function pointer yields a function designator. §6.3.2.1/4:
A function designator is an expression that has function type. Except when it is the operand of the sizeof operator or the unary & operator, a function designator with type ‘‘function returning type’’ is converted to an expression that has type ‘‘pointer to function returning type’’.
This is more like Norman's answer, but notably C99 has no concept of rvalues.
It happens with a few implicit conversions. Indeed, per the C standard:
ISO/IEC 2011, section 6.3.2.1 Lvalues, arrays, and function designators, paragraph 4
A function designator is an expression that has function type. Except when it is the operand of the sizeof operator or the unary & operator, a function designator with type “function returning type” is converted to an expression that has type “pointer to function returning type”.
Consider the following code:
void func(void);
int main(void)
{
void (*ptr)(void) = func;
return 0;
}
Here, the function designator func has the type “function returning void” but is immediately converted to an expression that has type “pointer to function returning void”. However, if you write
void (*ptr)(void) = &func;
then the function designator func has the type “function returning void” but the unary & operator explicitly take the address of that function, eventually yielding the type “pointer to function returning void”.
This is mentioned in the C standard:
ISO/IEC 2011, section 6.5.3.2 Address and indirection operators, paragraph 3
The unary & operator yields the address of its operand. If the operand has type “type”, the result has type “pointer to type”.
In particular, dereferencing a function pointer is redundant. Per the C standard:
ISO/IEC 2011, section 6.5.2.2 Function calls, paragraph 1
The expression that denotes the called function shall have type “pointer to function returning void” or returning a complete object type other than an array type. Most often, this is the result of converting an identifier that is a function designator.
ISO/IEC 2011, section 6.5.3.2 Address and indirection operators, paragraph 4
The unary * operator denotes indirection. If the operand points to a function, the result is a function designator.
So when you write
ptr();
the function call is evaluated with no implicit conversion because ptr is already a pointer to function. If you explicitly dereference it with
(*ptr)();
then the dereferencing yields the type “function returning void” which is immediately converted back to the type “pointer to function returning void” and the function call occurs. When writing an expression composed of x unary * indirection operators such as
(****ptr)();
then you just repeat the implicit conversions x times.
It does make sense that calling functions involves function pointers. Before executing a function, a program pushes all of the parameters for the function onto the stack in the reverse order that they are documented. Then the program issues a call instruction indicating which function it wishes to start. The call instruction does two things:
First it pushes the address of the next instruction, which is the return address, onto the stack.
Then, it modifies the instruction pointer %eip to point to the start of the function.
Since calling a function does involve modifying an instruction pointer, which is a memory address, it makes sense that the compiler implicitly converts a function designator to a pointer to function.
Even though it may seems unrigorous to have these implicit conversions, it can be useful in C (unlike C++ which have namespaces)
to take advantage of the namespace defined by a struct identifier to encapsulate variables.
Consider the following code:
void create_person(void);
void update_person(void);
void delete_person(void);
struct Person {
void (*create)(void);
void (*update)(void);
void (*delete)(void);
};
static struct Person person = {
.create = &create_person,
.update = &update_person,
.delete = &delete_person,
};
int main(void)
{
person.create();
person.update();
person.delete();
return 0;
}
It is possible to hide the implementation of the library in other translation units and to choose to only expose the struct encapsulating the pointers to functions, to use them in place of the actual function designators.
Put yourself in the shoes of the compiler writer. A function pointer has a well defined meaning, it is a pointer to a blob of bytes that represent machine code.
What do you do when the programmer dereferences a function pointer? Do you take the first (or 8) bytes of the machine code and reinterpret that as a pointer? Odds are about 2 billion to one that this won't work. Do you declare UB? Plenty of that going around already. Or do you just ignore the attempt? You know the answer.
How exactly does dereferencing of a function pointer work?
Two steps. The first step is at compile time, the second at runtime.
In step one, the compiler sees it has a pointer and a context in which that pointer is dereferenced (such as (*pFoo)() ) so it generates code for that situation, code that will be used in step 2.
In step 2, at runtime the code is executed. The pointer contains some bytes indicating which function should be executed next. These bytes are somehow loaded into the CPU. A common case is a CPU with an explicit CALL [register] instruction. On such systems, a function pointer can be simply the address of a function in memory, and the derefencing code does nothing more than loading that address into a register followed by a CALL [register] instruction.
Say for this simple code:
int foo(void);
int (*p)(void);
p = foo;
p = &foo;
int a = p();
int b = (*p)();
In the above example, Line 3&4 are both valid, Line 5&6 are also both valid.
In fact if you try to play with variable pointers like that when you play with function pointers, your program will surely die. So why is it valid for function pointers?
A function (more precisely, a function designator) is converted to a pointer to a function (except when it is the operand of sizeof, _Alignof, or the unary &). So foo and &foo are the same thing because foo is automatically converted to &foo.
Similarly, p is a pointer to a function, so *p is the function, so it is automatically converted to the address of the function.
In fact, because *p is converted back to a pointer right away, you can apply * to it again, as in int b = (**p)();. And you can do that again and again and again: int b = (************p)();.
From the 2011 C standard (committee draft N1570), 6.3.2.1 4:
A function designator is an expression that has function type. Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, a function designator with type “function returning type” is converted to an expression that has type “pointer to function returning type”.
This is just a consequence of the fact that C developers wanted to make it easy to use both functions like foo and pointers-to-functions like p. If you had to write foo() for one and (*p)() for the other, that would be a bit of a nuisance. Or at least more typing. Or, if you had to write (&foo)() for one and p() for the other, that is again more typing. So they just said whenever you write a function, we will automatically make it the pointer, and you can call it without typing any more.
I suppose they could have said the function call operator can accept either a function or a pointer to a function, but they chose the automatic conversion instead.
Corollary: If you do not know whether something is a function, or a pointer to a function, or a pointer to a pointer to a function, etc., just slap a hundred asterisks in front of it, and the compiler will stop when it gets to the function.
According to §6.5.2.2/1 of the C11 Standard, in a function call:
The expression that denotes the called function shall have type pointer to function returning void or returning a complete object type other than an array type.
Now, in most expressions a function designator is converted to a function pointer (§6.3.2.1/4):
A function designator is an expression that has function type. Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, a function designator with type ''function returning type'' is converted to an expression that has type ''pointer to function returning type''.
So, in a function call such as int a = p();, the function designator p is converted to a function pointer, which is needed in the expression before the postfix operator () for the function call.
But, according to §6.5.3.2/4 about the indirection operator:
If the operand points to a function, the result is a function designator
So, in a function call such as int b = (*p)();, the function designator p is converted to a function pointer, and the result of the indirection operator acting on this function pointer is a function designator, which itself is converted to the function pointer needed in the postfix expression for the function call.
By §6.5.2.2/1 above, foo is converted to a function pointer in the expression p = foo. But, in the expression p = &foo, the function designator is not converted to a function pointer. Here, §6.5.3.2/3 states that:
The unary & operator yields the address of its operand. If the operand has type ''type'', the result has type ''pointer to type''.... Otherwise, the result is a pointer to the object or function designated by its operand.
So the expression &foo evaluates to a pointer to the function designated by foo, as expected.
Why and how does dereferencing a function pointer just "do nothing"?
This is what I am talking about:
#include<stdio.h>
void hello() { printf("hello"); }
int main(void) {
(*****hello)();
}
From a comment over here:
function pointers dereference just
fine, but the resulting function
designator will be immediately
converted back to a function pointer
And from an answer here:
Dereferencing (in way you think) a
function's pointer means: accessing a
CODE memory as it would be a DATA
memory.
Function pointer isn't suppose to be
dereferenced in that way. Instead, it
is called.
I would use a name "dereference" side
by side with "call". It's OK.
Anyway: C is designed in such a way
that both function name identifier as
well as variable holding function's
pointer mean the same: address to CODE
memory. And it allows to jump to that
memory by using call () syntax either
on an identifier or variable.
How exactly does dereferencing of a function pointer work?
It's not quite the right question. For C, at least, the right question is
What happens to a function value in an rvalue context?
(An rvalue context is anywhere a name or other reference appears where it should be used as a value, rather than a location — basically anywhere except on the left-hand side of an assignment. The name itself comes from the right-hand side of an assignment.)
OK, so what happens to a function value in an rvalue context? It is immediately and implicitly converted to a pointer to the original function value. If you dereference that pointer with *, you get the same function value back again, which is immediately and implicitly converted into a pointer. And you can do this as many times as you like.
Two similar experiments you can try:
What happens if you dereference a function pointer in an lvalue context—the left-hand side of an assignment. (The answer will be about what you expect, if you keep in mind that functions are immutable.)
An array value is also converted to a pointer in an lvalue context, but it is converted to a pointer to the element type, not to a pointer to the array. Dereferencing it will therefore give you an element, not an array, and the madness you show doesn't occur.
Hope this helps.
P.S. As to why a function value is implicitly converted to a pointer, the answer is that for those of us who use function pointers, it's a great convenience not to have to use &'s everywhere. There's a dual convenience as well: a function pointer in call position is automatically converted to a function value, so you don't have to write * to call through a function pointer.
P.P.S. Unlike C functions, C++ functions can be overloaded, and I'm not qualified to comment on how the semantics works in C++.
C++03 §4.3/1:
An lvalue of function type T can be converted to an rvalue of type “pointer to T.” The result is a pointer to the function.
If you attempt an invalid operation on a function reference, such as the unary * operator, the first thing the language tries is a standard conversion. It's just like converting an int when adding it to a float. Using * on a function reference causes the language to take its pointer instead, which in your example, is square 1.
Another case where this applies is when assigning a function pointer.
void f() {
void (*recurse)() = f; // "f" is a reference; implicitly convert to ptr.
recurse(); // call operator is defined for pointers
}
Note that this doesn't work the other way.
void f() {
void (&recurse)() = &f; // "&f" is a pointer; ERROR can't convert to ref.
recurse(); // OK - call operator is *separately* defined for references
}
Function reference variables are nice because they (in theory, I've never tested) hint to the compiler that an indirect branch may be unnecessary, if initialized in an enclosing scope.
In C99, dereferencing a function pointer yields a function designator. §6.3.2.1/4:
A function designator is an expression that has function type. Except when it is the operand of the sizeof operator or the unary & operator, a function designator with type ‘‘function returning type’’ is converted to an expression that has type ‘‘pointer to function returning type’’.
This is more like Norman's answer, but notably C99 has no concept of rvalues.
It happens with a few implicit conversions. Indeed, per the C standard:
ISO/IEC 2011, section 6.3.2.1 Lvalues, arrays, and function designators, paragraph 4
A function designator is an expression that has function type. Except when it is the operand of the sizeof operator or the unary & operator, a function designator with type “function returning type” is converted to an expression that has type “pointer to function returning type”.
Consider the following code:
void func(void);
int main(void)
{
void (*ptr)(void) = func;
return 0;
}
Here, the function designator func has the type “function returning void” but is immediately converted to an expression that has type “pointer to function returning void”. However, if you write
void (*ptr)(void) = &func;
then the function designator func has the type “function returning void” but the unary & operator explicitly take the address of that function, eventually yielding the type “pointer to function returning void”.
This is mentioned in the C standard:
ISO/IEC 2011, section 6.5.3.2 Address and indirection operators, paragraph 3
The unary & operator yields the address of its operand. If the operand has type “type”, the result has type “pointer to type”.
In particular, dereferencing a function pointer is redundant. Per the C standard:
ISO/IEC 2011, section 6.5.2.2 Function calls, paragraph 1
The expression that denotes the called function shall have type “pointer to function returning void” or returning a complete object type other than an array type. Most often, this is the result of converting an identifier that is a function designator.
ISO/IEC 2011, section 6.5.3.2 Address and indirection operators, paragraph 4
The unary * operator denotes indirection. If the operand points to a function, the result is a function designator.
So when you write
ptr();
the function call is evaluated with no implicit conversion because ptr is already a pointer to function. If you explicitly dereference it with
(*ptr)();
then the dereferencing yields the type “function returning void” which is immediately converted back to the type “pointer to function returning void” and the function call occurs. When writing an expression composed of x unary * indirection operators such as
(****ptr)();
then you just repeat the implicit conversions x times.
It does make sense that calling functions involves function pointers. Before executing a function, a program pushes all of the parameters for the function onto the stack in the reverse order that they are documented. Then the program issues a call instruction indicating which function it wishes to start. The call instruction does two things:
First it pushes the address of the next instruction, which is the return address, onto the stack.
Then, it modifies the instruction pointer %eip to point to the start of the function.
Since calling a function does involve modifying an instruction pointer, which is a memory address, it makes sense that the compiler implicitly converts a function designator to a pointer to function.
Even though it may seems unrigorous to have these implicit conversions, it can be useful in C (unlike C++ which have namespaces)
to take advantage of the namespace defined by a struct identifier to encapsulate variables.
Consider the following code:
void create_person(void);
void update_person(void);
void delete_person(void);
struct Person {
void (*create)(void);
void (*update)(void);
void (*delete)(void);
};
static struct Person person = {
.create = &create_person,
.update = &update_person,
.delete = &delete_person,
};
int main(void)
{
person.create();
person.update();
person.delete();
return 0;
}
It is possible to hide the implementation of the library in other translation units and to choose to only expose the struct encapsulating the pointers to functions, to use them in place of the actual function designators.
Put yourself in the shoes of the compiler writer. A function pointer has a well defined meaning, it is a pointer to a blob of bytes that represent machine code.
What do you do when the programmer dereferences a function pointer? Do you take the first (or 8) bytes of the machine code and reinterpret that as a pointer? Odds are about 2 billion to one that this won't work. Do you declare UB? Plenty of that going around already. Or do you just ignore the attempt? You know the answer.
How exactly does dereferencing of a function pointer work?
Two steps. The first step is at compile time, the second at runtime.
In step one, the compiler sees it has a pointer and a context in which that pointer is dereferenced (such as (*pFoo)() ) so it generates code for that situation, code that will be used in step 2.
In step 2, at runtime the code is executed. The pointer contains some bytes indicating which function should be executed next. These bytes are somehow loaded into the CPU. A common case is a CPU with an explicit CALL [register] instruction. On such systems, a function pointer can be simply the address of a function in memory, and the derefencing code does nothing more than loading that address into a register followed by a CALL [register] instruction.
Suppose there is a function pointer:
void func(float a1, float a2) {
}
void (*fptr)(float, float) = &func;
Is there any difference between these two lines (both compile and work on my compiler)?
(*fptr)(1,2);
fptr(1,2);
I suppose that the second version is just a shortcut of the first one, but want to ensure myself. And more important is it a standard behavior?
They do the same thing.
The prefix of a function call is always an expression of pointer-to-function type.
An expression of pointer type, such as the name of a declared function. is implicitly converted to a pointer to that function, unless it's the operand of a unary "&" (which yields the function's address anyway) or of sizeof (which is illegal rather than yielding the size of a pointer).
The consequence of this rule is that all of these:
&func
func
*func
**func
are equivalent. They all evaluate to the address of the function, and they can all (if suitably parenthesized) be used as the prefix of a function call.
Yes, it is standard behavior and will work in all C compilers.
Function calls are actually always made through a pointer to function:
6.5.22 "Function calls":
The expression that denotes the called function (footnote 77) shall have type pointer to function returning void or returning an object type other than an array type
As footnote 77 indicates, a 'normal' function identifier is actually converted to a pointer-to-function when a function call is expression is used:
Footnote 77:
Most often, this is the result of converting an identifier that is a function designator.
Interestingly, dereferencing a function pointer (or a function designator converted to a function pointer) yields a function designator for the function, which will be converted back to a function pointer for the function call expression. So you can dereference the function pointer (or a normal function name), but don't have to.
6.5.3.2/4 "Address and indirection operators":
The unary * operator denotes indirection. If the operand points to a function, the result is a function designator
So as far as the standard is concerned, for your examples:
(*fptr)(1,2);
fptr(1,2);
The second isn't really a shorthand - it's what the compiler expects. Actually, the first example is really just a more verbose way of saying the same thing as the second. But some people might prefer the first form as it makes it more clear that a pointer is being used. There should be absolutely no difference in the generated code.
Another consequence of the standard's wording is that you can just as well dereference a normal function name:
func(1,2);
(*func)(1,2); // equivalent to the above
Not that there's really any defensible point to doing that that I can think of.