Related
Why do most C programmers name variables like this:
int *myVariable;
rather than like this:
int* myVariable;
Both are valid. It seems to me that the asterisk is a part of the type, not a part of the variable name. Can anyone explain this logic?
They are EXACTLY equivalent.
However, in
int *myVariable, myVariable2;
It seems obvious that myVariable has type int*, while myVariable2 has type int.
In
int* myVariable, myVariable2;
it may seem obvious that both are of type int*, but that is not correct as myVariable2 has type int.
Therefore, the first programming style is more intuitive.
If you look at it another way, *myVariable is of type int, which makes some sense.
Something nobody has mentioned here so far is that this asterisk is actually the "dereference operator" in C.
*a = 10;
The line above doesn't mean I want to assign 10 to a, it means I want to assign 10 to whatever memory location a points to. And I have never seen anyone writing
* a = 10;
have you? So the dereference operator is pretty much always written without a space. This is probably to distinguish it from a multiplication broken across multiple lines:
x = a * b * c * d
* e * f * g;
Here *e would be misleading, wouldn't it?
Okay, now what does the following line actually mean:
int *a;
Most people would say:
It means that a is a pointer to an int value.
This is technically correct, most people like to see/read it that way and that is the way how modern C standards would define it (note that language C itself predates all the ANSI and ISO standards). But it's not the only way to look at it. You can also read this line as follows:
The dereferenced value of a is of type int.
So in fact the asterisk in this declaration can also be seen as a dereference operator, which also explains its placement. And that a is a pointer is not really declared at all, it's implicit by the fact, that the only thing you can actually dereference is a pointer.
The C standard only defines two meanings to the * operator:
indirection operator
multiplication operator
And indirection is just a single meaning, there is no extra meaning for declaring a pointer, there is just indirection, which is what the dereference operation does, it performs an indirect access, so also within a statement like int *a; this is an indirect access (* means indirect access) and thus the second statement above is much closer to the standard than the first one is.
Because the * in that line binds more closely to the variable than to the type:
int* varA, varB; // This is misleading
As #Lundin points out below, const adds even more subtleties to think about. You can entirely sidestep this by declaring one variable per line, which is never ambiguous:
int* varA;
int varB;
The balance between clear code and concise code is hard to strike — a dozen redundant lines of int a; isn't good either. Still, I default to one declaration per line and worry about combining code later.
I'm going to go out on a limb here and say that there is a straight answer to this question, both for variable declarations and for parameter and return types, which is that the asterisk should go next to the name: int *myVariable;. To appreciate why, look at how you declare other types of symbol in C:
int my_function(int arg); for a function;
float my_array[3] for an array.
The general pattern, referred to as declaration follows use, is that the type of a symbol is split up into the part before the name, and the parts around the name, and these parts around the name mimic the syntax you would use to get a value of the type on the left:
int a_return_value = my_function(729);
float an_element = my_array[2];
and: int copy_of_value = *myVariable;.
C++ throws a spanner in the works with references, because the syntax at the point where you use references is identical to that of value types, so you could argue that C++ takes a different approach to C. On the other hand, C++ retains the same behaviour of C in the case of pointers, so references really stand as the odd one out in this respect.
A great guru once said "Read it the way of the compiler, you must."
http://www.drdobbs.com/conversationsa-midsummer-nights-madness/184403835
Granted this was on the topic of const placement, but the same rule applies here.
The compiler reads it as:
int (*a);
not as:
(int*) a;
If you get into the habit of placing the star next to the variable, it will make your declarations easier to read. It also avoids eyesores such as:
int* a[10];
-- Edit --
To explain exactly what I mean when I say it's parsed as int (*a), that means that * binds more tightly to a than it does to int, in very much the manner that in the expression 4 + 3 * 7 3 binds more tightly to 7 than it does to 4 due to the higher precedence of *.
With apologies for the ascii art, a synopsis of the A.S.T. for parsing int *a looks roughly like this:
Declaration
/ \
/ \
Declaration- Init-
Secifiers Declarator-
| List
| |
| ...
"int" |
Declarator
/ \
/ ...
Pointer \
| Identifier
| |
"*" |
"a"
As is clearly shown, * binds more tightly to a since their common ancestor is Declarator, while you need to go all the way up the tree to Declaration to find a common ancestor that involves the int.
That's just a matter of preference.
When you read the code, distinguishing between variables and pointers is easier in the second case, but it may lead to confusion when you are putting both variables and pointers of a common type in a single line (which itself is often discouraged by project guidelines, because decreases readability).
I prefer to declare pointers with their corresponding sign next to type name, e.g.
int* pMyPointer;
People who prefer int* x; are trying to force their code into a fictional world where the type is on the left and the identifier (name) is on the right.
I say "fictional" because:
In C and C++, in the general case, the declared identifier is surrounded by the type information.
That may sound crazy, but you know it to be true. Here are some examples:
int main(int argc, char *argv[]) means "main is a function that takes an int and an array of pointers to char and returns an int." In other words, most of the type information is on the right. Some people think function declarations don't count because they're somehow "special." OK, let's try a variable.
void (*fn)(int) means fn is a pointer to a function that takes an int and returns nothing.
int a[10] declares 'a' as an array of 10 ints.
pixel bitmap[height][width].
Clearly, I've cherry-picked examples that have a lot of type info on the right to make my point. There are lots of declarations where most--if not all--of the type is on the left, like struct { int x; int y; } center.
This declaration syntax grew out of K&R's desire to have declarations reflect the usage. Reading simple declarations is intuitive, and reading more complex ones can be mastered by learning the right-left-right rule (sometimes call the spiral rule or just the right-left rule).
C is simple enough that many C programmers embrace this style and write simple declarations as int *p.
In C++, the syntax got a little more complex (with classes, references, templates, enum classes), and, as a reaction to that complexity, you'll see more effort into separating the type from the identifier in many declarations. In other words, you might see see more of int* p-style declarations if you check out a large swath of C++ code.
In either language, you can always have the type on the left side of variable declarations by (1) never declaring multiple variables in the same statement, and (2) making use of typedefs (or alias declarations, which, ironically, put the alias identifiers to the left of types). For example:
typedef int array_of_10_ints[10];
array_of_10_ints a;
A lot of the arguments in this topic are plain subjective and the argument about "the star binds to the variable name" is naive. Here's a few arguments that aren't just opinions:
The forgotten pointer type qualifiers
Formally, the "star" neither belongs to the type nor to the variable name, it is part of its own grammatical item named pointer. The formal C syntax (ISO 9899:2018) is:
(6.7) declaration:
declaration-specifiers init-declarator-listopt ;
Where declaration-specifiers contains the type (and storage), and the init-declarator-list contains the pointer and the variable name. Which we see if we dissect this declarator list syntax further:
(6.7.6) declarator:
pointeropt direct-declarator
...
(6.7.6) pointer:
* type-qualifier-listopt
* type-qualifier-listopt pointer
Where a declarator is the whole declaration, a direct-declarator is the identifier (variable name), and a pointer is the star followed by an optional type qualifier list belonging to the pointer itself.
What makes the various style arguments about "the star belongs to the variable" inconsistent, is that they have forgotten about these pointer type qualifiers. int* const x, int *const x or int*const x?
Consider int *const a, b;, what are the types of a and b? Not so obvious that "the star belongs to the variable" any longer. Rather, one would start to ponder where the const belongs to.
You can definitely make a sound argument that the star belongs to the pointer type qualifier, but not much beyond that.
The type qualifier list for the pointer can cause problems for those using the int *a style. Those who use pointers inside a typedef (which we shouldn't, very bad practice!) and think "the star belongs to the variable name" tend to write this very subtle bug:
/*** bad code, don't do this ***/
typedef int *bad_idea_t;
...
void func (const bad_idea_t *foo);
This compiles cleanly. Now you might think the code is made const correct. Not so! This code is accidentally a faked const correctness.
The type of foo is actually int*const* - the outer most pointer was made read-only, not the pointed at data. So inside this function we can do **foo = n; and it will change the variable value in the caller.
This is because in the expression const bad_idea_t *foo, the * does not belong to the variable name here! In pseudo code, this parameter declaration is to be read as const (bad_idea_t *) foo and not as (const bad_idea_t) *foo. The star belongs to the hidden pointer type in this case - the type is a pointer and a const-qualified pointer is written as *const.
But then the root of the problem in the above example is the practice of hiding pointers behind a typedef and not the * style.
Regarding declaration of multiple variables on a single line
Declaring multiple variables on a single line is widely recognized as bad practice1). CERT-C sums it up nicely as:
DCL04-C. Do not declare more than one variable per declaration
Just reading the English, then common sense agrees that a declaration should be one declaration.
And it doesn't matter if the variables are pointers or not. Declaring each variable on a single line makes the code clearer in almost every case.
So the argument about the programmer getting confused over int* a, b is bad. The root of the problem is the use of multiple declarators, not the placement of the *. Regardless of style, you should be writing this instead:
int* a; // or int *a
int b;
Another sound but subjective argument would be that given int* a the type of a is without question int* and so the star belongs with the type qualifier.
But basically my conclusion is that many of the arguments posted here are just subjective and naive. You can't really make a valid argument for either style - it is truly a matter of subjective personal preference.
1) CERT-C DCL04-C.
Because it makes more sense when you have declarations like:
int *a, *b;
For declaring multiple pointers in one line, I prefer int* a, * b; which more intuitively declares "a" as a pointer to an integer, and doesn't mix styles when likewise declaring "b." Like someone said, I wouldn't declare two different types in the same statement anyway.
When you initialize and assign a variable in one statement, e.g.
int *a = xyz;
you assign the value of xyz to a, not to *a. This makes
int* a = xyz;
a more consistent notation.
Why do most C programmers name variables like this:
int *myVariable;
rather than like this:
int* myVariable;
Both are valid. It seems to me that the asterisk is a part of the type, not a part of the variable name. Can anyone explain this logic?
They are EXACTLY equivalent.
However, in
int *myVariable, myVariable2;
It seems obvious that myVariable has type int*, while myVariable2 has type int.
In
int* myVariable, myVariable2;
it may seem obvious that both are of type int*, but that is not correct as myVariable2 has type int.
Therefore, the first programming style is more intuitive.
If you look at it another way, *myVariable is of type int, which makes some sense.
Something nobody has mentioned here so far is that this asterisk is actually the "dereference operator" in C.
*a = 10;
The line above doesn't mean I want to assign 10 to a, it means I want to assign 10 to whatever memory location a points to. And I have never seen anyone writing
* a = 10;
have you? So the dereference operator is pretty much always written without a space. This is probably to distinguish it from a multiplication broken across multiple lines:
x = a * b * c * d
* e * f * g;
Here *e would be misleading, wouldn't it?
Okay, now what does the following line actually mean:
int *a;
Most people would say:
It means that a is a pointer to an int value.
This is technically correct, most people like to see/read it that way and that is the way how modern C standards would define it (note that language C itself predates all the ANSI and ISO standards). But it's not the only way to look at it. You can also read this line as follows:
The dereferenced value of a is of type int.
So in fact the asterisk in this declaration can also be seen as a dereference operator, which also explains its placement. And that a is a pointer is not really declared at all, it's implicit by the fact, that the only thing you can actually dereference is a pointer.
The C standard only defines two meanings to the * operator:
indirection operator
multiplication operator
And indirection is just a single meaning, there is no extra meaning for declaring a pointer, there is just indirection, which is what the dereference operation does, it performs an indirect access, so also within a statement like int *a; this is an indirect access (* means indirect access) and thus the second statement above is much closer to the standard than the first one is.
Because the * in that line binds more closely to the variable than to the type:
int* varA, varB; // This is misleading
As #Lundin points out below, const adds even more subtleties to think about. You can entirely sidestep this by declaring one variable per line, which is never ambiguous:
int* varA;
int varB;
The balance between clear code and concise code is hard to strike — a dozen redundant lines of int a; isn't good either. Still, I default to one declaration per line and worry about combining code later.
I'm going to go out on a limb here and say that there is a straight answer to this question, both for variable declarations and for parameter and return types, which is that the asterisk should go next to the name: int *myVariable;. To appreciate why, look at how you declare other types of symbol in C:
int my_function(int arg); for a function;
float my_array[3] for an array.
The general pattern, referred to as declaration follows use, is that the type of a symbol is split up into the part before the name, and the parts around the name, and these parts around the name mimic the syntax you would use to get a value of the type on the left:
int a_return_value = my_function(729);
float an_element = my_array[2];
and: int copy_of_value = *myVariable;.
C++ throws a spanner in the works with references, because the syntax at the point where you use references is identical to that of value types, so you could argue that C++ takes a different approach to C. On the other hand, C++ retains the same behaviour of C in the case of pointers, so references really stand as the odd one out in this respect.
A great guru once said "Read it the way of the compiler, you must."
http://www.drdobbs.com/conversationsa-midsummer-nights-madness/184403835
Granted this was on the topic of const placement, but the same rule applies here.
The compiler reads it as:
int (*a);
not as:
(int*) a;
If you get into the habit of placing the star next to the variable, it will make your declarations easier to read. It also avoids eyesores such as:
int* a[10];
-- Edit --
To explain exactly what I mean when I say it's parsed as int (*a), that means that * binds more tightly to a than it does to int, in very much the manner that in the expression 4 + 3 * 7 3 binds more tightly to 7 than it does to 4 due to the higher precedence of *.
With apologies for the ascii art, a synopsis of the A.S.T. for parsing int *a looks roughly like this:
Declaration
/ \
/ \
Declaration- Init-
Secifiers Declarator-
| List
| |
| ...
"int" |
Declarator
/ \
/ ...
Pointer \
| Identifier
| |
"*" |
"a"
As is clearly shown, * binds more tightly to a since their common ancestor is Declarator, while you need to go all the way up the tree to Declaration to find a common ancestor that involves the int.
That's just a matter of preference.
When you read the code, distinguishing between variables and pointers is easier in the second case, but it may lead to confusion when you are putting both variables and pointers of a common type in a single line (which itself is often discouraged by project guidelines, because decreases readability).
I prefer to declare pointers with their corresponding sign next to type name, e.g.
int* pMyPointer;
People who prefer int* x; are trying to force their code into a fictional world where the type is on the left and the identifier (name) is on the right.
I say "fictional" because:
In C and C++, in the general case, the declared identifier is surrounded by the type information.
That may sound crazy, but you know it to be true. Here are some examples:
int main(int argc, char *argv[]) means "main is a function that takes an int and an array of pointers to char and returns an int." In other words, most of the type information is on the right. Some people think function declarations don't count because they're somehow "special." OK, let's try a variable.
void (*fn)(int) means fn is a pointer to a function that takes an int and returns nothing.
int a[10] declares 'a' as an array of 10 ints.
pixel bitmap[height][width].
Clearly, I've cherry-picked examples that have a lot of type info on the right to make my point. There are lots of declarations where most--if not all--of the type is on the left, like struct { int x; int y; } center.
This declaration syntax grew out of K&R's desire to have declarations reflect the usage. Reading simple declarations is intuitive, and reading more complex ones can be mastered by learning the right-left-right rule (sometimes call the spiral rule or just the right-left rule).
C is simple enough that many C programmers embrace this style and write simple declarations as int *p.
In C++, the syntax got a little more complex (with classes, references, templates, enum classes), and, as a reaction to that complexity, you'll see more effort into separating the type from the identifier in many declarations. In other words, you might see see more of int* p-style declarations if you check out a large swath of C++ code.
In either language, you can always have the type on the left side of variable declarations by (1) never declaring multiple variables in the same statement, and (2) making use of typedefs (or alias declarations, which, ironically, put the alias identifiers to the left of types). For example:
typedef int array_of_10_ints[10];
array_of_10_ints a;
A lot of the arguments in this topic are plain subjective and the argument about "the star binds to the variable name" is naive. Here's a few arguments that aren't just opinions:
The forgotten pointer type qualifiers
Formally, the "star" neither belongs to the type nor to the variable name, it is part of its own grammatical item named pointer. The formal C syntax (ISO 9899:2018) is:
(6.7) declaration:
declaration-specifiers init-declarator-listopt ;
Where declaration-specifiers contains the type (and storage), and the init-declarator-list contains the pointer and the variable name. Which we see if we dissect this declarator list syntax further:
(6.7.6) declarator:
pointeropt direct-declarator
...
(6.7.6) pointer:
* type-qualifier-listopt
* type-qualifier-listopt pointer
Where a declarator is the whole declaration, a direct-declarator is the identifier (variable name), and a pointer is the star followed by an optional type qualifier list belonging to the pointer itself.
What makes the various style arguments about "the star belongs to the variable" inconsistent, is that they have forgotten about these pointer type qualifiers. int* const x, int *const x or int*const x?
Consider int *const a, b;, what are the types of a and b? Not so obvious that "the star belongs to the variable" any longer. Rather, one would start to ponder where the const belongs to.
You can definitely make a sound argument that the star belongs to the pointer type qualifier, but not much beyond that.
The type qualifier list for the pointer can cause problems for those using the int *a style. Those who use pointers inside a typedef (which we shouldn't, very bad practice!) and think "the star belongs to the variable name" tend to write this very subtle bug:
/*** bad code, don't do this ***/
typedef int *bad_idea_t;
...
void func (const bad_idea_t *foo);
This compiles cleanly. Now you might think the code is made const correct. Not so! This code is accidentally a faked const correctness.
The type of foo is actually int*const* - the outer most pointer was made read-only, not the pointed at data. So inside this function we can do **foo = n; and it will change the variable value in the caller.
This is because in the expression const bad_idea_t *foo, the * does not belong to the variable name here! In pseudo code, this parameter declaration is to be read as const (bad_idea_t *) foo and not as (const bad_idea_t) *foo. The star belongs to the hidden pointer type in this case - the type is a pointer and a const-qualified pointer is written as *const.
But then the root of the problem in the above example is the practice of hiding pointers behind a typedef and not the * style.
Regarding declaration of multiple variables on a single line
Declaring multiple variables on a single line is widely recognized as bad practice1). CERT-C sums it up nicely as:
DCL04-C. Do not declare more than one variable per declaration
Just reading the English, then common sense agrees that a declaration should be one declaration.
And it doesn't matter if the variables are pointers or not. Declaring each variable on a single line makes the code clearer in almost every case.
So the argument about the programmer getting confused over int* a, b is bad. The root of the problem is the use of multiple declarators, not the placement of the *. Regardless of style, you should be writing this instead:
int* a; // or int *a
int b;
Another sound but subjective argument would be that given int* a the type of a is without question int* and so the star belongs with the type qualifier.
But basically my conclusion is that many of the arguments posted here are just subjective and naive. You can't really make a valid argument for either style - it is truly a matter of subjective personal preference.
1) CERT-C DCL04-C.
Because it makes more sense when you have declarations like:
int *a, *b;
For declaring multiple pointers in one line, I prefer int* a, * b; which more intuitively declares "a" as a pointer to an integer, and doesn't mix styles when likewise declaring "b." Like someone said, I wouldn't declare two different types in the same statement anyway.
When you initialize and assign a variable in one statement, e.g.
int *a = xyz;
you assign the value of xyz to a, not to *a. This makes
int* a = xyz;
a more consistent notation.
void fun1()
{
puts("In fun1\n");
}
void fun2(void)
{
puts("In fun2\n");
}
main()
{
fun1();
fun1(5);
fun2();
}`
It is working properly but i want to catch the parameters value that i've pass, but in this format only.
I am not sure if I can comprehend what you mean by "but in this format only."
If you meant you don't have the liberty to change void fun1() to void fun1(<some suitable data type>), then you can't achieve what you are trying to.
One of the most basic and common ways to pass arguments to fun1() is by changing
void fun1()
to
void fun1(int a) //int for example
I have used int as the data type of the argument, it can be any of the following:
int
unsigned int
char
unsigned char
short
unsigned short
long
unsigned long
and so on..
There are many ways to use arguments to a function. You can possibly pick any basic standard reference book for C programming and see the chapter that deals with functions.
For a beginner, reading and more importantly understanding the C standard can be very hard, but its always the way going ahead. I'll provide you with the link to refer the C11 standard, which is the latest one.
Referring to C11 standard, section 6.5.2.2 Function calls,
If the expression that denotes the called function has a type that includes a prototype, the number of arguments shall agree with the number of parameters. Each argument shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter.
and
An argument may be an expression of any complete object type. In preparing for the call to a function, the arguments are evaluated, and each parameter is assigned the value of the corresponding argument
In short: You can't.
A little more detailed: A function prototype with just empty argument list means the function might be called with any number of arguments. You shouldn't use this to begin with, it's obsolete because it is unsafe practice, just declare a complete prototype with all arguments.
That said, pre-standard C had a mechanism to completely dynamically fetch arguments: varargs.h. DO NOT USE THIS, it is nowadays replaced by stdarg.h which requires at least one declared argument.
I'm reading this C BNF grammar. I have the following questions:
Is correct which it's <declarator> job to parse this syntax: id(int a, int b) (in <direct-declarator>) and so on to arrays in parameters of a function prototype/definition, etc;
In <function-definition>, why is <declarator> followed by a {<declaration>}* ?
from what I understood, it could make valid a type name or storage class followed by a function header like id(int a, int b) int. But I'm sure it isn't valid in C. What am I missing?
Yes, <declarator> in that grammar is the name of the object being declared, plus its arguments or array size (and also the pointer qualifiers of its type). <declarator> does not include the base type (return type for a function; element type for an array).
Note that there are two alternatives in <direct-declarator>, both of which seem relevant to functions:
<direct-declarator> ( <parameter-type-list> )
<direct-declarator> ( {<identifier>}* )
The first of these is what we normally think of as a function declaration, where the parameters are types or types-with-parameter-name. The second one is just a list of identifiers. (It should, I think, be a comma-separated list of identifiers.) The second case is the old-style "K&R" function definition syntax, which you may never have seen before, and should immediately forget about after reading this answer because -- while C compilers still accept it -- it has been deprecated for not just years, but decades. So don't use it. For historical completeness, here's how it looked:
int foo(n, p)
int n;
char p;
{
/ Body of the function */
}
Let's say I have a function that accepts a void (*)(void*) function pointer for use as a callback:
void do_stuff(void (*callback_fp)(void*), void* callback_arg);
Now, if I have a function like this:
void my_callback_function(struct my_struct* arg);
Can I do this safely?
do_stuff((void (*)(void*)) &my_callback_function, NULL);
I've looked at this question and I've looked at some C standards which say you can cast to 'compatible function pointers', but I cannot find a definition of what 'compatible function pointer' means.
As far as the C standard is concerned, if you cast a function pointer to a function pointer of a different type and then call that, it is undefined behavior. See Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to
type (6.3.2.3).
Section 6.3.2.3, paragraph 8 reads:
A pointer to a function of one type may be converted to a pointer to a function of another
type and back again; the result shall compare equal to the original pointer. If a converted
pointer is used to call a function whose type is not compatible with the pointed-to type,
the behavior is undefined.
So in other words, you can cast a function pointer to a different function pointer type, cast it back again, and call it, and things will work.
The definition of compatible is somewhat complicated. It can be found in section 6.7.5.3, paragraph 15:
For two function types to be compatible, both shall specify compatible return types127.
Moreover, the parameter type lists, if both are present, shall agree in the number of
parameters and in use of the ellipsis terminator; corresponding parameters shall have
compatible types. If one type has a parameter type list and the other type is specified by a
function declarator that is not part of a function definition and that contains an empty
identifier list, the parameter list shall not have an ellipsis terminator and the type of each
parameter shall be compatible with the type that results from the application of the
default argument promotions. If one type has a parameter type list and the other type is
specified by a function definition that contains a (possibly empty) identifier list, both shall
agree in the number of parameters, and the type of each prototype parameter shall be
compatible with the type that results from the application of the default argument
promotions to the type of the corresponding identifier. (In the determination of type
compatibility and of a composite type, each parameter declared with function or array
type is taken as having the adjusted type and each parameter declared with qualified type
is taken as having the unqualified version of its declared type.)
127) If both function types are ‘‘old style’’, parameter types are not compared.
The rules for determining whether two types are compatible are described in section 6.2.7, and I won't quote them here since they're rather lengthy, but you can read them on the draft of the C99 standard (PDF).
The relevant rule here is in section 6.7.5.1, paragraph 2:
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Hence, since a void* is not compatible with a struct my_struct*, a function pointer of type void (*)(void*) is not compatible with a function pointer of type void (*)(struct my_struct*), so this casting of function pointers is technically undefined behavior.
In practice, though, you can safely get away with casting function pointers in some cases. In the x86 calling convention, arguments are pushed on the stack, and all pointers are the same size (4 bytes in x86 or 8 bytes in x86_64). Calling a function pointer boils down to pushing the arguments on the stack and doing an indirect jump to the function pointer target, and there's obviously no notion of types at the machine code level.
Things you definitely can't do:
Cast between function pointers of different calling conventions. You will mess up the stack and at best, crash, at worst, succeed silently with a huge gaping security hole. In Windows programming, you often pass function pointers around. Win32 expects all callback functions to use the stdcall calling convention (which the macros CALLBACK, PASCAL, and WINAPI all expand to). If you pass a function pointer that uses the standard C calling convention (cdecl), badness will result.
In C++, cast between class member function pointers and regular function pointers. This often trips up C++ newbies. Class member functions have a hidden this parameter, and if you cast a member function to a regular function, there's no this object to use, and again, much badness will result.
Another bad idea that might sometimes work but is also undefined behavior:
Casting between function pointers and regular pointers (e.g. casting a void (*)(void) to a void*). Function pointers aren't necessarily the same size as regular pointers, since on some architectures they might contain extra contextual information. This will probably work ok on x86, but remember that it's undefined behavior.
I asked about this exact same issue regarding some code in GLib recently. (GLib is a core library for the GNOME project and written in C.) I was told the entire slots'n'signals framework depends upon it.
Throughout the code, there are numerous instances of casting from type (1) to (2):
typedef int (*CompareFunc) (const void *a,
const void *b)
typedef int (*CompareDataFunc) (const void *b,
const void *b,
void *user_data)
It is common to chain-thru with calls like this:
int stuff_equal (GStuff *a,
GStuff *b,
CompareFunc compare_func)
{
return stuff_equal_with_data(a, b, (CompareDataFunc) compare_func, NULL);
}
int stuff_equal_with_data (GStuff *a,
GStuff *b,
CompareDataFunc compare_func,
void *user_data)
{
int result;
/* do some work here */
result = compare_func (data1, data2, user_data);
return result;
}
See for yourself here in g_array_sort(): http://git.gnome.org/browse/glib/tree/glib/garray.c
The answers above are detailed and likely correct -- if you sit on the standards committee. Adam and Johannes deserve credit for their well-researched responses. However, out in the wild, you will find this code works just fine. Controversial? Yes. Consider this: GLib compiles/works/tests on a large number of platforms (Linux/Solaris/Windows/OS X) with a wide variety of compilers/linkers/kernel loaders (GCC/CLang/MSVC). Standards be damned, I guess.
I spent some time thinking about these answers. Here is my conclusion:
If you are writing a callback library, this might be OK. Caveat emptor -- use at your own risk.
Else, don't do it.
Thinking deeper after writing this response, I would not be surprised if the code for C compilers uses this same trick. And since (most/all?) modern C compilers are bootstrapped, this would imply the trick is safe.
A more important question to research: Can someone find a platform/compiler/linker/loader where this trick does not work? Major brownie points for that one. I bet there are some embedded processors/systems that don't like it. However, for desktop computing (and probably mobile/tablet), this trick probably still works.
The point really isn't whether you can. The trivial solution is
void my_callback_function(struct my_struct* arg);
void my_callback_helper(void* pv)
{
my_callback_function((struct my_struct*)pv);
}
do_stuff(&my_callback_helper);
A good compiler will only generate code for my_callback_helper if it's really needed, in which case you'd be glad it did.
You have a compatible function type if the return type and parameter types are compatible - basically (it's more complicated in reality :)). Compatibility is the same as "same type" just more lax to allow to have different types but still have some form of saying "these types are almost the same". In C89, for example, two structs were compatible if they were otherwise identical but just their name was different. C99 seem to have changed that. Quoting from the c rationale document (highly recommended reading, btw!):
Structure, union, or enumeration type declarations in two different translation units do not formally declare the same type, even if the text of these declarations come from the same include file, since the translation units are themselves disjoint. The Standard thus specifies additional compatibility rules for such types, so that if two such declarations are sufficiently similar they are compatible.
That said - yeah strictly this is undefined behavior, because your do_stuff function or someone else will call your function with a function pointer having void* as parameter, but your function has an incompatible parameter. But nevertheless, i expect all compilers to compile and run it without moaning. But you can do cleaner by having another function taking a void* (and registering that as callback function) which will just call your actual function then.
As C code compiles to instruction which do not care at all about pointer types, it's quite fine to use the code you mention. You'd run into problems when you'd run do_stuff with your callback function and pointer to something else then my_struct structure as argument.
I hope I can make it clearer by showing what would not work:
int my_number = 14;
do_stuff((void (*)(void*)) &my_callback_function, &my_number);
// my_callback_function will try to access int as struct my_struct
// and go nuts
or...
void another_callback_function(struct my_struct* arg, int arg2) { something }
do_stuff((void (*)(void*)) &another_callback_function, NULL);
// another_callback_function will look for non-existing second argument
// on the stack and go nuts
Basically, you can cast pointers to whatever you like, as long as the data continue to make sense at run-time.
Well, unless I understood the question wrong, you can just cast a function pointer this way.
void print_data(void *data)
{
// ...
}
((void (*)(char *)) &print_data)("hello");
A cleaner way would be to create a function typedef.
typedef void(*t_print_str)(char *);
((t_print_str) &print_data)("hello");
If you think about the way function calls work in C/C++, they push certain items on the stack, jump to the new code location, execute, then pop the stack on return. If your function pointers describe functions with the same return type and the same number/size of arguments, you should be okay.
Thus, I think you should be able to do so safely.
Void pointers are compatible with other types of pointer. It's the backbone of how malloc and the mem functions (memcpy, memcmp) work. Typically, in C (Rather than C++) NULL is a macro defined as ((void *)0).
Look at 6.3.2.3 (Item 1) in C99:
A pointer to void may be converted to or from a pointer to any incomplete or object type