char* func_name VS char *func_name (what are the differences) [duplicate] - c

I've recently decided that I just have to finally learn C/C++, and there is one thing I do not really understand about pointers or more precisely, their definition.
How about these examples:
int* test;
int *test;
int * test;
int* test,test2;
int *test,test2;
int * test,test2;
Now, to my understanding, the first three cases are all doing the same: Test is not an int, but a pointer to one.
The second set of examples is a bit more tricky. In case 4, both test and test2 will be pointers to an int, whereas in case 5, only test is a pointer, whereas test2 is a "real" int. What about case 6? Same as case 5?

4, 5, and 6 are the same thing, only test is a pointer. If you want two pointers, you should use:
int *test, *test2;
Or, even better (to make everything clear):
int* test;
int* test2;

White space around asterisks have no significance. All three mean the same thing:
int* test;
int *test;
int * test;
The "int *var1, var2" is an evil syntax that is just meant to confuse people and should be avoided. It expands to:
int *var1;
int var2;

Many coding guidelines recommend that you only declare one variable per line. This avoids any confusion of the sort you had before asking this question. Most C++ programmers I've worked with seem to stick to this.
A bit of an aside I know, but something I found useful is to read declarations backwards.
int* test; // test is a pointer to an int
This starts to work very well, especially when you start declaring const pointers and it gets tricky to know whether it's the pointer that's const, or whether its the thing the pointer is pointing at that is const.
int* const test; // test is a const pointer to an int
int const * test; // test is a pointer to a const int ... but many people write this as
const int * test; // test is a pointer to an int that's const

Use the "Clockwise Spiral Rule" to help parse C/C++ declarations;
There are three simple steps to follow:
Starting with the unknown element, move in a spiral/clockwise
direction; when encountering the following elements replace them with
the corresponding english statements:
[X] or []: Array X size of... or Array undefined size of...
(type1, type2): function passing type1 and type2 returning...
*: pointer(s) to...
Keep doing this in a spiral/clockwise direction until all tokens have been covered.
Always resolve anything in parenthesis first!
Also, declarations should be in separate statements when possible (which is true the vast majority of times).

There are three pieces to this puzzle.
The first piece is that whitespace in C and C++ is normally not significant beyond separating adjacent tokens that are otherwise indistinguishable.
During the preprocessing stage, the source text is broken up into a sequence of tokens - identifiers, punctuators, numeric literals, string literals, etc. That sequence of tokens is later analyzed for syntax and meaning. The tokenizer is "greedy" and will build the longest valid token that's possible. If you write something like
inttest;
the tokenizer only sees two tokens - the identifier inttest followed by the punctuator ;. It doesn't recognize int as a separate keyword at this stage (that happens later in the process). So, for the line to be read as a declaration of an integer named test, we have to use whitespace to separate the identifier tokens:
int test;
The * character is not part of any identifier; it's a separate token (punctuator) on its own. So if you write
int*test;
the compiler sees 4 separate tokens - int, *, test, and ;. Thus, whitespace is not significant in pointer declarations, and all of
int *test;
int* test;
int*test;
int * test;
are interpreted the same way.
The second piece to the puzzle is how declarations actually work in C and C++1. Declarations are broken up into two main pieces - a sequence of declaration specifiers (storage class specifiers, type specifiers, type qualifiers, etc.) followed by a comma-separated list of (possibly initialized) declarators. In the declaration
unsigned long int a[10]={0}, *p=NULL, f(void);
the declaration specifiers are unsigned long int and the declarators are a[10]={0}, *p=NULL, and f(void). The declarator introduces the name of the thing being declared (a, p, and f) along with information about that thing's array-ness, pointer-ness, and function-ness. A declarator may also have an associated initializer.
The type of a is "10-element array of unsigned long int". That type is fully specified by the combination of the declaration specifiers and the declarator, and the initial value is specified with the initializer ={0}. Similarly, the type of p is "pointer to unsigned long int", and again that type is specified by the combination of the declaration specifiers and the declarator, and is initialized to NULL. And the type of f is "function returning unsigned long int" by the same reasoning.
This is key - there is no "pointer-to" type specifier, just like there is no "array-of" type specifier, just like there is no "function-returning" type specifier. We can't declare an array as
int[10] a;
because the operand of the [] operator is a, not int. Similarly, in the declaration
int* p;
the operand of * is p, not int. But because the indirection operator is unary and whitespace is not significant, the compiler won't complain if we write it this way. However, it is always interpreted as int (*p);.
Therefore, if you write
int* p, q;
the operand of * is p, so it will be interpreted as
int (*p), q;
Thus, all of
int *test1, test2;
int* test1, test2;
int * test1, test2;
do the same thing - in all three cases, test1 is the operand of * and thus has type "pointer to int", while test2 has type int.
Declarators can get arbitrarily complex. You can have arrays of pointers:
T *a[N];
you can have pointers to arrays:
T (*a)[N];
you can have functions returning pointers:
T *f(void);
you can have pointers to functions:
T (*f)(void);
you can have arrays of pointers to functions:
T (*a[N])(void);
you can have functions returning pointers to arrays:
T (*f(void))[N];
you can have functions returning pointers to arrays of pointers to functions returning pointers to T:
T *(*(*f(void))[N])(void); // yes, it's eye-stabby. Welcome to C and C++.
and then you have signal:
void (*signal(int, void (*)(int)))(int);
which reads as
signal -- signal
signal( ) -- is a function taking
signal( ) -- unnamed parameter
signal(int ) -- is an int
signal(int, ) -- unnamed parameter
signal(int, (*) ) -- is a pointer to
signal(int, (*)( )) -- a function taking
signal(int, (*)( )) -- unnamed parameter
signal(int, (*)(int)) -- is an int
signal(int, void (*)(int)) -- returning void
(*signal(int, void (*)(int))) -- returning a pointer to
(*signal(int, void (*)(int)))( ) -- a function taking
(*signal(int, void (*)(int)))( ) -- unnamed parameter
(*signal(int, void (*)(int)))(int) -- is an int
void (*signal(int, void (*)(int)))(int); -- returning void
and this just barely scratches the surface of what's possible. But notice that array-ness, pointer-ness, and function-ness are always part of the declarator, not the type specifier.
One thing to watch out for - const can modify both the pointer type and the pointed-to type:
const int *p;
int const *p;
Both of the above declare p as a pointer to a const int object. You can write a new value to p setting it to point to a different object:
const int x = 1;
const int y = 2;
const int *p = &x;
p = &y;
but you cannot write to the pointed-to object:
*p = 3; // constraint violation, the pointed-to object is const
However,
int * const p;
declares p as a const pointer to a non-const int; you can write to the thing p points to
int x = 1;
int y = 2;
int * const p = &x;
*p = 3;
but you can't set p to point to a different object:
p = &y; // constraint violation, p is const
Which brings us to the third piece of the puzzle - why declarations are structured this way.
The intent is that the structure of a declaration should closely mirror the structure of an expression in the code ("declaration mimics use"). For example, let's suppose we have an array of pointers to int named ap, and we want to access the int value pointed to by the i'th element. We would access that value as follows:
printf( "%d", *ap[i] );
The expression *ap[i] has type int; thus, the declaration of ap is written as
int *ap[N]; // ap is an array of pointer to int, fully specified by the combination
// of the type specifier and declarator
The declarator *ap[N] has the same structure as the expression *ap[i]. The operators * and [] behave the same way in a declaration that they do in an expression - [] has higher precedence than unary *, so the operand of * is ap[N] (it's parsed as *(ap[N])).
As another example, suppose we have a pointer to an array of int named pa and we want to access the value of the i'th element. We'd write that as
printf( "%d", (*pa)[i] );
The type of the expression (*pa)[i] is int, so the declaration is written as
int (*pa)[N];
Again, the same rules of precedence and associativity apply. In this case, we don't want to dereference the i'th element of pa, we want to access the i'th element of what pa points to, so we have to explicitly group the * operator with pa.
The *, [] and () operators are all part of the expression in the code, so they are all part of the declarator in the declaration. The declarator tells you how to use the object in an expression. If you have a declaration like int *p;, that tells you that the expression *p in your code will yield an int value. By extension, it tells you that the expression p yields a value of type "pointer to int", or int *.
So, what about things like cast and sizeof expressions, where we use things like (int *) or sizeof (int [10]) or things like that? How do I read something like
void foo( int *, int (*)[10] );
There's no declarator, aren't the * and [] operators modifying the type directly?
Well, no - there is still a declarator, just with an empty identifier (known as an abstract declarator). If we represent an empty identifier with the symbol λ, then we can read those things as (int *λ), sizeof (int λ[10]), and
void foo( int *λ, int (*λ)[10] );
and they behave exactly like any other declaration. int *[10] represents an array of 10 pointers, while int (*)[10] represents a pointer to an array.
And now the opinionated portion of this answer. I am not fond of the C++ convention of declaring simple pointers as
T* p;
and consider it bad practice for the following reasons:
It's not consistent with the syntax;
It introduces confusion (as evidenced by this question, all the duplicates to this question, questions about the meaning of T* p, q;, all the duplicates to those questions, etc.);
It's not internally consistent - declaring an array of pointers as T* a[N] is asymmetrical with use (unless you're in the habit of writing * a[i]);
It cannot be applied to pointer-to-array or pointer-to-function types (unless you create a typedef just so you can apply the T* p convention cleanly, which...no);
The reason for doing so - "it emphasizes the pointer-ness of the object" - is spurious. It cannot be applied to array or function types, and I would think those qualities are just as important to emphasize.
In the end, it just indicates confused thinking about how the two languages' type systems work.
There are good reasons to declare items separately; working around a bad practice (T* p, q;) isn't one of them. If you write your declarators correctly (T *p, q;) you are less likely to cause confusion.
I consider it akin to deliberately writing all your simple for loops as
i = 0;
for( ; i < N; )
{
...
i++;
}
Syntactically valid, but confusing, and the intent is likely to be misinterpreted. However, the T* p; convention is entrenched in the C++ community, and I use it in my own C++ code because consistency across the code base is a good thing, but it makes me itch every time I do it.
I will be using C terminology - the C++ terminology is a little different, but the concepts are largely the same.

As others mentioned, 4, 5, and 6 are the same. Often, people use these examples to make the argument that the * belongs with the variable instead of the type. While it's an issue of style, there is some debate as to whether you should think of and write it this way:
int* x; // "x is a pointer to int"
or this way:
int *x; // "*x is an int"
FWIW I'm in the first camp, but the reason others make the argument for the second form is that it (mostly) solves this particular problem:
int* x,y; // "x is a pointer to int, y is an int"
which is potentially misleading; instead you would write either
int *x,y; // it's a little clearer what is going on here
or if you really want two pointers,
int *x, *y; // two pointers
Personally, I say keep it to one variable per line, then it doesn't matter which style you prefer.

#include <type_traits>
std::add_pointer<int>::type test, test2;

In 4, 5 and 6, test is always a pointer and test2 is not a pointer. White space is (almost) never significant in C++.

The rationale in C is that you declare the variables the way you use them. For example
char *a[100];
says that *a[42] will be a char. And a[42] a char pointer. And thus a is an array of char pointers.
This because the original compiler writers wanted to use the same parser for expressions and declarations. (Not a very sensible reason for a langage design choice)

I would say that the initial convention was to put the star on the pointer name side (right side of the declaration
in the c programming language by Dennis M. Ritchie the stars are on the right side of the declaration.
by looking at the linux source code at https://github.com/torvalds/linux/blob/master/init/main.c
we can see that the star is also on the right side.
You can follow the same rules, but it's not a big deal if you put stars on the type side.
Remember that consistency is important, so always but the star on the same side regardless of which side you have choose.

In my opinion, the answer is BOTH, depending on the situation.
Generally, IMO, it is better to put the asterisk next to the pointer name, rather than the type. Compare e.g.:
int *pointer1, *pointer2; // Fully consistent, two pointers
int* pointer1, pointer2; // Inconsistent -- because only the first one is a pointer, the second one is an int variable
// The second case is unexpected, and thus prone to errors
Why is the second case inconsistent? Because e.g. int x,y; declares two variables of the same type but the type is mentioned only once in the declaration. This creates a precedent and expected behavior. And int* pointer1, pointer2; is inconsistent with that because it declares pointer1 as a pointer, but pointer2 is an integer variable. Clearly prone to errors and, thus, should be avoided (by putting the asterisk next to the pointer name, rather than the type).
However, there are some exceptions where you might not be able to put the asterisk next to an object name (and where it matters where you put it) without getting undesired outcome — for example:
MyClass *volatile MyObjName
void test (const char *const p) // const value pointed to by a const pointer
Finally, in some cases, it might be arguably clearer to put the asterisk next to the type name, e.g.:
void* ClassName::getItemPtr () {return &item;} // Clear at first sight

The pointer is a modifier to the type. It's best to read them right to left in order to better understand how the asterisk modifies the type. 'int *' can be read as "pointer to int'. In multiple declarations you must specify that each variable is a pointer or it will be created as a standard variable.
1,2 and 3) Test is of type (int *). Whitespace doesn't matter.
4,5 and 6) Test is of type (int *). Test2 is of type int. Again whitespace is inconsequential.

I have always preferred to declare pointers like this:
int* i;
I read this to say "i is of type int-pointer". You can get away with this interpretation if you only declare one variable per declaration.
It is an uncomfortable truth, however, that this reading is wrong. The C Programming Language, 2nd Ed. (p. 94) explains the opposite paradigm, which is the one used in the C standards:
The declaration of the pointer ip,
int *ip;
is intended as a mnemonic; it says that the expression *ip is an
int. The syntax of the declaration for a variable mimics the syntax
of expressions in which the variable might appear. This reasoning
applies to function declarations as well. For example,
double *dp, atof(char *);
says that in an expression *dp and atof(s) have values of type
double, and that the argument of atof is a pointer to char.
So, by the reasoning of the C language, when you declare
int* test, test2;
you are not declaring two variables of type int*, you are introducing two expressions that evaluate to an int type, with no attachment to the allocation of an int in memory.
A compiler is perfectly happy to accept the following:
int *ip, i;
i = *ip;
because in the C paradigm, the compiler is only expected to keep track of the type of *ip and i. The programmer is expected to keep track of the meaning of *ip and i. In this case, ip is uninitialized, so it is the programmer's responsibility to point it at something meaningful before dereferencing it.

A good rule of thumb, a lot of people seem to grasp these concepts by: In C++ a lot of semantic meaning is derived by the left-binding of keywords or identifiers.
Take for example:
int const bla;
The const applies to the "int" word. The same is with pointers' asterisks, they apply to the keyword left of them. And the actual variable name? Yup, that's declared by what's left of it.

Related

Can someone explain pointer in C language for me? [duplicate]

I've recently decided that I just have to finally learn C/C++, and there is one thing I do not really understand about pointers or more precisely, their definition.
How about these examples:
int* test;
int *test;
int * test;
int* test,test2;
int *test,test2;
int * test,test2;
Now, to my understanding, the first three cases are all doing the same: Test is not an int, but a pointer to one.
The second set of examples is a bit more tricky. In case 4, both test and test2 will be pointers to an int, whereas in case 5, only test is a pointer, whereas test2 is a "real" int. What about case 6? Same as case 5?
4, 5, and 6 are the same thing, only test is a pointer. If you want two pointers, you should use:
int *test, *test2;
Or, even better (to make everything clear):
int* test;
int* test2;
White space around asterisks have no significance. All three mean the same thing:
int* test;
int *test;
int * test;
The "int *var1, var2" is an evil syntax that is just meant to confuse people and should be avoided. It expands to:
int *var1;
int var2;
Many coding guidelines recommend that you only declare one variable per line. This avoids any confusion of the sort you had before asking this question. Most C++ programmers I've worked with seem to stick to this.
A bit of an aside I know, but something I found useful is to read declarations backwards.
int* test; // test is a pointer to an int
This starts to work very well, especially when you start declaring const pointers and it gets tricky to know whether it's the pointer that's const, or whether its the thing the pointer is pointing at that is const.
int* const test; // test is a const pointer to an int
int const * test; // test is a pointer to a const int ... but many people write this as
const int * test; // test is a pointer to an int that's const
Use the "Clockwise Spiral Rule" to help parse C/C++ declarations;
There are three simple steps to follow:
Starting with the unknown element, move in a spiral/clockwise
direction; when encountering the following elements replace them with
the corresponding english statements:
[X] or []: Array X size of... or Array undefined size of...
(type1, type2): function passing type1 and type2 returning...
*: pointer(s) to...
Keep doing this in a spiral/clockwise direction until all tokens have been covered.
Always resolve anything in parenthesis first!
Also, declarations should be in separate statements when possible (which is true the vast majority of times).
There are three pieces to this puzzle.
The first piece is that whitespace in C and C++ is normally not significant beyond separating adjacent tokens that are otherwise indistinguishable.
During the preprocessing stage, the source text is broken up into a sequence of tokens - identifiers, punctuators, numeric literals, string literals, etc. That sequence of tokens is later analyzed for syntax and meaning. The tokenizer is "greedy" and will build the longest valid token that's possible. If you write something like
inttest;
the tokenizer only sees two tokens - the identifier inttest followed by the punctuator ;. It doesn't recognize int as a separate keyword at this stage (that happens later in the process). So, for the line to be read as a declaration of an integer named test, we have to use whitespace to separate the identifier tokens:
int test;
The * character is not part of any identifier; it's a separate token (punctuator) on its own. So if you write
int*test;
the compiler sees 4 separate tokens - int, *, test, and ;. Thus, whitespace is not significant in pointer declarations, and all of
int *test;
int* test;
int*test;
int * test;
are interpreted the same way.
The second piece to the puzzle is how declarations actually work in C and C++1. Declarations are broken up into two main pieces - a sequence of declaration specifiers (storage class specifiers, type specifiers, type qualifiers, etc.) followed by a comma-separated list of (possibly initialized) declarators. In the declaration
unsigned long int a[10]={0}, *p=NULL, f(void);
the declaration specifiers are unsigned long int and the declarators are a[10]={0}, *p=NULL, and f(void). The declarator introduces the name of the thing being declared (a, p, and f) along with information about that thing's array-ness, pointer-ness, and function-ness. A declarator may also have an associated initializer.
The type of a is "10-element array of unsigned long int". That type is fully specified by the combination of the declaration specifiers and the declarator, and the initial value is specified with the initializer ={0}. Similarly, the type of p is "pointer to unsigned long int", and again that type is specified by the combination of the declaration specifiers and the declarator, and is initialized to NULL. And the type of f is "function returning unsigned long int" by the same reasoning.
This is key - there is no "pointer-to" type specifier, just like there is no "array-of" type specifier, just like there is no "function-returning" type specifier. We can't declare an array as
int[10] a;
because the operand of the [] operator is a, not int. Similarly, in the declaration
int* p;
the operand of * is p, not int. But because the indirection operator is unary and whitespace is not significant, the compiler won't complain if we write it this way. However, it is always interpreted as int (*p);.
Therefore, if you write
int* p, q;
the operand of * is p, so it will be interpreted as
int (*p), q;
Thus, all of
int *test1, test2;
int* test1, test2;
int * test1, test2;
do the same thing - in all three cases, test1 is the operand of * and thus has type "pointer to int", while test2 has type int.
Declarators can get arbitrarily complex. You can have arrays of pointers:
T *a[N];
you can have pointers to arrays:
T (*a)[N];
you can have functions returning pointers:
T *f(void);
you can have pointers to functions:
T (*f)(void);
you can have arrays of pointers to functions:
T (*a[N])(void);
you can have functions returning pointers to arrays:
T (*f(void))[N];
you can have functions returning pointers to arrays of pointers to functions returning pointers to T:
T *(*(*f(void))[N])(void); // yes, it's eye-stabby. Welcome to C and C++.
and then you have signal:
void (*signal(int, void (*)(int)))(int);
which reads as
signal -- signal
signal( ) -- is a function taking
signal( ) -- unnamed parameter
signal(int ) -- is an int
signal(int, ) -- unnamed parameter
signal(int, (*) ) -- is a pointer to
signal(int, (*)( )) -- a function taking
signal(int, (*)( )) -- unnamed parameter
signal(int, (*)(int)) -- is an int
signal(int, void (*)(int)) -- returning void
(*signal(int, void (*)(int))) -- returning a pointer to
(*signal(int, void (*)(int)))( ) -- a function taking
(*signal(int, void (*)(int)))( ) -- unnamed parameter
(*signal(int, void (*)(int)))(int) -- is an int
void (*signal(int, void (*)(int)))(int); -- returning void
and this just barely scratches the surface of what's possible. But notice that array-ness, pointer-ness, and function-ness are always part of the declarator, not the type specifier.
One thing to watch out for - const can modify both the pointer type and the pointed-to type:
const int *p;
int const *p;
Both of the above declare p as a pointer to a const int object. You can write a new value to p setting it to point to a different object:
const int x = 1;
const int y = 2;
const int *p = &x;
p = &y;
but you cannot write to the pointed-to object:
*p = 3; // constraint violation, the pointed-to object is const
However,
int * const p;
declares p as a const pointer to a non-const int; you can write to the thing p points to
int x = 1;
int y = 2;
int * const p = &x;
*p = 3;
but you can't set p to point to a different object:
p = &y; // constraint violation, p is const
Which brings us to the third piece of the puzzle - why declarations are structured this way.
The intent is that the structure of a declaration should closely mirror the structure of an expression in the code ("declaration mimics use"). For example, let's suppose we have an array of pointers to int named ap, and we want to access the int value pointed to by the i'th element. We would access that value as follows:
printf( "%d", *ap[i] );
The expression *ap[i] has type int; thus, the declaration of ap is written as
int *ap[N]; // ap is an array of pointer to int, fully specified by the combination
// of the type specifier and declarator
The declarator *ap[N] has the same structure as the expression *ap[i]. The operators * and [] behave the same way in a declaration that they do in an expression - [] has higher precedence than unary *, so the operand of * is ap[N] (it's parsed as *(ap[N])).
As another example, suppose we have a pointer to an array of int named pa and we want to access the value of the i'th element. We'd write that as
printf( "%d", (*pa)[i] );
The type of the expression (*pa)[i] is int, so the declaration is written as
int (*pa)[N];
Again, the same rules of precedence and associativity apply. In this case, we don't want to dereference the i'th element of pa, we want to access the i'th element of what pa points to, so we have to explicitly group the * operator with pa.
The *, [] and () operators are all part of the expression in the code, so they are all part of the declarator in the declaration. The declarator tells you how to use the object in an expression. If you have a declaration like int *p;, that tells you that the expression *p in your code will yield an int value. By extension, it tells you that the expression p yields a value of type "pointer to int", or int *.
So, what about things like cast and sizeof expressions, where we use things like (int *) or sizeof (int [10]) or things like that? How do I read something like
void foo( int *, int (*)[10] );
There's no declarator, aren't the * and [] operators modifying the type directly?
Well, no - there is still a declarator, just with an empty identifier (known as an abstract declarator). If we represent an empty identifier with the symbol λ, then we can read those things as (int *λ), sizeof (int λ[10]), and
void foo( int *λ, int (*λ)[10] );
and they behave exactly like any other declaration. int *[10] represents an array of 10 pointers, while int (*)[10] represents a pointer to an array.
And now the opinionated portion of this answer. I am not fond of the C++ convention of declaring simple pointers as
T* p;
and consider it bad practice for the following reasons:
It's not consistent with the syntax;
It introduces confusion (as evidenced by this question, all the duplicates to this question, questions about the meaning of T* p, q;, all the duplicates to those questions, etc.);
It's not internally consistent - declaring an array of pointers as T* a[N] is asymmetrical with use (unless you're in the habit of writing * a[i]);
It cannot be applied to pointer-to-array or pointer-to-function types (unless you create a typedef just so you can apply the T* p convention cleanly, which...no);
The reason for doing so - "it emphasizes the pointer-ness of the object" - is spurious. It cannot be applied to array or function types, and I would think those qualities are just as important to emphasize.
In the end, it just indicates confused thinking about how the two languages' type systems work.
There are good reasons to declare items separately; working around a bad practice (T* p, q;) isn't one of them. If you write your declarators correctly (T *p, q;) you are less likely to cause confusion.
I consider it akin to deliberately writing all your simple for loops as
i = 0;
for( ; i < N; )
{
...
i++;
}
Syntactically valid, but confusing, and the intent is likely to be misinterpreted. However, the T* p; convention is entrenched in the C++ community, and I use it in my own C++ code because consistency across the code base is a good thing, but it makes me itch every time I do it.
I will be using C terminology - the C++ terminology is a little different, but the concepts are largely the same.
As others mentioned, 4, 5, and 6 are the same. Often, people use these examples to make the argument that the * belongs with the variable instead of the type. While it's an issue of style, there is some debate as to whether you should think of and write it this way:
int* x; // "x is a pointer to int"
or this way:
int *x; // "*x is an int"
FWIW I'm in the first camp, but the reason others make the argument for the second form is that it (mostly) solves this particular problem:
int* x,y; // "x is a pointer to int, y is an int"
which is potentially misleading; instead you would write either
int *x,y; // it's a little clearer what is going on here
or if you really want two pointers,
int *x, *y; // two pointers
Personally, I say keep it to one variable per line, then it doesn't matter which style you prefer.
#include <type_traits>
std::add_pointer<int>::type test, test2;
In 4, 5 and 6, test is always a pointer and test2 is not a pointer. White space is (almost) never significant in C++.
The rationale in C is that you declare the variables the way you use them. For example
char *a[100];
says that *a[42] will be a char. And a[42] a char pointer. And thus a is an array of char pointers.
This because the original compiler writers wanted to use the same parser for expressions and declarations. (Not a very sensible reason for a langage design choice)
I would say that the initial convention was to put the star on the pointer name side (right side of the declaration
in the c programming language by Dennis M. Ritchie the stars are on the right side of the declaration.
by looking at the linux source code at https://github.com/torvalds/linux/blob/master/init/main.c
we can see that the star is also on the right side.
You can follow the same rules, but it's not a big deal if you put stars on the type side.
Remember that consistency is important, so always but the star on the same side regardless of which side you have choose.
In my opinion, the answer is BOTH, depending on the situation.
Generally, IMO, it is better to put the asterisk next to the pointer name, rather than the type. Compare e.g.:
int *pointer1, *pointer2; // Fully consistent, two pointers
int* pointer1, pointer2; // Inconsistent -- because only the first one is a pointer, the second one is an int variable
// The second case is unexpected, and thus prone to errors
Why is the second case inconsistent? Because e.g. int x,y; declares two variables of the same type but the type is mentioned only once in the declaration. This creates a precedent and expected behavior. And int* pointer1, pointer2; is inconsistent with that because it declares pointer1 as a pointer, but pointer2 is an integer variable. Clearly prone to errors and, thus, should be avoided (by putting the asterisk next to the pointer name, rather than the type).
However, there are some exceptions where you might not be able to put the asterisk next to an object name (and where it matters where you put it) without getting undesired outcome — for example:
MyClass *volatile MyObjName
void test (const char *const p) // const value pointed to by a const pointer
Finally, in some cases, it might be arguably clearer to put the asterisk next to the type name, e.g.:
void* ClassName::getItemPtr () {return &item;} // Clear at first sight
The pointer is a modifier to the type. It's best to read them right to left in order to better understand how the asterisk modifies the type. 'int *' can be read as "pointer to int'. In multiple declarations you must specify that each variable is a pointer or it will be created as a standard variable.
1,2 and 3) Test is of type (int *). Whitespace doesn't matter.
4,5 and 6) Test is of type (int *). Test2 is of type int. Again whitespace is inconsequential.
I have always preferred to declare pointers like this:
int* i;
I read this to say "i is of type int-pointer". You can get away with this interpretation if you only declare one variable per declaration.
It is an uncomfortable truth, however, that this reading is wrong. The C Programming Language, 2nd Ed. (p. 94) explains the opposite paradigm, which is the one used in the C standards:
The declaration of the pointer ip,
int *ip;
is intended as a mnemonic; it says that the expression *ip is an
int. The syntax of the declaration for a variable mimics the syntax
of expressions in which the variable might appear. This reasoning
applies to function declarations as well. For example,
double *dp, atof(char *);
says that in an expression *dp and atof(s) have values of type
double, and that the argument of atof is a pointer to char.
So, by the reasoning of the C language, when you declare
int* test, test2;
you are not declaring two variables of type int*, you are introducing two expressions that evaluate to an int type, with no attachment to the allocation of an int in memory.
A compiler is perfectly happy to accept the following:
int *ip, i;
i = *ip;
because in the C paradigm, the compiler is only expected to keep track of the type of *ip and i. The programmer is expected to keep track of the meaning of *ip and i. In this case, ip is uninitialized, so it is the programmer's responsibility to point it at something meaningful before dereferencing it.
A good rule of thumb, a lot of people seem to grasp these concepts by: In C++ a lot of semantic meaning is derived by the left-binding of keywords or identifiers.
Take for example:
int const bla;
The const applies to the "int" word. The same is with pointers' asterisks, they apply to the keyword left of them. And the actual variable name? Yup, that's declared by what's left of it.

const and typedef of arrays in C

In C, it's possible to typedef an array, using this construction :
typedef int table_t[N];
Here, table_t is now defined as an array of N int. Any variable declared such as table_t t; will now behave as a normal array of int.
The point of such construction is to be used as an argument type in a function, such as :
int doSomething(table_t t);
A relatively equivalent function prototype could have been :
int doSomething(int* t);
The merit of the first construction is that it enforces N as the size of the table. In many circumstances, it's safer to enforce this property, rather than relying on the programmer to properly figure out this condition.
Now it's all good, except that, in order to guarantee that the content of table will not be modified, it's necessary to use the const qualifier.
The following statement is relatively simple to understand :
int doSomething(const int* t);
Now, doSomething guarantee that it will not modify the content of the table passed as a pointer.
Now, what about this almost equivalent construction ? :
int doSomething(const table_t t);
What is const here ? the content of the table, or the pointer to the table ?
If it's the pointer which is const, is there another way (C90 compatible) to retain the ability to define the size of the table and to tell that its content will be const ?
Note that it's also necessary sometimes to modify the content of the table, so the const property cannot be embedded into the typedef definition.
[Edit] Thanks for the excellent answers received so far.
To summarize :
The initial assumption of typedef enforcing size N was completely wrong. It basically behaves the same as a normal pointer.
The const property will also behave the same as if it was a pointer (in stark contrast with a typedef to a pointer type, as underlined by #random below)
To enforce a size (which was not the initial question, but end up being quite important now...), see Jonathan's answer
First, you are mistaken, the function prototypes
int doSomething(table_t t);
int doSomething(int* t);
are exactly equivalent. For function parameters, the first array dimension is always rewritten as a pointer. So there is no guarantee for the size of the array that is received.
const-qualification on arrays always applies to the base type of the array, so the two declarations
const table_t a;
int const a[N];
are equivalent, and for functions parameters we have
int doSomething(const table_t t);
int doSomething(int const* t);
The content of the table will be constant. Easily checked with this code.
#include<stdio.h>
typedef int table_t[3];
void doSomething(const table_t t)
{
t++; //No error, it's a non-const pointer.
t[1]=3; //Error, it's a pointer to const.
}
int main()
{
table_t t={1,2,3};
printf("%d %d %d %ld",t[0],t[1],t[2],sizeof(t));
t[1]=5;
doSomething(t);
return 0;
}
Array types and pointer types are not 100% equivalent, even in this context where you do ultimately get a pointer type for the function parameter. Your mistake is in assuming that const would have acted the same way if it were a pointer type.
To expand on ARBY's example:
typedef int table_t[3];
typedef int *pointer_t;
void doSomething(const table_t t)
{
t++; //No error, it's a non-const pointer.
t[1]=3; //Error, it's a pointer to const.
}
void doSomethingElse(const pointer_t t)
{
t++; //Error, it's a const pointer.
t[1]=3; //No error, it's pointer to plain int
}
It does act similarly to const int *, but const pointer_t is instead equivalent to int * const.
(Also, disclaimer, user-defined names ending with _t are not allowed by POSIX, they're reserved for future expansion)
The merit of the first construction is that it enforces N as the size of the table.
I'm not sure what you mean here. In what contexts would it "enforce" it? If you declare a function as
int doSomething(table_t t);
array size will not be enforced. I order to enforce the size, you'd have to go a different route
int doSomething(table_t *t); // equivalent to 'int (*t)[N]'
What is const here ?
As for const... When const is applied to array type it "drops down" all the way to array elements. This means that const table_t is an array of constant ints, i.e. it is equivalent to const int [N] type. The end result of this is that the array becomes non-modifiable. In function parameter declaration context const table_t will be converted into const int *.
However, note one peculiar detail that is not immediately obvious in this case: the array type itself remains non-const-qualified. It is the individual elements that become const. In fact, it is impossible to const-qualify the array type itself in C. Any attempts to do so will make const-qualification to "sift down" to individual elements.
This peculiarity leads to rather unpleasant consequences in array const-correctness. For example, this code will not compile in C
table_t t;
const table_t *pt = &t;
even though it looks quite innocently from the const-correctness point of view and will compile for any non-array object type. C++ language updated its const-correctness rules to resolve this issue, while C continues to stick to its old ways.
The standard 6.7.6.3 says:
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to type’’
Meaning that when you declare a function parameter as a const int array type, it decays into a pointer to const int (first element in array). Equivalent to const int* in this case.
Also note that because of the above mentioned rule, the array size specified adds no additional type safety! This is one big flaw in the C language, but that's how it is.
Still, it is good practice to declare the array with fixed width like you have, because static analysers or clever compilers may produce a diagnostic about different types.

Difference between int* p and int *p declaration [duplicate]

This question already has answers here:
Placement of the asterisk in pointer declarations
(14 answers)
Closed 2 years ago.
What is the difference between an int* p and an int *p declaration?
There is no difference.
It's a matter of notation, not semantics. The second is less misleading, because
int *a, b;
is clearly declaring an int* and an int, whereas
int* a, b;
looks as if it's declaring two pointers, when it's really doing the same thing as above.
int* p
widely used by C++ programmers
int* p, q wrongly implies that both p and q are pointers (leading to a preference for declaring this on two lines, which also improves readability when there are assignments, and makes it easier to quickly cut/paste or comment specific lines/variables)
int* p visually separates the type from the identifier
*p then unambiguously indicates a dereference (assuming you put spaces around your binary operator* ala 2 * 3)
in C++ ...&x is clearly taking an address while ...& x must be declaring a reference variable, and ... & ... is the bitwise-AND operator
int *p
widely used by C programmers
int *p, q clearly reflects p being a pointer and q not being.
int *p visually confuses the type with the identifier
visually indistinguishable from a pointer dereference (for better or worse)
Similarly for types appearing in function declarations...
int* f(), g(); // declares int g();
int *h(), (*i)(); // i is pointer to function returning int
int *const*m(), n(); // m returns pointer to (const-pointer to int)
// n returns int
...but at least function arguments can't get so hairy - the type specification starts afresh after each comma separators.
Summarily, int *p is better if your coding style / code-base utilises multiple declarations on a single line of source code, otherwise int* p offers a clearer separation of type and the following identifier.
For all that, people's preferences are largely based on what they're used to.
They are the same. The first one considers p as a int * type, and the second one considers *p as an int.
The second one tells you how C declarations simply work: declare as how would be used.
The Internet is flooded with complex approaches to parse declarations like "clockwise/spiral rule" and "the right rule". People makes the simple one complex: why do we need to read these types out loud? We only need to know how to use it!
By the way, the first one sometimes make you mad. Consider this:
int* p, q;
You would like p and q both becomes int *, but unfortunately not.
If you write int *p, *q, things will be simpler.
Anyway you can try this if you insists on approach one.
typedef int* intp;
intp p, q;

In C, what is the correct syntax for declaring pointers?

I vaguely recall seeing this before in an answer to another question, but searching has failed to yield the answer.
I can't recall what is the proper way to declare variables that are pointers. Is it:
Type* instance;
Or:
Type *instance;
Although I know both will compile in most cases, I believe there are some examples where it is significant, possibly related to declaring multiple variables of the same type on the same line, and so one makes more sense than the other.
It is simply a matter of how you like to read it.
The reason that some people put it like this:
Type *instance;
Is because it says that only instance is a pointer. Because if you have a list of variables:
int* a, b, c;
Only a is a pointer, so it's easier like so
int *a, b, c, *d;
Where both a and d are pointers. It actually makes no difference, it's just about readability.
Other people like having the * next to the type, because (among other reasons) they consider it a "pointer to an integer" and think the * belongs with the type, not the variable.
Personally, I always do
Type *instance;
But it really is up to you, and your company/school code style guidelines.
Those two are the same. However, if you do multiple declarations, the former can trap you.
int* pi, j;
declares an int pointer (pi) and an int (j).
It's an accident of C syntax that you can write it either way; however, it is always parsed as
Type (*pointer);
that is, the * is always bound to the declarator, not the type specifier. If you wrote
Type* pointer1, pointer2;
it would be parsed as
Type (*pointer1), pointer2;
so only pointer1 would be declared as a pointer type.
C follows a "declaration mimics use" paradigm; the structure of a declaration should mimic the equivalent expression used in the code as much as possible. For example, if you have an array of pointers to int named arr, and you want to retrieve the integer value pointed to by the i'th element in the array, you would subscript the array and dereference the result, which is written as *arr[i] (parsed as *(arr[i])). The type of the expression *arr[i] is int, so the declaration for arr is written as
int *arr[N];
Which way is right? Depends on who you ask. Old C farts like me favor T *p because it reflects what's actually happening in the grammar. Many C++ programmers favor T* p because it emphasizes the type of p, which feels more natural in many circumstances (having written my own container types, I can see the point, although it still feels wrong to me).
If you want to declare multiple pointers, you can either declare them all explicitly, such as:
T *p1, *p2, *p3;
or you can create a typedef:
typedef T *tptr;
tptr p1, p2, p3;
although I personally don't recommend it; hiding the pointer-ness of something behind a typedef can bite you if you aren't careful.
Those two are exactly the same.
However, for the purposes of declaring multiple variables, as in below:
int * p1, p2;
where p1 is of type pointer to int, but p2 is of type int. If you wished to declare p2 as a pointer to int, you have to write:
int *p1, *p2;
I prefer the following style:
Type *pointer;
Why? Because it is consistent with the mindset the creators of the language tried to establish:
"The syntax of the declaration for a variable mimics the syntax of expressions in which the variable might appear."
(The C Programming Language by Brian W. Kernighan & Dennis M. Ritchie, ansi c version, page 94)
It's easy to write FORTRAN in any language, but You always should write [programming language X] in [programming language X].
Curiously, this rule doesn't apply if the type is deduced in C++0x.
int a;
decltype(&a) i, j;
Both i and j are int*.
All of these will compile, are legal and are equivalent:
Type* instance;
Type * instance;
Type *instance;
Pick a style and be consistent.
As a side note, I think it helps to understand the motivation behind the C declaration syntax, which is meant to mimic how the variable could be used. Some examples below:
char *x means that if you do *x, you get a char.
int f(int x) means that if you do e.g. f(0), you get an int.
const char *foo[3] means that if you do e.g. *foo[0] (which is the same as *(foo[0])), you get a const char. This implies that foo must be an array (of size 3 in this case) of pointers to const char.
unsigned int (*foo)[3] means that if you do e.g. (*foo)[0], you get an unsigned int. This implies that foo must be a pointer to an array of unsigned int (of size 3 in this case).
Loosely, the general syntax is hence [what you get] [how you get it]. The trick is that this can be extended to [what you get] [how you get it], [how you get it], ... to declare multiple things at once. Hence, the following three declarations --
int *p;
int f(int x);
int a[3];
-- can be combined into a single line as follows (and yes, this even works for function declarations. They are not special in this regard):
int *p, f(int x), a[3];

Pointer syntax in C: why does * only apply to the first variable?

The following declaration in C:
int* a, b;
will declare a as type int* and b as type int. I'm well aware of this trap, but what I want to know is why it works this way. Why doesn't it also declare b as int*, as most people would intuitively expect? In other words, why does * apply to the variable name, rather than the type?
Sure you could write it this way to be more consistent with how it actually works:
int *a, b;
However, I and everyone I've spoken to think in terms of a is of type "pointer to int", rather than a is a pointer to some data and the type of that data is "int".
Was this simply a bad decision by the designers of C or is there some good reason why it's parsed this way? I'm sure the question has been answered before, but I can't seem to find it using the search.
C declarations were written this way so that "declaration mirrors use". This is why you declare arrays like this:
int a[10];
Were you to instead have the rule you propose, where it is always
type identifier, identifier, identifier, ... ;
...then arrays would logically have to be declared like this:
int[10] a;
which is fine, but doesn't mirror how you use a. Note that this holds for functions, too - we declare functions like this:
void foo(int a, char *b);
rather than
void(int a, char* b) foo;
In general, the "declaration mirrors use" rule means that you only have to remember one set of associativity rules, which apply to both operators like *, [] and () when you're using the value, and the corresponding tokens in declarators like *, [] and ().
After some further thought, I think it's also worth pointing out that spelling "pointer to int" as "int*" is only a consequence of "declaration mirrors use" anyway. If you were going to use another style of declaration, it would probably make more sense to spell "pointer to int" as "&int", or something completely different like "#int".
There's a web page on The Development of the C Language that says, "The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression." Search for that sentence on the page to find the relevant section that talks about this question.
There may be an additional historical reason, but I've always understood it this way:
One declaration, one type.
If a, b, c, and d must be the same type here:
int a, b, c, d;
Then everything on the line must an integer as well.
int a, *b, **c, ***d;
The 4 integers:
a
*b
**c
***d
It may be related to operator precedence, as well, or it may have been at some point in the past.
I assume it is related to the full declaration syntax for type modifiers:
int x[20], y;
int (*fp)(), z;
In these examples, it feels much more obvious that the modifiers are only affecting one of the declarations. One guess is that once K&R decided to design modifiers this way, it felt "correct" to have modifiers only affect one declaration.
On a side note, I would recommend just limiting yourself to one variable per declaration:
int *x;
int y;
The * modifies the variable name, not the type specifier. This is mostly because of the way the * is parsed. Take these statements:
char* x;
char *x;
Those statements are equivalent. The * operator needs to be between the type specifier and the variable name (it is treated like an infix operator), but it can go on either side of the space. Given this, the declaration
int* a, b;
would not make b a pointer, because there is no * adjacent to it. The * only operates on the objects on either side of it.
Also, think about it this way: when you write the declaration int x;, you are indicating that x is an integer. If y is a pointer to an integer, then *y is an integer. When you write int *y;, you are indicating that *y is an integer (which is what you want). In the statement char a, *b, ***c;, you are indicating that the variable a, the dereferenced value of b, and the triply-dereferenced value of c are all of type char. Declaring variables in this way makes the usage of the star operator (nearly) consistent with dereferencing.
I agree that it would almost make more sense for it to be the other way around. To avoid this trap, I made myself a rule always to declare pointers on a line by themselves.
Consider the declaration:
int *a[10];
int (*b)[10];
The first is an array of ten pointers to integers, the second is a pointer to an array of ten integers.
Now, if the * was attached to the type declaration, it wouldn't be syntatically valid to put a parenthesis between them. So you'd have to find another way to differentiate between the two forms.
Because if the statement
int* a, b;
were to declare b as a pointer too, then you would have no way to declare
int* a;
int b;
on a single line.
On the other hand, you can do
int*a, *b;
to get what you want.
Think about it like that: the way it is now it is still the most concise and yet unique way to do it. That's what C is mostly about :)

Resources