Equivalent C declarations - c

Are
int (*x)[10];
and
int x[10];
equivalent?
According to the "Clockwise Spiral" rule, they parse to different C declarations.
For the click-weary:
The ``Clockwise/Spiral Rule'' By David
Anderson
There is a technique known as the
``Clockwise/Spiral Rule'' which
enables any C programmer to parse in
their head any C declaration!
There are three simple steps to follow:
1. Starting with the unknown element, move in a spiral/clockwise direction;
when ecountering the following elements replace them with the
corresponding english statements:
[X] or []
=> Array X size of... or Array undefined size of...
(type1, type2)
=> function passing type1 and type2 returning...
*
=> pointer(s) to...
2. Keep doing this in a spiral/clockwise direction until all tokens have been covered.
3. Always resolve anything in parenthesis first!

Follow this simple process when reading declarations:
Start at the variable name (or
innermost construct if no identifier
is present. Look right without jumping
over a right parenthesis; say what you
see. Look left again without jumping
over a parenthesis; say what you see.
Jump out a level of parentheses if
any. Look right; say what you see.
Look left; say what you see. Continue
in this manner until you say the
variable type or return type.
So:
int (*x)[10];
x is a pointer to an array of 10 ints
int x[10];
x is an array of 10 ints
int *x[10];
x is an array of 10 pointers to ints

They are not equal. in the first case x is a pointer to an array of 10 integers, in the second case x is an array of 10 integers.
The two types are different. You can see they're not the same thing by checking sizeof in the two cases.

I tend to follow The Precedence Rule for Understanding C Declarations which is given very nicely in the book Expert C Programming - Deep C Secrets by Peter van der Linden
A - Declarations are read by starting with the name and then reading in
precedence order.
B - The precedence, from high to low, is:
B.1 parentheses grouping together parts of a declaration
B.2 the postfix operators:
parentheses () indicating a function, and
square brackets [] indicating an array.
B.3 the prefix operator: the asterisk denoting "pointer to".
C If a const and/or volatile keyword is next to a type specifier (e.g. int,
long, etc.) it applies to the type specifier.
Otherwise the const and/or volatile keyword
applies to the pointer asterisk on its immediate left.

For me, it's easier to remember the rule as absent any explicit grouping, () and [] bind before *. Thus, for a declaration like
T *a[N];
the [] bind before the *, so a is an N-element array of pointer. Breaking it down in steps:
a -- a
a[N] -- is an N-element array
*a[N] -- of pointer
T *a[N] -- to T.
For a declaration like
T (*a)[N];
the parens force the * to bind before the [], so
a -- a
(*a) -- is a pointer
(*a)[N] -- to an N-element array
T (*a)[N] -- of T
It's still the clockwise/spiral rule, just expressed in a more compact manner.

No. First one declares an array of 10 int pointers and second one declares an array of 10 ints.

Related

How to understand the syntax of C multidimensional arrays?

My intuition when I see this int array_name[x][y]; is an array of y arrays i.e. array_name[x] is one element out of y such elements. But turns out it's not so [in fact it's the opposite(?) ]
The guides/tutorials seem hellbent on bring matrices to explain this which makes it specific to 2D arrays. I'm looking to understand a general array_name[w][x] ... [n] syntax.
Note: fine, syntax is syntax, and this is how C defines it, okay. Then, is it true that array_name[w][x] ... [n] is simply an array of w elements each of which is array_name[x] ... [n]? But even this is not entirely correct because int a[][3] = {1,2,3,4,5,6,7}; is valid even though the RHS contains a number of elements not divisible by 3.
int x[5][3];
declares x as an array with 5 elements. Each of these elements is an array with 3 int. You're correct so far.
But you should compile with -Wall -Wextra. Look here:
k.c:2:16: warning: missing braces around initializer [-Wmissing-braces]
2 | int a[][3] = {1,2,3,4,5,6,7};
| ^
| { }{ }{}
It's valid to initialize it this way, but the more proper way of initializing it is:
int a[][3] = {{1,2,3},{4,5,6},{7,0,0}};
This is much more readable. The zeros are not needed. If you initialize one single element, all other elements will be zeroed.
One more thing is that you can go out of bounds without actually going out of bounds with multi dimensional arrays. DO NOTE THAT EVEN IF THIS IS LIKELY TO WORK, IT'S UNDEFINED BEHAVIOR, SO DON'T DO IT!
a[1][4] = (*a+1)[4]=*(*a+1)+4)
This is because [] is simply syntactic sugar for pointer arithmetic. So if you have declared T x[5][3]; for some type T, then x[1][1] will point to the same element as x[0][4]
But as I said, it's UB. Read more about it here
The way to read a multidimensional array declaration like
int arr[N][M];
is as an N-element array of M-element arrays of int. Each arr[i] has type int [M].
We can get there using substitution. Let's start with a simple object declaration:
T a;
a is an instance of something which we call T. Now replace T with the array type R [N]:
R a[N];
Important thing to note - since the [] operator is postfix in both expressions and declarations, when we substitute T with R [N] the [N] goes to the rightmost side of the declarator a, giving us R a[N]; the importance of this will be clear on the next round of substitution.
So now a is an N-element array of something, and that something is type R. Now we replace R with another array type, int [M]:
int a[N][M];
Again, since the [] operator is postfix, we need to add it to the rightmost side of the declarator a[N] when doing the substitution, giving us a[N][M]. a is still an N-element array of something, it's just now that something is "M-element array of int". Hence, a is an N-element array of M-element arrays of int.
But even this is not entirely correct because int a[][3] = {1,2,3,4,5,6,7}; is valid even though the RHS contains a number of elements not divisible by 3.
That's covered here:
6.7.9 Initialization
...
21 If there are fewer initializers in a brace-enclosed list than there are elements or members
of an aggregate, or fewer characters in a string literal used to initialize an array of known
size than there are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage duration.
C 2011 Online Draft
So that initializer is interpreted as:
int a[][3] = {{1,2,3},{4,5,6},{7,0,0}};

C declarator understanding

I am reading about declarators in C99 ISO Standard and I am struggling to understand the following passage:
5 If, in the declaration ”T D1”, D1 has the form
    identifier
then the type specified for ident is T.
6 If, in the declaration “T D1”, D1 has the form
    ( D )
then ident has the type specified by the declaration “T D”. Thus, a declarator in parentheses is identical to the unparenthesized declarator, but the binding of complicated declarators may be altered by parentheses.
You omitted an important previous paragraph:
4 In the following subclauses, consider a declaration
    T D1
where T contains the declaration specifiers that specify a type T (such as int) and D1 is a declarator that contains an identifier ident. The type specified for the identifier ident in the various forms of declarator is described inductively using this notation.
So, when we get to paragraphs 5 and 6, we know the declaration we are considering contains within it some identifier which we label ident. E.g., in int foo(void), ident is foo.
Paragraph 5 says that if the declaration “T D1” is just ”T ident”, it declares ident to be of type T.
Paragraph 6 says that if the declaration “T D1” is just ”T (ident)”, it also declares ident to be of type T.
These are just establishing the base cases for a recursive specification of declaration. Clause 6.7.5.1 goes on to say that if the declaration “T D1” is ”T * some-qualifiers D” and the same declaration without the * and the qualifiers, ”T D” would declare ident to be “some-derived-type T” (like “array of T” or “pointer to T”), then the declaration with the * and the qualifiers declares ident* to be “some-derived-type some-qualifiers pointer to T”.
For example, int x[3] declares x to be “array of 3 int”, so this rule in 6.7.5.1 tells us that “int * const x[3] declares x to be “array of 3 const pointer to int”—it takes the “array of 3” that must have been derived previously and appends “const pointer to” to it.
Similarly, clauses 6.7.5.2 and 6.7.5.3 tell us to append array and function types to declarators with brackets (for subscripts) and postfix parentheses.
In the quoted definitions, T is a type (e.g. int or double or struct foo or any typedef name), and D1 is a declarator.
Declarators may be arbitrarily complex, but are constructed from a small number of simple steps. They mention that it is defined inductively, which is another way of saying it has a recursive defintion.
The simplest declarator is just an identifier name, such as x or foo. So T could be int and D1 could be x.
It then goes on to say that a declarator may be parenthesized, e.g. (x) or ((x)) etc. These are degenerate cases, and are both equivalent to just x, but there are times when parentheses are needed to produce the desired grouping. For example, the declarators *x[10] and (*x)[10] mean quite different things. The former is equivalent to *(x[10]) and is an array of pointers, while the latter is a pointer to an array.
There is more to it (arrays, pointers, functions, etc.), but this covers the portion referenced in the question.
In long int *ident[4]; (array (4) of pointer to long int) long int is the specifier list, *ident[4] is the declarator.
You can put the declarator in parentheses without changing semantics:
long int (*ident[4]);, but in a more complex declaration such as
long int (*ident[4])[5]; (array (4) of pointer to array (5) of long int), the parentheses affect binding as without them, long int *ident[4][5]; would be interpreted as array (4) of array (5) of pointer to long int.
It works this way because the declarator part can recursively encompass another declarator and so on.
From the old C manual (it got a bit more indirect in later standardized Cs, but the basic principle is the same):
declarator:
identifier
* declarator
declarator ( )
declarator [ constant-expression opt ]
( declarator )
To put it in another way, C declarators are a way to prefix pointer-to/function-returning/array-of to some specifier list (e.g., long int) and these can be combined to create a chain.
Normally, in the creation of that chain, the suffixes (()=function returning, []=array of) bind tighter than the */pointer-to prefix. You can use parentheses to override this and force * (or several of them) to bind now without it getting overpowered by a []/() suffix.
E.g.:
long int *ident[4][5]; //array (4) of array (5) of pointer to long int
//the suffixes win over the `*` prefix
long int (*ident[4])[5]; //array (4) of pointer to array (5) of long int
//the parens force `pointer to` right after `array (4)`
//without letting the `[]` suffix overpower the pointer-to/`*` declarator prefix
P.S.:
C disallows arrays of functions and functions returning functions or arrays. These are both nicely expressible in the grammar but C's semantic check will want you to but a pointer-to link in there to prevent these.
cdecl.org may be very helpful if you want to play with these, and perhaps even more so because it doesn't do said semantic check, therefore allowing you even things like int foo[]()(); (array of function returning function returning int), which nicely demonstrate how the grammar works, even though a C compiler would reject them.
...binding of complicated declarators
This is a very nice hint in the specs.
The rule itself is really hard to analyze because of it's recursiveness. Also the relation and parts of declaration and declarator are relevant.
The results are:
() and [] are the innermost direct-declarator parts, declaring (directly by symbol) functions and array names to the left
* declares a name as pointer on the right
(...) is needed for...complicated cases, to change the default association.
The grouping parens lead you on your way from inside (identifier) to outside (type specifier on the left, say "int").
In the end it is all about the pointer symbol * and what it refers to. The reformulated (brackets mean optional, not array here!) syntax is:
declarator: [* [qual]] direct-declarator
direct-declarator: (declarator)
foo() is a DD (direct declarator)
*foo is a declarator. ("indirect" by deduction)
*foo() is *(foo()). foo stays a function, () and [] bind strongest. The * is the return type.
(*foo)() makes foo a pointer. One to a function.
BTW this also explains why in a list of declarator.
int const * a, b
both are const int, but only a is a pointer
The const belongs to int and the star only to a. This makes it more clear,
const int x, *pi
But this already is borderline obfuscation. Like modern poetry. Good for certain occasions.
Even without parens there is a slight U-turn in parsing. But this is natural:
3 2 0 1
int *foo()
This standard situation (and similar ones) had to be simple. Also the famous multidim arrays like int a[10][10][10].
3 1 0 2
int (*foo)()
Here the parens force "foo" to be what is on the left side (a pointer).
Complicated Declarations have their own chapter in K&R C book.
This is sort of the simplest complicated declaration:
int (*(*foo)[])()
It debugs to abstract type:
int (*(*)[])()
With (*) replaced:
int (*F[])()
The missing array size gives compiler warning - "assuming one element".
As abstract type:
int (*[])()
But:
int *G[]()
--> error: declaration of 'G' as array of functions
Yes, you can, even recursively, but with the * indirection and parens. This makes an onion of parens with the identifeir in the middle, stars on the left and [] and () on the right.
The C11 specs has this monster. The ... declare variadic args:
int (*fpfi(int (*)(long), int))(int, ...)
With all params removed:
int (*fpfi())()
Simply a function returning a pointer. One to a function returning int.
But the first param of fpfi is a function itself - a pointer to function with return type and its own params:
int (*)(long)
Non-abstractly:
int (*foo)(long)
A pointer to a function that "converts" a long to int, formally.
That is the param. Only. The return value is also a function pointer, and the pointed-to function's params and return type are outermost. Dropping the whole inner function thing (int (*)(long), int):
int (*pfi)(int, ...)
Or more generic/incomplete:
int (*pfi)()
"T (D)" Onion Rule
So this onion-game repeats itself. Inside-out and right-left-right-left between [], () and *. Syntax is not the problem, but semantics.

Bracket order in multidimensional arrays

int data[3][5];
is a 3-element array of 5-element arrays.
Why? Intuitively for me if int[3] is a 3-element array and int[3][5] Should be a 5-element array of 3-elements arrays.
The intuition should come from the indexing convention - since it is an array of arrays, first index is selecting the element which is an array, the second index is selecting the element of the selected array. That is:
data[2][4] will select element number 4 of the array number 2 (mind the zero-basing).
Now the definition of such an array seems to be a bit counter-intuitive as you noted, but apparently it is this way just to be consistent with indexing syntax, otherwise it will be much more confusing.
C doesn't always work in an intuitive way because of things like the spiral rule, though maybe you're mis-applying it here.
As with any language, you need to accept the syntax for what it is, not what you think it is, or you'll constantly be fighting with the language on a semantic level.
Tools like cdecl explain it as:
declare data as array 3 of array 5 of int
This falls out of C's concept of declarators. The pointer-ness, array-ness, or function-ness of a declaration is specified in the declarator, while the type-ness is specified with a type specifier1:
int *p; // *p is the declarator
double arr[N][M]; // arr[N][M] is the declarator
char *foo( int x ); // *foo( int x ) is the declarator
This allows you to create arbitrarily complex types in a compact manner:
int *(*foo(void))[M][N];
foo is a function taking no parameters, returning a pointer to an M-element array of N-element arrays of pointer to int.
Thus, the actual type of an object or function is specified through the combination of the type specifier (and any qualifiers) and the declarator.
Unfortunately, "compact" is just another way of saying "eye-stabby". Declarations like that can be hard to read and understand. It does mean that things like multi-dimensional array declarations read kind of "backwards":
+---------------------------------+
| |
v |
type arr -> array-of -> array-of -+
^
|
start here
But, if you work it through, it does make sense. Let's start with some arbitrary type T. We declare an array of T as
T arr[N];
Thus, arr is an N-element array of T. Now we replace T with an array type R [M]. This gives us
R arr[N][M];
arr is still an N-element array of something, and that something is R [M], which is why we write arr[N][M] instead of arr[M][N].
And there are also storage-class qualifiers, type qualifiers, etc., but we won't go into those here.

Why are C's arrays first dimension ignored by the compiler as a function parameter?

I know that in C if we were to write:
void myFunction(int x[30]) // 30 is ignored by the compiler
void myFunction(int x[][30]) // 30 is not ignored here but if I put say '40' in the first dimension
// it would be ignored.
Why is it that the first dimension is ignored by the compiler?
void myFunction(int x[30])
is equivalent to
void myFunction(int *x)
i.e, when arrays are used as parameters to function then array names are treated by compiler as pointer to first element of array. In this case the length of first dimension is of no use.
This way you must have to pass size of array explicitly to the function.
In the context of a function parameter declaration, both T a[] and T a[N] are interpreted as T *a; that is, all three declare a as a pointer to T. This goes along with the fact that, unless it is the operand of the sizeofor the unary & operator, an expression of type "N-element array of T" will be converted to an expression of type "pointer to T" and its value will be the address of the first element of the array.
It's not that the dimension is being ignored, it's that it's not meaningful in this context.
Since the C function does not check whether an array reference is in bounds, and since it does not allocate any space for it, the dimension has no use there. It only calculates an offset from the pointer (start of the array) and it already knows how to do that (based on the size of int).
When you specify more than one dimension, it needs to know that dimension only so it can calculate the proper offset for an array reference.
It is not ignored/useless according to the language. It may be ignored by the compiler.
If inside myFunction, you write:
... x[29] ...
you get a valid program.
If you write
... x[30] ...
your program has undefined behavior. The compiler may or may not check for this.
The fact that compiler can't always check everything is the price one pays for having a language as close to the machine as C is.

C programming: arrays and pointers [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is array name a pointer in C?
If I define:
int tab[4];
tab is a pointer, because if I display tab:
printf("%d", tab);
the code above will display the address to the first element in memory.
That's why i was wondering why we don't define an array like the following:
int *tab[4];
as tab is a pointer.
Thank you for any help!
tab is a pointer
No, tab is an array. An int[4] to be specific. But when you pass it as an argument to a function (and in many other contexts) the array is converted to a pointer to its first element. You can see the difference between arrays and pointers for example when you call sizeof array vs. sizeof pointer, when you try to assign to an array (that won't compile), and more.
int *tab[4];
declares an array of four pointers to int. I don't see how that is related to the confusion between arrays and pointers.
tab is not a pointer it's an array of 4 integers when passed to a function it decays into a pointer to the first element:
int tab[4];
And this is another array but it holds 4 integer pointers:
int *tab[4];
Finally, for the sake of completeness, this is a pointer to an array of 4 integers, if you dereference this you get an array of 4 integers:
int (*tab)[4];
You are not completely wrong, meaning that your statement is wrong but you are not that far from the truth.
Arrays and pointers under C share the same arithmetic but the main difference is that arrays are containers and pointers are just like any other atomic variable and their purpose is to store a memory address and provide informations about the type of the pointed value.
I suggest to read something about pointer arithmetic
Pointer Arithmetic
http://www.learncpp.com/cpp-tutorial/68-pointers-arrays-and-pointer-arithmetic/
Considering the Steve Jessop comment I would like to add a snippet that can introduce you to the simple and effective world of the pointer arithmetic:
#include <stdio.h>
int main()
{
int arr[10] = {10,11,12,13,14,15,16,17,18,19};
int pos = 3;
printf("Arithmetic part 1 %d\n",arr[pos]);
printf("Arithmetic part 2 %d\n",pos[arr]);
return(0);
}
arrays can behave like pointers, even look like pointers in your case, you can apply the same exact kind of arithmetic by they are not pointers.
int *tab[4];
this deffinition means that the tab array contains pointers of int and not int
From C standard
Coding Guidelines
The implicit conversion of array objects to a
pointer to their first element is a great inconvenience in trying to
formulate stronger type checking for arrays in C. Inexperienced, in
the C language, developers sometimes equate arrays and a pointers much
more closely than permitted by this requirement (which applies to uses
in expressions, not declarations). For instance, in:
file_1.c
extern int *a;
file_2.c
extern int a[10];
the two declarations of a are sometimes incorrectly assumed by
developers to be compatible. It is difficult to see what guideline
recommendation would overcome incorrect developer assumptions (or poor
training). If the guideline recommendation specifying a single point
of declaration is followed, this problem will not 419.1 identifier
declared in one file occur. Unlike the function designator usage,
developers are familiar with the fact that objects having an array
function designator converted to typetype are implicitly converted to
a pointer to their first element. Whether applying a unary & operator
to an operand having an array type provides readers with a helpful
visual cue or causes them to wonder about the intent of the author
(“what is that redundant operator doing there?”) is not known.
Example
static double a[5];
void f(double b[5])
{
double (*p)[5] = &a;
double **q = &b; /* This looks suspicious, */
p = &b; /* and so does this. */
q = &a;
}
If the array object has register storage class, the behavior is undefined
Under most circumstances, an expression of array type will be converted ("decay") to an expression of pointer type, and the value of the expression will be the address of the first element in the array. The exceptions to this rule are when the array expression is an operand of the sizeof, _Alignof, or unary & operators, or is a string literal being used to initialize another array in a declaration.
int tab[4];
defines tab as a 4-element array if int. In the statement
printf("%d", tab); // which *should* be printf("%p", (void*) tab);
the expression tab is converted from type "4-element array of int" to "pointer to int".

Resources