Confusion about dereference operator ("*") in C - c

As far as I know, the derefence operator * returns the value stored in the pointer address. What I'm confused by is the behavior when the operator is used with pointer of an array. For example,
int a[4][2];
Then a is internally converted to pointer of first element of array of 4 elements of 2 ints. Then which value does *a return? I'm really confused!

The type of a is int[4][2], so the type of *a (or equivalently a[0]) is int[2].
It is not the same as a[0][0]. If you do this:
int a[4][2];
printf("%d\n",*a);
The compiler will tell you this:
warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘int *’
Since an array (in this case one of type int [2]) is being passed to a function, it decays to a pointer to the first element in this context.
If on the other hand you had **a, that is equivalent to a[0][0] and has type int.

This:
int a[4][2];
defined a an array of 4 elements, each of which is an array of 2 int elements. (A 2-dimensional array is nothing more or less than an array of arrays.)
An array expression is, in most contexts, implicitly converted to a pointer to the array object's initial (zeroth) element. (Note the assumption that there is an array object; that has caused some angst, but it's not relevant here.)
The cases where an array expression is not converted to a pointer are:
When it's the operand of sizeof;
When it's the operand of unary &; and
When it's a string literal in an initializer used to initialize an array object.
(Compiler-specific extensions like gcc's typeof might create more exceptions.)
So in the expression *a, the subexpression a (which is of type int[4][2]) is implicitly converted to a pointer of type int(*)[2] (pointer to array of 2 ints). Applying unary * dereferences that pointer, giving us an expression of type int[2].
But we're not quite done yet. *a is also an expression of array type, which means that, depending on how it's used, it will probably be converted again to a pointer, this time of type int*.
If we write sizeof *a, the subexpression a is converted from int[4][2] to int(*)[2], but the subexpression *a is not converted from int[2] to int*, so the expression yields the size of the type int[2].
If we write **a, the conversion does occur. *a is of type int[2], which is converted to int*; dereferencing that yields an expression of type int.
Note that despite the fact that we can legally refer to **a, using two pointer dereference operations, there are no pointer objects. a is an array object, consisting entirely of 8 int objects. The implicit conversions yield pointer values.
The implicit array-to-pointer conversion rules are in N1570 section 6.3.2.1 paragraph 3. (That paragraph incorrectly gives _Alignof as a fourth exception, but _Alignof cannot be applied to an expression. The published C11 standard corrected the error.)
Recommended reading: Section 6 of the comp.lang.c FAQ.

Related

Can you proof why casting is important when I deference a void pointer?

Why is necessary do casting when I dereference a void pointer?
I have this example:
int x;
void* px = &x;
*px = 9;
Can you proof why this don't work?
By definition, a void pointer points to an I'm-not-sure-what-type-of-object.
By definition, when you use the unary * operator to access the object pointed to by a pointer, you must know (well, the compiler must know) what the type of the object is.
So we have just proved that we cannot directly dereference a void pointer using *; we must always explicitly cast the void pointer to some actual object pointer type first.
Now, in many people's minds, the "obvious" answer to "what type does/should a 'generic' pointer point to?" is "char". And, once upon a time, before the void type had been invented, character pointers were routinely used as "generic" pointers. So some compilers (including, notably, gcc) extend things a bit and let you do more (such as pointer arithmetic) with a void pointer than the standard requires.
So that might explain how code like that in your question might be able to "work". (In your case, though, since the pointed-to type was actually int, not char, if it "worked" it was only because you were on a little-endian machine.)
...And with that said, I find that the code in your question does not work for me, not even under gcc. It first gives me a non-fatal warning:
warning: dereferencing ‘void *’ pointer
But then it changes its mind and decides this is an error instead:
error: invalid use of void expression
A second compiler I tried said something similar:
error: incomplete type 'void' is not assignable
Addendum: To say a little more about why the pointed-to type is reuired when you dereference a pointer:
When you access a pointer using *, the compiler is going to emit code to fetch from (or maybe store to) the pointed-to location. But the compiler is going to have to emit code that accesses a certain number of bytes, and in many cases it may matter how those byte(s) are interpreted. Both the number and the interpretation of the bytes is determined by the type (that's what types are for), which is precisely why an actual, non-void type is required.
One of the best ways I know of appreciating this requirement is to consider code like
*p + 1
or, even better
*p += 1
If p points to a char, the compiler is probably going to emit some kind of an addb ("add byte") instruction.
If p points to an int, the compiler is going to emit an ordinary add instruction.
If p points to a float or double, the compiler is going to emit a floating-point addition instruction. And so on.
But if p is a void *, the compiler has no idea what to do. It complains (in the form of an error message) not just because the C standard says you can't dereference a void pointer, but more importantly, because the compiler simply doesn't know what to do with your code.
In short:
The target of an assignment expression must be a modifiable lvalue, which cannot be a void expression. This is because the void type does not represent any values - it denotes an absence of a value. You cannot create an object of type void.
If the expression px has type void *, then the expression *px has type void. Attempting to assign to *px is a constraint violation and the compiler is required to yell at you for it.
If you want to assign a new value to x through px, then you have to cast px to an int * before dereferencing:
*((int *)px) = 5;
Chapter and verse:
6.2.5 Types
...
19 The void type comprises an empty set of values; it is an incomplete object type that
cannot be completed.
...
6.3.2.1 Lvalues, arrays, and function designators
1 An lvalue is an expression (with an object type other than void) that potentially
designates an object;64) if an lvalue does not designate an object when it is evaluated, the
behavior is undefined. When an object is said to have a particular type, the type is
specified by the lvalue used to designate the object. A modifiable lvalue is an lvalue that
does not have array type, does not have an incomplete type, does not have a const-qualified type, and if it is a structure or union, does not have any member (including,
recursively, any member or element of all contained aggregates or unions) with a const-qualified type.
...
6.3.2.2 void
1 The (nonexistent) value of a void expression (an expression that has type void) shall not
be used in any way, and implicit or explicit conversions (except to void) shall not be
applied to such an expression. If an expression of any other type is evaluated as a void
expression, its value or designator is discarded. (A void expression is evaluated for its
side effects.)
...
6.3.2.3 Pointers
1 A pointer to void may be converted to or from a pointer to any object type. A pointer to
any object type may be converted to a pointer to void and back again; the result shall
compare equal to the original pointer.
...
6.5.3.2 Address and indirection operators
...
4 The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined.102)
...
6.5.16 Assignment operators
...
Constraints
2 An assignment operator shall have a modifiable lvalue as its left operand.
More specifically, dereferencing a void pointer violates the wording of 6.5.3.2 Address and indirection operators, paragraph 4:
The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ''pointer to type'', the result has type ''type''. If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.
Since a pointer to void has no "type" - it can't be dereferenced. Note that this is beyond undefined behavior - it is a violation of the C language standard.
It probably doesn't work because it violates a rule in the ISO C standard which requires a diagnostic, and (I'm guessing) your compiler is treating that as a fatal situation.
According to ISO C99, as well as the C11 Draft (n1548), the only constraint on the use of the * dereferencing operator is "[t]he operand of the unary *operator shall have pointer type." [6.5.3.2¶2, n1548] The code we have here meets that constraint, and has no syntax error. Therefore no diagnostic is required for the use of the * operator.
However, what is the meaning of the * operator applied to a void * pointer?
"The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. [6.5.3.2¶4, n1548]
The type void is neither a function nor an object type, so the middle sentence, which talks about producing a function or object designator, is not applicable to our case. The last sentence quoted above is applicable; it gives a requirement that an expression which dereferences a void * has void type.
Thus *px = 9; runs aground because it's assigning an int value to a void expression. An assignment requires an lvalue expression of object type; void is not an object type and the expression is certainly not an lvalue. The exact wording of the constraint is: "An assignment operator shall have a modifiable lvalue as its left operand." [6.5.16¶2, n1548] Violation of this constraint requires a diagnostic.
It appears from my perhaps naive reading of the standard that the expression *px as such is valid; only no attempt must be made to extract a result from it, or use it as the target of an assignment. If that is true, it could be used as an expression statement whose value is discarded: if (foo()) { *px; }, and it could be redundantly cast to void also: (void) *px. These apparently pointless situations might be somehow exploited by, or at least arise in, certain kinds of macros.
For instance, if we want to be sure that the argument of some macro is a pointer we can take advantage of the constraint that * requires a pointer operand:
#define MAC(NUM, PTR) ( ... (void) *(PTR) ...)
I.e. somewhere in the macro we dereference the pointer and throw away the result, which will diagnose if PTR isn't a pointer. It looks like ISO C allows this usage even if PTR is a void *, which is arguably useful.

how a sizeof works when array name is passed

Why sizeof(array_name) is the size of the array and sizeof(&a[0]) is the size of pointer even though, when an array name is passed to a function, what is passed is the location of the beginning of the array.
In most expressions, when an array is used, it is automatically converted to a pointer to its first element.
There is a special rule for sizeof: When an array is an operand of sizeof, it is not automatically converted to a pointer. Therefore, sizeof array_name gives the size of the array, not the size of a pointer.
This rule also applies to the unary & operator : &array_name is the address of the array, not the address of a pointer.
Also, if an array is a string literal used to initialize an array, it is not converted to a pointer. The string literal is used to initialize the array.
The rule for this is C 2018 6.3.2.1 3:
Except when it is the operand of the sizeof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
Array parameters to functions don't exist. They turn into pointers (the innermost index does, if the array is multidimensional). It's just syntactic sugar inherited from the B language. void f(int *X) == void f(int X[]);
(Same for function params to functions. void g(void X(void)) == void g(void (*X)(void))).

if int A[5] is declared then only A is pointer to the A[0].which means A is just a pointer . then how come sizeof(A) gives answer as 20

Suppose if int A[5] is declared then variable A will be pointer to the A[0]. Which means A is just a pointer and A stores the base address of array A[5] . Then how come sizeof(A) gives answer as 20
A isn't a pointer to the first element. It is an int[5], or a five element array of ints (the size is part of the type). It can decay into a pointer to the first element when you do stuff like pass it to a function taking a pointer.
Contrary to what you might have heard, arrays are not the same as pointers.
In most contexts, the name of an array will decay into a pointer to the first element, such as being passed to a function or as the subject of pointer arithmetic.
One of the cases where this decay does not happen is when the array is the subject of the sizeof operator. In that case the operator returns the full size of the array in bytes.
This is detailed in section 6.3.2.1p3 of the C standard:
Except when it is the operand of the sizeof operator, the _Alignof
operator, or the unary & operator, or is a string literal used to
initialize an array, an expression that has type "array of type" is
converted to an expression with type "pointer to type" that points
to the initial element of the array object and is not an lvalue. If
the array object has register storage class, the behavior is
undefined.
Section 6.5.3.4p4, which details the sizeof operator, additionally states:
When sizeof is applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is
1. When applied to an operand that has array type, the result is the total number of bytes in the array. When applied to an operand
that has structure or union type, the result is the total number of
bytes in such an object, including internal and trailing padding.
If you had some something like this:
int A[5];
int *B;
B = A;
printf("sizeof(B)=%zu\n", sizeof(B));
You would get the size of an int * on your system, most likely either 4 or 8.

casting pointer to array into pointer

Consider the following C code:
int arr[2] = {0, 0};
int *ptr = (int*)&arr;
ptr[0] = 5;
printf("%d\n", arr[0]);
Now, it is clear that the code prints 5 on common compilers. However, can somebody find the relevant sections in the C standard that specifies that the code does in fact work? Or is the code undefined behaviour?
What I'm essentially asking is why &arr when casted into void * is the same as arr when casted into void *? Because I believe the code is equivalent to:
int arr[2] = {0, 0};
int *ptr = (int*)(void*)&arr;
ptr[0] = 5;
printf("%d\n", arr[0]);
I invented the example while thinking about the question here: Pointer-to-array overlapping end of array ...but this is clearly a distinct question.
For unions and structures, cf. ISO 9899:2011§6.7.2.1/16f:
16 The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time. A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit-field, then to the unit in which it resides), and vice versa.
17 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
For array types, the situation is a bit more complex. First, observe what an array is, from ISO 9899:2011§6.2.5/20:
An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. The element type shall be complete whenever the array type is specified. Array types are characterized by their element type and by the number of elements in the array. An array type is said to be derived from its element type, and if its element type is T, the array type is sometimes
called “array of T”. The construction of an array type from an element type is called “array type derivation”.
The wording “contiguously allocated” implies that there is no padding between array members. This notion is affirmed by footnote 109:
Two objects may be adjacent in memory because they are adjacent elements of a larger array or adjacent members of a structure with no padding between them, or because the implementation chose to place them so, even though they are unrelated. If prior invalid pointer operations (such as accesses outside array bounds) produced undefined behavior, subsequent comparisons also produce undefined behavior.
The use of the sizeof operator in §6.5.3.5, Example 2 expresses the intent that there is also no padding before or after arrays:
EXAMPLE 2
Another use of the sizeof operator is to compute the number of elements
in an array:
sizeof array / sizeof array[0]
I therefore conclude that a pointer to an array, converted to a pointer to the element typo of that array, points to the first element in the array. Furthermore, observe what the definition of equality says about pointers (§6.5.9/6f.):
6 Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer
to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.109)
7 For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
Since the first element of an array is “a subobject at its beginning,” a pointer to the first element of an array and a pointer to an array compare equal.
Here is a slightly refactored version of your code for easier reference:
int arr[2] = { 0, 0 };
int *p1 = &arr[0];
int *p2 = (int *)&arr;
with the question being: Is p1 == p2 true, or unspecified, or UB?
Firstly: I think that it is intended by the authors of C's abstract memory model that p1 == p2 is true; and if the Standard doesn't actually spell it out then it would be a defect in the Standard.
Moving on; the only relevant piece of text seems to be C11 6.3.2.3/7 (irrelevant text excised):
A pointer to an object type may be converted to a pointer to a different object type. [...] When converted back again, the result shall compare equal to the original pointer.
When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.
It doesn't specifically say what the result of the first conversion is. Ideally it should say ...and the pointer points to the same address, but it doesn't.
However, I argue that it it is implied that the pointer must point to the same address after the conversion. Here is an illustrative example:
void *v1 = malloc( sizeof(int) );
int *i1 = (int *)v1;
If we do not accept "and the pointer points to the same address" then i1 might not actually point into the malloc'd space, which would be ridiculous.
My conclusion is that we should read 6.3.2.3/7 as saying that the pointer cast does not change the address being pointed to. The part about using pointers to character type seems to back this up.
Therefore, since p1 and p2 have the same type and point to the same address, they compare equal.
To answer directly:
Can somebody find the relevant sections in the C standard that specifies that the code does in fact work?
6.3.2.1 Lvalues, arrays, and function designators, paragraph 1
6.3.2.3 Pointers, paragraphs 1,5 and 6
6.5.3.2 Address and indirection operators, paragraph 3
Or is the code undefined behaviour?
The code you posted is not undefined, but it "might" be compiler/implementation specific (per section 6.3.2.3 p5/6)
What I'm essentially asking is why &arr when casted into void * is the same as arr when casted into void *?
This would imply asking why int *ptr = (int*)(void*)&arr gives the same results as int *ptr = (int*)(void*)arr;, but per your code posted, you're actually asking why int *ptr = (int*)(void*)&arr gives the same as int *ptr = (int*)&arr.
Either way I'll expand on what your code is actually doing to help clarify:
Per 6.3.2.1p3:
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
and per 6.5.3.2p3:
The unary & operator yields the address of its operand. If the operand has type ‘‘type’’, the result has type ‘‘pointer to type’’.
So in your first declaration
int arr[2] = {0, 0};
arr is initialized to an array type containing 2 elements of type int both equal to 0. Then per 6.3.2.1p3 it is "decayed" into a pointer type pointing to the first element anywhere it is called in scope (except when it's used like sizeof(arr), &arr, ++arr or --arr).
So in your next line, you could simply just do the following:
int *ptr = arr; or int *ptr = &*arr; or int *ptr = &arr[0];
and ptr is now a pointer to an int type that points to the first element of the array arr (i.e. &arr[0]).
Instead you declare it as such:
int *ptr = (int*)&arr;
Lets break this down into it's parts:
&arr -> triggers the exception to 6.3.2.1p3 so instead of getting &arr[0], you get the address to arr which is an int(*)[2] type (not an int* type), so you are not getting a pointer to an int, you are getting a pointer to an int array
(int*)&arr, (i.e. the cast to int*) -> per 6.5.3.2p3, &arr takes the address of the variable arr returning a pointer to the type of it, so simply saying int* ptr = &arr will give a warning of "incompatible pointer types" (since ptr is of type int* and &arr is of type int(*)[2]) which is why you need to cast to an int*.
Further per 6.3.2.3p1: "a pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer".
So, you're declaration of int* ptr = (int*)(void*)&arr; would produce the same results as int* ptr = (int*)&arr; because of the types you are using and converting to/from. Also as a note: ptr[0] = 5; is the same as *ptr = 5, where ptr[1] = 5; would also be the same as *++ptr = 5;
Some of the references:
6.3.2.1 Lvalues, arrays, and function designators
1. An lvalue is an expression (with an object type other than void) that potentially designates an object (*see note); if an lvalue does not designate an object when it is evaluated, the behavior is undefined. When an object is said to have a particular type, the type is specified by the lvalue used to designate the object. A modifiable lvalue is an lvalue that does not have array type, does not have an incomplete type, does not have a constqualified type, and if it is a structure or union, does not have any member (including, recursively, any member or element of all contained aggregates or unions) with a constqualified type.
*The name ‘‘lvalue’’ comes originally from the assignment expression E1 = E2, in which the left operand E1 is required to be a (modifiable) lvalue. It is perhaps better considered as representing an object ‘‘locator value’’. What is sometimes called ‘‘rvalue’’ is in this International Standard described as the ‘‘value of an expression’’. An obvious example of an lvalue is an identifier of an object. As a further example, if E is a unary expression that is a pointer to an object, *E is an lvalue that designates the object to which E points.
2. Except when it is the operand of the sizeof operator, the _Alignof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue); this is called lvalue conversion. If the lvalue has qualified type, the value has the unqualified version of the type of the lvalue; additionally, if the lvalue has atomic type, the value has the non-atomic version of the type of the lvalue; otherwise, the value has the type of the lvalue. If the lvalue has an incomplete type and does not have array type, the behavior is undefined. If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
3. Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
6.3.2.3 Pointers
1. A pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
5. An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation (the mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment).
6. Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.
6.5.3.2 Address and indirection operators
1. The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
3. The unary & operator yields the address of its operand. If the operand has type ‘‘type’’, the result has type ‘‘pointer to type’’. If the operand is the result of a unary * operator, neither that operator nor the & operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue. Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator. Otherwise, the result is a pointer to the object or function designated by its operand.
4. The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined (*see note).
*Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is always true that if E is a function designator or an lvalue that is a valid operand of the unary & operator, *&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points. Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.
6.5.4 Cast operators
5. Preceding an expression by a parenthesized type name converts the value of the expression to the named type. This construction is called a cast (a cast does not yield an lvalue; thus, a cast to a qualified type has the same effect as a cast to the unqualified version of the type). A cast that specifies no conversion has no effect on the type or value of an expression.
6. If the value of the expression is represented with greater range or precision than required by the type named by the cast (6.3.1.8), then the cast specifies a conversion even if the type of the expression is the same as the named type and removes any extra range and precision.
6.5.16.1 Simple assignment
2. In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.
6.7.6.2 Array declarators
1. In addition to optional type qualifiers and the keyword static, the [ and ] may delimit an expression or *. If they delimit an expression (which specifies the size of an array), the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero. The element type shall not be an incomplete or function type. The optional type qualifiers and the keyword static shall appear only in a declaration of a function parameter with an array type, and then only in the outermost array type derivation.
3. If, in the declaration ‘‘T D1’’, D1 has one of the forms:
D[ type-qualifier-listopt assignment-expressionopt ]
D[ static type-qualifier-listopt assignment-expression ]
D[ type-qualifier-list static assignment-expression ]
D[ type-qualifier-listopt * ]
and the type specified for ident in the declaration ‘‘T D’’ is ‘‘derived-declarator-type-list T’’, then the type specified for ident is ‘‘derived-declarator-type-list array of T’’.142) (See 6.7.6.3 for the meaning of the optional type qualifiers and the keyword static.)
4. If the size is not present, the array type is an incomplete type. If the size is * instead of being an expression, the array type is a variable length array type of unspecified size, which can only be used in declarations or type names with function prototype scope;143) such arrays are nonetheless complete types. If the size is an integer constant expression and the element type has a known constant size, the array type is not a variable length array type; otherwise, the array type is a variable length array type. (Variable length arrays are a conditional feature that implementations need not support; see 6.10.8.3.)
5. If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by *; otherwise, each time it is evaluated it shall have a value greater than zero. The size of each instance of a variable length array type does not change during its lifetime. Where a size expression is part of the operand of a sizeof operator and changing the value of the size expression would not affect the result of the operator, it is unspecified whether or not the size expression is evaluated.
6. For two array types to be compatible, both shall have compatible element types, and if both size specifiers are present, and are integer constant expressions, then both size specifiers shall have the same constant value. If the two array types are used in a context which requires them to be compatible, it is undefined behavior if the two size specifiers evaluate to unequal values.
P.S. As a side note, given the following code:
#include <stdio.h>
int main(int argc, char** argv)
{
int arr[2] = {10, 20};
X
Y
printf("%d,%d\n", arr[0],arr[1]);
return 0;
}
where X was one of the following:
int *ptr = (int*)(void*)&arr;
int *ptr = (int*)&arr;
int *ptr = &arr[0];
and Y was one of the following:
ptr[0] = 15;
*ptr = 15;
When compiled on OpenBSD with gcc version 4.2.1 20070719 and providing the -S flag, the assembler output for all files was exactly the same.

What does address of a, which is an array, returns?

I thought when you try to get the address of an array, it returns the address of the first element it holds.
int *j;
int a[5]={1,5,4,7,8};
Now j=&a[0]; works perfectly fine.
Even j=a also does the same function.
But when I do j=&a it throws an error saying cannot convertint (*)[5]' to int*' in assignment
Why does it happen? &a should be the first element of the array a, so it should give &a[0].
But instead it throws an error. Can somebody explain why?
The C standard says the following regarding how arrays are used in expressions (taken from C99 6.3.2.1/3 "Lvalues, array, and function designators):
Except when it is the operand of the sizeof operator or the unary &
operator, or is a string literal used to initialize an array, an
expression that has type ‘‘array of type’’ is converted to an
expression with type ‘‘pointer to type’’ that points to the initial
element of the array object
This is commonly known as "arrays decay to pointers".
So the sub-expression a in the following larger expressions evaluates to a pointer to int:
j=&a[0]
j=a
In the simpler expression, j=a, that pointer is simply assigned to j.
In the more complex expression, j=&a[0], the 'index' operator [] is applied to the pointer (which is an operation equivalent to *(a + 0)) and the 'address-of' operator is applied to that, resulting in another pointer to int that gets assigned to j.
In the expression j=&a, the address-of operator is applied directly to the array name, and we hit one of the exceptions in the above quoted clause: "Except when it is the operand of ... the unary & operator".
Now when we look at what the standard says about the unary & (address-of) operator (C99 6.5.3.2/3 "Address and indirection operators"):
The unary & operator returns the address of its operand. If the
operand has type "type", the result has type "pointer to type".
Since a has type "array of 5 int" (int [5]), the result of applying & to it directly has type "pointer to array of 5 int" (int (*)[5]), which is not assignable to int*.
The type of a and &a is not the same even though they contain the same value, i.e., base address of the array a.
j = a;
The array name a here gets converted to a pointer to its first element.
Try to see what values you get via these statements to understand where the difference lies:
printf("%p", a+1);
printf("%p", &a+1);
c is a strongly typed language. Assignment such as j=a; is allowed only if j and a are of the same type or the compiler can safely convert a to j. In your case, type of j is int * while the type of &a is int (*)[5]. The compiler does not know how to automatically convert an object of type int (*)[5] to an object of type int *. The compiler is telling you exactly that.
a is an array of 5 ints. The pointer to a is a pointer to an array of five integers, or int (*)[5]. This is not compatible with an int * because of pointer arithmetic: If you increment a variable of type int *, the address in the variable increases by 4 (assuming 4 byte integers), so that it points to the next integer. If you increment a variable that points to an array of 5 integers, the address in the variable increases by 20 (again assuming 4 byte integers), so that it points to the next array of five integers.
Perhaps what's confusing is that the value give by a and &a is the same, as you said. The value is the same but the type is different, and the difference is most obvious when you do arithmetic on the pointers.
I hope that helps.

Resources