I have the following code:
#include <stdio.h>
int main(void) {
int array[0];
printf("%d", array);
return 0;
}
As we know, an array always points to its first item, but we don't have items in this example, but this code produces some memory address. What does it point to?
An array of size 0 is considered a constraint violation. So having such an array and attempting to use it triggers undefined behavior.
Section 6.7.6.2p1 of the C standard regarding constraints on Array Declarators states:
In addition to optional type qualifiers and the keyword static, the [ and ] may delimit an expression or *. If they delimit an expression (which specifies the size of an array), the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero. The element type shall not be an incomplete or function type. The optional type qualifiers and the keyword static shall appear only in a declaration of a function parameter with an array type, and then only in the outermost array type derivation
GCC will allow a zero length array as an extension, but only if it is the last member of a struct. This is an alternate method of specifying a flexible array member which is allowed in the C standard if the array size is omitted.
Related
Option 1: use #define
#define kSize 5
int arr1[kSize] = {1,2,3,4,5};
--> OK.
Option 2: use an enum
enum { eSize = 5 };
int arr2[eSize] = {1,2,3,4,5};
--> OK.
But, a const int cannot be used:
const int cSize=5;
int arr3[cSize] = {1,2,3,4,5};
--> FAIL.
why?
A variable with the const qualifier does not qualify as an integer constant expression. This makes the array a variable length array (VLA) which cannot be initialized.
Section 6.7.6.2p4 of the C standard describing Array Declarators states:
If the size is not present, the array type is an incomplete type. If
the size is * instead of being an expression, the array type
is a variable length array type of unspecified size,which can
only be used in declarations or type names with function prototype
scope such arrays are nonetheless complete types. If the size is an
integer constant expression and the element type has a known
constant size, the array type is not a variable length array
type; otherwise, the array type is a variable length array
type. (Variable length arrays are a conditional feature that
implementations need not support; see 6.10.8.3.)
So for an array to not be a variable length array its size must be an integer constant expression. This is defined in section 6.6p6:
An integer constant expression shall have integer type and shall
only have operands that are integer constants, enumeration
constants, character constants, sizeof expressions whose
results are integer constants, _Alignof expressions, and
floating constants that are the immediate operands of casts. Cast
operators in an integer constant expression shall only convert
arithmetic types to integer types, except as part of an
operand to the sizeof or _Alignof .operator
A #define definition is replaced by the preprocessor before the compilation phase, so in your first case kSize is exactly the same as the constant 5. The above passage also states than an enum constant qualifies as an integer constant expression, so this makes your second case OK. The third case uses a const qualified variable which is not included above in the definition of an integer constant expression, so this makes it a variable length array.
Section 6.7.9p3 then dictates what can be initialized:
The type of the entity to be initialized shall be an array of unknown size or a complete object type that is not a variable length array type.
And as stated above a VLA cannot be initialized.
In some programs involving 2d array, written in C, I noted that row size is not mentioned and the compiler is also not throwing any error regarding this. But when I tried this by mentioning the row size but not the column size, the compiler throws an error.
Eg:
int arr[][5]; // correct
int arr[5][]; //compiler throws error
What's the reason?
We can define a 2-D array in C as:
A [][n];
where n is some constant
We must include the number of columns in the array because this specifies the size of each row. The two dimensional array can be viewed as an array of rows.Once the compiler knows the size of a row in the array (which is defined by the value in the second square bracket, n here), it is able to correctly determine the beginning of each row.
In other words,it is needed to compute the relative offset of the item you're actually accessing.
We have offset = (row*colwidth + col)
The offsets are computed by the compiler using the size of the row, which happens to be the number/count of the columns.
6.7.6.2 Array declarators
Constraints
1 In addition to optional type qualifiers and the keyword static, the [ and ] may delimit
an expression or *. If they delimit an expression (which specifies the size of an array), the
expression shall have an integer type. If the expression is a constant expression, it shall
have a value greater than zero. The element type shall not be an incomplete or function
type. The optional type qualifiers and the keyword static shall appear only in a
declaration of a function parameter with an array type, and then only in the outermost
array type derivation.
...
Semantics
...
4 If the size is not present, the array type is an incomplete type...
C 2011 Online Draft
Emphasis added. Given an array declaration
T a[];
the type of a is incomplete - it's "unknown size array of T". However, per the constraint above, T itself must be a complete type. If T is an array type, its size must be known, a la R [N]:
R a[][N]; // a is an unknown-size array of N-element arrays of R
This is why the compiler accepts
int arr[][5];
since, while we don't yet know how many elements will be in arr, we know how big each of those elements will be (5 * sizeof (int)). Note that arr must be given a size before it can actually be used. The converse,
int arr[5][];
says that arr is a 5-element array of unknown-size arrays of int. We know how many elements we need, but we don't know how big those elements are going to be.
Now, why does C make this restriction? I can't provide an authoritative answer for that, but I suspect it has to do with the relationship between array and pointer operations in C. Remember that the expression a[i] is defined as *(a + i) - that is, take the address a and offset i elements (not bytes!!) from that address and dereference the result. That only works if the size of the element type is known.
It should be possible to model an array of N elements of unknown size, but I suspect that such a model is cumbersome enough that it's more trouble to implement than it's worth.
How does char s[] act as a pointer while it looks like an array declaration?
#include<stdio.h>
void test(char s[]);
int main()
{
char s[10]="test";
char a[]=s;
test(s);
char p[]=s;
return 0;
}
void test(char s[])
{
printf(s);
}
In the context of a function parameter declaration (and only in that context), T a[N] and T a[] are the same as T *a; they declare a as a pointer to T, not an array of T.
Chapter and verse:
6.7.6.3 Function declarators (including prototypes)
...
7 A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to
type’’, where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation. If the keyword static also appears within the [ and ] of the
array type derivation, then for each call to the function, the value of the corresponding
actual argument shall provide access to the first element of an array with at least as many
elements as specified by the size expression.
Anywhere else, T a[]; declares a as an array with an as-yet-unspecified size. At this point the declaration is incomplete, and a cannot be used anywhere until a size has been specified, either by specifying it explicitly:
T a[N];
or using an initializer:
T a[] = { /* comma-separated list of initial values */ }
Chapter and verse, again:
6.7.6.2 Array declarators
...
3 If, in the declaration ‘‘T D1’’, D1 has one of the forms: D[ type-qualifier-listopt assignment-expressionopt ]
D[ static type-qualifier-listopt assignment-expression ]
D[ type-qualifier-list static assignment-expression ]
D[ type-qualifier-listopt * ]
and the type specified for ident in the declaration ‘‘T D’’ is ‘‘derived-declarator-type-list
T’’, then the type specified for ident is ‘‘derived-declarator-type-list array of T’’.142)
(See 6.7.6.3 for the meaning of the optional type qualifiers and the keyword static.)
4 If the size is not present, the array type is an incomplete type. If the size is * instead of
being an expression, the array type is a variable length array type of unspecified size,
which can only be used in declarations or type names with function prototype scope;143)
such arrays are nonetheless complete types. If the size is an integer constant expression
and the element type has a known constant size, the array type is not a variable length
array type; otherwise, the array type is a variable length array type. (Variable length
arrays are a conditional feature that implementations need not support; see 6.10.8.3.)
142) When several ‘‘array of’’ specifications are adjacent, a multidimensional array is declared.
143) Thus, * can be used only in function declarations that are not definitions (see 6.7.6.3).
...
6.7.9 Initialization
...
22 If an array of unknown size is initialized, its size is determined by the largest indexed
element with an explicit initializer. The array type is completed at the end of its
initializer list.
So, why are arrays as function parameters treated differently than arrays as regular objects? This is why:
6.3.2.1 Lvalues, arrays, and function designators
...
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the
unary & operator, or is a string literal used to initialize an array, an expression that has
type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points
to the initial element of the array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
Under most circumstances, an expression of type "N-element array of T" is converted ("decays") to an expression of type "pointer to T". If you pass an array expression as an argument to a function, like so:
int foo[10];
...
bar( foo );
what the function bar actually receives is a pointer to int, not a 10-element array of int, so the prototype for bar can be written
void bar( int *f );
"But why..." I hear you starting to ask. I'm getting to it, really.
C was derived from an earlier language called B (go figure), and in B the following things were true:
Pointer objects were declared using empty bracket notation - auto p[];. This syntax was kept for pointer function parameter declarations in C.
Array declarations set aside memory not only for the array elements, but also for an explicit pointer to the first element, which was bound to the array variable (i.e., auto v[10] would set aside 11 memory cells, 10 for the array contents, and the remaining one to store an offset to the first element).
The array subscript operation a[i] was defined as *(a + i) -- you'd offset i elements from the base address stored in the variable a and dereference the result (B was a "typeless" language, so all scalar objects were the same size).
For various reasons, Ritchie got rid of the explicit pointer to the first element of the array, but kept the definition of the subscript operation. So in C, when the compiler sees an array expression that isn't the operand of the sizeof or unary & operators, it replaces that expression with a pointer expression that evaluates to the address of the first element of the array; that way the *(a + i) operation still works the way it did in B, without actually setting aside any storage for that pointer value. However, it means arrays lose their "array-ness" in most circumstances, which will bite you in the ass if you aren't careful.
You can't assign to arrays, only copy to them or initialize them with a valid initializer (and another array is not a valid initializer).
And when you declare a function like
void test(char s[]);
it's actually the same as declaring it
void test(char *s);
What happens when you call a function taking an "array" is that the array decays to a pointer to the first element, and it's that pointer that is passed to the function.
So the call
test(s);
is the same as
test(&s[0]);
Regarding the function declaration, declaring a function taking an array of arrays is not the same as declaring a function taking a pointer to a pointer. See e.g. this old answer of mine as an explanation of why.
So if you want a function taking an array of arrays, like
void func2(char a[][X]);
it's not the same as
void func2(char **a);
Instead it's the same as
void func2(char (*a)[X]);
For more "dimensions" it doesn't change anything, e.g.
void func3(char a[][X][Y]);
is the same as
void func3(char (*a)[X][Y]);
char a[] is an array and not a pointer, the inittialization is invalid.
s is an array, too, but in the context of an expression it evaluates to a pointer to its first element.
char a[] is only valid if you have a following initializer list. It has to be an array of characters. As a special case for strings, C allows you to type an array of characters as "str", rather than {'s','t','r','\0'}. Either initializer is fine.
In the code char a[]=s;, s is an array type and not a valid initializer, so the code will not compile.
void test(char s[]) is another special case, because arrays passed as parameters always get replaced by the compiler with a pointer to the first element. Don't confuse this with array initialization, even though the syntax is similar.
This question already has an answer here:
What is the purpose of static keyword in array parameter of function like "char s[static 10]"?
(1 answer)
Closed 9 years ago.
A very simple program in C:
#include <stdio.h>
#include <stdlib.h>
void process(int array[static 5]){
int i;
for(i=0; i<5; i++)
printf("%d ", array[i]);
printf("\n");
}
int main(){
process((int[]){1,2,3});
process(NULL);
return 0;
}
I compile it: gcc -std=c99 -Wall -o demo demo.c
It does compile and when I run it, it crashes (quite predictable).
Why? What is the purpose of the static keyword in array parameter (whats the name of this construct btw?) ?
The static there is an indication (a hint — but not more than a hint) to the optimizer that it may assume there is a minimum of the appropriate number (in the example, 5) elements in the array (and therefore that the array pointer is not null too). It is also a directive to the programmer using the function that they must pass a big enough array to the function to avoid undefined behaviour.
ISO/IEC 9899:2011
§6.7.6.2 Array declarators
Constraints
¶1 In addition to optional type qualifiers and the keyword static, the [ and ] may delimit
an expression or *. If they delimit an expression (which specifies the size of an array), the
expression shall have an integer type. If the expression is a constant expression, it shall
have a value greater than zero. The element type shall not be an incomplete or function
type. The optional type qualifiers and the keyword static shall appear only in a
declaration of a function parameter with an array type, and then only in the outermost
array type derivation.
§6.7.6.3 Function declarators (including prototypes)
¶7 A declaration of a parameter as "array of type" shall be adjusted to "qualified pointer to
type", where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation. If the keyword static also appears within the [ and ] of the
array type derivation, then for each call to the function, the value of the corresponding
actual argument shall provide access to the first element of an array with at least as many
elements as specified by the size expression.
Your code crashes because if you pass a null pointer to a function expecting an array (that is guaranteed to be the start of an array of 5 elements). You are invoking undefined behaviour and crash is an eminently sensible way of dealing with your mistake.
It is more subtle when you pass an array of 3 integers to a function that's guaranteed an array of 5 integers; again, you invoke undefined behaviour and the results are unpredictable. A crash is relatively unlikely; spurious results are very probable.
In effect, the static in this context has two separate jobs — it defines two separate contracts:
It tells the user of the function that they must provide an array of at least 5 elements (and if they do not, they will invoke undefined behaviour).
It tells the optimizer that it may assume a non-null pointer to an array of at least 5 elements and it may optimize accordingly.
If the user of the function violates the requirements of the function, all hell may break loose ('nasal demons' etc; generally, undefined behaviour).
Your code is correct (and actually recommended... see the C99, N1124/1256, clause 6.7.5.3-7 (see Jonathan's full text below):
If the keyword static also appears within the [ and ] of the array
type derivation, then for each call to the function, the value of the
corresponding actual argument shall provide access to the first
element of an array with at least as many elements as specified by the
size expression.
The error is that your array definition -- you allocate it to hold 3 elements, but then you call a function that requires five elements (via the [static 5]), triggering a crash.
An array type decays to a pointer type when it is passed to a function
That means in
int func(int x[*p])
*p should not be evaluated as the declaration is equivalent to int func(int *x)
Does the same hold for pointer to arrays?
Here is the code
int *p=0;
void func(int (*ptr)[*p]) //A
{
// code
}
int main()
{
int arr[5][5];
func(arr);
}
Is evaluation of *p at //A guaranteed by the Standard?
I tried with and without optimization on g++4.6. With optimizations enabled I don't get segfault. On clang the code is not giving any segfault even without any optimizations.
From the C99 Standard Section 6.7.5.2 Array declarators, paragraph 1:
In addition to optional type qualifiers and the keyword static, the [
and ] may delimit an expression or *. If they delimit an expression
(which specifies the size of an array), the expression shall have an
integer type. If the expression is a constant expression, it shall
have a value greater than zero. The element type shall not be an
incomplete or function type. The optional type qualifiers and the
keyword static shall appear only in a declaration of a function
parameter with an array type, and then only in the outermost array
type derivation.
The expression *p evaluates to 0 and does not satisfy the requirements of the above paragraph, so the behavior is undefined.