Can sizeof safely be used on an array that has been declared without an explicit size specified inside the square brackets, but which gets initialised in the declaration?
Consider the following code:
unsigned int arr[] = { 1, 2, 3 };
size_t bytes = sizeof arr;
If compiled on macOS with clang-800.0.42.1 without any special compiler flags, this yields the expected result of 12.
But does the C standard (or any C standard, if they differ on this) guarantee this to be the case? Or do I have to declare it like unsigned int arr[3] in order for it to be "sane"?
Yes, the standard guarantees that the array element count will be equal to the number of elements in the array initializer in case no size is specified. See
C11 standard draft 6.7.9p22 and 6.7.9p25:
If an array of unknown size is initialized, its size is determined by
the largest indexed element with an explicit initializer. The array
type is completed at the end of its initializer list.
EXAMPLE 2 The declaration
int x[] = { 1, 3, 5 };
defines and initializes x as a one-dimensional array object that has three elements, as no size was specified and there are three initializers.
unsigned int arr[] = { 1, 2, 3 }; actually defines a complete array. The size of the array is known in this compilation unit and is n*sizeof(type) where n is the number of elements in the initialization list (here 3) and type is the underlying object type (here unsigned int).
That means that sizeof(arr) is defined in same scope as arr and has the expected value.
What would be completely different would be extern int arr[];. That would be a simple declaration that an array of that name will be provided in another compilation unit, but the compiler has no way to know its size. In that case using sizeof(arr) will be an error.
Another example of mere declaration is
void func(int arr[]) {
...
}
Here again the compiler only knows that the function will receive an int array, but again cannot know its size. But here the compiler generates a pointer that will receive the address of the array and sizeof(arr) is defined but is the size of that pointer and not the size of the original array.
Related
I am a bit confused about array declaration in C. I know that it's possible to do this:
int a[20]; // Reserved space for 20 int array
int b[] = {32, 431, 10, 42}; // Length in square brackets is auto-calculated
int *c = calloc(15, sizeof(int)); // Created a pointer to the dynamic int array
But is it possible to do this?:
int my_array[sizeof(int) * 5];
Is it a valid code, or an array length should be a constant expression (in ANSI C)?
This declaration
int my_array[sizeof(int) * 5];
does not declare a variable length array because the expression sizeof(int) * 5 is a constant integer expression. So even your compiler does not support variable length arrays you may use such a declaration.
From the C Standard (6.6 Constant expressions)
6 An integer constant expression117) shall have integer type and shall
only have operands that are integer constants, enumeration constants,
character constants, sizeof expressions whose results are integer
constants, and floating constants that are the immediate operands of
casts. Cast operators in an integer constant expression shall only
convert arithmetic types to integer types, except as part of an
operand to the sizeof operator.
and (6.7.6.2 Array declarators)
4 If the size is not present, the array type is an incomplete type. If
the size is * instead of being an expression, the array type is a
variable length array type of unspecified size, which can only be used
in declarations or type names with function prototype scope; such
arrays are nonetheless complete types. If the size is an integer
constant expression and the element type has a known constant size,
the array type is not a variable length array type; otherwise, the
array type is a variable length array type. (Variable length arrays
are a conditional feature that implementations need not support; see
6.10.8.3.)
A declaration of a variable length array can look like
const int n = 5;
int my_array[sizeof(int) * n];
The support of variable length arrays is optional in C11 and higher.
(This answer answers the question in the title, “Can array length in declaration be non-constant?” The example given in the body, int my_array[sizeof(int) * 5]; does not have a non-constant length.)
Variable length arrays are optional in the current C standard, 2018, meaning a C implementation may choose to support them or not. They were mandatory in the 1999 C standard and made optional in the 2011 standard.
Variable length arrays can be declared only inside functions or there parameters, not at file scope, and they cannot have static or thread storage duration.
sizeof(int) * 5 used in the example statement in your question: int my_array[sizeof(int) * 5];, is a constant expression, so although it does not serve as a good illustration of your primary question, it is legal syntax for C array declaration.
With the exception of C99, variable length arrays are optional in most recent C compiler implementations. (In C99 inclusion of VLA is mandated.)
So, if your compiler supports VLA, the following are an examples:
char string[100] = {0};
scanf("%99s", string);
int VLAarray1[strlen(string)+1];//per question in comments about functions to size array.
memset(VLA1array, 0, sizeof(VLAarray1));//see Note below for initialization
int arrayLen = 0;
scanf("%d", &arrayLen);
int VLAarray2[arrayLen];
memset(VLAarray2, 0, sizeof(VLAarray2));//see Note below for initialization
int nonVLAarray[100] = {0};//initialization during declaration of nonVLA
Note: that VLAs cannot be initialized in any form during its declaration. As with all variables though it is a good idea that it be initialized in subsequent statements by explicitly assigning values to its entire region of memory.
Passing VLAs as function arguments is not included within the scope of your question, but should it be of interest, there is a good discussion on that topic here.
#include <stdio.h>
int func()
{
int a = 3, b = 4;
int c = a * b;
return c;
}
int main()
{
const int N = 10;
int arr[N];
printf("size = %ld\n", sizeof(arr));
int x = 10;
const int SIZE = x;
int buf[SIZE];
printf("size = %ld\n", sizeof(buf));
const int FN = func();
int buf2[FN];
printf("size = %ld\n", sizeof(buf2));
return 0;
}
ubuntu 20 5.4.0-42-generic
gcc 9.3.0
compile:
gcc const_create_arr.c -Wall
show no warning
output:
size = 40
size = 40
size = 48
output corret
the last one FN is init by func(). we know that func() return vlaue should be computed in runtime. But an array definition should provide the true length of this array to compiler to help compiler allocate space. So I think the last one should not be passed at compiling. But it seems corret. I want to know how it works. Whether my gcc has optimise it and compute the func return when it compiling.
In all three cases you're creating a variable length array. For an array to not be a VLA the size needs to be an integer constant expression, and a variable with the const qualifier (no matter how it's initialized) does not qualify as one.
The definition of a VLA can be found in section 6.7.6.2p4 of the C standard regarding array declarators:
If the size is not present, the array type is an incomplete type. If the size is * instead of being an expression, the array type is a variable length array type of unspecified size, which can only be used in declarations or type names with function prototype scope; such arrays are nonetheless complete types. If the size is an integer constant expression and the element type has a known constant size, the array type is not a variable length array type; otherwise, the array type is a variable length array type.
And the definition of an integer constant expression is given in section 6.6p6:
An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, _Alignof expressions, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof or _Alignof operator.
There is nothing in this definition that qualifies a const qualified variable as an integer constant expression, so the sizeof operator in each of the three cases is being evaluated at runtime because the arrays are VLAs.
C has variable length arrays, so int x[y] is valid C provided y is defined in advance. It doesn't matter if y is const or not, it just needs to be > 0 to make any sense, as well as small enough that you don't use up the entire stack.
In this case y is 12, so you get a length 48 (12 * sizeof(int) where that's 4). This gets computed after the fact, not in advance like you might expect for something trivial like sizeof(int).
In C++ this is not the case as defined by the standard, however some compilers will still do it the C way.
This is what I write:
const int MAX=100;
int main (){
int notas [MAX]={0};
The compiler says the following:
[Error] variable-sized object may not be initialized
[Warning] excess elements in array initializer
When I write MAX with #define MAX 100, it works. But I don´t understand what's the matter with doing it this way?
In this case
const int MAX=100;
does not create a compile time constant, so the array is treated as VLA. By definition, VLAs can not be initialised, hence the error.
On the other hand, #define MAX 100 is a pre-processor macro, and based on the textual replacement property, it results in a compile time constant value of 100, then the array is not a VLA and can be initialized as per the initialization rules.
This
const int MAX=100;
int main (){
int notas [MAX]={0};
is a declaration of a variable length array the size of which is determined at run-time because the declaration of the variable MAX is not a compile-time constant in C. Such arrays may not be initialized in declarations.
From the C Standard (6.7.9 Initialization)
3 The type of the entity to be initialized shall be an array of
unknown size or a complete object type that is not a variable length
array type.
So you could write for example
const int MAX=100;
int main (){
int notas [MAX];
memset( notas, 0, MAX * sizeof( int ) );
Otherwise you could use a compile time constant like
enum { MAX=100 };
int main (){
int notas [MAX]={0};
Despite the const in the declaration
const int MAX = 100;
MAX is not a constant expression (i.e., something whose value is known at compile time). Its value isn't known until run time, so the declaration of notas is treated as a variable-length array declaration, and a VLA declaration may not have an initializer (nor may a VLA be declared at file scope, nor may it be a member of a struct or union type).
With the preprocessor macro
#define MAX 100
all instances of the symbol MAX are replaced with the literal 100 after preprocessing, so it's effectively the same as writing
int notas[100] = {0};
which is why using the preprocessor macro works.
This question already has answers here:
Pointers - Difference between Array and Pointer
(2 answers)
Closed 5 years ago.
What are the differences between (in C)
int * a
int [] a;
Where suppose we did int * a = malloc(...)
Isn't the second one also a pointer?
As it stands right now, the second is simply a syntax error. The closest you could do would be int a[]; Even that, however, it's allowed--for a variable definition, the brackets need to contain a constant expression with a strictly positive value.
For a function parameter, int a[] would be allowed, and so would something like int a[3]. In this case, the two are precisely equivalent--when you define a function parameter with a type array of T, it is adjusted to a type pointer to T.
You can also do an extern declaration:
extern int *a;
extern int b[];
In this case, the two are actually both syntactically valid, but the results are different--you're declaring that a has type pointer to int, while b has type array of int.
If you evaluate the name of an array in an expression, it will usually yield the address of the first element of that array (though there are a few exceptions, such as when used as the argument to sizeof). The array itself doesn't have that type though--what you're looking at is at least similar to an implicit conversion, somewhat similar to 2 + 10.0, converting the 2 to a double before doing the addition--2 itself is an int, but in this expression, it's silently converted to double.
I think you mean the difference between the following declarations
int a[N];
and
int *a = malloc( N * sizeof( int ) );
where N is some integer value.
The first one declares an array with the static or automatic storage duration. The second one does two things. Its initializing expression allocates dynamically memory that can be occupied by an array with N elements of type int. And the address of the extent of memory is assigned to the pointer a.
The allocated memory should be freed by the user when it is not needed any more.
So in the first declaration there is declared an object of type int[N] while in the second declaration there is declared an object of type int *. Correspondingly the size of the first object is equal to sizeof( int[N] ) that is equivalent to N * sizeof( int ) while the size of the second object is equal to sizeof( int * ).
Pointers do not possess the information about whether they point to a single object or the first object of an array. So the user should hold the number of elements of the allocated array himself while for an array it can get the number by using expression sizeof( a ) / sizeof( *a ).
Can anyone explain what a variably modified type is?
If we have an array a[n] and n is not known at compile time then a is a VLA. Given an array b[c][d] where c and d are not known until runtime implies b is a VLA, right?
In my book they have said that a variably modified type contains a VLA.
That's it; nothing more.
How do I create a pointer to a variably modified type?
A variably-modified type is a VLA (variable length array). There's a similar type in a structure with a flexible array member, but I don't plan to discuss flexible array members further.
The key point about a VLA is that the dimension of an array is not known until run-time. Classically, in C89 and before the standard, all dimensions of an array except the first had to be a known constant value at compile time (and the first dimension could be specified as int a[] or int b[][SIZE] or int c[][SIZE1][SIZE2] where the sizes are constants).
void some_function(int n)
{
int a[n];
int c = n+1;
int d = n+2;
int b[c][d];
another_function(n, a, c, d, b);
...
}
void another_function(int n, int a[n], int c, int d, int b[c][d])
{
...
}
Both a and b are variable length arrays. Prior to C99, you could not have written some_function() like that; the size of the arrays would have to be known at compile time as compile-time constants. Similarly, the notation for another_function() would not have been legal before C99.
You could, and still can (for reasons of backwards compatibility, if nothing else) write a moderate simulation of another_function():
enum { FIXED_SIZE = 32 };
void yet_another_function(int a[], int n, int b[][FIXED_SIZE], int c)
{
...
}
This isn't a perfect simulation because the FIXED_SIZE is a fixed size, but the pure C99 VLA code has a variable dimension there. Old code would often, therefore, use a FIXED_SIZE that was large enough for the worst case.
Inside another_function(), the names a and b are basically pointers to variably modified types.
Otherwise, you do it the same as for a fixed size array:
int z[FIXED_SIZE];
int (*z_pointer)[FIXED_SIZE] = &z;
int v[n];
int (*v_pointer)[n] = &v;
VLA == Variable Length Array
Variable Length Arrays were introduced in the C99 spec to allow for things like this:
int someArraySize;
int myArray[someArraySize];
Variably Modified type is the type of a Variable Length Array. Thus, a Variably Modified type CONTAINS a VLA. In the case of your example of b[c][d] where c and d are not known until run time, b is a Variably Modified type that happens to be a Variable Length multi-dimensional array. b[c][d] is a variable length array of variable length arrays-- phew, what a mouthful.
Here is a great source I found that describes these VLAs and the Variably Modified type with examples:
http://gustedt.wordpress.com/2011/01/09/dont-be-afraid-of-variably-modified-types/
VMT is a type usually used to allocate heap blocks of VMT size. The pointer to VMT is not VLA.
#include <stdlib.h>
int main( const int argc, char * const argv[argc])
{
typedef char * VMT [argc] ;
VMT * vmt_ptr = malloc(sizeof(VMT));
* vmt_ptr[0] = argv[0] ;
free(vmt_ptr);
return 42;
}
Some people prefer to call them "VLA on the heap". For some people that defeats the purpose of VLAs. For them, VLA is a small(er) array on the stack.
{ // VMT is a type of VLA
VMT VLA ;
VLA[0] = argv[0] ;
}
No mem leak here. But then some people are wondering what the fuss is all about.
{
typedef char * VMT [argc] ;
VMT * vmt_ptr = alloca(sizeof(VMT));
* vmt_ptr[0] = argv[0] ;
}
They are using alloca, using the VMTs in the process. Effectively creating VMT pointers to blocks allocated on stack space.
There are valid use cases for all three snippets. I hope also showing what are the VMT's.
Mandatory Godbolt: https://godbolt.org/z/zGe4K5hez