Can anyone explain what a variably modified type is?
If we have an array a[n] and n is not known at compile time then a is a VLA. Given an array b[c][d] where c and d are not known until runtime implies b is a VLA, right?
In my book they have said that a variably modified type contains a VLA.
That's it; nothing more.
How do I create a pointer to a variably modified type?
A variably-modified type is a VLA (variable length array). There's a similar type in a structure with a flexible array member, but I don't plan to discuss flexible array members further.
The key point about a VLA is that the dimension of an array is not known until run-time. Classically, in C89 and before the standard, all dimensions of an array except the first had to be a known constant value at compile time (and the first dimension could be specified as int a[] or int b[][SIZE] or int c[][SIZE1][SIZE2] where the sizes are constants).
void some_function(int n)
{
int a[n];
int c = n+1;
int d = n+2;
int b[c][d];
another_function(n, a, c, d, b);
...
}
void another_function(int n, int a[n], int c, int d, int b[c][d])
{
...
}
Both a and b are variable length arrays. Prior to C99, you could not have written some_function() like that; the size of the arrays would have to be known at compile time as compile-time constants. Similarly, the notation for another_function() would not have been legal before C99.
You could, and still can (for reasons of backwards compatibility, if nothing else) write a moderate simulation of another_function():
enum { FIXED_SIZE = 32 };
void yet_another_function(int a[], int n, int b[][FIXED_SIZE], int c)
{
...
}
This isn't a perfect simulation because the FIXED_SIZE is a fixed size, but the pure C99 VLA code has a variable dimension there. Old code would often, therefore, use a FIXED_SIZE that was large enough for the worst case.
Inside another_function(), the names a and b are basically pointers to variably modified types.
Otherwise, you do it the same as for a fixed size array:
int z[FIXED_SIZE];
int (*z_pointer)[FIXED_SIZE] = &z;
int v[n];
int (*v_pointer)[n] = &v;
VLA == Variable Length Array
Variable Length Arrays were introduced in the C99 spec to allow for things like this:
int someArraySize;
int myArray[someArraySize];
Variably Modified type is the type of a Variable Length Array. Thus, a Variably Modified type CONTAINS a VLA. In the case of your example of b[c][d] where c and d are not known until run time, b is a Variably Modified type that happens to be a Variable Length multi-dimensional array. b[c][d] is a variable length array of variable length arrays-- phew, what a mouthful.
Here is a great source I found that describes these VLAs and the Variably Modified type with examples:
http://gustedt.wordpress.com/2011/01/09/dont-be-afraid-of-variably-modified-types/
VMT is a type usually used to allocate heap blocks of VMT size. The pointer to VMT is not VLA.
#include <stdlib.h>
int main( const int argc, char * const argv[argc])
{
typedef char * VMT [argc] ;
VMT * vmt_ptr = malloc(sizeof(VMT));
* vmt_ptr[0] = argv[0] ;
free(vmt_ptr);
return 42;
}
Some people prefer to call them "VLA on the heap". For some people that defeats the purpose of VLAs. For them, VLA is a small(er) array on the stack.
{ // VMT is a type of VLA
VMT VLA ;
VLA[0] = argv[0] ;
}
No mem leak here. But then some people are wondering what the fuss is all about.
{
typedef char * VMT [argc] ;
VMT * vmt_ptr = alloca(sizeof(VMT));
* vmt_ptr[0] = argv[0] ;
}
They are using alloca, using the VMTs in the process. Effectively creating VMT pointers to blocks allocated on stack space.
There are valid use cases for all three snippets. I hope also showing what are the VMT's.
Mandatory Godbolt: https://godbolt.org/z/zGe4K5hez
Related
Can sizeof safely be used on an array that has been declared without an explicit size specified inside the square brackets, but which gets initialised in the declaration?
Consider the following code:
unsigned int arr[] = { 1, 2, 3 };
size_t bytes = sizeof arr;
If compiled on macOS with clang-800.0.42.1 without any special compiler flags, this yields the expected result of 12.
But does the C standard (or any C standard, if they differ on this) guarantee this to be the case? Or do I have to declare it like unsigned int arr[3] in order for it to be "sane"?
Yes, the standard guarantees that the array element count will be equal to the number of elements in the array initializer in case no size is specified. See
C11 standard draft 6.7.9p22 and 6.7.9p25:
If an array of unknown size is initialized, its size is determined by
the largest indexed element with an explicit initializer. The array
type is completed at the end of its initializer list.
EXAMPLE 2 The declaration
int x[] = { 1, 3, 5 };
defines and initializes x as a one-dimensional array object that has three elements, as no size was specified and there are three initializers.
unsigned int arr[] = { 1, 2, 3 }; actually defines a complete array. The size of the array is known in this compilation unit and is n*sizeof(type) where n is the number of elements in the initialization list (here 3) and type is the underlying object type (here unsigned int).
That means that sizeof(arr) is defined in same scope as arr and has the expected value.
What would be completely different would be extern int arr[];. That would be a simple declaration that an array of that name will be provided in another compilation unit, but the compiler has no way to know its size. In that case using sizeof(arr) will be an error.
Another example of mere declaration is
void func(int arr[]) {
...
}
Here again the compiler only knows that the function will receive an int array, but again cannot know its size. But here the compiler generates a pointer that will receive the address of the array and sizeof(arr) is defined but is the size of that pointer and not the size of the original array.
Someone made an argument saying that in modern C, we should always pass arrays to functions through an array pointer, since array pointers have strong typing. Example:
void func (size_t n, int (*arr)[n]);
...
int array [3];
func(3, &array);
This sounded like it could potentially be a good idea to prevent all kinds of type-related and array-out-of-bounds bugs. But then it occurred to me I don't know how to apply const correctness to this.
If I do void func (size_t n, const int (*arr)[n]) then it is const correct. But then I can no longer pass the array, because of incompatible pointer types. int (*)[3] versus const int (*)[3]. The qualifier belongs to the pointed-at data and not to the pointer itself.
An explicit cast in the caller would ruin the whole idea of increased type safety.
How do I apply const correctness to array pointers passed as parameters? Is it at all possible?
EDIT
Just as info, someone said that the idea of passing arrays by pointer like this probably originates from MISRA C++:2008 5-2-12. See for example PRQA's high integrity C++ standard.
There is no way to do it except for the cast. This is significant drawback of the idea to pass arrays in this way.
Here is a similar thread where the C rules are compared to the C++ rules. We could conclude from this comparison that the C rules are not so well designed, because your use case is valid but C doesn't allow the implicit conversion. Another such example is conversion of T ** to T const * const *; this is safe but is not allowed by C.
Note that since n is not a constant expression, then int n, int (*arr)[n] does not have any added type safety compared to int n, int *arr. You still know the length (n), and it is still silent undefined behaviour to access out of bounds, and silent undefined behaviour to pass an array that is not actually length n.
This technique has more value in the case of passing non-VLA arrays , when the compiler must report if you pass a pointer to an array of the wrong length.
C standard says that (section: §6.7.3/9):
If the specification of an array type includes any type qualifiers, the element type is so- qualified, not the array type.[...]
Therefore, in case of const int (*arr)[n], const is applied to the elements of the array instead of array arr itself. arr is of type pointer to array[n] of const int while you are passing a parameter of type pointer to array[n] of int. Both types are incompatible.
How do I apply const correctness to array pointers passed as parameters? Is it at all possible?
It's not possible. There is no way to do this in standard C without using explicit cast.
But, GCC allow this as an extension:
In GNU C, pointers to arrays with qualifiers work similar to pointers to other qualified types. For example, a value of type int (*)[5] can be used to initialize a variable of type const int (*)[5]. These types are incompatible in ISO C because the const qualifier is formally attached to the element type of the array and not the array itself.
extern void
transpose (int N, int M, double out[M][N], const double in[N][M]);
double x[3][2];
double y[2][3];
...
transpose(3, 2, y, x);
Further reading: Pointer to array with const qualifier in C & C++
OP describes a function func() that has the following signature.
void func(size_t n, const int (*arr)[n])
OP wants to call it passing various arrays
#define SZ(a) (sizeof(a)/sizeof(a[0]))
int array1[3];
func(SZ(array1), &array1); // problem
const int array2[3] = {1, 2, 3};
func(SZ(array2), &array2);
How do I apply const correctness to array pointers passed as parameters?
With C11, use _Generic to do the casting as needed. The cast only occurs when the input is of the acceptable non-const type, thus maintaining type safety. This is "how" to do it. OP may consider it "bloated" as it is akin to this. This approach simplifies the macro/function call to only 1 parameter.
void func(size_t n, const int (*arr)[n]) {
printf("sz:%zu (*arr)[0]:%d\n", n, (*arr)[0]);
}
#define funcCC(x) func(sizeof(*x)/sizeof((*x)[0]), \
_Generic(x, \
const int(*)[sizeof(*x)/sizeof((*x)[0])] : x, \
int(*)[sizeof(*x)/sizeof((*x)[0])] : (const int(*)[sizeof(*x)/sizeof((*x)[0])])x \
))
int main(void) {
#define SZ(a) (sizeof(a)/sizeof(a[0]))
int array1[3];
array1[0] = 42;
// func(SZ(array1), &array1);
const int array2[4] = {1, 2, 3, 4};
func(SZ(array2), &array2);
// Notice only 1 parameter to the macro/function call
funcCC(&array1);
funcCC(&array2);
return 0;
}
Output
sz:4 (*arr)[0]:1
sz:3 (*arr)[0]:42
sz:4 (*arr)[0]:1
Alternatively code could use
#define funcCC2(x) func(sizeof(x)/sizeof((x)[0]), \
_Generic(&x, \
const int(*)[sizeof(x)/sizeof((x)[0])] : &x, \
int(*)[sizeof(x)/sizeof((x)[0])] : (const int(*)[sizeof(x)/sizeof((x)[0])])&x \
))
funcCC2(array1);
funcCC2(array2);
I cannot understand why doing this is wrong:
const int n = 5;
int x[n] = { 1,1,3,4,5 };
even though n is already a const value.
While doing this seems to be right for the GNU compiler:
const int n = 5;
int x[n]; /*without initialization*/
I'm aware of VLA feature of C99 and I think it's related to what's going on but
I just need some clarification of what's happening in the background.
The key thing to remember is that const and "constant" mean two quite different things.
The const keyword really means "read-only". A constant is a numeric literal, such as 42 or 1.5 (or an enumeration or character constant). A constant expression is a particular kind of expression that can be evaluated at compile time, such as 2 + 2.
So given a declaration:
const int n = 5;
the expression n refers to the value of the object, and it's not treated as a constant expression. A typical compiler will optimize a reference to n, replacing it by the same code it would use for a literal 5, but that's not required -- and the rules for whether an expression is constant are determined by the language, not by the cleverness of the current compiler.
An example of the difference between const (read-only) and constant (evaluated at compile time) is:
const size_t now = time(NULL);
The const keyword means you're not allowed to modify the value of now after its initialization, but the value of time(NULL) clearly cannot be computed until run time.
So this:
const int n = 5;
int x[n];
is no more valid in C than it would be without the const keyword.
The language could (and IMHO probably should) evaluate n as a constant expression; it just isn't defined that way. (C++ does have such a rule; see the C++ standard or a decent reference for the gory details.)
If you want a named constant with the value 5, the most common way is to define a macro:
#define N 5
int x[N];
Another approach is to define an enumeration constant:
enum { n = 5 };
int x[n];
Enumeration constants are constant expressions, and are always of type int (which means this method won't work for types other than int). And it's arguably an abuse of the enum mechanism.
Starting with the 1999 standard, an array can be defined with a non-constant size; this is a VLA, or variable-length array. Such arrays are permitted only at block scope, and may not have initializers (since the compiler is unable to check that the initializer has the correct number of elements).
But given your original code:
const int n = 5;
int x[n] = { 1,1,3,4,5 };
you can let the compiler infer the length from the initializer:
int x[] = { 1,1,3,4,5 };
And you can then compute the length from the array's size:
const int x_len = sizeof x / sizeof x[0];
Why int x[n] is wrong where n is a const value?
n is not a constant. const only promise that n is a 'read-only' variable that shouldn't be modified during the program execution.
Note that in c, unlike c++, const qualified variables are not constant. Therefore, the array declared is a variable length array.
You can't use initializer list to initialize variable length arrays.
C11-§6.7.9/3:
The type of the entity to be initialized shall be an array of unknown size or a complete object type that is not a variable length array type.
You can use #define or enum to make n a constant
#define n 5
int x[n] = { 1,1,3,4,5 };
If you are exhaustively initialising an array, then it is easier, safer and more maintainable to let the compiler infer the array size:
int x[] = { 1,1,3,4,5 };
const int n = sizeof(x) / sizeof(*x) ;
Then to change the array size you only have to change the number of initialisers, rather than change the size and the initialiser list to match. Especially useful when there are many initialisers.
Even though n is a const, you cannot use it to define the size of an array unless you wish to create a VLA. However, you cannot use an initializer list to initialize a VLA.
Use a macro to create a fixed size array.
#define ARRAY_SIZE 5
int x[ARRAY_SIZE] = { 1,1,3,4,5 };
Is your code semantically different from myfunc() here:
void myfunc(const int n) {
int x[n] = { 1,1,3,4,5 };
printf("%d\n", x[n-1]);
*( (int *) &n) = 17; // Somewhat less "constant" than hoped...
return ;
}
int main(){
myfunc(4);
myfunc(5);
myfunc(6); // Haven't actually tested this. Boom? Maybe just printf(noise)?
return 0;
}
Is n really all that constant? How much space do you think the compiler should allocated for x[] (since it is the compiler's job to do so)?
As others have pointed out, the cv-qualifier const does not mean "a value that is constant during compilation and for all times after". It means "a value that the local code is not supposed to change (although it can)".
in C, I believe the following program is valid: casting a pointer to an allocated memory buffer to an array like this:
#include <stdio.h>
#include <stdlib.h>
#define ARRSIZE 4
int *getPointer(int num){
return malloc(sizeof(int) * num);
}
int main(){
int *pointer = getPointer(ARRSIZE);
int (*arrPointer)[ARRSIZE] = (int(*)[ARRSIZE])pointer;
printf("%d\n", sizeof(*arrPointer) / sizeof((*arrPointer)[0]));
return 0;
}
(this outputs 4).
However, is it safe, in C99, to do this using VLAs?
int arrSize = 4;
int *pointer = getPointer(arrSize);
int (*arrPointer)[arrSize] = (int(*)[arrSize])pointer;
printf("%d\n", sizeof(*arrPointer) / sizeof((*arrPointer)[0]));
return 0;
(also outputs 4).
Is this legit, according to the C99 standard?
It'd be quite strange if it is legit, since this would mean that VLAs effectively enable dynamic type creation, for example, types of the kind type(*)[variable].
Yes, this is legit, and yes, the variably-modified type system is extremely useful. You can use natural array syntax to access a contiguous 2-D array both of whose dimensions were not known until runtime.
It could be called syntactic sugar as there's nothing you can do with these types that you couldn't do without them, but it makes for clean code (in my opinion).
I would say it is valid. The Final version of the C99 standard (cited on Wikipedia) says in paragraph 7.5.2 - Array declarators alinea 5 :
If the size is an expression that is not an integer constant expression: ...
each time it is evaluated it shall have a value greater than zero. The size of each instance
of a variable length array type does not change during its lifetime.
It even explicitely says that it can be used in a sizeof provided the size never changes : Where a size
expression is part of the operand of a sizeof operator and changing the value of the
size expression would not affect the result of the operator, it is unspecified whether or not
the size expression is evaluated.
But the standard also says that this is only allowed in a block scope declarator or a function prototype : An ordinary identifier (as defined in 6.2.3) that has a variably modified type shall have
either block scope and no linkage or function prototype scope. If an identifier is declared
to be an object with static storage duration, it shall not have a variable length array type.
And an example later explains that it cannot be used for member fields, even in a block scope :
...
void fvla(int m, int C[m][m]); // valid: VLA with prototype scope
void fvla(int m, int C[m][m]) // valid: adjusted to auto pointer to VLA
{
typedef int VLA[m][m]; // valid: block scope typedef VLA
struct tag {
int (*y)[n]; // invalid: y not ordinary identifier
int z[n]; // invalid: z not ordinary identifier
};
...
I think that it is because the former is an array of pointers to char and the latter is a pointer to an array of chars, and we need to properly specify the size of the object being pointed to for our function definition. In the former;
function(char * p_array[])
the size of the object being pointed to is already included (its a pointer to char), but the latter
function(char (*p_array)[])
needs the size of the array p_array points to as part of p_array's definition?
I'm at the stage where I've been thinking about this for too long and have just confused myself, someone please let me know if my reasoning is correct.
Both are valid in C but not C++. You would ordinarily be correct:
char *x[]; // array of pointers to char
char (*y)[]; // pointer to array of char
However, the arrays decay to pointers if they appear as function parameters. So they become:
char **x; // Changes to pointer to array of pointer to char
char (*y)[]; // No decay, since it's NOT an array, it's a pointer to an array
In an array type in C, one of the sizes is permitted to be unspecified. This must be the leftmost one (whoops, I said rightmost at first). So,
int valid_array[][5]; // Ok
int invalid_array[5][]; // Wrong
(You can chain them... but we seldom have reason to do so...)
int (*convoluted_array[][5])[][10];
There is a catch, and the catch is that an array type with [] in it is an incomplete type. You can pass around a pointer to an incomplete type but certain operations will not work, as they need a complete type. For example, this will not work:
void func(int (*x)[])
{
x[2][5] = 900; // Error
}
This is an error because in order to find the address of x[2], the compiler needs to know how big x[0] and x[1] are. But x[0] and x[1] have type int [] -- an incomplete type with no information about how big it is. This becomes clearer if you imagine what the "un-decayed" version of the type would be, which is int x[][] -- obviously invalid C. If you want to pass a two-dimensional array around in C, you have a few options:
Pass a one-dimensional array with a size parameter.
void func(int n, int x[])
{
x[2*n + 5] = 900;
}
Use an array of pointers to rows. This is somewhat clunky if you have genuine 2D data.
void func(int *x[])
{
x[2][5] = 900;
}
Use a fixed size.
void func(int x[][5])
{
x[2][5] = 900;
}
Use a variable length array (C99 only, so it probably doesn't work with Microsoft compilers).
// There's some funny syntax if you want 'x' before 'width'
void func(int n, int x[][n])
{
x[2][5] = 900;
}
This is a frequent problem area even for C veterans. Many languages lack intrinsic "out-of-the-box" support for real, variable size, multidimensional arrays (C++, Java, Python) although a few languages do have it (Common Lisp, Haskell, Fortran). You'll see a lot of code that uses arrays of arrays or that calculates array offsets manually.
NOTE:
The below answer was added when the Q was tagged C++, and it answers from a C++ perspective. With tagged changed to only C, both the mentioned samples are valid in C.
Yes, Your reasoning is correct.
If you try compiling the error given by compiler is:
parameter ‘p_array’ includes pointer to array of unknown bound ‘char []’
In C++ array sizes need to be fixed at compile time. C++ standard forbids Variable Lenght Array's(VLA) as well. Some compilers support that as an extension but that is non standard conforming.
Those two declarations are very different. In a function parameter declaration, a declarator of [] directly applied to the parameter name is completely equivalent to a *, so your first declaration is exactly the same in all respects as this:
function(char **p_array);
However, this does not apply recursively to parameter types. Your second parameter has type char (*)[], which is a pointer to an array of unknown size - it is a pointer to an incomplete type. You can happily declare variables with this type - the following is a valid variable declaration:
char (*p_array)[];
Just like a pointer to any other incomplete type, you cannot perform any pointer arithmetic on this variable (or your function parameter) - that's where you error arises. Note that the [] operator is specified as a[i] being identical to *(a+i), so that operator cannot be applied to your pointer. You can, of course, happily use it as a pointer, so this is valid:
void function(char (*p_array)[])
{
printf("p_array = %p\n", (void *)p_array);
}
This type is also compatible with a pointer to any other fixed-size array of char, so you can also do this:
void function(char (*p_array)[])
{
char (*p_a_10)[10] = p_array;
puts(*p_a_10);
}
...and even this:
void function(char (*p_array)[])
{
puts(*p_array);
}
(though there is precious little point in doing so: you might as well just declare the parameter with type char *).
Note that although *p_array is allowed, p_array[0] is not.
Because,
(1) function(char * p_array[])
is equivalent to char **p_array; i.e. a double pointer which is valid.
(2) function(char (*p_array)[])
You are right, that p_array is pointer to char array. But that needs to be of fixed size in the case when it appears as function argument. You need to provide the size and that will also become valid.