Const correctness for array pointers? - c

Someone made an argument saying that in modern C, we should always pass arrays to functions through an array pointer, since array pointers have strong typing. Example:
void func (size_t n, int (*arr)[n]);
...
int array [3];
func(3, &array);
This sounded like it could potentially be a good idea to prevent all kinds of type-related and array-out-of-bounds bugs. But then it occurred to me I don't know how to apply const correctness to this.
If I do void func (size_t n, const int (*arr)[n]) then it is const correct. But then I can no longer pass the array, because of incompatible pointer types. int (*)[3] versus const int (*)[3]. The qualifier belongs to the pointed-at data and not to the pointer itself.
An explicit cast in the caller would ruin the whole idea of increased type safety.
How do I apply const correctness to array pointers passed as parameters? Is it at all possible?
EDIT
Just as info, someone said that the idea of passing arrays by pointer like this probably originates from MISRA C++:2008 5-2-12. See for example PRQA's high integrity C++ standard.

There is no way to do it except for the cast. This is significant drawback of the idea to pass arrays in this way.
Here is a similar thread where the C rules are compared to the C++ rules. We could conclude from this comparison that the C rules are not so well designed, because your use case is valid but C doesn't allow the implicit conversion. Another such example is conversion of T ** to T const * const *; this is safe but is not allowed by C.
Note that since n is not a constant expression, then int n, int (*arr)[n] does not have any added type safety compared to int n, int *arr. You still know the length (n), and it is still silent undefined behaviour to access out of bounds, and silent undefined behaviour to pass an array that is not actually length n.
This technique has more value in the case of passing non-VLA arrays , when the compiler must report if you pass a pointer to an array of the wrong length.

C standard says that (section: §6.7.3/9):
If the specification of an array type includes any type qualifiers, the element type is so- qualified, not the array type.[...]
Therefore, in case of const int (*arr)[n], const is applied to the elements of the array instead of array arr itself. arr is of type pointer to array[n] of const int while you are passing a parameter of type pointer to array[n] of int. Both types are incompatible.
How do I apply const correctness to array pointers passed as parameters? Is it at all possible?
It's not possible. There is no way to do this in standard C without using explicit cast.
But, GCC allow this as an extension:
In GNU C, pointers to arrays with qualifiers work similar to pointers to other qualified types. For example, a value of type int (*)[5] can be used to initialize a variable of type const int (*)[5]. These types are incompatible in ISO C because the const qualifier is formally attached to the element type of the array and not the array itself.
extern void
transpose (int N, int M, double out[M][N], const double in[N][M]);
double x[3][2];
double y[2][3];
...
transpose(3, 2, y, x);
Further reading: Pointer to array with const qualifier in C & C++

OP describes a function func() that has the following signature.
void func(size_t n, const int (*arr)[n])
OP wants to call it passing various arrays
#define SZ(a) (sizeof(a)/sizeof(a[0]))
int array1[3];
func(SZ(array1), &array1); // problem
const int array2[3] = {1, 2, 3};
func(SZ(array2), &array2);
How do I apply const correctness to array pointers passed as parameters?
With C11, use _Generic to do the casting as needed. The cast only occurs when the input is of the acceptable non-const type, thus maintaining type safety. This is "how" to do it. OP may consider it "bloated" as it is akin to this. This approach simplifies the macro/function call to only 1 parameter.
void func(size_t n, const int (*arr)[n]) {
printf("sz:%zu (*arr)[0]:%d\n", n, (*arr)[0]);
}
#define funcCC(x) func(sizeof(*x)/sizeof((*x)[0]), \
_Generic(x, \
const int(*)[sizeof(*x)/sizeof((*x)[0])] : x, \
int(*)[sizeof(*x)/sizeof((*x)[0])] : (const int(*)[sizeof(*x)/sizeof((*x)[0])])x \
))
int main(void) {
#define SZ(a) (sizeof(a)/sizeof(a[0]))
int array1[3];
array1[0] = 42;
// func(SZ(array1), &array1);
const int array2[4] = {1, 2, 3, 4};
func(SZ(array2), &array2);
// Notice only 1 parameter to the macro/function call
funcCC(&array1);
funcCC(&array2);
return 0;
}
Output
sz:4 (*arr)[0]:1
sz:3 (*arr)[0]:42
sz:4 (*arr)[0]:1
Alternatively code could use
#define funcCC2(x) func(sizeof(x)/sizeof((x)[0]), \
_Generic(&x, \
const int(*)[sizeof(x)/sizeof((x)[0])] : &x, \
int(*)[sizeof(x)/sizeof((x)[0])] : (const int(*)[sizeof(x)/sizeof((x)[0])])&x \
))
funcCC2(array1);
funcCC2(array2);

Related

char* func_name VS char *func_name (what are the differences) [duplicate]

I've recently decided that I just have to finally learn C/C++, and there is one thing I do not really understand about pointers or more precisely, their definition.
How about these examples:
int* test;
int *test;
int * test;
int* test,test2;
int *test,test2;
int * test,test2;
Now, to my understanding, the first three cases are all doing the same: Test is not an int, but a pointer to one.
The second set of examples is a bit more tricky. In case 4, both test and test2 will be pointers to an int, whereas in case 5, only test is a pointer, whereas test2 is a "real" int. What about case 6? Same as case 5?
4, 5, and 6 are the same thing, only test is a pointer. If you want two pointers, you should use:
int *test, *test2;
Or, even better (to make everything clear):
int* test;
int* test2;
White space around asterisks have no significance. All three mean the same thing:
int* test;
int *test;
int * test;
The "int *var1, var2" is an evil syntax that is just meant to confuse people and should be avoided. It expands to:
int *var1;
int var2;
Many coding guidelines recommend that you only declare one variable per line. This avoids any confusion of the sort you had before asking this question. Most C++ programmers I've worked with seem to stick to this.
A bit of an aside I know, but something I found useful is to read declarations backwards.
int* test; // test is a pointer to an int
This starts to work very well, especially when you start declaring const pointers and it gets tricky to know whether it's the pointer that's const, or whether its the thing the pointer is pointing at that is const.
int* const test; // test is a const pointer to an int
int const * test; // test is a pointer to a const int ... but many people write this as
const int * test; // test is a pointer to an int that's const
Use the "Clockwise Spiral Rule" to help parse C/C++ declarations;
There are three simple steps to follow:
Starting with the unknown element, move in a spiral/clockwise
direction; when encountering the following elements replace them with
the corresponding english statements:
[X] or []: Array X size of... or Array undefined size of...
(type1, type2): function passing type1 and type2 returning...
*: pointer(s) to...
Keep doing this in a spiral/clockwise direction until all tokens have been covered.
Always resolve anything in parenthesis first!
Also, declarations should be in separate statements when possible (which is true the vast majority of times).
There are three pieces to this puzzle.
The first piece is that whitespace in C and C++ is normally not significant beyond separating adjacent tokens that are otherwise indistinguishable.
During the preprocessing stage, the source text is broken up into a sequence of tokens - identifiers, punctuators, numeric literals, string literals, etc. That sequence of tokens is later analyzed for syntax and meaning. The tokenizer is "greedy" and will build the longest valid token that's possible. If you write something like
inttest;
the tokenizer only sees two tokens - the identifier inttest followed by the punctuator ;. It doesn't recognize int as a separate keyword at this stage (that happens later in the process). So, for the line to be read as a declaration of an integer named test, we have to use whitespace to separate the identifier tokens:
int test;
The * character is not part of any identifier; it's a separate token (punctuator) on its own. So if you write
int*test;
the compiler sees 4 separate tokens - int, *, test, and ;. Thus, whitespace is not significant in pointer declarations, and all of
int *test;
int* test;
int*test;
int * test;
are interpreted the same way.
The second piece to the puzzle is how declarations actually work in C and C++1. Declarations are broken up into two main pieces - a sequence of declaration specifiers (storage class specifiers, type specifiers, type qualifiers, etc.) followed by a comma-separated list of (possibly initialized) declarators. In the declaration
unsigned long int a[10]={0}, *p=NULL, f(void);
the declaration specifiers are unsigned long int and the declarators are a[10]={0}, *p=NULL, and f(void). The declarator introduces the name of the thing being declared (a, p, and f) along with information about that thing's array-ness, pointer-ness, and function-ness. A declarator may also have an associated initializer.
The type of a is "10-element array of unsigned long int". That type is fully specified by the combination of the declaration specifiers and the declarator, and the initial value is specified with the initializer ={0}. Similarly, the type of p is "pointer to unsigned long int", and again that type is specified by the combination of the declaration specifiers and the declarator, and is initialized to NULL. And the type of f is "function returning unsigned long int" by the same reasoning.
This is key - there is no "pointer-to" type specifier, just like there is no "array-of" type specifier, just like there is no "function-returning" type specifier. We can't declare an array as
int[10] a;
because the operand of the [] operator is a, not int. Similarly, in the declaration
int* p;
the operand of * is p, not int. But because the indirection operator is unary and whitespace is not significant, the compiler won't complain if we write it this way. However, it is always interpreted as int (*p);.
Therefore, if you write
int* p, q;
the operand of * is p, so it will be interpreted as
int (*p), q;
Thus, all of
int *test1, test2;
int* test1, test2;
int * test1, test2;
do the same thing - in all three cases, test1 is the operand of * and thus has type "pointer to int", while test2 has type int.
Declarators can get arbitrarily complex. You can have arrays of pointers:
T *a[N];
you can have pointers to arrays:
T (*a)[N];
you can have functions returning pointers:
T *f(void);
you can have pointers to functions:
T (*f)(void);
you can have arrays of pointers to functions:
T (*a[N])(void);
you can have functions returning pointers to arrays:
T (*f(void))[N];
you can have functions returning pointers to arrays of pointers to functions returning pointers to T:
T *(*(*f(void))[N])(void); // yes, it's eye-stabby. Welcome to C and C++.
and then you have signal:
void (*signal(int, void (*)(int)))(int);
which reads as
signal -- signal
signal( ) -- is a function taking
signal( ) -- unnamed parameter
signal(int ) -- is an int
signal(int, ) -- unnamed parameter
signal(int, (*) ) -- is a pointer to
signal(int, (*)( )) -- a function taking
signal(int, (*)( )) -- unnamed parameter
signal(int, (*)(int)) -- is an int
signal(int, void (*)(int)) -- returning void
(*signal(int, void (*)(int))) -- returning a pointer to
(*signal(int, void (*)(int)))( ) -- a function taking
(*signal(int, void (*)(int)))( ) -- unnamed parameter
(*signal(int, void (*)(int)))(int) -- is an int
void (*signal(int, void (*)(int)))(int); -- returning void
and this just barely scratches the surface of what's possible. But notice that array-ness, pointer-ness, and function-ness are always part of the declarator, not the type specifier.
One thing to watch out for - const can modify both the pointer type and the pointed-to type:
const int *p;
int const *p;
Both of the above declare p as a pointer to a const int object. You can write a new value to p setting it to point to a different object:
const int x = 1;
const int y = 2;
const int *p = &x;
p = &y;
but you cannot write to the pointed-to object:
*p = 3; // constraint violation, the pointed-to object is const
However,
int * const p;
declares p as a const pointer to a non-const int; you can write to the thing p points to
int x = 1;
int y = 2;
int * const p = &x;
*p = 3;
but you can't set p to point to a different object:
p = &y; // constraint violation, p is const
Which brings us to the third piece of the puzzle - why declarations are structured this way.
The intent is that the structure of a declaration should closely mirror the structure of an expression in the code ("declaration mimics use"). For example, let's suppose we have an array of pointers to int named ap, and we want to access the int value pointed to by the i'th element. We would access that value as follows:
printf( "%d", *ap[i] );
The expression *ap[i] has type int; thus, the declaration of ap is written as
int *ap[N]; // ap is an array of pointer to int, fully specified by the combination
// of the type specifier and declarator
The declarator *ap[N] has the same structure as the expression *ap[i]. The operators * and [] behave the same way in a declaration that they do in an expression - [] has higher precedence than unary *, so the operand of * is ap[N] (it's parsed as *(ap[N])).
As another example, suppose we have a pointer to an array of int named pa and we want to access the value of the i'th element. We'd write that as
printf( "%d", (*pa)[i] );
The type of the expression (*pa)[i] is int, so the declaration is written as
int (*pa)[N];
Again, the same rules of precedence and associativity apply. In this case, we don't want to dereference the i'th element of pa, we want to access the i'th element of what pa points to, so we have to explicitly group the * operator with pa.
The *, [] and () operators are all part of the expression in the code, so they are all part of the declarator in the declaration. The declarator tells you how to use the object in an expression. If you have a declaration like int *p;, that tells you that the expression *p in your code will yield an int value. By extension, it tells you that the expression p yields a value of type "pointer to int", or int *.
So, what about things like cast and sizeof expressions, where we use things like (int *) or sizeof (int [10]) or things like that? How do I read something like
void foo( int *, int (*)[10] );
There's no declarator, aren't the * and [] operators modifying the type directly?
Well, no - there is still a declarator, just with an empty identifier (known as an abstract declarator). If we represent an empty identifier with the symbol λ, then we can read those things as (int *λ), sizeof (int λ[10]), and
void foo( int *λ, int (*λ)[10] );
and they behave exactly like any other declaration. int *[10] represents an array of 10 pointers, while int (*)[10] represents a pointer to an array.
And now the opinionated portion of this answer. I am not fond of the C++ convention of declaring simple pointers as
T* p;
and consider it bad practice for the following reasons:
It's not consistent with the syntax;
It introduces confusion (as evidenced by this question, all the duplicates to this question, questions about the meaning of T* p, q;, all the duplicates to those questions, etc.);
It's not internally consistent - declaring an array of pointers as T* a[N] is asymmetrical with use (unless you're in the habit of writing * a[i]);
It cannot be applied to pointer-to-array or pointer-to-function types (unless you create a typedef just so you can apply the T* p convention cleanly, which...no);
The reason for doing so - "it emphasizes the pointer-ness of the object" - is spurious. It cannot be applied to array or function types, and I would think those qualities are just as important to emphasize.
In the end, it just indicates confused thinking about how the two languages' type systems work.
There are good reasons to declare items separately; working around a bad practice (T* p, q;) isn't one of them. If you write your declarators correctly (T *p, q;) you are less likely to cause confusion.
I consider it akin to deliberately writing all your simple for loops as
i = 0;
for( ; i < N; )
{
...
i++;
}
Syntactically valid, but confusing, and the intent is likely to be misinterpreted. However, the T* p; convention is entrenched in the C++ community, and I use it in my own C++ code because consistency across the code base is a good thing, but it makes me itch every time I do it.
I will be using C terminology - the C++ terminology is a little different, but the concepts are largely the same.
As others mentioned, 4, 5, and 6 are the same. Often, people use these examples to make the argument that the * belongs with the variable instead of the type. While it's an issue of style, there is some debate as to whether you should think of and write it this way:
int* x; // "x is a pointer to int"
or this way:
int *x; // "*x is an int"
FWIW I'm in the first camp, but the reason others make the argument for the second form is that it (mostly) solves this particular problem:
int* x,y; // "x is a pointer to int, y is an int"
which is potentially misleading; instead you would write either
int *x,y; // it's a little clearer what is going on here
or if you really want two pointers,
int *x, *y; // two pointers
Personally, I say keep it to one variable per line, then it doesn't matter which style you prefer.
#include <type_traits>
std::add_pointer<int>::type test, test2;
In 4, 5 and 6, test is always a pointer and test2 is not a pointer. White space is (almost) never significant in C++.
The rationale in C is that you declare the variables the way you use them. For example
char *a[100];
says that *a[42] will be a char. And a[42] a char pointer. And thus a is an array of char pointers.
This because the original compiler writers wanted to use the same parser for expressions and declarations. (Not a very sensible reason for a langage design choice)
I would say that the initial convention was to put the star on the pointer name side (right side of the declaration
in the c programming language by Dennis M. Ritchie the stars are on the right side of the declaration.
by looking at the linux source code at https://github.com/torvalds/linux/blob/master/init/main.c
we can see that the star is also on the right side.
You can follow the same rules, but it's not a big deal if you put stars on the type side.
Remember that consistency is important, so always but the star on the same side regardless of which side you have choose.
In my opinion, the answer is BOTH, depending on the situation.
Generally, IMO, it is better to put the asterisk next to the pointer name, rather than the type. Compare e.g.:
int *pointer1, *pointer2; // Fully consistent, two pointers
int* pointer1, pointer2; // Inconsistent -- because only the first one is a pointer, the second one is an int variable
// The second case is unexpected, and thus prone to errors
Why is the second case inconsistent? Because e.g. int x,y; declares two variables of the same type but the type is mentioned only once in the declaration. This creates a precedent and expected behavior. And int* pointer1, pointer2; is inconsistent with that because it declares pointer1 as a pointer, but pointer2 is an integer variable. Clearly prone to errors and, thus, should be avoided (by putting the asterisk next to the pointer name, rather than the type).
However, there are some exceptions where you might not be able to put the asterisk next to an object name (and where it matters where you put it) without getting undesired outcome — for example:
MyClass *volatile MyObjName
void test (const char *const p) // const value pointed to by a const pointer
Finally, in some cases, it might be arguably clearer to put the asterisk next to the type name, e.g.:
void* ClassName::getItemPtr () {return &item;} // Clear at first sight
The pointer is a modifier to the type. It's best to read them right to left in order to better understand how the asterisk modifies the type. 'int *' can be read as "pointer to int'. In multiple declarations you must specify that each variable is a pointer or it will be created as a standard variable.
1,2 and 3) Test is of type (int *). Whitespace doesn't matter.
4,5 and 6) Test is of type (int *). Test2 is of type int. Again whitespace is inconsequential.
I have always preferred to declare pointers like this:
int* i;
I read this to say "i is of type int-pointer". You can get away with this interpretation if you only declare one variable per declaration.
It is an uncomfortable truth, however, that this reading is wrong. The C Programming Language, 2nd Ed. (p. 94) explains the opposite paradigm, which is the one used in the C standards:
The declaration of the pointer ip,
int *ip;
is intended as a mnemonic; it says that the expression *ip is an
int. The syntax of the declaration for a variable mimics the syntax
of expressions in which the variable might appear. This reasoning
applies to function declarations as well. For example,
double *dp, atof(char *);
says that in an expression *dp and atof(s) have values of type
double, and that the argument of atof is a pointer to char.
So, by the reasoning of the C language, when you declare
int* test, test2;
you are not declaring two variables of type int*, you are introducing two expressions that evaluate to an int type, with no attachment to the allocation of an int in memory.
A compiler is perfectly happy to accept the following:
int *ip, i;
i = *ip;
because in the C paradigm, the compiler is only expected to keep track of the type of *ip and i. The programmer is expected to keep track of the meaning of *ip and i. In this case, ip is uninitialized, so it is the programmer's responsibility to point it at something meaningful before dereferencing it.
A good rule of thumb, a lot of people seem to grasp these concepts by: In C++ a lot of semantic meaning is derived by the left-binding of keywords or identifiers.
Take for example:
int const bla;
The const applies to the "int" word. The same is with pointers' asterisks, they apply to the keyword left of them. And the actual variable name? Yup, that's declared by what's left of it.

Aren't a[][] and (*a)[] equivalent as function parameters?

Function prototype
void foo(int n, int a[][]);
gives error about incomplete type while
void foo(int n, int (*a)[]);
compiles. As per the decay rule int a[][] is equivalent to int (*a)[] in this case and therefore int (*a)[] should also give an error about an incomplete type but GCC seems to accept it. Is there anything I am missing?
This might be a GCC bug but I didn't find anything related to it.
No, they are not equivalent as function parameters. They are not equivalent in exactly the same way as parameter declarations in foo and bar
struct S;
void foo(struct S* s); // OK
void bar(struct S a[]); // ERROR: incomplete type is not allowed
are not equivalent.
C does not allow incomplete types as array elements (see C 1999 6.7.5.2/1: "[...] The element type shall not be an incomplete or function type. [...]") and this restriction applies to array parameter declarations the same way as it applies to any other array declarations. Even though parameters of array type will be later implicitly adjusted to pointer type, C simply provides no special treatment for array declarations in function parameter lists. In other words, array parameter declarations are checked for validity before the aforementioned adjustment.
Your int a[][] is the same thing: an attempt to declare an array with elements of type int [], which is an incomplete type. Meanwhile, int (*a)[] is perfectly legal - there's nothing unusual about pointers to incomplete types.
As a side note, C++ "fixed" this issue, allowing arrays of incomplete type in parameter declarations. However, the original C++ still prohibits int a[][] parameters, int (&a)[] parameters and even int (*a)[] parameters. This was supposedly fixed/allowed later in C++17 (http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#393)
An incomplete type is allowed in contexts where the size doesn't need to be known.
With this declaration:
int a[][]
It is invalid even as a function parameter because the size of one array dimension is needed to know how to perform pointer arithmetic on the second dimension.
This however is valid:
int (*a)[];
Because the size of the array doesn't need to be known in order to use a pointer to it.
Section 6.2.7 of the C standard gives an example of a declaration like this:
5 EXAMPLE Given the following two file scope declarations:
int f(int (*)(), double (*)[3]);
int f(int (*)(char *), double (*)[]);
The resulting composite type for the function is:
int f(int (*)(char *), double (*)[3]);
This example shows a declaration of type double (*)[3] that is compatible with a declaration of type double (*)[]
You can't however directly use this like a 2D array because of the missing size. Here are some examples to illustrate. If you attempt to do this:
void foo(int n, int (*a)[])
{
int i,j;
for (i=0;i<n;i++) {
for (j=0;j<n;j++) {
printf("a[%d][%d]=%d\n",i,j,a[i][j]);
}
}
}
The compiler (as expected) tells you this:
error: invalid use of array with unspecified bounds
printf("a[%d][%d]=%d\n",i,j,a[i][j]);
^
You can get around this by taking advantage of the fact that an array, even of indeterminate size, decays to a pointer in most contexts:
#include <stdio.h>
void foo(int n, int (*a)[])
{
int i,j;
for (i=0;i<n;i++) {
// dereference "a", type is int[], which decays to int *
// now manually add "n" ints times the row
int *b = *a + (n*i);
for (j=0;j<n;j++) {
printf("a[%d][%d]=%d\n",i,j,b[j]);
}
}
}
int main()
{
int a[2][2] = { {4,5},{6,7} };
foo(2,a);
return 0;
}
This compiles clean with the following output:
a[0][0]=4
a[0][1]=5
a[1][0]=6
a[1][1]=7
Even outside of a function, the int (*)[] syntax can be used:
#include <stdio.h>
int main()
{
int a[2][2] = { {4,5},{6,7} };
int i,j,n=2;
int (*aa)[];
// "a" decays from int[2][2] to int (*)[2], assigned to int (*)[]
aa = a;
for (i=0;i<n;i++) {
int *b = *aa + (n*i);
for (j=0;j<n;j++) {
printf("a[%d][%d]=%d\n",i,j,b[j]);
}
}
return 0;
}
EDIT
Having read through all the relevant parts of the standard, C11 6.7.6.2 and 6.7.6.3, I believe this is a compiler bug/non-conformance. it apparently boils down to the text that the committee sneaked into the middle of a paragraph concerning array delimiters. 6.7.6.2/1 emphasis mine:
In addition to optional type qualifiers and the keyword static, the [
and ] may delimit an expression or *. If they delimit an expression
(which specifies the size of an array), the expression shall have an
integer type. If the expression is a constant expression, it shall
have a value greater than zero. The element type shall not be an
incomplete or function type. The optional type qualifiers and the
keyword static shall appear only in a declaration of a function
parameter with an array type, and then only in the outermost array
type derivation.
Now this is of course very poorly written, basically it says
"peripheral feature of little interest, peripheral feature of little
interest, peripheral feature of little interest, OUT OF THE BLUE HERE COMES SOME ARRAY ELEMENT TYPE SPECIFICATION NOT RELATED TO THE REST OF THIS PARAGRAPH, peripheral feature of little interest, peripheral feature
of little interest,...."
So it is easy to misunderstand, fooled me.
Meaning that int a[][] is always incorrect no matter where it is declared, since an array cannot be an array of incomplete type.
However, my original answer below raises some valid concerns regarding whether array decay should be done before or after the compiler decides if the type is incomplete or not.
Given the specific case void foo(int n, int a[][]); only, this is a function declaration. It is not a definition.
C11 6.7.6.3/12
If the function declarator is not part of a definition of that
function, parameters may have incomplete type
So first of all, parameters are allowed to have incomplete type in the function declaration. The standard is clear. Which is why code like this compiles just fine:
struct s; // incomplete type
void foo(int n, struct s a); // just fine, incomplete type is allowed in the declaration
Furthermore:
C11 6.7.6.3/4
After adjustment, the parameters in a parameter type list in a function declarator that is part of a definition of that function shall not have incomplete type.
After adjustment is very important here.
Meaning that after adjusting int a[][] to int (*a)[], the parameter shall not have incomplete type. It does not, it is a pointer to incomplete type, which is always allowed and perfectly fine.
The compiler is not allowed to first evaluate int a[][] as an incomplete array of incomplete arrays, and then later adjust it (if it found that the type was not incomplete). This would directly violate 6.7.6.3/4.

const and typedef of arrays in C

In C, it's possible to typedef an array, using this construction :
typedef int table_t[N];
Here, table_t is now defined as an array of N int. Any variable declared such as table_t t; will now behave as a normal array of int.
The point of such construction is to be used as an argument type in a function, such as :
int doSomething(table_t t);
A relatively equivalent function prototype could have been :
int doSomething(int* t);
The merit of the first construction is that it enforces N as the size of the table. In many circumstances, it's safer to enforce this property, rather than relying on the programmer to properly figure out this condition.
Now it's all good, except that, in order to guarantee that the content of table will not be modified, it's necessary to use the const qualifier.
The following statement is relatively simple to understand :
int doSomething(const int* t);
Now, doSomething guarantee that it will not modify the content of the table passed as a pointer.
Now, what about this almost equivalent construction ? :
int doSomething(const table_t t);
What is const here ? the content of the table, or the pointer to the table ?
If it's the pointer which is const, is there another way (C90 compatible) to retain the ability to define the size of the table and to tell that its content will be const ?
Note that it's also necessary sometimes to modify the content of the table, so the const property cannot be embedded into the typedef definition.
[Edit] Thanks for the excellent answers received so far.
To summarize :
The initial assumption of typedef enforcing size N was completely wrong. It basically behaves the same as a normal pointer.
The const property will also behave the same as if it was a pointer (in stark contrast with a typedef to a pointer type, as underlined by #random below)
To enforce a size (which was not the initial question, but end up being quite important now...), see Jonathan's answer
First, you are mistaken, the function prototypes
int doSomething(table_t t);
int doSomething(int* t);
are exactly equivalent. For function parameters, the first array dimension is always rewritten as a pointer. So there is no guarantee for the size of the array that is received.
const-qualification on arrays always applies to the base type of the array, so the two declarations
const table_t a;
int const a[N];
are equivalent, and for functions parameters we have
int doSomething(const table_t t);
int doSomething(int const* t);
The content of the table will be constant. Easily checked with this code.
#include<stdio.h>
typedef int table_t[3];
void doSomething(const table_t t)
{
t++; //No error, it's a non-const pointer.
t[1]=3; //Error, it's a pointer to const.
}
int main()
{
table_t t={1,2,3};
printf("%d %d %d %ld",t[0],t[1],t[2],sizeof(t));
t[1]=5;
doSomething(t);
return 0;
}
Array types and pointer types are not 100% equivalent, even in this context where you do ultimately get a pointer type for the function parameter. Your mistake is in assuming that const would have acted the same way if it were a pointer type.
To expand on ARBY's example:
typedef int table_t[3];
typedef int *pointer_t;
void doSomething(const table_t t)
{
t++; //No error, it's a non-const pointer.
t[1]=3; //Error, it's a pointer to const.
}
void doSomethingElse(const pointer_t t)
{
t++; //Error, it's a const pointer.
t[1]=3; //No error, it's pointer to plain int
}
It does act similarly to const int *, but const pointer_t is instead equivalent to int * const.
(Also, disclaimer, user-defined names ending with _t are not allowed by POSIX, they're reserved for future expansion)
The merit of the first construction is that it enforces N as the size of the table.
I'm not sure what you mean here. In what contexts would it "enforce" it? If you declare a function as
int doSomething(table_t t);
array size will not be enforced. I order to enforce the size, you'd have to go a different route
int doSomething(table_t *t); // equivalent to 'int (*t)[N]'
What is const here ?
As for const... When const is applied to array type it "drops down" all the way to array elements. This means that const table_t is an array of constant ints, i.e. it is equivalent to const int [N] type. The end result of this is that the array becomes non-modifiable. In function parameter declaration context const table_t will be converted into const int *.
However, note one peculiar detail that is not immediately obvious in this case: the array type itself remains non-const-qualified. It is the individual elements that become const. In fact, it is impossible to const-qualify the array type itself in C. Any attempts to do so will make const-qualification to "sift down" to individual elements.
This peculiarity leads to rather unpleasant consequences in array const-correctness. For example, this code will not compile in C
table_t t;
const table_t *pt = &t;
even though it looks quite innocently from the const-correctness point of view and will compile for any non-array object type. C++ language updated its const-correctness rules to resolve this issue, while C continues to stick to its old ways.
The standard 6.7.6.3 says:
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to type’’
Meaning that when you declare a function parameter as a const int array type, it decays into a pointer to const int (first element in array). Equivalent to const int* in this case.
Also note that because of the above mentioned rule, the array size specified adds no additional type safety! This is one big flaw in the C language, but that's how it is.
Still, it is good practice to declare the array with fixed width like you have, because static analysers or clever compilers may produce a diagnostic about different types.

Why is function(char * array[]) a valid function definition but not (char (*array)[] in C?

I think that it is because the former is an array of pointers to char and the latter is a pointer to an array of chars, and we need to properly specify the size of the object being pointed to for our function definition. In the former;
function(char * p_array[])
the size of the object being pointed to is already included (its a pointer to char), but the latter
function(char (*p_array)[])
needs the size of the array p_array points to as part of p_array's definition?
I'm at the stage where I've been thinking about this for too long and have just confused myself, someone please let me know if my reasoning is correct.
Both are valid in C but not C++. You would ordinarily be correct:
char *x[]; // array of pointers to char
char (*y)[]; // pointer to array of char
However, the arrays decay to pointers if they appear as function parameters. So they become:
char **x; // Changes to pointer to array of pointer to char
char (*y)[]; // No decay, since it's NOT an array, it's a pointer to an array
In an array type in C, one of the sizes is permitted to be unspecified. This must be the leftmost one (whoops, I said rightmost at first). So,
int valid_array[][5]; // Ok
int invalid_array[5][]; // Wrong
(You can chain them... but we seldom have reason to do so...)
int (*convoluted_array[][5])[][10];
There is a catch, and the catch is that an array type with [] in it is an incomplete type. You can pass around a pointer to an incomplete type but certain operations will not work, as they need a complete type. For example, this will not work:
void func(int (*x)[])
{
x[2][5] = 900; // Error
}
This is an error because in order to find the address of x[2], the compiler needs to know how big x[0] and x[1] are. But x[0] and x[1] have type int [] -- an incomplete type with no information about how big it is. This becomes clearer if you imagine what the "un-decayed" version of the type would be, which is int x[][] -- obviously invalid C. If you want to pass a two-dimensional array around in C, you have a few options:
Pass a one-dimensional array with a size parameter.
void func(int n, int x[])
{
x[2*n + 5] = 900;
}
Use an array of pointers to rows. This is somewhat clunky if you have genuine 2D data.
void func(int *x[])
{
x[2][5] = 900;
}
Use a fixed size.
void func(int x[][5])
{
x[2][5] = 900;
}
Use a variable length array (C99 only, so it probably doesn't work with Microsoft compilers).
// There's some funny syntax if you want 'x' before 'width'
void func(int n, int x[][n])
{
x[2][5] = 900;
}
This is a frequent problem area even for C veterans. Many languages lack intrinsic "out-of-the-box" support for real, variable size, multidimensional arrays (C++, Java, Python) although a few languages do have it (Common Lisp, Haskell, Fortran). You'll see a lot of code that uses arrays of arrays or that calculates array offsets manually.
NOTE:
The below answer was added when the Q was tagged C++, and it answers from a C++ perspective. With tagged changed to only C, both the mentioned samples are valid in C.
Yes, Your reasoning is correct.
If you try compiling the error given by compiler is:
parameter ‘p_array’ includes pointer to array of unknown bound ‘char []’
In C++ array sizes need to be fixed at compile time. C++ standard forbids Variable Lenght Array's(VLA) as well. Some compilers support that as an extension but that is non standard conforming.
Those two declarations are very different. In a function parameter declaration, a declarator of [] directly applied to the parameter name is completely equivalent to a *, so your first declaration is exactly the same in all respects as this:
function(char **p_array);
However, this does not apply recursively to parameter types. Your second parameter has type char (*)[], which is a pointer to an array of unknown size - it is a pointer to an incomplete type. You can happily declare variables with this type - the following is a valid variable declaration:
char (*p_array)[];
Just like a pointer to any other incomplete type, you cannot perform any pointer arithmetic on this variable (or your function parameter) - that's where you error arises. Note that the [] operator is specified as a[i] being identical to *(a+i), so that operator cannot be applied to your pointer. You can, of course, happily use it as a pointer, so this is valid:
void function(char (*p_array)[])
{
printf("p_array = %p\n", (void *)p_array);
}
This type is also compatible with a pointer to any other fixed-size array of char, so you can also do this:
void function(char (*p_array)[])
{
char (*p_a_10)[10] = p_array;
puts(*p_a_10);
}
...and even this:
void function(char (*p_array)[])
{
puts(*p_array);
}
(though there is precious little point in doing so: you might as well just declare the parameter with type char *).
Note that although *p_array is allowed, p_array[0] is not.
Because,
(1) function(char * p_array[])
is equivalent to char **p_array; i.e. a double pointer which is valid.
(2) function(char (*p_array)[])
You are right, that p_array is pointer to char array. But that needs to be of fixed size in the case when it appears as function argument. You need to provide the size and that will also become valid.

Strange warning in a C function const multidimensional-array argument

I'm getting some strange warnings about this code:
typedef double mat4[4][4];
void mprod4(mat4 r, const mat4 a, const mat4 b)
{
/* yes, function is empty */
}
int main()
{
mat4 mr, ma, mb;
mprod4(mr, ma, mb);
}
gcc output as follows:
$ gcc -o test test.c
test.c: In function 'main':
test.c:13: warning: passing argument 2 of 'mprod4' from incompatible pointer
type
test.c:4: note: expected 'const double (*)[4]' but argument is of type 'double
(*)[4]'
test.c:13: warning: passing argument 3 of 'mprod4' from incompatible pointer
type
test.c:4:
note: expected 'const double (*)[4]' but argument is of type 'double
(*)[4]'
If I define the function as:
void mprod4(mat4 r, mat4 a, mat4 b)
{
}
Or defining matrices in main as:
mat4 mr;
const mat4 ma;
const mat4 mb;
Or call the function in main as:
mprod4(mr, (const double(*)[4])ma, (const double(*)[4])mb);
Or even defining mat4 as:
typedef double mat4[16];
Makes the warning go away. What is happening here? Am I doing something invalid?
The gcc version is 4.4.3, if relevant.
I also posted on gcc bugzilla: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47143
My current workaround is making ugly macros that cast stuff for me:
#ifndef _NO_UGLY_MATRIX_MACROS
#define mprod4(r, a, b) mprod4(r, (const double(*)[4])a, (const double(*)[4])b)
#endif
Answer from Joseph S. Myers on gcc bugzilla:
Not a bug. The function parameters
are of type "pointer to array[4] of
const double" because const on an
array type applies to the element
type, recursively, and then the
outermost array type, only, of a
parameter of array type decays to a
pointer, and the arguments passed are
of type "pointer to array[4] of
double" after array-to-pointer decay,
and the only case where qualifiers are
permitted to be added in assignment,
argument passing etc. is qualifiers on
the immediate pointer target, not
those nested more deeply.
Sounds pretty confusing to me, like the function expects:
pointer to array[4] of const doubles
and we are passing
pointer to const array[4] of doubles
intead.
Or would it be the inverse? The warnings suggest that the function expects a:
const double (*)[4]
which seems to me more like a
pointer to const array[4] of doubles
I'm really confused with this answer. Could somebody who understands what he said clarify and exemplify?
I believe the problem is the constraints specified in C99 6.5.16.1(1), which seem to prohibit mixing qualifications in assignments, except for pointers for which an inclusive-qualifier exception is defined. The problem is that with indirect pointers, you end up passing a pointer to one thing to a pointer to another. The assignment isn't valid because, if it was, you could fool it into modifying a const-qualified object with the following code:
const char **cpp;
char *p;
const char c = 'A';
cpp = &p; // constraint violation
*cpp = &c; // valid
*p = 0; // valid by itself, but would clobber c
It might seem reasonable that cpp, which promises not to modify any chars, might be assigned a pointer to an object pointing at non-qualified chars. After all, that's allowed for single-indirect pointers, which is why, e.g., you can pass a mutable object to the second parameter of strcpy(3), the first parameter to strchr(3), and many other parameters that are declared with const.
But with the indirect pointer, at the next level, assignment from a qualified pointer is allowed, and now a perfectly unqualified pointer assignment will clobber a qualified object.
I don't immediately see how a 2-D array could lead to this situation, but in any case it hits the same constraint in the standard.
Since in your case, you aren't actually tricking it into clobbering a const, the right thing for your code would seem to be inserting the cast.
Update: OK guys, as it happens this issue is in the C faq, and this entire discussion has also taken place several times on the gcc bug list and on the gcc mailing list.
Gcc bug list: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20230.
C FAQ: it's question 11.10: http://c-faq.com/ansi/constmismatch.html
The lesson: you can pass a T *x when const T *x is expected, by explicit exception, but T *x and const T *x are still distinct types, so you can't pass a pointer to either one to a pointer to the other.
To explain what Joseph said: the function is expecting a pointer to array[4] of const double to be passed in, but you're passing in a pointer to array[4] of double. These types are not compatible, so you get an error. They look like they should be compatible, but they're not.
For the purposes of passing parameters to functions (or for variable assignments), you can always convert an X to a const X, or a pointer to X to a pointer to const X for any type X. For example:
int x1 = 0;
const int x2 = x1; // ok
int *x3 = &x1;
const int *x4 = x3; // ok: convert "pointer to int" to "pointer to const int"
int **x5 = &x3;
const int **x6 = x5; // ERROR: see DigitalRoss's answer
int *const *x7 = x5; // ok: convert "pointer to (pointer to int)" to
// "pointer to const (pointer to int)"
You're only allowed to add qualifiers (that is, the const, volatile, and restrict qualifiers) to the first level of pointers. You can't add them to higher levels of pointers because, as DigitalRoss mentioned, doing so would allow you to accidentally violate const-correctness. This is what Joseph means by "the only case where
qualifiers are permitted to be added in assignment, argument passing etc. is
qualifiers on the immediate pointer target, not those nested more deeply."
So, bringing us back to Joseph's response, you can't convert a pointer to array[4] of double to a pointer to array[4] of const double because there is no type X such that you're converting from pointer to X to pointer to const X.
If you try using array[4] of double for X, you'd see that you can convert to pointer to const array[4] of double, which is a different type. However, no such type exists in C: you can have an array of a const type, but there is no such thing as a const array.
Hence, there's no way to perfectly solve your problem. You'll have to either add casts to all of your function calls (either manually or via a macro or helper function), rewrite your functions to not take const parameters (bad since it doesn't let you pass in const matrices), or change the mat4 type to be either a 1-dimensional array or a structure, as user502515 suggested.
To practically solve this, one could use a struct, and the change of double[4][4] into the a-bit-awkward double (*)[4] is avoided, and constness also works intuitively — while the same amount of memory is used:
struct mat4 {
double m[4][4];
};
void myfunc(struct mat4 *r, const struct mat4 *a, const struct mat4 *b)
{
}
int main(void)
{
struct mat4 mr, ma, mb;
myfunc(&mr, &ma, &mb);
}
I think in C99, you can do this, but I'm not sure it will help:
void mprod4(double mr[4][4], double ma[const 4][const 4], double mb[const 4][const 4])
{
}
I haven't got a C99 compiler handy but I remember reading something in the C99 specification regarding qualifiers within the [] for arrays as arguments. You can also put static in there (e.g. ma[static 4]) but of course that means something else.
Edit
Here it is, section 6.7.3.5 paragraph 7.
A declaration of a parameter as “array of type” shall be adjusted to “qualified pointer to type”, where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the keyword static also appears within the [ and ] of the array type derivation, then for each call to the function, the value of the corresponding actual argument shall provide access to the first element of an array with at least as many elements as specified by the size expression.
Here's a problem (IMHO): double[4][4] in a function signature.
You know it's a double[4][4], but the compiler sees double(*)[4] in the function paramter list, which notably has no array size constraint. It turns your 2D array of 4 by 4 objects into a pointer to a 1D array of 4 objects, and the pointer can be validly indexed as if it were an array of 4 objects.
I would pass all mat4 objects by pointer:
void mprod4(mat4 *r, const mat4 *a, const mat4 *b);
// and, if you don't want to hairy your syntax
#define mprod4(r, a, b) (mprod4)(&r, (const mat4 *)&a, (const mat4 *)&b)
This will (I believe) ensure const correctness and array size correctness. It may make mprod4 a bit harder to write, and still involves some hairy casts, but it'll (IMHO) be worth it (especially after the macro above):
void mprod4(mat4 *r, const mat4 *a, const mat4 *b)
{
// all indexing of the matricies must be done after dereference
for(int i = 0; i < 4; i++) for(int j = 0; j < 4; j++)
{
(*r)[i][j] = (*a)[i][j] * (*b)[i][j];
// you could make it easier on yourself if you like:
#define idx(a, i, j) ((*a)[i][j])
idx(r, i, j) = idx(a, i, j) * idx(b, i, j)
}
}
It may look a bit bad when you write it, but I think it'll be cleaner type-wise. (Maybe I've been thinking C++ too much...)
Compiler is just being anal.
You're passing an argument that is essentially a non-const pointer, and the function is declared to accept a const pointer as an argument. These two are, in fact, incompatible. It is not a real problem because the compiler is still supposed to work as long as you can assign the value of the first type to the variable of the second type. Hence a warning but not an error.
EDIT: looks like gcc does not complain about other con-const to const conversions, e.g. passing char* where a const char* is expected. In this case, I'm inclined to agree that Joseph Myers from Bugzilla is correct.

Resources