Functions with multi-dimensional arrays as formal parameters - arrays

I am trying to figure out why it is that the signature of functions with multi-dimensional arrays as formal parameters have the first dimension as unsized, while the others are not. Actually the answer to the second part of the aforementioned statement is clear: without passing the dimension information, the compiler will not know how the multi-dimensional array will be organized in memory. What bothers me is the inconsistency.
Why not require all dimensions to be explicitly specified (including the first dimension)?
I have come up with a theory and I want to see if that is correct or not.
The most common usage of an array is a 1D array. Given that the array name is a pointer to the first element of the array, Dennis Ritchie wanted to have the exact same signature for 1D arrays, whether the array syntax was used, or the pointer syntax was used.
With the pointer syntax it was impossible to know how many elements to process in the function, without specifying the size information as well. So Dennis Ritchie forced the same signature for array syntax also by having the array be unsized (and even have the compiler completely ignore the size information for the first dimension, if it is provided). In other words you will have one formal parameter be a pointer, or an unsized array, and the second formal parameter the array size.

When you pass an array it decays to a pointer to the first element. This is a for convenience. Without it, passing a string literal to a function like this:
void foo(const char* str) {}
would be cumbersome:
foo("Hello world"); // the const char[12] decays into ...
foo(&"Hello world"[0]); // what would have to be written like this without the decay
If you want all dimensions to be specified, just take the address of the array and you'll get a pointer to a single element - with all dimensions specified.
Example. This function takes a pointer to a int[2][10]:
void f2d2(int (*)[2][10]) {}
int a2d[2][10];
f2d2(&a2d); // and you call it like this
In C++ you can prevent the array decaying into a pointer to the first element by taking the array by reference.
void f2d2(int (&)[2][10]) {}
int a2d[2][10];
f2d2(a2d); // no decay

There are two quotes from the C Standard that makes using arrays more clear.
The first one is (6.3.2.1 Lvalues, arrays, and function designators)
3 Except when it is the operand of the sizeof operator or the unary &
operator, or is a string literal used to initialize an array, an
expression that has type ‘‘array of type’’ is converted to an
expression with type ‘‘pointer to type’’ that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
And the second one is (6.7.6.3 Function declarators (including prototypes))
7 A declaration of a parameter as ‘‘array of type’’ shall be adjusted
to ‘‘qualified pointer to type’’, where the type qualifiers (if any)
are those specified within the [ and ] of the array type derivation.
If the keyword static also appears within the [ and ] of the array
type derivation, then for each call to the function, the value of the
corresponding actual argument shall provide access to the first
element of an array with at least as many elements as specified by the
size expression.
What does this mean relative to function declaration?
If you declared a function like for example
void f( int a[100] );
then the compiler will adjust the function parameter the following way
void f( int *a );
So for example these function declarations are equivalent
void f( int a[100] );
void f( int a[10] );
void f( int a[1] );
void f( int a[] );
void f( int *a );
and declare the same one function. You nay even include all these declarations in your program though the compiler can issue a message that there are redundant declarations.
Within the function the variable a has the the pointer type int *.
On the other hand, you may call the function passing arrays of different sizes. Arrays designators will be implicitly converted by the compiler to pointers to their first element. You even may pass a scalar object through a pointer to it.
So these calls of the function are all correct
int a[100];
f( a );
int a[10];
f( a );
int a[1];
f( a );
int a;
f( &a );
As a result the function has no information what array was used as an argument. So you need to declare the second function parameter that will specify the size of the passed array (if the function does not rely on a sentinel value present in the array)
For example
void f( int a[], size_t n );
If you have a multidimensional array like this
T a[N1][N2][N3]...[Nn];
then a pointer to its first element will have the type
T ( *a )[N2][N3]...[Nn];
So a function declared like
void f( T a[N1][N2][N3]...[Nn] );
is equivalent to
void f( T ( *a )[N2][N3]...[Nn] );
If there N2, N3,..Nn are integer constant expressions then the array is not a variable length array. Otherwise it is a variable length array and the function may be declared like (when the declaration os not a part of the function definition)
void f( T ( *a )[*][*]...[*] );
For such a function there is also a problem of determining of sizes of sub-arrays. So you need to declare parameters that will specify the sizes.
For example
void f( size_t n1, size_t n2, size_t n3, ..., size_t nn, T a[][n2][n3]...[nn] );
As for C++ then variable length arrays is not a standard C++ feature. Also you can declare a function parameter as having a referenced type.
For example
void f( int ( &a )[10] );
within the function in this case a is not a pointer. It denotes an array and the expression sizeof( a ) will yield the size of the whole array instead of the size of pointer.

It's not possible to pass arrays by value in C. You can only pass a pointer, and then in the function, index from the pointer to access the contents of the array.
Accordingly; in a function declaration, if you specify parameter with array type, that parameter is adjusted to pointer type before any further analysis. For example:
You write: void f( int x[][5] );
The compiler sees: void f( int (*x)[5] );
The language would be equally functional if the first form were outlawed, i.e. you had to define the parameter with the pointer version. It is just syntactic sugar -- which as it turns out, causes a lot of confusion to newbies but that's another story.
So to answer your question -- all array dimensions after adjustment must be specified , because otherwise the compiler doesn't know how to index the memory when accessing the array via the pointer argument passed. And it should now be clear that the faux first "dimension" is irrelevant because it is syntactic sugar that is immediately discarded.

Related

Why pass by reference works also without '&'? [duplicate]

What is array to pointer decay? Is there any relation to array pointers?
It's said that arrays "decay" into pointers. A C++ array declared as int numbers [5] cannot be re-pointed, i.e. you can't say numbers = 0x5a5aff23. More importantly the term decay signifies loss of type and dimension; numbers decay into int* by losing the dimension information (count 5) and the type is not int [5] any more. Look here for cases where the decay doesn't happen.
If you're passing an array by value, what you're really doing is copying a pointer - a pointer to the array's first element is copied to the parameter (whose type should also be a pointer the array element's type). This works due to array's decaying nature; once decayed, sizeof no longer gives the complete array's size, because it essentially becomes a pointer. This is why it's preferred (among other reasons) to pass by reference or pointer.
Three ways to pass in an array1:
void by_value(const T* array) // const T array[] means the same
void by_pointer(const T (*array)[U])
void by_reference(const T (&array)[U])
The last two will give proper sizeof info, while the first one won't since the array argument has decayed to be assigned to the parameter.
1 The constant U should be known at compile-time.
Arrays are basically the same as pointers in C/C++, but not quite. Once you convert an array:
const int a[] = { 2, 3, 5, 7, 11 };
into a pointer (which works without casting, and therefore can happen unexpectedly in some cases):
const int* p = a;
you lose the ability of the sizeof operator to count elements in the array:
assert( sizeof(p) != sizeof(a) ); // sizes are not equal
This lost ability is referred to as "decay".
For more details, check out this article about array decay.
Here's what the standard says (C99 6.3.2.1/3 - Other operands - Lvalues, arrays, and function designators):
Except when it is the operand of the sizeof operator or the unary & operator, or is a
string literal used to initialize an array, an expression that has type ‘‘array of type’’ is
converted to an expression with type ‘‘pointer to type’’ that points to the initial element of
the array object and is not an lvalue.
This means that pretty much anytime the array name is used in an expression, it is automatically converted to a pointer to the 1st item in the array.
Note that function names act in a similar way, but function pointers are used far less and in a much more specialized way that it doesn't cause nearly as much confusion as the automatic conversion of array names to pointers.
The C++ standard (4.2 Array-to-pointer conversion) loosens the conversion requirement to (emphasis mine):
An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to an rvalue
of type “pointer to T.”
So the conversion doesn't have to happen like it pretty much always does in C (this lets functions overload or templates match on the array type).
This is also why in C you should avoid using array parameters in function prototypes/definitions (in my opinion - I'm not sure if there's any general agreement). They cause confusion and are a fiction anyway - use pointer parameters and the confusion might not go away entirely, but at least the parameter declaration isn't lying.
"Decay" refers to the implicit conversion of an expression from an array type to a pointer type. In most contexts, when the compiler sees an array expression it converts the type of the expression from "N-element array of T" to "pointer to T" and sets the value of the expression to the address of the first element of the array. The exceptions to this rule are when an array is an operand of either the sizeof or & operators, or the array is a string literal being used as an initializer in a declaration.
Assume the following code:
char a[80];
strcpy(a, "This is a test");
The expression a is of type "80-element array of char" and the expression "This is a test" is of type "15-element array of char" (in C; in C++ string literals are arrays of const char). However, in the call to strcpy(), neither expression is an operand of sizeof or &, so their types are implicitly converted to "pointer to char", and their values are set to the address of the first element in each. What strcpy() receives are not arrays, but pointers, as seen in its prototype:
char *strcpy(char *dest, const char *src);
This is not the same thing as an array pointer. For example:
char a[80];
char *ptr_to_first_element = a;
char (*ptr_to_array)[80] = &a;
Both ptr_to_first_element and ptr_to_array have the same value; the base address of a. However, they are different types and are treated differently, as shown below:
a[i] == ptr_to_first_element[i] == (*ptr_to_array)[i] != *ptr_to_array[i] != ptr_to_array[i]
Remember that the expression a[i] is interpreted as *(a+i) (which only works if the array type is converted to a pointer type), so both a[i] and ptr_to_first_element[i] work the same. The expression (*ptr_to_array)[i] is interpreted as *(*a+i). The expressions *ptr_to_array[i] and ptr_to_array[i] may lead to compiler warnings or errors depending on the context; they'll definitely do the wrong thing if you're expecting them to evaluate to a[i].
sizeof a == sizeof *ptr_to_array == 80
Again, when an array is an operand of sizeof, it's not converted to a pointer type.
sizeof *ptr_to_first_element == sizeof (char) == 1
sizeof ptr_to_first_element == sizeof (char *) == whatever the pointer size
is on your platform
ptr_to_first_element is a simple pointer to char.
Arrays, in C, have no value.
Wherever the value of an object is expected but the object is an array, the address of its first element is used instead, with type pointer to (type of array elements).
In a function, all parameters are passed by value (arrays are no exception). When you pass an array in a function it "decays into a pointer" (sic); when you compare an array to something else, again it "decays into a pointer" (sic); ...
void foo(int arr[]);
Function foo expects the value of an array. But, in C, arrays have no value! So foo gets instead the address of the first element of the array.
int arr[5];
int *ip = &(arr[1]);
if (arr == ip) { /* something; */ }
In the comparison above, arr has no value, so it becomes a pointer. It becomes a pointer to int. That pointer can be compared with the variable ip.
In the array indexing syntax you are used to seeing, again, the arr is 'decayed to a pointer'
arr[42];
/* same as *(arr + 42); */
/* same as *(&(arr[0]) + 42); */
The only times an array doesn't decay into a pointer are when it is the operand of the sizeof operator, or the & operator (the 'address of' operator), or as a string literal used to initialize a character array.
It's when array rots and is being pointed at ;-)
Actually, it's just that if you want to pass an array somewhere, but the pointer is passed instead (because who the hell would pass the whole array for you), people say that poor array decayed to pointer.
Array decaying means that, when an array is passed as a parameter to a function, it's treated identically to ("decays to") a pointer.
void do_something(int *array) {
// We don't know how big array is here, because it's decayed to a pointer.
printf("%i\n", sizeof(array)); // always prints 4 on a 32-bit machine
}
int main (int argc, char **argv) {
int a[10];
int b[20];
int *c;
printf("%zu\n", sizeof(a)); //prints 40 on a 32-bit machine
printf("%zu\n", sizeof(b)); //prints 80 on a 32-bit machine
printf("%zu\n", sizeof(c)); //prints 4 on a 32-bit machine
do_something(a);
do_something(b);
do_something(c);
}
There are two complications or exceptions to the above.
First, when dealing with multidimensional arrays in C and C++, only the first dimension is lost. This is because arrays are layed out contiguously in memory, so the compiler must know all but the first dimension to be able to calculate offsets into that block of memory.
void do_something(int array[][10])
{
// We don't know how big the first dimension is.
}
int main(int argc, char *argv[]) {
int a[5][10];
int b[20][10];
do_something(a);
do_something(b);
return 0;
}
Second, in C++, you can use templates to deduce the size of arrays. Microsoft uses this for the C++ versions of Secure CRT functions like strcpy_s, and you can use a similar trick to reliably get the number of elements in an array.
tl;dr: When you use an array you've defined, you'll actually be using a pointer to its first element.
Thus:
When you write arr[idx] you're really just saying *(arr + idx).
functions never really take arrays as parameters, only pointers - either directly, when you specify an array parameter, or indirectly, if you pass a reference to an array.
Sort-of exceptions to this rule:
You can pass fixed-length arrays to functions within a struct.
sizeof() gives the size taken up by the array, not the size of a pointer.
Try this code
void f(double a[10]) {
printf("in function: %d", sizeof(a));
printf("pointer size: %d\n", sizeof(double *));
}
int main() {
double a[10];
printf("in main: %d", sizeof(a));
f(a);
}
and you will see that the size of the array inside the function is not equal to the size of the array in main, but it is equal to the size of a pointer.
You probably heard that "arrays are pointers", but, this is not exactly true (the sizeof inside main prints the correct size). However, when passed, the array decays to pointer. That is, regardless of what the syntax shows, you actually pass a pointer, and the function actually receives a pointer.
In this case, the definition void f(double a[10] is implicitly transformed by the compiler to void f(double *a). You could have equivalently declared the function argument directly as *a. You could have even written a[100] or a[1], instead of a[10], since it is never actually compiled that way (however, you shouldn't do it obviously, it would confuse the reader).
Arrays are automatically passed by pointer in C. The rationale behind it can only be speculated.
int a[5], int *a and int (*a)[5] are all glorified addresses meaning that the compiler treats arithmetic and deference operators on them differently depending on the type, so when they refer to the same address they are not treated the same by the compiler. int a[5] is different to the other 2 in that the address is implicit and does not manifest on the stack or the executable as part of the array itself, it is only used by the compiler to resolve certain arithmetic operations, like taking its address or pointer arithmetic. int a[5] is therefore an array as well as an implicit address, but as soon as you talk about the address itself and place it on the stack, the address itself is no longer an array, and can only be a pointer to an array or a decayed array i.e. a pointer to the first member of the array.
For instance, on int (*a)[5], the first dereference on a will produce an int * (so the same address, just a different type, and note not int a[5]), and pointer arithmetic on a i.e. a+1 or *(a+1) will be in terms of the size of an array of 5 ints (which is the data type it points to), and the second dereference will produce the int. On int a[5] however, the first dereference will produce the int and the pointer arithmetic will be in terms of the size of an int.
To a function, you can only pass int * and int (*)[5], and the function casts it to whatever the parameter type is, so within the function you have a choice whether to treat an address that is being passed as a decayed array or a pointer to an array (where the function has to specify the size of the array being passed). If you pass a to a function and a is defined int a[5], then as a resolves to an address, you are passing an address, and an address can only be a pointer type. In the function, the parameter it accesses is then an address on the stack or in a register, which can only be a pointer type and not an array type -- this is because it's an actual address on the stack and is therefore clearly not the array itself.
You lose the size of the array because the type of the parameter, being an address, is a pointer and not an array, which does not have an array size, as can be seen when using sizeof, which works on the type of the value being passed to it. The parameter type int a[5] instead of int *a is allowed but is treated as int * instead of disallowing it outright, though it should be disallowed, because it is misleading, because it makes you think that the size information can be used, but you can only do this by casting it to int (*a)[5], and of course, the function has to specify the size of the array because there is no way to pass the size of the array because the size of the array needs to be a compile-time constant.
I might be so bold to think there are four (4) ways to pass an array as the function argument. Also here is the short but working code for your perusal.
#include <iostream>
#include <string>
#include <vector>
#include <cassert>
using namespace std;
// test data
// notice native array init with no copy aka "="
// not possible in C
const char* specimen[]{ __TIME__, __DATE__, __TIMESTAMP__ };
// ONE
// simple, dangerous and useless
template<typename T>
void as_pointer(const T* array) {
// a pointer
assert(array != nullptr);
} ;
// TWO
// for above const T array[] means the same
// but and also , minimum array size indication might be given too
// this also does not stop the array decay into T *
// thus size information is lost
template<typename T>
void by_value_no_size(const T array[0xFF]) {
// decayed to a pointer
assert( array != nullptr );
}
// THREE
// size information is preserved
// but pointer is asked for
template<typename T, size_t N>
void pointer_to_array(const T (*array)[N])
{
// dealing with native pointer
assert( array != nullptr );
}
// FOUR
// no C equivalent
// array by reference
// size is preserved
template<typename T, size_t N>
void reference_to_array(const T (&array)[N])
{
// array is not a pointer here
// it is (almost) a container
// most of the std:: lib algorithms
// do work on array reference, for example
// range for requires std::begin() and std::end()
// on the type passed as range to iterate over
for (auto && elem : array )
{
cout << endl << elem ;
}
}
int main()
{
// ONE
as_pointer(specimen);
// TWO
by_value_no_size(specimen);
// THREE
pointer_to_array(&specimen);
// FOUR
reference_to_array( specimen ) ;
}
I might also think this shows the superiority of C++ vs C. At least in reference (pun intended) of passing an array by reference.
Of course there are extremely strict projects with no heap allocation, no exceptions and no std:: lib. C++ native array handling is mission critical language feature, one might say.

passing multi-dimensional arrays as function arguments

As we know, when passing arrays as function arguments, only the first dimension's size can be empty, the others must be specified.
void my_function(int arr[5][10][15]); // OKAY!
void my_function(int arr[][10][15]); // OKAY!
void my_function(int arr[][][15]); // WRONG!!!
void my_function(int arr[][][]); // WRONG!!!
What is logic behind this? Could someone explain the main reason?
Passing arrays to a function is an illusion: arrays instantly decay to a pointer when you use them. That is, with this example:
int foo[10];
foo[1];
the way this is interpreted by the compiler, in foo[1], foo is first transformed to a pointer to element 0 of foo, and then subscripting is the same as *(pointer_to_foo + 1).
However, this is only possible with the first dimension of an array. Suppose this:
int foo[5][5];
This puts 25 integer elements contiguously in memory. It is not the same as int** foo, and not convertible to int** foo, because int** foo represents some number of pointers to some number of pointers, and they don't have to be contiguous in memory.
While it's legal to use an array type as a function parameter, the compiler builds it identically to if you had specified a pointer to the array's element type:
int foo(int bar[10]); // identical to `int foo(int* bar)`
But what happens if you pass a multi-dimensional array?
int foo(int bar[5][5]); // identical to what?
Only the first dimension of the array can undergo decay. The equivalent signature for int foo here would be:
int foo(int (*bar)[5]); // identical to `int foo(int bar[5][5])`
where int (*bar)[5] is a pointer to an array of 5 integers. Stacking up more dimensions doesn't change the idea, only the first dimension will decay, and the other dimensions need to have a known size.
In other words, you can skip the first dimension because the compiler doesn't care about its size, as it instantly decays. However, subsequent dimensions do not decay at the call site, so you need to know their size.
Arrays in C and C++ are stored contiguously in memory. In the case of a multidimensional array, this means you have an array of arrays.
When you traverse an array, you don't necessarily need to know how many elements are in the array. You do however need to know the size of each element of the array so you can jump to the correct element.
That's why int arr[][][15] is incorrect. It says that arr is an array of unknown size elements are of type int [][15]. But this means you don't know how big each array element is, so you don't know where each array element it in memory.
Starting with two bits of standardese:
6.3.2.1 Lvalues, arrays, and function designators
...
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the
unary & operator, or is a string literal used to initialize an array, an expression that has
type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points
to the initial element of the array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
...
6.7.6.3 Function declarators (including prototypes)
...
7 A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to
type’’, where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation. If the keyword static also appears within the [ and ] of the
array type derivation, then for each call to the function, the value of the corresponding
actual argument shall provide access to the first element of an array with at least as many
elements as specified by the size expression.
C 2011 online draft
Lovely. What does all that mean?
Let's start with the declaration of an array of some arbitrary type T:
T arr[N];
The type of the expression arr is "N-element array of T" (T [N]). Unless that expression is the operand of the unary &, sizeof, or _Alignof operators as specified in 6.3.2.1/3 above, then the type of the expression is converted ("decays") to "pointer to T" (T *) and the value of the expression is the address of the first element - i.e., &arr[0].
When we pass arr as a parameter to a function, as in
foo( arr );
what foo actually receives is a pointer, not an array, and you would write the function prototype as
void foo( T *arr )
Or...
As a notational convenience, C allows you to declare the formal parameter using array notation:
void foo( T arr[N] )
or
void foo( T arr[] )
In this case, both T arr[N] and T arr[] are "adjusted" to T *arr as per 6.7.6.3/7 above, and all three forms declare arr as a pointer (this is only true for function parameter declarations).
That's easy enough to see for 1D arrays. But what about multidimensional arrays?
Let's replace T with an array type A [M]. Our declaration now becomes
A arr[N][M]; // we're creating N M-element arrays.
The type of the expression arr is "N-element array of M-element arrays of A" (A [M][N]). By the rule above, this "decays" to type pointer to *M-element array* ofA" (A (*)[M]`). So when we call
foo( arr );
the corresponding prototype isvoid foo( A (*arr)[M] )1
which can also be written as
void foo( A arr[N][M] )
or
void foo( A arr[][M] );
Since T arr[N] and T arr[] are adjusted to T *arr, then A arr[N][M] and A arr[][M] are adjusted to A (*arr)[M].
Let's replace A with another array type, R [O]:
R arr[N][M][O];
The type of the expression arr decays to R (*)[M][O], so the prototype of foo can be written as
void foo( R (*arr)[M][O] )
or
void foo( R arr[N][M][O] )
or
void foo( R arr[][M][O] )
Are you starting to see the pattern yet?
When a multi-dimensional array expression "decays" to a pointer expression, only the first (leftmost) dimension is "lost", so in a function prototype declaration, only the leftmost dimension my be left blank.
Since the subscript [] operator has a higher precedence than the unary * operator, A *arr[M] would be interpreted as "M-element array of pointers to A", which is not what we want. We have to explicitly group the * operator with the identifier to properly declare it as a pointer to an array.

Array as parameter doesn't maintain size [duplicate]

What is array to pointer decay? Is there any relation to array pointers?
It's said that arrays "decay" into pointers. A C++ array declared as int numbers [5] cannot be re-pointed, i.e. you can't say numbers = 0x5a5aff23. More importantly the term decay signifies loss of type and dimension; numbers decay into int* by losing the dimension information (count 5) and the type is not int [5] any more. Look here for cases where the decay doesn't happen.
If you're passing an array by value, what you're really doing is copying a pointer - a pointer to the array's first element is copied to the parameter (whose type should also be a pointer the array element's type). This works due to array's decaying nature; once decayed, sizeof no longer gives the complete array's size, because it essentially becomes a pointer. This is why it's preferred (among other reasons) to pass by reference or pointer.
Three ways to pass in an array1:
void by_value(const T* array) // const T array[] means the same
void by_pointer(const T (*array)[U])
void by_reference(const T (&array)[U])
The last two will give proper sizeof info, while the first one won't since the array argument has decayed to be assigned to the parameter.
1 The constant U should be known at compile-time.
Arrays are basically the same as pointers in C/C++, but not quite. Once you convert an array:
const int a[] = { 2, 3, 5, 7, 11 };
into a pointer (which works without casting, and therefore can happen unexpectedly in some cases):
const int* p = a;
you lose the ability of the sizeof operator to count elements in the array:
assert( sizeof(p) != sizeof(a) ); // sizes are not equal
This lost ability is referred to as "decay".
For more details, check out this article about array decay.
Here's what the standard says (C99 6.3.2.1/3 - Other operands - Lvalues, arrays, and function designators):
Except when it is the operand of the sizeof operator or the unary & operator, or is a
string literal used to initialize an array, an expression that has type ‘‘array of type’’ is
converted to an expression with type ‘‘pointer to type’’ that points to the initial element of
the array object and is not an lvalue.
This means that pretty much anytime the array name is used in an expression, it is automatically converted to a pointer to the 1st item in the array.
Note that function names act in a similar way, but function pointers are used far less and in a much more specialized way that it doesn't cause nearly as much confusion as the automatic conversion of array names to pointers.
The C++ standard (4.2 Array-to-pointer conversion) loosens the conversion requirement to (emphasis mine):
An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to an rvalue
of type “pointer to T.”
So the conversion doesn't have to happen like it pretty much always does in C (this lets functions overload or templates match on the array type).
This is also why in C you should avoid using array parameters in function prototypes/definitions (in my opinion - I'm not sure if there's any general agreement). They cause confusion and are a fiction anyway - use pointer parameters and the confusion might not go away entirely, but at least the parameter declaration isn't lying.
"Decay" refers to the implicit conversion of an expression from an array type to a pointer type. In most contexts, when the compiler sees an array expression it converts the type of the expression from "N-element array of T" to "pointer to T" and sets the value of the expression to the address of the first element of the array. The exceptions to this rule are when an array is an operand of either the sizeof or & operators, or the array is a string literal being used as an initializer in a declaration.
Assume the following code:
char a[80];
strcpy(a, "This is a test");
The expression a is of type "80-element array of char" and the expression "This is a test" is of type "15-element array of char" (in C; in C++ string literals are arrays of const char). However, in the call to strcpy(), neither expression is an operand of sizeof or &, so their types are implicitly converted to "pointer to char", and their values are set to the address of the first element in each. What strcpy() receives are not arrays, but pointers, as seen in its prototype:
char *strcpy(char *dest, const char *src);
This is not the same thing as an array pointer. For example:
char a[80];
char *ptr_to_first_element = a;
char (*ptr_to_array)[80] = &a;
Both ptr_to_first_element and ptr_to_array have the same value; the base address of a. However, they are different types and are treated differently, as shown below:
a[i] == ptr_to_first_element[i] == (*ptr_to_array)[i] != *ptr_to_array[i] != ptr_to_array[i]
Remember that the expression a[i] is interpreted as *(a+i) (which only works if the array type is converted to a pointer type), so both a[i] and ptr_to_first_element[i] work the same. The expression (*ptr_to_array)[i] is interpreted as *(*a+i). The expressions *ptr_to_array[i] and ptr_to_array[i] may lead to compiler warnings or errors depending on the context; they'll definitely do the wrong thing if you're expecting them to evaluate to a[i].
sizeof a == sizeof *ptr_to_array == 80
Again, when an array is an operand of sizeof, it's not converted to a pointer type.
sizeof *ptr_to_first_element == sizeof (char) == 1
sizeof ptr_to_first_element == sizeof (char *) == whatever the pointer size
is on your platform
ptr_to_first_element is a simple pointer to char.
Arrays, in C, have no value.
Wherever the value of an object is expected but the object is an array, the address of its first element is used instead, with type pointer to (type of array elements).
In a function, all parameters are passed by value (arrays are no exception). When you pass an array in a function it "decays into a pointer" (sic); when you compare an array to something else, again it "decays into a pointer" (sic); ...
void foo(int arr[]);
Function foo expects the value of an array. But, in C, arrays have no value! So foo gets instead the address of the first element of the array.
int arr[5];
int *ip = &(arr[1]);
if (arr == ip) { /* something; */ }
In the comparison above, arr has no value, so it becomes a pointer. It becomes a pointer to int. That pointer can be compared with the variable ip.
In the array indexing syntax you are used to seeing, again, the arr is 'decayed to a pointer'
arr[42];
/* same as *(arr + 42); */
/* same as *(&(arr[0]) + 42); */
The only times an array doesn't decay into a pointer are when it is the operand of the sizeof operator, or the & operator (the 'address of' operator), or as a string literal used to initialize a character array.
It's when array rots and is being pointed at ;-)
Actually, it's just that if you want to pass an array somewhere, but the pointer is passed instead (because who the hell would pass the whole array for you), people say that poor array decayed to pointer.
Array decaying means that, when an array is passed as a parameter to a function, it's treated identically to ("decays to") a pointer.
void do_something(int *array) {
// We don't know how big array is here, because it's decayed to a pointer.
printf("%i\n", sizeof(array)); // always prints 4 on a 32-bit machine
}
int main (int argc, char **argv) {
int a[10];
int b[20];
int *c;
printf("%zu\n", sizeof(a)); //prints 40 on a 32-bit machine
printf("%zu\n", sizeof(b)); //prints 80 on a 32-bit machine
printf("%zu\n", sizeof(c)); //prints 4 on a 32-bit machine
do_something(a);
do_something(b);
do_something(c);
}
There are two complications or exceptions to the above.
First, when dealing with multidimensional arrays in C and C++, only the first dimension is lost. This is because arrays are layed out contiguously in memory, so the compiler must know all but the first dimension to be able to calculate offsets into that block of memory.
void do_something(int array[][10])
{
// We don't know how big the first dimension is.
}
int main(int argc, char *argv[]) {
int a[5][10];
int b[20][10];
do_something(a);
do_something(b);
return 0;
}
Second, in C++, you can use templates to deduce the size of arrays. Microsoft uses this for the C++ versions of Secure CRT functions like strcpy_s, and you can use a similar trick to reliably get the number of elements in an array.
tl;dr: When you use an array you've defined, you'll actually be using a pointer to its first element.
Thus:
When you write arr[idx] you're really just saying *(arr + idx).
functions never really take arrays as parameters, only pointers - either directly, when you specify an array parameter, or indirectly, if you pass a reference to an array.
Sort-of exceptions to this rule:
You can pass fixed-length arrays to functions within a struct.
sizeof() gives the size taken up by the array, not the size of a pointer.
Try this code
void f(double a[10]) {
printf("in function: %d", sizeof(a));
printf("pointer size: %d\n", sizeof(double *));
}
int main() {
double a[10];
printf("in main: %d", sizeof(a));
f(a);
}
and you will see that the size of the array inside the function is not equal to the size of the array in main, but it is equal to the size of a pointer.
You probably heard that "arrays are pointers", but, this is not exactly true (the sizeof inside main prints the correct size). However, when passed, the array decays to pointer. That is, regardless of what the syntax shows, you actually pass a pointer, and the function actually receives a pointer.
In this case, the definition void f(double a[10] is implicitly transformed by the compiler to void f(double *a). You could have equivalently declared the function argument directly as *a. You could have even written a[100] or a[1], instead of a[10], since it is never actually compiled that way (however, you shouldn't do it obviously, it would confuse the reader).
Arrays are automatically passed by pointer in C. The rationale behind it can only be speculated.
int a[5], int *a and int (*a)[5] are all glorified addresses meaning that the compiler treats arithmetic and deference operators on them differently depending on the type, so when they refer to the same address they are not treated the same by the compiler. int a[5] is different to the other 2 in that the address is implicit and does not manifest on the stack or the executable as part of the array itself, it is only used by the compiler to resolve certain arithmetic operations, like taking its address or pointer arithmetic. int a[5] is therefore an array as well as an implicit address, but as soon as you talk about the address itself and place it on the stack, the address itself is no longer an array, and can only be a pointer to an array or a decayed array i.e. a pointer to the first member of the array.
For instance, on int (*a)[5], the first dereference on a will produce an int * (so the same address, just a different type, and note not int a[5]), and pointer arithmetic on a i.e. a+1 or *(a+1) will be in terms of the size of an array of 5 ints (which is the data type it points to), and the second dereference will produce the int. On int a[5] however, the first dereference will produce the int and the pointer arithmetic will be in terms of the size of an int.
To a function, you can only pass int * and int (*)[5], and the function casts it to whatever the parameter type is, so within the function you have a choice whether to treat an address that is being passed as a decayed array or a pointer to an array (where the function has to specify the size of the array being passed). If you pass a to a function and a is defined int a[5], then as a resolves to an address, you are passing an address, and an address can only be a pointer type. In the function, the parameter it accesses is then an address on the stack or in a register, which can only be a pointer type and not an array type -- this is because it's an actual address on the stack and is therefore clearly not the array itself.
You lose the size of the array because the type of the parameter, being an address, is a pointer and not an array, which does not have an array size, as can be seen when using sizeof, which works on the type of the value being passed to it. The parameter type int a[5] instead of int *a is allowed but is treated as int * instead of disallowing it outright, though it should be disallowed, because it is misleading, because it makes you think that the size information can be used, but you can only do this by casting it to int (*a)[5], and of course, the function has to specify the size of the array because there is no way to pass the size of the array because the size of the array needs to be a compile-time constant.
I might be so bold to think there are four (4) ways to pass an array as the function argument. Also here is the short but working code for your perusal.
#include <iostream>
#include <string>
#include <vector>
#include <cassert>
using namespace std;
// test data
// notice native array init with no copy aka "="
// not possible in C
const char* specimen[]{ __TIME__, __DATE__, __TIMESTAMP__ };
// ONE
// simple, dangerous and useless
template<typename T>
void as_pointer(const T* array) {
// a pointer
assert(array != nullptr);
} ;
// TWO
// for above const T array[] means the same
// but and also , minimum array size indication might be given too
// this also does not stop the array decay into T *
// thus size information is lost
template<typename T>
void by_value_no_size(const T array[0xFF]) {
// decayed to a pointer
assert( array != nullptr );
}
// THREE
// size information is preserved
// but pointer is asked for
template<typename T, size_t N>
void pointer_to_array(const T (*array)[N])
{
// dealing with native pointer
assert( array != nullptr );
}
// FOUR
// no C equivalent
// array by reference
// size is preserved
template<typename T, size_t N>
void reference_to_array(const T (&array)[N])
{
// array is not a pointer here
// it is (almost) a container
// most of the std:: lib algorithms
// do work on array reference, for example
// range for requires std::begin() and std::end()
// on the type passed as range to iterate over
for (auto && elem : array )
{
cout << endl << elem ;
}
}
int main()
{
// ONE
as_pointer(specimen);
// TWO
by_value_no_size(specimen);
// THREE
pointer_to_array(&specimen);
// FOUR
reference_to_array( specimen ) ;
}
I might also think this shows the superiority of C++ vs C. At least in reference (pun intended) of passing an array by reference.
Of course there are extremely strict projects with no heap allocation, no exceptions and no std:: lib. C++ native array handling is mission critical language feature, one might say.

How does an unsized array declaration act as a pointer?

How does char s[] act as a pointer while it looks like an array declaration?
#include<stdio.h>
void test(char s[]);
int main()
{
char s[10]="test";
char a[]=s;
test(s);
char p[]=s;
return 0;
}
void test(char s[])
{
printf(s);
}
In the context of a function parameter declaration (and only in that context), T a[N] and T a[] are the same as T *a; they declare a as a pointer to T, not an array of T.
Chapter and verse:
6.7.6.3 Function declarators (including prototypes)
...
7 A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to
type’’, where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation. If the keyword static also appears within the [ and ] of the
array type derivation, then for each call to the function, the value of the corresponding
actual argument shall provide access to the first element of an array with at least as many
elements as specified by the size expression.
Anywhere else, T a[]; declares a as an array with an as-yet-unspecified size. At this point the declaration is incomplete, and a cannot be used anywhere until a size has been specified, either by specifying it explicitly:
T a[N];
or using an initializer:
T a[] = { /* comma-separated list of initial values */ }
Chapter and verse, again:
6.7.6.2 Array declarators
...
3 If, in the declaration ‘‘T D1’’, D1 has one of the forms: D[ type-qualifier-listopt assignment-expressionopt ]
D[ static type-qualifier-listopt assignment-expression ]
D[ type-qualifier-list static assignment-expression ]
D[ type-qualifier-listopt * ]
and the type specified for ident in the declaration ‘‘T D’’ is ‘‘derived-declarator-type-list
T’’, then the type specified for ident is ‘‘derived-declarator-type-list array of T’’.142)
(See 6.7.6.3 for the meaning of the optional type qualifiers and the keyword static.)
4 If the size is not present, the array type is an incomplete type. If the size is * instead of
being an expression, the array type is a variable length array type of unspecified size,
which can only be used in declarations or type names with function prototype scope;143)
such arrays are nonetheless complete types. If the size is an integer constant expression
and the element type has a known constant size, the array type is not a variable length
array type; otherwise, the array type is a variable length array type. (Variable length
arrays are a conditional feature that implementations need not support; see 6.10.8.3.)
142) When several ‘‘array of’’ specifications are adjacent, a multidimensional array is declared.
143) Thus, * can be used only in function declarations that are not definitions (see 6.7.6.3).
...
6.7.9 Initialization
...
22 If an array of unknown size is initialized, its size is determined by the largest indexed
element with an explicit initializer. The array type is completed at the end of its
initializer list.
So, why are arrays as function parameters treated differently than arrays as regular objects? This is why:
6.3.2.1 Lvalues, arrays, and function designators
...
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the
unary & operator, or is a string literal used to initialize an array, an expression that has
type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points
to the initial element of the array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
Under most circumstances, an expression of type "N-element array of T" is converted ("decays") to an expression of type "pointer to T". If you pass an array expression as an argument to a function, like so:
int foo[10];
...
bar( foo );
what the function bar actually receives is a pointer to int, not a 10-element array of int, so the prototype for bar can be written
void bar( int *f );
"But why..." I hear you starting to ask. I'm getting to it, really.
C was derived from an earlier language called B (go figure), and in B the following things were true:
Pointer objects were declared using empty bracket notation - auto p[];. This syntax was kept for pointer function parameter declarations in C.
Array declarations set aside memory not only for the array elements, but also for an explicit pointer to the first element, which was bound to the array variable (i.e., auto v[10] would set aside 11 memory cells, 10 for the array contents, and the remaining one to store an offset to the first element).
The array subscript operation a[i] was defined as *(a + i) -- you'd offset i elements from the base address stored in the variable a and dereference the result (B was a "typeless" language, so all scalar objects were the same size).
For various reasons, Ritchie got rid of the explicit pointer to the first element of the array, but kept the definition of the subscript operation. So in C, when the compiler sees an array expression that isn't the operand of the sizeof or unary & operators, it replaces that expression with a pointer expression that evaluates to the address of the first element of the array; that way the *(a + i) operation still works the way it did in B, without actually setting aside any storage for that pointer value. However, it means arrays lose their "array-ness" in most circumstances, which will bite you in the ass if you aren't careful.
You can't assign to arrays, only copy to them or initialize them with a valid initializer (and another array is not a valid initializer).
And when you declare a function like
void test(char s[]);
it's actually the same as declaring it
void test(char *s);
What happens when you call a function taking an "array" is that the array decays to a pointer to the first element, and it's that pointer that is passed to the function.
So the call
test(s);
is the same as
test(&s[0]);
Regarding the function declaration, declaring a function taking an array of arrays is not the same as declaring a function taking a pointer to a pointer. See e.g. this old answer of mine as an explanation of why.
So if you want a function taking an array of arrays, like
void func2(char a[][X]);
it's not the same as
void func2(char **a);
Instead it's the same as
void func2(char (*a)[X]);
For more "dimensions" it doesn't change anything, e.g.
void func3(char a[][X][Y]);
is the same as
void func3(char (*a)[X][Y]);
char a[] is an array and not a pointer, the inittialization is invalid.
s is an array, too, but in the context of an expression it evaluates to a pointer to its first element.
char a[] is only valid if you have a following initializer list. It has to be an array of characters. As a special case for strings, C allows you to type an array of characters as "str", rather than {'s','t','r','\0'}. Either initializer is fine.
In the code char a[]=s;, s is an array type and not a valid initializer, so the code will not compile.
void test(char s[]) is another special case, because arrays passed as parameters always get replaced by the compiler with a pointer to the first element. Don't confuse this with array initialization, even though the syntax is similar.

Correct C syntax for "address of" character pointer array?

I need to pass into a function the address of a character pointer array, what would be the correct function declaration/prototype syntax for that?
I tried:
void myFunc(char &*a[]);
But get an expected ; , or ) before & error.
I also tried:
void myFunc(char **a);
Since the pointer I would be passing in is indeed a pointer to a pointer, but I get yet another error. This time the error has to do with: expected char **, but got char *[]
Or something like that. I have since attempted other solutions so I not remember exactly the error.
Any suggestions?
Assuming you have an array declared as
char *a[N];
and you want to pass a pointer to the array, as in
foo( &a );
then the prototype for foo needs to be
void foo( char *(*aptr)[N] );
Note that in this case, the size of the array must be declared; a pointer to an N-element array is a different type from a pointer to an M-element array.
Normally, you don't want to do this; instead, you would normally just pass the array expression like so:
foo ( a );
and the corresponding prototype would be:
void foo ( char **aptr );
Except when it is the operand of thesizeof or unary & operators, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer toT", and the value of the expression will be the address of the first element of the array.
Updated Answer For Modified Problem Statement
Given what you have said in comments, there is no need to pass a pointer to an array. You can simply pass a pointer to the first element of the array. Such a pointer suffices because the remaining elements of the array are obviously located after the first element.
To write a function that sets pointers in an array of pointers to char, do this:
void MyFunction(int NumberToSet, char *Pointers[])
{
for (int i = 0; i < NumberToSet; ++i)
{
Pointers[i] = SomeString;
}
}
In the above, SomeString must have type “pointer to char”. This could be a string, such as "Hello", or an array of char (which is automatically converted to a pointer to char), or some identifier x that has been declared as char *x (and has been initialized or assigned), for example.
To use this function, call it like this:
char *MyArrayOfPointers[SomeNumber];
MyFunction(NumberOfPointersIWantToSet, MyArrayOfPointers);
Original Answer
In most cases, to pass an array of pointers to char to a function, it suffices to pass the address of the first element. In this case, you would use either of these (they are equivalent):
void myFunc(char **a)
void myFunc(char *a[])
If you truly want to pass the address of the array, you would use:
void myFunc(char *(*a)[])
In this case, the type of a is incomplete, since the dimension is missing. Depending on what you intend to do with a, you may need to provide the dimension in the declaration.
When calling myFunc and passing it some array declared as char *array[N];, you would pass it, in the former case, as myFunc(array) and, in the latter case, as myFunc(&array).
try this as a function definition void myFunc(char *a[]) or void myFunc(char **a) then use it this way :
char *arr[20];
myFunc(arr);
Ok you are almost on the right path. void myFunc(char *a[]);
Example
void fun(char *a[]){
printf("%s",*a); //for accessing the next element do a+1
}
int main(void) {
char *x[3];
x[0]="abcd";
fun(x); // here you are passing the address first array element
return 0;
DEMO
Declaration
void myFunc(char &*a[]);
is not a valid C syntax.
To pass the address of character pointer arrays, use this instead
void myFunc(char *(*a)[]);
*(*a)[] in the above function declares a as pointer to array of pointers to chars. Must note that a has an incompatible type. A suffix is needed in [] to make it complete.
First of all, C is in general a "pass by reference" language. Some data items such as integers, floats, and single characters can be passed by value. But, arrays of those same data types are ALWAYS passed by reference.
Thus, when you say "I need to pass into a function the address of a character pointer array" then simply declare an array of character pointers in your function prototype as:
void myFunc(char *a[]);
Thus, char * declares a char pointer and a[] defines an array of them. To check this declaration refer to: http://www.cdecl.org/ which parses this expression as a "declare a as array of pointer to char".
The technical point is that the * binds with char rather than with a[].
So, C will pass a pointer to the data structure that you have declared. A discussion on this topic could delve into double pointers but for this question such a discussion is probably off topic.

Resources