C programming: arrays and pointers [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is array name a pointer in C?
If I define:
int tab[4];
tab is a pointer, because if I display tab:
printf("%d", tab);
the code above will display the address to the first element in memory.
That's why i was wondering why we don't define an array like the following:
int *tab[4];
as tab is a pointer.
Thank you for any help!

tab is a pointer
No, tab is an array. An int[4] to be specific. But when you pass it as an argument to a function (and in many other contexts) the array is converted to a pointer to its first element. You can see the difference between arrays and pointers for example when you call sizeof array vs. sizeof pointer, when you try to assign to an array (that won't compile), and more.
int *tab[4];
declares an array of four pointers to int. I don't see how that is related to the confusion between arrays and pointers.

tab is not a pointer it's an array of 4 integers when passed to a function it decays into a pointer to the first element:
int tab[4];
And this is another array but it holds 4 integer pointers:
int *tab[4];
Finally, for the sake of completeness, this is a pointer to an array of 4 integers, if you dereference this you get an array of 4 integers:
int (*tab)[4];

You are not completely wrong, meaning that your statement is wrong but you are not that far from the truth.
Arrays and pointers under C share the same arithmetic but the main difference is that arrays are containers and pointers are just like any other atomic variable and their purpose is to store a memory address and provide informations about the type of the pointed value.
I suggest to read something about pointer arithmetic
Pointer Arithmetic
http://www.learncpp.com/cpp-tutorial/68-pointers-arrays-and-pointer-arithmetic/
Considering the Steve Jessop comment I would like to add a snippet that can introduce you to the simple and effective world of the pointer arithmetic:
#include <stdio.h>
int main()
{
int arr[10] = {10,11,12,13,14,15,16,17,18,19};
int pos = 3;
printf("Arithmetic part 1 %d\n",arr[pos]);
printf("Arithmetic part 2 %d\n",pos[arr]);
return(0);
}
arrays can behave like pointers, even look like pointers in your case, you can apply the same exact kind of arithmetic by they are not pointers.

int *tab[4];
this deffinition means that the tab array contains pointers of int and not int
From C standard
Coding Guidelines
The implicit conversion of array objects to a
pointer to their first element is a great inconvenience in trying to
formulate stronger type checking for arrays in C. Inexperienced, in
the C language, developers sometimes equate arrays and a pointers much
more closely than permitted by this requirement (which applies to uses
in expressions, not declarations). For instance, in:
file_1.c
extern int *a;
file_2.c
extern int a[10];
the two declarations of a are sometimes incorrectly assumed by
developers to be compatible. It is difficult to see what guideline
recommendation would overcome incorrect developer assumptions (or poor
training). If the guideline recommendation specifying a single point
of declaration is followed, this problem will not 419.1 identifier
declared in one file occur. Unlike the function designator usage,
developers are familiar with the fact that objects having an array
function designator converted to typetype are implicitly converted to
a pointer to their first element. Whether applying a unary & operator
to an operand having an array type provides readers with a helpful
visual cue or causes them to wonder about the intent of the author
(“what is that redundant operator doing there?”) is not known.
Example
static double a[5];
void f(double b[5])
{
double (*p)[5] = &a;
double **q = &b; /* This looks suspicious, */
p = &b; /* and so does this. */
q = &a;
}
If the array object has register storage class, the behavior is undefined

Under most circumstances, an expression of array type will be converted ("decay") to an expression of pointer type, and the value of the expression will be the address of the first element in the array. The exceptions to this rule are when the array expression is an operand of the sizeof, _Alignof, or unary & operators, or is a string literal being used to initialize another array in a declaration.
int tab[4];
defines tab as a 4-element array if int. In the statement
printf("%d", tab); // which *should* be printf("%p", (void*) tab);
the expression tab is converted from type "4-element array of int" to "pointer to int".

Related

Why pass by reference works also without '&'? [duplicate]

What is array to pointer decay? Is there any relation to array pointers?
It's said that arrays "decay" into pointers. A C++ array declared as int numbers [5] cannot be re-pointed, i.e. you can't say numbers = 0x5a5aff23. More importantly the term decay signifies loss of type and dimension; numbers decay into int* by losing the dimension information (count 5) and the type is not int [5] any more. Look here for cases where the decay doesn't happen.
If you're passing an array by value, what you're really doing is copying a pointer - a pointer to the array's first element is copied to the parameter (whose type should also be a pointer the array element's type). This works due to array's decaying nature; once decayed, sizeof no longer gives the complete array's size, because it essentially becomes a pointer. This is why it's preferred (among other reasons) to pass by reference or pointer.
Three ways to pass in an array1:
void by_value(const T* array) // const T array[] means the same
void by_pointer(const T (*array)[U])
void by_reference(const T (&array)[U])
The last two will give proper sizeof info, while the first one won't since the array argument has decayed to be assigned to the parameter.
1 The constant U should be known at compile-time.
Arrays are basically the same as pointers in C/C++, but not quite. Once you convert an array:
const int a[] = { 2, 3, 5, 7, 11 };
into a pointer (which works without casting, and therefore can happen unexpectedly in some cases):
const int* p = a;
you lose the ability of the sizeof operator to count elements in the array:
assert( sizeof(p) != sizeof(a) ); // sizes are not equal
This lost ability is referred to as "decay".
For more details, check out this article about array decay.
Here's what the standard says (C99 6.3.2.1/3 - Other operands - Lvalues, arrays, and function designators):
Except when it is the operand of the sizeof operator or the unary & operator, or is a
string literal used to initialize an array, an expression that has type ‘‘array of type’’ is
converted to an expression with type ‘‘pointer to type’’ that points to the initial element of
the array object and is not an lvalue.
This means that pretty much anytime the array name is used in an expression, it is automatically converted to a pointer to the 1st item in the array.
Note that function names act in a similar way, but function pointers are used far less and in a much more specialized way that it doesn't cause nearly as much confusion as the automatic conversion of array names to pointers.
The C++ standard (4.2 Array-to-pointer conversion) loosens the conversion requirement to (emphasis mine):
An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to an rvalue
of type “pointer to T.”
So the conversion doesn't have to happen like it pretty much always does in C (this lets functions overload or templates match on the array type).
This is also why in C you should avoid using array parameters in function prototypes/definitions (in my opinion - I'm not sure if there's any general agreement). They cause confusion and are a fiction anyway - use pointer parameters and the confusion might not go away entirely, but at least the parameter declaration isn't lying.
"Decay" refers to the implicit conversion of an expression from an array type to a pointer type. In most contexts, when the compiler sees an array expression it converts the type of the expression from "N-element array of T" to "pointer to T" and sets the value of the expression to the address of the first element of the array. The exceptions to this rule are when an array is an operand of either the sizeof or & operators, or the array is a string literal being used as an initializer in a declaration.
Assume the following code:
char a[80];
strcpy(a, "This is a test");
The expression a is of type "80-element array of char" and the expression "This is a test" is of type "15-element array of char" (in C; in C++ string literals are arrays of const char). However, in the call to strcpy(), neither expression is an operand of sizeof or &, so their types are implicitly converted to "pointer to char", and their values are set to the address of the first element in each. What strcpy() receives are not arrays, but pointers, as seen in its prototype:
char *strcpy(char *dest, const char *src);
This is not the same thing as an array pointer. For example:
char a[80];
char *ptr_to_first_element = a;
char (*ptr_to_array)[80] = &a;
Both ptr_to_first_element and ptr_to_array have the same value; the base address of a. However, they are different types and are treated differently, as shown below:
a[i] == ptr_to_first_element[i] == (*ptr_to_array)[i] != *ptr_to_array[i] != ptr_to_array[i]
Remember that the expression a[i] is interpreted as *(a+i) (which only works if the array type is converted to a pointer type), so both a[i] and ptr_to_first_element[i] work the same. The expression (*ptr_to_array)[i] is interpreted as *(*a+i). The expressions *ptr_to_array[i] and ptr_to_array[i] may lead to compiler warnings or errors depending on the context; they'll definitely do the wrong thing if you're expecting them to evaluate to a[i].
sizeof a == sizeof *ptr_to_array == 80
Again, when an array is an operand of sizeof, it's not converted to a pointer type.
sizeof *ptr_to_first_element == sizeof (char) == 1
sizeof ptr_to_first_element == sizeof (char *) == whatever the pointer size
is on your platform
ptr_to_first_element is a simple pointer to char.
Arrays, in C, have no value.
Wherever the value of an object is expected but the object is an array, the address of its first element is used instead, with type pointer to (type of array elements).
In a function, all parameters are passed by value (arrays are no exception). When you pass an array in a function it "decays into a pointer" (sic); when you compare an array to something else, again it "decays into a pointer" (sic); ...
void foo(int arr[]);
Function foo expects the value of an array. But, in C, arrays have no value! So foo gets instead the address of the first element of the array.
int arr[5];
int *ip = &(arr[1]);
if (arr == ip) { /* something; */ }
In the comparison above, arr has no value, so it becomes a pointer. It becomes a pointer to int. That pointer can be compared with the variable ip.
In the array indexing syntax you are used to seeing, again, the arr is 'decayed to a pointer'
arr[42];
/* same as *(arr + 42); */
/* same as *(&(arr[0]) + 42); */
The only times an array doesn't decay into a pointer are when it is the operand of the sizeof operator, or the & operator (the 'address of' operator), or as a string literal used to initialize a character array.
It's when array rots and is being pointed at ;-)
Actually, it's just that if you want to pass an array somewhere, but the pointer is passed instead (because who the hell would pass the whole array for you), people say that poor array decayed to pointer.
Array decaying means that, when an array is passed as a parameter to a function, it's treated identically to ("decays to") a pointer.
void do_something(int *array) {
// We don't know how big array is here, because it's decayed to a pointer.
printf("%i\n", sizeof(array)); // always prints 4 on a 32-bit machine
}
int main (int argc, char **argv) {
int a[10];
int b[20];
int *c;
printf("%zu\n", sizeof(a)); //prints 40 on a 32-bit machine
printf("%zu\n", sizeof(b)); //prints 80 on a 32-bit machine
printf("%zu\n", sizeof(c)); //prints 4 on a 32-bit machine
do_something(a);
do_something(b);
do_something(c);
}
There are two complications or exceptions to the above.
First, when dealing with multidimensional arrays in C and C++, only the first dimension is lost. This is because arrays are layed out contiguously in memory, so the compiler must know all but the first dimension to be able to calculate offsets into that block of memory.
void do_something(int array[][10])
{
// We don't know how big the first dimension is.
}
int main(int argc, char *argv[]) {
int a[5][10];
int b[20][10];
do_something(a);
do_something(b);
return 0;
}
Second, in C++, you can use templates to deduce the size of arrays. Microsoft uses this for the C++ versions of Secure CRT functions like strcpy_s, and you can use a similar trick to reliably get the number of elements in an array.
tl;dr: When you use an array you've defined, you'll actually be using a pointer to its first element.
Thus:
When you write arr[idx] you're really just saying *(arr + idx).
functions never really take arrays as parameters, only pointers - either directly, when you specify an array parameter, or indirectly, if you pass a reference to an array.
Sort-of exceptions to this rule:
You can pass fixed-length arrays to functions within a struct.
sizeof() gives the size taken up by the array, not the size of a pointer.
Try this code
void f(double a[10]) {
printf("in function: %d", sizeof(a));
printf("pointer size: %d\n", sizeof(double *));
}
int main() {
double a[10];
printf("in main: %d", sizeof(a));
f(a);
}
and you will see that the size of the array inside the function is not equal to the size of the array in main, but it is equal to the size of a pointer.
You probably heard that "arrays are pointers", but, this is not exactly true (the sizeof inside main prints the correct size). However, when passed, the array decays to pointer. That is, regardless of what the syntax shows, you actually pass a pointer, and the function actually receives a pointer.
In this case, the definition void f(double a[10] is implicitly transformed by the compiler to void f(double *a). You could have equivalently declared the function argument directly as *a. You could have even written a[100] or a[1], instead of a[10], since it is never actually compiled that way (however, you shouldn't do it obviously, it would confuse the reader).
Arrays are automatically passed by pointer in C. The rationale behind it can only be speculated.
int a[5], int *a and int (*a)[5] are all glorified addresses meaning that the compiler treats arithmetic and deference operators on them differently depending on the type, so when they refer to the same address they are not treated the same by the compiler. int a[5] is different to the other 2 in that the address is implicit and does not manifest on the stack or the executable as part of the array itself, it is only used by the compiler to resolve certain arithmetic operations, like taking its address or pointer arithmetic. int a[5] is therefore an array as well as an implicit address, but as soon as you talk about the address itself and place it on the stack, the address itself is no longer an array, and can only be a pointer to an array or a decayed array i.e. a pointer to the first member of the array.
For instance, on int (*a)[5], the first dereference on a will produce an int * (so the same address, just a different type, and note not int a[5]), and pointer arithmetic on a i.e. a+1 or *(a+1) will be in terms of the size of an array of 5 ints (which is the data type it points to), and the second dereference will produce the int. On int a[5] however, the first dereference will produce the int and the pointer arithmetic will be in terms of the size of an int.
To a function, you can only pass int * and int (*)[5], and the function casts it to whatever the parameter type is, so within the function you have a choice whether to treat an address that is being passed as a decayed array or a pointer to an array (where the function has to specify the size of the array being passed). If you pass a to a function and a is defined int a[5], then as a resolves to an address, you are passing an address, and an address can only be a pointer type. In the function, the parameter it accesses is then an address on the stack or in a register, which can only be a pointer type and not an array type -- this is because it's an actual address on the stack and is therefore clearly not the array itself.
You lose the size of the array because the type of the parameter, being an address, is a pointer and not an array, which does not have an array size, as can be seen when using sizeof, which works on the type of the value being passed to it. The parameter type int a[5] instead of int *a is allowed but is treated as int * instead of disallowing it outright, though it should be disallowed, because it is misleading, because it makes you think that the size information can be used, but you can only do this by casting it to int (*a)[5], and of course, the function has to specify the size of the array because there is no way to pass the size of the array because the size of the array needs to be a compile-time constant.
I might be so bold to think there are four (4) ways to pass an array as the function argument. Also here is the short but working code for your perusal.
#include <iostream>
#include <string>
#include <vector>
#include <cassert>
using namespace std;
// test data
// notice native array init with no copy aka "="
// not possible in C
const char* specimen[]{ __TIME__, __DATE__, __TIMESTAMP__ };
// ONE
// simple, dangerous and useless
template<typename T>
void as_pointer(const T* array) {
// a pointer
assert(array != nullptr);
} ;
// TWO
// for above const T array[] means the same
// but and also , minimum array size indication might be given too
// this also does not stop the array decay into T *
// thus size information is lost
template<typename T>
void by_value_no_size(const T array[0xFF]) {
// decayed to a pointer
assert( array != nullptr );
}
// THREE
// size information is preserved
// but pointer is asked for
template<typename T, size_t N>
void pointer_to_array(const T (*array)[N])
{
// dealing with native pointer
assert( array != nullptr );
}
// FOUR
// no C equivalent
// array by reference
// size is preserved
template<typename T, size_t N>
void reference_to_array(const T (&array)[N])
{
// array is not a pointer here
// it is (almost) a container
// most of the std:: lib algorithms
// do work on array reference, for example
// range for requires std::begin() and std::end()
// on the type passed as range to iterate over
for (auto && elem : array )
{
cout << endl << elem ;
}
}
int main()
{
// ONE
as_pointer(specimen);
// TWO
by_value_no_size(specimen);
// THREE
pointer_to_array(&specimen);
// FOUR
reference_to_array( specimen ) ;
}
I might also think this shows the superiority of C++ vs C. At least in reference (pun intended) of passing an array by reference.
Of course there are extremely strict projects with no heap allocation, no exceptions and no std:: lib. C++ native array handling is mission critical language feature, one might say.

Is it UB to compare (for equality) a void pointer with a typed pointer in C?

I have a typed pointer, typed, that was initialized using pointer arithmetic to point to an object within an array. I also have a function that takes two pointer args, the 1st typed the same as the afore-mentioned pointer and the 2nd is void * (see myfunc() in the code below).
If I pass typed as the 1st argument and another pointer typed the same as typed as the 2nd argument, and then compare these for equality within the function, is that Undefined Behavior?
#include <stdio.h>
typedef struct S {int i; float f;} s;
void myfunc(s * a, void * b)
{
if (a == b) // <-------------------------------- is this UB?
printf("the same\n");
}
int main()
{
s myarray[] = {{7, 7.0}, {3, 3.0}};
s * typed = myarray + 1;
myfunc(typed, &(myarray[0]));
return 0;
}
Update: Ok, so I come back a day after posting my question above and there are two great answers (thanks both to #SouravGhosh and #dbush). One came in earlier than the other by less than a minute (!) but from the looks of the comments on the 1st one, the answer was initially wrong and only corrected after the 2nd answer was posted. Which one do I accept? Is there a protocol for accepting one answer over the other in this case?
No, this is not undefined behaviour. This is allowed and explicitly defined in the spec for equality operator constraints. Quoting C11, chapter 6.5.9
one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void;
and from paragraph 5 of same chapter
[...] If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.
This comparison is well defined.
When a void * is compared against another pointer type via ==, the other pointer is converted to void *.
Also, Section 6.5.9p6 of the C standard says the following regarding pointer comparisons with ==:
Two pointers compare equal if and only if both are null pointers, both
are pointers to the same object (including a pointer to an object and
a subobject at its beginning) or function,both are pointers to one
past the last element of the same array object, or one is a pointer to
one past the end of one array object and the other is a pointer to the
start of a different array object that happens to immediately
follow the first array object in the address space.
There is no mention here of undefined behavior.

array incompatible pointer type [duplicate]

This question already has answers here:
Initialization from incompatible pointer type warning when assigning to a pointer
(5 answers)
Closed 4 years ago.
can anyone please let me know what's the difference b/w
int vector[]={10,20,30}
int *p = vector
and
int *p= &vector
I know by mentioning array name, we get it's base address . The second statement is giving me warning
initialisation from incompatible pointer type
Why warning, both statement give array base address.
When used in an expression, an array name will in most cases decay to a pointer to its first element.
So this:
int *p = vector;
Is equivalent to:
int *p = &vector[0];
The type of &vector[0] is int *, so the type is compatible.
One of the few times an array does not decay is when it is the object of the address-of operator &. Then &array is the address of the array and has type int (*)[3], i.e. a pointer to an array of size 3. This is not compatible with int *, hence the error.
Although the address of the array and the address of its first element have the same value, their types are different.
&vector is of type int (*)[3] i.e. a pointer to an array, whereas vector without & would decay to int * pointing to the first element.
int (*)[3], even though pointing at the same address is incompatible with int *, hence your program has a constraint violation, which makes your program an invalid program, and a compiler must issue a diagnostics message.
The C standard explicitly mentions in a [footnote (C11 footnote 9) that a compiler is allowed to successfully compile an invalid program:
9) The intent is that an implementation should identify the nature of, and where possible localize, each violation. Of course, an implementation is free to produce any number of diagnostics as long as a valid program is still correctly translated. It may also successfully translate an invalid program.
I.e. a compiler is allowed to do whatever it pleases with the given source, provided that a valid program is correctly translated. Hence many compilers will look these kinds of things through the fingers with default settings and provide a translation that has more or less expected - or equally unexpected - results, and you will get only a warning.
Expressions in C have both values and types.
Given int vector[] = {10, 20, 30};, vector is an array. C has a rule that, when an array is used in an expression outside of certain places1, it is automatically converted to be a pointer to its first element. Then its value is effectively the address of the start of the array2, and its type is “pointer to int”.
The expression &vector takes the address of the array. This is different from the address of its first element. Largely, they both have the same value. Both the array and its first element start at the same place in memory. But they have different types. The type of &vector is “pointer to array of 3 int”.
C has rules about types, and you cannot automatically use one type where another is expected. Sometimes types are converted automatically, as when a narrower integer is converted to a wider integer. But, generally in places where the type is important to the meaning of the software, there are no automatic conversions (or limited conversions). If you try to assign a pointer to an array to a pointer to an int, a good compiler will warn you you are doing something improper.
When a pointer to one type is assigned to a pointer to a different type, it might be because the programmer has made a mistake. This is why the compiler warns you.
Additionally, the same value with different types may behave differently. Because array is a pointer to its first element, a + 1 is a pointer to the second element. But, since &array is a pointer to the array, &array + 1 is a pointer to the end of the array (where the next array after it would start, if there were one).
Footnote
1 An array is not automatically converted when it is the operand of sizeof or the operand of unary & or is a string literal used to initialize an array.
2 The array starts in the same place in memory that its first element starts, of course. So they have the same virtual address. So they have the same value in that sense. However, there are some technical issues about C pointers that mean two pointers to the same place may not be exactly the “same” in certain senses. In this answer, I will not get into details of that. We can treat pointers to the same place as the same in this discussion.

Why can arrays be assigned directly?

Consider this code snippet:
void foo(int a[], int b[]){
static_assert(sizeof(a) == sizeof(int*));
static_assert(sizeof(b) == sizeof(int*));
b = a;
printf("%d", b[1]);
assert(a == b); // This also works!
}
int a[3] = {[1] = 2}, b[1];
foo(a, b);
Output (no compilation error):
2
I can't get the point why b = a is valid. Even though arrays may decay to pointers, shouldn't they decay to const pointers (T * const)?
They can't.
Arrays cannot be assigned to. There are no arrays in the foo function. The syntax int a[] in a function parameter list means to declare that a has type "pointer to int". The behaviour is exactly the same as if the code were void foo(int *a, int *b). (C11 6.7.6.3/7)
It is valid to assign one pointer to another. The result is that both pointers point to the same location.
Even though arrays may decay to pointers, shouldn't they decay to const pointers (T * const)?
The pointer that results from array "decay" is an rvalue. The const qualifier is only meaningful for lvalues (C11 6.7.3/4). (The term "decay" refers to conversion of the argument, not the adjustment of the parameter).
Quoting C11, chapter §6.7.6.3, Function declarators (including prototypes)
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to
type’’, where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation. [...]
So, a and b are actually pointers, not arrays.
There's no assignment to any array type happennning here, hence there's no problem with the code.
Yes, it would have made sense for array parameters declared with [] to be adjusted to const-qualified pointers. However, const did not exist when this behavior was established.
When the C language was being developed, it made sense to pass an array by passing its address, or, more specifically, the address of the first element. You certainly did not want to copy the entire array to pass it. Passing the address was an easy way to make the array known to the called function. (The semantics for the reference types we see in C++ had not been invented yet.) To make that easy for programmers, so that they could write foo(ArrayA, ArrayB) instead of foo(&Array[0], &ArrayB[0]), the mechanism of converting an array to a pointer to its first element was invented. (Per M.M. and The Development of the C Language by Dennis M. Ritchie, this notation for parameters already existed in C’s predecessor language, B.)
That is fine, you have hidden the conversion. But that is only where the function is called. In the called routine, the programmer who is thinking about passing an array is going to write void foo(int ArrayA[], int ArrayB[]). But since we are actually passing pointers, not arrays, these need to be changed to int *ArrayA and int *ArrayB. So the notion that parameters declared as arrays are automatically adjusted to pointers was created.
As you observe, this leaves the programmer able to assign values to the parameters, which changes the apparent base address of the array. It would have made sense for a parameter declared as int ArrayA[] to be adjusted to int * const ArrayA, so that the value of the parameter ArrayA could not be changed. Then it would act more like an array, whose address also cannot be changed, so this better fits the goal of pretending to pass arrays even though we are passing addresses.
However, at the time, const did not exist, so this was not possible, and nobody thought of inventing const at that time (or at least did work on it enough to get it adopted into the language).
Now there is a large amount of source code in the world that works with the non-const adjustment. Changing the specification of the C language now would cause problems with the existing code.

Can I use arrays as a function parameter in C99?

The C99 standard says the following in 6.7.5.3/7:
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to
type’’, where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation.
Which I understand as:
void foo(int * arr) {} // valid
void foo(int arr[]) {} // invalid
However, gcc 4.7.3 will happily accept both function definitions, even when compiled with gcc -Wall -Werror -std=c99 -pedantic-errors. Since I am not a C expert, I am unsure if maybe I misinterpreted what the standard is saying.
I also noticed that
size_t foo(int arr[]) { return sizeof(arr); }
will always return sizeof(int *) instead of the array size, which firms my belief that int arr[] is handled as int * and gcc is just trying to make me feel more comfortable.
Can someone shed some light on this issue? Just for reference, this question arose from this comment.
Some context:
First of all, remember that when an expression of type "N-element array of T" appears in a context where it isn't the operand of the sizeof or unary & operator, or isn't a string literal being used to initialize another array in a declaration, it will be converted to an expression of type "pointer to T" and its value will be the address of the first element in the array.
That means when you pass an array argument to a function, the function will receive a pointer value as a parameter; the array expression is converted to a pointer type before the function is called.
That's all well and good, but why is arr[] allowed as a pointer declaration? I can't say that this is the reason for sure, but I suspect it's a holdover from the B language, from which C was derived. In fact, pretty much everything hinky or unintuitive about arrays in C is a holdover from B.
B was a "typeless" language; you didn't have different types for floats, integers, text, whatever. Everything was stored as fixed-size words, or "cells", and memory was treated as a linear array of cells. When you declared an array in B, as in
auto arr[10];
the compiler would set aside 10 cells for the array, and then set aside an additional 11th cell that would store an offset to the first element of the array, and that additional cell would be bound to the variable arr. As in C, array indexing in B was computed as *(arr + i); you'd take the value stored in arr, add an offset i, and dereference the result. Ritchie retained most of these semantics, with the huge exception of no longer setting aside storage for the pointer to the first element of the array; instead, that pointer value would be computed from the array expression itself when the code was translated. This is why array expressions are converted to pointer types, why &arr and arr give the same value, if different types (the address of the array and the address of the first element of the array are the same) and why an array expression cannot be the target of an assignment (there's nothing to assign to; no storage has been set aside for a variable independent of the array elements).
Now here's the fun bit; in B, you'd declare a "pointer" as
auto ptr[];
This had the effect of allocating the cell to store the offset to the first element of the array and binding it to ptr, but ptr didn't point anywhere in particular; you could assign it to point to various locations. I suspect that notation was held over for a couple of reasons:
Most of the guys who worked on the initial version of C were familiar with it;
It sort of emphasizes that the parameter represents an array in the caller;
Personally, I would have preferred that Ritchie had used * to designate pointers everywhere, but he didn't (or, alternately, use [] to designate a pointer in all contexts, not just a function parameter declaration). I will normally recommend that everyone use * notation for function parameters instead of [], simply because it more accurately conveys the type of the parameter, but I can understand why people would prefer the second notation.
Both your valid and invalid declarations are internally equivalent, i.e., the compiler converts the latter to the former.
What your function sees is the pointer to the first element of the array.
PS. The alternative would be to push the whole array on the stack, which would be grossly inefficient from both time and space viewpoints.

Resources