In the following bit of code, pointer values and pointer addresses differ as expected.
But array values and addresses don't!
How can this be?
Output
my_array = 0022FF00
&my_array = 0022FF00
pointer_to_array = 0022FF00
&pointer_to_array = 0022FEFC
#include <stdio.h>
int main()
{
char my_array[100] = "some cool string";
printf("my_array = %p\n", my_array);
printf("&my_array = %p\n", &my_array);
char *pointer_to_array = my_array;
printf("pointer_to_array = %p\n", pointer_to_array);
printf("&pointer_to_array = %p\n", &pointer_to_array);
printf("Press ENTER to continue...\n");
getchar();
return 0;
}
The name of an array usually evaluates to the address of the first element of the array, so array and &array have the same value (but different types, so array+1 and &array+1 will not be equal if the array is more than 1 element long).
There are two exceptions to this: when the array name is an operand of sizeof or unary & (address-of), the name refers to the array object itself. Thus sizeof array gives you the size in bytes of the entire array, not the size of a pointer.
For an array defined as T array[size], it will have type T *. When/if you increment it, you get to the next element in the array.
&array evaluates to the same address, but given the same definition, it creates a pointer of the type T(*)[size] -- i.e., it's a pointer to an array, not to a single element. If you increment this pointer, it'll add the size of the entire array, not the size of a single element. For example, with code like this:
char array[16];
printf("%p\t%p", (void*)&array, (void*)(&array+1));
We can expect the second pointer to be 16 greater than the first (because it's an array of 16 char's). Since %p typically converts pointers in hexadecimal, it might look something like:
0x12341000 0x12341010
That's because the array name (my_array) is different from a pointer to array. It is an alias to the address of an array, and its address is defined as the address of the array itself.
The pointer is a normal C variable on the stack, however. Thus, you can take its address and get a different value from the address it holds inside.
I wrote about this topic here - please take a look.
In C, when you use the name of an array in an expression (including passing it to a function), unless it is the operand of the address-of (&) operator or the sizeof operator, it decays to a pointer to its first element.
That is, in most contexts array is equivalent to &array[0] in both type and value.
In your example, my_array has type char[100] which decays to a char* when you pass it to printf.
&my_array has type char (*)[100] (pointer to array of 100 char). As it is the operand to &, this is one of the cases that my_array doesn't immediately decay to a pointer to its first element.
The pointer to the array has the same address value as a pointer to the first element of the array as an array object is just a contiguous sequence of its elements, but a pointer to an array has a different type to a pointer to an element of that array. This is important when you do pointer arithmetic on the two types of pointer.
pointer_to_array has type char * - initialized to point at the first element of the array as that is what my_array decays to in the initializer expression - and &pointer_to_array has type char ** (pointer to a pointer to a char).
Of these: my_array (after decay to char*), &my_array and pointer_to_array all point directly at either the array or the first element of the array and so have the same address value.
The reason why my_array and &my_array result in the same address can be easily understood when you look at the memory layout of an array.
Let's say you have an array of 10 characters (instead the 100 in your code).
char my_array[10];
Memory for my_array looks something like:
+---+---+---+---+---+---+---+---+---+---+
| | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+
^
|
Address of my_array.
In C/C++, an array decays to the pointer to the first element in an expression such as
printf("my_array = %p\n", my_array);
If you examine where the first element of the array lies you will see that its address is the same as the address of the array:
my_array[0]
|
v
+---+---+---+---+---+---+---+---+---+---+
| | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+
^
|
Address of my_array[0].
In the B programming language, which was the immediate predecessor to C,
pointers and integers were freely interchangeable. The system would behave as
though all of memory was a giant array. Each variable name had either a global
or stack-relative address
associated with it, for each variable name the only things the compiler had to keep track of was whether it was a global or local variable, and its address relative to the first global or local variable.
Given a global declaration like i; [there was no need to specify a type, since everything was an integer/pointer] would be processed by the
compiler as: address_of_i = next_global++; memory[address_of_i] = 0; and a statement like i++ would be processed as: memory[address_of_i] = memory[address_of_i]+1;.
A declaration like arr[10]; would be processed as address_of_arr = next_global; memory[next_global] = next_global; next_global += 10;. Note that as soon as that declaration was processed, the compiler could immediately forget about arr being an array. A statement like arr[i]=6; would be processed as memory[memory[address_of_a] + memory[address_of_i]] = 6;. The compiler wouldn't care whether arr represented an array and i an integer, or vice versa. Indeed, it wouldn't care if they were both arrays or both integers; it would perfectly happily generate the code as described, without regard for whether the resulting behavior would likely be useful.
One of the goals of the C programming language was to be largely compatible with B. In B, the name of an array [called a "vector" in the terminology of B] identified a variable holding a pointer which was initially assigned to point to to the first element of an allocation of the given size, so if that name appeared in the argument list for a function, the function would receive a pointer to the vector. Even though C added "real" array types, whose name was rigidly associated with the address of the allocation rather than a pointer variable that would initially point to the allocation, having arrays decompose to pointers made code which declared a C-type array behave identically to B code which declared a vector and then never modified the variable holding its address.
Actually &myarray and myarray both are the base address.
If you want to see the difference instead of using
printf("my_array = %p\n", my_array);
printf("my_array = %p\n", &my_array);
use
printf("my_array = %s\n", my_array);
printf("my_array = %p\n", my_array);
Related
In the following bit of code, pointer values and pointer addresses differ as expected.
But array values and addresses don't!
How can this be?
Output
my_array = 0022FF00
&my_array = 0022FF00
pointer_to_array = 0022FF00
&pointer_to_array = 0022FEFC
#include <stdio.h>
int main()
{
char my_array[100] = "some cool string";
printf("my_array = %p\n", my_array);
printf("&my_array = %p\n", &my_array);
char *pointer_to_array = my_array;
printf("pointer_to_array = %p\n", pointer_to_array);
printf("&pointer_to_array = %p\n", &pointer_to_array);
printf("Press ENTER to continue...\n");
getchar();
return 0;
}
The name of an array usually evaluates to the address of the first element of the array, so array and &array have the same value (but different types, so array+1 and &array+1 will not be equal if the array is more than 1 element long).
There are two exceptions to this: when the array name is an operand of sizeof or unary & (address-of), the name refers to the array object itself. Thus sizeof array gives you the size in bytes of the entire array, not the size of a pointer.
For an array defined as T array[size], it will have type T *. When/if you increment it, you get to the next element in the array.
&array evaluates to the same address, but given the same definition, it creates a pointer of the type T(*)[size] -- i.e., it's a pointer to an array, not to a single element. If you increment this pointer, it'll add the size of the entire array, not the size of a single element. For example, with code like this:
char array[16];
printf("%p\t%p", (void*)&array, (void*)(&array+1));
We can expect the second pointer to be 16 greater than the first (because it's an array of 16 char's). Since %p typically converts pointers in hexadecimal, it might look something like:
0x12341000 0x12341010
That's because the array name (my_array) is different from a pointer to array. It is an alias to the address of an array, and its address is defined as the address of the array itself.
The pointer is a normal C variable on the stack, however. Thus, you can take its address and get a different value from the address it holds inside.
I wrote about this topic here - please take a look.
In C, when you use the name of an array in an expression (including passing it to a function), unless it is the operand of the address-of (&) operator or the sizeof operator, it decays to a pointer to its first element.
That is, in most contexts array is equivalent to &array[0] in both type and value.
In your example, my_array has type char[100] which decays to a char* when you pass it to printf.
&my_array has type char (*)[100] (pointer to array of 100 char). As it is the operand to &, this is one of the cases that my_array doesn't immediately decay to a pointer to its first element.
The pointer to the array has the same address value as a pointer to the first element of the array as an array object is just a contiguous sequence of its elements, but a pointer to an array has a different type to a pointer to an element of that array. This is important when you do pointer arithmetic on the two types of pointer.
pointer_to_array has type char * - initialized to point at the first element of the array as that is what my_array decays to in the initializer expression - and &pointer_to_array has type char ** (pointer to a pointer to a char).
Of these: my_array (after decay to char*), &my_array and pointer_to_array all point directly at either the array or the first element of the array and so have the same address value.
The reason why my_array and &my_array result in the same address can be easily understood when you look at the memory layout of an array.
Let's say you have an array of 10 characters (instead the 100 in your code).
char my_array[10];
Memory for my_array looks something like:
+---+---+---+---+---+---+---+---+---+---+
| | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+
^
|
Address of my_array.
In C/C++, an array decays to the pointer to the first element in an expression such as
printf("my_array = %p\n", my_array);
If you examine where the first element of the array lies you will see that its address is the same as the address of the array:
my_array[0]
|
v
+---+---+---+---+---+---+---+---+---+---+
| | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+
^
|
Address of my_array[0].
In the B programming language, which was the immediate predecessor to C,
pointers and integers were freely interchangeable. The system would behave as
though all of memory was a giant array. Each variable name had either a global
or stack-relative address
associated with it, for each variable name the only things the compiler had to keep track of was whether it was a global or local variable, and its address relative to the first global or local variable.
Given a global declaration like i; [there was no need to specify a type, since everything was an integer/pointer] would be processed by the
compiler as: address_of_i = next_global++; memory[address_of_i] = 0; and a statement like i++ would be processed as: memory[address_of_i] = memory[address_of_i]+1;.
A declaration like arr[10]; would be processed as address_of_arr = next_global; memory[next_global] = next_global; next_global += 10;. Note that as soon as that declaration was processed, the compiler could immediately forget about arr being an array. A statement like arr[i]=6; would be processed as memory[memory[address_of_a] + memory[address_of_i]] = 6;. The compiler wouldn't care whether arr represented an array and i an integer, or vice versa. Indeed, it wouldn't care if they were both arrays or both integers; it would perfectly happily generate the code as described, without regard for whether the resulting behavior would likely be useful.
One of the goals of the C programming language was to be largely compatible with B. In B, the name of an array [called a "vector" in the terminology of B] identified a variable holding a pointer which was initially assigned to point to to the first element of an allocation of the given size, so if that name appeared in the argument list for a function, the function would receive a pointer to the vector. Even though C added "real" array types, whose name was rigidly associated with the address of the allocation rather than a pointer variable that would initially point to the allocation, having arrays decompose to pointers made code which declared a C-type array behave identically to B code which declared a vector and then never modified the variable holding its address.
Actually &myarray and myarray both are the base address.
If you want to see the difference instead of using
printf("my_array = %p\n", my_array);
printf("my_array = %p\n", &my_array);
use
printf("my_array = %s\n", my_array);
printf("my_array = %p\n", my_array);
Is an array's name a pointer in C?
If not, what is the difference between an array's name and a pointer variable?
An array is an array and a pointer is a pointer, but in most cases array names are converted to pointers. A term often used is that they decay to pointers.
Here is an array:
int a[7];
a contains space for seven integers, and you can put a value in one of them with an assignment, like this:
a[3] = 9;
Here is a pointer:
int *p;
p doesn't contain any spaces for integers, but it can point to a space for an integer. We can, for example, set it to point to one of the places in the array a, such as the first one:
p = &a[0];
What can be confusing is that you can also write this:
p = a;
This does not copy the contents of the array a into the pointer p (whatever that would mean). Instead, the array name a is converted to a pointer to its first element. So that assignment does the same as the previous one.
Now you can use p in a similar way to an array:
p[3] = 17;
The reason that this works is that the array dereferencing operator in C, [ ], is defined in terms of pointers. x[y] means: start with the pointer x, step y elements forward after what the pointer points to, and then take whatever is there. Using pointer arithmetic syntax, x[y] can also be written as *(x+y).
For this to work with a normal array, such as our a, the name a in a[3] must first be converted to a pointer (to the first element in a). Then we step 3 elements forward, and take whatever is there. In other words: take the element at position 3 in the array. (Which is the fourth element in the array, since the first one is numbered 0.)
So, in summary, array names in a C program are (in most cases) converted to pointers. One exception is when we use the sizeof operator on an array. If a was converted to a pointer in this context, sizeof a would give the size of a pointer and not of the actual array, which would be rather useless, so in that case a means the array itself.
When an array is used as a value, its name represents the address of the first element.
When an array is not used as a value its name represents the whole array.
int arr[7];
/* arr used as value */
foo(arr);
int x = *(arr + 1); /* same as arr[1] */
/* arr not used as value */
size_t bytes = sizeof arr;
void *q = &arr; /* void pointers are compatible with pointers to any object */
If an expression of array type (such as the array name) appears in a larger expression and it isn't the operand of either the & or sizeof operators, then the type of the array expression is converted from "N-element array of T" to "pointer to T", and the value of the expression is the address of the first element in the array.
In short, the array name is not a pointer, but in most contexts it is treated as though it were a pointer.
Edit
Answering the question in the comment:
If I use sizeof, do i count the size of only the elements of the array? Then the array “head” also takes up space with the information about length and a pointer (and this means that it takes more space, than a normal pointer would)?
When you create an array, the only space that's allocated is the space for the elements themselves; no storage is materialized for a separate pointer or any metadata. Given
char a[10];
what you get in memory is
+---+
a: | | a[0]
+---+
| | a[1]
+---+
| | a[2]
+---+
...
+---+
| | a[9]
+---+
The expression a refers to the entire array, but there's no object a separate from the array elements themselves. Thus, sizeof a gives you the size (in bytes) of the entire array. The expression &a gives you the address of the array, which is the same as the address of the first element. The difference between &a and &a[0] is the type of the result1 - char (*)[10] in the first case and char * in the second.
Where things get weird is when you want to access individual elements - the expression a[i] is defined as the result of *(a + i) - given an address value a, offset i elements (not bytes) from that address and dereference the result.
The problem is that a isn't a pointer or an address - it's the entire array object. Thus, the rule in C that whenever the compiler sees an expression of array type (such as a, which has type char [10]) and that expression isn't the operand of the sizeof or unary & operators, the type of that expression is converted ("decays") to a pointer type (char *), and the value of the expression is the address of the first element of the array. Therefore, the expression a has the same type and value as the expression &a[0] (and by extension, the expression *a has the same type and value as the expression a[0]).
C was derived from an earlier language called B, and in B a was a separate pointer object from the array elements a[0], a[1], etc. Ritchie wanted to keep B's array semantics, but he didn't want to mess with storing the separate pointer object. So he got rid of it. Instead, the compiler will convert array expressions to pointer expressions during translation as necessary.
Remember that I said arrays don't store any metadata about their size. As soon as that array expression "decays" to a pointer, all you have is a pointer to a single element. That element may be the first of a sequence of elements, or it may be a single object. There's no way to know based on the pointer itself.
When you pass an array expression to a function, all the function receives is a pointer to the first element - it has no idea how big the array is (this is why the gets function was such a menace and was eventually removed from the library). For the function to know how many elements the array has, you must either use a sentinel value (such as the 0 terminator in C strings) or you must pass the number of elements as a separate parameter.
Which *may* affect how the address value is interpreted - depends on the machine.
An array declared like this
int a[10];
allocates memory for 10 ints. You can't modify a but you can do pointer arithmetic with a.
A pointer like this allocates memory for just the pointer p:
int *p;
It doesn't allocate any ints. You can modify it:
p = a;
and use array subscripts as you can with a:
p[2] = 5;
a[2] = 5; // same
*(p+2) = 5; // same effect
*(a+2) = 5; // same effect
The array name by itself yields a memory location, so you can treat the array name like a pointer:
int a[7];
a[0] = 1976;
a[1] = 1984;
printf("memory location of a: %p", a);
printf("value at memory location %p is %d", a, *a);
And other nifty stuff you can do to pointer (e.g. adding/substracting an offset), you can also do to an array:
printf("value at memory location %p is %d", a + 1, *(a + 1));
Language-wise, if C didn't expose the array as just some sort of "pointer"(pedantically it's just a memory location. It cannot point to arbitrary location in memory, nor can be controlled by the programmer). We always need to code this:
printf("value at memory location %p is %d", &a[1], a[1]);
I think this example sheds some light on the issue:
#include <stdio.h>
int main()
{
int a[3] = {9, 10, 11};
int **b = &a;
printf("a == &a: %d\n", a == b);
return 0;
}
It compiles fine (with 2 warnings) in gcc 4.9.2, and prints the following:
a == &a: 1
oops :-)
So, the conclusion is no, the array is not a pointer, it is not stored in memory (not even read-only one) as a pointer, even though it looks like it is, since you can obtain its address with the & operator. But - oops - that operator does not work :-)), either way, you've been warned:
p.c: In function ‘main’:
pp.c:6:12: warning: initialization from incompatible pointer type
int **b = &a;
^
p.c:8:28: warning: comparison of distinct pointer types lacks a cast
printf("a == &a: %d\n", a == b);
C++ refuses any such attempts with errors in compile-time.
Edit:
This is what I meant to demonstrate:
#include <stdio.h>
int main()
{
int a[3] = {9, 10, 11};
void *c = a;
void *b = &a;
void *d = &c;
printf("a == &a: %d\n", a == b);
printf("c == &c: %d\n", c == d);
return 0;
}
Even though c and a "point" to the same memory, you can obtain address of the c pointer, but you cannot obtain the address of the a pointer.
The following example provides a concrete difference between an array name and a pointer. Let say that you want to represent a 1D line with some given maximum dimension, you could do it either with an array or a pointer:
typedef struct {
int length;
int line_as_array[1000];
int* line_as_pointer;
} Line;
Now let's look at the behavior of the following code:
void do_something_with_line(Line line) {
line.line_as_pointer[0] = 0;
line.line_as_array[0] = 0;
}
void main() {
Line my_line;
my_line.length = 20;
my_line.line_as_pointer = (int*) calloc(my_line.length, sizeof(int));
my_line.line_as_pointer[0] = 10;
my_line.line_as_array[0] = 10;
do_something_with_line(my_line);
printf("%d %d\n", my_line.line_as_pointer[0], my_line.line_as_array[0]);
};
This code will output:
0 10
That is because in the function call to do_something_with_line the object was copied so:
The pointer line_as_pointer still contains the same address it was pointing to
The array line_as_array was copied to a new address which does not outlive the scope of the function
So while arrays are not given by values when you directly input them to functions, when you encapsulate them in structs they are given by value (i.e. copied) which outlines here a major difference in behavior compared to the implementation using pointers.
The array name behaves like a pointer and points to the first element of the array. Example:
int a[]={1,2,3};
printf("%p\n",a); //result is similar to 0x7fff6fe40bc0
printf("%p\n",&a[0]); //result is similar to 0x7fff6fe40bc0
Both the print statements will give exactly same output for a machine. In my system it gave:
0x7fff6fe40bc0
In the following bit of code, pointer values and pointer addresses differ as expected.
But array values and addresses don't!
How can this be?
Output
my_array = 0022FF00
&my_array = 0022FF00
pointer_to_array = 0022FF00
&pointer_to_array = 0022FEFC
#include <stdio.h>
int main()
{
char my_array[100] = "some cool string";
printf("my_array = %p\n", my_array);
printf("&my_array = %p\n", &my_array);
char *pointer_to_array = my_array;
printf("pointer_to_array = %p\n", pointer_to_array);
printf("&pointer_to_array = %p\n", &pointer_to_array);
printf("Press ENTER to continue...\n");
getchar();
return 0;
}
The name of an array usually evaluates to the address of the first element of the array, so array and &array have the same value (but different types, so array+1 and &array+1 will not be equal if the array is more than 1 element long).
There are two exceptions to this: when the array name is an operand of sizeof or unary & (address-of), the name refers to the array object itself. Thus sizeof array gives you the size in bytes of the entire array, not the size of a pointer.
For an array defined as T array[size], it will have type T *. When/if you increment it, you get to the next element in the array.
&array evaluates to the same address, but given the same definition, it creates a pointer of the type T(*)[size] -- i.e., it's a pointer to an array, not to a single element. If you increment this pointer, it'll add the size of the entire array, not the size of a single element. For example, with code like this:
char array[16];
printf("%p\t%p", (void*)&array, (void*)(&array+1));
We can expect the second pointer to be 16 greater than the first (because it's an array of 16 char's). Since %p typically converts pointers in hexadecimal, it might look something like:
0x12341000 0x12341010
That's because the array name (my_array) is different from a pointer to array. It is an alias to the address of an array, and its address is defined as the address of the array itself.
The pointer is a normal C variable on the stack, however. Thus, you can take its address and get a different value from the address it holds inside.
I wrote about this topic here - please take a look.
In C, when you use the name of an array in an expression (including passing it to a function), unless it is the operand of the address-of (&) operator or the sizeof operator, it decays to a pointer to its first element.
That is, in most contexts array is equivalent to &array[0] in both type and value.
In your example, my_array has type char[100] which decays to a char* when you pass it to printf.
&my_array has type char (*)[100] (pointer to array of 100 char). As it is the operand to &, this is one of the cases that my_array doesn't immediately decay to a pointer to its first element.
The pointer to the array has the same address value as a pointer to the first element of the array as an array object is just a contiguous sequence of its elements, but a pointer to an array has a different type to a pointer to an element of that array. This is important when you do pointer arithmetic on the two types of pointer.
pointer_to_array has type char * - initialized to point at the first element of the array as that is what my_array decays to in the initializer expression - and &pointer_to_array has type char ** (pointer to a pointer to a char).
Of these: my_array (after decay to char*), &my_array and pointer_to_array all point directly at either the array or the first element of the array and so have the same address value.
The reason why my_array and &my_array result in the same address can be easily understood when you look at the memory layout of an array.
Let's say you have an array of 10 characters (instead the 100 in your code).
char my_array[10];
Memory for my_array looks something like:
+---+---+---+---+---+---+---+---+---+---+
| | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+
^
|
Address of my_array.
In C/C++, an array decays to the pointer to the first element in an expression such as
printf("my_array = %p\n", my_array);
If you examine where the first element of the array lies you will see that its address is the same as the address of the array:
my_array[0]
|
v
+---+---+---+---+---+---+---+---+---+---+
| | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+
^
|
Address of my_array[0].
In the B programming language, which was the immediate predecessor to C,
pointers and integers were freely interchangeable. The system would behave as
though all of memory was a giant array. Each variable name had either a global
or stack-relative address
associated with it, for each variable name the only things the compiler had to keep track of was whether it was a global or local variable, and its address relative to the first global or local variable.
Given a global declaration like i; [there was no need to specify a type, since everything was an integer/pointer] would be processed by the
compiler as: address_of_i = next_global++; memory[address_of_i] = 0; and a statement like i++ would be processed as: memory[address_of_i] = memory[address_of_i]+1;.
A declaration like arr[10]; would be processed as address_of_arr = next_global; memory[next_global] = next_global; next_global += 10;. Note that as soon as that declaration was processed, the compiler could immediately forget about arr being an array. A statement like arr[i]=6; would be processed as memory[memory[address_of_a] + memory[address_of_i]] = 6;. The compiler wouldn't care whether arr represented an array and i an integer, or vice versa. Indeed, it wouldn't care if they were both arrays or both integers; it would perfectly happily generate the code as described, without regard for whether the resulting behavior would likely be useful.
One of the goals of the C programming language was to be largely compatible with B. In B, the name of an array [called a "vector" in the terminology of B] identified a variable holding a pointer which was initially assigned to point to to the first element of an allocation of the given size, so if that name appeared in the argument list for a function, the function would receive a pointer to the vector. Even though C added "real" array types, whose name was rigidly associated with the address of the allocation rather than a pointer variable that would initially point to the allocation, having arrays decompose to pointers made code which declared a C-type array behave identically to B code which declared a vector and then never modified the variable holding its address.
Actually &myarray and myarray both are the base address.
If you want to see the difference instead of using
printf("my_array = %p\n", my_array);
printf("my_array = %p\n", &my_array);
use
printf("my_array = %s\n", my_array);
printf("my_array = %p\n", my_array);
I have the following C program:
#include <stdio.h>
int main(){
int a[2][2] = {1, 2, 3, 4};
printf("a:%p, &a:%p, *a:%p \n", a, &a, *a);
printf("a[0]:%p, &a[0]:%p \n", a[0], &a[0]);
printf("&a[0][0]:%p \n", &a[0][0]);
return 0;
}
It gives the following output:
a:0028FEAC, &a:0028FEAC, *a:0028FEAC
a[0]:0028FEAC, &a[0]:0028FEAC
&a[0][0]:0028FEAC
I am not able to understand why are &a, a, *a - all identical. The same for a[0], &a[0] and &a[0][0].
EDIT:
Thanks to the answers, I've understood the reason why these values are coming out to be equal. This line from the book by Kernighan & Ritchie turned out to be the key to my question:
the name of an array is a synonym for the location of the initial element.
So, by this, we get
a = &a[0], and
a[0] = &a[0][0] (considering a as an array of arrays)
Intuitively, now the reason is clear behind the output. But, considering how pointers are implemented in C, I can't understand how a and &a are equal. I am assuming that there is a variable a in memory which points to the array(and the starting address of this array-memory-block would be the value of this variable a).
But, when we do &a, doesn't that mean taking the address of the memory location where the variable a was stored? Why are these values equal then?
They're not identical pointers. They're pointers of distinct types that all point to the same memory location. Same value (sort of), different types.
A 2-dimensional array in C is nothing more or less than an array of arrays.
The object a is of type int[2][2], or 2-element array of 2-element array of int.
Any expression of array type is, in most but not all contexts, implicitly converted to ("decays" to) a pointer to the array object's first element. So the expression a, unless it's the operand of unary & or sizeof, is of type int(*)[2], and is equivalent to &a[0] (or &(a[0]) if that's clearer). It becomes a pointer to row 0 of the 2-dimensional array. It's important to remember that this is a pointer value (or equivalently an address), not a pointer object; there is no pointer object here unless you explicitly create one.
So looking at the several expressions you asked about:
&a is the address of the entire array object; it's a pointer expression of type int(*)[2][2].
a is the name of the array. As discussed above, it "decays" to a pointer to the first element (row) of the array object. It's a pointer expression of type int(*)[2].
*a dereferences the pointer expression a. Since a (after it decays) is a pointer to an array of 2 ints, *a is an array of 2 ints. Since that's an array type, it decays (in most but not all contexts) to a pointer to the first element of the array object. So it's of type int*. *a is equivalent to &a[0][0].
&a[0] is the address of the first (0th) row of the array object. It's of type int(*)[2]. a[0] is an array object; it doesn't decay to a pointer because it's the direct operand of unary &.
&a[0][0] is the address of element 0 of row 0 of the array object. It's of type int*.
All of these pointer expressions refer to the same location in memory. That location is the beginning of the array object a; it's also the beginning of the array object a[0] and of the int object a[0][0].
The correct way to print a pointer value is to use the "%p" format and to convert the pointer value to void*:
printf("&a = %p\n", (void*)&a);
printf("a = %p\n", (void*)a);
printf("*a = %p\n", (void*)*a);
/* and so forth */
This conversion to void* yields a "raw" address that specifies only a location in memory, not what type of object is at that location. So if you have multiple pointers of different types that point to objects that begin at the same memory location, converting them all to void* yields the same value.
(I've glossed over the inner workings of the [] indexing operator. The expression x[y] is by definition equivalent to *(x+y), where x is a pointer (possibly the result of the implicit conversion of an array) and y is an integer. Or vice versa, but that's ugly; arr[0] and 0[arr] are equivalent, but that's useful only if you're writing deliberately obfuscated code. If we account for that equivalence, it takes a paragraph or so to describe what a[0][0] means, and this answer is probably already too long.)
For the sake of completeness the three contexts in which an expression of array type is not implicitly converted to a pointer to the array's first element are:
When it's the operand of unary &, so &arr yields the address of the entire array object;
When it's the operand of sizeof, so sizeof arr yields the size in bytes of the array object, not the size of a pointer; and
When it's a string literal in an initializer used to initialize an array (sub-)object, so char s[6] = "hello"; copies the array value into s rather than nonsensically initializing an array object with a pointer value. This last exception doesn't apply to the code you're asking about.
(The N1570 draft of the 2011 ISO C standard incorrectly states that _Alignof is a fourth exception; this is incorrect, since _Alignof can only be applied to a parenthesized type name, not to a expression. The error is corrected in the final C11 standard.)
Recommended reading: Section 6 of the comp.lang.c FAQ.
Because all expressions are pointing to the beginning of the array:
a = {{a00},{a01},{a10},{a11}}
a points to the array, just because it is an array, so a == &a[0]
and &a[0][0] is positioned at the first cell of the 2D array.
+------------------------------+
| a[0][0] <-- a[0] <-- a | // <--&a, a,*a, &a[0],&a[0][0]
|_a[0][1]_ |
| a[1][0] <-- a[1] |
| a[1][1] |
+------------------------------+
It is printing out the same values because they all are pointing to the same location.
Having said that,
&a[i][i] is of type int * which is a pointer to an integer.
a and &a[0] have the type int(*)[2] which indicates a pointer to an array of 2 ints.
&a has the type of int(*)[2][2] which indicates a pointer to a 2-D array or a pointer to an array of two elements in which each element is an array of 2-ints.
So, all of them are of different type and behave differently if you start doing pointer arithmetic on them.
(&a[0][1] + 1) points to the next integer element in the 2-D array i.e. to a[0][1]
&a[0] + 1 points to the next array of integers i.e. to a[1][0]
&a + 1 points to the next 2-D array which is non-existent in this case, but would be a[2][0] if present.
You know that a is the address of the first element of your array and according to the C standard, a[X] is equal to *(a + X).
So:
&a[0] == a because &a[0] is the same as &(*(a + 0)) = &(*a) = a.
&a[0][0] == a because &a[0][0] is the same as &(*(*(a + 0) + 0))) = &(*a) = a
A 2D array in C is treated as a 1D array whose elements are 1D arrays (the rows).
For example, a 4x3 array of T (where "T" is some data type) may
be declared by: T a[4][3], and described by the following
scheme:
+-----+-----+-----+
a == a[0] ---> | a00 | a01 | a02 |
+-----+-----+-----+
+-----+-----+-----+
a[1] ---> | a10 | a11 | a12 |
+-----+-----+-----+
+-----+-----+-----+
a[2] ---> | a20 | a21 | a22 |
+-----+-----+-----+
+-----+-----+-----+
a[3] ---> | a30 | a31 | a32 |
+-----+-----+-----+
Also the array elements are stored in memory row after row.
Prepending the T and appending the [3] to a we have an array of 3 elements of type T. But, the name a[4] is itself an array indicating that there are 4 elements each being an array of 3 elements. Hence we have an array of 4 arrays of 3 elements each.
Now it is clear that a points to the first element (a[0]) of a[4] . On the Other hand &a[0] will give the address of first element (a[0]) of a[4] and &a[0][0] will give the address of 0th row (a00 | a01 | a02) of array a[4][3]. &a will give the address of 2D array a[3][4]. *a decays to pointers to a[0][0].
Note that a is not a pointer to a[0][0]; instead it is a pointer to a[0].
Hence
G1: a and &a[0] are equivalent.
G2: *a, a[0]and &a[0][0] are equivalent.
G3: &a (gives the address of 2D array a[3][4]).
But group G1, G2 and G3 are not identical although they are giving the same result (and I explained above why it is giving same result).
This also means that in C arrays have no overhead. In some other languages the structure of arrays is
&a --> overhead
more overhead
&a[0] --> element 0
element 1
element 2
...
and &a != &a[0]
Intuitively, now the reason is clear behind the output. But, considering how pointers are implemented in C, I can't understand how a and &a are equal. I am assuming that there is a variable a in memory which points to the array(and the starting address of this array-memory-block would be the value of this variable a).
Well, no. There is no such thing as an address stored anywhere in memory. There is only memory allocated for the raw data, and that's it. What happens is, when you use a naked a, it immediately decays into a pointer to the first element, giving the impression that the 'value' of a were the address, but the only value of a is the raw array storage.
As a matter of fact, a and &a are different, but only in type, not in value. Let's make it a bit easier by using 1D arrays to clarify this point:
bool foo(int (*a)[2]) { //a function expecting a pointer to an array of two elements
return (*a)[0] == (*a)[1]; //a pointer to an array needs to be dereferenced to access its elements
}
bool bar(int (*a)[3]); //a function expecting a pointer to an array of three elements
bool baz(int *a) { //a function expecting a pointer to an integer, which is typically used to access arrays.
return a[0] == a[1]; //this uses pointer arithmetic to access the elements
}
int z[2];
assert((size_t)z == (size_t)&z); //the value of both is the address of the first element.
foo(&z); //This works, we pass a pointer to an array of two elements.
//bar(&z); //Error, bar expects a pointer to an array of three elements.
//baz(&z); //Error, baz expects a pointer to an int
//foo(z); //Error, foo expects a pointer to an array
//bar(z); //Error, bar expects a pointer to an array
baz(z); //Ok, the name of an array easily decays into a pointer to its first element.
As you see, a and &a behave very differently, even though they share the same value.
I know that an array decays to a pointer, such that if one declared
char things[8];
and then later on used things somewhere else, things is a pointer to the first element in the array.
Also, from my understanding, if one declares
char moreThings[8][8];
then moreThings is not of type pointer to char but of type "array of pointers to char," because the decay only occurs once.
When moreThings is passed to a function (say with prototype void doThings(char thingsGoHere[8][8]) what is actually going on with the stack?
If moreThings is not of pointer type, then is this really still a pass-by-reference? I guess I always thought that moreThings still represented the base address of the multidimensional array. What if doThings took input thingsGoHere and itself passed it to another function?
Is the rule pretty much that unless one specifies an array input as const then the array will always be modifiable?
I know that the type checking stuff only happens at compile time, but I'm still confused about what technically counts as a pass by reference (i.e. is it only when arguments of type pointer are passed, or would array of pointers be a pass-by-reference as well?)
Sorry to be a little all over the place with this question, but because of my difficulty in understanding this it is hard to articulate a precise inquiry.
You got it slightly wrong: moreThings also decays to a pointer to the first element, but since it is an array of an array of chars, the first element is an "array of 8 chars". So the decayed pointer is of this type:
char (*p)[8] = moreThings;
The value of the pointer is of course the same as the value of &moreThings[0][0], i.e. of the first element of the first element, and also the same of &a, but the type is a different one in each case.
Here's an example if char a[N][3]:
+===========================+===========================+====
|+--------+--------+-------+|+--------+--------+-------+|
|| a[0,0] | a[0,1] | a[0,2]||| a[1,0] | a[1,1] | a[1,2]|| ...
|+--------+--------+-------+++--------+--------+-------++ ...
| a[0] | a[1] |
+===========================+===========================+====
a
^^^
||+-- &a[0,0]
|+-----&a[0]
+-------&a
&a: address of the entire array of arrays of chars, which is a char[N][3]
&a[0], same as a: address of the first element, which is itself a char[3]
&a[0][0]: address of the first element of the first element, which is a char
This demonstrates that different objects may have the same address, but if two objects have the same address and the same type, then they are the same object.
"ARRAY ADDRESS AND POINTERS TO MULTIDIMENSIONAL ARRAYS"
Lets we start with 1-D array first:
Declaration char a[8]; creates an array of 8 elements.
And here a is address of fist element but not address of array.
char* ptr = a; is correct expression as ptr is pointer to char and can address first element.
But the expression ptr = &a is wrong! Because ptr can't address an array.
&a means address of array. Really Value of a and &a are same but semantically both are different, One is address of char other is address of array of 8 chars.
char (*ptr2)[8]; Here ptr2 is pointer to an array of 8 chars, And this time
ptr2=&a is a valid expression.
Data-type of &a is char(*)[8] and type of a is char[8] that simply decays into char* in most operation e.g. char* ptr = a;
To understand better read: Difference between char *str and char str[] and how both stores in memory?
Second case,
Declaration char aa[8][8]; creates a 2-D array of 8x8 size.
Any 2-D array can also be viewed as 1-D array in which each array element is a 1-D array.
aa is address of first element that is an array of 8 chars. Expression ptr2 = aa is valid and correct.
If we declare as follows:
char (*ptr3)[8][8];
char ptr3 = &aa; //is a correct expression
Similarly,
moreThings in your declaration char moreThings[8][8]; contain address of fist element that is char array of 8 elements.
To understand better read: Difference between char* str[] and char str[][] and how both stores in memory?
It would be interesting to know:
morething is an address of 8 char array .
*morething is an address of first element that is &morething[0][0].
&morething is an address of 2-D array of 8 x 8.
And address values of all above three are same but semantically all different.
**morething is value of first element that is morething[0][0].
To understand better read: Difference between &str and str, when str is declared as char str[10]?
Further more,
void doThings(char thingsGoHere[8][8]) is nothing but void doThings(char (*thingsGoHere)[8]) and thus accepts any array that is two dimensional with the second dimension being 8.
About type of variables in C and C++: (I would like to add in answer)
Nothing is pass by reference in C its C++ concept. If its used in C that means author talking about pointer variable.
C supports pass by Address and pass by value.
C++ supports Pass by address, pass by value and also pass by Reference.
Read: pointer variables and reference variables
At the end,
Name Of an array is constant identifier not variable.
Nicely explained by Kerrek,
In addition to that, we can prove it by the following example:
#include <stdio.h>
int main ()
{
int a[10][10];
printf (".. %p %p\n", &a, &a+1);
printf (".. %p %p \n ", &a[0], &a[0]+1);
printf (".. %p %p \n ", &a[0][0], &a[0][0] +1);
}
The Output is :
.. 0x7fff6ae2ca5c 0x7fff6ae2cbec = 400 bytes difference
.. 0x7fff6ae2ca5c 0x7fff6ae2ca84 = 40 bytes difference
.. 0x7fff6ae2ca5c 0x7fff6ae2ca60 = 4 bytes difference.
&a +1 -> Moves the pointer by adding whole array size. ie: 400 bytes
&a[0] + 1 -> Moves the pointer by adding the size of column. ie: 40 bytes.
&a[0][0] +1 -> Moves the pointer by adding the size of element ie: 4 bytes.
[ int size is 4 bytes ]
Hope this helps. :)