This question already has answers here:
how do arrays work internally in c/c++
(4 answers)
Closed 4 years ago.
I have a question about how C arrays are stored in memory. But I'm having trouble formulating the question, so here's my best try to put it into words. I have trouble with English. Let's say we have a three dimensional array:
int foo[2][3][4];
Elements can be accessed using either array or pointer notation:
foo[i][j][k]
*(*(*(foo + i) + j) + k)
We could think of the array as a pointer to a pointer to a pointer to an int, or, for example, a pointer to a 2 dimensional array like (*a)[2][3].
The problem in my thinking is this: I would have thought that in order to 'extract' values in the array, we'd only have to dereference the top level of the array (i.e. [i]) once, the second level (i.e. [j]) twice, and the third level (i.e. [k]) three times. But actually we always have to dereference three times to get to any value. Why is this? Or is this really the case?
I try to imagine the array structure in memory.
Apologies for my poor way to express this.
Your array of arrays of arrays foo is arranged like this in memory:
+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+----------+
| foo[0][0][0] | foo[0][0][1] | foo[0][0][2] | foo[0][0][3] | foo[0][1][0] | foo[0][1][1] | foo[0][1][2] | foo[0][1][3] | ... etc. |
+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+----------+
foo[0][0][0] will be at the lowest memory location, foo[1][2][3] will be at the highest.
And an important note: An array is not a pointer. It can decay to a pointer to its first element, but please don't "think of" an array as a pointer.
Another important note: The pointers &foo, &foo[0], &foo[0][0] and &foo[0][0][0] are all pointing to the same location, but they are all different types which makes them semantically different:
&foo is of the type int (*)[2][3][4]
&foo[0] is of the type int (*)[3][4]
&foo[0][0] is of the type int (*)[4]
And &foo[0][0][0] is of the type int *
Lastly a note about the array-to-pointer decay, it only happens in one step. That means foo decays to &foo[0], foo[0] decays to &foo[0][0], and foo[0][0] decays to &foo[0][0][0].
An array is not as plain as a storage location (memory address), but an object with a type and layout.
So in your example, foo is an array of 3 arrays of 4 arrays of int, whose length is 2. *f is an array of 4 arrays of int, and **f is an array of int.
Even though each level of dereferencing gives the same memory address, they're different because they have different types, and thus the data at the same location should be interpreted differently.
Related
This question already has answers here:
Is an array name a pointer?
(8 answers)
Closed 5 years ago.
void main()
{
int array[10] = {1,2,3,4,5,6,7};
printf("%p\n",array);
}
Here, the system would allocate a memory in stack equivalent to 10 integers for the array. However, i dont think there is any extra memory allocate for the variable array, i presume array is a mnemonic for human understanding and coding purpose. If that is the case, how does the printf() in the statement - printf("%p\n",array); accept it as though it is a pointer variable?
This confusion becomes more evident as the dimension(s) of the array keeps increasing.
int main()
{
int matrix[2][4] = {{11,22,33,99},{44,55,66,110}};
printf("%p\n", matrix);
printf("%p\n", matrix+1);
printf("%p\n", *(matrix+1));
}
The ouput for one of the program execution was -
0x7ffd9ba44d10
0x7ffd9ba44d20
0x7ffd9ba44d20
So both matrix+1 and *(matrix+1), after indirection outputs the same virtual memory address. I understand why matrix+1 address is what it is displaying but i don't understand why *(matrix+1) is outputting the same address even after indirection!
Well arrays are not pointers. Most of the cases(the exceptions are sizeof,&operator, _alignof etc) - it is converted into (array decaying) pointer to first element.
So here matrix is converted (decay) into pointer to first element - which is int (*)[4] when passed to printf.
Now dissect one by one, matrix+1 will point to the second element of the 2d array which is the 2nd element of the 2d array (That's why they are sizeof(int)*4 times apart.
In the third case they are same, because matrix+1 is of type int (*)[4] and when you dereference it you get int[4] basically the same address as that of before.
There is one thing to keep in mind - with pointers there are two things
It's value
It's type.
Two pointers may have the same value but their type may be different. Here also you saw that.
It (decaying) is mentioned in standard 6.3.2.1p3:-
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type 'array of type' is converted to an expression with type 'pointer to type' that points to the initial element of the array object and is not an lvalue.
Also you print the address in wrong manner (this is one of the case you would use casting).
printf("%p",(void*)matrix);
To make things a bit more clear:-
matrix is basically an object int[2][4] which decayed in the cases you have shown to int(*)[4]. You might wonder what is it that makes matrix+1
point to the second element of the array - the thing is pointer arithmetic is dicated by thing it points to. Here matrix as decayed into pointer to first element (int(*)[4]) it will move by an size of 4 ints.
This question already has answers here:
Why can't we use double pointer to represent two dimensional arrays?
(6 answers)
Closed 6 years ago.
int main()
{
matrix[2][4] = {{11,22,33,99},{44,55,66,110}};
int **ptr = (int**)matrix;
printf("%d%d",**matrix,*ptr);
}
But when a 2-d array is passed as a parameter it is typecasted into (*matrix)[2] ..
what type does the compiler store this array as... is it storing as a 2-d array or a double pointer or an pointer to an array .. If it is storing as an array how does it interprets differently at different situations like above. Please help me understand.
Is 2d array a double pointer?
No. This line of your program is incorrect:
int **ptr = (int**)matrix;
This answer deals with the same topic
If you want concrete image how multidimensional arrays are implemented:
The rules for multidimensional arrays are not different from those for ordinary arrays, just substitute the "inner" array type as element type. The array items are stored in memory directly after each other:
matrix: 11 22 33 99 44 55 66 110
----------- the first element of matrix
------------ the second element of matrix
Therefore, to address element matrix[x][y], you take the base address of matrix + x*4 + y (4 is the inner array size).
When arrays are passed to functions, they decay to pointers to their first element. As you noticed, this would be int (*)[4]. The 4 in the type would then tell the compiler the size of the inner type, which is why it works. When doing pointer arithmetic on a similar pointer, the compiler adds multiples of the element size, so for matrix_ptr[x][y], you get matrix_ptr + x*4 + y, which is exactly the same as above.
The cast ptr=(int**)matrix is therefore incorrect. For once, *ptr would mean a pointer value stored at address of matrix, but there isn't any. Secondly, There isn't a pointer to matrix[1] anywhere in the memory of the program.
Note: the calculations in this post assume sizeof(int)==1, to avoid unnecessary complexity.
No. A multidimensional array is a single block of memory. The size of the block is the product of the dimensions multiplied by the size of the type of the elements, and indexing in each pair of brackets offsets into the array by the product of the dimensions for the remaining dimensions. So..
int arr[5][3][2];
is an array that holds 30 ints. arr[0][0][0] gives the first, arr[1][0][0] gives the seventh (offsets by 3 * 2). arr[0][1][0] gives the third (offsets by 2).
The pointers the array decays to will depend on the level; arr decays to a pointer to a 3x2 int array, arr[0] decays to a pointer to a 2 element int array, and arr[0][0] decays to a pointer to int.
However, you can also have an array of pointers, and treat it as a multidimensional array -- but it requires some extra setup, because you have to set each pointer to its array. Additionally, you lose the information about the sizes of the arrays within the array (sizeof would give the size of the pointer). On the other hand, you gain the ability to have differently sized sub-arrays and to change where the pointers point, which is useful if they need to be resized or rearranged. An array of pointers like this can be indexed like a multidimensional array, even though it's allocated and arranged differently and sizeof won't always behave the same way with it. A statically allocated example of this setup would be:
int *arr[3];
int aa[2] = { 10, 11 },
ab[2] = { 12, 13 },
ac[2] = { 14, 15 };
arr[0] = aa;
arr[1] = ab;
arr[2] = ac;
After the above, arr[1][0] is 12. But instead of giving the int found at 1 * 2 * sizeof(int) bytes past the start address of the array arr, it gives the int found at 0 * sizeof(int) bytes past the address pointed to by arr[1]. Also, sizeof(arr[0]) is equivalent to sizeof(int *) instead of sizeof(int) * 2.
In C, there's nothing special you need to know to understand multi-dimensional arrays. They work exactly the same way as if they were never specifically mentioned. All you need to know is that you can create an array of any type, including an array.
So when you see:
int matrix[2][4];
Just think, "matrix is an array of 2 things -- those things are arrays of 4 integers". All the normal rules for arrays apply. For example, matrix can easily decay into a pointer to its first member, just like any other array, which in this case is an array of four integers. (Which can, of course, itself decay.)
If you can use the stack for that data (small volume) then you usually define the matrix:
int matrix[X][Y]
When you want to allocate it in the heap (large volume), the you usually define a:
int** matrix = NULL;
and then allocate the two dimensions with malloc/calloc.
You can treat the 2d array as int** but that is not a good practice since it makes the code less readable. Other then that
**matrix == matrix[0][0] is true
This question already has answers here:
Why can't we use double pointer to represent two dimensional arrays?
(6 answers)
Closed 6 years ago.
int main()
{
matrix[2][4] = {{11,22,33,99},{44,55,66,110}};
int **ptr = (int**)matrix;
printf("%d%d",**matrix,*ptr);
}
But when a 2-d array is passed as a parameter it is typecasted into (*matrix)[2] ..
what type does the compiler store this array as... is it storing as a 2-d array or a double pointer or an pointer to an array .. If it is storing as an array how does it interprets differently at different situations like above. Please help me understand.
Is 2d array a double pointer?
No. This line of your program is incorrect:
int **ptr = (int**)matrix;
This answer deals with the same topic
If you want concrete image how multidimensional arrays are implemented:
The rules for multidimensional arrays are not different from those for ordinary arrays, just substitute the "inner" array type as element type. The array items are stored in memory directly after each other:
matrix: 11 22 33 99 44 55 66 110
----------- the first element of matrix
------------ the second element of matrix
Therefore, to address element matrix[x][y], you take the base address of matrix + x*4 + y (4 is the inner array size).
When arrays are passed to functions, they decay to pointers to their first element. As you noticed, this would be int (*)[4]. The 4 in the type would then tell the compiler the size of the inner type, which is why it works. When doing pointer arithmetic on a similar pointer, the compiler adds multiples of the element size, so for matrix_ptr[x][y], you get matrix_ptr + x*4 + y, which is exactly the same as above.
The cast ptr=(int**)matrix is therefore incorrect. For once, *ptr would mean a pointer value stored at address of matrix, but there isn't any. Secondly, There isn't a pointer to matrix[1] anywhere in the memory of the program.
Note: the calculations in this post assume sizeof(int)==1, to avoid unnecessary complexity.
No. A multidimensional array is a single block of memory. The size of the block is the product of the dimensions multiplied by the size of the type of the elements, and indexing in each pair of brackets offsets into the array by the product of the dimensions for the remaining dimensions. So..
int arr[5][3][2];
is an array that holds 30 ints. arr[0][0][0] gives the first, arr[1][0][0] gives the seventh (offsets by 3 * 2). arr[0][1][0] gives the third (offsets by 2).
The pointers the array decays to will depend on the level; arr decays to a pointer to a 3x2 int array, arr[0] decays to a pointer to a 2 element int array, and arr[0][0] decays to a pointer to int.
However, you can also have an array of pointers, and treat it as a multidimensional array -- but it requires some extra setup, because you have to set each pointer to its array. Additionally, you lose the information about the sizes of the arrays within the array (sizeof would give the size of the pointer). On the other hand, you gain the ability to have differently sized sub-arrays and to change where the pointers point, which is useful if they need to be resized or rearranged. An array of pointers like this can be indexed like a multidimensional array, even though it's allocated and arranged differently and sizeof won't always behave the same way with it. A statically allocated example of this setup would be:
int *arr[3];
int aa[2] = { 10, 11 },
ab[2] = { 12, 13 },
ac[2] = { 14, 15 };
arr[0] = aa;
arr[1] = ab;
arr[2] = ac;
After the above, arr[1][0] is 12. But instead of giving the int found at 1 * 2 * sizeof(int) bytes past the start address of the array arr, it gives the int found at 0 * sizeof(int) bytes past the address pointed to by arr[1]. Also, sizeof(arr[0]) is equivalent to sizeof(int *) instead of sizeof(int) * 2.
In C, there's nothing special you need to know to understand multi-dimensional arrays. They work exactly the same way as if they were never specifically mentioned. All you need to know is that you can create an array of any type, including an array.
So when you see:
int matrix[2][4];
Just think, "matrix is an array of 2 things -- those things are arrays of 4 integers". All the normal rules for arrays apply. For example, matrix can easily decay into a pointer to its first member, just like any other array, which in this case is an array of four integers. (Which can, of course, itself decay.)
If you can use the stack for that data (small volume) then you usually define the matrix:
int matrix[X][Y]
When you want to allocate it in the heap (large volume), the you usually define a:
int** matrix = NULL;
and then allocate the two dimensions with malloc/calloc.
You can treat the 2d array as int** but that is not a good practice since it makes the code less readable. Other then that
**matrix == matrix[0][0] is true
This question already has answers here:
In C, are arrays pointers or used as pointers?
(6 answers)
Closed 8 years ago.
New to C. When I declare the following array:
char arr [3] = {'a', 'b', 'c'};
What does just the following represent?
arr
What's the difference between the following? What does each one represent specifically?
arr and &arr
What happens when you pass in the starting point of an array as a parameter to a function that accepts a 1D array? Are the values copied over to a new chunk of memory?
I understand that in C, an array is simply an allocated chunk of blocks of memory that are grouped together? Is that right? I'm trying to wrap my head around arrays in C, but I'm kind of confused. Please provide as much insight as you can.
arr &arr and &arr[0]
char arr[3] = {'1','2','3'};
--------------------
| 1 | 2 | 3 |
--------------------
|
|
arr &arr and &arr[0]
So there is no difference when it comes to what these contain.
All have the starting address of the array.
In C array is a contiguous collection of elements of similar data-type.
As mentioned what they contain is same but make a note of the below points.
&arr gives the address let's say 0x1000.
arr also gives 0x1000.
Now incrementing the pointer arr will give you the address of the next element of the array which is &arr[1]
`arr+1` != `&arr+1`
With &arr+1 the value &arr is getting incremented by the size of the array (in this case, 3); it is not pointing to the next element in the array as the pointer does.
So they are both different types and should be used with this in mind.
PS: Array name is not a modifiable lvalue.
arr &arr and &arr[0] all store the address of the first value in your array that is the address of arr[0].
They are all pointers.
for your comment
What happens when you pass in the starting point of an array as a
parameter to a function that accepts a 1D array? Are the values copied
over to a new chunk of memory?
the answer is that when the address is passed, whatever changes are made to it is reflected on the original array.
I'm a bit confused about pointer arrays and I just wanna make sure I'm right.
When I write int *arrit is just a pointer to an int variable, not an array yet. It is only that I initialize it (say with malloc) that it becomes an array. Am I right so far?
Also I have another question: were given (in school) a little function that is supposed to return an array of grades, with the first cell being the average. The function was deliberately wrong: what they did was to set
int *grades = getAllGrades();
And than they have decreased the pointer by one for the average 'cell'
*(grades - 1) = getAverage();
return *(grades - 1)
I know this is wrong because the returned value is not an array, I just don't know how to explain it. When I set a pointer, how does the machine/compiler know if I want just a pointer or an array?
(If I'm not clear its because I'm trying to ask about something that is still vague for me, my apologizes)
how does the machine/compiler know if I want just a pointer or an
array?
It doesn't, and it never will. Suppose you
int *a = malloc(3 * sizeof(int));
You just allsocated 12 bytes (assuming int is 4). But malloc only sees 12. Is that 1 big object or lots of little ones? It doesn't know. The only one who actually knows is you ;)
Now about your particular example,
int *grades = getAllGrades();
At this point, as you said, there's nothing to say whether grades points to an array. But you know it points to an array, and that's what's important. Or, maybe you know it doesn't point to an array. The key is you have to know what getAllGrades does, to know if it's returning an array or a pointer to 1 thing.
*(grades - 1) = getAverage();
return *(grades - 1)
This is not necessarily wrong, but it does look kind of sketch. If it is an array, you would expect grades[0] == *(grade + 0) to be the first element, so grades[-1] == *(grades - 1) looks like it would be before the first element. Again, it's not necessarily wrong; maybe in getAllGrades they did:
int* getAllGrades() {
int *grades = malloc(sizeof(int) * 10);
return grades + 1;
}
ie they scooched the start up by 1. It's been known to happen (look in Numerical Recipes in C) but it's kind of odd.
Arrays are not pointers. Pointers are not arrays.
Perhaps it would be clearer to say that array objects are not pointer objects, and vice versa.
When you declare int *arr, arr is a pointer object. That's all it is; it cannot be, and never will be, an array.
When you execute arr = malloc(10 * sizeof *arr);, (if malloc() doesn't fail, which you should always check), arr now points to an int object. That object happens to be the first element of a 10-element array of int (the one created by the malloc call). Note that there is such a thing as a pointer to an array, but this isn't it.
Arrays, in a very real sense, are not first-class types in C. You can create and manipulate array objects as you can with any other type of objects, but you'll rarely deal with array values directly. Instead, you'll deal with the elements of an array object indirectly, via pointers to those elements. And in the case of the arr declaration above, you can perform arithmetic on the pointer to the first element to obtain pointers to the other elements (and you have to have some other mechanism to remember how many elements there are).
Any expression of array type, in most contexts, is implicitly converted to a pointer to the array's first element (the exceptions are: the operand of a unary & operator, the operand of the sizeof operator, and a string literal in an initializer used to initialize an array (sub)object). That's the rule that makes it seem as if arrays and pointers are interchangeable.
The array indexing operator [] is actually defined to work on pointers, not arrays. a[b] is simply another way of writing *((a)+(b)). If a happens to be the name of an array object, it's first converted to a pointer, as I describe above.
I highly recommend reading section 6 of the comp.lang.c FAQ. (The link is to the front page, not directly to section 6, because I like to encourage people to browse the whole thing.)
I mentioned that there are array pointers. Given int foo[10];, &foo[0] is a pointer to an int, but &foo is a pointer to the entire array. Both point to the same location in memory, but they're of different types, and they behave quite differently under pointer arithmetic.
When I write int *arr it is just a pointer to an int variable, not
an array yet. It is only that I initialize it (say with malloc) that
it becomes an array. Am I right so far?
Well, yes and no :). arr is a pointer to (or the address of) some block of memory. Until arr is initialized it probably points to an invalid, or non-sense memory address. So it may be confusing to think of arr as a pointer to an int variable. For example, before the malloc, you can't store an integer in the location that it is pointing to.
Also, it may be easier to understand if you say that after the malloc, arr points to an array, it does not "become" an array. Before the malloc, arr points to some random non-sense location.
When I set a pointer, how does the machine/compiler know if I want
just a pointer or an array?
If you set a pointer (e.g. arr = <something>) you are just changing where the pointer points. That may be what you want. If don't want to change where arr points but you want to change the values stored in the memory where it is pointing you have to do it one element at a time (e.g. with a for loop that iterates over each element in the array).
You are right, the only difference between arrays and pointers is convention. Here's a picture of what memory must look like when the getAllGrades() function returns:
| secret malloc() stuff |
+-----------------------+
| average value |
+-----------------------+ ----\
grades* points here ---> | grade at index 0 | | by convention
+-----------------------+ | this stuff is
| grade at index 1 | | called grades[]
+-----------------------+ |
| grade at index 2 | |
+-----------------------+ .
| ... | .
Now, there is no difference between an array and a pointer. So, when the compiler sees *(grades - 1) it first subtracts 1 from the grades pointer. This is special pointer arithmetic so it knows to go one whole int block upwards, and points at the average value. Then it can operate on this value, for example to set the average with *(grades - 1) = getAverage().
An aside on array indexing: Array indexing gets compiled exactly like pointer arithmetic. For example, grades[2] gets compiled down to *(grades + 2) which does pointer arithmetic to move down 2 blocks to the memory address marked "grade at index 2" in my picture. This means you could change *(grades - 1) = getAverage() to grades[-1] = getAverage() and it would work exactly the same.
If you wanted to experiment you could do (-1)[grades] which compiles down to *(-1 + grades) and works as well, but that's stupid so don't do that :)