Why application of indirection to a two-dimensional array gives a pointer? - c

After reading some posts on this site, I realized that array in C isn't just a constant pointer as I originaly thought, but is itself a distinct type, but in most cases array "decays" to a constant pointer to the first element of the array. Because of this new information, a question arised in my mind. Suppose we have a two-dimensional A[10][10]. Why is the result of the expression *A a pointer to the first element of the array ? I thought that in this expression, A decays to a constant pointer to the first element of the array A[0][0], and then the application of the indirection should give us the value of the A[0][0], but in fact it still gives us the address of the first element of the array. Certainly, something is wrong with my logic or my understanding of the arrays or pointers, so where do I get it wrong ?
Thanks in advance.

The first element of A is A[0], not A[0][0].
This is because A is an array of ten things. C does not have two-dimensional arrays as a primary type. An array with multiple dimensions is derived or constructed as multiple layers of arrays. To the compiler, the resulting type is still just an array, whose elements happen to be further arrays.
Thus, in *A:
A is converted to a pointer to its first element. That pointer is &A[0], so *A becomes *&A[0].
* and & cancel, so the expression becomes A[0].
A[0] is an array of ten elements, so it is converted to a pointer to its first element, &A[0][0].

*A, or A[0], is itself an array of 10 elements and and array is always expressed by a pointer to its first element. However A[10][10] (let's say an array of ints) is effectively a block of memory holding 100 ints, the 10 of the first row followed by the 10 of the second row and so on. But if the expression *A or A[0] would return an int instead of a ptr to that row, it would be impossible to use the expression A[0][0], right ?
However, because such multidimensional array is a single block of memory, it's also possible to cast it to a pointer and then access it with an expression of this kind :
((int *)A)[iRow * 10 + iCol];
Which is equivalent to the expression :
A[iRow][iCol];
But this if it's possible for a 2D array declared this way :
int main()
{
int A[10][10] = { 0 };
A[9][9] = 9999;
printf("==> %d\n", ((int *)A)[9 * 10 + 9]); //==> 9999
return 0;
}
It is not if the memory is potentially made of separate blocks of bytes (probably requiring several calls to malloc) as with this kind of expressions :
int * A[10]; // or
int ** A;

A decays to a constant pointer to the first element of the array
A[0][0]
No, it does not. Why?
C standard specifies that *(pointer + integer) == pointer[integer] so the *A is an equivalent of *(A + 0) which is A[0]. A[0] will not give you the element A[0][0] only the single dimensional array which will decay to pointer to the first element of the first row of this array.

Related

what does a[0] indicate(mean) in multidimensional array?

I'm just started studying C language with a book and is not getting bit confused on the part where they discuss pointers and arrays. If there is a multidimensional array(I'll just discuss this array as two-dimensional to be specific) called a[NUM_ROW][NUM_COLS], what does the a[0] mean?
The part I was studying had a part concerning "processing the rows of a multidimensional array" and it had example where
p = &a[i][0] ;
could be written as
p = a[i];
and the book said a[i] is a pointer to the first element in row i.
Then there was a part about "using the name of a multidimensional array as a pointer" where in the case of int a[NUM_ROWS][NUM_COLS], the array name a is not a pointer to a[0][0] but a pointer to a[0].
Does a[0] have same meaning as the a[i] in the first part? I am a bit confused because in the part about "using the name of a multidimensional array as a pointer" the books says array name a is a pointer to an integer array of length NUM_COLS(and a has type int (*) [NUM_COLS]
I was wondering if a[0] indicate the integer array of length NUM_COLS or a pointer to the first element in row 0. (Or is it the same thing? Maybe since I am a bit new to the concept and confused.)
P.S. the book is chapter 12.4 of C programming(KNK)
In general, the name of an array decays to a pointer to its first element. A multidimensional array is basically just an array of arrays, so when you have int a[NUM_ROW][NUM_COL], a[i] is the "name" of the i'th row.
So by the above rule, a[i] decays to a pointer to the first element of that row, which is a[i][0]. To create a pointer we put & before the expression, so that's &a[i][0].
And a decays to a pointer to the the first element of the 2-dimensional array. Each element of the main array is a row, not an individual integer, so a is equivalent to &a[0], not &a[0][0].
The memory location of a[0] and a[0][0] are the same, the difference is in the type of the expression. The type of a[0][0] is int, but the type of a[0] is int[NUM_COL], which will decay to int * in many contexts. This is easiest to see by using the typeof operator:
printf("size of a = %d, size of a[0] = %d, size of a[0][0] = %d\n", sizeof a, sizeof a[0], sizeof a[0][0]);
If NUM_ROW = 5 and NUM_COL = 10, this will probably print:
size of a = 200, size of a[0] = 40, size of a[0][0] = 4
Let's get this out of the way: Arrays and pointers are not one and the same. Array type is a different type. As an example, if you have a pointer int* ptr, ++ptr is perfectly valid (though it might not point to something valid), but if you have an array like int a[3], you may not increment it. But one constraint arrays have is, you may not pass arrays to functions and functions may not return array type. But what happens when you try? What happens is your array is implicitly converted to a pointer to its first element. That is where the confusion comes from: Arrays are converted to a pointer to their first element when you need a pointer pointing them. Therefore, ptr = a would mean ptr is now pointing to the first element of a.
Now let's assume we have this:
int arr[3][3] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}};
What is this exactly? It is an array of arrays. arr[i] refers to one of the answers though it usually decays to a pointer.

Printf and Array

I was asked this question as a class exercise:
int A[] = {1,3,5,7,9,0,2,4,6};
printf("%d\n", *(A+A[1]-*A));
I couldn't figure it out on paper, so went ahead to compiling a simple program and tested it and found that printf("%d",*A) always gives me 1 for the output.
But I still do not understand why this is the case, hence it would be great if someone can explain this.
A is treated like a pointer to the first element of array of integers.
A[1] is the value of the first element of that array, which is 3 (indexes are 0-based)
*A is the value to which A points, which if the zeroth element of array, so 1.
So
A[1] - *A == 3 - 1 == 2
Now we have
*(A + 2)
That's where pointer arithmetic kicks in. Since A is a pointer to integer, A+2 points to the second (0-based) item in that array and *(A+2) gets its value.
So answer is 5.
Also please note for future reference that pointer to an integer and array of integers are somewhat different things in C, but for the purposes of this discussion they are the same thing.
Break it down into its constituent parts:
A by itself is the memory address of the array, which is also equivalent to &A[0], the memory address of the first element of the array.
A[1] is the value stored in the second element of the array, which is 3.
*A dereferences the memory address of the array, which is equivilent to A[0], the value stored in the first element of the array, which is 1.
So, do some substitutions:
*(A+A[1]-*A)
= *(A+(A[1])-(A[0]))
= *(A+3-1)
= *(A+2)
The notation *(Array+index) is the same as the notation Array[index]. Under the hood, they both take the starting address of the array, increment it by the number of bytes of the array element type (in this case, int) multiplied by the index, and then dereference the resulting address. So *(A+2) is the same as A[2], which is 5.
Arrays used in expressions are automatically converted into pointers pointing at the first elements of the arrays except for some exceptions such as operands of sizeof or unary & operators.
E1[E2] is defined to be equivalent to *((E1) + (E2))
+ and - operator used to pointers will move the pointer forward and backward.
In this case, *A is equivalent to *(A + 0), which is equivalent to A[0] and it will give you the first element of the array.
The expression *(A+A[1]-*A) will
Get the pointer to the first element, which points at 1, via A
Move the pointer to A[1] (3) elements ahead via +A[1], so the pointer now points at 7
Move the pointer to *A (1) element before what is pointed via -*A, so the pointer now points at 5
Dereference the pointer via the unary * operator, so the expression is evaluated to 5
An array variable in C is only the pointer to the initial memory location for the array. So if you derreference the array, you will always get the value for the first position.
If you sum up 1 to the original array value, like *(A+1) you will get the second position.
You can get any position from the array using the same method:
*(A) is the first position
*(A+1) is the second position
*(A+2) is the third position
and so on...
If you declare the int array as int* A and allocate the memory and attribute the values, it is usually easier to visualize how this works.

C: if a[i] is equivalent to *(a+i). What is the equivalent of a[j][i]?

I'm very confused about this question in C.
if a[i] is equivalent to *(a+i). What is the equivalent of a[j][i]?
I know the (a+i) is incrementing the memory address of the first element of the array by the value of i and then using the * operator to dereference that address to obtain the value. However, I am confused about multidimensional arrays. In memory, the values are stored just like a single dimensional array but I don't understand how I can increment the memory address by using the variable i or j like in the single dimensional array example.
for some reason printing *a in single dimensional array will print the first element of the array whereas *a in a multidimensional array will print a random number. Why is this so?
Any help is greatly appreciated.
if a[i] is equivalent to *(a+i). What is the equivalent of a[j][i]?
a[j][i]
is similar to
*(*(a+ j) + i)
Now If you want to know how it is?
Then let see
You already know that
a[j]=*(a+j) -------------------------> res 1
Now
a[j][i] = *(a[j]+i); --------------------------> res2
After that replace the res1 in res2. So it become
a[j][i] = *(*(a+ j) + i) ----------------------> res3
The literal answer to the question is simple: a[i] is defined to be always identical to *(a+i) by the standard, and therefore, a[j][i] is guaranteed to be always identical to *(*(a+j)+i). However, that by itself does not help us to understand what is going on; it just transforms one compound expression to another.
a[j][i] (and by extension, *(*(a+j)+i)) does very different things depending on the type of a. This is because, depending on the types, there may be implicit array-to-pointer conversions that are not apparent.
In C, a value of array type T[x] is implicitly converted to an rvalue of pointer type T* in many contexts, some of which include the left side of the subscript operator, as well as an operand in addition. So if you do either a[i] or *(a+i), and a is an expression of array type, in both cases it is converted to a pointer to its first element (like &a[0]) and it's the pointer that participates in the operation. Thus you can see how *(a+i) makes sense.
If a had type T[x][y], it would be a "true" multidimensional array, which is a C array whose elements are themselves C arrays (of a certain compile-time-constant size). In this case, if you consider *(*(a+j)+i), what is happening is 1) a is converted to a pointer to its first element (which is a pointer to an array, of type T(*)[y]), 2) that pointer is incremented and dereferenced, producing a value of array type (the jth subarray of a), 3) that array is then converted to a pointer to its first element (a pointer of type T*), which is then 4) incremented and dereferenced. This finally produces the ith element of the jth element of a, what you usually think of as a[j][i].
However, a could also have type, say, T**. This is usually used to implement "fake" multidimensional arrays, which is an array of pointers, which then in turn point to the first element of some array. This allows you to have "rows" that can have different sizes (thus the multidimensional array need not be "rectangular"), and sizes not fixed at compile time. The "rows", as well as the main pointer array, do not have to be stored contiguously. In this case, if you consider *(*(a+j)+i), what is happening is 1) a is incremented and dereferenced, producing a value of pointer type (the jth element of a), 2) that pointer is then incremented and dereferenced. This finally produces the ith element of the array referred to by the jth element of the main pointer array. Note that in this case there are no implicit array-to-pointer conversions.
Generally array follows pointer concepts.For 1D array one time dereference is enough to get value.But in multidimensional array ,to get values we have dereference 2times in 2D array, 3times in 3D array like that.
In a[j][i]=*(*(a+j)+i)
A multidimensional array in C is contiguous. The following:
int a[4][5];
consists of 4 int[5]s laid out next to each other in memory.
An array of pointers:
int *a[4];
is jagged. Each pointer can point to (the first element of) a separate array of a different length.
a[i][j] is equivalent to ((a+i)+j). See the C11 standard, section 6.5.2.1:
The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2)))
Thus, a[i][j] is equivalent to (*(a+i))[j], which is equivalent to ((a+i)+j).
This equivalence exists because in most contexts, expressions of array type decay to pointers to their first element (C11 standard, 6.3.2.1). a[i][j] is interpreted as the following:
a is an array of arrays, so it decays to a pointer to a[0], the first subarray.
a+i is a pointer to the ith subarray of a.
a[i] is equivalent to *(a+i), dereferencing a pointer to the ith subarray of a. Since this is an expression of array type, it decays to a pointer to a[i][0].
a[i][j] is equivalent to *(*(a+i)+j), dereferencing a pointer to the jth element of the ith subarray of a.
Note that pointers to arrays are different from pointers to their first element. a+i is a pointer to an array; it is not an expression of array type, and it does not decay, whether to a pointer to a pointer or to any other type.
and for some reason printing *a in single dimensional array will print the first element of the array whereas *a in a multidimensional array will print a random number. Why is this so?
for two dimensional array you need to dereference it accordingly ,
printf("\nvalue : %d",**a);
to print the first element of the array.

Doesn't a 2D array decay to pointer to pointer

Up till now I was pretty much sure that
int arr[4][5];
Then arr will decay to pointer to pointer.
But this link proves me wrong.
I am not sure how did I get about arr being pointer to pointer but it seemed pretty obvious to me. Because arr[i] would be a pointer, and hence arr should be a pointer to pointer.
Am I missing out on something.
Yep you are missing out on a lot :)
To avoid another wall of text I'll link to an answer I wrote earlier today explaining multi-dimensional arrays.
With that in mind, arr is a 1-D array with 4 elements, each of which is an array of 5 ints.
When used in an expression other than &arr or sizeof arr, this decays to &arr[0]. But what is &arr[0]? It is a pointer, and importantly, an rvalue.
Since &arr[0] is a pointer, it can't decay further. (Arrays decay, pointers don't). Furthermore, it's an rvalue. Even if it could decay into a pointer, where would that pointer point? You can't point at an rvalue. (What is &(x+y) ? )
Another way of looking at it is to remember that int arr[4][5]; is a contiguous bloc of 20 ints, grouped into 4 lots of 5 within the compiler's mind, but with no special marking in memory at runtime.
If there were "double decay" then what would the int ** point to? It must point to an int * by definition. But where in memory is that int * ? There are certainly not a bunch of pointers hanging around in memory just in case this situation occurs.
A simple rule is:
A reference to an object of type array-of-T which appears in an expression decays (with three exceptions) into a pointer to its first element; the type of the resultant pointer is pointer-to-T.
When you deal with 1D array, array name converts to pointer to first element when passed to a function.
A 2D array can be think as of an array of arrays. In this case int arr[4][5];, you can think arr[] as an array name and when passed to a function then converts to a pointer to the first element of array arr. Since first element of arr is an array, arr[i] is a pointer to ith row of the array and is of type pointer to array of 5 ints.
In general, a 2-dim array is implemented as an array of pointers (which, in a sense, is a pointer to a pointer... a pointer to the first element (i.e., pointer) in the array) When you specify the first index (i.e., arr[x]), it indexes into the array of pointers and gives the pointer to the x-th row. The the second index (i.e., arr[x][y]) gives the y-th int in that row.
In the case of a static declared array (as in your example), the actual storage is allocated as a single block ... in your example, as a single contiguous block of 20 integers (80 bytes, on most platforms). In this case, there IS no array of pointers, the compiler just does the appropriate arithmetic to address the correct element of the array. Specifically, arr[x][y] is equivalent to *(arr + x * 5 + y). This automatically-adjusted-arithmetic only happens in the original scope of the array... if you pass the array to a function, the dimension information is lost (just as the dimension is lost for a 1-dim array), and you have to do the array-indexing calculations explicitly.
To avoid this, do NOT declare the array as static, but as an array of pointers, with each pointer pointed to a 1-dim array, such as in this example:
int arr0[5];
int arr1[5];
int arr2[5];
int arr3[5];
int* arr[4] = { arr0, arr1, arr2, arr3 };
Then, when you pass arr to a function, you can address it as a 2-dim array within the function as well.

Problems with 2 D arrays

I wrote the following code in C:
#include<stdio.h>
int main()
{
int a[10][10]={1};
//------------------------
printf("%d\n",&a);
printf("%d\n",a);
printf("%d\n",*a);
//-------------------------
printf("%d",**a);
return 0;
}
With the above 3 printf statements I got the same value. On my machine it's 2686384. But with the last statement I got 1.
Isn't it something going wrong? These statements mean:
The address of a is 2686384
The value stored in a is 2686384
the value that is stored at address of variable pointed by a (i.e. at 2686384) is 2686384.
This means a must be something like a variable pointing towards itself...
Then why is the output of *(*a) 1? Why isn't it evaluated as *(*a)=*(2686384)=2686384?
#include<stdio.h>
int main()
{
// a[row][col]
int a[2][2]={ {9, 2}, {3, 4} };
// in C, multidimensional arrays are really one dimensional, but
// syntax alows us to access it as a two dimensional (like here).
//------------------------
printf("&a = %d\n",&a);
printf("a = %d\n",a);
printf("*a = %d\n",*a);
//-------------------------
// Thing to have in mind here, that may be confusing is:
// since we can access array values through 2 dimensions,
// we need 2 stars(asterisk), right? Right.
// So as a consistency in this aproach,
// even if we are asking for first value,
// we have to use 2 dimensional (we have a 2D array)
// access syntax - 2 stars.
printf("**a = %d\n", **a ); // this says a[0][0] or *(*(a+0)+0)
printf("**(a+1) = %d\n", **(a+1) ); // a[1][0] or *(*(a+1)+0)
printf("*(*(a+1)+1) = %d\n", *(*(a+1)+1) ); // a[1][1] or *(*(a+1)+1)
// a[1] gives us the value on that position,
// since that value is pointer, &a[i] returns a pointer value
printf("&a[1] = %d\n", &a[1]);
// When we add int to a pointer (eg. a+1),
// really we are adding the lenth of a type
// to which pointer is directing - here we go to the next element in an array.
// In C, you can manipulate array variables practically like pointers.
// Example: littleFunction(int [] arr) accepts pointers to int, and it works vice versa,
// littleFunction(int* arr) accepts array of int.
int b = 8;
printf("b = %d\n", *&b);
return 0;
}
An expression consisting the the name of an array can decay to a pointer to the first element of the array. So even though a has type int[10][10], it can decay to int(*)[10].
Now, this decay happens in the expression *a. Consequently the expression has type int[10]. Repeating the same logic, this again decays to int*, and so **a is an int, which is moreover the first element of the first element of the array a, i.e. 1.
The other three print statements print out the address of, respectively, the array, the first element of the array, and the first element of the first element of the array (which are of course all the same address, just different types).
First, a word on arrays...
Except when it is the operand0 of the sizeof, _Alignof, or unary & operators, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element in the array.
The expression &a has type "pointer to 10-element array of 10-element array of int", or int (*)[10][10]. The expression a has type "10-element array of 10-element array of int", which by the rule above decays to "pointer to 10-element array of int", or int (*)[10]. And finally, the expression *a (which is equivalent to a[0]) has type "10-element array of int", which again by the rule above decays to "pointer to int".
All three expressions have the same value because the address of an array and the address of its first element are the same: &a[0][0] == a[0] == *a == a == &a. However, the types of the expressions are different, which matters when doing pointer arithmetic. For example, if I have the following declarations:
int (*ap0)[10][10] = &a;
int (*ap1)[10] = a;
int *ip = *a;
then ap0++ would advance ap0 to point to the next 10x10 array of int, ap1++ would advance ap1 to pointer to the next 10-element array of int (or a[1]), and ip++ would advance ip to point to the next int (&a[0][1]).
**a is equivalent to *a[0] which is equivalent to a[0][0]. which is the value of the first element of a and has type int and the value 1 (note that only a[0][0] is initialized to 1; all remaining elements are initialized to 0).
Note that you should use %p to print out pointer values:
printf("&a = %p\n", &a);
printf(" a = %p\n", a);
printf("*a = %p\n", *a);
First of all, if you want to print out pointer values, use %p - if you're on a 64 bit machine int almost certainly is smaller than a pointer.
**a is double dereferencing what's effectively a int**, so you end up with what the first element of the first sub-array is: 1.
If you define a as T a[10] (where T is some typedef), then a simple unadorned a means the address of the start of the array, the same as &a[0]. They both have type T*.
&a is also the address of the start of the array, but it has type T**.
Things become trickier in the presence of multi-dimensional arrays. To see what is happening, it is easier to break things down into smaller chunks using typedefs. So, you effectively wrote
typedef int array10[10];
array10 a[10];
[Exercise to reader: What is the type of a? (it is not int**)]
**a correctly evaluates to the first int in the array a.
From C99 Std
Consider the array object defined by the declaration
int x[3][5];
Here x is a 3 × 5 array of ints; more precisely, x is an array of three element objects, each of which is an array of five ints. In the expression x[i], which is equivalent to (*((x)+(i))), x is first converted to a pointer to the initial array of five ints. Then i is adjusted according to the type of x, which conceptually entails multiplying i by the size of the object to which the pointer points, namely an array of five int objects. The results are added and indirection is applied to yield an array of five ints. When used in the expression x[i][j], that array is in turn converted to a pointer to the first of the ints, so x[i][j] yields an int.
so,
Initial array will be x[0][0] only.
all x, &x and *x will be pointing to x[0][0].
No, there's nothing wrong with your code. Just they way you are thinking about it... The more I think about it the harder I realize this is to explain, so before I go in to this, keep these points in mind:
arrays are not pointers, don't think of them that way, they are different types.
the [] is an operator. It's a shift and deference operator, so when I write printf("%d",array[3]); I am shifting and deferencing
So an array (lets think about 1 dimension to start) is somewhere in memory:
int arr[10] = {1};
//Some where in memory---> 0x80001f23
[1][1][1][1][1][1][1][1][1][1]
So if I say:
*arr; //this gives the value 1
Why? because it's the same as arr[0] it gives us the value at the address which is the start of the array. This implies that:
arr; // this is the address of the start of the array
So what does this give us?
&arr; //this will give us the address of the array.
//which IS the address of the start of the array
//this is where arrays and pointers really show some difference
So arr == &arr;. The "job" of an array is to hold data, the array will not "point" to anything else, because it's holding its own data. Period. A pointer on the other hand has the job to point to something else:
int *z; //the pointer holds the address of someone else's values
z = arr; //the pointer holds the address of the array
z != &z; //the pointer's address is a unique value telling us where the pointer resides
//the pointer's value is the address of the array
EDIT:
One more way to think about this:
int b; //this is integer type
&b; //this is the address of the int b, right?
int c[]; //this is the array of ints
&c; //this would be the address of the array, right?
So that's pretty understandable how about this:
*c; //that's the first element in the array
What does that line of code tell you? if I deference c, then I get an int. That means just plain c is an address. Since it's the start of the array it's the address of the array, thus:
c == &c;

Resources