I was going through the passage pertaining to the access of individual elements of a 2D array using pointers.
It suggests the following mechanism to access the ith row's jth element of the 2D array arr[5][5]:
*(*(arr+i)+j)
From my cursory understanding of pointers, I am given to understand that the name of the array yields the address of the 0th element of the 0th row, and that any integral increment to the array will yield the base address of the next row. All's fine till this juncture, I suppose.
However, what I fail to understand is the relevance of the indirection (*) operator within the following snippet:
*(arr+i)
How is the indirection operator in this case relevant? Since the name of the array itself yields the base address of the 0th row, adding any integral number to it entails it to point to the base element of the next row. In that case, the following snippet yields the address of the ith row:
(arr+i)
And the addition of j warrants the pointer to point to the jth element of the said row.
However, in the following snippet:
*(arr+i)
Wouldn't the addition of the indirection operator cause it to yield the ith element of the row, as opposed to the address of the base element of the ith row?
Should not the following be the code to access the ith row's jth element?
*((arr+i)+j)
In the aforementioned case, incrementing arr i times will entail the code fragment (arr+i) to point to the base address of the ith row, and then the addition of j will entail the address of the ith row's jth element, and then the indirection operator (*) would yield the element that the specific address holds, wouldn't it?
Is my reasoning satisfactory?
From my cursory understanding of pointers, I am given to understand
that the name of the array yields the address of the 0th element of
the 0th row, and that any integral increment to the array will yield
the base address of the next row. All's fine till this juncture, I
suppose.
It is not exactly how you think.
An array designator used in an expression is implicitly converted (with rare exceptions) to a pointer to its first element.
If you have a two-dimensional array like for example
T arr[M][N];
(where T is some type specifier) then the array designator arr is converted to a pointer of the type T ( * )[N] that points to the first element of the array arr[0] having the type T[N].
Of course the value of the pointer is equal to the value of the address of the first element arr[0][0] of the type T.
So let's consider the expression
*(*(arr+i)+j)
In this expression the array designator arr is converted to pointer to the first element of the array. That is the expression arr yields the value of the expression &arr[0]. The expression arr + i yields the value &arr[i]. Dereferencing the expression like *( arr + i ) you will get one dimensional array arr[i] that in turn used in the expression *( arr + i ) + j is converted to a pointer to its first element of the type T * that is equivalent to the expression &arr[i][0] . And due to adding the variable j the pointer points to the element &arr[i][j]. Dereferencing this expression like
*(*(arr+i)+j)
that is equivalent to the expression &arr[i][j] you will get the element arr[i][j].
The ith/jth element should be accessed using
arr[i][j]
but if for some reason you had to do it the hard way it would be
*((type *)arr + i * 5 + j)
where the 5 is the number of columns in the array. The problem with
*((arr+i)+j)
is arr is of the wrong type and things break down. In theory, the following should work but I never tried anything like it:
*((type *)(arr+i)+j)
But from here we are in a position to discuss
*(*(arr+i)+j)
again; I have never tried this; just the rectangular form at the top. But we should be able to understand this. Type of arr is type[][] which is convertible to type[]*; thus it the result of *(arr + i) should be the ith row as type type[] which is convertible to type*; thus we should expect *(*(arr+i)+j) to access the ith row of the jth column. Without the inner deference we would end up with the (i+j)th row instead.
Note that arr must be declared correctly or it's going to interpret it as pointer to pointer rather than pointer to array and do the wrong thing.
*((arr+i)+j)
The parentheses around (arr+i) are mathematically unnecessary, but they are not enough to switch dimensions from rows to elements. i is for rows, j for elements, and it is a good idea to group them, but it takes *(...).
With *(arr + i + j) it seems obvious that i and j are on the same level.
Related
I was asked what the output of the following code is:
int a[5] = { 1, 3, 5, 7, 9 };
int *p = (int *)(&a + 1);
printf("%d, %d", *(a + 1), *(p - 1));
3, 9
Error
3, 1
2, 1
The answer is NO.1
It is easy to get *(a+1) is 3.
But how about int *p = (int *)(&a + 1); and *(p - 1) ?
The answer to this could be either "1) 3,9" or "2) Error" (or more specifically undefined behavior) depending on how you read the C standard.
First, let's take this:
&a + 1
The & operator takes the address of the array a giving us an expression of type int(*)[5] i.e. a pointer to an array of int of size 5. Adding 1 to this treats the pointer as pointing to the first element of an array of int [5], with the resulting pointer pointing to just after a.
Also, even though &a points to a singular object (in this case an array of type int [5]) we can still add 1 to this address. This is valid because 1) a pointer to a singular object can be treated as a pointer to the first element of an array of size 1, and 2) a pointer may point to one element past the end of an array.
Section 6.5.6p7 of the C standard states the following regarding treating a pointer to an object as a pointer to the first element of an array of size 1:
For the purposes of these operators, a pointer to an object
that is not an element of an array behaves the same as a pointer
to the first element of an array of length one with the type of the
object as its element type.
And section 6.5.6p8 says the following regarding allowing a pointer to point to just past the end of an array:
When an expression that has integer type is added to or
subtracted from a pointer, the result has the type of the pointer
operand. If the pointer operand points to an element of an array
object, and the array is large enough, the result points to an element
offset from the original element such that the difference of the
subscripts of the resulting and original array elements equals the
integer expression. In other words, if the expression P points to the
i-th element of an array object, the expressions (P)+N
(equivalently, N+(P)) and (P)-N (where N has the value n) point to,
respectively, the i+n-th and i−n-th elements of the array object,
provided they exist. Moreover, if the expression P points to the
last element of an array object, the expression (P)+1 points one past
the last element of the array object, and if the expression Q
points one past the last element of an array object, the
expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to
elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an
overflow; otherwise, the behavior is undefined. If the result points
one past the last element of the array object, it shall not be used as
the operand of a unary * operator that is evaluated.
Now comes the questionable part, which is the cast:
(int *)(&a + 1)
This converts the pointer of type int(*)[5] to type int *. The intent here is to change the pointer which points to the end of the 1-element array of int [5] to the end of the 5-element array of int.
However the C standard isn't clear on whether this conversion and the subsequent operation on the result is allowed. It does allow conversion from one object type to another and back, assuming the pointer is properly aligned. While the alignment shouldn't be an issue, using this pointer is iffy.
So this pointer is assigned to p:
int *p = (int *)(&a + 1)
Which is then used as follows:
*(p - 1)
If we assume that p validly points to one element past the end of the array a, subtracting 1 from it results in a pointer to the last element of the array. The * operator then dereferences this pointer to the last element, yielding the value 9.
So if we assume that (int *)(&a + 1) results in a valid pointer, then the answer is 1) 3,9 otherwise the answer is 2) Error.
In the line
int *p = (int *)(&a + 1);
note that &a is being written, not a. This is important.
If simply a had been written, then the array would have decayed to a pointer to the first element, i.e. to &a[0]. However, since the expression &a was used instead, the result of this expression has the same value as if a or &a[0] had been used, but the type is different: The type is a pointer to an array of 5 int elements, instead of a pointer to a single int element.
According to the rules on pointer arithmetic, incrementing a pointer by 1 will increase the memory address by the size of the object that it is pointing to. Since the pointer is not pointing to a single element, but to an array of 5 elements, the memory address will be incremented by 5 * sizeof(int). Therefore, after incrementing the pointer, the value of (but not type of) the pointer will be equivalent to &a[5], i.e. one past the end of the array.
After casting this pointer to int * and assigning the result to p, the expression p is fully equivalent to &a[5] (both in value and in type).
Therefore, the expression *(p - 1) is equivalent to *(&a[5] - 1), which is equivalent to *(&a[4]), or simply a[4].
This:
&a + 1;
is taking the address of a, an array, and adding 1, which adds the size of one a, i.e. 5 integers. Then the indexing "backs down", one integer, ending up in the final element of a.
Normally whenever arrays are used in expressions, they "decay" into a pointer to the first element. There are a few exceptions to this rule and one such exception is the & operator.
&a therefore yields a pointer to the array of type int (*)[5]. Then &a + 1 is pointer arithmetic on such a type, meaning the pointer address is increased by the size of one int [5]. We end up pointing just beyond the array, but C actually allows us to do that as long as we don't de-reference that location.
Then the pointer is forced a type conversion to (int *) which we can do too - C allows pretty much any manner of wild pointer conversions as long as we don't de-reference or cause misalignment etc.
p - 1 does pointer arithmetic on type int and the actual type of data in the array is also int, so we are allowed to de-reference that location. We end up at the last item of the array.
After reading some posts on this site, I realized that array in C isn't just a constant pointer as I originaly thought, but is itself a distinct type, but in most cases array "decays" to a constant pointer to the first element of the array. Because of this new information, a question arised in my mind. Suppose we have a two-dimensional A[10][10]. Why is the result of the expression *A a pointer to the first element of the array ? I thought that in this expression, A decays to a constant pointer to the first element of the array A[0][0], and then the application of the indirection should give us the value of the A[0][0], but in fact it still gives us the address of the first element of the array. Certainly, something is wrong with my logic or my understanding of the arrays or pointers, so where do I get it wrong ?
Thanks in advance.
The first element of A is A[0], not A[0][0].
This is because A is an array of ten things. C does not have two-dimensional arrays as a primary type. An array with multiple dimensions is derived or constructed as multiple layers of arrays. To the compiler, the resulting type is still just an array, whose elements happen to be further arrays.
Thus, in *A:
A is converted to a pointer to its first element. That pointer is &A[0], so *A becomes *&A[0].
* and & cancel, so the expression becomes A[0].
A[0] is an array of ten elements, so it is converted to a pointer to its first element, &A[0][0].
*A, or A[0], is itself an array of 10 elements and and array is always expressed by a pointer to its first element. However A[10][10] (let's say an array of ints) is effectively a block of memory holding 100 ints, the 10 of the first row followed by the 10 of the second row and so on. But if the expression *A or A[0] would return an int instead of a ptr to that row, it would be impossible to use the expression A[0][0], right ?
However, because such multidimensional array is a single block of memory, it's also possible to cast it to a pointer and then access it with an expression of this kind :
((int *)A)[iRow * 10 + iCol];
Which is equivalent to the expression :
A[iRow][iCol];
But this if it's possible for a 2D array declared this way :
int main()
{
int A[10][10] = { 0 };
A[9][9] = 9999;
printf("==> %d\n", ((int *)A)[9 * 10 + 9]); //==> 9999
return 0;
}
It is not if the memory is potentially made of separate blocks of bytes (probably requiring several calls to malloc) as with this kind of expressions :
int * A[10]; // or
int ** A;
A decays to a constant pointer to the first element of the array
A[0][0]
No, it does not. Why?
C standard specifies that *(pointer + integer) == pointer[integer] so the *A is an equivalent of *(A + 0) which is A[0]. A[0] will not give you the element A[0][0] only the single dimensional array which will decay to pointer to the first element of the first row of this array.
I was asked this question as a class exercise:
int A[] = {1,3,5,7,9,0,2,4,6};
printf("%d\n", *(A+A[1]-*A));
I couldn't figure it out on paper, so went ahead to compiling a simple program and tested it and found that printf("%d",*A) always gives me 1 for the output.
But I still do not understand why this is the case, hence it would be great if someone can explain this.
A is treated like a pointer to the first element of array of integers.
A[1] is the value of the first element of that array, which is 3 (indexes are 0-based)
*A is the value to which A points, which if the zeroth element of array, so 1.
So
A[1] - *A == 3 - 1 == 2
Now we have
*(A + 2)
That's where pointer arithmetic kicks in. Since A is a pointer to integer, A+2 points to the second (0-based) item in that array and *(A+2) gets its value.
So answer is 5.
Also please note for future reference that pointer to an integer and array of integers are somewhat different things in C, but for the purposes of this discussion they are the same thing.
Break it down into its constituent parts:
A by itself is the memory address of the array, which is also equivalent to &A[0], the memory address of the first element of the array.
A[1] is the value stored in the second element of the array, which is 3.
*A dereferences the memory address of the array, which is equivilent to A[0], the value stored in the first element of the array, which is 1.
So, do some substitutions:
*(A+A[1]-*A)
= *(A+(A[1])-(A[0]))
= *(A+3-1)
= *(A+2)
The notation *(Array+index) is the same as the notation Array[index]. Under the hood, they both take the starting address of the array, increment it by the number of bytes of the array element type (in this case, int) multiplied by the index, and then dereference the resulting address. So *(A+2) is the same as A[2], which is 5.
Arrays used in expressions are automatically converted into pointers pointing at the first elements of the arrays except for some exceptions such as operands of sizeof or unary & operators.
E1[E2] is defined to be equivalent to *((E1) + (E2))
+ and - operator used to pointers will move the pointer forward and backward.
In this case, *A is equivalent to *(A + 0), which is equivalent to A[0] and it will give you the first element of the array.
The expression *(A+A[1]-*A) will
Get the pointer to the first element, which points at 1, via A
Move the pointer to A[1] (3) elements ahead via +A[1], so the pointer now points at 7
Move the pointer to *A (1) element before what is pointed via -*A, so the pointer now points at 5
Dereference the pointer via the unary * operator, so the expression is evaluated to 5
An array variable in C is only the pointer to the initial memory location for the array. So if you derreference the array, you will always get the value for the first position.
If you sum up 1 to the original array value, like *(A+1) you will get the second position.
You can get any position from the array using the same method:
*(A) is the first position
*(A+1) is the second position
*(A+2) is the third position
and so on...
If you declare the int array as int* A and allocate the memory and attribute the values, it is usually easier to visualize how this works.
I'm very confused about this question in C.
if a[i] is equivalent to *(a+i). What is the equivalent of a[j][i]?
I know the (a+i) is incrementing the memory address of the first element of the array by the value of i and then using the * operator to dereference that address to obtain the value. However, I am confused about multidimensional arrays. In memory, the values are stored just like a single dimensional array but I don't understand how I can increment the memory address by using the variable i or j like in the single dimensional array example.
for some reason printing *a in single dimensional array will print the first element of the array whereas *a in a multidimensional array will print a random number. Why is this so?
Any help is greatly appreciated.
if a[i] is equivalent to *(a+i). What is the equivalent of a[j][i]?
a[j][i]
is similar to
*(*(a+ j) + i)
Now If you want to know how it is?
Then let see
You already know that
a[j]=*(a+j) -------------------------> res 1
Now
a[j][i] = *(a[j]+i); --------------------------> res2
After that replace the res1 in res2. So it become
a[j][i] = *(*(a+ j) + i) ----------------------> res3
The literal answer to the question is simple: a[i] is defined to be always identical to *(a+i) by the standard, and therefore, a[j][i] is guaranteed to be always identical to *(*(a+j)+i). However, that by itself does not help us to understand what is going on; it just transforms one compound expression to another.
a[j][i] (and by extension, *(*(a+j)+i)) does very different things depending on the type of a. This is because, depending on the types, there may be implicit array-to-pointer conversions that are not apparent.
In C, a value of array type T[x] is implicitly converted to an rvalue of pointer type T* in many contexts, some of which include the left side of the subscript operator, as well as an operand in addition. So if you do either a[i] or *(a+i), and a is an expression of array type, in both cases it is converted to a pointer to its first element (like &a[0]) and it's the pointer that participates in the operation. Thus you can see how *(a+i) makes sense.
If a had type T[x][y], it would be a "true" multidimensional array, which is a C array whose elements are themselves C arrays (of a certain compile-time-constant size). In this case, if you consider *(*(a+j)+i), what is happening is 1) a is converted to a pointer to its first element (which is a pointer to an array, of type T(*)[y]), 2) that pointer is incremented and dereferenced, producing a value of array type (the jth subarray of a), 3) that array is then converted to a pointer to its first element (a pointer of type T*), which is then 4) incremented and dereferenced. This finally produces the ith element of the jth element of a, what you usually think of as a[j][i].
However, a could also have type, say, T**. This is usually used to implement "fake" multidimensional arrays, which is an array of pointers, which then in turn point to the first element of some array. This allows you to have "rows" that can have different sizes (thus the multidimensional array need not be "rectangular"), and sizes not fixed at compile time. The "rows", as well as the main pointer array, do not have to be stored contiguously. In this case, if you consider *(*(a+j)+i), what is happening is 1) a is incremented and dereferenced, producing a value of pointer type (the jth element of a), 2) that pointer is then incremented and dereferenced. This finally produces the ith element of the array referred to by the jth element of the main pointer array. Note that in this case there are no implicit array-to-pointer conversions.
Generally array follows pointer concepts.For 1D array one time dereference is enough to get value.But in multidimensional array ,to get values we have dereference 2times in 2D array, 3times in 3D array like that.
In a[j][i]=*(*(a+j)+i)
A multidimensional array in C is contiguous. The following:
int a[4][5];
consists of 4 int[5]s laid out next to each other in memory.
An array of pointers:
int *a[4];
is jagged. Each pointer can point to (the first element of) a separate array of a different length.
a[i][j] is equivalent to ((a+i)+j). See the C11 standard, section 6.5.2.1:
The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2)))
Thus, a[i][j] is equivalent to (*(a+i))[j], which is equivalent to ((a+i)+j).
This equivalence exists because in most contexts, expressions of array type decay to pointers to their first element (C11 standard, 6.3.2.1). a[i][j] is interpreted as the following:
a is an array of arrays, so it decays to a pointer to a[0], the first subarray.
a+i is a pointer to the ith subarray of a.
a[i] is equivalent to *(a+i), dereferencing a pointer to the ith subarray of a. Since this is an expression of array type, it decays to a pointer to a[i][0].
a[i][j] is equivalent to *(*(a+i)+j), dereferencing a pointer to the jth element of the ith subarray of a.
Note that pointers to arrays are different from pointers to their first element. a+i is a pointer to an array; it is not an expression of array type, and it does not decay, whether to a pointer to a pointer or to any other type.
and for some reason printing *a in single dimensional array will print the first element of the array whereas *a in a multidimensional array will print a random number. Why is this so?
for two dimensional array you need to dereference it accordingly ,
printf("\nvalue : %d",**a);
to print the first element of the array.
I come across a program in C and see the pointer comparison program. What I didn't understand is these two statements.
j=&arr[4];
k=(arr+4);
the first statement is holding the address of the fifth element and the second statement syntax is what I saw first time. Can anybody explain me the second statement. and also
after executing program j and k are equal. so they are pointing to the same location.
k=(arr+4);
means k will point to 4 elements ahead of arr location after it is decayed into a pointer to index 0.
array name decays to a pointer to it's zero index. by adding 4 means it'll point to 5th element.
It's the infamous pointer arithmetic! The statement simply assigns the address of the element at the address pointed to by arr and an offset of 4 elements to the right. arr + 4 is pointing to the address of arr[4].
This is simply pointer arithmetic, mixed with C's indexing<->pointer defererence equivalence.
The former means that the expression arr + 4 causes arr (the name of an array) to decay into simply a pointer to the array's first argument. In other words, arr == &arr[0] is true.
The latter is this equivalency, for any pointer a and integer i:
a[i] === *(a + i)
This means that the first expression, the assignment to j, can be read as j = &(*(a + 4)), which makes it (pretty) clear that it's just taking the address of the element with index 4, just as the k line is doing.
This code uses a simple case of pointer arithmetics. It assigns the adress of the array ( +4 adresses, so it is the 5th element) to the pointer k.
Every arr[4] statement is expending to (arr+4); statement by the compiler itself.
These two are equivalent and can be use interchangeably.
Both are ways of getting a pointer value.
First arr[x] returns (x+1) array contents and you can get its address with & operator.
Second is known as pointer arithmetic and returns the address of the arr pointer plus x positions, so the x+1 address.
It's basic pointer arithmetic. k is a pointer, arr is a pointer to the first element of the array (a pointer to arr[0]). So by adding 4 to k, you move the pointer on 4 elements. Therefore k=(arr+4) means k points to arr[4], which would be the fifth element, and the same as j.