I was asked this question as a class exercise:
int A[] = {1,3,5,7,9,0,2,4,6};
printf("%d\n", *(A+A[1]-*A));
I couldn't figure it out on paper, so went ahead to compiling a simple program and tested it and found that printf("%d",*A) always gives me 1 for the output.
But I still do not understand why this is the case, hence it would be great if someone can explain this.
A is treated like a pointer to the first element of array of integers.
A[1] is the value of the first element of that array, which is 3 (indexes are 0-based)
*A is the value to which A points, which if the zeroth element of array, so 1.
So
A[1] - *A == 3 - 1 == 2
Now we have
*(A + 2)
That's where pointer arithmetic kicks in. Since A is a pointer to integer, A+2 points to the second (0-based) item in that array and *(A+2) gets its value.
So answer is 5.
Also please note for future reference that pointer to an integer and array of integers are somewhat different things in C, but for the purposes of this discussion they are the same thing.
Break it down into its constituent parts:
A by itself is the memory address of the array, which is also equivalent to &A[0], the memory address of the first element of the array.
A[1] is the value stored in the second element of the array, which is 3.
*A dereferences the memory address of the array, which is equivilent to A[0], the value stored in the first element of the array, which is 1.
So, do some substitutions:
*(A+A[1]-*A)
= *(A+(A[1])-(A[0]))
= *(A+3-1)
= *(A+2)
The notation *(Array+index) is the same as the notation Array[index]. Under the hood, they both take the starting address of the array, increment it by the number of bytes of the array element type (in this case, int) multiplied by the index, and then dereference the resulting address. So *(A+2) is the same as A[2], which is 5.
Arrays used in expressions are automatically converted into pointers pointing at the first elements of the arrays except for some exceptions such as operands of sizeof or unary & operators.
E1[E2] is defined to be equivalent to *((E1) + (E2))
+ and - operator used to pointers will move the pointer forward and backward.
In this case, *A is equivalent to *(A + 0), which is equivalent to A[0] and it will give you the first element of the array.
The expression *(A+A[1]-*A) will
Get the pointer to the first element, which points at 1, via A
Move the pointer to A[1] (3) elements ahead via +A[1], so the pointer now points at 7
Move the pointer to *A (1) element before what is pointed via -*A, so the pointer now points at 5
Dereference the pointer via the unary * operator, so the expression is evaluated to 5
An array variable in C is only the pointer to the initial memory location for the array. So if you derreference the array, you will always get the value for the first position.
If you sum up 1 to the original array value, like *(A+1) you will get the second position.
You can get any position from the array using the same method:
*(A) is the first position
*(A+1) is the second position
*(A+2) is the third position
and so on...
If you declare the int array as int* A and allocate the memory and attribute the values, it is usually easier to visualize how this works.
Related
I was asked what the output of the following code is:
int a[5] = { 1, 3, 5, 7, 9 };
int *p = (int *)(&a + 1);
printf("%d, %d", *(a + 1), *(p - 1));
3, 9
Error
3, 1
2, 1
The answer is NO.1
It is easy to get *(a+1) is 3.
But how about int *p = (int *)(&a + 1); and *(p - 1) ?
The answer to this could be either "1) 3,9" or "2) Error" (or more specifically undefined behavior) depending on how you read the C standard.
First, let's take this:
&a + 1
The & operator takes the address of the array a giving us an expression of type int(*)[5] i.e. a pointer to an array of int of size 5. Adding 1 to this treats the pointer as pointing to the first element of an array of int [5], with the resulting pointer pointing to just after a.
Also, even though &a points to a singular object (in this case an array of type int [5]) we can still add 1 to this address. This is valid because 1) a pointer to a singular object can be treated as a pointer to the first element of an array of size 1, and 2) a pointer may point to one element past the end of an array.
Section 6.5.6p7 of the C standard states the following regarding treating a pointer to an object as a pointer to the first element of an array of size 1:
For the purposes of these operators, a pointer to an object
that is not an element of an array behaves the same as a pointer
to the first element of an array of length one with the type of the
object as its element type.
And section 6.5.6p8 says the following regarding allowing a pointer to point to just past the end of an array:
When an expression that has integer type is added to or
subtracted from a pointer, the result has the type of the pointer
operand. If the pointer operand points to an element of an array
object, and the array is large enough, the result points to an element
offset from the original element such that the difference of the
subscripts of the resulting and original array elements equals the
integer expression. In other words, if the expression P points to the
i-th element of an array object, the expressions (P)+N
(equivalently, N+(P)) and (P)-N (where N has the value n) point to,
respectively, the i+n-th and i−n-th elements of the array object,
provided they exist. Moreover, if the expression P points to the
last element of an array object, the expression (P)+1 points one past
the last element of the array object, and if the expression Q
points one past the last element of an array object, the
expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to
elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an
overflow; otherwise, the behavior is undefined. If the result points
one past the last element of the array object, it shall not be used as
the operand of a unary * operator that is evaluated.
Now comes the questionable part, which is the cast:
(int *)(&a + 1)
This converts the pointer of type int(*)[5] to type int *. The intent here is to change the pointer which points to the end of the 1-element array of int [5] to the end of the 5-element array of int.
However the C standard isn't clear on whether this conversion and the subsequent operation on the result is allowed. It does allow conversion from one object type to another and back, assuming the pointer is properly aligned. While the alignment shouldn't be an issue, using this pointer is iffy.
So this pointer is assigned to p:
int *p = (int *)(&a + 1)
Which is then used as follows:
*(p - 1)
If we assume that p validly points to one element past the end of the array a, subtracting 1 from it results in a pointer to the last element of the array. The * operator then dereferences this pointer to the last element, yielding the value 9.
So if we assume that (int *)(&a + 1) results in a valid pointer, then the answer is 1) 3,9 otherwise the answer is 2) Error.
In the line
int *p = (int *)(&a + 1);
note that &a is being written, not a. This is important.
If simply a had been written, then the array would have decayed to a pointer to the first element, i.e. to &a[0]. However, since the expression &a was used instead, the result of this expression has the same value as if a or &a[0] had been used, but the type is different: The type is a pointer to an array of 5 int elements, instead of a pointer to a single int element.
According to the rules on pointer arithmetic, incrementing a pointer by 1 will increase the memory address by the size of the object that it is pointing to. Since the pointer is not pointing to a single element, but to an array of 5 elements, the memory address will be incremented by 5 * sizeof(int). Therefore, after incrementing the pointer, the value of (but not type of) the pointer will be equivalent to &a[5], i.e. one past the end of the array.
After casting this pointer to int * and assigning the result to p, the expression p is fully equivalent to &a[5] (both in value and in type).
Therefore, the expression *(p - 1) is equivalent to *(&a[5] - 1), which is equivalent to *(&a[4]), or simply a[4].
This:
&a + 1;
is taking the address of a, an array, and adding 1, which adds the size of one a, i.e. 5 integers. Then the indexing "backs down", one integer, ending up in the final element of a.
Normally whenever arrays are used in expressions, they "decay" into a pointer to the first element. There are a few exceptions to this rule and one such exception is the & operator.
&a therefore yields a pointer to the array of type int (*)[5]. Then &a + 1 is pointer arithmetic on such a type, meaning the pointer address is increased by the size of one int [5]. We end up pointing just beyond the array, but C actually allows us to do that as long as we don't de-reference that location.
Then the pointer is forced a type conversion to (int *) which we can do too - C allows pretty much any manner of wild pointer conversions as long as we don't de-reference or cause misalignment etc.
p - 1 does pointer arithmetic on type int and the actual type of data in the array is also int, so we are allowed to de-reference that location. We end up at the last item of the array.
After reading some posts on this site, I realized that array in C isn't just a constant pointer as I originaly thought, but is itself a distinct type, but in most cases array "decays" to a constant pointer to the first element of the array. Because of this new information, a question arised in my mind. Suppose we have a two-dimensional A[10][10]. Why is the result of the expression *A a pointer to the first element of the array ? I thought that in this expression, A decays to a constant pointer to the first element of the array A[0][0], and then the application of the indirection should give us the value of the A[0][0], but in fact it still gives us the address of the first element of the array. Certainly, something is wrong with my logic or my understanding of the arrays or pointers, so where do I get it wrong ?
Thanks in advance.
The first element of A is A[0], not A[0][0].
This is because A is an array of ten things. C does not have two-dimensional arrays as a primary type. An array with multiple dimensions is derived or constructed as multiple layers of arrays. To the compiler, the resulting type is still just an array, whose elements happen to be further arrays.
Thus, in *A:
A is converted to a pointer to its first element. That pointer is &A[0], so *A becomes *&A[0].
* and & cancel, so the expression becomes A[0].
A[0] is an array of ten elements, so it is converted to a pointer to its first element, &A[0][0].
*A, or A[0], is itself an array of 10 elements and and array is always expressed by a pointer to its first element. However A[10][10] (let's say an array of ints) is effectively a block of memory holding 100 ints, the 10 of the first row followed by the 10 of the second row and so on. But if the expression *A or A[0] would return an int instead of a ptr to that row, it would be impossible to use the expression A[0][0], right ?
However, because such multidimensional array is a single block of memory, it's also possible to cast it to a pointer and then access it with an expression of this kind :
((int *)A)[iRow * 10 + iCol];
Which is equivalent to the expression :
A[iRow][iCol];
But this if it's possible for a 2D array declared this way :
int main()
{
int A[10][10] = { 0 };
A[9][9] = 9999;
printf("==> %d\n", ((int *)A)[9 * 10 + 9]); //==> 9999
return 0;
}
It is not if the memory is potentially made of separate blocks of bytes (probably requiring several calls to malloc) as with this kind of expressions :
int * A[10]; // or
int ** A;
A decays to a constant pointer to the first element of the array
A[0][0]
No, it does not. Why?
C standard specifies that *(pointer + integer) == pointer[integer] so the *A is an equivalent of *(A + 0) which is A[0]. A[0] will not give you the element A[0][0] only the single dimensional array which will decay to pointer to the first element of the first row of this array.
I'm very confused about this question in C.
if a[i] is equivalent to *(a+i). What is the equivalent of a[j][i]?
I know the (a+i) is incrementing the memory address of the first element of the array by the value of i and then using the * operator to dereference that address to obtain the value. However, I am confused about multidimensional arrays. In memory, the values are stored just like a single dimensional array but I don't understand how I can increment the memory address by using the variable i or j like in the single dimensional array example.
for some reason printing *a in single dimensional array will print the first element of the array whereas *a in a multidimensional array will print a random number. Why is this so?
Any help is greatly appreciated.
if a[i] is equivalent to *(a+i). What is the equivalent of a[j][i]?
a[j][i]
is similar to
*(*(a+ j) + i)
Now If you want to know how it is?
Then let see
You already know that
a[j]=*(a+j) -------------------------> res 1
Now
a[j][i] = *(a[j]+i); --------------------------> res2
After that replace the res1 in res2. So it become
a[j][i] = *(*(a+ j) + i) ----------------------> res3
The literal answer to the question is simple: a[i] is defined to be always identical to *(a+i) by the standard, and therefore, a[j][i] is guaranteed to be always identical to *(*(a+j)+i). However, that by itself does not help us to understand what is going on; it just transforms one compound expression to another.
a[j][i] (and by extension, *(*(a+j)+i)) does very different things depending on the type of a. This is because, depending on the types, there may be implicit array-to-pointer conversions that are not apparent.
In C, a value of array type T[x] is implicitly converted to an rvalue of pointer type T* in many contexts, some of which include the left side of the subscript operator, as well as an operand in addition. So if you do either a[i] or *(a+i), and a is an expression of array type, in both cases it is converted to a pointer to its first element (like &a[0]) and it's the pointer that participates in the operation. Thus you can see how *(a+i) makes sense.
If a had type T[x][y], it would be a "true" multidimensional array, which is a C array whose elements are themselves C arrays (of a certain compile-time-constant size). In this case, if you consider *(*(a+j)+i), what is happening is 1) a is converted to a pointer to its first element (which is a pointer to an array, of type T(*)[y]), 2) that pointer is incremented and dereferenced, producing a value of array type (the jth subarray of a), 3) that array is then converted to a pointer to its first element (a pointer of type T*), which is then 4) incremented and dereferenced. This finally produces the ith element of the jth element of a, what you usually think of as a[j][i].
However, a could also have type, say, T**. This is usually used to implement "fake" multidimensional arrays, which is an array of pointers, which then in turn point to the first element of some array. This allows you to have "rows" that can have different sizes (thus the multidimensional array need not be "rectangular"), and sizes not fixed at compile time. The "rows", as well as the main pointer array, do not have to be stored contiguously. In this case, if you consider *(*(a+j)+i), what is happening is 1) a is incremented and dereferenced, producing a value of pointer type (the jth element of a), 2) that pointer is then incremented and dereferenced. This finally produces the ith element of the array referred to by the jth element of the main pointer array. Note that in this case there are no implicit array-to-pointer conversions.
Generally array follows pointer concepts.For 1D array one time dereference is enough to get value.But in multidimensional array ,to get values we have dereference 2times in 2D array, 3times in 3D array like that.
In a[j][i]=*(*(a+j)+i)
A multidimensional array in C is contiguous. The following:
int a[4][5];
consists of 4 int[5]s laid out next to each other in memory.
An array of pointers:
int *a[4];
is jagged. Each pointer can point to (the first element of) a separate array of a different length.
a[i][j] is equivalent to ((a+i)+j). See the C11 standard, section 6.5.2.1:
The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2)))
Thus, a[i][j] is equivalent to (*(a+i))[j], which is equivalent to ((a+i)+j).
This equivalence exists because in most contexts, expressions of array type decay to pointers to their first element (C11 standard, 6.3.2.1). a[i][j] is interpreted as the following:
a is an array of arrays, so it decays to a pointer to a[0], the first subarray.
a+i is a pointer to the ith subarray of a.
a[i] is equivalent to *(a+i), dereferencing a pointer to the ith subarray of a. Since this is an expression of array type, it decays to a pointer to a[i][0].
a[i][j] is equivalent to *(*(a+i)+j), dereferencing a pointer to the jth element of the ith subarray of a.
Note that pointers to arrays are different from pointers to their first element. a+i is a pointer to an array; it is not an expression of array type, and it does not decay, whether to a pointer to a pointer or to any other type.
and for some reason printing *a in single dimensional array will print the first element of the array whereas *a in a multidimensional array will print a random number. Why is this so?
for two dimensional array you need to dereference it accordingly ,
printf("\nvalue : %d",**a);
to print the first element of the array.
I come across a program in C and see the pointer comparison program. What I didn't understand is these two statements.
j=&arr[4];
k=(arr+4);
the first statement is holding the address of the fifth element and the second statement syntax is what I saw first time. Can anybody explain me the second statement. and also
after executing program j and k are equal. so they are pointing to the same location.
k=(arr+4);
means k will point to 4 elements ahead of arr location after it is decayed into a pointer to index 0.
array name decays to a pointer to it's zero index. by adding 4 means it'll point to 5th element.
It's the infamous pointer arithmetic! The statement simply assigns the address of the element at the address pointed to by arr and an offset of 4 elements to the right. arr + 4 is pointing to the address of arr[4].
This is simply pointer arithmetic, mixed with C's indexing<->pointer defererence equivalence.
The former means that the expression arr + 4 causes arr (the name of an array) to decay into simply a pointer to the array's first argument. In other words, arr == &arr[0] is true.
The latter is this equivalency, for any pointer a and integer i:
a[i] === *(a + i)
This means that the first expression, the assignment to j, can be read as j = &(*(a + 4)), which makes it (pretty) clear that it's just taking the address of the element with index 4, just as the k line is doing.
This code uses a simple case of pointer arithmetics. It assigns the adress of the array ( +4 adresses, so it is the 5th element) to the pointer k.
Every arr[4] statement is expending to (arr+4); statement by the compiler itself.
These two are equivalent and can be use interchangeably.
Both are ways of getting a pointer value.
First arr[x] returns (x+1) array contents and you can get its address with & operator.
Second is known as pointer arithmetic and returns the address of the arr pointer plus x positions, so the x+1 address.
It's basic pointer arithmetic. k is a pointer, arr is a pointer to the first element of the array (a pointer to arr[0]). So by adding 4 to k, you move the pointer on 4 elements. Therefore k=(arr+4) means k points to arr[4], which would be the fifth element, and the same as j.
In this program all three addresses which I mention refer to the first element of the array but why don't I get the value of the first element of the array when I dereference them?
int main()
{
int a[5] = {1,2,3,4,5};
printf("address a = %d\n",a);
printf("address of a[0] = %d\n",&a[0]);
printf("address of first element = %d\n",&a);
printf("value of first element of the array a =%d\n",*(a));
printf("first element =%d\n",*(&a[0]));
printf("a[0] = %d\n",*(&a));//this print statement again prints the address of a[0]
return 0;
}
I get address of the first element of the array a for the first 3 print statements and when I dereference all the 3 I get values only for the fourth and fifth print statements and not for the sixth print statement (which is accompanied with a comment).
Things to remember:
Name of the array is the address of its first element
So, as the array name is a, then, printing a would give you the address of a[0] (which is also the address of the array too) i.e. you will get the values of &a[0] (same as a) and &a to be the same
Now, you are aware that a and &a[0] refer to the first element, you can dereference the first element in 3 ways:-
*a
*(&a[0])
a[0] - Note that internally, this gets transformed into: *(a+0)
Things to remember:
2. Adding an integer to a pointer takes the pointer to the next element
Here, &a points to the address of the whole array. Although the value of &a is the same as &a[0] and a, but, it is a pointer to the array, not pointer to the first element.
So, if you add 1 to &a i.e. &a + 1, you'll go beyond this array.
SImilarly, as &a[0] and a are pointers to the first element, adding 1 to them will give you the next element of the array (if there are more than 1 items defined in the array). i.e. (a+1) and &a[0] + 1 point to the next element from the first element. Now, for dereferencing them, you can use:
*(a+1)
*(&a[0] +1)
a[1] - Note that internally, this gets transformed into: *(a+1)
Adding more information to remove the following doubt:
If, as this answer states, the name of the array were the address of its first element, then &a would be the address of the address of the first element.
The answer to this doubt is both No and Yes.
No because there is nothing like address of the address.
For understanding yes, consider the following situation:
Imagine that you have 10 boxes of chocolates and each box contains 5 chocolates (fitted in a line inside the box) and that the boxes are lined up.
Ok, enough chocolates to explain.
Here, So, boxes represent arrays of chocolates. Thus, we have with us 5 boxes of 5 chocolates each. The declaration for that would be:
Translating it to C, just assume that a is an array with 5 numbers.
-Now, if I ask you to tell me the location of the first box, then, you will refer to it as &a. If I ask you to get me the location of second box, then, you'll refer to it as &a +1.
If I ask you to get me the location of first chocolate in the first box, then you'll refer to it as &a[0] or (a+0) or a.
If I ask you to get me the location of second chocolate in the first box, then you'll refer to it as &a[1] or (a+1) or a+1. Note: In (a+1), as a is the name of the array, it is the address of the first element, which is an integer. So, increasing a by 1, means the address of the second element.
If I ask you to get me the location of the second box of chocolates, then, you'll refer to it as (&a+1)
If I ask you to get me the location of the first chocolate in the second box of chocolates, then, you'll refer to it as *(&a+1) or *((&a+1) + 0)
If I ask you to get me the location of the third chocolate in the second box of chocolates, then, you'll refer to it as (*(&a+1))+2
To answer just your question, by definition the operators * and & are such that they cancel out. Taking the address of a variable and then dereferencing gives you back the variable. Here this is is an array, a. In most contexts arrays "decay" to pointers so what you then see again is the address of the first element.
The C standard specifies that an expression that has type “array of type” is converted to type “pointer to type” and points to the initial element of the array, except when the expression is the operand of & (or three other exceptions, noted below but not relevant to this question). Here is how this applies to your examples:
a is an array of int. It is converted to pointer to int. The value is the address of the initial element of a.
In &a[0], a[0] is processed first. a is again converted to pointer to int. Then the subscript operator is applied, which produces an lvalue for the initial element of the array. Finally, & takes the address of this lvalue, so the value is the address of the initial element of a.
In &a, a is the operand of &, so a is not converted to a pointer to int. It remains an array, and &a takes its address. The result has type “pointer to array of int”, and its value is the address of the array, which equals the address of the initial element.
For completeness: The relevant rule in the C standard is 6.3.2.1 paragraph 3. The exceptions are:
The array expression is the operand of &.
The array expression is the operand of sizeof.
The array expression is the operand of _Alignof.
The array expression is a string literal used to initialize an array.
The last means that in char s[] = "abc";, "abc" is not converted to a pointer to the initial element; it remains an array, which is used to initialize s.
The a is a pointer and a[0] is *(a+0). So when you write *(&a), you aren't dereferencing it.