Can't understand why these four notations are same? [duplicate] - c

This question already has answers here:
With arrays, why is it the case that a[5] == 5[a]?
(20 answers)
Closed 5 years ago.
Array declaration:
int arr [ ]={34, 65, 23, 75, 76, 33};
Four notations: (consider i=0)
arr[i]
and
*(arr+i)
and
*(i+arr)
and
i[arr]

Lets take a look at how your array is laid out in memory:
low address high address
| |
v v
+----+----+----+----+----+----+
| 34 | 65 | 23 | 75 | 76 | 33 |
+----+----+----+----+----+----+
^ ^ ^ ^
| | | ...etc
| | |
| | arr[2]
| |
| arr[1]
|
arr[0]
That the first elements is arr[0], the second arr[1] is pretty clear, that's what everybody learns. What is less clear is that the compiler actually translates an expression such as arr[i] to *(arr + i).
What *(arr + i) does is first get a pointer to the first element, then do pointer arithmetic to get a pointer to the wanted element at index i, and then dereference the pointer to get its value.
Due to the commutative property of addition, the expression *(arr + i) is equal to *(i + arr) which due to the above mentioned translation is equal to i[arr].
The equivalence of arr[i] and *(arr + i) is also what's behind the decay of an array to a pointer to its first element.
The pointer to the arrays first element would be &arr[0]. Now we know that arr[0] should be equal to *(arr + 0) which means &arr[0] has to be equal to &*(arr + 0). Adding zero to anything is a no-op, so leading to the expression &*(arr). Parentheses with only one term and no operator can also be removed, leaving &*arr. And lastly the address-of and dereference operator are each other opposites and cancel out each other, leaving us with simply arr. So &arr[0] is equal to arr.

Each element in the array, have a position in memory. The positions in the arrays are sequential. The arrays in C are pointers and always point the first direction on memory for the collection (first element of the array).
arr[i] => Gets value of "i-position" in the array. It is the same that arr[i] = *(arr + i)
*(arr+i) => Gets value that is in memory by adding the position in memory that point arr and i value.
*(i+arr) => Is the same that *(arr+i). The sum is commutative.
i[arr] => Is the same that *(i+arr). It's another way of representing.

They are the same because the C language specification says so. Read n1570

The notation a[i] is syntactic sugar for *(a+i).
The first one is mathematical syntax (symbolics closer of what human brain is educated with) while the second one corresponds directly to one assembler instruction.
On the other hand *(a+i)=*(i+a)=i[a] because the arithmetic of pointers is commutative.

These are the same because of how the array subscript operator [] is defined.
From sectino 6.5.2.1 of the C standard:
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is
identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object
(equivalently, a pointer to the initial element of an array object)
and E2 is an integer, E1[E2] designates the E2-th element of
E1 (counting from zero).
The expression arr[i] in your example is of the form E1[E2]. Because the standard states that this is the same as *(E1+E2) that means that arr[i] is the same as *(arr + i).
Because of the commutative property of addition, *(arr + i) is the same as *(i + arr). Applying the equivalence rule above to this expression gives i[arr].
So in short, those 4 expressions are equivalent because of how the standard defines array subscripting and because of the commutative property of addition.

It works because an array variable in C (i.e. arr in your example) is just a pointer to the beginning of an array of memory locations. A pointer is number which represents the address of a specific memory location. When you put and '*' in front of a pointer, it means "give me the data in that memory location".
So, if arr is a pointer to the beginning of the array, *(arr) or *(arr + 0) is the data in the 0th index of the array, and *(arr + 1) is the data in the 1st index, and so on.
An expression which looks like A[B] essentially gets translated into something like *(A+B). So, arr[0] = *(arr + 0) and arr[i] = *(arr+i), etc.
And because A+B = B+A, the two are interchangeable. Meaning *(arr+i) = *(i+arr).
And because arr[i] = *(arr+i) and *(arr+i) = *(i+arr), it should make sense that arr[i] = i[arr].

Related

dereferencing 2D array using arithmetic

int main(void)
{
short arr[3][2]={3,5,11,14,17,20};
printf("%d %d",*(arr+1)[1],**(arr+2));
return 0;
}
Hi. In above code as per my understanding ,*(arr+1)[1] is equivalent to *(*(arr+sizeof(1D array)*1)+sizeof(short)*1)=>arr[1][1] i.e 14. But the program output is arr[2][0]. can someone please explain how dereferencing the array second time adds sizeof(1Darray) i.e *(*(arr+sizeof(1D array)*1)+sizeof(1D array)*1)=>arr[2][0]
From the C Standard (6.5.2.1 Array subscripting)
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is identical to
(*((E1)+(E2))). Because of the conversion rules that apply to the
binary + operator, if E1 is an array object (equivalently, a pointer
to the initial element of an array object) and E2 is an integer,
E1[E2] designates the E2-th element of E1 (counting from zero).
So the expression
*(arr+1)[1]
can be rewritten like
* ( *( arr + 1 + 1 ) )
that is the same as
*( *( arr + 2 ) )
arr + 2 points to the third "row" of the array. Dereferencing the pointer expression you will get the "row" itself of the type short[2] that used in the expression *( arr[2] ) is converted to pointer to its first element. So the expression equivalent to arr[2][0] yields the value 17.
Thus these two expressions
*(arr+1)[1],
and
**(arr+2)
are equivalent each other.
Note: pay attention to that there is a typo in your code
printf("%d %d",*(arr+1)[1],**(arr+2);
You need one more parenthesis
printf("%d %d",*(arr+1)[1],**(arr+2) );

What is '-1[p]' when p points to an array (of int) index? [duplicate]

This question already has answers here:
With arrays, why is it the case that a[5] == 5[a]?
(20 answers)
Closed 3 years ago.
Today I stumbled over a C riddle that got a new surprise for me.
I didn't think that -1[p] in the example below would compile, but it did. In fact, x ends up to be -3.
int x;
int array[] = {1, 2, 3};
int *p = &array[1];
x = -1[p];
I searched the internet for something like -1[pointer] but couldn't find anything. Okay, it is difficult to enter the correct search query, I admit. Who knows why -1[p] compiles and X becomes -3?
I'm the person that made this "riddle" (see my Twitter post)
So! What's up with -1[p]?
ISO C actually defines [] to be symmetrical, meaning x[y] is the same as y[x], where x and y are both expressions.
Naively, we could jump to the conclusion that -1[p] is therefore p[-1] and so x = 1,
However, -1 is actually the unary minus operator applied to the constant 1, and unary minus has a lower precedence than []
So, -1[p] is -(p[1]), which yields -3.
This can lead to funky looking snippets like this one, too:
sizeof(char)["abc"] /* yields 'b' */
First thing to figure out is the precedence. Namely [] has higher precedence than unary operators, so -1[p] is equal to -(1[p]), not (-1)[p]. So we're taking the result of 1[p] and negating it.
x[y] is equal to *(x+y), so 1[p] is equal to *(1+p), which is equal to *(p+1), which is equal to p[1].
So we're taking the element one after where p points, so the third element of array, i.e. 3, and then negating it, which gives us -3.
According to the C Standard (6.5.2 Postfix operators) the subscript operator is defined the following way
postfix-expression [ expression ]
So before the square brackets there shall be a postfix expression.
In this expression statement
x = -1[p];
there is used the postfix expression 1 (that is at the same time a primary expression), the postfix expression 1[p] (that is the subscript operator) and the unary operator - Take into account that when the compiler splits a program into tokens then integer constants are considered as tokens themselves without the minus. minus is a separate token.
So the statement can be rewritten like
x = -( 1[p] );
because a postfix expression has a higher priority than an unary expression.
Let's consider at first the postfix sub-expression 1[p]
According to the C Standard (6.5.2.1 Array subscripting)
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is identical to
(*((E1)+(E2))). Because of the conversion rules that apply to the
binary + operator, if E1 is an array object (equivalently, a pointer
to the initial element of an array object) and E2 is an integer,
E1[E2] designates the E2-th element of E1 (counting from zero).
So this sub-expression evaluates like *( ( 1 ) + ( p ) ) and is the same as *( ( p ) + ( 1 ) ).
Thus the above statement
x = -1[p];
is equivalent to
x = -p[1];
and will yield -3, because the pointer p points to the second element of the array due to the statement
int *p = &array[1];
and then the expression p[1] yields the value of the element after the second element of the array. Then the unary operator - is applied.
This
int array[] = {1, 2, 3};
looks like
array[0] array[1] array[2]
--------------------------
| 1 | 2 | 3 |
--------------------------
0x100 0x104 0x108 <-- lets assume 0x100 is base address of array
array
Next when you do like
int *p = &array[1];
the integer pointer p points to address of array[1] i.e 0x104. It looks like
array[0] array[1] array[2]
--------------------------
| 1 | 2 | 3 |
--------------------------
0x100 0x104 0x108 <-- lets assume 0x100 is base address of array
|
p holds 0x104
And when you do like
x = -1[p]
-1[p] is equivalent to -(1[p]) i.e -(p[1]). it looks like
-(p[1]) ==> -(*(p + 1*4)) /* p holds points to array[1] i.e 0x104 */
==> -(*(0x104 + 4))
==> -(*(0x108)) ==> value at 0x108 is 3
==> prints -3
What happens here is really interesting.
p[n] means *(p+n). Thats why you see 3, because "p" points to array[1] which is 2, and -p[1] is interpreted as -(*(p+1)) which is -3.

Achieve the output in one statement

I was given this question by my school teacher. I was supposed to add in one statement in the C code and achieve this desired output.
I have tried but i am stuck. I think the main idea of this question is to establish the relationship between the int x[] and the y[] string as i increases from 0 to 6.
The code is below:
#include <stdio.h>
int main(){
int i, x[] = {-5,10,-10,-2,23,-20};
char y[20] = "goodbye";
char * p = y;
for (i=0;i<6;i++){
*(p + i) = //Fill in the one line statement here
}
y[6] = '\0';
printf("%s\n",p); //should print out "byebye"
}
As you can see the ascii value of the characters b is from 5 lesser than g and similarly for y it is 10 greater than o..so it will be (This meets the criteria of using x) (solution utilizing the values of x)
*(p+i) = (char)(*(p+i)+x[i]);
Yes one thing that is mentioned by rici is very important. *(p+i) is nothing other than p[i] - in fact it is much leaner to use and underneath it is still being calculated as *(p+i).
From standard 6.5.2.1p2 C11 N1570
A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).
The standard mentions this also. Being said this it would be as simple as
p[i]+=x[i];
Thoughts that came to my mind while solving.
It would be (things that came to my mind when I saw it very first time - this is establishing no relation between x and y).
*(p + i) = "byebye"[i];
String literals are basically arrays and it decays into pointer to the first element of it and then we do this *(decayed pointer + i). This will eventually assign the characters of "byebye" to the char array y.
Or something like this:- (too many hardcoded values - this does relate x and y)
*(p+i) = *(y+4+i%3);
Using a the modulus operation you can manipulate your loop to assign byebye to the 6 *char values in p.
This works because you are starting from y[4] which is 'b'.
The 6 in the for loop is your next hint. You need to iterate through bye twice. bye has 3 characters.
This gives you:
*(p + i) = y[4+(i%3)];

How do I understand the "add" (+) operator when used in a C array expression?

I've been facing difficulties to understand the fourth line of code after the first curly brace,
#include<stdio.h>
int main()
{
int arr[] = {10,20,36,72,45,36};
int *j,*k;
j = &arr[4];
k = (arr+4);
if(j==k)
printf("The two pointers are pointing at the same location");
else
printf("The two pointers are not pointing at the same location");
}
I just wanted to know what the fourth line of code after the first curly brace i.e. k = (arr+4); does?
Since k was a pointer it was supposed to point at something that had an "address of operator" ? I can still understand that if it doesn't have the "address of operator" then whatever does the part of the code k = (arr+4) do?
For any array or pointer arr and index i, the expression arr[i] is exactly equal to *(arr + i).
Now considering that arrays naturally can decay to pointers to their first element, arr + i is a pointer to element i.
Without & or the sizeof operator an array converts to a pointer to the first element of the array.In your case array will convert to a pointer to int and will point to the first element.arr + 1 will point to the second element,arr + 2 will point to the third element etc..arr+1 means increment arr with sizeof(int).
This is simplified diagram of the first 2 elements of the array.Let's assume int is 4 bytes long.
| first element | second element |
-------------------------------------------------
| | | | | | | | |
| | | | | | | | |
-------------------------------------------------
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
arr will contain 0x00
arr + 1 will contain 0x04
*arr will mean take the value from adress 0x00.It's equivalent to *(arr+0), *(0+arr), arr[0] and 0[arr].Since arr is of type int* it will take a four bytes long value.
With int* k = array k will contain the same address with array.
k = (arr + 4) will contain the address of the 5th element.
j = &arr[4]; will also store the address of the 5th element
As k is defined as pointer, it can be use to store address as value and can point to a location.
Here (arr+4) will return address of arr and plus 4.
If int takes 4 bytes then it will point to 2nd element in arr, so it depends on system(32bit/64 bit), that how much it takes to store int.
Something that you should aware of (C Standards#6.3.2.1p3):
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
The statement:
int arr[] = {10,20,36,72,45,36};
arr is an array of int.
The expression arr[i], can also be written as:
*(arr+i)
So, &a[i] can be written as:
&(*(arr+i))
The operator & is used to get the address and the operator * is used for dereferencing. These operators cancel the effect of each other when used one after another. Hence, &(*(arr+i)) is equivalent to arr+i.
I just wanted to know what the fourth line of code after the first curly brace i.e. k = (arr+4); does?
In the statement:
k = (arr+4);
none of the operators - sizeof, _Alignof and unary & is used. So,
arr will convert to a pointer to type int. That means, arr+4 will give the address of the four element past the object (i.e. int) pointed to by arr, which is nothing but &a[4].
&arr[4] --> &(*(arr+4)) --> (arr+4)
You need to understand pointer arithmetic here.
arr[i] is interpreted as *(arr + i),
where * is 'dereferencing' or 'value at' operator and arr always represents the address of first element of the array i.e. the address of array itself, which literally means,
valueat(starting address of arr + i)
Now suppose address of arr is 100 and you are adding i elements of type arr in address of array and not the value of i, that is,
valueat(100 + i)
Now according to your code, here you are assigning address of 4th element to pointer j ,
j = &arr[4];
j = &(valueat(100+ 4 elements of type int));
j = &(valueat(100+ 16)); ->
j = &(valueat(116)); ->
j = &(45) that is j = 116
Now when you do
k = (arr+4);
k = (starting address of arr + 4 elements of type int);
k = (100 + 16); that is k = 116, and that is why the output,
The two pointers are pointing at the same location
Hope this helps.

String as an array index

In 3["XoePhoenix"], array index is of type array of characters. Can we do this in C? Isn't it true that an array index must be an integer?
What does 3["XeoPhoenix"] mean?
3["XoePhoenix"] is the same as "XoePhoenix"[3], so it will evaluate to the char 'P'.
The array syntax in C is not more than a different way of writing *( x + y ), where x and y are the sub expressions before and inside the brackets. Due to the commutativity of the addition these sub expressions can be exchanged without changing the meaning of the expression.
So 3["XeoPhoenix"] is compiled as *( 3 + "XeoPhoenix" ) where the string decays to a pointer and 3 is added to this pointer which in turn results in a pointer to the 4th char in the string. The * dereferences this pointer and so this expression evaluates to 'P'.
"XeoPhoenix"[ 3 ] would be compiled as *( "XeoPhoenix" + 3 ) and you can see that would lead to the same result.
3["XeoPhoenix"] is equivalent to "XeoPhoenix"[3] and would evaluate to the 4th character i.e 'P'.
In general a[i] and i[a] are equivalent.
a[i] = *(a + i) = *(i + a) = i[a]
In C, arrays are very simple data structures with consecutive blocks of memory. They therefore need to be integers as these indices are nothing more than offsets to addresses in memory.

Resources