String as an array index - c

In 3["XoePhoenix"], array index is of type array of characters. Can we do this in C? Isn't it true that an array index must be an integer?
What does 3["XeoPhoenix"] mean?

3["XoePhoenix"] is the same as "XoePhoenix"[3], so it will evaluate to the char 'P'.
The array syntax in C is not more than a different way of writing *( x + y ), where x and y are the sub expressions before and inside the brackets. Due to the commutativity of the addition these sub expressions can be exchanged without changing the meaning of the expression.
So 3["XeoPhoenix"] is compiled as *( 3 + "XeoPhoenix" ) where the string decays to a pointer and 3 is added to this pointer which in turn results in a pointer to the 4th char in the string. The * dereferences this pointer and so this expression evaluates to 'P'.
"XeoPhoenix"[ 3 ] would be compiled as *( "XeoPhoenix" + 3 ) and you can see that would lead to the same result.

3["XeoPhoenix"] is equivalent to "XeoPhoenix"[3] and would evaluate to the 4th character i.e 'P'.
In general a[i] and i[a] are equivalent.
a[i] = *(a + i) = *(i + a) = i[a]

In C, arrays are very simple data structures with consecutive blocks of memory. They therefore need to be integers as these indices are nothing more than offsets to addresses in memory.

Related

dereferencing 2D array using arithmetic

int main(void)
{
short arr[3][2]={3,5,11,14,17,20};
printf("%d %d",*(arr+1)[1],**(arr+2));
return 0;
}
Hi. In above code as per my understanding ,*(arr+1)[1] is equivalent to *(*(arr+sizeof(1D array)*1)+sizeof(short)*1)=>arr[1][1] i.e 14. But the program output is arr[2][0]. can someone please explain how dereferencing the array second time adds sizeof(1Darray) i.e *(*(arr+sizeof(1D array)*1)+sizeof(1D array)*1)=>arr[2][0]
From the C Standard (6.5.2.1 Array subscripting)
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is identical to
(*((E1)+(E2))). Because of the conversion rules that apply to the
binary + operator, if E1 is an array object (equivalently, a pointer
to the initial element of an array object) and E2 is an integer,
E1[E2] designates the E2-th element of E1 (counting from zero).
So the expression
*(arr+1)[1]
can be rewritten like
* ( *( arr + 1 + 1 ) )
that is the same as
*( *( arr + 2 ) )
arr + 2 points to the third "row" of the array. Dereferencing the pointer expression you will get the "row" itself of the type short[2] that used in the expression *( arr[2] ) is converted to pointer to its first element. So the expression equivalent to arr[2][0] yields the value 17.
Thus these two expressions
*(arr+1)[1],
and
**(arr+2)
are equivalent each other.
Note: pay attention to that there is a typo in your code
printf("%d %d",*(arr+1)[1],**(arr+2);
You need one more parenthesis
printf("%d %d",*(arr+1)[1],**(arr+2) );

Achieve the output in one statement

I was given this question by my school teacher. I was supposed to add in one statement in the C code and achieve this desired output.
I have tried but i am stuck. I think the main idea of this question is to establish the relationship between the int x[] and the y[] string as i increases from 0 to 6.
The code is below:
#include <stdio.h>
int main(){
int i, x[] = {-5,10,-10,-2,23,-20};
char y[20] = "goodbye";
char * p = y;
for (i=0;i<6;i++){
*(p + i) = //Fill in the one line statement here
}
y[6] = '\0';
printf("%s\n",p); //should print out "byebye"
}
As you can see the ascii value of the characters b is from 5 lesser than g and similarly for y it is 10 greater than o..so it will be (This meets the criteria of using x) (solution utilizing the values of x)
*(p+i) = (char)(*(p+i)+x[i]);
Yes one thing that is mentioned by rici is very important. *(p+i) is nothing other than p[i] - in fact it is much leaner to use and underneath it is still being calculated as *(p+i).
From standard 6.5.2.1p2 C11 N1570
A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).
The standard mentions this also. Being said this it would be as simple as
p[i]+=x[i];
Thoughts that came to my mind while solving.
It would be (things that came to my mind when I saw it very first time - this is establishing no relation between x and y).
*(p + i) = "byebye"[i];
String literals are basically arrays and it decays into pointer to the first element of it and then we do this *(decayed pointer + i). This will eventually assign the characters of "byebye" to the char array y.
Or something like this:- (too many hardcoded values - this does relate x and y)
*(p+i) = *(y+4+i%3);
Using a the modulus operation you can manipulate your loop to assign byebye to the 6 *char values in p.
This works because you are starting from y[4] which is 'b'.
The 6 in the for loop is your next hint. You need to iterate through bye twice. bye has 3 characters.
This gives you:
*(p + i) = y[4+(i%3)];

How do I understand the "add" (+) operator when used in a C array expression?

I've been facing difficulties to understand the fourth line of code after the first curly brace,
#include<stdio.h>
int main()
{
int arr[] = {10,20,36,72,45,36};
int *j,*k;
j = &arr[4];
k = (arr+4);
if(j==k)
printf("The two pointers are pointing at the same location");
else
printf("The two pointers are not pointing at the same location");
}
I just wanted to know what the fourth line of code after the first curly brace i.e. k = (arr+4); does?
Since k was a pointer it was supposed to point at something that had an "address of operator" ? I can still understand that if it doesn't have the "address of operator" then whatever does the part of the code k = (arr+4) do?
For any array or pointer arr and index i, the expression arr[i] is exactly equal to *(arr + i).
Now considering that arrays naturally can decay to pointers to their first element, arr + i is a pointer to element i.
Without & or the sizeof operator an array converts to a pointer to the first element of the array.In your case array will convert to a pointer to int and will point to the first element.arr + 1 will point to the second element,arr + 2 will point to the third element etc..arr+1 means increment arr with sizeof(int).
This is simplified diagram of the first 2 elements of the array.Let's assume int is 4 bytes long.
| first element | second element |
-------------------------------------------------
| | | | | | | | |
| | | | | | | | |
-------------------------------------------------
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
arr will contain 0x00
arr + 1 will contain 0x04
*arr will mean take the value from adress 0x00.It's equivalent to *(arr+0), *(0+arr), arr[0] and 0[arr].Since arr is of type int* it will take a four bytes long value.
With int* k = array k will contain the same address with array.
k = (arr + 4) will contain the address of the 5th element.
j = &arr[4]; will also store the address of the 5th element
As k is defined as pointer, it can be use to store address as value and can point to a location.
Here (arr+4) will return address of arr and plus 4.
If int takes 4 bytes then it will point to 2nd element in arr, so it depends on system(32bit/64 bit), that how much it takes to store int.
Something that you should aware of (C Standards#6.3.2.1p3):
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
The statement:
int arr[] = {10,20,36,72,45,36};
arr is an array of int.
The expression arr[i], can also be written as:
*(arr+i)
So, &a[i] can be written as:
&(*(arr+i))
The operator & is used to get the address and the operator * is used for dereferencing. These operators cancel the effect of each other when used one after another. Hence, &(*(arr+i)) is equivalent to arr+i.
I just wanted to know what the fourth line of code after the first curly brace i.e. k = (arr+4); does?
In the statement:
k = (arr+4);
none of the operators - sizeof, _Alignof and unary & is used. So,
arr will convert to a pointer to type int. That means, arr+4 will give the address of the four element past the object (i.e. int) pointed to by arr, which is nothing but &a[4].
&arr[4] --> &(*(arr+4)) --> (arr+4)
You need to understand pointer arithmetic here.
arr[i] is interpreted as *(arr + i),
where * is 'dereferencing' or 'value at' operator and arr always represents the address of first element of the array i.e. the address of array itself, which literally means,
valueat(starting address of arr + i)
Now suppose address of arr is 100 and you are adding i elements of type arr in address of array and not the value of i, that is,
valueat(100 + i)
Now according to your code, here you are assigning address of 4th element to pointer j ,
j = &arr[4];
j = &(valueat(100+ 4 elements of type int));
j = &(valueat(100+ 16)); ->
j = &(valueat(116)); ->
j = &(45) that is j = 116
Now when you do
k = (arr+4);
k = (starting address of arr + 4 elements of type int);
k = (100 + 16); that is k = 116, and that is why the output,
The two pointers are pointing at the same location
Hope this helps.

Can't understand why these four notations are same? [duplicate]

This question already has answers here:
With arrays, why is it the case that a[5] == 5[a]?
(20 answers)
Closed 5 years ago.
Array declaration:
int arr [ ]={34, 65, 23, 75, 76, 33};
Four notations: (consider i=0)
arr[i]
and
*(arr+i)
and
*(i+arr)
and
i[arr]
Lets take a look at how your array is laid out in memory:
low address high address
| |
v v
+----+----+----+----+----+----+
| 34 | 65 | 23 | 75 | 76 | 33 |
+----+----+----+----+----+----+
^ ^ ^ ^
| | | ...etc
| | |
| | arr[2]
| |
| arr[1]
|
arr[0]
That the first elements is arr[0], the second arr[1] is pretty clear, that's what everybody learns. What is less clear is that the compiler actually translates an expression such as arr[i] to *(arr + i).
What *(arr + i) does is first get a pointer to the first element, then do pointer arithmetic to get a pointer to the wanted element at index i, and then dereference the pointer to get its value.
Due to the commutative property of addition, the expression *(arr + i) is equal to *(i + arr) which due to the above mentioned translation is equal to i[arr].
The equivalence of arr[i] and *(arr + i) is also what's behind the decay of an array to a pointer to its first element.
The pointer to the arrays first element would be &arr[0]. Now we know that arr[0] should be equal to *(arr + 0) which means &arr[0] has to be equal to &*(arr + 0). Adding zero to anything is a no-op, so leading to the expression &*(arr). Parentheses with only one term and no operator can also be removed, leaving &*arr. And lastly the address-of and dereference operator are each other opposites and cancel out each other, leaving us with simply arr. So &arr[0] is equal to arr.
Each element in the array, have a position in memory. The positions in the arrays are sequential. The arrays in C are pointers and always point the first direction on memory for the collection (first element of the array).
arr[i] => Gets value of "i-position" in the array. It is the same that arr[i] = *(arr + i)
*(arr+i) => Gets value that is in memory by adding the position in memory that point arr and i value.
*(i+arr) => Is the same that *(arr+i). The sum is commutative.
i[arr] => Is the same that *(i+arr). It's another way of representing.
They are the same because the C language specification says so. Read n1570
The notation a[i] is syntactic sugar for *(a+i).
The first one is mathematical syntax (symbolics closer of what human brain is educated with) while the second one corresponds directly to one assembler instruction.
On the other hand *(a+i)=*(i+a)=i[a] because the arithmetic of pointers is commutative.
These are the same because of how the array subscript operator [] is defined.
From sectino 6.5.2.1 of the C standard:
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is
identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object
(equivalently, a pointer to the initial element of an array object)
and E2 is an integer, E1[E2] designates the E2-th element of
E1 (counting from zero).
The expression arr[i] in your example is of the form E1[E2]. Because the standard states that this is the same as *(E1+E2) that means that arr[i] is the same as *(arr + i).
Because of the commutative property of addition, *(arr + i) is the same as *(i + arr). Applying the equivalence rule above to this expression gives i[arr].
So in short, those 4 expressions are equivalent because of how the standard defines array subscripting and because of the commutative property of addition.
It works because an array variable in C (i.e. arr in your example) is just a pointer to the beginning of an array of memory locations. A pointer is number which represents the address of a specific memory location. When you put and '*' in front of a pointer, it means "give me the data in that memory location".
So, if arr is a pointer to the beginning of the array, *(arr) or *(arr + 0) is the data in the 0th index of the array, and *(arr + 1) is the data in the 1st index, and so on.
An expression which looks like A[B] essentially gets translated into something like *(A+B). So, arr[0] = *(arr + 0) and arr[i] = *(arr+i), etc.
And because A+B = B+A, the two are interchangeable. Meaning *(arr+i) = *(i+arr).
And because arr[i] = *(arr+i) and *(arr+i) = *(i+arr), it should make sense that arr[i] = i[arr].

Multidimensional array and addressing

I have an issue with the multidimensional arrays. Maybe the solution is much easier.
int arr[2][2]; //multidimensional array
My simple question is: why the
arr[0][2] and arr[1][0]
or
arr[1][2] and arr[2][0]
are on the same address in my case?
I checked this problem in Linux and Windows environment. And the issue is the same. I have checked tutorials and other sources, but no answer.
The pointer &arr[0][2] is the one-past-the-end pointer of the array arr[0]. This is the same address as that of the first element of the next array, arr[1], which is &arr[1][0], because arrays are laid out contiguously in memory.
arr[2][0] is a bit tricker: arr[2] is not a valid access, but &arr[2] is the one-past-the-end pointer of the array arr. But since that pointer cannot be dereferenced, it doesn't make sense to talk about arr[2][0]. arr doesn't have a third element.
C stores multi-dimensional arrays in what is called row-major order. In that configuration, all the data for a single row is stored in consecutive memory:
arr[2][2] -> r0c0, r0c1, r1c0, r1c2
The alternative would be column-major order, which places the columns consecutively.
Since you have specified the length of the row (number of cols) as 2, it follows that accessing column 2 (the third column) will compute an address that "wraps around" to the next row.
The math looks like:
&(arr[row][col])
= arr # base address
+ row * ncols * sizeof(element)
+ col * sizeof(element)
= arr + sizeof(element) * (row * ncols + col)
In your case, arr[0][2] is arr + (0*2 + 2) * sizeof(int), while arr[1][0] is arr + (1*2 + 0)*sizeof(int).
You can do similar math for the other variations.
Array indexing is identical to pointer arithmetic (actually, the array name first is converted ("decays") to a pointer to the first element before the []-operator is applied):
arr[r][c] <=> *(arr + r * INNER_LENGTH + c)
Your array has two entries per dimension. In C indexes start from 0, so for each dimension valid indexes are 0 and 1 (i.e. total_entries - 1). Which makes three of your expressions suspective in the first place:
arr[0][2] // [outer dimension/index][inner dimension/index]
arr[1][2]
arr[2][0]
We have these cases:
Both indexes are valid: no problem.
Only the address is taken, the element is not accessed and
the outer index is valid and the inner (see below) index equals the length of the inner dimension: comparison and certain address arithmetic is allowed (other constraints apply!).
the outer index equals the length of the outer dimension, and the inner index is 0: The same.
Anything else: the address is invalid and any usage (take address, dereference, etc.) invokes undefined behaviour.
What exactly goes on in memory might become a bit more clear if we use different lengths for the dimensions and have a look how the data is stored:
int arr[3][2];
This is an "array of 3 arrays of 2 int elements". The leftmost dimension is called the "outer", the rightmost the "inner" dimension, because of the memory layout:
arr[0][0] // row 0, column 0
arr[0][1] // row 0, column 1
arr[1][0] // ...
arr[1][1]
arr[2][0]
arr[2][1]
Using the formula above, &arr[0][2] (arr + 0 * 2 + 2) will yield the same as &arr[1][0] (arr + 1 * 2 + 0), etc. Note, however, while the addresses are identical, the first version must not be dereferenced and the compiler may generate incorrect code, etc.
Array indexing in C is similar to adding the value of the index to the address of the first element.
In the multidimensional array that you describe, you have 2 elements on each dimension: 0 and 1. When you introduce a number larger than that, you're referencing an element outside that dimension. Technically, this is an array out of bounds error.
The addresses break down like this:
arr[0][0] - &arr[0] + 0
arr[0][1] - &arr[0] + 1
arr[1][0] - &arr[0] + 2
arr[1][0] - &arr[0] + 3
When you write arr[0][2], you're referencing address &arr[0] + 2, which is the same as arr[1][0]. It all just pointer math, so you can work it out pretty easily once you know how it works.
You can look in your two dimensional array as a long one dimensional array:
[00][01][10][11]
With the pointers arithmetic, another representation of this long one dimensional array is:
[00][01][02][03]
So looking in cell [10] is exactly the same as looking into a cell [20] in pointer arithmetic point of view.

Resources