How to understand this pointer to a data structure in C? - c

I have this code, I'm trying to figure out what the second line of code does.
static int table [][4]= {{1,2,3,4},{2,3,4,5},{3,4,5,6}};
int valore = *(*(table+2)+1);
printf("%d",valore);
I have a basic knowledge of pointers in C, can you explain me what does the second line of code do please?

Your table is simply a 2D array of integers. In C a 2D array is really an "array or arrays". Your table has the dimensions of static in table[3][4]; (3 rows x 4 cols), it is an array of 3 integer arrays with 4 elements each. Since it is an array, all values will be sequential in memory. You can think of the memory layout as follows.
+---+---+---+---+
table[0] | 1 | 2 | 3 | 4 |
+---+---+---+---+
table[1] | 2 | 3 | 4 | 5 |
+---+---+---+---+
table[2] | 3 | 4 | 5 | 6 |
+---+---+---+---+
An array is converted to a pointer on access (accept in 4 limited circumstances, not relevant here, see C11 Standard - 6.3.2.1 Other Operands - Lvalues, arrays, and function designators(p3) for details)
You are introduced to "pointer notation" in the question. You can access any element of an array using "array indexes" or "pointer notation". In pointer notation *(a + b) is equivalent to a[b] in array index notation. You have:
*(*(table+2)+1)
If you take it piece by piece *(table + 2) is simply table[2]. Next *(table[2] + 1) is simply table[2][1]. So you are acccessing the 2nd value in the 3rd row with either (which is simply 4).
Look things over and let me know if you have further questions.

table is an array of 3 arrays of 4 int.
When an array is used in an expression, it is converted to a pointer to its first element, except when:
It is the operand of sizeof.
It is the operand of unary &.
It is a string literal used to initialize an array.
So, in *(*(table+2)+1), table is converted to a pointer to its first element, producing &table[0]. Then we have:
*(*(&table[0]+2)+1)
Next, we have the addition &table[0] + 2. This uses pointer arithmetic. Adding an integer to a pointer (into an array) moves the pointer backward or forward by a number of elements. So &table[0] + 2 produces a pointer to table[2], which is &table[2]. Then we have:
*(*(&table[2])+1)
The inner parentheses are no longer needed, so we have:
*(*&table[2]+1)
Then * &table[2] is the thing that &table[2] points to, which means it is table[2]:
*(table[2] + 1)
Since table is an array of 3 arrays of 4 int, table[2] is an array of 4 int. Since it is an array, it is converted to a pointer to its first element, producing &table[2][0]:
*(&table[2][0] + 1)
Now we have pointer arithmetic again. &table[2][0] is a pointer to element 0 of the array table[2], so adding 1 produces a pointer to element 1, &table[2][1]:
*(&table[2][1])
Again we have parentheses that are no longer needed:
*&table[2][1]
And, finally, * &table[2][1] is the thing that &table[2][1] points to, so it is just:
table[2][1]

Related

Storage order for multidimensional arrays in C

With a C compiler, are array elements are stored in column major order or row major order, or it is compiler dependent?
int arr[2][3]={1,2,3,4,5,6};
int array[3][2]={1,2,3,4,5,6};
on printing arr and array output:
arr:
1 2 3
4 5 6
array:
1 2
3 4
5 6
It seems its always prefer row major order?
Row major order is mandated by the standard.
6.5.2.1p3:
Successive subscript operators designate an element of a
multidimensional array object. If E is an n-dimensional array (n >= 2)
with dimensions i x j x . . . x k, then E (used as other than an
lvalue) is converted to a pointer to an (n - 1)-dimensional array with
dimensions j x . . . x k. If the unary * operator is applied to this
pointer explicitly, or implicitly as a result of subscripting, the
result is the referenced (n - 1)-dimensional array, which itself is
converted into a pointer if used as other than an lvalue. It follows
from this that arrays are stored in row-major order (last subscript
varies fastest).
(Emphasis mine)
You printed the array. The output is in whatever order that you printed the array elements. So what you see has nothing to do with the order in which array elements are stored in memory.
int arr[2][3] means that you have three arrays, and the object stored in each array is an int[2]. Objects are always stored consecutively, so the first int[2] is stored in consecutive memory, followed by the second int[2], followed by the third int[2]. And that is the same for any C implementation.

Can't understand why these four notations are same? [duplicate]

This question already has answers here:
With arrays, why is it the case that a[5] == 5[a]?
(20 answers)
Closed 5 years ago.
Array declaration:
int arr [ ]={34, 65, 23, 75, 76, 33};
Four notations: (consider i=0)
arr[i]
and
*(arr+i)
and
*(i+arr)
and
i[arr]
Lets take a look at how your array is laid out in memory:
low address high address
| |
v v
+----+----+----+----+----+----+
| 34 | 65 | 23 | 75 | 76 | 33 |
+----+----+----+----+----+----+
^ ^ ^ ^
| | | ...etc
| | |
| | arr[2]
| |
| arr[1]
|
arr[0]
That the first elements is arr[0], the second arr[1] is pretty clear, that's what everybody learns. What is less clear is that the compiler actually translates an expression such as arr[i] to *(arr + i).
What *(arr + i) does is first get a pointer to the first element, then do pointer arithmetic to get a pointer to the wanted element at index i, and then dereference the pointer to get its value.
Due to the commutative property of addition, the expression *(arr + i) is equal to *(i + arr) which due to the above mentioned translation is equal to i[arr].
The equivalence of arr[i] and *(arr + i) is also what's behind the decay of an array to a pointer to its first element.
The pointer to the arrays first element would be &arr[0]. Now we know that arr[0] should be equal to *(arr + 0) which means &arr[0] has to be equal to &*(arr + 0). Adding zero to anything is a no-op, so leading to the expression &*(arr). Parentheses with only one term and no operator can also be removed, leaving &*arr. And lastly the address-of and dereference operator are each other opposites and cancel out each other, leaving us with simply arr. So &arr[0] is equal to arr.
Each element in the array, have a position in memory. The positions in the arrays are sequential. The arrays in C are pointers and always point the first direction on memory for the collection (first element of the array).
arr[i] => Gets value of "i-position" in the array. It is the same that arr[i] = *(arr + i)
*(arr+i) => Gets value that is in memory by adding the position in memory that point arr and i value.
*(i+arr) => Is the same that *(arr+i). The sum is commutative.
i[arr] => Is the same that *(i+arr). It's another way of representing.
They are the same because the C language specification says so. Read n1570
The notation a[i] is syntactic sugar for *(a+i).
The first one is mathematical syntax (symbolics closer of what human brain is educated with) while the second one corresponds directly to one assembler instruction.
On the other hand *(a+i)=*(i+a)=i[a] because the arithmetic of pointers is commutative.
These are the same because of how the array subscript operator [] is defined.
From sectino 6.5.2.1 of the C standard:
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is
identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object
(equivalently, a pointer to the initial element of an array object)
and E2 is an integer, E1[E2] designates the E2-th element of
E1 (counting from zero).
The expression arr[i] in your example is of the form E1[E2]. Because the standard states that this is the same as *(E1+E2) that means that arr[i] is the same as *(arr + i).
Because of the commutative property of addition, *(arr + i) is the same as *(i + arr). Applying the equivalence rule above to this expression gives i[arr].
So in short, those 4 expressions are equivalent because of how the standard defines array subscripting and because of the commutative property of addition.
It works because an array variable in C (i.e. arr in your example) is just a pointer to the beginning of an array of memory locations. A pointer is number which represents the address of a specific memory location. When you put and '*' in front of a pointer, it means "give me the data in that memory location".
So, if arr is a pointer to the beginning of the array, *(arr) or *(arr + 0) is the data in the 0th index of the array, and *(arr + 1) is the data in the 1st index, and so on.
An expression which looks like A[B] essentially gets translated into something like *(A+B). So, arr[0] = *(arr + 0) and arr[i] = *(arr+i), etc.
And because A+B = B+A, the two are interchangeable. Meaning *(arr+i) = *(i+arr).
And because arr[i] = *(arr+i) and *(arr+i) = *(i+arr), it should make sense that arr[i] = i[arr].

Why does m[1] - m[0] return 3 where m is a 3x3 matrix?

This is my code:
int m[][3] = {
{ 0 , 1 , 2 },
{ 10, 11, 12 },
{ 20, 21, 22 }
};
printf("%d %d\n", m[1] - m[0], m[1][0] - m[0][0]);
And why does
m[1] - m[0]
return 3? I know why the second expression would return 10 but the 1st one doesn't seem logical to me.
In your code:
m[1] - m[0]
denotes a pointer subtraction which gives you the difference of the two pointers based on the type. In this case, both the pointers are differentiated by 3 elements, so the result is 3.
To quote C11 standard, chapter §6.5.6
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. [...]
and
[...] In other words, if the expressions P and Q point to, respectively, the i-th and j-th elements of
an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type ptrdiff_t. [....]
To help visualize better, please see the following image
Here, s is a two dimensional array, defined as s[4][2]. Considering the data type of the array consumers 2 byte each, please follow the elements (index) and corresponding memory location (arbitrary). This will give a better understating how actually in memory, the array elements are contiguous.
So, as per the representation, s[0] and s[1] are differentiated by two elements, s[0][0] and s[0][1]. Hence, s[1] - s[0] will produce a result of 2.
Because the "difference" between m[1] and m[0] is three elements.
It might be easier to understand if you look at it like this
m[0] m[1] m[2]
| | |
v v v
+---------+---------+---------+---------+---------+---------+---------+---------+---------+
| m[0][0] | m[0][1] | m[0][2] | m[1][0] | m[1][1] | m[1][2] | m[2][0] | m[2][1] | m[2][2] |
+---------+---------+---------+---------+---------+---------+---------+---------+---------+
The difference between m[1] and m[0] is the three elements m[0][0], m[0][1] and m[0][2].

Does nth index of n sized C array contain size of it?

I've written a C program for showing the values of an array using pointer. Here's the code :
#include <stdio.h>
int main()
{
int a[] = {1, 1, 1, 1, 1};
int *ptr = a;
for (int i = 0 ; i < 5; i++)
printf("%d ", *ptr++);
printf("%d", *ptr);
}
As you can see after terminating the loop, the pointer holds the memory address of a value out of the array. As it i.e. the last output is not initialized, it should be a garbage value. But, every time it is showing 5 which is the size of the array. Then, I thought the next memory address of allocated memory for array contains the size of array. But, this is not happening with double type array.
Output for int array : 1 1 1 1 1 5
Output for double array : 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.000000
Will anyone explain the output?
This is just a value from memory in this address. You should never access memory which is not allocated by you (In this case you are accessing 6th element, but you have declared just 5). This may lead to segmentation fault is some cases.
C does not store any array metadata (including array length) anywhere, either at the beginning or the end of the array. In the case of the integer array, the most likely explanation for the output is that the memory used by the variable i immediately follows the last element of the array, like so:
+---+
a: | 1 | a[0]
+---+
| 1 | a[1]
+---+
| 1 | a[2]
+---+
| 1 | a[3]
+---+
| 1 | a[4]
+---+
i: | 5 | a[5]
+---+
However, you cannot rely on this behavior being consistent, as you saw with changing the array type to double.
Attempting to read the value contained in the element one past the end of the array results in undefined behavior. Chapter and (truncated) verse:
6.5.6 Additive operators
...
8 When an expression that has integer type is added to or subtracted from a pointer, the
result has the type of the pointer operand...If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is evaluated.
For giggles, I compiled your code on my system at work, and I get the following output:
1 1 1 1 1 0
This really is just an artifact of how the compiler lays objects out in memory for this particular program.
What you do invokes Undefined Behavior.
It's simple a coincidence and probably just the value of i, print the address of i and check. But be careful, it will not always be that way. Just declare a new variable in the program and it might change.
In the case of double it doesn't work because the address after the array no longer matches the address of i. It's what I mean when I say Be careful.

why two dimensional is not equal to one dimemsional

a[6]={1,2,3,4,5,6};
memory layout for a
1 2 3 4 5 6
addr 2002 2006 2010 2014 2016 2020
b[2][3]={1,2,3,4,5,6};
memory layout for b
1 2 3 4 5 6
addr 2002 2006 2010 2014 2016 2020
both a and b are same
why a[1] address is 2006 and b[1] address is 2010 both are different. arrays are stored contiguously why they are different. So I have doubt what is braket[][] in array we know that memory consist of address not columns and rows.
The answer lies in the type of what results. Almost whenever an array is mentioned in C, it decays into a pointer to its first element. Now for your two cases:
The type of a is int ()[6] which decays into a pointer to int int* before doing the pointer arithmetic implied by a[1]. The expression a[1] is precisely equivalent to *(a + 1). This pointer addition will advance the pointer by one int, because that is what the pointer points at.
The type of b is int ()[2][3] which decays into a pointer to an array int (*)[3]. The size of the array that the pointer points at is three integers. As such, *(b + 1) advances the pointer by three integers.
For the second array, the memory layout is actually
+---------+---------+---------+---------+---------+---------+
| b[0][0] | b[0][1] | b[0][2] | b[1][0] | b[1][1] | b[1][2] |
+---------+---------+---------+---------+---------+---------+
Let's set this clear once and for all:
First, let's imagine that both a and b are mapped into the exact same memory region. That is, both a[0] and b[0][0] are stored in contiguous memory positions starting at the same address.
With this in mind, note that a[1] and b[1] are not the same memory location. Why? Because a is an array of integers, and b is an array of arrays of integers. Each position in b is an array of 3 integers; each position in a is an integer.
Thus, a[1] is not the same memory address as b[1], because b[1] is 3*sizeof(b[0][0]) bytes away from b[0][0], and a[1] is sizeof(a[0]) bytes away from a[0]. Thus, the offsets are different, even though the arrays layout is the same in memory. Your confusion around a[1] and b[1] relates to the fact that the index is the same, but means different things.
a[1] is equivalent to *(a+1), and b[1] is equivalent to *(b+1). The thing is, a+1 and b+1 scale 1 by different amounts (again, by sizeof(a[0]) and 3*sizeof(b[0][0]), respectively). That's why the addresses are different.
In particular, &b[1] is be the same as &a[3].

Resources