double pointer output explanation - c

Could you explain how the output is -4? I think ++pp; is UB but not sure. Your explanation will really help in the way of understanding. Could be there any difference of outputs in big-endian or little-endian machine?
#include <stdio.h>
int a[] = { -1, -2, -3, -4 };
int b[] = { 0, 1, 2, 3 };
int main(void)
{
int *p[] = { a, b };
int **pp = p;
printf("a=%p, b=%p, p=%p, pp=%p\n", (void*)a, (void*)b, (void*)p, (void*)pp);
++pp;
printf("p=%p, pp=%p *pp=%p\n", (void*)p, (void*)pp, (void*)*pp);
++*pp;
printf("p=%p, pp=%p *pp=%p\n", (void*)p, (void*)pp, (void*)*pp);
++**pp;
printf("%d\n", (++**pp)[a]);
}
My output:
a=0x107121040, b=0x107121050, p=0x7ffee8adfad0, pp=0x7ffee8adfad0
p=0x7ffee8adfad0, pp=0x7ffee8adfad8 *pp=0x107121050
p=0x7ffee8adfad0, pp=0x7ffee8adfad8 *pp=0x107121054
-4
Ideone output

When you use the name of an array (in most contexts), it decays to a pointer to its first element. That means that int* p = a; and int* p = &a[0]; are exactly the same.
So to understand what happens in this case, just walk through step by step. At the point of your first printf call, things look like this:
pp p a
+-------+ +------+ +----+----+----+----+
| +---------> +--------> -1 | -2 | -3 | -4 |
+-------+ | | +----+----+----+----+
| |
+------+ b
| | +----+----+----+----+
| +---------> 0 | 1 | 2 | 3 |
| | +----+----+----+----+
+------+
pp points to the first element of p, which is a pointer to the first element of a.
Now, when you increment pp, it changes to point to the second element of p, which is a pointer to the first element of b:
pp p a
+-------+ +------+ +----+----+----+----+
| + | | +--------> -1 | -2 | -3 | -4 |
+---|---+ | | +----+----+----+----+
| | |
| +------+ b
| | | +----+----+----+----+
+---------> +---------> 0 | 1 | 2 | 3 |
| | +----+----+----+----+
+------+
You then increment *pp. Since *pp is a pointer to the first element of b, that pointer is incremented to point to the second element of b:
pp p a
+-------+ +------+ +----+----+----+----+
| + | | +--------> -1 | -2 | -3 | -4 |
+---|---+ | | +----+----+----+----+
| | |
| +------+ b
| | | +----+----+----+----+
+---------> | | 0 | 1 | 2 | 3 |
| + | +----+-^--+----+----+
+---|--+ |
+---------------+
Then you increment **pp. At this point pp is a pointer to the second element of p, so *pp is a pointer to the second element of b. That means **pp names the second element of b. You increment that from 1 to 2:
pp p a
+-------+ +------+ +----+----+----+----+
| + | | +--------> -1 | -2 | -3 | -4 |
+---|---+ | | +----+----+----+----+
| | |
| +------+ b
| | | +----+----+----+----+
+---------> | | 0 | 2 | 2 | 3 |
| + | +----+-^--+----+----+
+---|--+ |
+---------------+
Now, lets dissect (++**pp)[a]. ++**pp is the same as before, so the second element of b gets incremented to 3.
Now, for any pointer ptr and integer n, ptr[n] is the same as *(ptr + n). Since addition is commutative, ptr + n is the same as n + ptr. That means ptr[n] is the same as n[ptr].
Putting these together, that means that (++**pp)[a] is the same as 3[a], which is the same as a[3]. a[3] is -4, hence your result.

Remember the definition of the subscription operator [], e.g. as defined in this online C standard draft:
6.5.2.1 Array subscripting
2) ... The definition of the subscript operator [] is that E1[E2] is
identical to (*((E1)+(E2))). ...
It says that E1[E2] is identical to (*((E1)+(E2)).Then it becomes clear that (++**pp)[a] is the same as *((++**pp)+(a)), which again is the same as *((a)+(++**pp)), which consequently reads as a[(++**pp)]. The value of ++**pp is 3 then, and a[3] is -4.

It's easiest to understand this if you express all the array names in expressions as their decayed values. arrayName as a pointer becomes &arrayName[0]. So after all the initializations, you have:
a[0] = -1, a[1] = -2, a[2] = -3, a[3] = -4
b[0] = 0, b[1] = 1, b[2] = 2, b[3] = 3
p[0] = &a[0], p[1] = &b[0]
pp = &p[0]
Incrementing a pointer makes it point to the next array element, so after ++pp we now have
pp = &p[1]
++*pp dereferences pp, so it's equivalent to ++p[1], so now we have
p[1] = &b[1]
++**pp dereferences this twice, so it's equivalent to ++b[1], so now we have
b[1] = 2
Finally, we have the really confusing expression (++**pp)[a]. ++**pp again increments b[1], so its value is now 3, and that value replaces that expression, so it's equivalent to 3[a]. This might look like nonsense (3 isn't an array, how can you index it?), but it turns out that in C, x[y] == y[x] because of the way indexing is defined in terms of pointer arithmetic. So 3[a] is the same as a[3], and the last line prints -4.

Related

Beginner C arrays and incrementation

Excuse the amateurism but I'm really struggling to understand the basic incrementing mechanisms. Are the comments correct?
#include <stdio.h>
main()
{
int a[5]={1,2,3,4,5};
int i,j,m;
i = ++a[1]; // the value of a[1] is 3. i=3
j = ++a[1]; /* because of the previous line a[1]=3
and now a[1]=4? but not in the line defining i? */
m = a[i++]; /* i retained the value of 3 even though the value of a[1] has changed
so finally i++ which is incremented in printf()? */
printf("%d, %d, %d", i,j,m);
}
I could be answering my own question but I have fooled myself quite a few times learning C so far.
i = ++a[1] will increment the value of a[1] to 3 and the result of ++a[1] which is 3 will be assigned to i.
j = ++a[1]; will increment the value of a[1] to 4 and the result of ++a[1] which is 4 will be assigned to j.
m = a[i++];, will assign the value of a[3] (as i is 3 b now) to m which is 4 and i will be incremented by 1. Now i becomes 4.
The thing to remember with the ++ and -- operators is that the expression has a result and a side effect. The result of ++i is the original value of i plus 1. The side effect of ++i is to add 1 to the value stored in i.
So, if i is originally 0, then in the expression
j = ++i
j gets the result of 0 + 1 (the original value of i plus 1). As a side effect, 1 is added to the value currently stored in i. So after this expression is evaluated, both i and j contain 1.
The postfix version of ++ is slightly different; the result of i++ is the original value of i, but the side effect is the same - 1 is added to the value stored in i. So, if i is originally 0, then
j = i++;
j gets the original value of i (0), and 1 is added to the value stored in i. After this expression, j is 0 and i is 1.
Important - the exact order in which the assignment to j and the side effect to i are executed is not specified. i does not have to be updated before j is assigned, and vice versa. Because of this, certain combinations of ++ and -- (including but not limited to i = i++, i++ * i++, a[i++] = i, and a[i] = i++) will result in undefined behavior; the result will vary, unpredictably, depending on platform, optimization, and surrounding code.
So, let's imagine your objects are laid out in memory like so:
+---+
a: | 1 | a[0]
+---+
| 2 | a[1]
+---+
| 3 | a[2]
+---+
| 4 | a[3]
+---+
| 5 | a[4]
+---+
i: | ? |
+---+
j: | ? |
+---+
m: | ? |
+---+
First we evaluate
i = ++a[1];
The result of ++a[1] is the original value of a[1] plus 1 - in this case, 3. The side effect is to update the value in a[1]. After this statement, your objects now look like this:
+---+
a: | 1 | a[0]
+---+
| 3 | a[1]
+---+
| 3 | a[2]
+---+
| 4 | a[3]
+---+
| 5 | a[4]
+---+
i: | 3 |
+---+
j: | ? |
+---+
m: | ? |
+---+
Now we execute
j = ++a[1];
Same deal - j gets the value of a[1] plus 1, and the side effect is to update a[1]. After evaluation, we have
+---+
a: | 1 | a[0]
+---+
| 4 | a[1]
+---+
| 3 | a[2]
+---+
| 4 | a[3]
+---+
| 5 | a[4]
+---+
i: | 3 |
+---+
j: | 4 |
+---+
m: | ? |
+---+
Finally, we have
m = a[i++];
The result of i++ is 3, so m gets the value stored in a[3]. The side effect is to add 1 to the value stored in i. Now, our objects look like
+---+
a: | 1 | a[0]
+---+
| 4 | a[1]
+---+
| 3 | a[2]
+---+
| 4 | a[3]
+---+
| 5 | a[4]
+---+
i: | 4 |
+---+
j: | 4 |
+---+
m: | 4 |
+---+

About using pointers for 2D array definition in C

When we're defining a 2D array as in:
int *a[5];
Which dimension does the "5" define? The first or the second?
It's not a "2D" array. It's a 1-dimensional array of pointers to int. As such the array size designates that it has space for 5 pointers. Each individual pointer can point to the first element of a buffer with different size.
A "true 2D array" is the colloquial "array of arrays" int a[M][N]. Here the expression a[i] evaluates to the array of N integers, at position i.
Each a[i] points to a single int, which may be the first element in a sequence of int objects, like so:
a[0] a[1] a[2] a[3] a[4]
+----+----+----+----+----+
| | | | | |
+----+----+----+----+----+
| | | | |
| | | ... ...
| | +-------------------------------+
| +-------------------+ |
+-------+ | |
| | |
v v v
+---+ +---+ +---+
a[0][0] | | a[1][0] | | a[2][0] | |
+---+ +---+ +---+
a[0][1] | | a[1][1] | | a[2][1] | |
+---+ +---+ +---+
... ... ...
Thus, each a[i] can represent a "row" in your structure. You can dynamically allocate each "row" as
a[i] = malloc( sizeof *a[i] * row_length_for_i );
or you can set it to point to an existing array:
int foo[] = { 1, 2, 3 };
int bar[] = { 5, 6, 7, 8, 9 };
...
a[0] = foo;
a[1] = bar;
As shown in the example above, each "row" may have a different length.
I keep putting scare quotes around "row" because what you have is not a true 2D array - it's not a contiguous sequence of elements. The object immediately following a[0][N-1] will most likely not be a[1][0]. What you have is a sequence of pointers, each of which may point to the first element of a sequence of int, or to a single int, or to nothing at all.

How memory is allocated when we use malloc to create 2-dimensional array?

I want to create an integer array[5][10] using malloc(). The difference between memory address of array[0] and array[1] is showing 8. Why?
#include <stdio.h>
#include <stdlib.h>
int main() {
int *b[5];
for (int loop = 0; loop < 5; loop++)
b[loop] = (int*)malloc(10 * sizeof(int));
printf("b=%u \n", b);
printf("(b+1)=%u \n", (b + 1));
printf("(b+2)=%u \n", (b + 2));
}
The output is:
b=2151122304
(b+1)=2151122312
(b+2)=2151122320
The difference between memory address of array[0] and array[1] is showing 8. Why?
That's because sizeof of a pointer on your platform is 8.
BTW, use of %u to print a pointer leads to undefined behavior. Use %p instead.
printf("(b+1)=%p \n",(b+1));
printf("(b+2)=%p \n",(b+2));
Difference between array of pointers and a 2D array
When you use:
int *b[5];
The memory used for b is:
&b[0] &b[1] &b[2]
| | |
v v v
+--------+--------+--------+
| b[0] | b[1] | b[2] |
+--------+--------+--------+
(b+1) is the same as &b[1]
(b+2) is the same as &b[2]
Hence, the difference between (b+2) and (b+1) is the size of a pointer.
When you use:
int b[5][10];
The memory used for b is:
&b[0][0] &b[1][0] &b[2][0]
| | |
v v v
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ ...
| | | | | | | | | | | | | | | | | | | | | ...
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ ...
(b+1) is the same as &b[1], The value of that pointer is the same as the value of &b[1][0] even though they are pointers to different types.
(b+2) is the same as &b[2], The value of that pointer is the same as the value of &b[2][0]
Hence, the difference between (b+2) and (b+1) is the size of 10 ints.
First, with int *b[5] you are not creating a two dimensional array, but an array of pointers.
The elements of the array b are pointers. Each occupies the size of a pointer, which depends on your architecture. In a 64-bits architecture it will probably occupy 64 bits (8 bytes). You can check that by printing sizeof(int*) or sizeof(b[0])
Memory allocation will look like
b
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[0]+--------------> | | | | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[1]+--------------> | | |....... | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[2]+--------------> | | | ...... | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[3]+--------------> | | | ...... | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[4]+--------------> | | | ...... | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
b will point to b[0], after decay, and b + 1 will give the address of b[1]. Size of pointer on your machine is 8 bytes, therefore you are getting a difference of 8 in the address.
Beside of this
Do not cast return value of malloc
b[loop]=malloc(10*sizeof(int));
and use %p for pointer data type
printf("b=%p \n",(void *)b);
printf("(b+1)=%p \n",(void *)(b+1));
printf("(b+2)=%p \n",(void *)(b+2));
What you've declared is not technically a two dimensional array but an array of pointers to int, each of which points to an array of int. The reason array[0] and array[1] are 8 bytes apart is because you have an array of pointers, and pointers on your system are 8 bytes.
When you allocate each individual 1 dimensional array, they don't necessarily exist next to each other in memory. If on the other hand you declared int b[5][10], you would have 10 * 5 = 50 contiguous integers arranged in 5 rows of 10.

3d arrays in c-programming

#include <stdio.h>
int main()
{
int a [2][3][2]={{{1,2},{3,4},{5,6}},{{5,8},{9,10},{11,12}}};
printf("%d\n%d\n%d\n",a[1]-a[0],a[1][0]-a[0][0],a[1][0][0]-a[0][0][0]);
return 0;
}
The output is 3 6 4. Can anyone explain to me the reason for this? How come a[1]-a[0]=3 and a[1][0]-a[0][0]=6 and how a[] and a[][] interprets in a 3-dimensional array?
It might help if you understand how an array like yours is laid out in memory:
+------------+ Low address +---------+ Low address +------+
| a[0][0][0] | | a[0][0] | | a[0] |
| a[0][0][1] | | | | |
| a[0][1][0] | | a[0][1] | | |
| a[0][1][1] | | | | |
| a[0][2][0] | | a[0][2] | | |
| a[0][2][1] | | | | |
| a[1][0][0] | | a[1][0] | | a[1] |
| a[1][0][1] | | | | |
| a[1][1][0] | | a[1][1] | | |
| a[1][1][1] | | | | |
| a[1][2][0] | | a[1][2] | | |
| a[1][2][1] | | | | |
+------------+ High address +---------+ High address +------+
Then it helps to know that the difference you get is in multiples of the type. So for a[0] and a[1] the type is int[3][2] and there are three of those multiples between a[0] and a[1].
Same for a[0][0] and a[1][0], the type is int[2], and the difference is six int[2] units between a[0][0] and a[1][0].
To elaborate a little: Between a[0] and a[1] you have a[0][0], a[0][1] and a[0][2]. Three entries.
Between a[0][0] and a[1][0] you have a[0][0][0], a[0][0][1], a[0][1][0], a[0][1][1], a[0][2][0] anda[0][2][1]. Six entries.
At the point of address, a[1] and a[1][0] are the same value. And a[0] and a[0][0] are same value.
But the types are different.
a[1][0] and a[0][0] are int *, from a[0][0] to a[1][0], there are 6 int.
And from a[1] to a[0], there are 3 {x, y}.
a[1][0][0] and a[0][0][0] are int, a[1][0][0]-a[0][0][0] = 5 - 1 = 4.
In C, a multi-dimensional array is conceptually an array whose elements are also arrays. So if you do:
int array[2][3]; Conceptually you end up with:
array[0] => [0, 1, 2]
array[1] => [0, 1, 2]
int array[2][3][2]; ...will give you a structure like:
array[0] => [0] => [1, 2]
[1] => [3, 4]
[2] => [5, 6]
array[1] => [0] => [5, 8]
[1] => [9, 10]
[2] => [11, 12]
a[1]-a[0] => will give difference you get is type of unit. a[0] and a[1] is int and there are three unit between them.similarly for the second part
a[1][0]-a[0][0]=6
number of combination for between a[0][0] and a[1][0] is 6.

Multidimensional arrays allocated through calloc

I have a question about how memory is allocated when I calloc. I had a look at this question, but it doesn't address how memory is allocated in the case of a dynamically allocated two dimensional array.
I was wondering if there was a difference in the memory representation between the following three ways of dynamically allocating a 2D array.
Type 1:
double **array1;
int ii;
array1 = calloc(10, sizeof(double *));
for(ii = 0; ii < 10; ii++) {
array1[ii] = calloc(10, sizeof(double));
}
// Then access array elements like array1[ii][jj]
Type 2:
double **array1;
int ii;
array1 = calloc(10 * 10, sizeof(double *));
// Then access array elements like array1[ii + 10*jj]
Type 3:
double **array1;
int ii;
array1 = malloc(10 * 10, sizeof(double *));
// Then access array elements like array1[ii + 10*jj]
From what I understand of calloc and malloc, the difference between the last two is that calloc will zero all the elements of the array, whereas malloc will not. But are the first two ways of defining the array equivalent in memory?
Are the first two ways of defining the array equivalent in memory?
Not quite. In the second type they are almost certainly contiguous, while in the first type this is not sure.
Type 1: in-memory representation will look like this:
+---+---+---+---+---+---+---+---+---+---+
double| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+---+---+
^
|------------------------------------
. . . . . . . . | // ten rows of doubles
-
+---+---+---+---+---+---+---+---+---+--|+
double| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0||
+---+---+---+---+---+---+---+---+---+--|+
^ . . . -
| ^ ^ ^ . . . . . |
| | | | ^ ^ ^ ^ ^ |
+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+
array1[ii]| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | // each cell points to ten doubles
+---+---+---+---+---+---+---+---+---+---+
^
|
|
+-|-+
array1| | |
+---+
Type 2: in-memory representation will look like this:
+---+---+---+---+---+---+---+---+---+---+ +---+
double| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 |
+---+---+---+---+---+---+---+---+---+---+ +---+
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
| | | | | | | | | | |
| | | | | | | | | | |
+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+ +-|-+
array1[ii]| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... |99 | // each cell points to one double
+---+---+---+---+---+---+---+---+---+---+ +---+
^
|
|
+-|-+
array1| | |
+---+
Simple Example
#include<stdio.h>
#include<stdlib.h>
int **d ;
int sum();
//----------------------------------------------
int main(){
d = (int **)calloc(3,sizeof(int*));
printf("\n%d",sum());
}
//-----------------------------------------------
int sum(){
int s = 0;
for(int i = 0; i < 3; i++)
d[i] = (int *) calloc (3,sizeof(int));
for(int i = 0; i < 3; i++){
for(int j = 0; j < 3; j++){
d[i][j] = i+j;
s += d[i][j];
printf("\n array[%d][%d]-> %d",i,j,d[i][j]);
}
}
return s;
}
In the first way, you allocate 10 pointers to double, and 100 double. In the second way you allocate 100 pointers to double.The other difference is that in the second way, you allocate one big block of memory, so that all the elements of your array are in the same block. In the first way, each "row" of your array is in a different block than the others.
Though, in the second way, your array should be a double* instead of a double**, because in this way of allocating, your array only contains pointers to double, not double.
On the case 1, you make:
array1[0] -> [memory area of 10]
array1[1] -> [memory area of 10] ...
array1[N] -> [memory area of 10] ...
Note: You cannot assume that the memory area is continuous, there might be gaps.
On the case 2 you make:
array1 -> [memory area of 100]
The case 3 is same as the case 2, but its not initializing the memory. Difference between case 1 and 2 & 3 is that on the first case you really have 2D memory structure. For example if you want to swap rows 1 and 2, you could just swap the pointers:
help = array1[1]
array1[1] = array1[2]
array1[2] = help
But if you want to do the same in the 2&3 case you need to do real memcpy. What to use? Depends what you are doing.
The first way uses bit more memory: if you would have array of 1000x10 then the first version will use 1000*8 + 1000*10*8 (on 64bit system), while the 2&3 will only use 1000*10*8.

Resources