Beginner C arrays and incrementation - c

Excuse the amateurism but I'm really struggling to understand the basic incrementing mechanisms. Are the comments correct?
#include <stdio.h>
main()
{
int a[5]={1,2,3,4,5};
int i,j,m;
i = ++a[1]; // the value of a[1] is 3. i=3
j = ++a[1]; /* because of the previous line a[1]=3
and now a[1]=4? but not in the line defining i? */
m = a[i++]; /* i retained the value of 3 even though the value of a[1] has changed
so finally i++ which is incremented in printf()? */
printf("%d, %d, %d", i,j,m);
}
I could be answering my own question but I have fooled myself quite a few times learning C so far.

i = ++a[1] will increment the value of a[1] to 3 and the result of ++a[1] which is 3 will be assigned to i.
j = ++a[1]; will increment the value of a[1] to 4 and the result of ++a[1] which is 4 will be assigned to j.
m = a[i++];, will assign the value of a[3] (as i is 3 b now) to m which is 4 and i will be incremented by 1. Now i becomes 4.

The thing to remember with the ++ and -- operators is that the expression has a result and a side effect. The result of ++i is the original value of i plus 1. The side effect of ++i is to add 1 to the value stored in i.
So, if i is originally 0, then in the expression
j = ++i
j gets the result of 0 + 1 (the original value of i plus 1). As a side effect, 1 is added to the value currently stored in i. So after this expression is evaluated, both i and j contain 1.
The postfix version of ++ is slightly different; the result of i++ is the original value of i, but the side effect is the same - 1 is added to the value stored in i. So, if i is originally 0, then
j = i++;
j gets the original value of i (0), and 1 is added to the value stored in i. After this expression, j is 0 and i is 1.
Important - the exact order in which the assignment to j and the side effect to i are executed is not specified. i does not have to be updated before j is assigned, and vice versa. Because of this, certain combinations of ++ and -- (including but not limited to i = i++, i++ * i++, a[i++] = i, and a[i] = i++) will result in undefined behavior; the result will vary, unpredictably, depending on platform, optimization, and surrounding code.
So, let's imagine your objects are laid out in memory like so:
+---+
a: | 1 | a[0]
+---+
| 2 | a[1]
+---+
| 3 | a[2]
+---+
| 4 | a[3]
+---+
| 5 | a[4]
+---+
i: | ? |
+---+
j: | ? |
+---+
m: | ? |
+---+
First we evaluate
i = ++a[1];
The result of ++a[1] is the original value of a[1] plus 1 - in this case, 3. The side effect is to update the value in a[1]. After this statement, your objects now look like this:
+---+
a: | 1 | a[0]
+---+
| 3 | a[1]
+---+
| 3 | a[2]
+---+
| 4 | a[3]
+---+
| 5 | a[4]
+---+
i: | 3 |
+---+
j: | ? |
+---+
m: | ? |
+---+
Now we execute
j = ++a[1];
Same deal - j gets the value of a[1] plus 1, and the side effect is to update a[1]. After evaluation, we have
+---+
a: | 1 | a[0]
+---+
| 4 | a[1]
+---+
| 3 | a[2]
+---+
| 4 | a[3]
+---+
| 5 | a[4]
+---+
i: | 3 |
+---+
j: | 4 |
+---+
m: | ? |
+---+
Finally, we have
m = a[i++];
The result of i++ is 3, so m gets the value stored in a[3]. The side effect is to add 1 to the value stored in i. Now, our objects look like
+---+
a: | 1 | a[0]
+---+
| 4 | a[1]
+---+
| 3 | a[2]
+---+
| 4 | a[3]
+---+
| 5 | a[4]
+---+
i: | 4 |
+---+
j: | 4 |
+---+
m: | 4 |
+---+

Related

double pointer output explanation

Could you explain how the output is -4? I think ++pp; is UB but not sure. Your explanation will really help in the way of understanding. Could be there any difference of outputs in big-endian or little-endian machine?
#include <stdio.h>
int a[] = { -1, -2, -3, -4 };
int b[] = { 0, 1, 2, 3 };
int main(void)
{
int *p[] = { a, b };
int **pp = p;
printf("a=%p, b=%p, p=%p, pp=%p\n", (void*)a, (void*)b, (void*)p, (void*)pp);
++pp;
printf("p=%p, pp=%p *pp=%p\n", (void*)p, (void*)pp, (void*)*pp);
++*pp;
printf("p=%p, pp=%p *pp=%p\n", (void*)p, (void*)pp, (void*)*pp);
++**pp;
printf("%d\n", (++**pp)[a]);
}
My output:
a=0x107121040, b=0x107121050, p=0x7ffee8adfad0, pp=0x7ffee8adfad0
p=0x7ffee8adfad0, pp=0x7ffee8adfad8 *pp=0x107121050
p=0x7ffee8adfad0, pp=0x7ffee8adfad8 *pp=0x107121054
-4
Ideone output
When you use the name of an array (in most contexts), it decays to a pointer to its first element. That means that int* p = a; and int* p = &a[0]; are exactly the same.
So to understand what happens in this case, just walk through step by step. At the point of your first printf call, things look like this:
pp p a
+-------+ +------+ +----+----+----+----+
| +---------> +--------> -1 | -2 | -3 | -4 |
+-------+ | | +----+----+----+----+
| |
+------+ b
| | +----+----+----+----+
| +---------> 0 | 1 | 2 | 3 |
| | +----+----+----+----+
+------+
pp points to the first element of p, which is a pointer to the first element of a.
Now, when you increment pp, it changes to point to the second element of p, which is a pointer to the first element of b:
pp p a
+-------+ +------+ +----+----+----+----+
| + | | +--------> -1 | -2 | -3 | -4 |
+---|---+ | | +----+----+----+----+
| | |
| +------+ b
| | | +----+----+----+----+
+---------> +---------> 0 | 1 | 2 | 3 |
| | +----+----+----+----+
+------+
You then increment *pp. Since *pp is a pointer to the first element of b, that pointer is incremented to point to the second element of b:
pp p a
+-------+ +------+ +----+----+----+----+
| + | | +--------> -1 | -2 | -3 | -4 |
+---|---+ | | +----+----+----+----+
| | |
| +------+ b
| | | +----+----+----+----+
+---------> | | 0 | 1 | 2 | 3 |
| + | +----+-^--+----+----+
+---|--+ |
+---------------+
Then you increment **pp. At this point pp is a pointer to the second element of p, so *pp is a pointer to the second element of b. That means **pp names the second element of b. You increment that from 1 to 2:
pp p a
+-------+ +------+ +----+----+----+----+
| + | | +--------> -1 | -2 | -3 | -4 |
+---|---+ | | +----+----+----+----+
| | |
| +------+ b
| | | +----+----+----+----+
+---------> | | 0 | 2 | 2 | 3 |
| + | +----+-^--+----+----+
+---|--+ |
+---------------+
Now, lets dissect (++**pp)[a]. ++**pp is the same as before, so the second element of b gets incremented to 3.
Now, for any pointer ptr and integer n, ptr[n] is the same as *(ptr + n). Since addition is commutative, ptr + n is the same as n + ptr. That means ptr[n] is the same as n[ptr].
Putting these together, that means that (++**pp)[a] is the same as 3[a], which is the same as a[3]. a[3] is -4, hence your result.
Remember the definition of the subscription operator [], e.g. as defined in this online C standard draft:
6.5.2.1 Array subscripting
2) ... The definition of the subscript operator [] is that E1[E2] is
identical to (*((E1)+(E2))). ...
It says that E1[E2] is identical to (*((E1)+(E2)).Then it becomes clear that (++**pp)[a] is the same as *((++**pp)+(a)), which again is the same as *((a)+(++**pp)), which consequently reads as a[(++**pp)]. The value of ++**pp is 3 then, and a[3] is -4.
It's easiest to understand this if you express all the array names in expressions as their decayed values. arrayName as a pointer becomes &arrayName[0]. So after all the initializations, you have:
a[0] = -1, a[1] = -2, a[2] = -3, a[3] = -4
b[0] = 0, b[1] = 1, b[2] = 2, b[3] = 3
p[0] = &a[0], p[1] = &b[0]
pp = &p[0]
Incrementing a pointer makes it point to the next array element, so after ++pp we now have
pp = &p[1]
++*pp dereferences pp, so it's equivalent to ++p[1], so now we have
p[1] = &b[1]
++**pp dereferences this twice, so it's equivalent to ++b[1], so now we have
b[1] = 2
Finally, we have the really confusing expression (++**pp)[a]. ++**pp again increments b[1], so its value is now 3, and that value replaces that expression, so it's equivalent to 3[a]. This might look like nonsense (3 isn't an array, how can you index it?), but it turns out that in C, x[y] == y[x] because of the way indexing is defined in terms of pointer arithmetic. So 3[a] is the same as a[3], and the last line prints -4.

How come the following program output is 5, not 4? Could anyone explain?

I came upon a program which outputs 5. I don't know how. Please explain.
int main(void) {
int t[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }, *p = t;
p += 2;
p += p[-1];
printf("\n%d",*p);
return 0;
}
I expect the output to be 4.
the pointer moves from t[0] to t[2] here(p+=2;). In the next statement p+= p[-1], I believe pointer moves to t[1] whose value is 2 first and so increased by 2. So I expected output to be 4.
but the actual output is 5. Anyone, please explain?
p = t; // p = &t[0]
p += 2; // p = &t[2]
p += p[-1]; // p += 2; // p = &t[4]
At first, the pointer p points to the beginning of the array t. So it should be something like
p--
|
v
------------------------------------------
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
------------------------------------------
Now by
p += 2
p is increment according to pointer arithmetic. So that p is now pointing to 3.
p----------
|
v
------------------------------------------
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
------------------------------------------
p[-1] is same as *(p-1). ie, the value at the address p-1. This value is 2.
------ p[-1] or *(p-1)
|
|
------|-----------------------------------
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
------------------------------------------
After adding 2 to the current value of p, p would now be pointing to 5.
p------------------
|
v
------------------------------------------
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
------------------------------------------
So, when you print the value of *p, 5 is output.

About using pointers for 2D array definition in C

When we're defining a 2D array as in:
int *a[5];
Which dimension does the "5" define? The first or the second?
It's not a "2D" array. It's a 1-dimensional array of pointers to int. As such the array size designates that it has space for 5 pointers. Each individual pointer can point to the first element of a buffer with different size.
A "true 2D array" is the colloquial "array of arrays" int a[M][N]. Here the expression a[i] evaluates to the array of N integers, at position i.
Each a[i] points to a single int, which may be the first element in a sequence of int objects, like so:
a[0] a[1] a[2] a[3] a[4]
+----+----+----+----+----+
| | | | | |
+----+----+----+----+----+
| | | | |
| | | ... ...
| | +-------------------------------+
| +-------------------+ |
+-------+ | |
| | |
v v v
+---+ +---+ +---+
a[0][0] | | a[1][0] | | a[2][0] | |
+---+ +---+ +---+
a[0][1] | | a[1][1] | | a[2][1] | |
+---+ +---+ +---+
... ... ...
Thus, each a[i] can represent a "row" in your structure. You can dynamically allocate each "row" as
a[i] = malloc( sizeof *a[i] * row_length_for_i );
or you can set it to point to an existing array:
int foo[] = { 1, 2, 3 };
int bar[] = { 5, 6, 7, 8, 9 };
...
a[0] = foo;
a[1] = bar;
As shown in the example above, each "row" may have a different length.
I keep putting scare quotes around "row" because what you have is not a true 2D array - it's not a contiguous sequence of elements. The object immediately following a[0][N-1] will most likely not be a[1][0]. What you have is a sequence of pointers, each of which may point to the first element of a sequence of int, or to a single int, or to nothing at all.

3d arrays in c-programming

#include <stdio.h>
int main()
{
int a [2][3][2]={{{1,2},{3,4},{5,6}},{{5,8},{9,10},{11,12}}};
printf("%d\n%d\n%d\n",a[1]-a[0],a[1][0]-a[0][0],a[1][0][0]-a[0][0][0]);
return 0;
}
The output is 3 6 4. Can anyone explain to me the reason for this? How come a[1]-a[0]=3 and a[1][0]-a[0][0]=6 and how a[] and a[][] interprets in a 3-dimensional array?
It might help if you understand how an array like yours is laid out in memory:
+------------+ Low address +---------+ Low address +------+
| a[0][0][0] | | a[0][0] | | a[0] |
| a[0][0][1] | | | | |
| a[0][1][0] | | a[0][1] | | |
| a[0][1][1] | | | | |
| a[0][2][0] | | a[0][2] | | |
| a[0][2][1] | | | | |
| a[1][0][0] | | a[1][0] | | a[1] |
| a[1][0][1] | | | | |
| a[1][1][0] | | a[1][1] | | |
| a[1][1][1] | | | | |
| a[1][2][0] | | a[1][2] | | |
| a[1][2][1] | | | | |
+------------+ High address +---------+ High address +------+
Then it helps to know that the difference you get is in multiples of the type. So for a[0] and a[1] the type is int[3][2] and there are three of those multiples between a[0] and a[1].
Same for a[0][0] and a[1][0], the type is int[2], and the difference is six int[2] units between a[0][0] and a[1][0].
To elaborate a little: Between a[0] and a[1] you have a[0][0], a[0][1] and a[0][2]. Three entries.
Between a[0][0] and a[1][0] you have a[0][0][0], a[0][0][1], a[0][1][0], a[0][1][1], a[0][2][0] anda[0][2][1]. Six entries.
At the point of address, a[1] and a[1][0] are the same value. And a[0] and a[0][0] are same value.
But the types are different.
a[1][0] and a[0][0] are int *, from a[0][0] to a[1][0], there are 6 int.
And from a[1] to a[0], there are 3 {x, y}.
a[1][0][0] and a[0][0][0] are int, a[1][0][0]-a[0][0][0] = 5 - 1 = 4.
In C, a multi-dimensional array is conceptually an array whose elements are also arrays. So if you do:
int array[2][3]; Conceptually you end up with:
array[0] => [0, 1, 2]
array[1] => [0, 1, 2]
int array[2][3][2]; ...will give you a structure like:
array[0] => [0] => [1, 2]
[1] => [3, 4]
[2] => [5, 6]
array[1] => [0] => [5, 8]
[1] => [9, 10]
[2] => [11, 12]
a[1]-a[0] => will give difference you get is type of unit. a[0] and a[1] is int and there are three unit between them.similarly for the second part
a[1][0]-a[0][0]=6
number of combination for between a[0][0] and a[1][0] is 6.

Multidimensional arrays allocated through calloc

I have a question about how memory is allocated when I calloc. I had a look at this question, but it doesn't address how memory is allocated in the case of a dynamically allocated two dimensional array.
I was wondering if there was a difference in the memory representation between the following three ways of dynamically allocating a 2D array.
Type 1:
double **array1;
int ii;
array1 = calloc(10, sizeof(double *));
for(ii = 0; ii < 10; ii++) {
array1[ii] = calloc(10, sizeof(double));
}
// Then access array elements like array1[ii][jj]
Type 2:
double **array1;
int ii;
array1 = calloc(10 * 10, sizeof(double *));
// Then access array elements like array1[ii + 10*jj]
Type 3:
double **array1;
int ii;
array1 = malloc(10 * 10, sizeof(double *));
// Then access array elements like array1[ii + 10*jj]
From what I understand of calloc and malloc, the difference between the last two is that calloc will zero all the elements of the array, whereas malloc will not. But are the first two ways of defining the array equivalent in memory?
Are the first two ways of defining the array equivalent in memory?
Not quite. In the second type they are almost certainly contiguous, while in the first type this is not sure.
Type 1: in-memory representation will look like this:
+---+---+---+---+---+---+---+---+---+---+
double| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+---+---+
^
|------------------------------------
. . . . . . . . | // ten rows of doubles
-
+---+---+---+---+---+---+---+---+---+--|+
double| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0||
+---+---+---+---+---+---+---+---+---+--|+
^ . . . -
| ^ ^ ^ . . . . . |
| | | | ^ ^ ^ ^ ^ |
+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+
array1[ii]| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | // each cell points to ten doubles
+---+---+---+---+---+---+---+---+---+---+
^
|
|
+-|-+
array1| | |
+---+
Type 2: in-memory representation will look like this:
+---+---+---+---+---+---+---+---+---+---+ +---+
double| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 |
+---+---+---+---+---+---+---+---+---+---+ +---+
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
| | | | | | | | | | |
| | | | | | | | | | |
+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+ +-|-+
array1[ii]| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... |99 | // each cell points to one double
+---+---+---+---+---+---+---+---+---+---+ +---+
^
|
|
+-|-+
array1| | |
+---+
Simple Example
#include<stdio.h>
#include<stdlib.h>
int **d ;
int sum();
//----------------------------------------------
int main(){
d = (int **)calloc(3,sizeof(int*));
printf("\n%d",sum());
}
//-----------------------------------------------
int sum(){
int s = 0;
for(int i = 0; i < 3; i++)
d[i] = (int *) calloc (3,sizeof(int));
for(int i = 0; i < 3; i++){
for(int j = 0; j < 3; j++){
d[i][j] = i+j;
s += d[i][j];
printf("\n array[%d][%d]-> %d",i,j,d[i][j]);
}
}
return s;
}
In the first way, you allocate 10 pointers to double, and 100 double. In the second way you allocate 100 pointers to double.The other difference is that in the second way, you allocate one big block of memory, so that all the elements of your array are in the same block. In the first way, each "row" of your array is in a different block than the others.
Though, in the second way, your array should be a double* instead of a double**, because in this way of allocating, your array only contains pointers to double, not double.
On the case 1, you make:
array1[0] -> [memory area of 10]
array1[1] -> [memory area of 10] ...
array1[N] -> [memory area of 10] ...
Note: You cannot assume that the memory area is continuous, there might be gaps.
On the case 2 you make:
array1 -> [memory area of 100]
The case 3 is same as the case 2, but its not initializing the memory. Difference between case 1 and 2 & 3 is that on the first case you really have 2D memory structure. For example if you want to swap rows 1 and 2, you could just swap the pointers:
help = array1[1]
array1[1] = array1[2]
array1[2] = help
But if you want to do the same in the 2&3 case you need to do real memcpy. What to use? Depends what you are doing.
The first way uses bit more memory: if you would have array of 1000x10 then the first version will use 1000*8 + 1000*10*8 (on 64bit system), while the 2&3 will only use 1000*10*8.

Resources