Accessing 2D array elements by array names and pointers in C - c

I am little confused with 2D arrays. Specially with a formula a[i][j] = *(*(a+i)+j)
Before asking my doubt I will like to mention how I think about symbols '*' and '&'. I think '&' is a operator which takes "variable" as operand and gives "address of that variable" and '*' takes "address of variable" as a operand and gives "variable" as output, so
1.*(address)---->>(gives variable)
2.&(variable)---->>(gives address)
(Please tell me if this concept is wrong)
Now suppose there is a 2D array 'a' as follows:
a[3][2]={{1,2,3},{4,5,6},{7,8,9},{10,11,12}}
Now I want to access last element of array block i.e a[3][2] by using that formula.
1st Doubt
So by formula:
a[3][2]=*(*(a+3)+2) // 1
I have read that a+3 gives address of first element of 4rt row i.e &a[3][0].
But I have seen people saying that writing a is equivalent to &a[0][0]. So subsituting in equation (1)
a[3][2] *(*(&a[0][0] +3)+2)
So adding 3 to &a[0][0] means giving adress of the block a[1][0].....(going three blocks forward of a[0][0]). So here our (a+3) has pointed us to &a[1][0] and not to &a[3][0].
2nd Doubt
Suppose now evaluating (a+3) really gives me address of a[3][0] (which is correct). So equation (1) now becomes
a[3][2]=*(*(&a[3][0])+2)
Using my concept
*(address of variable)---->>(gives variable)
So*(&a[3][0])= a[3][0]. So a[3][0] should be a variable storing value 10. Now we have then, a[3][2]=*(10+2)=*(12). But now '*'operator needs adress as input and we are giving a r-value which is not a address, so this should give an error.
I know there is a hell lot of mistake in my concepts but I am a beginner and just started C language as my 1st topic in field of programming, please help me out.

I think '&' is a operator which takes variable as operand and gives address of that variable…
This is not quite correct. We ought to clarify what a “variable” is. What is often called a variable is, in C, an identifier and an object. The identifier is the text string we use as the name. For example, in int xyz;, “xyz” is the identifier. The object is the region of memory used to represent the value. So, in int xyz;, the object is a few bytes (often four) the compiler reserves somewhere in memory.
The & operator gives the address of the object (or function) to which it is applied. Note that it does not need to be applied to a variable, just to any object (or function). So, instead of a named object, it can be applied to some computed thing (as in &a[i+4]) or to a string literal (&"abc") or compound literal (& (int []) {3, 4, 5}).
… and '*' reversibly takes adress of variable as a operand and gives variable as output,
The * takes a pointer to an object (or function, not further discussed here) and produces the object (specifically, an lvalue that designates the object). The object does not have to be the object of a named variable; it can be an array element or a dynamically allocated object or something else.
Now suppose there is a 2D array 'a' as follows:
a[3][2]={{1,2,3},{4,5,6},{7,8,9},{10,11,12}}
That is not a correct array definition, because it has no element type and because it says the array dimensions are 3 and 2, but the list of initializers show the dimensions should be 4 and 3. Let’s suppose we correct it to:
int a[4][3] = { {1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {10, 11, 12} };
I have read that a+3 gives address of first element of 4th row i.e &a[3][0].
Not quite. In the expression a+3, a designates an array. That array is automatically converted to a pointer to its first element, so it is equivalent to &a[0]. Note that the type of this expression is an array of 3 int—it is a subarray of a. When we add 3 to this, the compiler counts 3 subarrays, so a+3 points to the subarray number 3 (starting the numbering from 0). Thus a+3 is equivalent to &a[3].
&a[3] is the address of subarray number 3. This is not the same as &a[3][0], which is the address of element number 0 of subarray number 3. Although they, in effect, point to the same place in memory, they have different types, and the compiler treats them differently.
But I have seen people saying that writing a is equivalent to &a[0][0].
That is incorrect. a is equivalent to &a[0]—it is a pointer to the first element of a. The first element of a is itself an array; it is a[0], not a[0][0]. Although &a[0] and &a[0][0] may in effect point to the same place in memory, they have different types, and the compiler will treat them differently.
So subsituting in equation (1)
Since a is not equivalent to &a[0][0], the latter cannot be substituted for the former.
Let’s go back to the formula you mentioned:
a[i][j] = *(*(a+i)+j)
This is correct. Recall that a is automatically converted to &a[0]. Then &a[0]+i counts i subarrays, and the result of the addition is equal to &a[i]. Then, in *(a+i), we apply the * operator. This changes &a[i] to *&a[i]. Since &a[i] points to a[i], *&a[i] is a[i].
Now, we have figured out that *(a+i) becomes a[i], and we want to figure out what *(a+i)+j is. In effect, we are asking what a[i]+j is. So we have to figure out what happens to a[i] in this expression.
Recall that a[i] is a subarray of a. So it is itself an array. When used in an expression, an array is automatically converted to the address of its first element (except when used as the operand of sizeof or unary &). So a[i] is converted to &a[i][0]. Then we add j, producing &a[i][0] + j. Since &a[i][0] is a pointer to an int, the compiler counts j int and produces a pointer to a[i][j]. That is, the result of *(a+i)+j is &a[i][j]. Then applying * produces *&a[i][j], which is a[i][j].

Related

C: Ampersand operator applied to an array vs pointer

Let's say we have
int foo[4] = {1, 2, 3, 4};
foo would then be a pointer to the first element of the array.
We can do:
printf("%p", foo); // OUTPUT is some address 0xffffcc00
Now if we do:
printf("%p", &foo); // OUTPUT is the same address 0xffffcc00
Looking online, I see this particular syntax for &<array-name> takes the address of the entire array. This explains the two same values above, because the starting element's address is the same as the whole array's address.
But in general, my understanding is that & should take address of whatever is on its right. (i.e. in this case, it "should" have taken the address of the pointer to the first element of the array.)
So why isn't &foo taking the address of the pointer foo?
If this is an exception to the C language, is this the only exception, or are there other cases like this?
There is a common misconception that pointers and arrays are the same thing. They are not, but they are related.
Your first example works:
printf("%p", foo);
Because in most contexts an array decays to a pointer to the first element of the array.
One of the situations where this is not the case is when an array is the operand of the address-of operator &. This gives the address of the entire array, not just the first element (even though the values are the same).
This is detailed in section 6.3.2.1p3 of the C standard:
Except when it is the operand of the sizeof operator, the
_Alignof operator, or the unary & operator, or is a string
literal used to initialize an array, an expression that has
type "array of type" is converted to an expression with type
"pointer to type" that points to the initial element of the
array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
Basically, Pointers used to access array elements in a alternative
ease way.In Low level Programming, each and every variable has an
address to allocate in memory. BTW, the array is also a variable
name as like scalar variable.
Unlike Scalar Variables, Array variable has few things.,
List(sequences) of elements allocated in subsequent memory address.
So we can get all the elements by using first element address
FYI, Array Name == Address of the first element == Address of the array
(foo == &foo == &foo[0])
So while you are trying to get elements by using this notation foo, initially it points to (&foo + 0) , so based on ith value you can access all the elements one by one.
Hope this Answer will helps.:)
I prefer using this explanation about '& basically means "the address-of operator"' thesis. So, Arrays are not pointers! But an array can automatically "decay" into a pointer. Into &foo[0], in your case. But &foo is a pointer to an array of four integers because of its type. To get PoC you need to check ptype foo vs ptype &foo under GDB.

Printf and Array

I was asked this question as a class exercise:
int A[] = {1,3,5,7,9,0,2,4,6};
printf("%d\n", *(A+A[1]-*A));
I couldn't figure it out on paper, so went ahead to compiling a simple program and tested it and found that printf("%d",*A) always gives me 1 for the output.
But I still do not understand why this is the case, hence it would be great if someone can explain this.
A is treated like a pointer to the first element of array of integers.
A[1] is the value of the first element of that array, which is 3 (indexes are 0-based)
*A is the value to which A points, which if the zeroth element of array, so 1.
So
A[1] - *A == 3 - 1 == 2
Now we have
*(A + 2)
That's where pointer arithmetic kicks in. Since A is a pointer to integer, A+2 points to the second (0-based) item in that array and *(A+2) gets its value.
So answer is 5.
Also please note for future reference that pointer to an integer and array of integers are somewhat different things in C, but for the purposes of this discussion they are the same thing.
Break it down into its constituent parts:
A by itself is the memory address of the array, which is also equivalent to &A[0], the memory address of the first element of the array.
A[1] is the value stored in the second element of the array, which is 3.
*A dereferences the memory address of the array, which is equivilent to A[0], the value stored in the first element of the array, which is 1.
So, do some substitutions:
*(A+A[1]-*A)
= *(A+(A[1])-(A[0]))
= *(A+3-1)
= *(A+2)
The notation *(Array+index) is the same as the notation Array[index]. Under the hood, they both take the starting address of the array, increment it by the number of bytes of the array element type (in this case, int) multiplied by the index, and then dereference the resulting address. So *(A+2) is the same as A[2], which is 5.
Arrays used in expressions are automatically converted into pointers pointing at the first elements of the arrays except for some exceptions such as operands of sizeof or unary & operators.
E1[E2] is defined to be equivalent to *((E1) + (E2))
+ and - operator used to pointers will move the pointer forward and backward.
In this case, *A is equivalent to *(A + 0), which is equivalent to A[0] and it will give you the first element of the array.
The expression *(A+A[1]-*A) will
Get the pointer to the first element, which points at 1, via A
Move the pointer to A[1] (3) elements ahead via +A[1], so the pointer now points at 7
Move the pointer to *A (1) element before what is pointed via -*A, so the pointer now points at 5
Dereference the pointer via the unary * operator, so the expression is evaluated to 5
An array variable in C is only the pointer to the initial memory location for the array. So if you derreference the array, you will always get the value for the first position.
If you sum up 1 to the original array value, like *(A+1) you will get the second position.
You can get any position from the array using the same method:
*(A) is the first position
*(A+1) is the second position
*(A+2) is the third position
and so on...
If you declare the int array as int* A and allocate the memory and attribute the values, it is usually easier to visualize how this works.

Pointer and array 'a' and '&a' giving same output? [duplicate]

I am having a tough time understanding the type and use of the name of the array in C. It might seems a long post but please bear with me.
I understand that the following statement declares a to be of type int [] i.e array of integers.
int a[30];
While a also points the first element of array and things like *(a+2) are valid. Thus, making a look like a pointer to an integer. But actually the types int [] and int* are different; while the former is an array type and later is a pointer to an integer.
Also a variable of type int [] gets converted into a variable of type int* when passing it to functions; as in C arrays are passed by reference (with the exception of the sizeof operator).
Here comes the point which makes me dangle. Have a look at the following piece of code:
int main()
{
int (*p)[3];
int a[3] = { 5, 4, 6 };
p = &a;
printf("a:%d\t&a:%d\n",a,&a);
printf("%d",*(*p + 2));
}
OUTPUT:
a:2686720 &a:2686720
6
So, how does the above code work? I have two questions:
a and &a have the same values. Why?
What exactly does int (*p)[3]; do? It declares a pointer to an array, I know this. But how is a pointer to an array different from the pointer to the first element of the array and name of the array?
Can anyone clarify things up? I am having a hell of a lot of confusions.
I know that I should use %p as a placeholder instead of using %d for printing the value of pointer variables. As using the integer placeholder might print truncated addresses. But I just want to keep things simple.
Other answers already explained the issue. I am trying to explain it with some diagram. Hope this will help.
When you declare an array
int a[3] = {5, 4, 6}
the memory arrangement looks like
Now answering your question:
a and &a have the same values.How?
As you already know that a is of array type and array name a becomes a pointer to first element of array a (after decay),i.e it points to the address 0x100. Note that 0x100 also is the starting address of the memory block (array a). And you should know that, in general, the address of the first byte is said to be the address of the variable. That is, if a variable is of 100 bytes, then its address is equal to the address of its first byte.
&a is address of the entire memory block, i.e it is an address of array a. See the diagram:
Now you can understand why a and &a both have same address value although both are of different type.
What exactly it does int (*p)[3]; Declares a pointer to an array,i know this.But,how a pointer to an array is different from the pointer to the first element of the array and name of the array?
See the above figure, it is explained clearly how pointer to an array is different from the pointer to an array element.
When you assign &a to p, then p points to the entire array having starting address 0x100.
NOTE: Regarding to the line
... as in C arrays are passed by references (with exception of sizeof function).
In C, arguments are passed by value. No pass by reference in C. When an ordinary variable is passed to a function, its value is copied; any changes to corresponding parameter do not affect the variable.
Arrays are also passed by value, but difference is that the array name decays to pointer to first element and this pointer assigned to the parameter (here, pointer value is copied) of the function; the array itself isn't copied.
In contrast to ordinary variable, an array used as an argument is not protected against any change, since no copy is made of the array itself, instead copy of pointer to first element is made.
You should also note that sizeof is not a function and array name does not act as an argument in this case. sizeof is an operator and array name serves as an operand. Same holds true when array name is an operand of the unary & operator.
a and &a have the same values.How?
They have the same value but different types. Array objects have no padding between elements (before or after) so the address of the array and the address of the first element of the array are the same.
That is:
(void *) a == (void *) &a
What exactly it does int (*p)[3]; Declares a pointer to an array,i know this.But,how a pointer to an array is different from the pointer to the first element of the array and name of the array?
These are two different pointer types. Take for example, pointer arithmetic:
a + 1 /* address of the second element of the array */
&a + 1 /* address one past the last element of the array */
EDIT: due to popular demand I added below some information about conversion of arrays.
With three exceptions, in an expression an object of type array of T is converted to a value of type pointer to T pointing to the first element of the array. The exceptions are if the object is the operand of sizeof or & unary operator or if the object is a string literal initializing an array.
For example this statement:
printf("a:%d\t&a:%d\n", a, &a);
is actually equivalent to:
printf("a:%d\t&a:%d\n", &a[0], &a);
Also please note that d conversion specifier can only be use to print a signed integer; to print a pointer value you have to use p specifier (and the argument must be void *). So to do things correctly use:
printf("a:%p\t&a:%p\n", (void *) a, (void *) &a);
respectively:
printf("a:%p\t&a:%p\n", (void *) &a[0], (void *) &a);
a corresponds to the pointer pointing at 0th element of the array. Whereas,the same is the case with &a.It just gives the starting address of the array.
As,a --> pointer pointing to starting element of array a[],it does not know about other element's location..
&a --->address location for storing array a[] which stores first element location,but knows every element's location.
Similarly,other elements location will be (a+2),(a+4) and so upto the end of the array.
Hence,you got such result.
int (*p)[3] is a pointer to the array. had it been int *p[3],it would been meant entirely different. It'd have meant an array of pointers which would have been totally different from this context.
Pointer to an array will automatically take care of all the other
elements in the array.In this case,your's is (p);
Whereas,the pointer to the first element of the array,i.e., a will
only know about first element of the array.You'll have to manually
give pointer arithmetic directions to access next elements.See,in this
case---we can get second element from a by adding 2 to a,i.e.
a+2,third element by adding 4 to a,i.e., a+4 and so on. // mind the
difference of two as it is an integer array!
In answer to question 1, this is simply an aspect of the C language as designed, unlike most other modern languages C/C++ allows direct manipulation of addresses in memory and has built in facilities to 'understand' that. There are many articles online that explain this better than I could in this small space. Here is one and I am sure there are many others: http://www.cprogramming.com/tutorial/c/lesson8.html
From C99 Standard n1124 6.3.2.1 p3
Except when it is the operand of the sizeof operator or the unary &
operator, or is a string literal used to initialize an array, an
expression that has type ‘‘array of type’’ is converted to an
expression with type ‘‘pointer to type’’ that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
a and &a have the same value because a long time ago you were required to use the address operator & on arrays to get the array's address, but it is no longer necessary. The name of the array (a in this case) these days just represents the memory address of the array itself, which is also what you get from &a. It's a shorthand that the compiler handles for you.

2D Array indexing - undefined behavior?

I've recently got into some pieces of code doing some questionable 2D arrays indexing operations. Considering as an example the following code sample:
int a[5][5];
a[0][20] = 3;
a[-2][15] = 4;
a[5][-3] = 5;
Are the indexing operations above subject to undefined behavior?
It's undefined behavior, and here's why.
Multidimensional array access can be broken down into a series of single-dimensional array accesses. In other words, the expression a[i][j] can be thought of as (a[i])[j]. Quoting C11 §6.5.2.1/2:
The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).
This means the above is identical to *(*(a + i) + j). Following C11 §6.5.6/8 regarding addition of an integer and pointer (emphasis mine):
If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
In other words, if a[i] is not a valid index, the behavior is immediately undefined, even if "intuitively" a[i][j] seems in-bounds.
So, in the first case, a[0] is valid, but the following [20] is not, because the type of a[0] is int[5]. Therefore, index 20 is out of bounds.
In the second case, a[-1] is already out-of-bounds, thus already UB.
In the last case, however, the expression a[5] points to one past the last element of the array, which is valid as per §6.5.6/8:
... if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object ...
However, later in that same paragraph:
If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
So, while a[5] is a valid pointer, dereferencing it will cause undefined behavior, which is caused by the final [-3] indexing (which, is also out-of-bounds, therefore UB).
array indexing with negative indexes is undefined behaviour. Sorry, that a[-3] is the same as *(&a - 3) in most architectures/compilers, and accepted without warning, but the C language allows you to add negative integers to pointers, but not use negative values as array indexes. Of curse this is not even checked at runtime.
Also, there are some issues to be acquainted for when defining arrays in front to pointers. You can leave unspecified just the first subindex, and no more, like in:
int a[][3][2]; /* array of unspecified size, definition is alias of int (*a)[3][2]; */
(indeed, the above is a pointer definition, not an array, just print sizeof a)
or
int a[4][3][2]; /* array of 24 integers, size is 24*sizeof(int) */
when you do this, the way to evaluate the offset is different for arrays than for pointers, so be carefull. In case of arrays, int a[I][J][K];
&a[i][j][k]
is placed at
&a + i*(sizeof(int)*J*K) + j*(sizeof(int)*K) + k*(sizeof(int))
but when you declare
int ***a;
then a[i][j][k] is the same as:
*(*(*(&a+i)+j)+k), meaning you have to dereference pointer a, then add (sizeof(int **))*i to its value, then dereference again, then add (sizeof (int *))*j to that value, then dereference it, and add (sizeof(int))*k to that value to get the exact address of the data.
BR

why don't i get the value of first element of the array?

In this program all three addresses which I mention refer to the first element of the array but why don't I get the value of the first element of the array when I dereference them?
int main()
{
int a[5] = {1,2,3,4,5};
printf("address a = %d\n",a);
printf("address of a[0] = %d\n",&a[0]);
printf("address of first element = %d\n",&a);
printf("value of first element of the array a =%d\n",*(a));
printf("first element =%d\n",*(&a[0]));
printf("a[0] = %d\n",*(&a));//this print statement again prints the address of a[0]
return 0;
}
I get address of the first element of the array a for the first 3 print statements and when I dereference all the 3 I get values only for the fourth and fifth print statements and not for the sixth print statement (which is accompanied with a comment).
Things to remember:
Name of the array is the address of its first element
So, as the array name is a, then, printing a would give you the address of a[0] (which is also the address of the array too) i.e. you will get the values of &a[0] (same as a) and &a to be the same
Now, you are aware that a and &a[0] refer to the first element, you can dereference the first element in 3 ways:-
*a
*(&a[0])
a[0] - Note that internally, this gets transformed into: *(a+0)
Things to remember:
2. Adding an integer to a pointer takes the pointer to the next element
Here, &a points to the address of the whole array. Although the value of &a is the same as &a[0] and a, but, it is a pointer to the array, not pointer to the first element.
So, if you add 1 to &a i.e. &a + 1, you'll go beyond this array.
SImilarly, as &a[0] and a are pointers to the first element, adding 1 to them will give you the next element of the array (if there are more than 1 items defined in the array). i.e. (a+1) and &a[0] + 1 point to the next element from the first element. Now, for dereferencing them, you can use:
*(a+1)
*(&a[0] +1)
a[1] - Note that internally, this gets transformed into: *(a+1)
Adding more information to remove the following doubt:
If, as this answer states, the name of the array were the address of its first element, then &a would be the address of the address of the first element.
The answer to this doubt is both No and Yes.
No because there is nothing like address of the address.
For understanding yes, consider the following situation:
Imagine that you have 10 boxes of chocolates and each box contains 5 chocolates (fitted in a line inside the box) and that the boxes are lined up.
Ok, enough chocolates to explain.
Here, So, boxes represent arrays of chocolates. Thus, we have with us 5 boxes of 5 chocolates each. The declaration for that would be:
Translating it to C, just assume that a is an array with 5 numbers.
-Now, if I ask you to tell me the location of the first box, then, you will refer to it as &a. If I ask you to get me the location of second box, then, you'll refer to it as &a +1.
If I ask you to get me the location of first chocolate in the first box, then you'll refer to it as &a[0] or (a+0) or a.
If I ask you to get me the location of second chocolate in the first box, then you'll refer to it as &a[1] or (a+1) or a+1. Note: In (a+1), as a is the name of the array, it is the address of the first element, which is an integer. So, increasing a by 1, means the address of the second element.
If I ask you to get me the location of the second box of chocolates, then, you'll refer to it as (&a+1)
If I ask you to get me the location of the first chocolate in the second box of chocolates, then, you'll refer to it as *(&a+1) or *((&a+1) + 0)
If I ask you to get me the location of the third chocolate in the second box of chocolates, then, you'll refer to it as (*(&a+1))+2
To answer just your question, by definition the operators * and & are such that they cancel out. Taking the address of a variable and then dereferencing gives you back the variable. Here this is is an array, a. In most contexts arrays "decay" to pointers so what you then see again is the address of the first element.
The C standard specifies that an expression that has type “array of type” is converted to type “pointer to type” and points to the initial element of the array, except when the expression is the operand of & (or three other exceptions, noted below but not relevant to this question). Here is how this applies to your examples:
a is an array of int. It is converted to pointer to int. The value is the address of the initial element of a.
In &a[0], a[0] is processed first. a is again converted to pointer to int. Then the subscript operator is applied, which produces an lvalue for the initial element of the array. Finally, & takes the address of this lvalue, so the value is the address of the initial element of a.
In &a, a is the operand of &, so a is not converted to a pointer to int. It remains an array, and &a takes its address. The result has type “pointer to array of int”, and its value is the address of the array, which equals the address of the initial element.
For completeness: The relevant rule in the C standard is 6.3.2.1 paragraph 3. The exceptions are:
The array expression is the operand of &.
The array expression is the operand of sizeof.
The array expression is the operand of _Alignof.
The array expression is a string literal used to initialize an array.
The last means that in char s[] = "abc";, "abc" is not converted to a pointer to the initial element; it remains an array, which is used to initialize s.
The a is a pointer and a[0] is *(a+0). So when you write *(&a), you aren't dereferencing it.

Resources