Related
I wanted to test if I could change the constant pointer that points to the first element of an array in C. While testing I got some strange output that I don't understand:
//Constant pointer to pointer to constant value
void test(int const * * const a) {
//printf("%d", **a); //Program crashes (2)
(*a)++;
}
int main()
{
int a[5] = { 1,2,3,4,5 };
test(&a);
printf("%d", *a); //Prints 5 as output (1)
return 0;
}
I expected the compiler to give an error when I try to compile (*a)++ but instead I can run the code, but when I try to print the element I get a strange value (1).
Then I wanted to print out the value (2) of the first element of the array. When I tried this, the program crashes.
By doing &a you are making a pointer to an array (int (*)[]).
Then when this pointer to array is passed to the test function, it's converted to a pointer to a pointer(int **);
Then (*a)++; is UB.
1. So why 5?
On modern implementation of C like GCC, pointer to a array has the same numerical value as the beginning of the array, so is the address value when the array decays to a pointer: they all are the beginning address of the array.
So, in test, int **a points to the beginning of the array, (*a)++ deferences the pointer as int * and increment the pointer by 1 int element, which is usually implemented as adding the sizeof(int) to the numerical value of the pointer.
Then, 1+sizeof(int) gives you 5.
2. Why it crashed in the second case?
Assuming you are using a 32bit x86 machine, or some machine whose pointer type has the same size as int type, then *a equal to 1. Then further dereferencing the pointer at a memory address at 1 usually gives you a segfault.
The program crashes at the printf because test assumes that when it dereferences a the resulting object is a pointer. If it were one and contained a valid address, the second dereferencing would yield an int object. Alas, a contains the address of the array, which is numerically the address of its first element. The 4 or 8 bytes there are considered an address (because test thinks that *a is a pointer) and the code then tries to access the memory at address 1 in order to print the int value assumed at that address. The address is invalid though, so the program crashes.
Now that we have established that the program considers the data at the beginning of the array a pointer to int, we know what (*a)++ does: it increments the value there by sizeof(int) so that the "pointer" point to the next int "element". I guess an int on your machine is 4 bytes long because 1+4=5, which is printed.
This code is illegal in C, you should get a compiler diagnostic. (If not, turn up your warning level). The results of running any executable produced are meaningless.
The code is illegal because int (*)[5] does not implicitly convert to int const **.
the constant pointer that points to the first element of an array in C
There is no such thing. You misunderstand what arrays are. Arrays are a series of contiguous elements. int a[5] is like int a; except that there are 5 ints instead of 1.
int a; and int a[1]; cause identical memory layout. The only difference is the syntax used to access that memory.
int s[4][2] = {
{1234, 56},
{1212, 33},
{1434, 80},
{1312, 78}
};
int (*p)[1];
p = s[0];
printf("%d\n", *(*(p + 0))); // 1234
printf("%d\n", *(s[0] + 0)); // 1234
printf("%u\n", p); // 1256433(address of s[0][0])
printf("%u\n", *p); // 1256433(address of s[0][0])
Can anyone explain why doing *(*(p + 0)) prints 1234, and doing *(s[0] + 0) also prints 1234, when p = s[0] and also why does p and *p gives the same result?
Thanking you in anticipation.
This is the way arrays work in C -- arrays are not first class types, in that you can't do anything with them other than declaring them and getting their size. In any other context, when you use an expression with type array (of anything) it is silently converted into a pointer to the array's first element. This is often referred to as an array "decaying" into a pointer.
So lets look at your statements one by one:
p = s[0];
Here, s has array type (it's an int[4][2] -- a 2D int array), so its silently converted into a pointer to its first element (an int (*)[2], pointing at the word containing 1234). You then index this with [0] which adds 0 * sizeof(int [2]) bytes to the pointer, and then dereferences it, giving you an int [2] (1D array of 2 ints). Since this is an array, its silently converted into a pointer to its first element (an int * pointing at 1234). Note that this is the same pointer as before the index, just the pointed at type is different.
You then assign this int * to p, which was declared as int (*)[1]. Since C allows assigning any pointer to any other pointer (even if the pointed at types are different), this works, but any reasonable compiler will give you a type mismatch warning.
p now points at the word containing 1234 (the same place the pointer you get from s points at)
printf("%d\n", *(*(p+0)));
This first adds 0*sizeof(int[1]) to p and dereferences it, giving an array (int[1]) that immediately decays to a pointer to its first element (an int * still pointing at the same place). THAT pointer is then dereferenced, giving the int value 1234 which is printed.
printf("%d\n", *(s[0]+0));
We have s[0] again which via the multiple decay and dereference process noted in the description of the first line, becomes an int * pointing at 1234. We add 0*sizeof(int) to it, and then dereference, giving the integer 1234.
printf("%u\n", p);
p is a pointer, so the address of the pointer is simply printed.
printf("%u\n",*p)
p is dereferenced, giving an int [1] (1D integer array) which decays into a pointer to its first element. That pointer is then printed.
s[0]points to a location in memory. That memory location happens to be the starting point of int s[4][2]. When you make the assignment p = s[0], p and p+0 also point to s[0]. So when you print any one of these with a "%d" specifier, you will get the value stored at that location which happens to be `1234'. If you would like to verify the address is the same for all of these, use a format specifier "%p" instead of "%d".
EDIT to address OP comment question...
Here is an example using your own int **s:
First, C uses pointers. Only pointers. No arrays. The [] notation gives the appearance of arrays, but any variable that is created using the [] notation (eg. int s[4][2]) is resolved into a simple pointer (eg. int **s). Also, a pointer to a pointer is still just a pointer.
int a[8]={0}; (or int *a then malloced)
will look the same in memory as will:
int a[2][4]; ( or in **a=0; then malloced)
The statment:
s[row][col] = 1;
creates the same object code as
*(*(s + row) + col) = 1;
It is also true that
s[row] == *(s + row)
Since s[row] resolves to a pointer, then so does *(s + row)
It follows that s[0] == *(s + 0) == *s
If these three are equal, whatever value is held at this address will be displayed when printing it.
It follows that in your code: given that you have assigned p = s[0]; and s[0] == *s
*(*(p + 0)) == *(s[0] + 0) == *s[0] == **s
printf("%d\n", >>>fill in any one<<<); //will result in 1234
Note, in the following printf statements, your comment indicates addresses were printed. But because you used the unsigned int format specifier "%u",
Consider p == s[0]; which is a pointer to the first location of s. Note that either s[0][0] or **s would give you the value held at the first location of s, but s[0] is the _address_ of the first memory location of s. Therefore, since p is a pointer, pointing to the address at s[0], the following will give you the address of p, or s[0] (both same):
printf("%p\n", *p); // 1256433(address of s[0][0])
As for *p, p was created as int (*p)[1]; and pointer array of 1 element. an array is resolved into a pointer, so again, in the following you will get the address pointing to s[0]:
printf("%u\n", **p);
In summary, both p and *p are pointers. Both will result in giving address when printed.
Edit 2 Answer to your question: So my question is what is the difference between a simple pointer and a pointer to an array?
Look toward the bottom of this tutorial download a pdf. It may explain it better...
But in short, C Does not implement arrays in the same way other languages do. In C, an array of any data type always resolves into a pointer. int a[10]; is just really int *a;, with memory set aside for space to hold 10 integers consecutively. In memory it would look like:
a[0] a[9]
|0|0|0|0|0|0|0|0|0|0| (if all were initialized to zero)
Likewise you would be tempted to think of float b[2][2][2]; as a 3 dimensional array: 2x2x2, it is not. It is really a place in memory, starting at b[0] that has room for 8 floating point numbers. Look at the illustrations HERE.
int stud[5][2] = {{1,2},{3,4},{5,6},{7,8},{9,8}};
printf("%u %u",*(stud+1),stud+1);
printf("%u, %u", &stud,stud);
Why this statement prints similar values, stud[1] or *(stud+1) is actually an array hence must get the base address i.e &stud[0][0], but stud itself is a pointer to an array of array. Also the third statement prints identical values.
Your observations are correct concerning the expressions are all address-results. But the types of those addresses per the standard are different. Your phrase "but stud itself is a pointer to an array of array". is not accurate. stud is an array of arrays. Pointers are not arrays. After decades of trying to come up with a solid vernacular that describes how it works, and refusing steadfastly to walk the "decay" plank (a word that appears exactly one times in the C standard and even there it is used as a verb-footnote), the best I could come up with is this:
Pointers are not arrays. A pointer holds an address. An array is an address.
Each expression is shown below Given int stud[5][2];
stud int (*)[2]
stud+1 int (*)[2]
*(stud+1) int *
&stud int (*)[5][2]
Remembering that, per the standard, the expressive value of an array is the address of its first element, and pointer-to-element-type is the type of said-address. In both outputs each pair of expressions have equivalent addresses, but they're different types. This is verifiable with some expansion of the original code:
#include <stdio.h>
int main()
{
int stud[5][2] = {{1,2},{3,4},{5,6},{7,8},{9,8}};
printf("%p %p\n", *(stud+1), stud+1);
printf("%p %p\n", &stud,stud);
int (*p1)[2] = stud+1; // OK
// int (*p2)[2] = *(stud+1); // incompatible types
int *p3 = *(stud+1); // OK
int (*p4)[5][2] = &stud; // OK
return 0;
}
int stud[5][2] = {{1,2},{3,4},{5,6},{7,8},{9,8}};
The above statement defined stud to be an array of 5 elements where each element is of type int[2], i.e., an array of 2 integers. It also initializes the array with an initializer list.
Now, in the expression stud + 1, the array stud decays into a pointer to its first element. Therefore, stud + 1 evaluates to &stud[1] and is of type int (*)[2], i.e., a pointer to an array of 2 integers . *(stud + 1) is then *(&stud[1]), i.e., stud[1]. stud[1] is again an array type, i.e., int[2], so it again decays to a pointer to its first element, i.e., &stud[1][0] (which is the base address of second element of the array stud[1]) in the printf call.
Please note that stud + 1 and *(stud + 1) evaluate to the same address but they are not the same type.
Similarly, &stud and stud decay to the same address but they are different types. stud is of type int[5][2] where as &stud is of type int (*)[5][2].
Why this statement prints similar values, stud[1] or *(stud+1) is actually an array hence must get the base address i.e &stud[0][0], but
stud itself is a pointer to an array of array.
You are wrong here. The base address of stud[1] or *(stud + 1) is &stud[1][0] and not &stud[0][0]. Also, stud is not a pointer but an array type. It decays to a pointer to its first element in some cases like here but it does mean it is a pointer.
Also, you should use %p conversion specifier for printing addresses.
Without using any decaying syntax it may be clearer (these are the same addresses as your code; the first line is in the opposite order; and my parentheses are redundant but hopefully it improves clarity of this example):
printf( "%p %p\n", &(stud[1]), &(stud[1][0]) );
printf( "%p %p\n", &(stud), &(stud[0]) );
In both cases the first address on the line matches the second because the first element of an array lives at the same address as the array. Arrays can't have initial padding, and in C the address of an object is the address of its first byte.
The first element of stud is stud[0], and the first element of stud[1] is stud[1][0].
Since all of those values you are trying to display are all pointers you should use %p instead of %u. If you do that you will see that the addresses pointed to:
printf("%p, %p", &stud,stud);
are different than:
printf("%p %p",*(stud+1),stud+1);
because as you said stud is a pointer to an array of array.
Lets analyze the program
int stud[5][2] = {{1,2},{3,4},{5,6},{7,8},{9,8}};
Now address will be like this (assuming 2 byte integer). Brackets denote corresponding elements in array.
1 element of 2-D array ---> 4001(1) 4003(2)
2 element of 2-D array ---> 4005(3) 4007(4)
3 element of 2-D array ---> 4009(5) 4011(6)
4 element of 2-D array ---> 4013(7) 4015(8)
5 element of 2-D array ---> 4017(9) 4019(8)
We know that arr[i] gives the ith element of array. So when we say stud[0] we expect 0th element of array stud[5][2].
We can assume 2-d array as collection of 1-d array. So with statement like printf("%u",stud[0]) we exptect 0th element to get printed and what is 0th element for this array. It is one dimensional array. We know that just mentioning 1-D array gives its base address. Hence printf would print base address of 0th 1-D array and so on.
With this information we can analyze your problem.
Remember stud is 2-D array. stud is treated as pointer to zeroth element of 2-D array. So (stud + 1) would give address of 2nd element of 2-D array. And thus printing (stud+1) would print address of 2nd element of stud array. What is it. It will be 4005 from above addresses.
Now lets see why *(stud +1) also gives the same value.
Now we know that *(stud +1) is equivalent to stud[1]. From above we know stud[1] would print base address of 2nd 1-D array. What is 1-d array at 2nd position it is (3,4) with address (4005,4007). So what is it base address. It is 4005. Thus *(stud+1) also prints 4005.
Now you say stud[0] and &stud[0] print the same value.
From above stud[0] is 1-d array and printing it gives its base address. Now so &stud[0] should give address of 1-D array which is same as its base address. Thus they print the same address.
Similar explanation will hold for other cases.
I wrote the following code in C:
#include<stdio.h>
int main()
{
int a[10][10]={1};
//------------------------
printf("%d\n",&a);
printf("%d\n",a);
printf("%d\n",*a);
//-------------------------
printf("%d",**a);
return 0;
}
With the above 3 printf statements I got the same value. On my machine it's 2686384. But with the last statement I got 1.
Isn't it something going wrong? These statements mean:
The address of a is 2686384
The value stored in a is 2686384
the value that is stored at address of variable pointed by a (i.e. at 2686384) is 2686384.
This means a must be something like a variable pointing towards itself...
Then why is the output of *(*a) 1? Why isn't it evaluated as *(*a)=*(2686384)=2686384?
#include<stdio.h>
int main()
{
// a[row][col]
int a[2][2]={ {9, 2}, {3, 4} };
// in C, multidimensional arrays are really one dimensional, but
// syntax alows us to access it as a two dimensional (like here).
//------------------------
printf("&a = %d\n",&a);
printf("a = %d\n",a);
printf("*a = %d\n",*a);
//-------------------------
// Thing to have in mind here, that may be confusing is:
// since we can access array values through 2 dimensions,
// we need 2 stars(asterisk), right? Right.
// So as a consistency in this aproach,
// even if we are asking for first value,
// we have to use 2 dimensional (we have a 2D array)
// access syntax - 2 stars.
printf("**a = %d\n", **a ); // this says a[0][0] or *(*(a+0)+0)
printf("**(a+1) = %d\n", **(a+1) ); // a[1][0] or *(*(a+1)+0)
printf("*(*(a+1)+1) = %d\n", *(*(a+1)+1) ); // a[1][1] or *(*(a+1)+1)
// a[1] gives us the value on that position,
// since that value is pointer, &a[i] returns a pointer value
printf("&a[1] = %d\n", &a[1]);
// When we add int to a pointer (eg. a+1),
// really we are adding the lenth of a type
// to which pointer is directing - here we go to the next element in an array.
// In C, you can manipulate array variables practically like pointers.
// Example: littleFunction(int [] arr) accepts pointers to int, and it works vice versa,
// littleFunction(int* arr) accepts array of int.
int b = 8;
printf("b = %d\n", *&b);
return 0;
}
An expression consisting the the name of an array can decay to a pointer to the first element of the array. So even though a has type int[10][10], it can decay to int(*)[10].
Now, this decay happens in the expression *a. Consequently the expression has type int[10]. Repeating the same logic, this again decays to int*, and so **a is an int, which is moreover the first element of the first element of the array a, i.e. 1.
The other three print statements print out the address of, respectively, the array, the first element of the array, and the first element of the first element of the array (which are of course all the same address, just different types).
First, a word on arrays...
Except when it is the operand0 of the sizeof, _Alignof, or unary & operators, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element in the array.
The expression &a has type "pointer to 10-element array of 10-element array of int", or int (*)[10][10]. The expression a has type "10-element array of 10-element array of int", which by the rule above decays to "pointer to 10-element array of int", or int (*)[10]. And finally, the expression *a (which is equivalent to a[0]) has type "10-element array of int", which again by the rule above decays to "pointer to int".
All three expressions have the same value because the address of an array and the address of its first element are the same: &a[0][0] == a[0] == *a == a == &a. However, the types of the expressions are different, which matters when doing pointer arithmetic. For example, if I have the following declarations:
int (*ap0)[10][10] = &a;
int (*ap1)[10] = a;
int *ip = *a;
then ap0++ would advance ap0 to point to the next 10x10 array of int, ap1++ would advance ap1 to pointer to the next 10-element array of int (or a[1]), and ip++ would advance ip to point to the next int (&a[0][1]).
**a is equivalent to *a[0] which is equivalent to a[0][0]. which is the value of the first element of a and has type int and the value 1 (note that only a[0][0] is initialized to 1; all remaining elements are initialized to 0).
Note that you should use %p to print out pointer values:
printf("&a = %p\n", &a);
printf(" a = %p\n", a);
printf("*a = %p\n", *a);
First of all, if you want to print out pointer values, use %p - if you're on a 64 bit machine int almost certainly is smaller than a pointer.
**a is double dereferencing what's effectively a int**, so you end up with what the first element of the first sub-array is: 1.
If you define a as T a[10] (where T is some typedef), then a simple unadorned a means the address of the start of the array, the same as &a[0]. They both have type T*.
&a is also the address of the start of the array, but it has type T**.
Things become trickier in the presence of multi-dimensional arrays. To see what is happening, it is easier to break things down into smaller chunks using typedefs. So, you effectively wrote
typedef int array10[10];
array10 a[10];
[Exercise to reader: What is the type of a? (it is not int**)]
**a correctly evaluates to the first int in the array a.
From C99 Std
Consider the array object defined by the declaration
int x[3][5];
Here x is a 3 × 5 array of ints; more precisely, x is an array of three element objects, each of which is an array of five ints. In the expression x[i], which is equivalent to (*((x)+(i))), x is first converted to a pointer to the initial array of five ints. Then i is adjusted according to the type of x, which conceptually entails multiplying i by the size of the object to which the pointer points, namely an array of five int objects. The results are added and indirection is applied to yield an array of five ints. When used in the expression x[i][j], that array is in turn converted to a pointer to the first of the ints, so x[i][j] yields an int.
so,
Initial array will be x[0][0] only.
all x, &x and *x will be pointing to x[0][0].
No, there's nothing wrong with your code. Just they way you are thinking about it... The more I think about it the harder I realize this is to explain, so before I go in to this, keep these points in mind:
arrays are not pointers, don't think of them that way, they are different types.
the [] is an operator. It's a shift and deference operator, so when I write printf("%d",array[3]); I am shifting and deferencing
So an array (lets think about 1 dimension to start) is somewhere in memory:
int arr[10] = {1};
//Some where in memory---> 0x80001f23
[1][1][1][1][1][1][1][1][1][1]
So if I say:
*arr; //this gives the value 1
Why? because it's the same as arr[0] it gives us the value at the address which is the start of the array. This implies that:
arr; // this is the address of the start of the array
So what does this give us?
&arr; //this will give us the address of the array.
//which IS the address of the start of the array
//this is where arrays and pointers really show some difference
So arr == &arr;. The "job" of an array is to hold data, the array will not "point" to anything else, because it's holding its own data. Period. A pointer on the other hand has the job to point to something else:
int *z; //the pointer holds the address of someone else's values
z = arr; //the pointer holds the address of the array
z != &z; //the pointer's address is a unique value telling us where the pointer resides
//the pointer's value is the address of the array
EDIT:
One more way to think about this:
int b; //this is integer type
&b; //this is the address of the int b, right?
int c[]; //this is the array of ints
&c; //this would be the address of the array, right?
So that's pretty understandable how about this:
*c; //that's the first element in the array
What does that line of code tell you? if I deference c, then I get an int. That means just plain c is an address. Since it's the start of the array it's the address of the array, thus:
c == &c;
#include<stdio.h>
main()
{ int x[3][5]={{1,2,10,4,5},{6,7,1,9,10},{11,12,13,14,15}};
printf("%d\n",x);
printf("%d\n",*x); }
Here first printf will print the address of first element.
So why not the second printf prints the value at the address x i.e the first value.
To print the value I need to write **x.
For pointers, x[0] is the same as *x. It follows from this that *x[0] is the same as **x.
In *x[0]:
x is a int[3][5], which gets converted to int(*)[5] when used in expression. So x[0] is lvalue of type int[5] (the first 5-element "row"), which gets once again converted to int*, and dereferenced to its first element.
*x is evaluated along the same lines, except the first dereference is done with an asterisk (as opposed to indexing), and there is no second dereference, so we end up with lvalue of type int[5], which is passed to printf as a pointer to its first element.
Arrays, when used as arguments to functions, decay into pointers to the first element of the array. That being said, the type of object that x decays into is a pointer to the first sub-array, which is a pointer to an array of int, or basically int (*)[5]. When you call printf("%d\n",*x), you are not feeding an integer value to printf, but rather a pointer to the first sub-array of x. Since that sub-array will also decay to a pointer to the first sub-array's element, you can do **x to dereference that subsequent pointer and get at the first element of the first sub-array of x. This is effectively the same thing as *x[0], which by operator precedence will index into the first sub-array of x, and then dereference the pointer to the first sub-array's element that the first sub-array will decay into.
Because of type of *x is 'pointer to array of 5 ints'. So, you need one more dereference to get the first element
PS:
#include <typeinfo>
#include <iostream>
typedef int arr[5]; // can't compile if put arr[4] here
void foo(arr& x)
{
}
int main()
{
int x[3][5]={{1,2,10,4,5},{6,7,1,9,10},{11,12,13,14,15}};
std::cout << typeid(*x).name() << std::endl;// output: int [5]
foo(x[0]);
return 0;
}
Think about a 2-d array as an array of pointers, with each element in the array pointing to the first element in another array. When you dereference x, you get the value that is in the memory location pointed to by x... a pointer to the first int in an array of ints. When you dereference that pointer, you will get the first element.