I just saw this code snippet Q4 here and was wondering if I understood this correctly.
#include <stdio.h>
int main(void)
{
int a[5] = { 1, 2, 3, 4, 5 };
int *ptr = (int*)(&a + 1);
printf("%d %d\n", *(a + 1), *(ptr - 1));
return 0;
}
Here's my explanation:
int a[5] = { 1, 2, 3, 4, 5 }; => a points to the first element of the array. In other words: a contains the address of the first element of the array.
int *ptr = (int*)(&a + 1); => Here &a will be a double pointer and point to the whole array. I visualize it like this: int b[1][5] = {1, 2, 3, 4, 5};, here b points to a row of a 2D array. &a + 1 should point to the next array of integers in the memory (non-existent) [kind of like, b + 1 points to the second (non-existent) row of a 2D array with 1 row]. We cast it as int *, so this should probably point to the first element of the next array (non-existent) in memory.
*(a + 1) => This one's easy. It just points to the second element of the array.
*(ptr - 1) => This one's tricky, and my explanation is probably flawed for this one. As ptr is an int *, this should point to int previous to that pointed by ptr. ptr points to the non-existent second array in memory. So, ptr - 1 should probably point to the last element of the first array (a[4]).
Here &a will be a double pointer.
No. It is a pointer to an array. In this example, int (*)[5]. Refer C pointer to array/array of pointers disambiguation
so when you increment pointer to an array, it will crosses the array and points to non-existent place.
In this example, It is assigned to integer pointer. so when int pointer is decremented, it will point to previous sizeof(int) bytes. so 5 is printed.
Your statement is essentially correct, and you probably understand it better than most professionals. But since you are seeking a critique, here is the long answer. Arrays and pointers in C are different types, this is one of the most subtle details in C. I remember one of my favorite professors saying once that the people who made the language latter regretted making this so subtle and often confusing.
It is true in many cases an array of a type, and a pointer to a type can be treated the same way. They both have a value equal to their address, but they are truly different types.
When you take the address of an array &a, you have a pointer to an array. When you say (a + 1) you have a pointer to an int, when you just say a you have an array (not a pointer). a[1] is exactly the same as typing *(a + 1), in fact you could type 1[a] and it would be exactly the same as the previous two. When you pass an array to a function, you are not really passing an array, you are passing a pointer void Fn(int b[]) and void Fn(int *b) are both the exact same function signature, if you take sizeof b within the function, in both cases you will get the size of a pointer.
Pointer arithmetic is tricky, it always offsets by the size of the object it's pointing to in bytes. Whenever you use the address of operator you get a pointer to the type you applied it to.
So for what's going on in your example above:
&a is a pointer to an array, and so when you add one to it, it is offset by the sizeof that array (5 * sizeof(int)).
When you cast to int*, the cast retains the value of the pointer, but now its type is pointer to int, you then store it in ptr, a variable of type pointer to int.
a is an array, not a pointer. So when you say a + 1 you apply the addition operator to an array, not a pointer; and this yields a pointer to one-past the first element of the type stored in the array, int. Dereferencing it with * gives you the int pointed to.
ptr is a pointer to int, and it points one past the end of the array. (it is legal by the way to point one past the end of an array, it's just not legal to dereference this pointer) When you subtract 1 from it, you end up with a pointer to an int that is the last in the array, which you can dereference. (Your explain of visualizing int b[1][5] = {1, 2, 3, 4, 5}; is something I've not heard before, and while I can't honestly say if this is technically correct, I will say this is how it works and I think this is a great way to think of it; I will likely do so in the back of my mind from now on.)
Types will get very tricky in C, and also in C++. The best is yet to come.
As per your explanation you understand array pointer correctly.Using statement
int *ptr = (int*)(&a + 1);
you point to the next address of address occupied by whole array a[] so you can access the array element using ptr by decrementing address of ptr.
Related
I am a beginner in C and have recently came across the fact that an array name is a pointer to the address of the first element of an array, that is a pretty understandable concept since a pointer is a variable that holds a memory address.
int x[4];
printf("%p",x); // x is a pointer
What i am having problems understanding is the following code:
int x[4], *ptr;
ptr = x;
This is simple enough but the second line ptr = x points to the pointer x, would this not making ptr a pointer to a pointer meaning i would need to declare ptr as int **ptr ? If my understanding is not mistaken, xstores the address of x[0] so doing ptr = x, is making ptr point to another pointer which is x making it essentially pointer to a pointer.
An array is not a pointer.
But first, let’s look at the assignment. Suppose you have:
int SomeInt;
int *a = &SomeInt;
At this point a points to SomeInt. Then we do:
int *b;
b = a;
What does this do? It sets b equal to a. It does not make b point to a. After this assignment, b has the same value as a, so b points to the same place a points to; it points to SomeInt. It does not point to a.
Similarly, your ptr = x; does not make ptr point to x. It makes ptr equal to the value of the expression x.
However, in this case, that value is not the array, because there is an automatic conversion occurring. We will discuss that below.
Getting back to arrays not being pointers, after int x[4];, x is the name of the array of four int. If you print the size of the array, with printf("%zu\n", sizeof x);, you will get 16 on systems where int is four bytes, because the size of the array is 16 bytes. This is because x is the array, so sizeof x is the size of the array.
However, when you write ptr = x;, it does not assign the array to ptr. It assigns a pointer to the first element to ptr. How does that work?
Because early C did not have any support for working with whole arrays, such as assigning one to another, you had to work with them only through pointers. So, to set a pointer to point to the first element of an array, you would have to write ptr = &x[0];. To make this easier, the language was designed to allow you to write ptr = x; instead. When you use an array in this way, it is automatically converted to a pointer to its first element, as if you had written &array[0] instead of array.
That automatic conversion occurs whenever an array is used in an expression except when it is the operand of sizeof, is the operand of unary &, or is a string literal used to initialize an array.
I'm new in C programming and currently learning about array and strings. I'm quite confuse in this topic. Coming to my question-
Since an array (for ex- a[]={20,44,4,8}), the name in an expression decays into pointer constant,so whenever if i try to do pointer arithmetic for example- a=a+1 or anything like this the compiler shows error but when the same thing I write in printf() function it is showing the address of the first element rather than showing error. Why?
In an expression for example *(a+1)=2 first (a+1) will be evaluated and then * will dereference it. My question is that if a is a pointer constant then how it can point to any other memory location in an array and how this expression is perfectly legal?
I tried to search about this but couldn't get the accurate result.
Although an array name evaluates to a pointer in some expressions, your a = a+1 assignment tries to assign to an array, which is not allowed.
On the other hand, a+1 expression is allowed, and it evaluates to another pointer. When you pass this value to printf, the function happily prints it. Do not forget to cast the result to void* when you print:
printf("%p\n", (void*)(a+1));
if a is a pointer constant then how it can point to any other memory location in an array and how is *(a+1) expression perfectly legal?
For the same reason that 2+3, a combination of two constants, produces a value that is neither a 2 nor a 3. In your example, a+1 expression does not modify a. Instead, the expression uses it as a "starting point", computes a different value (which happens to be of type pointer), and leaves a unchanged.
The name of the array a is not quite the same as a pointer constant. It merely
acts like a pointer constant in some circumstances. In other circumstances it will
act quite differently; for example, sizeof(a) may have a much larger value
than sizeof(b) where b is truly a pointer.
This code is legal:
int a[] = {20,44,4,8};
int *b;
b = a;
b = b + 1;
because a is enough like a pointer that you can set b to point to the same
address but, unlike a, b really is a pointer and it can be modified.
The last line of code could just as well be:
b = a + 1;
because the right-hand side here is not trying to modify a; it is merely using
the address of the first element of a to compute a new address.
The expression *(a + 1) is effectively another way of writing a[1].
You know what will happen when you write a[1] = 2, right?
It will change what is stored in the second element of a.
(The first element is always a[0] whether you do anything with it or not.)
Storing a new value in a[1] doesn't change the location of the array a.
When array decays in to pointer, the resulting value is a rvalue. It's an value that cannot be assigned to.
So int[4] will become int*const, constant pointer to integer.
Q1:
Types in expression a = a + 1 are:
int[4] = int[4] + int
If we focus on addition first, array decays to pointer:
int[4] = int*const + int
int[4] = int*const // After addition
But now there is a problem:
int*const = int*const
In memory a is an array with 4 ints, and nothing more. There is no place where you could possibly store address with type int*. Compiler will show an error.
Q2:
Types in expression *(a+1)=2 are:
*(int[4] + int) = int
Again, array decays to pointer and addition happens:
*(int*const + int) = int
*(int*const) = int // int* is now equal to &a[1]
Dereferencing int*const is legal. While pointer is constant, value it points to is not:
int = int // Ok, equal types
Types are now perfectly compatible.
int s[4][2] = {
{1234, 56},
{1212, 33},
{1434, 80},
{1312, 78}
};
int (*p)[1];
p = s[0];
printf("%d\n", *(*(p + 0))); // 1234
printf("%d\n", *(s[0] + 0)); // 1234
printf("%u\n", p); // 1256433(address of s[0][0])
printf("%u\n", *p); // 1256433(address of s[0][0])
Can anyone explain why doing *(*(p + 0)) prints 1234, and doing *(s[0] + 0) also prints 1234, when p = s[0] and also why does p and *p gives the same result?
Thanking you in anticipation.
This is the way arrays work in C -- arrays are not first class types, in that you can't do anything with them other than declaring them and getting their size. In any other context, when you use an expression with type array (of anything) it is silently converted into a pointer to the array's first element. This is often referred to as an array "decaying" into a pointer.
So lets look at your statements one by one:
p = s[0];
Here, s has array type (it's an int[4][2] -- a 2D int array), so its silently converted into a pointer to its first element (an int (*)[2], pointing at the word containing 1234). You then index this with [0] which adds 0 * sizeof(int [2]) bytes to the pointer, and then dereferences it, giving you an int [2] (1D array of 2 ints). Since this is an array, its silently converted into a pointer to its first element (an int * pointing at 1234). Note that this is the same pointer as before the index, just the pointed at type is different.
You then assign this int * to p, which was declared as int (*)[1]. Since C allows assigning any pointer to any other pointer (even if the pointed at types are different), this works, but any reasonable compiler will give you a type mismatch warning.
p now points at the word containing 1234 (the same place the pointer you get from s points at)
printf("%d\n", *(*(p+0)));
This first adds 0*sizeof(int[1]) to p and dereferences it, giving an array (int[1]) that immediately decays to a pointer to its first element (an int * still pointing at the same place). THAT pointer is then dereferenced, giving the int value 1234 which is printed.
printf("%d\n", *(s[0]+0));
We have s[0] again which via the multiple decay and dereference process noted in the description of the first line, becomes an int * pointing at 1234. We add 0*sizeof(int) to it, and then dereference, giving the integer 1234.
printf("%u\n", p);
p is a pointer, so the address of the pointer is simply printed.
printf("%u\n",*p)
p is dereferenced, giving an int [1] (1D integer array) which decays into a pointer to its first element. That pointer is then printed.
s[0]points to a location in memory. That memory location happens to be the starting point of int s[4][2]. When you make the assignment p = s[0], p and p+0 also point to s[0]. So when you print any one of these with a "%d" specifier, you will get the value stored at that location which happens to be `1234'. If you would like to verify the address is the same for all of these, use a format specifier "%p" instead of "%d".
EDIT to address OP comment question...
Here is an example using your own int **s:
First, C uses pointers. Only pointers. No arrays. The [] notation gives the appearance of arrays, but any variable that is created using the [] notation (eg. int s[4][2]) is resolved into a simple pointer (eg. int **s). Also, a pointer to a pointer is still just a pointer.
int a[8]={0}; (or int *a then malloced)
will look the same in memory as will:
int a[2][4]; ( or in **a=0; then malloced)
The statment:
s[row][col] = 1;
creates the same object code as
*(*(s + row) + col) = 1;
It is also true that
s[row] == *(s + row)
Since s[row] resolves to a pointer, then so does *(s + row)
It follows that s[0] == *(s + 0) == *s
If these three are equal, whatever value is held at this address will be displayed when printing it.
It follows that in your code: given that you have assigned p = s[0]; and s[0] == *s
*(*(p + 0)) == *(s[0] + 0) == *s[0] == **s
printf("%d\n", >>>fill in any one<<<); //will result in 1234
Note, in the following printf statements, your comment indicates addresses were printed. But because you used the unsigned int format specifier "%u",
Consider p == s[0]; which is a pointer to the first location of s. Note that either s[0][0] or **s would give you the value held at the first location of s, but s[0] is the _address_ of the first memory location of s. Therefore, since p is a pointer, pointing to the address at s[0], the following will give you the address of p, or s[0] (both same):
printf("%p\n", *p); // 1256433(address of s[0][0])
As for *p, p was created as int (*p)[1]; and pointer array of 1 element. an array is resolved into a pointer, so again, in the following you will get the address pointing to s[0]:
printf("%u\n", **p);
In summary, both p and *p are pointers. Both will result in giving address when printed.
Edit 2 Answer to your question: So my question is what is the difference between a simple pointer and a pointer to an array?
Look toward the bottom of this tutorial download a pdf. It may explain it better...
But in short, C Does not implement arrays in the same way other languages do. In C, an array of any data type always resolves into a pointer. int a[10]; is just really int *a;, with memory set aside for space to hold 10 integers consecutively. In memory it would look like:
a[0] a[9]
|0|0|0|0|0|0|0|0|0|0| (if all were initialized to zero)
Likewise you would be tempted to think of float b[2][2][2]; as a 3 dimensional array: 2x2x2, it is not. It is really a place in memory, starting at b[0] that has room for 8 floating point numbers. Look at the illustrations HERE.
I am bit confused with respect to pointer to arrays and just normal pointer and how to access.
I have tried this...
int *ptr1, i;
int (*ptr2)[3];
int myArray[3] = {1, 1, 1};
int myArray1[5] = {1, 1, 1, 1, 1};
ptr1 = myArray;
ptr2 = myArray1;// compiles fine even though myArray1 contains 5 elements
// and ptr2 is pointing to array of 3 elements.
printf("%d",ptr2[3]); // prints some garbage.
Why this statement is printing garbage? What is the correct statement?
Can anyone explain?
We can also declare pointer to array as
int (*ptr)[]; // ptr is a pointer to array.
When you do
ptr2 = myArray1;
compiler will throw you a warning message. Look boss types are not compatible.
In some context Arrays decays into pointers. Warning message is because, when arrays decays into pointers, the decayed type is pointer. In this case, when you do
ptr1 = myArray;
myArray decays into int *.
But when you do,
ptr2 = myArray1;
myArray1 decays into pointer that is int *, but the type of ptr2 is int (*)[].
In order to avoid warning, you should say
ptr2 = &myArray1; //&myArray1 returns address of array and type is int(*)[].
Why this statement is printing garbage? What is the correct statement? Can anyone explain?
printf("%d",ptr2[3]);// prints some garbage.
Yes, But why? Lets see the correct statement first...(note that index should be less than 3)
printf("myArray1[%d] = %d\n", i, (*ptr2)[2]);
We have to use (*ptr2)[i] to print the array element. This is because, just by mentioning ptr2, we will get the address of the array (not the address of some int). and by de-referencing it (*ptr2) we will get the address of 0th element of the array.
Pointer ptr2 is pointer to int array of size three
ptr2 = myArray1; // you may getting warning for this
should be:
ptr2 = &myArray1; // use & operator
And
printf("%d", ptr2[3]);
should be:
printf("%d", (*ptr2)[2]); // index can't be 3 for three size array
// ^^^^^^^
Notice parenthesis around *ptr2 is needed as precedence of [] operator is higher then * dereference operator (whereas if you use pointer to int you don't need parentheses as in above code).
Read Inconsistency in using pointer to an array and address of an array directly I have explained both array to access array elements. (1) Using pointer to int (2) pointer to array
Whereas ptr1 = myArray; is just find, you can simply access array elements using ptr1[i] (not i values should be 0 to 4)
int (*ptr2)[3];
This signifies that (*ptr2) is the identifier which contains the location in memory of the beginning of an integer array. Or, rephrasing, that ptr2 is an address. That address contains a value. That value is itself the address of the beginning of an array of ints.
So when you write ptr2 = myArray1 you are basically saying "The address of my address holder is the address of the beginning of myArray1".
So when you go to print the value using ptr2[3] you are actually printing the value of the memory address of myArray1, incremented 3*sizeof(ptr2) units. Which is why it looks like garbage, it's some memory address.
What should be in the print statement is (*ptr2)[3] which means "Take the address of ptr2 and get the value stored at that address. Next, using that second address, increment it 3*sizeof(int) and get the value stored at that incremented address."
Arrays in C are nothing more than a contiguous region of allocated memory. Which is why saying:
int *x = malloc(3*sizeof(int));
Is the same as:
int x[3];
Since [] and * both dereference a pointer, if we wanted to access the second element of either array, we could write:
int value = x[1];
or
int value = *x+sizeof(int); // Since *x would be index 0
Two of the major differences are that the [] notation is a bit easier to read but lengths must be determined at compile time. Whereas, with the *x/malloc() notation, we can dynamically create an array or arbitrary length.
a[1][2] is expanded by compiler like this: *( *(a+1) +2 ). So if a has such a prototype:int **a,
The foregoing expression should be explained like this:
Get the address of a from symbol table. Note it is a pointer
to a pointer
Now we add it by 1, then it point to the somewhere next to
where a point to.
Then we dereference it. I think here is a undefined behavior,
for we don't know if a+1 is valid and we arbitraryly access it.
Ok, if we are lucky enough that we successfully get the value
*(a+1). We add this by 2.
Upon this step, we dereference (*(a+1) +2 ). Will we be lucky now?
I read this in Expert C Programming in Chapter 10. Is this correct?
New answer, after edited question:
For a[1][2] to be valid, given that a has is defined as int **a;, both of these must be true:
a must point at the first of two sequential int * objects;
The second of those int * objects must point at the first of three sequential int objects.
The simplest way to arrange this is:
int x[3];
int *y[2] = { 0, x };
int **a = y;
Original answer:
If the expression a[1][2] is valid, then there are many distinct possibilities for the type of a (even neglecting qualifiers like const):
type **a; (pointer to pointer to type)
type *a[n]; (array of n pointers to type)
type (*a)[n]; (pointer to array of n type)
type a[m][n]; (array of m arrays of n type)
Precisely how the expression is evaluated depends on which of these types a actually has.
First a + 1 is calculated. If a is itself a pointer (either case 1 or case 3), then the value of a is directly loaded. If a is an array (case 2 or case 4), then the address of the first element of a is loaded (which is identical to the address of a itself).
This pointer is now offset by 1 object of the type that it points to. In case 1 and case 2, it would be offset by 1 "pointer to type" object; in case 3 and case 4, it would be offset by 1 "array of n type" object, which is the same as ofsetting by n type objects.
The calculated (offset) pointer is now dereferenced. In cases 1 and 2, the result has type "pointer to type", in cases 3 and 4 the result has type "array of n type".
Next *(a + 1) + 2 is calculated. As in the first case, if *(a + 1) is a pointer, then the value is used directly (this time, cases 1 and 2). If *(a + 1) is an array (cases 3 and 4), then the address of the first element of that array is taken.
The resulting pointer (which, at this point, always has type "pointer to type") is now offset by 2 type objects. The final offset pointer is now dereferenced, and the type object is retrieved.
Let's say the definition of a looks something like this:
int a[2][2] = {{1, 2}, {3, 4}};
Here's what the storage that the symbol a looks like:
[ 1 ][ 2 ][ 3 ][ 4 ]
In C, when you perform arithmetic on a pointer, the actual amount by which the pointer value is incremented or decremented is based on the size of the type stored in the array. The type contained in the first dimension of a is int[2], so when we ask C to calculate the pointer value (a + 1), it takes the location named by a and increments it by the size of int[2], which results in pointer referring to the memory location containing the integer value [3]. So yes, when you dereference this pointer and then add 2 to it, the result is the integer value 5. When you then try to dereference that integer value, it makes no sense.
So now let's say the array contains pointers:
char const * one = "one",
two = "two",
three = "three",
four = "four";
char const * a[2][2] = {{one, two}, {three, four}};
Add 1 to a and then dereference it, and you get the char pointer referring to the string "three." Add two to this, and you'll get a pointer referring to a now shorter string "ree". Dereference that, and you get the char value 'r', but only by sheer luck did you avoid a memory protection fault.