This question already has answers here:
One-dimensional access to a multidimensional array: is it well-defined behaviour?
(4 answers)
Closed 8 years ago.
My book (Pointers on C by Kenneth Reek) says that the following is illegal although it works fine.
int arr[5][5];
int *p=&arr[2][2];
p=p+3; // As array is stored in row major form I think this
//should make p point to arr[3][0]
The book says leaving one row to the next row is illegal. But I cannot understand why.
The reason that the book says it's illegal is because pointer arithmetic is guaranteed to work only on pointers to elements in the same array, or one past the end.
arr is an array of 5 elements, in which each element is an array of 5 integers. Thus, theoretically, if you want to have pointers to array elements in arr[i], you can only do pointer arithmetic that yields pointers in the range &arr[i][0..4] or arr[i]+5 keeping i constant.
For example, imagine arr was a one dimensional of 5 integers. Then a pointer p could only point to each of &arr[0..4] or arr+5 (one past the end). This is what happens with multi-dimensional arrays as well.
With int arr[5][5];, you can only do pointer arithmetic such that you always have a pointer that is in the range &arr[i][0..4] or arr[i]+5 - that's what the rules say. It just may be confusing because these are arrays inside arrays, but the rule is the same no matter what. Conceptually, arr[0] and arr[1] are different arrays, and even though you know they are contiguous in memory, it is illegal to do pointer arithmetic between elements of arr[0] and arr[1]. Remember that conceptually, each element in arr[i] is a different array.
In your example, however, p+3 will point one past the end of arr[2][2], so it looks to me like it is valid nonetheless. It's a poor choice of an example because it will make p point precisely to one past the end, making it still valid. Had the author chosen p+4, the example would be correct.
Either way, I have never had any problems with flattening multidimensional arrays in C using similar methods.
Also see this question, it has got other useful information: One-dimensional access to a multidimensional array: well-defined C?
I gelled on this for awhile, and I'll try my best to explain where I think he's coming from, though without reading the book, it will be at-best-conjecture.
First, technically, the increment you propose (or he proposed) isn't illegal; dereferencing it is. The standard allows you to advance a pointer to one-past the last element of the array sequence from which it is being sourced for valuation, but not for dereference. Change it to p = p + 4 and both are illegal.
That aside, the linear footprint of the array not withstanding, ar[2] has a type, and it is int[5]. If you don't believe that, consider the following, all of which is correctly typed:
int ar[5][5];
int (*sub)[5] = ar+2; // sub points to 3rd row
int *col = *sub + 2; // col points to 3rd column of third row.
int *p = col + 3; // p points to 5th colum of third row.
Whether this lands on ar[3][0] isn't relevant You're exceeding the declared magnitude of the dimension participating in the pointer-math. The result cannot legally be dereferenced, and were it larger than a 3-offset, nor could it be even legally evaluated.
Remember, the array being addressed is ar[2]; not just ar, and said-same is declared to be size=5. That it is buttressed up against two other arrays of the same ilk isn't relevant to the addressing currently being done. I believe Christoph's answer to the question proposed as a duplicate should have been the one selected for outright-solution. In particular, the reference to C99 §6.5.6, p8 which, though wordy, appears below with:
When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
In other words, if the expression P points to the i-th element of an
array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N
(where N has the value n) point to, respectively, the i+n-th and
i−n-th elements of the array object, provided they exist. Moreover, if
the expression P points to the last element of an array object, the
expression (P)+1 points one past the last element of the array object,
and if the expression Q points one past the last element of an array
object, the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. If the result points one past the last element
of the array object, it shall not be used as the operand of a unary *
operator that is evaluated.
Sorry for the spam, but the bolded highlights are what I believe is relevant to your question. By addressing as you are, you're leaving the array being addressed, and as such walking into UB. in short, it works (usually), but is isn't legal.
Yes. It is illegal in C. In fact by doing so you are laying to your compiler. p is pointing to the element arr[2][2] (and is of pointer to int type), i.e, 3rd element of third row. The statement p=p+3; will increment the pointer p to arr[2][5], which is equivalent to arr[3][0].
But this will fail whenever memory is allocated as a power of 2 ( 2n ) on some architechture. Now in this case the memory allocation would round up to 2n, i.e, in your case, each row would round up to 64 bytes.
See a test program in which the memory allocated is 5 allocations of 10 integers. On some machines, memory allocations are a multiple of 16 bytes, so the 40 bytes requested is rounded up to 48 bytes per allocation:
#include <stdio.h>
#include <stdlib.h>
extern void print_numbers(int *num_ptr, int n, int m);
extern void print_numbers2(int **nums, int n, int m);
int main(void)
{
int **nums;
int n = 5;
int m = 10;
int count = 0;
// Allocate rows
nums = (int **)malloc(n * sizeof(int *));
// Allocate columns for each row
for (int i = 0; i < n; i++)
{
nums[i] = (int *)malloc(m * sizeof(int));
printf("%2d: %p\n", i, (void *)nums[i]);
}
// Populate table
for (int i = 0; i < n; i++)
for (int j = 0; j < m; j++)
nums[i][j] = ++count;
// Print table
puts("print_numbers:");
print_numbers(&nums[0][0], n, m);
puts("print_numbers2:");
print_numbers2(nums, n, m);
return 0;
}
void print_numbers(int *nums_ptr, int n, int m)
{
int (*nums)[m] = (int (*)[m])nums_ptr;
for (int i = 0; i < n; i++)
{
printf("%2d: %p\n", i, (void *)nums[i]);
for (int j = 0; j < m; j++)
{
printf("%3d", nums[i][j]);
}
printf("\n");
}
}
void print_numbers2(int **nums, int n, int m)
{
for (int i = 0; i < n; i++)
{
printf("%2d: %p\n", i, (void *)nums[i]);
for (int j = 0; j < m; j++)
printf("%3d", nums[i][j]);
printf("\n");
}
}
Sample output on Mac OS X 10.8.5; GCC 4.8.1:
0: 0x7f83a0403a50
1: 0x7f83a0403a80
2: 0x7f83a0403ab0
3: 0x7f83a0403ae0
4: 0x7f83a0403b10
print_numbers:
0: 0x7f83a0403a50
1 2 3 4 5 6 7 8 9 10
1: 0x7f83a0403a78
0 0 11 12 13 14 15 16 17 18
2: 0x7f83a0403aa0
19 20 0 0 21 22 23 24 25 26
3: 0x7f83a0403ac8
27 28 29 30 0 0 31 32 33 34
4: 0x7f83a0403af0
35 36 37 38 39 40 0 0 41 42
print_numbers2:
0: 0x7f83a0403a50
1 2 3 4 5 6 7 8 9 10
1: 0x7f83a0403a80
11 12 13 14 15 16 17 18 19 20
2: 0x7f83a0403ab0
21 22 23 24 25 26 27 28 29 30
3: 0x7f83a0403ae0
31 32 33 34 35 36 37 38 39 40
4: 0x7f83a0403b10
41 42 43 44 45 46 47 48 49 50
Sample output on Win7; GCC 4.8.1:
Related
When we subtract a pointer from another pointer the difference is not equal to how many bytes they are apart but equal to how many integers (if pointing to integers) they are apart. Why so?
The idea is that you're pointing to blocks of memory
+----+----+----+----+----+----+
| 06 | 07 | 08 | 09 | 10 | 11 | mem
+----+----+----+----+----+----+
| 18 | 24 | 17 | 53 | -7 | 14 | data
+----+----+----+----+----+----+
If you have int* p = &(array[5]) then *p will be 14. Going p=p-3 would make *p be 17.
So if you have int* p = &(array[5]) and int *q = &(array[3]), then p-q should be 2, because the pointers are point to memory that are 2 blocks apart.
When dealing with raw memory (arrays, lists, maps, etc) draw lots of boxes! It really helps!
Because everything in pointer-land is about offsets. When you say:
int array[10];
array[7] = 42;
What you're actually saying in the second line is:
*( &array[0] + 7 ) = 42;
Literally translated as:
* = "what's at"
(
& = "the address of"
array[0] = "the first slot in array"
plus 7
)
set that thing to 42
And if we can add 7 to make the offset point to the right place, we need to be able to have the opposite in place, otherwise we don't have symmetry in our math. If:
&array[0] + 7 == &array[7]
Then, for sanity and symmetry:
&array[7] - &array[0] == 7
So that the answer is the same even on platforms where integers are different lengths.
Say you have an array of 10 integers:
int intArray[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Then you take a pointer to intArray:
int *p = intArray;
Then you increment p:
p++;
What you would expect, because p starts at intArray[0], is for the incremented value of p to be intArray[1]. That's why pointer arithmetic works like that. See the code here.
"When you subtract two pointers, as long as they point into the same array, the result is the number of elements separating them"
Check for more here.
This way pointer subtraction behaves is consistent with the behaviour of pointer addition. It means that p1 + (p2 - p1) == p2 (where p1 and p2 are pointers into the same array).
Pointer addition (adding an integer to a pointer) behaves in a similar way: p1 + 1 gives you the address of the next item in the array, rather than the next byte in the array - which would be a fairly useless and unsafe thing to do.
The language could have been designed so that pointers are added and subtracted the same way as integers, but it would have meant writing pointer arithmetic differently, and having to take into account the size of the type pointed to:
p2 = p1 + n * sizeof(*p1) instead of p2 = p1 + n
n = (p2 - p1) / sizeof(*p1) instead of n = p2 - p1
So the result would be code that is longer, and harder to read, and easier to make mistakes in.
When applying arithmetic operations on pointers of a specific type, you always want the resulting pointer to point to a "valid" (meaning the right step size) memory-address relative to the original starting-point. That is a very comfortable way of accessing data in memory independently from the underlying architecture.
If you want to use a different "step-size" you can always cast the pointer to the desired type:
int a = 5;
int* pointer_int = &a;
double* pointer_double = (double*)pointer_int; /* totally useless in that case, but it works */
#fahad Pointer arithmetic goes by the size of the datatype it points.So when ur pointer is of type int you should expect pointer arithmetic in the size of int(4 bytes).Likewise for a char pointer all operations on the pointer will be in terms of 1 byte.
#include<stdio.h>
int main(){
int a1[]={6,7,8,18,34,67};
int a2[]={23,56,28,24};
int a3[]={-12,27,-31};
int *y[]={a1,a2,a3};
int **a= y;
printf("%d\n",a[0][2]);
printf("%d\n",*a[2]);
printf("%d\n",*(++a[0]));
printf("%d\n",*(++a)[0]);
printf("%d\n",a[-1][1]);
return 0;
}
When I run the above code output is 8,-12,7,23,8. But if i change the last 3 lines to
printf("%d\n",*(++a[2]));
printf("%d\n",*(++a)[1]);
printf("%d\n",a[-1][1]);
output is 8,-12,27,27,7. I'm unable to understand last printf statement. How does a[-1][something] is calculated ? And according to me *(++a)[1] should print 56 instead of 27 !
Pointers and array bases are in fact addresses in virtual memory. In C, they can be calculated into new addresses. Since the compiler knows the size of memory the pointer points to (e.g. int * points to 4 Bytes), a pointer +/- 1 means the address +/- the size (e.g. 4 Bytes for int).
The operator * means to get the value stored in the specified address.
Another trick here is the priorities of the operators. [] is calculated before ++.
If you understand what I mean above, your problem should be resolved.
according to me *(++a)[1] should print 56 instead of 27 !
++a increments a to the next int *, so after it pointed to y[0] equal to a1, it points to y[1] equal to a2. Then [1] in turn designates the next int * after y[1], i. e. y[2] equal to a3+1 (due to the preceding ++a[2]). Lastly, * designates the int which y[2] points to, i. e. a3[1] equal to 27.
I can't wrap my head about idea of array of pointers. Problem is I'm trying to iterate throught list of pointers (or at least get second value from pointer's array). I understand that integer is 4 bytes long (assuming im on 32-bit). And what I'm trying to do is get first address that points to a[0] and add to this address 4 bytes, which in my opinion will result in a[1]. However, this works as I'm just adding value to index. I.e. f[0] + 4 -> f[5]
And I don't quite understand why.
#include "stdio.h"
int main()
{
int a[6] = {10,2,3,4,20, 42};
int *f[6];
for(int i = 0; i < sizeof(a)/sizeof(int); i++) f[i] = &a[i];
for(int i = 0; i < sizeof(a)/sizeof(int); i++) printf("Current pointer points to %i\n", *(*f+i));
printf("The is %i", *(f[0]+sizeof(int)));
return 1;
}
Pointer arithmetic takes into account the size of the pointer.
f[0] + 4 will multiply 4 by the size of the integer type.
Here's an online disassembler: https://godbolt.org/.
When I type the code f[0] + 4, the disassembly appears as
add QWORD PTR [rbp-8], 16
Meaning it has multiplied the 4 by 4 (32-bit = 4 bytes) to make 16.
An array is a pointer to a chunk of RAM. int a[6] = {10,2,3,4,20, 42}; actually creates a chunk with [0x0000000A, 0x00000002, 0x00000003, 0x00000004, 0x00000014, 0x0000002A], and a points to where the list starts.
Using an index a[n] basically means go to the position of a (start of the array), then advance by n*sizeof(int) bytes.
a[0] means Go to position of a, then don't jump
a[1] means Go to position of a, then jump 1 time the size of an integer
a[2] means Go to position of a, then jump 2 times the size of an integer
supposing a is at the address 0xF00D0000, and you're on a 32bit machine:
a[0] // Pointer to 0xF00D0000
a[1] // Pointer to 0xF00D0004
a[2] // Pointer to 0xF00D0008
a[32] // Pointer to 0xF00D0080
I hope this makes sense.
When we subtract a pointer from another pointer the difference is not equal to how many bytes they are apart but equal to how many integers (if pointing to integers) they are apart. Why so?
The idea is that you're pointing to blocks of memory
+----+----+----+----+----+----+
| 06 | 07 | 08 | 09 | 10 | 11 | mem
+----+----+----+----+----+----+
| 18 | 24 | 17 | 53 | -7 | 14 | data
+----+----+----+----+----+----+
If you have int* p = &(array[5]) then *p will be 14. Going p=p-3 would make *p be 17.
So if you have int* p = &(array[5]) and int *q = &(array[3]), then p-q should be 2, because the pointers are point to memory that are 2 blocks apart.
When dealing with raw memory (arrays, lists, maps, etc) draw lots of boxes! It really helps!
Because everything in pointer-land is about offsets. When you say:
int array[10];
array[7] = 42;
What you're actually saying in the second line is:
*( &array[0] + 7 ) = 42;
Literally translated as:
* = "what's at"
(
& = "the address of"
array[0] = "the first slot in array"
plus 7
)
set that thing to 42
And if we can add 7 to make the offset point to the right place, we need to be able to have the opposite in place, otherwise we don't have symmetry in our math. If:
&array[0] + 7 == &array[7]
Then, for sanity and symmetry:
&array[7] - &array[0] == 7
So that the answer is the same even on platforms where integers are different lengths.
Say you have an array of 10 integers:
int intArray[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Then you take a pointer to intArray:
int *p = intArray;
Then you increment p:
p++;
What you would expect, because p starts at intArray[0], is for the incremented value of p to be intArray[1]. That's why pointer arithmetic works like that. See the code here.
"When you subtract two pointers, as long as they point into the same array, the result is the number of elements separating them"
Check for more here.
This way pointer subtraction behaves is consistent with the behaviour of pointer addition. It means that p1 + (p2 - p1) == p2 (where p1 and p2 are pointers into the same array).
Pointer addition (adding an integer to a pointer) behaves in a similar way: p1 + 1 gives you the address of the next item in the array, rather than the next byte in the array - which would be a fairly useless and unsafe thing to do.
The language could have been designed so that pointers are added and subtracted the same way as integers, but it would have meant writing pointer arithmetic differently, and having to take into account the size of the type pointed to:
p2 = p1 + n * sizeof(*p1) instead of p2 = p1 + n
n = (p2 - p1) / sizeof(*p1) instead of n = p2 - p1
So the result would be code that is longer, and harder to read, and easier to make mistakes in.
When applying arithmetic operations on pointers of a specific type, you always want the resulting pointer to point to a "valid" (meaning the right step size) memory-address relative to the original starting-point. That is a very comfortable way of accessing data in memory independently from the underlying architecture.
If you want to use a different "step-size" you can always cast the pointer to the desired type:
int a = 5;
int* pointer_int = &a;
double* pointer_double = (double*)pointer_int; /* totally useless in that case, but it works */
#fahad Pointer arithmetic goes by the size of the datatype it points.So when ur pointer is of type int you should expect pointer arithmetic in the size of int(4 bytes).Likewise for a char pointer all operations on the pointer will be in terms of 1 byte.
int main()
{
int (*x)[5]; //pointer to an array of integers
int y[6] = {1,2,3,4,5,6}; //array of integers
int *z; //pointer to integer
z = y;
for(int i=0;i<6;i++)
printf("%d ",z[i]);
x = y;
for(int i=0;i<6;i++)
printf("%d ",(*x)[i]);
return 0;
}
Both the above printfs print numbers 1 through 6.
If both "pointer to array of integers" and "pointer to integer" can do the same thing, do they have the same internal representation?
EDIT: This code does give warnings when compiled as pointed out by the answers below, however it does print the values correctly both the time on my x86_64 machine using gcc
Firstly, your code will not compile. The array has type int[6] (6 elements), while the pointer has type int (*)[5]. You can't make this pointer to point to that array because the types are different.
Secondly, when you initialize (assign to) such a pointer, you have to use the & on the array: x = &y, not just a plain x = y as in your code.
I assume that you simply typed the code up, instead of copy-pasting the real code.
Thirdly, about the internal representation. Generally, in practice, you should expect all data pointers to use the same internal representation. Moreover, after the above assignments (if written correctly), the pointers will have the same numerical value. The difference between int (*)[5] and int * exists only on the conceptual level, i.e. at the level of the language: the types are different. It has some consequences. For example, if you increment your z it will jump to the next member of the array, but if you increment y, it will jump over the whole array etc. So, these pointers do not really "do the same thing".
The short answer: There is a difference, but your example is flawed.
The long answer:
The difference is that int* points to an int type, but int (*x)[6] points to an array of 6 ints. Actually in your example,
x = y;
is undefined** behavior, you know these are of two different types, but in C you do what you want. I'll just use a pointer to an array of six ints.
Take this modified example:
int (*x)[6]; //pointer to an array of integers
int y[6] = {1,2,3,4,5,6}; //array of integers
int *z; //pointer to integer
int i;
z = y;
for(i = 0;i<6;i++)
printf("%d ",z[i]);
x = y; // should be x = &y but leave it for now!
for(i = 0;i<6;i++)
printf("%d ",x[i]); // note: x[i] not (*x)[i]
First,
1 2 3 4 5 6
Would be printed. Then, we get to x[0]. x[0] is nothing but an array of 6 ints. An array in C is the address of the first element. So, the address of y would be printed, then the address of the next array in the next iteration. For example, on my machine:
1 2 3 4 5 6 109247792 109247816 109247840 109247864 109247888 109247912
As you can see, the difference between consecutive addresses is nothing but:
sizeof(int[6]) // 24 on my machine!
In summary, these are two different pointer types.
** I think it is undefined behavior, please feel free to correct my post if it is wrong.
Hope this code helps:
int main() {
int arr[5] = {4,5,6,7,8};
int (*pa)[5] = &arr;
int *pi = arr;
for(int i = 0; i< 5; i++) {
printf("\n%d %d", arr[i], (*pa)[i]);
}
printf("\n0x%x -- 0x%x", pi, pa);
pi++;
pa++;
printf("\n0x%x -- 0x%x", pi, pa);
}
prints the following:
4 4
5 5
6 6
7 7
8 8
0x5fb0be70 -- 0x5fb0be70
0x5fb0be74 -- 0x5fb0be84
UPDATE:
You can notice that pointer to integer incremented by 4 bytes (size of 32 bit integer) whereas pointer to array of integer incremented by 20 bytes (size of int arr[5] i.e. size of 5 int of 32 bit each). This demonstrates the difference.
To answer your question from the title, from the comp.lang.c FAQ: Since array references decay into pointers, if arr is an array, what's the difference between arr and &arr?
However, the code you've posted has other issues (you're assigning y, not &y to x, and y is a 6-element array, but *x is a 5-element array; both of these should generate compilation warnings).
Who knows - this code exhibits undefined behavior:
printf("%d ",(*x)[i]);
Hope this code helps.
#include <stdio.h>
#include <stdlib.h>
#define MAXCOL 4
#define MAXROW 3
int main()
{
int i,j,k=1;
int (*q)[MAXCOL]; //pointer to an array of integers
/* As malloc is type casted to "int(*)[MAXCOL]" and every
element (as in *q) is 16 bytes long (I assume 4 bytes int),
in all 3*16=48 bytes will be allocated */
q=(int(*)[MAXCOL])malloc(MAXROW*sizeof(*q));
for(i=0; i<MAXROW; i++)
for(j=0;j<MAXCOL;j++)
q[i][j]=k++;
for(i=0;i<MAXROW;i++){
for(j=0;j<MAXCOL;j++)
printf(" %2d ", q[i][j]);
printf("\n");
}
}
#include<stdio.h>
int main(void)
{
int (*x)[6]; //pointer to an array of integers
int y[6] = {11,22,33,44,55,66}; //array of integers
int *z; //pointer to integer
int i;
z = y;
for(i = 0;i<6;i++)
printf("%d ",z[i]);
printf("\n");
x = &y;
for(int j = 0;j<6;j++)
printf("%d ",*(x[0]+j));
return 0;
}
//OUTPUT::
11 22 33 44 55 66
11 22 33 44 55 66
Pointer to an array are best suitable for multi-dimensional array. but in above example we used single dimension array. so, in the second for loop we should use (x[0]+j) with * to print the value. Here, x[0] means 0th array.
And when we try to print value using printf("%d ",x[i]);
you will get 1st value is 11 and then some garbage value due to trying to access 1st row of array and so on.
One should understand the internal representation of (*x)[i]. Internally, it is represented as
*((*x)+i), which is nothing but the ith element of the array to which x is pointing. This is also a way to have a pointer pointing to 2d array. The number of rows is irrelevant in a 2d array.
For example:
int arr[][2]={{1,2},{3,4}};
int (*x)(2);
x=arr; /* Now x is a pointer to the 2d array arr.*/
Here x is pointing to a 2d array having 2 integer values in all columns, and array elements are stored contiguously. So (*x)[0] will print arr[0][0] (which is 1), (*x)[1] will print the value of arr[0][1] (which is 2) and so on. (*x+1)[0] will print the value of arr[1][0] (3 in this case) (*x+1)[1] will print the value of arr[1][1] (4 in this case) and so on.
Now, a 1d array could be treated as nothing but a 2d array having only one row with as many columns.
int y[6] = {1,2,3,4,5,6};
int (*x)[6];
x =y;
This means x is a pointer to an array having 6 integers. So (*x)[i] which is equivalent to *((*x)+i) will print ith index value of y.