subtracting two pointers(arrays), (C language) [duplicate] - c

This question already has answers here:
Pointer Arithmetic In C
(2 answers)
Pointer subtraction confusion
(8 answers)
Closed 4 years ago.
int vector[] = { 28, 41, 7 };
int *p0 = vector;
int *p1 = vector + 1;
int *p2 = vector + 2;
I know result of
printf("%p, %p, %p\n", p0, p1, p2);
is ex) 100, 104, 108
but why is the result of
printf("p2-p0: %d\n", p2 - p0);
printf("p2-p1: %d\n", p2 - p1);
printf("p0-p1: %d\n", p0 - p1);
is 2, 1, -1
not 8, 4, -4????????

when you subtract to pointers (of the same type else no sense) that computes the difference as indexes, not the difference of the addresses :
type * p1 = ...;
type * p2 = ...;
(p1 - p2) == (((char *) p1) - ((char *) p2)) / sizeof(type)
It is the same when you do vector + n, that gives the address of the element rank n, not ((char *) vector) + n. So
type * p = ...;
int n = ...;
((char *) (p + n)) == (((char *) p) + n * sizeof(type))

Related

Passing Two Dimension array to a function as a single pointer

I need to pass Two Dimension array to a function as a single pointer. There are different types of approaches are there but due to some constraints(CodeGeneration), I want to pass a single pointer only. I have macros which contain the size of each dimension. I implemented the following way but I am not sure it will work fine for N dimensions also
#define size_1D 3
#define size_2D 3
void fun(int *arr)
{
int i,total_size = size_1D* size_2D;
for(i = 0; i < total_size ; i++)
{
int value = arr[i];
}
}
int main()
{
int arr[size_1D][size_2D] = {{1,2,7},{8,4,9}};
fun(&arr[0][0]);
}
Any loophole is there if I followed the above approach?
void fun(int (*arr)[3]);
or exactly equivalent, but maybe more readable:
void fun(int arr[][3]);
arr is a pointer to two dimensional array with 3 rows and 3 columns. arr decayed to a pointer has the type of a pointer to an array of 3 elements. You need to pass a pointer to an array of 3 elements. You can access the data normally, using arr[a][b].
#define size_1D 3
#define size_2D 3
void fun(int arr[][3])
{
for(int i = 0; i < size_1D ; i++) {
for(int j = 0; j < size_2D ; j++) {
int value = arr[i][j];
}
}
}
int main()
{
int arr[size_1D][size_2D] = {{1,2,7},{8,4,9}};
fun(arr);
}
You can specify the sizes as arguments and use a variable length array declaration inside function parameter list. The compiler will do some job for you.
#include <stdlib.h>
void fun(size_t xmax, size_t ymax, int arr[xmax][ymax]);
// is equivalent to
void fun(size_t xmax, size_t ymax, int arr[][ymax]);
// is equivalent to
void fun(size_t xmax, size_t ymax, int (*arr)[ymax]);
void fun(size_t xmax, size_t ymax, int arr[xmax][ymax])
{
for(int i = 0; i < xmax ; i++) {
for(int j = 0; j < ymax ; j++) {
int value = arr[i][j];
}
}
}
int main()
{
int arr[3][4] = {{1,2,7},{8,4,9}};
fun(3, 4, arr);
}
#edit
We know that the result of array subscript operator is exactly identical to pointer dereference operator of the sum:
a[b] <=> *(a + b)
From pointer arithmetic we know that:
type *pnt;
int a;
pnt + a = (typeof(pnt))(void*)((uintptr_t)(void*)pnt + a * sizeof(*pnt))
pnt + a = (int*)(void*)((uintptr_t)(void*)pnt + a * sizeof(type))
And that the array is equal to the value to the pointer to the first element of an array:
type pnt[A];
assert((uintptr_t)pnt == (uintptr_t)&pnt[0]);
assert((uintptr_t)pnt == (uintptr_t)&*(pnt + 0));
assert((uintptr_t)pnt == (uintptr_t)&*pnt);
So:
int arr[A][B];
then:
arr[x][y]
is equivalent to (ignore warnings, kind-of pseudocode):
*(*(arr + x) + y)
*( *(int[A][B])( (uintptr_t)arr + x * sizeof(int[B]) ) + y )
// ---- x * sizeof(int[B]) = x * B * sizeof(int)
*( *(int[A][B])( (uintptr_t)arr + x * B * sizeof(int) ) + y )
// ---- C11 6.5.2.1p3
*( (int[B])( (uintptr_t)arr + x * B * sizeof(int) ) + y )
*(int[B])( (uintptr_t)( (uintptr_t)arr + x * B * sizeof(int) ) + y * sizeof(int) )
// ---- *(int[B])( ... ) = (int)dereference( ... ) = *(int*)( ... )
// ---- loose braces - conversion from size_t to uintptr_t should be safe
*(int*)( (uintptr_t)arr + x * B * sizeof(int) + y * sizeof(int) )
*(int*)( (uintptr_t)arr + ( x * B + y ) * sizeof(int) )
*(int*)( (uintptr_t)( &*arr ) + ( x * B + y ) * sizeof(int) )
// ---- (uintptr_t)arr = (uintptr_t)&arr[0][0]
*(int*)( (uintptr_t)( &*(*(arr + 0) + 0) ) + ( x * B + y ) * sizeof(int) )
*(int*)( (uintptr_t)( &arr[0][0] ) + ( x * B + y ) * sizeof(int) )
*(int*)( (uintptr_t)&arr[0][0] + ( x * B + y ) * sizeof(int) )
// ---- decayed typeof(&arr[0][0]) = int*
*( &arr[0][0] + ( x * B + y ) )
(&arr[0][0])[x * B + y]
So:
arr[x][y] == (&arr[0][0])[x * B + y]
arr[x][y] == (&arr[0][0])[x * sizeof(*arr)/sizeof(**arr) + y]
On a sane architecture where sizeof(uintptr_t) == sizeof(size_t) == sizeof(int*) == sizeof(int**) and etc., and there is no difference in accessing data behind a int* pointer from accessing data behind int(*)[B] pointer etc. You should be safe with accessing one dimensional array when using a pointer to the first array member, as the operations should be equivalent ("safe" with exception for out-of-bound accesses, that's never safe)
Note, that this is correctly undefined behavior according to C standard and will not work on all architectures. Example: there could be an architecture, where data of the type int[A] are stored in different memory bank then int[A][B] data (by hardware, by design). So the type of the pointer tells the compiler which data bank to choose, so accessing the same data with the same to the value pointer, but with different pointer type, leads to UB, as the compiler chooses different data bank to access the data.

C Simple Arrays & Pointers

int a[] = {10, 15, 20, 25};
int b[] = {50, 60, 70, 80, 90};
int *x[] = {a, b};
int *y[] = {a + 2, b + 3};
int *p;
int *q;
int **r;
p = a;
q = y[1];
r = &q;
*p = &p[3] - y[0];
r[0][1] = **r - y[0][1];
What are the contents of a and b at the end?
I figured out that *p is a[0], and &p[3] - y[0] is just 3 - 2, so a[0] = 3 - 2 = 1. Therefore, a[] = {1, 10, 15, 20} (correct me if I am wrong), but b[] is where I get lost. I have no idea how the last line of the code works. No idea on what r[0][1] refers to, so getting the contents for b[] is confusing. P.S. this is for C.
Remember that the identity *(p + k) == p[k] (or p + x == &p[k]) means that you can always rewrite dereferencing as indexing and vice-versa, so if an expression is confusing you can try a different form and see if it makes more sense.
I personally find indexing easier to reason about:
Since r = &q, both r[0] and *r are the same as q:
q[1] = *q - y[0][1];
or
q[1] = q[0] - y[0][1];
q is y[1] gives:
y[1][1] = y[1][0] - y[0][1];
y[0]is a + 2 and y[1] is b + 3:
(b + 3)[1] = (b + 3)[0] - (a + 2)[1];
which is
*(b + 3 + 1) = *(b + 3 + 0) - *(a + 2 + 1);
which is
*(b + 4) = *(b + 3) - *(a + 3);
which is
b[4] = b[3] - a[3];
that is,
b[4] = 80 - 25;
The line int **r; declares a pointer to an int *. In other words, r is a pointer to a pointer to an int. If you recall that the syntax x[y] is equivalent to *(x + y), you might get an idea for what r[0][1] does.
r[0][1] --> *((*(r + 0)) + 1)
Keeping in mind that r[0][1] is on the LHS of the assignment operator, you are storing to that memory location.

Need help in understanding the solution to the C exercise involving pointers

I need help in understanding how we got the values in the table below for Loc3 and Loc4.
When I was making a table on my own I arrived to totally different entries for those columns.
Thank you!
int x = 42; /* x is at address 100 */
int y = 13; /* y is at address 104 */
int *p; /* p is at address 108 */
int **p2; /* p2 is at address 112 */
/* Location 1 */
p = &y;
p2 = &p;
/* Location 2 */
*p2 = &x;
**p2 = 11;
/* Location 3 */
*p = 12;
/* Location 4 */
For instance, x at loc3 becomes 11 because you set **p2 to 11, which modifies the value at that memory location. (double star is a pointer to a pointer). Ampersand gets the address.
To elaborate:
*p2 = &x;
**p2 = 11;
In Loc2 you set p2 = &p, which means p2 is now pointing to the address of p, which is 108.
But now in Loc3, you set what p2 is pointing to to the address of x. In other words, since p2 was pointing to the address of p, now you're saying that p should instead point to the address of x (which is also why p becomes 100).
Then **p2 modifies the value at that address of x to be 11 (through p), hence loc3's x value becomes 11.
Location 1:
int x = 42;
int y = 13;
int *p;
int **p2;
p or p2 don't point anywhere.
Location 2:
p = &y;
p points to y.
p2 = &p;
p2 points to p.
No changes to x or y.
Location 3:
*p2 = &x;
Since p2 points to p, dereferencing p2 and assigning a value to it changes p to point to x. Same as doing p = &x.
**p2 = 11;
Dereference once to get to p, dereference again to get to x, and assign 11 to it. Same as doing: x = 11 or *p == 11.
No change to y or p2.
Location 4:
*p = 12;
Dereference p to get to x and assing 12 to it. No change to y or p2 or p.
When you start out, p and p2 are uninitialized and contain indeterminate values, hence the ?? in both entries under Loc 1.
p = &y;
assigns the location of y (104) to p.
p2 = &p;
assigns the location of p (108) to p2. Note that the type of the expression &p is char **, which matches the type of the variable p2. So after these two statements, all of the following are true:
p2 == &p == 108 // all expressions have type char **
*p2 == p == &y == 104 // all expressions have type char *
**p2 == *p == y == 13 // all expressions have type char
x == 42
Next we execute
*p2 = &x;
From above we see that *p2 is equivalent to p, so this statement assigns the address of x (100) to p, so now we have
p2 == &p == 108
*p2 == p == &x == 100
**p2 == *p == x == 42
y == 13
Next we execute
**p2 = 11;
**p2 is equivalent to *p, which is equivalent to x, so we wind up assigning the value 11 to x:
p2 == &p == 108
*p2 == p == &x == 100
**p2 == *p == x == 11
y == 13
Finally we have
*p = 12;
*p is equivalent to x, so we're assigning the value 12 to x, leaving us with:
p2 == &p == 108
*p2 == p == &x == 100
**p2 == *p == x == 12
y == 13

Segmentation fault in a recursive function

I am implementing Strassen's matrix multiplication algorithm as a part of an assignment. I have coded it correctly but I don't know why it is giving segmentation fault.
I have called strassen() as strassen(0,n,0,n); in main. n is a number given by user which is power of two and it is the maximum size of the matrix (2D Array).
It is not giving segfault for n=4 but for n=8,16,32, it is giving segfaults.
Code is as given below.
void strassen(int p, int q, int r, int s)
{
int p1,p2,p3,p4,p5,p6,p7;
if(((q-p) == 2)&&((s-r) == 2))
{
p1 = ((a[p][r] + a[p+1][r+1])*(b[p][r] + b[p+1][r+1]));
p2 = ((a[p+1][r] + a[p+1][r+1])*b[p][r]);
p3 = (a[p][r]*(b[p][r+1] - b[p+1][r+1]));
p4 = (a[p+1][r+1]*(b[p+1][r] - b[p][r]));
p5 = ((a[p][r] + a[p][r+1])*b[p+1][r+1]);
p6 = ((a[p+1][r] - a[p][r])*(b[p][r] +b[p][r+1]));
p7 = ((a[p][r+1] - a[p+1][r+1])*(b[p+1][r] + b[p+1][r+1]));
c[p][r] = p1 + p4 - p5 + p7;
c[p][r+1] = p3 + p5;
c[p+1][r] = p2 + p4;
c[p+1][r+1] = p1 + p3 - p2 + p6;
}
else
{
strassen(p, q/2, r, s/2);
strassen(p, q/2, s/2, s);
strassen(q/2, q, r, s/2);
strassen(q/2, q, s/2, s);
}
}
Some of the conditions in your else block are infinitely recursive (at least the second and the fourth, didn't checked the other). This can be easily proved with pen and paper:
e.g.
strassen(p, q/2, s/2, s) for `0,8,0,8 will yield at each iteration:
1) 0, 4, 4, 8
2) 0, 2, 4, 8
3) 0, 1, 4, 8
4) 0, 0, 4, 8
5) 0, 0, 4, 8
...
and since none of those results pass your
if(((q-p) == 2)&&((s-r) == 2))
test, the function will run (and I suspect branch, as the 4th function has the same problem...) until the end of the stack is hit, causing a Segmentation Fault.
Anyway, if what you are trying to do in the else block is to recursively bisect the matrix, a better attempt would be something like:
strassen(p, (q+p)/2, r, (r+s)/2);
strassen(p, (q+p)/2, (r+s)/2, s);
strassen((q+p)/2,q, (r+s)/2, s);
strassen((q+p)/2,q, r, (r+s)/2);
(keep in mind that I didn't check this code, though)
void strassen(int p, int q, int r, int s)
{
int p1,p2,p3,p4,p5,p6,p7;
if(q-p == 2 && s-r == 2)
{
p1 = (a[p][r] + a[p+1][r+1]) * (b[p][r] + b[p+1][r+1]);
p2 = (a[p+1][r] + a[p+1][r+1]) * b[p][r];
p3 = a[p][r] * (b[p][r+1] - b[p+1][r+1]);
p4 = a[p+1][r+1] * (b[p+1][r] - b[p][r]);
p5 = (a[p][r] + a[p][r+1]) * b[p+1][r+1];
p6 = (a[p+1][r] - a[p][r]) * (b[p][r] +b[p][r+1] );
p7 = (a[p][r+1] - a[p+1][r+1]) * (b[p+1][r] + b[p+1][r+1]);
c[p][r] = p1 + p4 - p5 + p7;
c[p][r+1] = p3 + p5;
c[p+1][r] = p2 + p4;
c[p+1][r+1] = p1 + p3 - p2 + p6;
}
else
{
if (q/2-p >= 2 && s/2-r >= 2) strassen(p, q/2, r, s/2);
if (q/2-p >= 2 && s-s/2 >= 2) strassen(p, q/2, s/2, s);
if (q-q/2 >= 2 && s/2-r >= 2) strassen(q/2, q, r, s/2);
if (q-q/2 >= 2 && s-s/2 >= 2) strassen(q/2, q, s/2, s);
}
}
But an easier recursion stopper would be at the beginning of the function, like:
{
int p1,p2,p3,p4,p5,p6,p7;
if(q-p < 2 || s-r < 2) return;
if(q-p == 2 && s-r == 2)
{ ...

How does the int and char pointer affect my print out here?

so here is the code, till the 4th print out I easily followed it, but at the 5th print out, I don't understand
why its "5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302 "?
I have commented the line in the code which I don't understand. I look forward to your response.
"#include <stdio.h>
#include <stdlib.h>
void
f(void)
{
int a[4];
int *b = malloc(16);
int *c = 0;
int i;
printf("1: a = %p, b = %p, c = %p\n", a, b, c);
c = a;
for (i = 0; i < 4; i++)
a[i] = 100 + i;
c[0] = 200;
printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
c[1] = 300;
*(c + 2) = 301;
3[c] = 302;
printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
c = c + 1;
*c = 400;
printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
//I DONT UNDERSTAND WHAT THIS LINE BELOW DOES
c = (int *) ((char *) c + 1);
*c = 500;
printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
b = (int *) a + 1;
c = (int *) ((char *) a + 1);
printf("6: a = %p, b = %p, c = %p\n", a, b, c);
}
int
main(int ac, char **av)
{
f();
return 0;
}
output:
1: a = 0x7fff65fdcb90, b = 0x1065007e0, c = 0x0
2: a[0] = 200, a[1] = 101, a[2] = 102, a[3] = 103
3: a[0] = 200, a[1] = 300, a[2] = 301, a[3] = 302
4: a[0] = 200, a[1] = 400, a[2] = 301, a[3] = 302
5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302
6: a = 0x7fff65fdcb90, b = 0x7fff65fdcb94, c = 0x7fff65fdcb91
Let's start with the basics.
c is a pointer to an array of ints.
Let this be a:
[00000000][00000000][00000000][00000000]
Every two digits is a byte, and we assume that sizeof(int) is 4 in our example, so every element in a has 4 bytes, or 8 digits.
Now, c is a pointer to the first element in a.
Let's have a look at the expression in question:
c = (int *) ((char *) c + 1);
Obviously, c is changed here, but what exactly happens is:
c is cast from int* to char*
the result of the cast is incremented, resulting in sizeof(char) being added to c. Since sizeof(char) is 1, c is incremented by 1 and points to the second byte of an element in a.
the result is cast back to int*, and reassigned to c. This second cast is actually not needed.
So, ignoring all the other code, we start from this:
a : [00000000][00000000]...
^
c -|
And go to this:
a : [00000000][00000000]...
^
c ---|
As Daniel pointed out below, if c is not correctly aligned for a pointer of type int*, you get undefined behaviour, which should be avoided.
c is a pointer-to-int, so normally c+1 refers to the address which is sizeof(int) further along in memory - usually 4 bytes on a 32-bit system.
But you cast c to char* - that is, pointer-to-char. Now, char is only 1 byte long, so (char *)c + 1 refers to the memory location 1 byte further on than c; which is in the middle of the int at c.
You then cast the result back to an int* and write 500 into it. So what you're doing is (probably) writing the 4-byte representation of 500 over the last 3 bytes of a[1] and the 1st byte of a[2]. Exactly what effect that will have depends on the endianness of your system, but that's basically what's going on.

Resources