I have a doubt regarding pointer of pointer arithmetic in C.
If we do
int ** ptr = 0x0;
printf("%p",ptr+=1);
The output will be ptr+(# of bytes needed for storing a pointer, in my case 8).
Now if we declare a matrix:
int A[100][50];
A[0] is a pointer of pointer.
A[0]+1 will now point to A[0]+(# of bytes needed for storing an integer, in my case 4).
Why "normally" 8 bytes are added and now 4?
A[0]+1 will point to A[0][1], so it is useful, but how does it work?
Thank you!
Consider this program, run on a 64-bit machine (a Mac running macOS Mojave 10.14.6, with GCC 9.2.0 to be precise):
#include <stdio.h>
int main(void)
{
int A[100][50];
printf("Size of void * = %zu and size of int = %zu\n", sizeof(void *), sizeof(int));
printf("Given 'int A[100][50];\n");
printf("Size of A = %zu\n", sizeof(A));
printf("Size of A[0] = %zu\n", sizeof(A[0]));
printf("Size of A[0][0] = %zu\n", sizeof(A[0][0]));
putchar('\n');
printf("Address of A[0] = %p\n", (void *)A[0]);
printf("Address of A[0] + 0 = %p\n", (void *)(A[0] + 0));
printf("Address of A[0] + 1 = %p\n", (void *)(A[0] + 1));
printf("Difference = %td\n", (void *)(A[0] + 1) - (void *)(A[0] + 0));
putchar('\n');
printf("Address of &A[0] = %p\n", (void *)&A[0]);
printf("Address of &A[0] + 0 = %p\n", (void *)(&A[0] + 0));
printf("Address of &A[0] + 1 = %p\n", (void *)(&A[0] + 1));
printf("Difference = %td\n", (void *)(&A[0] + 1) - (void *)(&A[0] + 0));
return 0;
}
The output is:
Size of void * = 8 and size of int = 4
Given 'int A[100][50];
Size of A = 20000
Size of A[0] = 200
Size of A[0][0] = 4
Address of A[0] = 0x7ffee5b005e0
Address of A[0] + 0 = 0x7ffee5b005e0
Address of A[0] + 1 = 0x7ffee5b005e4
Difference = 4
Address of &A[0] = 0x7ffee5b005e0
Address of &A[0] + 0 = 0x7ffee5b005e0
Address of &A[0] + 1 = 0x7ffee5b006a8
Difference = 200
Therefore, it is possible to deduce that A[0] is an array of 50 int — it is not a 'pointer of pointer'. Nevertheless, when used in an expression such as A[0] + 1, it 'decays' into a 'pointer to int' (pointer to the type of the element of the array), and hence A[0] + 1 is one integer's worth further through the array.
The last block of output shows that the address of an array has a different type — int (*)[50] in the case of A[0].
Related
I meet the question in OS course. Here is the code from 6.828 (Operating System) online course. It meant to let learners practice the pointers in C programming language.
#include <stdio.h>
#include <stdlib.h>
void
f(void)
{
int a[4];
int *b = malloc(16);
int *c;
int i;
printf("1: a = %p, b = %p, c = %p\n", a, b, c);
c = a;
for (i = 0; i < 4; i++)
a[i] = 100 + i;
c[0] = 200;
printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
c[1] = 300;
*(c + 2) = 301;
3[c] = 302;
printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
c = c + 1;
*c = 400;
printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
c = (int *) ((char *) c + 1);
*c = 500;
printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
b = (int *) a + 1;
c = (int *) ((char *) a + 1);
printf("6: a = %p, b = %p, c = %p\n", a, b, c);
}
int
main(int ac, char **av)
{
f();
return 0;
}
I copy it to a file and compile it use gcc , then I got this output:
$ ./pointer
1: a = 0x7ffd3cd02c90, b = 0x55b745ec72a0, c = 0x7ffd3cd03079
2: a[0] = 200, a[1] = 101, a[2] = 102, a[3] = 103
3: a[0] = 200, a[1] = 300, a[2] = 301, a[3] = 302
4: a[0] = 200, a[1] = 400, a[2] = 301, a[3] = 302
5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302
6: a = 0x7ffd3cd02c90, b = 0x7ffd3cd02c94, c = 0x7ffd3cd02c91
I can easily understand the output of 1,2,3,4. But it's hard for me to understand the output of 5. Specially why a[1] = 128144 and a[2] = 256?
It seems this output is the result of
c = (int *) ((char *) c + 1);
*c = 500;
I have trouble understand the function of the code c = (int *) ((char *) c + 1).
c is a pointer by definiton int *c. And before the output of 5th line, c points to the second address of array a by c = a and c = c + 1. Now what's the meaning of (char *) c and ((char *) c + 1) ,then (int *) ((char *) c + 1)?
Although this is undefined behavior per the standard, it has a clear meaning in "ancient C", and it clearly works that way on the machine/compiler you're working with.
First, it casts c to a (char *), which means that pointer arithmetic will work in units of sizeof(char) (i.e. one byte) instead of sizeof(int). Then it adds one byte. Then it converts the result back to (int *). The result is an int pointer that now refers to an address one byte higher than it used to. Since c was pointing at a[1] beforehand, afterwards *c = 500 will write to the last three bytes of a[1] and the first byte of a[2].
On many machines (but not x86) this is an outright illegal thing to do. An unaligned access like that would simply crash your program. The C standard goes further and says that that code is allowed to do anything: when the compiler sees it, it can generate code that crashes, does nothing, writes to a completely unrelated bit of memory, or causes a small gnome to pop out of the side of your monitor and hit you with a mallet. However, sometimes the easiest thing to do in the case of UB is also the straightforward obvious thing, and this is one of those cases.
Your course material is trying to show you something about how numbers are stored in memory, and how the same bytes can be interpreted in different ways depending on what you tell the CPU. You should take it in that spirit, and not as a guide to writing decent C.
At the first output, c is point to a random address.
After c = a;, c point to a so when you change value of c[0], c[1], *(c + 2), 3[c] the value of a change accordingly.
At the following line:
c = c + 1;
c is now point to a[1] and the address would be 0x7ffd3cd02c94.
Now go to the line that you are asking for: c = (int *) ((char *) c + 1); it will do as following:
Convert c to a pointer type char which still point to same address 0x7ffd3cd02c94.
Do increase the pointer 1, so now the address would be 0x7ffd3cd02c95
Assign the new address again to c (int *).
Before that command, c will point to address: 0x7ffd3cd02c94-0x7ffd3cd02c97. But after that the address would be: 0x7ffd3cd02c95-0x7ffd3cd02c98. That is the reason the value at [5] is
[![enter image description here][1]][1]
Now it is clear why the value changed as you observed.
NOTE: This is correct for little endian system. For big endian the result would be a little bit different. AND for some embedded platform which not allow UNALIGNED access, you should got exception at that line.
[1]: https://i.stack.imgur.com/eU0Tb.png
This is a result of undefined behavior. You invoke undefined behavior because you dereference a null pointer (for array a) and the array size is zero (for array b) - for this case, this is equivalent to c= a; b= 0; c = (int *) ((char *) c + 1). This should trigger a warning, which is why I also added -Wall -pedantic -std=c99 in the above example.
To answer your question about (char *) c and ((char *) c + 1).
(char *) c: Since c is a pointer, c->type is int * (pointer to int). This makes c->type have type char *. You take the address of the second element in the array c and assign it to a. So, c->type is then char * (address of second element in the array c). c[0] (index 0) is therefore the first element in array c.
((char *) c + 1) - c + 1 = &c[1]. c[0] + 1 = c[1] (first element of the array c+1).
I'm trying to better understand memcpy. Here's an example I was experimenting with:
int arr[] = {10, 20, 30, 40};
int dest[] = {1, 2, 3, 4};
void *ptr = &dest;
printf("Before copy: %d, %d, %d, %d\n", *(int*)ptr, *(int*)ptr + 1, *(int*)ptr + 2, *(int*)ptr + 3);
memcpy(dest, arr, 3*sizeof(int));
printf("After copy: %d, %d, %d, %d\n", dest[0], dest[1], dest[2], dest[3]);
printf("After copy: %d, %d, %d, %d\n", *(int*)ptr, *(int*)ptr + 1, *(int*)ptr + 2, *(int*)ptr + 3);
How am I getting different results from last two print statements? The first one behaves the way I expect, but the second one doesn't.
You're getting confused by the first printf only because dest is initialized with consecutive integers. Try
int dest[] = { 4, 72, 0, -5 };
instead.
Your real problem is operator precedence: *a + b parses as (*a) + b, not *(a + b) (the latter being equivalent to a[b]).
By the way, I'm not convinced
void *ptr = &dest;
*(int *)ptr
is legal. The standard says any (object) pointer can be converted to void * and back without loss of information, but here you're converting from type A to void * to type B (where A != B).
Specifically: &dest has type int (*)[4] (pointer to array of 4 ints), not int *. To fix this, do
void *ptr = dest;
instead. Or just int *ptr = dest;, then you don't even need to cast.
When you print the values:
printf("Before copy: %d, %d, %d, %d\n", *(int*)ptr, *(int*)ptr + 1, *(int*)ptr + 2, *(int*)ptr + 3);
You're not printing what you think you are. The expression *(int*)ptr + 1 takes ptr, converts it to an int *, then dereferences that pointer, which gives you the value of the first element, then adds 1 to that element's value. It does not add to the pointer value because the dereference operator * has higher precedence than the addition operator +.
You need to add parenthesis to get the behavior you want:
printf("Before copy: %d, %d, %d, %d\n", *(int *)ptr, *((int *)ptr + 1), *((int *)ptr + 2), *((int *)ptr + 3));
I'm running this program:
#include<stdio.h>
void main(){
int num = 1025;
int *poinTer = #
char *pointChar = poinTer+1;
*pointChar = 'A';
printf("Size of Integer: %d\n", sizeof(int));
printf("Address: %d, Value: %d\n", poinTer, *poinTer);
printf("Address: %d, Value: %c\n", poinTer+1, *(poinTer+1));
printf("Address: %d, Value: %c\n", pointChar, *pointChar);
}
*pointChar and *(poinTer+1) should output same result but the output that I'm getting is different. *pointChar is not outputting any value:
Size of Integer: 4
Address: 1704004844, Value: 1025
Address: 1704004848, Value: A
Address: 1704004673, Value:
What's happening here?
When you perform + 1 on a pointer, it does not necessarily increase the memory address by 1. It increases it by sizeof(*ptr).
In this case, poinTer + 1 is equivalent to (char*)poinTer + sizeof(int). This actually makes dealing with arrays much easier.
The good old fashioned ptr[i] is syntactic sugar for *(ptr + i). So, if you have an array of 10 integers, ptr[4] will point to the 5th element rather than 4 bytes away from the base address (since integers are generally 4 or 8 bytes).
So what you've actually done is:
Create an int (num) on the stack and gave it the value 1025
Created a int*(poinTer) on the stack and assigned it the memory address of num
Incremented the pointer by sizeof(int) (which unintentionally points to a different memory address), then cast it to a char* and assigned it to a new pointer.
Assigned the byte pointed to at this new memory address the value 65 ('A').
This is probably what you wanted to do:
#include<stdio.h>
void main(){
int num = 1025;
int *poinTer = #
char *pointChar = (char*)poinTer + 1;
*pointChar = 'A';
printf("Size of Integer: %d\n", sizeof(int));
printf("Address: %d, Value: %d\n", poinTer, *poinTer);
printf("Address: %d, Value: %c\n", (char*)poinTer + 1, *((char*)poinTer+1));
printf("Address: %d, Value: %c\n", pointChar, *pointChar);
}
In a homework project, I have to subtract the address of one pointer from another.
Here is a piece of code I tried to write to subtract the heap of void* type, from a given metadata address. It's wrong somewhere.
metadata_t* getBuddy(metadata_t* ptr)
{
metadata_t* offset = ptr - (char)heap;
int h = (char)heap;
#ifdef DEBUG
printf("ptr : %p\n", ptr);
printf("heap : %p\n", heap);
printf("offset: %p\n", offset);
printf("char : %d\n", h);
#endif
return NULL;
}
Here is the output I get:
ptr : 0x7fe7b3802440
heap : 0x7fe7b3802200
offset: 0x7fe7b3802440
char : 0
Here is the output I EXPECTED:
ptr : 0x7fe7b3802440
heap : 0x7fe7b3802200
offset: 0x000000000240
char : 0x7fe7b3802200
Questions:
1) Why would the char output be zero? (Is this not what I am doing: casting the a pointer in single bytes, and then storing it into an int)
2) If this is not how you properly do the pointer arithmetic, how else would you accomplish the offset?
Edits:
1) Heap is defined as a int*, I think. This is the given piece of code that returns its value.
#define HEAP_SIZE 0x2000
void *my_sbrk(int increment) {
static char *fake_heap = NULL;
static int current_top_of_heap = 0;
void *ret_val;
if(fake_heap == NULL){
if((fake_heap = calloc(HEAP_SIZE, 1)) == NULL) {
return (void*)-1;
}
}
ret_val=current_top_of_heap+fake_heap;
if ((current_top_of_heap + increment > HEAP_SIZE)
|| (current_top_of_heap+increment < 0)) {
errno=ENOMEM;
return (void*)-1;
}
current_top_of_heap += increment;
return ret_val;
}
Pointer arithmetic only makes sense for a specific type. In this example, the int type is size 4 but the pointer subtraction is only 1.
#include <stdio.h>
int array[2];
int *a, *b;
int main(void){
a = &array [0];
b = &array [1];
printf ("Int size = %d\n", sizeof(int));
printf ("Pointer difference = %d\n", b-a);
return 0;
}
Program output:
Int size = 4
Pointer difference = 1
Pointers arithmetic doesn't support the operation (pointer + pointer). The only operation allowed is (Pointer + Integer) so the result is a pointer.
To get the offset you need to cast both pointers to an integer type. And the resulting value is an integer not a pointer.
Example:
int offset = (int)ptr - (int)heap;
printf("ptr : %p\n", ptr);
printf("heap : %p\n", heap);
printf("offset: %d\n", offset);
Also the value of heap is too much large to be stored in a single byte and that's why casting it into a char type returns the value zero.
I am a Java programmer and recently play with C for fun. Now I am learning address and pointers which are a little bit confusing for me. Here is my question. See the below two blocks of the code.
void withinArray(int * a, int size, int * ptr) {
int x;
printf("ptr is %d\n", ptr);
printf("a is %d\n", a);
printf("difference in pointers is: %d\n", ptr - a);
x = ptr - intArray;
printf("x is %d\n", x);
}
void doubleSize() {
double doubArray[10];
double * doubPtr1;
double * doubPtr2;
doubPtr1 = doubArray;
doubPtr2= doubArray+1;
int p2 = doubPtr2;
int p1 = doubPtr1;
printf("p2-p1 is %d\n", p2-p1);
printf("doubPtr2-doubPtr1 is %d\n", doubPtr2-doubPtr1);
}
int main(void)
{
int a[10];
int *intarray = a;
int *p = intarray + 9;
printf(withinArray(a, 10, p));
return 0;
}
I am wondering for function withinArray(), why we could directly get the x value, which is 9? But for the other method, we have to convert doubPtr to int first and then we can get the difference between pointers in int?
From my understanding, in doubleSize(), doubPtr2-doubPtr1 = 1 means the difference in pointer address in memory is 1. But why the withinArray() doesn't need to do so?
A difference of 1 between two pointers means that the pointers point to adjacent units of memory of the size of the objects pointed at.
Thus, given:
int i[2];
int *ip0 = &i[0];
int *ip1 = &i[1];
double d[2];
double *dp0 = &d[0];
double *dp1 = &d[1];
we could safely write:
assert((ip1 - ip0) == (dp1 - dp0));
assert(ip1 - ip0 == 1);
assert(dp1 - dp0 == 1);
However, you could also safely write:
assert((char *)ip1 - (char *)ip0 == sizeof(int));
assert((char *)dp1 - (char *)dp0 == sizeof(double));
and usually you would find that it is safe to write:
assert(sizeof(double) != sizeof(int));
though that is not guaranteed by the standard.
Also, as Filipe Gonçalves correctly points out in his comment, the difference between two pointers is formally only defined if the pointers are of the same type and point to two elements of the same array, or point to one element beyond the end of the array. Note that standard C demands that given:
int a[100];
it is safe to generate the address int *ip = &array[100];, even though it is not safe to either read from or write to the location pointed at by ip. The value stored in ip can be used in comparisons.
You also formally cannot subtract two void * values because there is no size for the type void (which is why my example used casts to char *, not void *). Beware: GCC will not object to the subtraction of two void * values unless you include -pedantic in the options.
Do you know why the value of doubPtr2 - doubPtr1 (in my second method) is different from x = ptr - a (in my first method)?
Assuming that intArray is meant to be a, then this code:
#include <stdio.h>
static void withinArray(int *a, int *ptr)
{
int x;
printf("ptr is %p\n", (void *)ptr);
printf("a is %p\n", (void *)a);
printf("difference in pointers is: %td\n", ptr - a);
x = ptr - a;
printf("x is %d\n", x);
}
static void doubleSize(void)
{
double doubArray[10];
double *doubPtr1 = doubArray;
double *doubPtr2 = doubArray+1;
int p2 = doubPtr2;
int p1 = doubPtr1;
printf("p1 = 0x%.8X\n", p1);
printf("p2 = 0x%.8X\n", p2);
printf("p2-p1 is %d\n", p2-p1);
printf("doubPtr1 = %p\n", (void *)doubPtr1);
printf("doubPtr1 = %p\n", (void *)doubPtr2);
printf("doubPtr2-doubPtr1 is %td\n", doubPtr2-doubPtr1);
}
int main(void)
{
int a[10];
int *intarray = a;
int *p = intarray + 9;
withinArray(a, p);
doubleSize();
return 0;
}
compiles with warnings that I would ordinarily fix (change the type of p1 and p2 to uintptr_t, include <inttypes.h>, and format using "p1 = 0x%.8" PRIXPTR "\n" as the format string), and it generates the output:
ptr is 0x7fff5c5684a4
a is 0x7fff5c568480
difference in pointers is: 9
x is 9
p1 = 0x5C5684B0
p2 = 0x5C5684B8
p2-p1 is 8
doubPtr1 = 0x7fff5c5684b0
doubPtr1 = 0x7fff5c5684b8
doubPtr2-doubPtr1 is 1
Fixed code generates:
ptr is 0x7fff5594f4a4
a is 0x7fff5594f480
difference in pointers is: 9
x is 9
p1 = 0x7FFF5594F4B0
p2 = 0x7FFF5594F4B8
p2-p1 is 8
doubPtr1 = 0x7fff5594f4b0
doubPtr1 = 0x7fff5594f4b8
doubPtr2-doubPtr1 is 1
(The difference is in the number of hex digits printed for p1 and p2.)
I assume that your puzzlement is about why the int code prints 9 rather than, say, 36, whereas the double code prints 8 instead of 1.
The answer is that when you subtract two pointers, the result is given in units of the size of the objects pointed at (which I seem to remember saying in my opening sentence).
When you execute doubPtr2-doubPtr1, the distance returned is in units of the number of double values between the two addresses.
However, the conversion to integer loses the type information, so you effectively have the char * (or void *) addresses of the two pointers in the integer, and the byte addresses are indeed 8 apart.
If we make two symmetrical routines, the information is clearer:
#include <stdio.h>
#include <inttypes.h>
static void intSize(void)
{
int intArray[10];
int *intPtr1 = intArray;
int *intPtr2 = intArray+1;
uintptr_t p2 = (uintptr_t)intPtr2;
uintptr_t p1 = (uintptr_t)intPtr1;
printf("p1 = 0x%.8" PRIXPTR "\n", p1);
printf("p2 = 0x%.8" PRIXPTR "\n", p2);
printf("p2-p1 is %" PRIdPTR "\n", p2-p1);
printf("intPtr1 = %p\n", (void *)intPtr1);
printf("intPtr1 = %p\n", (void *)intPtr2);
printf("intPtr2-intPtr1 is %td\n", intPtr2-intPtr1);
}
static void doubleSize(void)
{
double doubArray[10];
double *doubPtr1 = doubArray;
double *doubPtr2 = doubArray+1;
uintptr_t p2 = (uintptr_t)doubPtr2;
uintptr_t p1 = (uintptr_t)doubPtr1;
printf("p1 = 0x%.8" PRIXPTR "\n", p1);
printf("p2 = 0x%.8" PRIXPTR "\n", p2);
printf("p2-p1 is %" PRIdPTR "\n", p2-p1);
printf("doubPtr1 = %p\n", (void *)doubPtr1);
printf("doubPtr1 = %p\n", (void *)doubPtr2);
printf("doubPtr2-doubPtr1 is %td\n", doubPtr2-doubPtr1);
}
int main(void)
{
doubleSize();
intSize();
return 0;
}
Output:
p1 = 0x7FFF5C93D4B0
p2 = 0x7FFF5C93D4B8
p2-p1 is 8
doubPtr1 = 0x7fff5c93d4b0
doubPtr1 = 0x7fff5c93d4b8
doubPtr2-doubPtr1 is 1
p1 = 0x7FFF5C93D4B0
p2 = 0x7FFF5C93D4B4
p2-p1 is 4
intPtr1 = 0x7fff5c93d4b0
intPtr1 = 0x7fff5c93d4b4
intPtr2-intPtr1 is 1
Remember Polya's advice in How to Solve It:
Try to treat symmetrically what is symmetrical and do not destroy wantonly any natural symmetry.