I have a piece of code written in C where some pointer arithmetic is performed. I would like to know how the output comes to be this?
#include <stdio.h>
int main()
{
char arr[] = "gookmforgookm";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
printf ("ptr2 - ptr1 = %d\n", ptr2 - ptr1);
printf ("(int*)ptr2 - (int*) ptr1 = %d", (int*)ptr2 - (int*)ptr1);
getchar();
return 0;
}
Output is below:
ptr2 - ptr1 = 3
(int*)ptr2 - (int*) ptr1 = 0
Strictly speaking, you're invoking undefined behaviour and any result that the program produces is OK according to the C standard.
However, you're probably on a machine where sizeof(int) == 4 (as opposed to say, 2). Since there are 4 bytes to an integer, two addresses which are 3 bytes apart are part of the same integer, so the difference between the addresses is 0 * sizeof(int). You might find a different answer if you chose ptr1 = arr + 1;, or you might not. But that's the beauty of undefined behaviour - it would be 'right' either way.
After the subtraction you need to divide the result in the size of the pointed type.
(int*)ptr2 - (int*)ptr1 == (0x1000003 - 0x1000000) / sizeof(int)
(int*)ptr2 - (int*)ptr1 == (0x1000003 - 0x1000000) / 4 == 0
ptr1 and ptr2 are both char * type, that means one byte one pointer.
char *ptr2 = ptr1 + 3;
so
ptr2 - ptr1 = 3
Next, you cast both pointer to type int *, int type need 4 byte, so both pointer aim at the same int, both pointer have the same value through the memory align, you get the 0 result.
When you subtract two pointers, as long as they point into the same array, the result is the number of elements separating them.
Pointer Subtraction and Comparison
The memory addresses of the elements of the same array are always sequential. i.e
if the memory adress of myarray[0] is:
0x4000000
then the memory address of myarray[2] will definitely be
0x4000002
So when you store the address of arr into ptr1 assume it to be x
, and then when you make the address of ptr2, three units higher than ptr1, it will be x+3. So when you subtract ptr1 from ptr2 the answer will be:
(x+3) - x = 3
Hence the answer.
In the second printf() statement, if you want it to display the same result as above (3), you have to convert the pointer to int and not int*.
char *myvar; // given contents somewhere
int addr = (int)myvar; // addr now = the char pointer
So In your case:
printf ("(int)ptr2 - (int) ptr1 = %d", (int)ptr2 - (int)ptr1);
Related
Code
short **p = (short **)malloc(sizeof(short *));
*p = malloc(sizeof(short));
**p = 10;
printf("**p = %d", **p);
Output
**p = 10
In this code, a multiple pointer **p is declared and *p is used without any declaration(maybe it's by **p).
What does *p mean in my case? Sorry for very simple question.
I saw C standard and stack overflow, but I couldn't find out something.
For any array or pointer p and index i, the expression p[i] is exactly equal to *(p + i) (where * is the unary dereference operator, the result of it on a pointer is the value that the pointer is pointing to).
So if we have p[0] that's then exactly equal to *(p + 0), which is equal to *(p) which is equal to *p. Going backwards from that, *p is equal to p[0].
So
*p = malloc(sizeof(short));
is equal to
p[0] = malloc(sizeof(short));
And
**p = 10;
is equal to
p[0][0] = 10;
(**p is equal to *(*(p + 0) + 0) which is equal to *(p[0] + 0) which is then equal to p[0][0])
It's important to note that the asterisk * can mean different things in different contexts.
It can be used when declaring a variable, and then it means "declare as pointer":
int *p; // Declare p as a pointer to an int value
It can be used to dereference a pointer, to get the value the pointer is pointing to:
*p = 0; // Equal to p[0] = 0
And it can be used as the multiplication operator:
r = a * b; // Multiply the values in a and b, store the resulting value in r
short **p = (short **)malloc(sizeof(short *));
This line declares a pointer to a pointer p. Additionally the value of p is set to the return value from malloc. It is equivalent to
short **p;
p = (short **)malloc(sizeof(short *));
The second line
*p = malloc(sizeof(short));
Here *p is the value of p. *p is of type pointer. *p is set to the return value of malloc. It is equivalent to
p[0] = malloc(sizeof(short));
The third line
**p = 10;
**p is the value of the value of p. It is of type short. It is equivalent to
p[0][0] = 10
In effect what the code above does is to allocate a 2D array of short, then allocate memory for the first row, and then set the element p[0][0] to 10.
As a general comment on your code, you should not use typecast in malloc. See Do I cast the result of malloc?
What does *p mean when **p is already declared?
short **p = (short **)malloc(sizeof(short *));
(better written as)
short **p = malloc (sizeof *p);
Declares the pointer-to-pointer-to short p and allocates storage for a signle pointer with malloc and assigns the beginning address for that block of memory to p. See: In C, there is no need to cast the return of malloc, it is unnecessary. See: Do I cast the result of malloc?
*p = malloc(sizeof(short));
(equivalent to)
p[0] = malloc (sizeof *p[0]);
Allocates storage for a single short and assigns the starting address for that block of memory to p[0].
**p = 10;
(equivalent to)
*p[0] = 10;
(or)
p[0][0] = 10;
Assigns the value 10 to the dereference pointer *p[0] (or **p or p[0][0]) updating the value at that memory address to 10.
printf("**p = %d", **p);
Prints the value stored in the block of memory pointed to by p[0] (the value accessed by dereferencing the pointer as *p[0] or **p)
The way to keep this straight in your head, is p is a single pointer of type pointer-to-pointer-to short. There are 2-level of indirection (e.g. pointer-to-pointer). To remove one level of indirection, you use the unary * operator, e.g.
*p /* has type pointer-to short */
or the [..] also acts as a dereference such that:
p[0] /* also has type pointer-to short */
You still have a pointer-to so you must remove one more level of indirection to refernce the value stored at the memory location pointed to by the pointer. (e.g. the pointer holds the address where the short is stored as its value). So you need:
**p /* has type short */
and
*p[0] /* also has type short */
as would
p[0][0] /* also has type short */
The other piece to keep straight is the type controls pointer-arithmetic. So p++ adds 8-bytes to the pointer-to-ponter address so it now points to the next pointer. If you do short *q = (*p)++; (or short *q = p[0]++, adds 2-bytes to the address for the pointer-to-short, soqnow points to the nextshortin the block of memory beginning at*p(orp[0]`). (there is no 2nd short because you only allocated 1 -- but you get the point)
Let me know if you have further questions.
Let me put it in different way,
consider an example,
int x;
int *y = &x;
int **z = &y;
x = 10;
Which simplifies to this,
Note: Only for illustration purpose I have chosen address of x,y,z as 0x1000,0x2000,0x3000 respectively.
What does *p mean in my case?
In short the snippetshort **p = (short **)malloc(sizeof(short *)); is dynamically allocating a pointer to a pointer of type short i.e same asy in my example.
If I run the following on OS X:
int main (void)
{
int* n; // initialise(declare) pointer
*n = 20; // the value in address pointed to by n is 20
printf("n: %i, n&: %i\n", n, &n);
return 0;
}
I get:
n: 1592302512, n&: 1592302480
Why the differing values?
Why do pointer and &pointer have different values?
The expression &n yields the address of n itself, while n evaluates to the value of the pointer, i.e. the address of the thing it points to.
But note that you have undefined behaviour First of all, because you are de-referencing an uninitialized pointer. You need to make n point somewhere you can write to.
For example,
int* n;
int i = 42;
n = &i;
// now you can de-reference n
*n = 20;
Second, you have the wrong printf specifier for &n. You need %p:
printf("n: %i, &n: %p\n", n, &n);
int* n declares a variable called n which is a pointer to an integer.
&n returns the address of the variable n, which would be a pointer to a pointer-to-integer.
Let's say we have the following code:
int a = 20; // declare an integer a whose value 20
int* n = &a; // declare a pointer n whose value is the address of a
int** p = &n; // declare a pointer p whose value is the address of n
In this case we would have the following:
variable name | value | address in memory
a | 20 | 1592302512
n | 1592302512 | 1592302480
p | 1592302480 | who knows?
In your code
int* n; //initialization is not done
*n = 20;
invokes undefined behavior. You're trying to de-reference (write into) uninitialized memory. You have to allocate memory to n before de-referencing.
Apart form that part,
n is of type int *
&n will be of type int **
So, they are different and supposed to have different values.
That said, you should use %p format specifier with printf() to print the pointers.
Just as an alternative, let me spell this out a different way.
char *ptr;
char c='A';
ptr = &c;
In this code, here's what's happening and what values are found when we qualify ptr in different ways.
ptr itself contains the address in memory where the char c variable is located.
*ptr dereferences the pointer, returning the actual value of the variable c. In this case, a capital A.
&ptr will give you the address of the memory location that ptr represents. In other words, if you needed to know where the pointer itself was located rather than what the address is of the thing that it points to, this is how you get it.
Can anyone explain the reason behind the second output? Also what is the difference between solving using int pointers and char pointers?
The second answer is coming out to be 0.
int main()
{
char arr[] = "geeksforgeeks";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
printf ("ptr2 - ptr1 = %d\n", ptr2 - ptr1);
printf ("(int*)ptr2 - (int*) ptr1 = %d", (int*)ptr2 - (int*)ptr1);
getchar();
return 0;
}
Pointers of some type T point to objects of type T.
For example
int a[] = { 1, 2 };
int *p = a;
If you increase a pointer as for example
++p;
or
p = p + 1;
(take into account that these statements are equivalent) it will point to the next object of type T that follows the current object. So the value of the pointer will be increased by sizeof( T ) that to provide that the poiner indeed will point to the next element.
In the example above sizeof( int ) is (usually) equal to 4. So the value of the pointer will be increased by 4.
If you write
int a[] = { 1, 2 };
int *p = &a[0]; // the same as int *p = a;
int *q = &a[1];
then expression q - p will be equal 1 but the difference between the values stored in p and q will ve equal to sizeof( int ). p points to the first element of the array and q points to the second element of the array. It is so-called pointer arithmetic.
As for your result with subtracting int pointers then the behaviour is undefined. According to the C++ Standard
Unless both pointers point to elements of the same array object, or one
past the last element of the array object, the behavior is undefined
In your case int pointers do not point to elements of the same array. That they would point to the elements of the same array at least the difference of their values shall be equal to sizeof( int )
It's happen because char size is 1-byte, when int is 32Bit (4byte) variable
Edit keep pointer in the original array and ensure correct alignement to avoid undefined behaviour (see comment of Matt McNabb)
Because 3 < sizeof(int).
In pointer arithmetic, (int *) ptr2 - (int *) ptr1 gives real_addr_of_ptr2 - real_addr_of_ptr1) / sizeof(int) = 3 / 4. As it is integer division => 0 - this is not specified by C++ but is current implementation.
If you use : char *ptr2 = ptr1 + 8;, you will get : (int*)ptr2 - (int*) ptr1 = 2
As there are more than 8 characters in array, it can work, provided the original array is correctly aligned. To be coherent with the specs, it should have been declared :
union {
char arr[] = "geeksforgeeks";
int iarr[];
} uarr;
char *ptr1 = uarr.arr;
I am a total beginner to C so please, work with my ignorance. Why does a normal pointer
int* ptr = &a; has two spaces in memory (one for the pointer variable and one for the value it points to) and an array pointer int a[] = {5}; only has one memory space (if I print out
printf("\n%p\n", a) I get the same address as if I printed out: printf("\n%p\n", &a).
The question is, shouldn't there be a memory space for the pointer variable a and one for its value which points to the first array element? It does it with the regular pointer int* ptr = &a;
It's a little unclear from your question (and assuming no compiler optimization), but if you first declare a variable and then a pointer to that variable,
int a = 4;
int *p = &a;
then you have two different variables, it makes sense that there are two memory slots. You might change p to point to something else, and still want to refer to a later
int a = 4;
int b = 5;
int *p = &a; // p points to a
// ...
p = &b; // now p points to b
a = 6; // but you can still use a
The array declaration just allocates memory on the stack. If you wanted to do the same with a pointer, on the heap, you would use something like malloc or calloc (or new in c++)
int *p = (int*)malloc(1 * sizeof(int));
*p = 4;
but of course remember to free it later (delete in c++)
free(p);
p = 0;
The main misunderstanding here is that &a return not pointer to pointer as it expected that's because in C language there some difference between [] and * (Explanation here: Difference between [] and *)
If you try to &a if a was an pointer (e.g. int *a) then you obtain a new memory place but when your use a static array (i.e. int a[]) then it return address of the first array element. I'll also try to clarify this by mean of the next code block.
#include <stdio.h>
int main(int argc, char *argv[])
{
// for cycles
int k;
printf("That is a pointer case:\n");
// Allocate memory for 4 bytes (one int is four bytes on x86 platform,
// can be differ for microcontroller e.g.)
int c = 0xDEADBEEF;
unsigned char *b = (unsigned char*) &c;
printf("Value c: %p\n", c);
printf("Pointer to c: %p\n", &c);
printf("Pointer b (eq. to c): %p\n", b);
// Reverse order (little-endian in case of x86)
for (k = 0; k < 4; k++)
printf("b[%d] = 0x%02X\n", k, b[k]);
// MAIN DIFFERENCE HERE: (see below)
unsigned char **p_b = &b;
// And now if we use & one more we obtain pointer to the pointer
// 0xDEADBEEF <-- b <-- &p_b
// This pointer different then b itself
printf("Pointer to the pointer b: %p\n", p_b);
printf("\nOther case, now we use array that defined by []:\n");
int a[] = {5,1};
int *ptr = &a;
// 'a' is array but physically it also pointer to location
// logically it's treat differ other then real pointer
printf("'a' is array: %x\n", a);
// MAIN DIFFERENCE HERE: we obtain not a pointer to pointer
printf("Pointer to 'a' result also 'a'%x\n", &a);
printf("Same as 'a': %x\n", ptr);
printf("Access to memory that 'a' pointes to: \n%x\n", *a);
return 0;
}
This is very simple. In first case,
int* ptr = &a;
you have one variable a already declared and hence present in memory. Now you declare another variable ptr (to hold the address, in C variables which hold address of another variable are called pointers), which again requires memory in the same way as a required.
In second case,
int a[] = {5};
You just declare one variable (which will hold a collection of ints), hence memory is allocated accordingly for a[].
In this expression, int* p = &a; p has only one memory location, of the WORD size of your CPU, most probably, and it is to store the address (memory location) of another variable.
When you do *p you are dereferencing p, which means you are getting the value of what p points to. In this particular case that would be the value of a. a has its own location in memory, and p only points to it, but does not itself store as content.
When you have an array, like int a[] = {5};, you have a series (or one) of memory locations, and they are filled with values. These are actual locations.
Arrays in C can decay to a pointer, so when you printf like you did with your array, you get the same address, whether you do a or &a. This is because of array to pointer decay.
a is still the same location, and is only that location. &a actually returns a pointer to a, but that pointer sits else where in memory. If you did int* b = &a; then b here would not have the same location as a, however, it would point to a.
ptr is a variable containing a memory address. You can assign various memory addresses to ptr. a is a constant representing a fixed memory address of the first element of the array. As such you can do:
ptr = a;
but not
a = ptr;
Pointers point to an area in memory. Pointers to int point to an area large enough to hold a value of int type.
If you have an array of int and make a pointer point to the array first element
int array[42];
int *p = array;
the pointer still points to a space wide enough for an int.
On the other hand, if you make a different pointer point to the whole array, this new pointer points to a larger area that starts at the same address
int (*q)[42]; // q is a pointer to an array of 42 ints
q = &array;
the address of both p and q is the same, but they point to differently sized areas.
#include<stdio.h>
int main()
{
char arr[] = "somestring";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
printf("ptr2 - ptr1 = %ld\n", ptr2 - ptr1);
printf("(int*)ptr2 - (int*) ptr1 = %ld", (int*)ptr2 - (int*)ptr1);
return 0;
}
I understand
ptr2 - ptr1
gives 3 but cannot figure out why second printf prints 0.
It's because when you substract two pointers, you get the distance between the pointer in number of elements, not in bytes.
(char*)ptr2-(char*)ptr1 // distance is 3*sizeof(char), ie 3
(int*)ptr2-(int*)ptr1 // distance is 0.75*sizeof(int), rounded to 0
EDIT: I was wrong by saying that the cast forces the pointer to be aligned
If you want to check the distance between addresses don't use (int *) or (void *), ptrdiff_t is a type able to represent the result of any valid pointer subtraction operation.
#include <stdio.h>
#include <stddef.h>
int main(void)
{
char arr[] = "somestring";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
ptrdiff_t diff = ptr2 - ptr1;
printf ("ptr2 - ptr1 = %td\n", diff);
return 0;
}
EDIT: As pointed out by #chux, use "%td" character for ptrdiff_t.
Casting a char pointer with int* would make it aligned to the 4bytes (considering int is 4 bytes here). Though ptr1 and ptr2 are 3 bytes away, casting them to int*, results in the same address -- hence the result.
This is because sizeof(int) == 4
Each char takes 1 byte. Your array of chars looks like this in memory:
[s][o][m][e][s][t][r][i][n][g][0]
When you have an array of ints, each int occupies four bytes. storing '1' and '2' conceptually looks more like this:
[0][0][0][1][0][0][0][2]
Ints must therefore be aligned to 4-byte boundaries. Your compiler is aliasing the address to the lowest integer boundary. You'll note that if you use 4 instead of 3 this works as you expected.
The reason you have to perform a subtraction to get it to do it (just passing the casted pointers to printf doesn't do it) is because printf is not strictly typed, i.e. the %ld format does not contain the information that the parameter is an int pointer.