subtracting two addresses giving wrong output - c

int main()
{
int x = 4;
int *p = &x;
int *k = p++;
int r = p - k;
printf("%d %d %d", p,k,p-k);
getch();
}
Output:
2752116 2752112 1
Why not 4?
And also I can't use p+k or any other operator except - (subtraction).

First of all, you MUST use correct argument type for the supplied format specifier, supplying mismatched type of arguments causes undefined behavior.
You must use %p format specifier and cast the argument to void * to print address (pointers)
To print the result of a pointer subtraction, you should use %td, as the result is of type ptrdiff_t.
That said, regarding the result 1 for the subtraction, pointer arithmetic honors the data type. Quoting C11, chapter §6.5.6, (emphasis mine)
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header. [....] if the expressions P and Q point to, respectively, the i-th and j-th elements of
an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type ptrdiff_t. [....]
So, in your case, the indexes for p and k are one element apart, i.e, |i-J| == 1, hence the result.
Finally, you cannot add (or multiply or divide) two pointers, because, that is meaningless. Pointers are memory locations and logically you cannot make sense of adding two memory locations. Only subtracting makes sense, to find the related distance between two array members/elements.
Related Constraints, from C11, chapter §6.5.6, additive operators,
For addition, either both operands shall have arithmetic type, or one operand shall be a
pointer to a complete object type and the other shall have integer type. (Incrementing is
equivalent to adding 1.)

What you are getting is the difference between the subscripts of two elements.
C11-6.5.6p9:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.
Also note that the statement
printf("%d %d %d", p,k,p-k);
should be
printf("%p %p %ld\n", (void*)p,(void*)k, p-k);

If your variable is of type pointer, then each calculation on pointer is done by multiplying of pointer type size.
For example:
//Lets assume char is 1 byte, int is 4 bytes long.
// sizeof(*cp) = 4, sizeof(*ip) = 4;
char *cp = (char *)10; //Char itself is 1 byte
int *ip = (int *)10;
cp++; //Increase pointer, let us point to the next char location
ip++; //Increase pointer, let us point to the next int location
printf("Char: %p\r\n", (void *)cp); //Prints 11
printf("Int: %p\r\n", (void *)ip); //Prints 14
First case prints 11 while in second it prints 14. That's because next char element is 1 byte next, while next int element is 4 bytes in advance.
If you have 2 pointers of same type (eg. int *, like you) then if one points to 14 and another to 10, between is for 1 int memory, subtracting gives you 1.
If you want to get your result 4, then cast pointers to char * before calculation, because sizeof(char) is always 1 which means you have 4 elements between addressed 10 and 14 and you will get result 4.
Hope it helps.

First of all adding 2 pointers is not defined. so if you use + operator, you will face compile error.
Second, the output is true and the if you minus 2 pointers, it shows how many boxes of that type are between the pointers. not the number of bytes.
You say :
int* p1 = &x;
int* p2 = p1++;
So between p1 & p2 there are 4 bytes. they are both of type int. so only 1 box of int is between them.

Related

Why does this expression come out to 4 in C?

So this expression comes out to 4:
int a[] = {1,2,3,4,5}, i= 3, b,c,d;
int *p = &i, *q = a;
char *format = "\n%d\n%d\n%d\n%d\n%d";
printf("%ld",(long unsigned)(q+1) - (long unsigned)q);
I have to explain it in my homework and I have no idea why it's coming out to that value. I see (long unsigned) casting q+1, and then we subtract the value of whatever q is pointing at as a long unsigned and I assumed we would be left with 1. Why is this not the case?
Because q is a pointer the expression q+1 employs pointer arithmetic. This means that q+1 points to one element after q, not one byte after q.
The type of q is int *, meaning it points to an int. The size of an int on your platform is most likely 4 bytes, so adding 1 to a int * actually adds 4 to the raw pointer value so that it points to the next int in the array.
Try printing the parts of the expression and it becomes a bit clearer what is going on.
printf("%p\n",(q+1));
printf("%p\n",q);
printf("%ld\n",(long unsigned)(q+1));
printf("%ld\n",(long unsigned)q);
It becomes more clear that q is a pointer pointing to the zeroth element of a, and q+1 is a pointer pointing to the next element of a. Int's are 4 bytes on my machine (and presumably on your machine), so they are four bytes apart. Casting the pointers to unsigned values has no effect on my machine, so printing out the difference between the two gives a value of 4.
0x7fff70c3d1a4
0x7fff70c3d1a0
140735085269412
140735085269408
It's because sizeof(int) is 4.
This is an esoteric corner of C that is usually best avoided.
(If it doesn't make sense yet, add some temporary variables).
BTW, the printf format string is incorrect. But that's not why it's outputting 4.

Subtracting two pointers giving unexpected result

#include <stdio.h>
int main() {
int *p = 100;
int *q = 92;
printf("%d\n", p - q); //prints 2
}
Shouldn't the output of above program be 8?
Instead I get 2.
Undefined behavior aside, this is the behavior that you get with pointer arithmetic: when it is legal to subtract pointers, their difference represents the number of data items between the pointers. In case of int which on your system uses four bytes per int, the difference between pointers that are eight-bytes apart is (8 / 4), which works out to 2.
Here is a version that has no undefined behavior:
int data[10];
int *p = &data[2];
int *q = &data[0];
// The difference between two pointers computed as pointer difference
ptrdiff_t pdiff = p - q;
intptr_t ip = (intptr_t)((void*)p);
intptr_t iq = (intptr_t)((void*)q);
// The difference between two pointers computed as integer difference
int idiff = ip - iq;
printf("%td %d\n", pdiff, idiff);
Demo.
This
int *p = 100;
int *q = 92;
is already invalid C. In C you cannot initialize pointers with arbitrary integer values. There's no implicit integer-to-pointer conversion in the language, aside from conversion from null-pointer constant 0. If you need to force a specific integer value into a pointer for some reason, you have to use an explicit cast (e.g. int *p = (int *) 100;).
Even if your code somehow compiles, its behavior in not defined by C language, which means that there's no "should be" answer here.
Your code is undefined behavior.
You cannot simply subtract two "arbitrary" pointers. Quoting C11, chapter §6.5.6/P9
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header. [....]
Also, as mentioned above, if you correctly subtract two pointers, the result would be of type ptrdiff_t and you should use %td to print the result.
That being said, the initialization
int *p = 100;
looks quite wrong itself !! To clarify, it does not store a value of 100 to the memory location pointed by (question: where does it point to?) p. It attempts to sets the pointer variable itself with an integer value of 100 which seems to be a constraint violation in itself.
According to the standard (N1570)
When two pointers are subtracted, both shall point to elements of
the same array object, or one past the last element of the array
object; the result is the difference of the subscripts of the two
array elements.
These are integer pointers, sizeof(int) is 4. Pointer arithmetic is done in units of the size of the thing pointed to. Therefore the "raw" difference in bytes is divided by 4. Also, the result is a ptrdiff_t so %d is unlikely to cut it.
But please note, what you are doing is technically undefined behaviour as Sourav points out. It works in the most common environments almost by accident. However, if p and q point into the same array, the behaviour is defined.
int a[100];
int *p = a + 23;
int *q = a + 25;
printf("0x%" PRIXPTR "\n", (uintptr_t)a); // some number
printf("0x%" PRIXPTR "\n", (uintptr_t)p); // some number + 92
printf("0x%" PRIXPTR "\n", (uintptr_t)q); // some number + 100
printf("%ld\n", q - p); // 2

Why does pointer subtraction in C yield an integer?

Why if I subtract from a pointer another pointer (integer pointers) without typecasting the result will be 1 and not 4 bytes (like it is when I typecast to int both pointers). Example :
int a , b , *p , *q;
p = &b;
q = p + 1; // q = &a;
printf("%d",q - p); // The result will be one .
printf("%d",(int)q - (int)p); // The result will be 4(bytes). The memory address of b minus The memory address of a.
According to the C Standard (6.5.6 Additive operators)
9 When two pointers are subtracted, both shall point to elements of
the same array object, or one past the last element of the array
object; the result is the difference of the subscripts of the two
array elements....
If the two pointers pointed to elements of the same array then as it is said in the quote from the Standard
the result is the difference of the subscripts of the two array
elements
That is you would get the number of elements of the array between these two pointers. It is the result of the so-called pointer arithmetic.
If you subtract addresses stored in the pointers as integer values then you will get the number that corresponds to the arithmetic subtract operation.
Why If If I subtract from a pointer another pointer (integer pointers) without typecasting the result will be 1 and not 4 bytes
That's the whole point of the data type that a pointer pointing to. It's probably easier to look at an array context like below. The point is regardless of the underlying data type (here long or double), you can use pointer arithmetic to navigate the array without caring about how exactly the size of its element is. In other words, (pointer + 1) means point the next element regardless of the type.
long l[] = { 10e4, 10e5, 10e6 };
long *pl = l + 1; // point to the 2nd element in the "long" array.
double d[] = { 10e7, 10e8, 10e9 };
double *pd = d + 2; // point to the 3rd element in the "double" array.
Also note in your code:
int a , b , *p , *q;
p = &b;
q = p + 1; // q = &a; <--- NO this is wrong.
The fact that a and b are declared next to each other does not mean that a and b are allocated next to each other in the memory. So q is pointing to the memory address next to that of b - but what is in that address is undefined.
Because the ptrdiff_t from pointer subtraction is calculated relative to the size of the elements pointed to. It's a lot more convenient that way; for one, it tells you how many times you can increment one pointer before you reach the other pointer.
where you have
int a , b , *p , *q;
The compiler can put a and b anywhere. They don't have to even be near each other. Also, when you subtract two int pointers, the result is sized in terms of int, not bytes.
C is not assembly language. So pointers are not just plain integers -- pointers are special guys that know how to point to other things.
It's fundamental to the way pointers and pointer arithmetic work in C that they can point to successive elements of an array. So if we write
int a[10];
int *p1 = &a[4];
int *p2 = &a[3];
then p1 - p2 will be 1. The result is 1 because the "distance" between a[3] and a[4] is one int. The result is 1 because 4 - 3 = 1. The result is not 4 (as you might have thought it would be if you know that ints are 32 bits on your machine) because we're not interesting in doing assembly language programming or working with machine addresses; we're doing higher-level language programming with an array, and we're thinking in those terms.
(But, yes, at the machine address level, the way p2 - p1 is computed is typically as (<raw address value in p2> - <raw address value in p1>) / sizeof(int).)

typecasting a pointer to an int .

I can't understand the output of this program .
What I get of it is , that , first of all , the pointers p, q ,r ,s were pointing towards null .
Then , there has been a typecasting . But how the heck , did the output come as 1 4 4 8 . I might be very wrong in my thoughts . So , please correct me if I am wrong .
int main()
{
int a, b, c, d;
char* p = (char*)0;
int *q = (int *)0;
float* r = (float*)0;
double* s = (double*)0;
a = (int)(p + 1);
b = (int)(q + 1);
c = (int)(r + 1);
d = (int)(s + 1);
printf("%d %d %d %d\n", a, b, c, d);
_getch();
return 0;
}
Pointer arithmetic, in this case adding an integer value to a pointer value, advances the pointer value in units of the type it points to. If you have a pointer to an 8-byte type, adding 1 to that pointer will advance the pointer by 8 bytes.
Pointer arithmetic is valid only if both the original pointer and the result of the addition point to elements of the same array object, or just past the end of it.
The way the C standard describes this is (N1570 6.5.6 paragraph 8):
When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
[...]
If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; otherwise, the behavior
is undefined. If the result points one past the last element of the
array object, it shall not be used as the operand of a unary *
operator that is evaluated.
A pointer just past the end of an array is valid, but you can't dereference it. A single non-array object is treated as a 1-element array.
Your program has undefined behavior. You add 1 to a null pointer. Since the null pointer doesn't point to any object, pointer arithmetic on it is undefined.
But compilers aren't required to detect undefined behavior, and your program will probably treat a null pointer just like any valid pointer value, and perform arithmetic on it in the same way. So if the null pointer points to address 0 (this is not guaranteed, BTW, but it's very common), then adding 1 to it will probably give you a pointer to address N, where N is the size in bytes of the type it points to.
You then convert the resulting pointer to int (which is at best implementation-defined, will lose information if pointers are bigger than int, and may yield a trap representation) and you print the int value. The result, on most systems, will probably show you the sizes of char, int, float, and double, which are commonly 1, 4, 4, and 8 bytes, respectively.
Your program's behavior is undefined, but the way it actually behaves on your system is typical and unsurprising.
Here's a program that doesn't have undefined behavior that illustrates the same point:
#include <stdio.h>
int main(void) {
char c;
int i;
float f;
double d;
char *p = &c;
int *q = &i;
float *r = &f;
double *s = &d;
printf("char: %p --> %p\n", (void*)p, (void*)(p + 1));
printf("int: %p --> %p\n", (void*)q, (void*)(q + 1));
printf("float: %p --> %p\n", (void*)r, (void*)(r + 1));
printf("double: %p --> %p\n", (void*)s, (void*)(s + 1));
return 0;
}
and the output on my system:
char: 0x7fffa67dc84f --> 0x7fffa67dc850
int: 0x7fffa67dc850 --> 0x7fffa67dc854
float: 0x7fffa67dc854 --> 0x7fffa67dc858
double: 0x7fffa67dc858 --> 0x7fffa67dc860
The output is not as clear as your program's output, but if you examine the results closely you can see that adding 1 to a char* advances it by 1 byte, an int* or float* by 4 bytes, and a double* by 8 bytes. (Other than char, which by definition has a size of 1 bytes, these may vary on some systems.)
Note that the output of the "%p" format is implementation-defined, and may or may not reflect the kind of arithmetic relationship you might expect. I've worked on systems (Cray vector computers) where incrementing a char* pointer would actually update a byte offset stored in the high-order 3 bits of the 64-bit word. On such a system, the output of my program (and of yours) would be much more difficult to interpret unless you know the low-level details of how the machine and compiler work.
But for most purposes, you don't need to know those low-level details. What's important is that pointer arithmetic works as it's described in the C standard. Knowing how it's done on the bit level can be useful for debugging (that's pretty much what %p is for), but is not necessary to writing correct code.
Adding 1 to a pointer advances the pointer to the next address appropriate for the pointer's type.
When the (null)pointers+1 are recast to int, you are effectively printing the size of each of the types being pointed to by the pointers.
printf("%d %d %d %d\n", sizeof(char), sizeof(int), sizeof(float), sizeof(double) );
does pretty much the same thing. If you want to increment each pointer by only 1 BYTE, you'll need to cast them to (char *) before incrementing them to let the compiler know
Search for information about pointer arithmetic to learn more.
You're typecasting the pointers to primitive datatypes rather type casting them to pointers themselves and then using * (indirection) operator to indirect to that variable value. For instance, (int)(p + 1); means p; a pointer to constant, is first incremented to next address inside memory (0x1), in this case. and than this 0x1 is typecasted to an int. This totally makes sense.
The output you get is related to the size of each of the relevant types. When you do pointer arithmetic as such, it increases the value of the pointer by the added value times the base type size. This occurs to facilitate proper array access.
Because the size of char, int, float, and double are 1, 4, 4, and 8 respectively on your machine, those are reflected when you add 1 to each of the associated pointers.
Edit:
Removed the alternate code which I thought did not exhibit undefined behavior, which in fact did.

C function pointer behavior, array/array pointer behavior

Consider the following code:
#include <stdio.h>
int ret_five() {
return 5;
}
int main() {
int x[5] = {1,2,3,4,5};
int (*p)();
p = &ret_five;
printf("%d\n", p()); // 1
p = ret_five;
printf("%d\n", p()); // 2
printf("%d\n", sizeof ret_five); // 3
printf("%d\n", sizeof &ret_five); // 4
printf("%d\n", (*p)()); // 5
printf("%d\n", (****p)()); // 6
printf("%p\n", p); // 7 // edited: replaced %d with %p
printf("%p\n", *p); // 8 // same here and in (8), (10)
printf("%p\n", **p); // 9
printf("%p\n", *******p); // 10
printf("%p\n", x); // 11
printf("%p\n", &x); // 12
return 0;
}
My questions are:
Lines (1) and (2) print the same result. Do ret_five and &ret_five have the same data type? It seems like no, because lines (3) and (4) print different results.
From a syntactical point of view, it seems to me that line (5) should be the right way to call the function that p points to, but of course lines (1) and (2) print 5 just fine. Is there a technical reason for this, or was it a design decision made because the calls in (1) and (2) look cleaner? Or something else?
Line (5) makes perfect sense to me (because p is a function pointer, its dereferenced value is the function, we call the function, it returns 5, we print 5). I was very surprised to find that (6) prints 5 as well! Why is this?
Similarly, lines (7)--(10) all print the same value, namely &ret_five. Why does (10) work?
Lines (11) and (12) print the same value, namely the address where the first element of x lives in memory. Line (12) makes sense, but I don't quite understand exactly what is technically happening in line (11). Does x automatically get cast or interpreted as an int* in this context?
To get the location in memory where x is stored, I typically do &x[0], but it seems like &x works just fine as well, and because x is an array and not a pointer, it seems like in fact &x may be the more canonical way of getting this memory address. Is there a reason to prefer one to the other?
In general, are there best-practices in the above situations? For example, if p = ret_five; and p = &ret_five really do the exact same thing, is there a reason to prefer one to the other?
And, if the two assignments in question 7 really do the exact same thing, why, in a language that is otherwise so rigid, was this laxity built-in?
Do ret_five and &ret_five have the same data type?
ret_five is a function designator and &ret_five is a function pointer. In an expression ret_five is converted to a function pointer whose value and type are the same asret_five.
printf("%d\n", sizeof ret_five); // 3
printf("%d\n", sizeof &ret_five); // 4
sizeof &ret_five is correct. And it yields the size of a function pointer of type int (*)().
sizeof ret_five is invalid C code and it is accepted in gcc as a GNU extension.
printf("%d\n", p); // 7
printf("%d\n", *p); // 8
printf("%d\n", **p); // 9
printf("%d\n", *******p); // 10
If p is a function pointer, p, *p or *****p are equivalent in C.
printf("%p\n", x); // 11
printf("%p\n", &x); // 12
x is an array of 5 int elements. In an expression (except in a few exceptions like if it is the operand of the &operator), it is converted to a pointer to its first element (type of x after conversion is int *).
&x is a pointer to an array of 5 int elements (type of &x is int (*)[5].
A function designator is an expression that has function type. Except when it is the
operand of the sizeof operator or the unary & operator, a function designator with
type »function returning type« is converted to an expression that has type »pointer to
function returning type«.
ret_five and &ret_five both evaluate to the same function pointers. sizeof ret_five is a constraint violation and your compiler should output a diagnostic. So, ret_five is a function designator that is in all (but two (see above)) situations converted to a pointer to said function, *ret_five is again a function designator, which is AGAIN converted to a pointer to said function if you use it in any context except the two above, so **ret_five is again a function designator, and so on. Printing such a pointer with %d is undefined behavior since %d is for ints.
p = ret_five is correct in modern C. Using &ret_five instead is old fashioned, 1980s C.
Except when it is the operand of the sizeof operator or the unary & operator, or is a
string literal used to initialize an array, an expression that has type »array of type« is
converted to an expression with type »pointer to type« that points to the initial element of
the array object and is not an lvalue.
x and &x have the same numerical value (they are pointers to x's first element) but different types. x evaluates to a pointer to int, but &x evaluates to a pointer to an array of five ints.
When using %p specifier you need to cast the argument to void * as it expects void * type argument.
7.21.6 Formatted input/output functions:
p The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
printf("%p\n", (void *) &x);
Lines (11) and (12) print the same value, namely the address where the first element of x lives in memory. Line (12) makes sense, but I don't quite understand exactly what is technically happening in line (11). Does x automatically get cast or interpreted as an int* in this context?
Yes it will. Array name x decays to the pointer to first element of arrayx. It will give you the location of first element and having type int *. &x is the address of entire array x and it will always print the starting address of the array and is of type int (*)[5]. Since address of first element of the array is same as the staring address of array x that's why you are getting the same value.
To get the location in memory where x is stored, I typically do &x[0], but it seems like &x works just fine as well, and because x is an array and not a pointer, it seems like in fact &x may be the more canonical way of getting this memory address. Is there a reason to prefer one to the other?
Answer is similar to previous one. &x[0] is the address of first element while &x is the address of entire array.

Resources