Subtracting two pointers giving unexpected result - c

#include <stdio.h>
int main() {
int *p = 100;
int *q = 92;
printf("%d\n", p - q); //prints 2
}
Shouldn't the output of above program be 8?
Instead I get 2.

Undefined behavior aside, this is the behavior that you get with pointer arithmetic: when it is legal to subtract pointers, their difference represents the number of data items between the pointers. In case of int which on your system uses four bytes per int, the difference between pointers that are eight-bytes apart is (8 / 4), which works out to 2.
Here is a version that has no undefined behavior:
int data[10];
int *p = &data[2];
int *q = &data[0];
// The difference between two pointers computed as pointer difference
ptrdiff_t pdiff = p - q;
intptr_t ip = (intptr_t)((void*)p);
intptr_t iq = (intptr_t)((void*)q);
// The difference between two pointers computed as integer difference
int idiff = ip - iq;
printf("%td %d\n", pdiff, idiff);
Demo.

This
int *p = 100;
int *q = 92;
is already invalid C. In C you cannot initialize pointers with arbitrary integer values. There's no implicit integer-to-pointer conversion in the language, aside from conversion from null-pointer constant 0. If you need to force a specific integer value into a pointer for some reason, you have to use an explicit cast (e.g. int *p = (int *) 100;).
Even if your code somehow compiles, its behavior in not defined by C language, which means that there's no "should be" answer here.

Your code is undefined behavior.
You cannot simply subtract two "arbitrary" pointers. Quoting C11, chapter §6.5.6/P9
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header. [....]
Also, as mentioned above, if you correctly subtract two pointers, the result would be of type ptrdiff_t and you should use %td to print the result.
That being said, the initialization
int *p = 100;
looks quite wrong itself !! To clarify, it does not store a value of 100 to the memory location pointed by (question: where does it point to?) p. It attempts to sets the pointer variable itself with an integer value of 100 which seems to be a constraint violation in itself.

According to the standard (N1570)
When two pointers are subtracted, both shall point to elements of
the same array object, or one past the last element of the array
object; the result is the difference of the subscripts of the two
array elements.

These are integer pointers, sizeof(int) is 4. Pointer arithmetic is done in units of the size of the thing pointed to. Therefore the "raw" difference in bytes is divided by 4. Also, the result is a ptrdiff_t so %d is unlikely to cut it.
But please note, what you are doing is technically undefined behaviour as Sourav points out. It works in the most common environments almost by accident. However, if p and q point into the same array, the behaviour is defined.
int a[100];
int *p = a + 23;
int *q = a + 25;
printf("0x%" PRIXPTR "\n", (uintptr_t)a); // some number
printf("0x%" PRIXPTR "\n", (uintptr_t)p); // some number + 92
printf("0x%" PRIXPTR "\n", (uintptr_t)q); // some number + 100
printf("%ld\n", q - p); // 2

Related

Simple implementation of sizeof in C

I came across one simple (maybe over simplified) implementation of the sizeof operator in C, which goes as follows:
#include <stdio.h>
#define mySizeof(type) ((char*)(&type + 1) - (char*)(&type))
int main() {
char x;
int y;
double z;
printf("mySizeof(char) is : %ld\n", mySizeof(x));
printf("mySizeof(int) is : %ld\n", mySizeof(y));
printf("mySizeof(double) is : %ld\n", mySizeof(z));
}
Note: Please ignore whether this simple function can work in all cases; that's not the purpose of this post (though it works for the three cases defined in the program).
My question is: How does it work? (Especially the char* casting part.)
I did some investigations as follows:
#include <stdio.h>
#define Address(x) (&x)
#define NextAddress(x) (&x + 1)
int main() {
int n = 1;
printf("address is : %lld\n", Address(n));
printf("next address is : %lld\n", NextAddress(n));
printf("size is %lld\n", NextAddress(n) - Address(n));
return 0;
}
The above sample program outputs:
address is : 140721498241924
next address is : 140721498241928
size is 1
I can see the addresses of &x and &x + 1. Notice that the difference is 4, which means 4 bytes, since the variable is int type. But, when I do the subtraction operation, the result is 1.
What you have to remember here is that pointer arithmetic is performed in units of the size of the pointed-to type.
So, if p is a pointer to the first element of an int array, then *p refers to that first element and the result of the p + 1 operation will be the address resulting from adding the size of an int to the address in p; thus, *(p + 1) will refer to the second element of the array, as it should.
In your mySizeof macro, the &type + 1 expression will yield the result of adding the size of the relevant type to the address of type; so, in order for the subsequent subtraction of &type to yield the size in bytes, we cast the pointers to char*, so that the subtraction will be performed in base units of the size of a char … which is guaranteed by the C Standard to be 1 byte.
Pointers carry the information about their type. If you have a pointer to a 4-byte value such is int, and add 1 to it, you get a pointer to the next int, not a pointer to the second byte of the original int. Similarly for subtraction.
If you want to obtain the item size in bytes, it's necessary to force pointers to point to byte-like items. Hence the typecast to char*.
See also Pointer Arithmetic
Your implementation of sizeof works for most objects, albeit you should modify it this way:
the misnamed macro argument type (which cannot be a type) should be bracketed in the expansion to avoid operator precedence issues.
the expression has type ptrdiff_t, it should be cast as size_t
the printf format for size_t is %zu. Note that %ld is incorrect for ptrdiff_t, you should use %td for this.
Here is a modified version:
#include <stdio.h>
#define mySizeof(obj) ((size_t)((char *)(&(obj) + 1) - (char *)&(obj)))
int main() {
char x;
int y;
double z;
printf("mySizeof(char) is : %zu\n", mySizeof(x));
printf("mySizeof(int) is : %zu\n", mySizeof(y));
printf("mySizeof(double) is : %zu\n", mySizeof(z));
return 0;
}
How it works:
valid pointers can point to an element of an array or the the element just past the last element of the array. Objects that are not arrays are considered as arrays of 1 element for this purpose.
so if obj is a valid lvalue &(obj) + 1 is a valid pointer past the end of obj in memory and casting it as (char *) is valid.
similarly (char *)&(obj) is a valid pointer to the beginning of the object, and the only iffy operation here is the subtraction of 2 valid pointers that cannot be considered to point to the same array of char.
the C standard make a special case of character type pointers to allow the representation of objects to be accessed as individual bytes. So (char *)(&(obj) + 1) - (char *)&(obj) effectively evaluates to the number of bytes in the representation of obj.
Note these limitations for this implementation of sizeof:
it does not work for types as in mySizeof(int)
the argument must be an object: mySizeof(1) does not work, nor mySizeof(x + 1)
the object may be struct or an array: char foo[3]; mySizeof(foo) but not a string literal: mySizeof("abc") nor a compound literal: mySizeof((char[2]){'a','b'})

subtracting two addresses giving wrong output

int main()
{
int x = 4;
int *p = &x;
int *k = p++;
int r = p - k;
printf("%d %d %d", p,k,p-k);
getch();
}
Output:
2752116 2752112 1
Why not 4?
And also I can't use p+k or any other operator except - (subtraction).
First of all, you MUST use correct argument type for the supplied format specifier, supplying mismatched type of arguments causes undefined behavior.
You must use %p format specifier and cast the argument to void * to print address (pointers)
To print the result of a pointer subtraction, you should use %td, as the result is of type ptrdiff_t.
That said, regarding the result 1 for the subtraction, pointer arithmetic honors the data type. Quoting C11, chapter §6.5.6, (emphasis mine)
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header. [....] if the expressions P and Q point to, respectively, the i-th and j-th elements of
an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type ptrdiff_t. [....]
So, in your case, the indexes for p and k are one element apart, i.e, |i-J| == 1, hence the result.
Finally, you cannot add (or multiply or divide) two pointers, because, that is meaningless. Pointers are memory locations and logically you cannot make sense of adding two memory locations. Only subtracting makes sense, to find the related distance between two array members/elements.
Related Constraints, from C11, chapter §6.5.6, additive operators,
For addition, either both operands shall have arithmetic type, or one operand shall be a
pointer to a complete object type and the other shall have integer type. (Incrementing is
equivalent to adding 1.)
What you are getting is the difference between the subscripts of two elements.
C11-6.5.6p9:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.
Also note that the statement
printf("%d %d %d", p,k,p-k);
should be
printf("%p %p %ld\n", (void*)p,(void*)k, p-k);
If your variable is of type pointer, then each calculation on pointer is done by multiplying of pointer type size.
For example:
//Lets assume char is 1 byte, int is 4 bytes long.
// sizeof(*cp) = 4, sizeof(*ip) = 4;
char *cp = (char *)10; //Char itself is 1 byte
int *ip = (int *)10;
cp++; //Increase pointer, let us point to the next char location
ip++; //Increase pointer, let us point to the next int location
printf("Char: %p\r\n", (void *)cp); //Prints 11
printf("Int: %p\r\n", (void *)ip); //Prints 14
First case prints 11 while in second it prints 14. That's because next char element is 1 byte next, while next int element is 4 bytes in advance.
If you have 2 pointers of same type (eg. int *, like you) then if one points to 14 and another to 10, between is for 1 int memory, subtracting gives you 1.
If you want to get your result 4, then cast pointers to char * before calculation, because sizeof(char) is always 1 which means you have 4 elements between addressed 10 and 14 and you will get result 4.
Hope it helps.
First of all adding 2 pointers is not defined. so if you use + operator, you will face compile error.
Second, the output is true and the if you minus 2 pointers, it shows how many boxes of that type are between the pointers. not the number of bytes.
You say :
int* p1 = &x;
int* p2 = p1++;
So between p1 & p2 there are 4 bytes. they are both of type int. so only 1 box of int is between them.

Distance between arbitrary pointers

I am trying to print the distance between two pointers, but I have found that sometimes the code doesn't work well.
#include <stdio.h>
#include <math.h>
/**
* Print the distance between 2 pointers
*/
void distance(int * a0, int * a1){
size_t difference = (size_t) a1 - (size_t) a0;
printf("distance between %p & %p: %u\n" ,a0, a1, abs((int) difference));
}
Trying this works perfectly!!
int main(void){
int x = 100;
int y = 3000;
distance(&x, &y);
return 0;
}
printing (example):
distance between 0028ff18 & 0028ff14: 4
But start to going wrong with this code
int main(void){
int x = 100;
int p = 1500;
int y = 3000;
distance(&x, &y);
p = p + 2; // remove unused warning
// &p
return 0;
}
printing (example):
distance between 0028ff18 & 0028ff14: 4
When it has to print 8 because of an integer separating this values!
But if I uncomment //&p, it works again.
It is as if the variable p does not exist until its memory address is used.
I'm using gcc 4.9.3 on windows 7 (64 bits)
p is not used in your program and is likely to be optimized out. Anyway this is implementation details and the compiler is free to even change the order of the x and y objects in memory.
it is as if the variable p not exist until its memory address is used
gcc will optimize away any variables that do not affect the behavior of the program.
Secondly, you cannot assume that variables are laid out in any particular order in memory, and the "distance" between any two may be completely meaningless. The only time you can rely on the distance between two pointers to have any meaning is when they are both pointing to elements within the same array object.
Pointer arithmetic is valid only between pointers to elements of the same array, or one past the end. Going against that will have unpredictable and implementation-defined results. In this case, it seems you have found the answer yourself: gcc is optimizing p away, but in any case, you can't assume any specific order of variables in memory, yet alone do pointer arithmetic.
Also, this:
size_t difference = (size_t) a1 - (size_t) a0;
Should be:
ptrdiff_t difference = a1 - a0;
The correct type for the difference between two pointers is ptrdiff_t (defined in stddef.h). You shouldn't cast the pointers to size_t because pointer values are not necessarily representable in a size_t (if you want a numeric type to convert pointers to, use uintptr_t or intptr_t).
The %p format specifier expects a void *, so you should cast the pointers, and the correct format specifier for ptrdiff_t is %td:
printf("distance between %p & %p: %td\n", (void *) a0, (void *) a1, difference);

typecasting a pointer to an int .

I can't understand the output of this program .
What I get of it is , that , first of all , the pointers p, q ,r ,s were pointing towards null .
Then , there has been a typecasting . But how the heck , did the output come as 1 4 4 8 . I might be very wrong in my thoughts . So , please correct me if I am wrong .
int main()
{
int a, b, c, d;
char* p = (char*)0;
int *q = (int *)0;
float* r = (float*)0;
double* s = (double*)0;
a = (int)(p + 1);
b = (int)(q + 1);
c = (int)(r + 1);
d = (int)(s + 1);
printf("%d %d %d %d\n", a, b, c, d);
_getch();
return 0;
}
Pointer arithmetic, in this case adding an integer value to a pointer value, advances the pointer value in units of the type it points to. If you have a pointer to an 8-byte type, adding 1 to that pointer will advance the pointer by 8 bytes.
Pointer arithmetic is valid only if both the original pointer and the result of the addition point to elements of the same array object, or just past the end of it.
The way the C standard describes this is (N1570 6.5.6 paragraph 8):
When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
[...]
If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; otherwise, the behavior
is undefined. If the result points one past the last element of the
array object, it shall not be used as the operand of a unary *
operator that is evaluated.
A pointer just past the end of an array is valid, but you can't dereference it. A single non-array object is treated as a 1-element array.
Your program has undefined behavior. You add 1 to a null pointer. Since the null pointer doesn't point to any object, pointer arithmetic on it is undefined.
But compilers aren't required to detect undefined behavior, and your program will probably treat a null pointer just like any valid pointer value, and perform arithmetic on it in the same way. So if the null pointer points to address 0 (this is not guaranteed, BTW, but it's very common), then adding 1 to it will probably give you a pointer to address N, where N is the size in bytes of the type it points to.
You then convert the resulting pointer to int (which is at best implementation-defined, will lose information if pointers are bigger than int, and may yield a trap representation) and you print the int value. The result, on most systems, will probably show you the sizes of char, int, float, and double, which are commonly 1, 4, 4, and 8 bytes, respectively.
Your program's behavior is undefined, but the way it actually behaves on your system is typical and unsurprising.
Here's a program that doesn't have undefined behavior that illustrates the same point:
#include <stdio.h>
int main(void) {
char c;
int i;
float f;
double d;
char *p = &c;
int *q = &i;
float *r = &f;
double *s = &d;
printf("char: %p --> %p\n", (void*)p, (void*)(p + 1));
printf("int: %p --> %p\n", (void*)q, (void*)(q + 1));
printf("float: %p --> %p\n", (void*)r, (void*)(r + 1));
printf("double: %p --> %p\n", (void*)s, (void*)(s + 1));
return 0;
}
and the output on my system:
char: 0x7fffa67dc84f --> 0x7fffa67dc850
int: 0x7fffa67dc850 --> 0x7fffa67dc854
float: 0x7fffa67dc854 --> 0x7fffa67dc858
double: 0x7fffa67dc858 --> 0x7fffa67dc860
The output is not as clear as your program's output, but if you examine the results closely you can see that adding 1 to a char* advances it by 1 byte, an int* or float* by 4 bytes, and a double* by 8 bytes. (Other than char, which by definition has a size of 1 bytes, these may vary on some systems.)
Note that the output of the "%p" format is implementation-defined, and may or may not reflect the kind of arithmetic relationship you might expect. I've worked on systems (Cray vector computers) where incrementing a char* pointer would actually update a byte offset stored in the high-order 3 bits of the 64-bit word. On such a system, the output of my program (and of yours) would be much more difficult to interpret unless you know the low-level details of how the machine and compiler work.
But for most purposes, you don't need to know those low-level details. What's important is that pointer arithmetic works as it's described in the C standard. Knowing how it's done on the bit level can be useful for debugging (that's pretty much what %p is for), but is not necessary to writing correct code.
Adding 1 to a pointer advances the pointer to the next address appropriate for the pointer's type.
When the (null)pointers+1 are recast to int, you are effectively printing the size of each of the types being pointed to by the pointers.
printf("%d %d %d %d\n", sizeof(char), sizeof(int), sizeof(float), sizeof(double) );
does pretty much the same thing. If you want to increment each pointer by only 1 BYTE, you'll need to cast them to (char *) before incrementing them to let the compiler know
Search for information about pointer arithmetic to learn more.
You're typecasting the pointers to primitive datatypes rather type casting them to pointers themselves and then using * (indirection) operator to indirect to that variable value. For instance, (int)(p + 1); means p; a pointer to constant, is first incremented to next address inside memory (0x1), in this case. and than this 0x1 is typecasted to an int. This totally makes sense.
The output you get is related to the size of each of the relevant types. When you do pointer arithmetic as such, it increases the value of the pointer by the added value times the base type size. This occurs to facilitate proper array access.
Because the size of char, int, float, and double are 1, 4, 4, and 8 respectively on your machine, those are reflected when you add 1 to each of the associated pointers.
Edit:
Removed the alternate code which I thought did not exhibit undefined behavior, which in fact did.

adding two number using pointers

I found this code in the internet for adding two numbers using pointers.
couldn't understand how it is working? Any help would be appreciated.
#include <stdio.h>
#include <conio.h>
int main()
{
int a,b,sum;
char *p;
printf("Enter 2 values : ");
scanf("%d%d",&a,&b);
p = (char *)a; // Using pointers
sum = (int)&p[b];
printf("sum = %d",sum);
getch();
return 0;
}
The following line interprets the value in a as an address:
p = (char *)a;
&p[b] is the address of the b th element of the array starting at p. So, as each element of the array has a size of 1, it's a char pointer pointing at address p+b. As p contains a, it's the address at p+a.
Finally, the following line converts back the pointer to an int:
sum = (int)&p[b];
But needless to say: it's a weird construct.
Additional remarks:
Please note that there are limitations, according to the C++ standard:
5.2.10/5: A value of integral type (...) can be explicitly converted to a pointer.
5.2.10/4: A pointer can be explicitly converted to any integral type large enough to hold it.
So better verify that sizeof(int) >= sizeof(char*).
Finally, although this addition will work on most implementations, this is not a guaranteed behaviour on all CPU architectures, because the mapping function between integers and pointers is implementation-defined:
A pointer converted to an integer of sufficient size (if any such
exists on the implementation) and back to the same pointer type will
have its original value; mappings between pointers and integers are
otherwise implementation-defined.
First a is converted to a pointer with the same value. It doesn't point to anything really, it's just the same value.
The expression p[b] will add b to p and refer to the value at that position.
Then the address of the p[b] element is taken and convert to an integer.
As commented, it is valid, but horrible code - just a party trick.
p = (char *)a;
p takes the value of a entered as a supposed address.
sum = (int)&p[b];
the address of the bth element of a char array is at p + b.
Since p == a (numerically), the correct sum is obtained.
To take a worked example, enter 46 and 11.
p = (char *)a; // p = 46
sum = (int)&p[b]; // the address of p[b] = 46 + 11 = 57
Note: nowhere is *p or p[b] written or read, and size does not matter - except for the char array, where pointer arithmetic is in units of 1.

Resources