Why does pointer subtraction in C yield an integer? - c

Why if I subtract from a pointer another pointer (integer pointers) without typecasting the result will be 1 and not 4 bytes (like it is when I typecast to int both pointers). Example :
int a , b , *p , *q;
p = &b;
q = p + 1; // q = &a;
printf("%d",q - p); // The result will be one .
printf("%d",(int)q - (int)p); // The result will be 4(bytes). The memory address of b minus The memory address of a.

According to the C Standard (6.5.6 Additive operators)
9 When two pointers are subtracted, both shall point to elements of
the same array object, or one past the last element of the array
object; the result is the difference of the subscripts of the two
array elements....
If the two pointers pointed to elements of the same array then as it is said in the quote from the Standard
the result is the difference of the subscripts of the two array
elements
That is you would get the number of elements of the array between these two pointers. It is the result of the so-called pointer arithmetic.
If you subtract addresses stored in the pointers as integer values then you will get the number that corresponds to the arithmetic subtract operation.

Why If If I subtract from a pointer another pointer (integer pointers) without typecasting the result will be 1 and not 4 bytes
That's the whole point of the data type that a pointer pointing to. It's probably easier to look at an array context like below. The point is regardless of the underlying data type (here long or double), you can use pointer arithmetic to navigate the array without caring about how exactly the size of its element is. In other words, (pointer + 1) means point the next element regardless of the type.
long l[] = { 10e4, 10e5, 10e6 };
long *pl = l + 1; // point to the 2nd element in the "long" array.
double d[] = { 10e7, 10e8, 10e9 };
double *pd = d + 2; // point to the 3rd element in the "double" array.
Also note in your code:
int a , b , *p , *q;
p = &b;
q = p + 1; // q = &a; <--- NO this is wrong.
The fact that a and b are declared next to each other does not mean that a and b are allocated next to each other in the memory. So q is pointing to the memory address next to that of b - but what is in that address is undefined.

Because the ptrdiff_t from pointer subtraction is calculated relative to the size of the elements pointed to. It's a lot more convenient that way; for one, it tells you how many times you can increment one pointer before you reach the other pointer.

where you have
int a , b , *p , *q;
The compiler can put a and b anywhere. They don't have to even be near each other. Also, when you subtract two int pointers, the result is sized in terms of int, not bytes.

C is not assembly language. So pointers are not just plain integers -- pointers are special guys that know how to point to other things.
It's fundamental to the way pointers and pointer arithmetic work in C that they can point to successive elements of an array. So if we write
int a[10];
int *p1 = &a[4];
int *p2 = &a[3];
then p1 - p2 will be 1. The result is 1 because the "distance" between a[3] and a[4] is one int. The result is 1 because 4 - 3 = 1. The result is not 4 (as you might have thought it would be if you know that ints are 32 bits on your machine) because we're not interesting in doing assembly language programming or working with machine addresses; we're doing higher-level language programming with an array, and we're thinking in those terms.
(But, yes, at the machine address level, the way p2 - p1 is computed is typically as (<raw address value in p2> - <raw address value in p1>) / sizeof(int).)

Related

Adding the integer to hexadecimal address and How is the pointers calculation done in C? [duplicate]

#include<stdio.h>
int main(void){
int *ptr,a,b;
a = ptr;
b = ptr + 1;
printf("the vale of a,b is %x and %x respectively",a,b);
int c,d;
c = 0xff;
d = c + 1;
printf("the value of c d are %x and %x respectively",c,d);
return 0;
}
the out put value is
the vale of a,b is 57550c90 and 57550c94 respectively
the value of c d are ff and 100 respectively%
it turns out the ptr + 1 actually, why it behave this way?
Because pointers are designed to be compatible with arrays:
*(pointer + offset)
is equivalent to
pointer[offset]
So pointer aritmetic doesn't work in terms of bytes, but in terms of sizeof(pointer base type)-bytes sized blocks.
Consider what a pointer is... it's a memory address. Every byte in memory has an address. So, if you have an int that's 4 bytes and its address is 1000, 1001 is actually the 2nd byte of that int and 1002 is the third byte and 1003 is the fourth. Since the size of an int might vary from compiler to compiler, it is imperative that when you increment your pointer you don't get the address of some middle point in the int. So, the job of figuring out how many bytes to skip, based on your data type, is handled for you and you can just use whatever value you get and not worry about it.
As Basile Starynkvitch points out, this amount will vary depending on the sizeof property of the data member pointed to. It's very easy to forget that even though addresses are sequential, the pointers of your objects need to take into account the actual memory space required to house those objects.
Pointer arithmetic is a tricky subject. A pointer addition means passing to some next pointed element. So the address is incremented by the sizeof the pointed element.
Short answer
The address of the pointer will be incremented by sizeof(T) where T is the type pointed to. So for an int, the pointer will be incremented by sizeof(int).
Why?
Well first and foremost, the standard requires it. The reason this behaviour is useful (other than for compatibility with C) is because when you have a data structure which uses contiguous memory, like an array or an std::vector, you can move to the next item in the array by simply adding one to the pointer. If you want to move to the nth item in the container, you just add n.
Being able to write firstAddress + 2 is far simpler than firstAddress + (sizeof(T) * 2), and helps prevent bugs arising from developers assuming sizeof(int) is 4 (it might not be) and writing code like firstAddress + (4 * 2).
In fact, when you say myArray[4], you're saying myArray + 4. This is the reason that arrays indices start at 0; you just add 0 to get the first element (i.e. myArray points to the first element of the array) and n to get the nth.
What if I want to move one byte at a time?
sizeof(char) is guaranteed to be one byte in size, so you can use a char* if you really want to move one byte at a time.
A pointer is used to point to a specific byte of memory marking where an object has been allocated (technically it can point anywhere, but that's how it's used). When you do pointer arithmetic, it operates based on the size of the objects pointed to. In your case, it's a pointer to integers, which have a size of 4 bytes each.
Let consider a pointer p. The expression p+n is like (unsigned char *)p + n * sizeof *p (because sizeof(unsigned char) == 1).
Try this :
#include <stdio.h>
#define N 3
int
main(void)
{
int i;
int *p = &i;
printf("%p\n", (void *)p);
printf("%p\n", (void *)(p + N));
printf("%p\n", (void *)((unsigned char *)p + N * sizeof *p));
return 0;
}

Meaning of int a[10]; int *p = a+9;

I am currently trying to understand pointers in C but I am having a hard time understanding this code:
int a[10];
int *p = a+9;
while ( p > a )
*p-- = (int)(p-a);
I understand the code to some degree. I can see that an array with 10 integer elements is created then a pointer variable to type int is declared. (But I don't understand what a+9 means: does this change the value of the array?).
It would be very helpful if someone could explain this step by step, since I am new to pointers in C.
When used in an expression1, the name of an array in C, 'decays' to a pointer to its first element. Thus, in the expression a + 9, the a is equivalent to an int* variable that has the value of &a[0].
Also, pointer arithmetic works in units of the pointed-to type; so, adding 9 to &a[0] means that you get the address of a[9] – the last element of the array. So, overall, the p = a + 9 expression assigns the address of the array's last element to the p pointer (but it does not change anything in that array).
The subsequent while loop, however, does change the values of the array's elements, setting each to the value of its position (the result of the p - a expression) and decrementing the address in p by the size of an int. (Well, that what it's probably intended to do; but, as mentioned in the comments, the use of such "unsequenced operations" – i.e. the use of p-- and p - a in the same statement – is actually undefined behaviour because, in this case, the C Standard does not dictate which of those two expressions should be evaluated first.)
To avoid that undefined behaviour, the code should be written to use an explicit intermediate, like this:
int main()
{
int a[10];
int* p = a + 9;
while (p > a) {
int n = (int)(p - a); // Get the value FIRST ...
*p-- = n; // ... only THEN assign it
}
return 0;
}
1 There two exceptions: when that array name is used as the operand of a sizeof operator or of the unary & (address of) operator.
int a[10];
This declares an array on e.g. the stack. a represents the starting address of the array. The declaration tells the compiler that a will hold 10 integers. C assumes you know what you are doing so it is up to you to keep yourself in that range.
int *p = a+9;
p is declared a pointer e.g. like a RL street address. When you add an offset to a an offset is added to the address a. The compiler converts the offset like +5 to bytes +5*sizeof(int) so you don't need to think about that, so your p pointer is now pointing inside the array at offset 9 - which is the last int in the array a since index starts at 0 in C.
while( p > a )
The condition says that do this while the address of what p is pointing to is larger than the address where a is.
*p-- = (int)(p-a);
here the value what p points to is overwritten with a crude(1) subtraction between current p and starting address a before the pointer p is decremented.
(1) Undefined Behavior

Why does this expression come out to 4 in C?

So this expression comes out to 4:
int a[] = {1,2,3,4,5}, i= 3, b,c,d;
int *p = &i, *q = a;
char *format = "\n%d\n%d\n%d\n%d\n%d";
printf("%ld",(long unsigned)(q+1) - (long unsigned)q);
I have to explain it in my homework and I have no idea why it's coming out to that value. I see (long unsigned) casting q+1, and then we subtract the value of whatever q is pointing at as a long unsigned and I assumed we would be left with 1. Why is this not the case?
Because q is a pointer the expression q+1 employs pointer arithmetic. This means that q+1 points to one element after q, not one byte after q.
The type of q is int *, meaning it points to an int. The size of an int on your platform is most likely 4 bytes, so adding 1 to a int * actually adds 4 to the raw pointer value so that it points to the next int in the array.
Try printing the parts of the expression and it becomes a bit clearer what is going on.
printf("%p\n",(q+1));
printf("%p\n",q);
printf("%ld\n",(long unsigned)(q+1));
printf("%ld\n",(long unsigned)q);
It becomes more clear that q is a pointer pointing to the zeroth element of a, and q+1 is a pointer pointing to the next element of a. Int's are 4 bytes on my machine (and presumably on your machine), so they are four bytes apart. Casting the pointers to unsigned values has no effect on my machine, so printing out the difference between the two gives a value of 4.
0x7fff70c3d1a4
0x7fff70c3d1a0
140735085269412
140735085269408
It's because sizeof(int) is 4.
This is an esoteric corner of C that is usually best avoided.
(If it doesn't make sense yet, add some temporary variables).
BTW, the printf format string is incorrect. But that's not why it's outputting 4.

In C, why does incrementing a pointer adds the size of the type the pointer is referring to instead of 1? [duplicate]

#include<stdio.h>
int main(void){
int *ptr,a,b;
a = ptr;
b = ptr + 1;
printf("the vale of a,b is %x and %x respectively",a,b);
int c,d;
c = 0xff;
d = c + 1;
printf("the value of c d are %x and %x respectively",c,d);
return 0;
}
the out put value is
the vale of a,b is 57550c90 and 57550c94 respectively
the value of c d are ff and 100 respectively%
it turns out the ptr + 1 actually, why it behave this way?
Because pointers are designed to be compatible with arrays:
*(pointer + offset)
is equivalent to
pointer[offset]
So pointer aritmetic doesn't work in terms of bytes, but in terms of sizeof(pointer base type)-bytes sized blocks.
Consider what a pointer is... it's a memory address. Every byte in memory has an address. So, if you have an int that's 4 bytes and its address is 1000, 1001 is actually the 2nd byte of that int and 1002 is the third byte and 1003 is the fourth. Since the size of an int might vary from compiler to compiler, it is imperative that when you increment your pointer you don't get the address of some middle point in the int. So, the job of figuring out how many bytes to skip, based on your data type, is handled for you and you can just use whatever value you get and not worry about it.
As Basile Starynkvitch points out, this amount will vary depending on the sizeof property of the data member pointed to. It's very easy to forget that even though addresses are sequential, the pointers of your objects need to take into account the actual memory space required to house those objects.
Pointer arithmetic is a tricky subject. A pointer addition means passing to some next pointed element. So the address is incremented by the sizeof the pointed element.
Short answer
The address of the pointer will be incremented by sizeof(T) where T is the type pointed to. So for an int, the pointer will be incremented by sizeof(int).
Why?
Well first and foremost, the standard requires it. The reason this behaviour is useful (other than for compatibility with C) is because when you have a data structure which uses contiguous memory, like an array or an std::vector, you can move to the next item in the array by simply adding one to the pointer. If you want to move to the nth item in the container, you just add n.
Being able to write firstAddress + 2 is far simpler than firstAddress + (sizeof(T) * 2), and helps prevent bugs arising from developers assuming sizeof(int) is 4 (it might not be) and writing code like firstAddress + (4 * 2).
In fact, when you say myArray[4], you're saying myArray + 4. This is the reason that arrays indices start at 0; you just add 0 to get the first element (i.e. myArray points to the first element of the array) and n to get the nth.
What if I want to move one byte at a time?
sizeof(char) is guaranteed to be one byte in size, so you can use a char* if you really want to move one byte at a time.
A pointer is used to point to a specific byte of memory marking where an object has been allocated (technically it can point anywhere, but that's how it's used). When you do pointer arithmetic, it operates based on the size of the objects pointed to. In your case, it's a pointer to integers, which have a size of 4 bytes each.
Let consider a pointer p. The expression p+n is like (unsigned char *)p + n * sizeof *p (because sizeof(unsigned char) == 1).
Try this :
#include <stdio.h>
#define N 3
int
main(void)
{
int i;
int *p = &i;
printf("%p\n", (void *)p);
printf("%p\n", (void *)(p + N));
printf("%p\n", (void *)((unsigned char *)p + N * sizeof *p));
return 0;
}

adding two number using pointers

I found this code in the internet for adding two numbers using pointers.
couldn't understand how it is working? Any help would be appreciated.
#include <stdio.h>
#include <conio.h>
int main()
{
int a,b,sum;
char *p;
printf("Enter 2 values : ");
scanf("%d%d",&a,&b);
p = (char *)a; // Using pointers
sum = (int)&p[b];
printf("sum = %d",sum);
getch();
return 0;
}
The following line interprets the value in a as an address:
p = (char *)a;
&p[b] is the address of the b th element of the array starting at p. So, as each element of the array has a size of 1, it's a char pointer pointing at address p+b. As p contains a, it's the address at p+a.
Finally, the following line converts back the pointer to an int:
sum = (int)&p[b];
But needless to say: it's a weird construct.
Additional remarks:
Please note that there are limitations, according to the C++ standard:
5.2.10/5: A value of integral type (...) can be explicitly converted to a pointer.
5.2.10/4: A pointer can be explicitly converted to any integral type large enough to hold it.
So better verify that sizeof(int) >= sizeof(char*).
Finally, although this addition will work on most implementations, this is not a guaranteed behaviour on all CPU architectures, because the mapping function between integers and pointers is implementation-defined:
A pointer converted to an integer of sufficient size (if any such
exists on the implementation) and back to the same pointer type will
have its original value; mappings between pointers and integers are
otherwise implementation-defined.
First a is converted to a pointer with the same value. It doesn't point to anything really, it's just the same value.
The expression p[b] will add b to p and refer to the value at that position.
Then the address of the p[b] element is taken and convert to an integer.
As commented, it is valid, but horrible code - just a party trick.
p = (char *)a;
p takes the value of a entered as a supposed address.
sum = (int)&p[b];
the address of the bth element of a char array is at p + b.
Since p == a (numerically), the correct sum is obtained.
To take a worked example, enter 46 and 11.
p = (char *)a; // p = 46
sum = (int)&p[b]; // the address of p[b] = 46 + 11 = 57
Note: nowhere is *p or p[b] written or read, and size does not matter - except for the char array, where pointer arithmetic is in units of 1.

Resources