typecasting a pointer to an int . - c

I can't understand the output of this program .
What I get of it is , that , first of all , the pointers p, q ,r ,s were pointing towards null .
Then , there has been a typecasting . But how the heck , did the output come as 1 4 4 8 . I might be very wrong in my thoughts . So , please correct me if I am wrong .
int main()
{
int a, b, c, d;
char* p = (char*)0;
int *q = (int *)0;
float* r = (float*)0;
double* s = (double*)0;
a = (int)(p + 1);
b = (int)(q + 1);
c = (int)(r + 1);
d = (int)(s + 1);
printf("%d %d %d %d\n", a, b, c, d);
_getch();
return 0;
}

Pointer arithmetic, in this case adding an integer value to a pointer value, advances the pointer value in units of the type it points to. If you have a pointer to an 8-byte type, adding 1 to that pointer will advance the pointer by 8 bytes.
Pointer arithmetic is valid only if both the original pointer and the result of the addition point to elements of the same array object, or just past the end of it.
The way the C standard describes this is (N1570 6.5.6 paragraph 8):
When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
[...]
If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; otherwise, the behavior
is undefined. If the result points one past the last element of the
array object, it shall not be used as the operand of a unary *
operator that is evaluated.
A pointer just past the end of an array is valid, but you can't dereference it. A single non-array object is treated as a 1-element array.
Your program has undefined behavior. You add 1 to a null pointer. Since the null pointer doesn't point to any object, pointer arithmetic on it is undefined.
But compilers aren't required to detect undefined behavior, and your program will probably treat a null pointer just like any valid pointer value, and perform arithmetic on it in the same way. So if the null pointer points to address 0 (this is not guaranteed, BTW, but it's very common), then adding 1 to it will probably give you a pointer to address N, where N is the size in bytes of the type it points to.
You then convert the resulting pointer to int (which is at best implementation-defined, will lose information if pointers are bigger than int, and may yield a trap representation) and you print the int value. The result, on most systems, will probably show you the sizes of char, int, float, and double, which are commonly 1, 4, 4, and 8 bytes, respectively.
Your program's behavior is undefined, but the way it actually behaves on your system is typical and unsurprising.
Here's a program that doesn't have undefined behavior that illustrates the same point:
#include <stdio.h>
int main(void) {
char c;
int i;
float f;
double d;
char *p = &c;
int *q = &i;
float *r = &f;
double *s = &d;
printf("char: %p --> %p\n", (void*)p, (void*)(p + 1));
printf("int: %p --> %p\n", (void*)q, (void*)(q + 1));
printf("float: %p --> %p\n", (void*)r, (void*)(r + 1));
printf("double: %p --> %p\n", (void*)s, (void*)(s + 1));
return 0;
}
and the output on my system:
char: 0x7fffa67dc84f --> 0x7fffa67dc850
int: 0x7fffa67dc850 --> 0x7fffa67dc854
float: 0x7fffa67dc854 --> 0x7fffa67dc858
double: 0x7fffa67dc858 --> 0x7fffa67dc860
The output is not as clear as your program's output, but if you examine the results closely you can see that adding 1 to a char* advances it by 1 byte, an int* or float* by 4 bytes, and a double* by 8 bytes. (Other than char, which by definition has a size of 1 bytes, these may vary on some systems.)
Note that the output of the "%p" format is implementation-defined, and may or may not reflect the kind of arithmetic relationship you might expect. I've worked on systems (Cray vector computers) where incrementing a char* pointer would actually update a byte offset stored in the high-order 3 bits of the 64-bit word. On such a system, the output of my program (and of yours) would be much more difficult to interpret unless you know the low-level details of how the machine and compiler work.
But for most purposes, you don't need to know those low-level details. What's important is that pointer arithmetic works as it's described in the C standard. Knowing how it's done on the bit level can be useful for debugging (that's pretty much what %p is for), but is not necessary to writing correct code.

Adding 1 to a pointer advances the pointer to the next address appropriate for the pointer's type.
When the (null)pointers+1 are recast to int, you are effectively printing the size of each of the types being pointed to by the pointers.
printf("%d %d %d %d\n", sizeof(char), sizeof(int), sizeof(float), sizeof(double) );
does pretty much the same thing. If you want to increment each pointer by only 1 BYTE, you'll need to cast them to (char *) before incrementing them to let the compiler know
Search for information about pointer arithmetic to learn more.

You're typecasting the pointers to primitive datatypes rather type casting them to pointers themselves and then using * (indirection) operator to indirect to that variable value. For instance, (int)(p + 1); means p; a pointer to constant, is first incremented to next address inside memory (0x1), in this case. and than this 0x1 is typecasted to an int. This totally makes sense.

The output you get is related to the size of each of the relevant types. When you do pointer arithmetic as such, it increases the value of the pointer by the added value times the base type size. This occurs to facilitate proper array access.
Because the size of char, int, float, and double are 1, 4, 4, and 8 respectively on your machine, those are reflected when you add 1 to each of the associated pointers.
Edit:
Removed the alternate code which I thought did not exhibit undefined behavior, which in fact did.

Related

Memory address and content

I have written
int a;
printf("addr = %p and content = %x\n", (void*)(&a), *(&a));
printf("addr = %p and content = %x\n", (void*)(&a)+1, *(&a)+1);
What I see in the output is
addr = 0x7fffffffde3a and content = 55554810
addr = 0x7fffffffde3b and content = 5555
I expect to see one byte in each address. However, I don't see such thing. Why?
First of all, pointer arithmetic and the dereference operator honors the data type.
Remember, a pointer arithmetic, which generates a pointer one past the last element of an array is valid, but attempt to dereference the generated pointer is undefined behavior.
Attempt to dereference a pointer which points to invalid memory location is undefined behavior.
That said,
Quoting C11,
The unary * operator denotes indirection. [...] If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’.
So, in your case, *(&a) is the same as a, which is of type int, and the format specifier prints the integer value stored in a.
If you want to see the byte-by-byte value, you need to cast the pointer (address of a) to a char * and then, dereference the pointer to see value stored in each byte.
So, (void*)(&a)+1 should be changed to (char*)(&a)+1 to point to the next byte of memory.
If you want to print bytes, then play with bytes:
use unsigned char* or uint8_t* for pointers
use %hhx to tell printf that input value is character length.
Example:
int a = 0x12345678;
printf("addr = %p and content = %hhx\n", (void*)&a, *(uint8_t*)&a);
printf("addr = %p and content = %hhx\n", ((void*)&a)+1, *((uint8_t*)&a+1));
Result in my little-endian env:
addr = 0xbedbacac and content = 78
addr = 0xbedbacad and content = 56
And don't forget to use *((uint8_t*)&a+1) instead of *(uint8_t*)&a+1. In the example the later would return 79 (78+1).
You've used the formatting directive %x. %x expects an unsigned int. The type of what you pass as the corresponding argument to printf does not change that. printf will read an unsigned int values from its arguments, no matter what you pass to it. If what you pass something that isn't compatible with what printf reads, the behavior of your program is undefined.
You happen to pass an int. An int is not an unsigned int, but it's close enough that when you read an int expecting an unsigned int, the value remains the same if the integer is positive or zero. Negative integers are mapped to a large positive value (UINT_MAX - a on machines using two's complement representation, which is almost all machines).
If you pass a char (signed or not) to printf when it expects an int (signed or not), the behavior is well-defined due to another feature of the C language, which is promotions. Values of integer types that are smaller than int (i.e. char and short) are converted to int¹. So the following program snippet is well-defined and prints the value of a byte at the address p (assuming that … is replaced by a valid pointer value):
unsigned char p = …;
printf("addr = %p and content = %x\n", (void*)p, *p);
*(&a) is the same thing as a. To see just one byte of a, you could cast the pointer &a to the type unsigned char *.
printf("addr = %p and content = %x\n", (void*)&a, *(unsigned char *)(&a));
This will print one byte of the representation of a in memory. Note that the representation depends on the machine's endianness.
Your code snippet doesn't initialize a, so the first call to printf prints whatever garbage happens to be at the location of a in memory at this time. This assumes that int does not have any trap representation, which is the case on virtually all C implementations.
The second call to printf tries to print *(&a)+1. This is just a+1. The output you get is surprising: are you sure you didn't actually run the program with *(&a+1)? This seems to be what you wanted to explore. With *(&a+1), the behavior of your program would be undefined, because this looks one int past a, and a is not in an array of two or more ints. In practice, you'd be likely to get whatever was just below a on the stack, but that's not something you can count on.
If you wanted to see the byte value at an address which is 1 past the start of a in memory, you'd need to cast the pointer to a byte pointer first. When you add an integer n to a pointer, this doesn't add n to the address stored in the pointer, it adds n to the pointer itself. This is only useful when the pointer points to a value inside an array; then p + n points to n array elements past p. In fact, p[n] is equivalent to *(p+n). If you want to add 1 to the address, then you need to obtain a byte pointer, i.e. a pointer to unsigned char. Contrast:
int a[2] = {0x12345678, 0x9abcdef0};
printf("addr = %p and content = %x\n", (unsigned char*)(&a) + 1, *((unsigned char*)(&a) + 1));
printf("addr = %p and content = %x\n", (&a) + 1, *((&a) + 1));
This is well-defined (but with an implementation-defined value since it depends on the platform's endianness) provided that int consists of at least two bytes (which is not strictly mandatory in C, but is the case everywhere except in a few embedded systems).
(void*)(&a) + 1 is not standard C, because void doesn't have a size so it doesn't make sense to move a pointer to void one void element further. However some implementations such as GCC treat void* as byte pointers, just like unsigned char *, so adding an integer to a void* adds this integer to the address stored in the pointer.
¹ Or to unsigned int if the smaller type doesn't fit in a signed int, e.g. on a platform where short and int have the same size.

Difference between type casting (char*) and (size_t)

I was trying to find the difference of two pointers by subtraction, but one is int * and other is char *. As a result it gave me an error, as I expected, because of incompatible pointer type.
int main() {
char * ca="test";
int *ia=malloc(12);
*ia=45;
printf("add char * =%p, add int = %p \n", ca, ia);
printf("add ca-va * =%p\n", ca-ia);
return(0);
}
test3.c:22:35: error: invalid operands to binary - (have ‘char *’
and ‘int *’)
However, when I type cast int* to size_t I was successfully able to subtract the address. Can some explain what exactly size_t did here?
int main() {
char * ca="test";
int *ia=malloc(12);
*ia=45;
printf("add char * =%p, add int = %p \n", ca, ia);
printf("add ca-va * =%p\n", (ca-(size_t)ia));
return(0);
}
You have 2 problems here:
The difference between two pointer values is counted in units of the data type the pointers point to. This cannot work if you have two different data types.
Pointer arithmetics is only allowed within the same data object. You may only subtract pointers that point to the same array or to one block of dynamically allocated memory.
This is not the case in your code.
Subtracting pointers that do not match those criterias doesn't make much sense anyway.
The compiler is right to complain.
This is just pointer arithmetic.
For some pointer ptr and integer offset, ptr - offset means the address offset elements before ptr. Note that this is elements (whatever the pointer points to), not bytes. You can also use addition here. ptr[i] is shorthand for *(ptr + i).
For two pointers of the same type (e.g. both char*), ptr1 - ptr2 means the number of elements between the 2 pointers. e.g. if ptr1 - ptr2 == 5, then ptr1 + 5 == ptr2.
For two pointers of different types (e.g. char* and int*) ptr1 - ptr2 doesn't make any sense.
In your first piece of code the error occurs because you're trying to subtract pointers of different types. The second piece of code works because your cast is causing it to use the ptr - offset version. But this is is certainly not what you actually want because a pointer was converted to an offset and the result is a pointer.
What you probably want is something that Paul Hankin mentioned in a comment:
intptr_t pc = (intptr_t)ca;
intptr_t pa = (intptr_t)ia;
printf("add ca-va = %" PRIdPTR "\n", pc - pa);
This converts the pointers into integer types capable of holding an address and then does the subtraction. You will need to #include <inttypes.h> to get PRIdPTR (inttypes.h internally includes stdint.h which provides intptr_t).
size_t is an integer type. When a pointer is converted to an integer type, the result is implementation-defined (if it it can fit in the destination type; otherwise the behavior is not defined by the C standard).
Per a non-normative note in the C standard, “The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.” On machines with simple memory address schemes, the result of converting a pointer to an integer is typically the memory address. The remainder of this answer will assume we have such a C implementation.
Thus, if ca points to an array of char at address 9678, and ia points to some allocated memory at 4444, the result of converting ia to size_t would be 4444. Then, when 4444 is subtracted from ca, we are not subtracting two pointers but rather are subtracting an integer from a pointer. In general, the behavior of this is not defined by the C standard, because you are only allowed to add and subtract integers to pointers within the bounds of one array, and 4444 is far outside of ca in this example. However, what the compiler may do is simply convert the integer to the size of the pointed-to elements and then subtract the result from the address. Since ca points to char, and the size of char is one byte, converting 4444 to the size of 4444 char elements is simply 4444 bytes. Then 9678−4444 is 5234, so the result is a pointer that points to address 5234.
When you need to convert a pointer to an integer, there is a better type for this, uintptr_t, defined in the <stdint.h> header. (Comments have pointed out intptr_t, but you should use the unsigned version unless there is specific reason to use the signed version.) Then, if you convert both pointers to uintptr_t, as with (uintptr_t) ca - (uintptr_t) ia you will avoid the problem of the first pointer possibly pointing to some type whose size is not one byte. Then result on machines with flat memory address spaces will typically be the difference between the two addresses.
Since implementation-defined and undefined behavior are involved here, this is not something you can rely on, and you should not manipulate pointers this way in normal code.

Array of Pointers and address spacing

#include<stdio.h>
int main()
{
int a[4]={1,2,3,4};
int *p[4]={a,a+1,a+2,a+3};
printf("%u %u %u\n",p,(p+1),(p+2));
}
And the output is:
937449104 937449112 937449120
On line 3:
This will store the addresses of a[] and its address spacing is 4 values apart as expected.
On line 6:
But when I print the addresses of the elements in p, shouldn't even their address differ by 4 since they are ints as well.
But the output gives us address spacing of 8.
Note that p is an array of int* (pointers), not an array of integers. So on a 64-bit system, it's perfectly normal for pointers to have a spacing of 8, or more precisely, sizeof(int*).
int *p[4]={a,a+1,a+2,a+3};
^
printf("%u %u %u\n",p,(p+1),(p+2));
^ ^ ^
When you write p+1 (use p in pointer arithmetics), the array p decays to a pointer, so the type for p+1 is int**, which should be a pointer to a pointer to int. You'll observe 4 (or sizeof(int)) if you dereference p, getting its content:
printf("%u %u %u\n",*p,*(p+1),*(p+2));
^ ^ ^
which is equivalent to:
printf("%u %u %u\n",a,(a+1),(a+2));
^ ^ ^
By the way, your compiler should have warned you about wrong format specifier, so this is the correct statement:
printf("%p %p %p\n",*p,*(p+1),*(p+2));
printf("%p %p %p\n",,a,(a+1),(a+2));
^ ^ ^
Note that %p is the correct specifier if you want to print the address of pointers.
The increment of a pointer in pointer arithmetic is dictated by what the pointer points to. Here in p+1 p will convert ("decay") into pointer to the first element of the array, which (decayed pointer) is of type int**, because the first element being of type int*. Now when we write p+1 it will move by what p points to — which means move by sizeof(int*). And in your system sizeof(int*)=8. (You have likely to have 64 bit system — the sizeof resulting in 8 byte or 64 bit).
At this point it should be clear why it gave the spacing of 8 in the second case.
Check what happens in first case. a converts ("decays") into pointer to the first element, which (decayed pointer) is of type int*, the first element being of type int. So here a+1 will move by sizeof what it points to which is size of the int variable or sizeof(int).
The printed values will differ by 4 in a system where sizeof(int*) == 4 which is certainly not the case in your system. Understanding the declaration here solves the half of the doubt you have — both are distinct in that, the first one is an array of 4 int but the second is an array 4 int*. When in doubt about declarations, you can check with cdecl.
The correct way to print pointers is to use %p format specifier with explicit void* casting. Example: printf("%p\n",(void*)p); Also, you can check the size of int variable and int* variables like this (and doing this will make you aware of the size of the int and int* in your system).
printf("sizeof(int)=%zu sizeof(int*)=%zu\n", sizeof(int), sizeof(int*));
%zu used because sizeof operator returns the results in size_t.
Cutting a long story short — the takeaways will be:
Pointer arithmetic is dependent on the object it points to.
For the same arithmetic operation p+1, the result will be different depending on the type p points to.
Correctly printing pointer variables and size_t values.

Subtracting two pointers giving unexpected result

#include <stdio.h>
int main() {
int *p = 100;
int *q = 92;
printf("%d\n", p - q); //prints 2
}
Shouldn't the output of above program be 8?
Instead I get 2.
Undefined behavior aside, this is the behavior that you get with pointer arithmetic: when it is legal to subtract pointers, their difference represents the number of data items between the pointers. In case of int which on your system uses four bytes per int, the difference between pointers that are eight-bytes apart is (8 / 4), which works out to 2.
Here is a version that has no undefined behavior:
int data[10];
int *p = &data[2];
int *q = &data[0];
// The difference between two pointers computed as pointer difference
ptrdiff_t pdiff = p - q;
intptr_t ip = (intptr_t)((void*)p);
intptr_t iq = (intptr_t)((void*)q);
// The difference between two pointers computed as integer difference
int idiff = ip - iq;
printf("%td %d\n", pdiff, idiff);
Demo.
This
int *p = 100;
int *q = 92;
is already invalid C. In C you cannot initialize pointers with arbitrary integer values. There's no implicit integer-to-pointer conversion in the language, aside from conversion from null-pointer constant 0. If you need to force a specific integer value into a pointer for some reason, you have to use an explicit cast (e.g. int *p = (int *) 100;).
Even if your code somehow compiles, its behavior in not defined by C language, which means that there's no "should be" answer here.
Your code is undefined behavior.
You cannot simply subtract two "arbitrary" pointers. Quoting C11, chapter §6.5.6/P9
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header. [....]
Also, as mentioned above, if you correctly subtract two pointers, the result would be of type ptrdiff_t and you should use %td to print the result.
That being said, the initialization
int *p = 100;
looks quite wrong itself !! To clarify, it does not store a value of 100 to the memory location pointed by (question: where does it point to?) p. It attempts to sets the pointer variable itself with an integer value of 100 which seems to be a constraint violation in itself.
According to the standard (N1570)
When two pointers are subtracted, both shall point to elements of
the same array object, or one past the last element of the array
object; the result is the difference of the subscripts of the two
array elements.
These are integer pointers, sizeof(int) is 4. Pointer arithmetic is done in units of the size of the thing pointed to. Therefore the "raw" difference in bytes is divided by 4. Also, the result is a ptrdiff_t so %d is unlikely to cut it.
But please note, what you are doing is technically undefined behaviour as Sourav points out. It works in the most common environments almost by accident. However, if p and q point into the same array, the behaviour is defined.
int a[100];
int *p = a + 23;
int *q = a + 25;
printf("0x%" PRIXPTR "\n", (uintptr_t)a); // some number
printf("0x%" PRIXPTR "\n", (uintptr_t)p); // some number + 92
printf("0x%" PRIXPTR "\n", (uintptr_t)q); // some number + 100
printf("%ld\n", q - p); // 2

subtracting two addresses giving wrong output

int main()
{
int x = 4;
int *p = &x;
int *k = p++;
int r = p - k;
printf("%d %d %d", p,k,p-k);
getch();
}
Output:
2752116 2752112 1
Why not 4?
And also I can't use p+k or any other operator except - (subtraction).
First of all, you MUST use correct argument type for the supplied format specifier, supplying mismatched type of arguments causes undefined behavior.
You must use %p format specifier and cast the argument to void * to print address (pointers)
To print the result of a pointer subtraction, you should use %td, as the result is of type ptrdiff_t.
That said, regarding the result 1 for the subtraction, pointer arithmetic honors the data type. Quoting C11, chapter §6.5.6, (emphasis mine)
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header. [....] if the expressions P and Q point to, respectively, the i-th and j-th elements of
an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type ptrdiff_t. [....]
So, in your case, the indexes for p and k are one element apart, i.e, |i-J| == 1, hence the result.
Finally, you cannot add (or multiply or divide) two pointers, because, that is meaningless. Pointers are memory locations and logically you cannot make sense of adding two memory locations. Only subtracting makes sense, to find the related distance between two array members/elements.
Related Constraints, from C11, chapter §6.5.6, additive operators,
For addition, either both operands shall have arithmetic type, or one operand shall be a
pointer to a complete object type and the other shall have integer type. (Incrementing is
equivalent to adding 1.)
What you are getting is the difference between the subscripts of two elements.
C11-6.5.6p9:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.
Also note that the statement
printf("%d %d %d", p,k,p-k);
should be
printf("%p %p %ld\n", (void*)p,(void*)k, p-k);
If your variable is of type pointer, then each calculation on pointer is done by multiplying of pointer type size.
For example:
//Lets assume char is 1 byte, int is 4 bytes long.
// sizeof(*cp) = 4, sizeof(*ip) = 4;
char *cp = (char *)10; //Char itself is 1 byte
int *ip = (int *)10;
cp++; //Increase pointer, let us point to the next char location
ip++; //Increase pointer, let us point to the next int location
printf("Char: %p\r\n", (void *)cp); //Prints 11
printf("Int: %p\r\n", (void *)ip); //Prints 14
First case prints 11 while in second it prints 14. That's because next char element is 1 byte next, while next int element is 4 bytes in advance.
If you have 2 pointers of same type (eg. int *, like you) then if one points to 14 and another to 10, between is for 1 int memory, subtracting gives you 1.
If you want to get your result 4, then cast pointers to char * before calculation, because sizeof(char) is always 1 which means you have 4 elements between addressed 10 and 14 and you will get result 4.
Hope it helps.
First of all adding 2 pointers is not defined. so if you use + operator, you will face compile error.
Second, the output is true and the if you minus 2 pointers, it shows how many boxes of that type are between the pointers. not the number of bytes.
You say :
int* p1 = &x;
int* p2 = p1++;
So between p1 & p2 there are 4 bytes. they are both of type int. so only 1 box of int is between them.

Resources