#include<stdio.h>
int main()
{
char arr[] = "somestring";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
printf("ptr2 - ptr1 = %ld\n", ptr2 - ptr1);
printf("(int*)ptr2 - (int*) ptr1 = %ld", (int*)ptr2 - (int*)ptr1);
return 0;
}
I understand
ptr2 - ptr1
gives 3 but cannot figure out why second printf prints 0.
It's because when you substract two pointers, you get the distance between the pointer in number of elements, not in bytes.
(char*)ptr2-(char*)ptr1 // distance is 3*sizeof(char), ie 3
(int*)ptr2-(int*)ptr1 // distance is 0.75*sizeof(int), rounded to 0
EDIT: I was wrong by saying that the cast forces the pointer to be aligned
If you want to check the distance between addresses don't use (int *) or (void *), ptrdiff_t is a type able to represent the result of any valid pointer subtraction operation.
#include <stdio.h>
#include <stddef.h>
int main(void)
{
char arr[] = "somestring";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
ptrdiff_t diff = ptr2 - ptr1;
printf ("ptr2 - ptr1 = %td\n", diff);
return 0;
}
EDIT: As pointed out by #chux, use "%td" character for ptrdiff_t.
Casting a char pointer with int* would make it aligned to the 4bytes (considering int is 4 bytes here). Though ptr1 and ptr2 are 3 bytes away, casting them to int*, results in the same address -- hence the result.
This is because sizeof(int) == 4
Each char takes 1 byte. Your array of chars looks like this in memory:
[s][o][m][e][s][t][r][i][n][g][0]
When you have an array of ints, each int occupies four bytes. storing '1' and '2' conceptually looks more like this:
[0][0][0][1][0][0][0][2]
Ints must therefore be aligned to 4-byte boundaries. Your compiler is aliasing the address to the lowest integer boundary. You'll note that if you use 4 instead of 3 this works as you expected.
The reason you have to perform a subtraction to get it to do it (just passing the casted pointers to printf doesn't do it) is because printf is not strictly typed, i.e. the %ld format does not contain the information that the parameter is an int pointer.
Related
void main() {
char var = 10;
char *ptr = &var;
printf("Pointer address before increment:%p\n", ptr);
printf("sizeof(ptr):%d\n", sizeof(ptr));
ptr++;
printf("Pointer address after increment:%p\n", ptr);
printf("sizeof(ptr):%d\n", sizeof(ptr));
}
Output:
Pointer address before increment:0x7fffb997144f
sizeof(ptr):8
Pointer address after increment:0x7fffb9971450
sizeof(ptr):8
Why does the char pointer increment by one byte only? Its size is 8, right?
When ptr is declared, the compiler allocates 8 bytes to it. When we increment it by 1, it increments based on the datatype. Why? How does this works?
For starters to output values of the type size_t you need to use the conversion specifier zu instead of d
printf("sizeof(ptr):%zu\n",sizeof(ptr));
^^^
Incremented pointer points to the memory after the object it points to. That is the value of a pointer of the type T * is incremented by the value sizeof( T ).
Consider for example accessing array elements.
T a[N];
The expression a[i] evaluates like +( a + i ). So the value of the pointer is incremented by i * sizeof( T ).
This is called the pointer arithmetic,
As for the size of objects of the type char then according to the C Standard sizeof( char ) is always equal to 1.
Also consider the following demonstration program.
#include <stdio.h>
int main( void )
{
char s[] = "Hello";
for ( const char *p = s; *p != '\0'; ++p )
{
putchar( *p );
}
putchar( '\n' );
}
Its output is
Hello
If the pointer p was incremented by the value sizeof( char * ) then you could not output the array using the pointer.
For pointer arithmetics, what matters is not the sizeof the pointer, but the size of the type it points to:
T a[10]; T *p = a; defines a pointer p to an array of objects of type T.
p contains the memory address of the first element of a.
The next element's address is sizeof(T) bytes farther, so incrementing the pointer p by 1 increments the memory address by sizeof(*p).
Here is a modified version:
#include <stdio.h>
int main() {
char char_array[2] = "a";
char *char_ptr = char_array;
printf("sizeof(char_ptr): %zu\n", sizeof(char_ptr));
printf("char_ptr before increment: %p\n", (void *)char_ptr);
printf("sizeof(*char_ptr): %zu\n", sizeof(*char_ptr));
char_ptr++;
printf("char_ptr after increment: %p\n", (void *)char_ptr);
int int_array[2] = { 1, 2 };
int *int_ptr = int_array;
printf("\nsizeof(int_ptr): %zu\n", sizeof(int_ptr));
printf("int_ptr before increment: %p\n", (void *)int_ptr);
printf("sizeof(*int_ptr): %zu\n", sizeof(*int_ptr));
int_ptr++;
printf("int_ptr after increment: %p\n", (void *)int_ptr);
return 0;
}
Output (64 bits):
sizeof(char_ptr): 8
char_ptr before increment: 0x7fff52c1f7ce
sizeof(*char_ptr): 1
char_ptr after increment: 0x7fff52c1f7cf
sizeof(int_ptr): 8
int_ptr before increment: 0x7fff52c1f7c0
sizeof(*int_ptr): 4
int_ptr after increment: 0x7fff52c1f7c4
Output (32 bits):
sizeof(char_ptr): 4
char_ptr before increment: 0xbffc492e
sizeof(*char_ptr): 1
char_ptr after increment: 0xbffc492f
sizeof(int_ptr): 4
int_ptr before increment: 0xbffc4930
sizeof(*int_ptr): 4
int_ptr after increment: 0xbffc4934
That's because a charactor occupies 1 byte, while a pointer to charactor is a pointer, not a charactor or something else, that means it stored a memory adress, so it occupies 8 bytes. Also all kinds of pointer is 8 bytes on your machine.
What are pointers?
Briefly the pointers are a special types of variables for holding memory addresses and its bit wide or byte size that the compilers allocate may vary depending on the platform. If you size a pointer using the sizeof operator in an 8-bit architecture for example, in most devices you will get 2 bytes, that is, a pointer can hold an address value up to 64kB. This is logically big enough to hold address value for devices that only has up to 64kB of ROM and up to a few kBs of RAM.
Since the architecture in which you compile your code is a 64-bit architecture, the system registers are 64-bits wide and it's wide enough to hold very large address values (2 * 10^64). Since its big enough it will naturally size 64-bits / 8 bytes.
As said the pointers are special objects, so you cannot assign kinds of constants like
int* ptr = 0x7fffb997144f; // Compilers will not allow this
Compilers will not allow this because the pointers are not regular kind of variables. So in order to assign a constant address value you must cast it as pointer like in the following example:
int* ptr = (int*) 0x7fffb997144f; // Compilers will allow this
How the pointers are incremented
It depends on the byte size of the type to which a pointer points. That is;
1 for char types
2 for short int types
4 for long types
8 for long long types
Some type sizes like of int and float types may vary depending on the platform.
Now that we know about pointers and how it is incremented let's make an example for each type above. Let's assume that the start address for each example is 0x7fffb997144f.
For char types
char vars[5];
char* ptr = vars; // 0x7fffb997144f
ptr++; // 0x7fffb9971450
ptr++; // 0x7fffb9971451
For short int types
short int vars[5];
short int* ptr = vars; // 0x7fffb997144f
ptr++; // 0x7fffb9971451
ptr++; // 0x7fffb9971453
For long types
long vars[5];
long* ptr = vars; // 0x7fffb997144f
ptr++; // 0x7fffb9971454
ptr++; // 0x7fffb9971458
For long long types
long long vars[5];
long long* ptr = vars; // 0x7fffb997144f
ptr++; // 0x7fffb9971458
ptr++; // 0x7fffb997145f
I am learning C. As I went through pointers there I noticed some strange behavior which I can't get it. When casting character pointer to integer pointer, integer pointer holds some weird value, no where reasonably related to char or char ascii code. But while printing casted variable with '%c', it prints correct char value.
Printing with '%d' gives some unknown numbers.
printf("%d", *pt); // prints as unknown integer value that too changes for every run
But while printing as '%c' then
printf("%c", *pt); // prints correct casted char value
Whole Program:
int main() {
char f = 'a';
int *pt = (int *)&f;
printf("%d\n", *pt);
printf("%c\n", *pt);
return 0;
}
Please explain how char to int pointer casting works and explain the output value.
Edit:
If I make the below changes to the program, then output will be as expected. Please explain this too.
#include <stdio.h>
int main() {
char f = 'a';
int *pt = (int *)&f;
printf("%d\n", *pt);
printf("%c\n", *pt);
int val = (int)f;
printf("%d\n", val);
printf("%c", val);
return 0;
}
Output:
97
a
97
a
Please explain this behavior too.
For what the C language specifies, this is just plain undefined behavior. You have a char sized region of memory from which you are reading an int; the result is undefined.
As for what is likely happening: The C runtime ends up dumping some random garbage on the stack before main is even executed. char f = 'a'; happens to rewrite one byte of the garbage to a known value, but the padding to align pt means the remaining bytes are never rewritten at all, and have "whatever the runtime left behind" in them. So when you read an int out, on a little endian system, the low byte equals the value of 'a', but the high bytes are whatever garbage happens to be left in the padding space.
As for why %c works, since the low byte is still the same, and %c only examines the low byte of the int provided, all the garbage is ignored, and things happen to work as expected. This only works on a little endian machine though; on a big endian machine, it would be the high byte initialized to 'a', but the low byte (garbage) would be printed by %c.
You have define f as a char. This allocates typically 1 byte of storage in most of the hardware. You take the address of f, cast it to (int *) and assign it to an int * variable, pt. Size of integer depends on the underlying hardware - it could be 2 or 4 or even more. When you assign address of f to pt, the address that gets assigned to pt depends on factors such as int size and the alignment requirements. That is why when you print *pt, you see a garbage value. Actually, the ASCII value of 'a' is contained in the garbage, the position of which depends on the int size, endianness of the hardware, etc. If you print *pt with %x, you will see 61 in the output (61 hex is 97 in decimal).
<#include <stdio.h>
int main()
{
//type casting in pointers
int a = 500; //value is assgned
int *p; //pointer p
p = &a; //stores the address in the pointer
printf("p=%d\n*p=%d", p, *p);
printf("\np+1=%d\n*(p+1)=%d", p + 1, *(p + 1));
char *p0;
p0 = (char *)p;
printf("\n\np0=%d\n*p0=%d", p0, *p0);
return 0;
}
Can anyone explain the reason behind the second output? Also what is the difference between solving using int pointers and char pointers?
The second answer is coming out to be 0.
int main()
{
char arr[] = "geeksforgeeks";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
printf ("ptr2 - ptr1 = %d\n", ptr2 - ptr1);
printf ("(int*)ptr2 - (int*) ptr1 = %d", (int*)ptr2 - (int*)ptr1);
getchar();
return 0;
}
Pointers of some type T point to objects of type T.
For example
int a[] = { 1, 2 };
int *p = a;
If you increase a pointer as for example
++p;
or
p = p + 1;
(take into account that these statements are equivalent) it will point to the next object of type T that follows the current object. So the value of the pointer will be increased by sizeof( T ) that to provide that the poiner indeed will point to the next element.
In the example above sizeof( int ) is (usually) equal to 4. So the value of the pointer will be increased by 4.
If you write
int a[] = { 1, 2 };
int *p = &a[0]; // the same as int *p = a;
int *q = &a[1];
then expression q - p will be equal 1 but the difference between the values stored in p and q will ve equal to sizeof( int ). p points to the first element of the array and q points to the second element of the array. It is so-called pointer arithmetic.
As for your result with subtracting int pointers then the behaviour is undefined. According to the C++ Standard
Unless both pointers point to elements of the same array object, or one
past the last element of the array object, the behavior is undefined
In your case int pointers do not point to elements of the same array. That they would point to the elements of the same array at least the difference of their values shall be equal to sizeof( int )
It's happen because char size is 1-byte, when int is 32Bit (4byte) variable
Edit keep pointer in the original array and ensure correct alignement to avoid undefined behaviour (see comment of Matt McNabb)
Because 3 < sizeof(int).
In pointer arithmetic, (int *) ptr2 - (int *) ptr1 gives real_addr_of_ptr2 - real_addr_of_ptr1) / sizeof(int) = 3 / 4. As it is integer division => 0 - this is not specified by C++ but is current implementation.
If you use : char *ptr2 = ptr1 + 8;, you will get : (int*)ptr2 - (int*) ptr1 = 2
As there are more than 8 characters in array, it can work, provided the original array is correctly aligned. To be coherent with the specs, it should have been declared :
union {
char arr[] = "geeksforgeeks";
int iarr[];
} uarr;
char *ptr1 = uarr.arr;
i have this array where i am trying to access its elements by incrementing ptr, as suggested here Trying to find different methods of accessing array elements?...i must be doing something stupid...please help me!
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
char *p1 = "Cnversions";
char *p2 = "Divided";
char *p3 = "Plain";
char *p4 = "Solid";
char *arr[3];
arr[0] = p1;
arr[1] = p2;
arr[2] = p3;
arr[3] = p4;
for(i=0;i<=3;i++)
{
printf("string at arr[%d] is: %s\n",i,*arr);
arr++;
}
return 0;
}
An array like arr is located at a specific spot in memory, so it makes no sense to increment arr (what does it mean to increment an array?)
Instead, you will need to create a pointer to the start of the array and increment the pointer:
char **ptr = arr;
for(i=0; i<4; i++) {
printf("arr[%d] = %s\n", i, *ptr);
ptr++;
}
(Note also that you need to make arr four elements big, i.e. char *arr[4], to accommodate the four string pointers you put in it.)
Remember that although we tend to think of double pointers as arrays, all the compiler sees them as are pointers. When you increment a pointer, it adds a value to the address of the pointer equal to the size of the data type it's pointing at. For example:
int *p;
p = (int *) malloc(sizeof(int));
p is pointing to an int, so the size of p's pointed data is (typically) 4 bytes. This means when you increment p, it will be pointing at a location 4 bytes greater than where it was before.
The type of arr is a char**, meaning that it's a pointer to a pointer to a char. Pointers on most machines these days are 8 bytes. So when you incrementing arr, you are effectively telling the computer to set arr to point to an address 8 bytes higher than what it was at before. In the case of arr, this is an illegal address, so you'll get some kind of crash.
I have a piece of code written in C where some pointer arithmetic is performed. I would like to know how the output comes to be this?
#include <stdio.h>
int main()
{
char arr[] = "gookmforgookm";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
printf ("ptr2 - ptr1 = %d\n", ptr2 - ptr1);
printf ("(int*)ptr2 - (int*) ptr1 = %d", (int*)ptr2 - (int*)ptr1);
getchar();
return 0;
}
Output is below:
ptr2 - ptr1 = 3
(int*)ptr2 - (int*) ptr1 = 0
Strictly speaking, you're invoking undefined behaviour and any result that the program produces is OK according to the C standard.
However, you're probably on a machine where sizeof(int) == 4 (as opposed to say, 2). Since there are 4 bytes to an integer, two addresses which are 3 bytes apart are part of the same integer, so the difference between the addresses is 0 * sizeof(int). You might find a different answer if you chose ptr1 = arr + 1;, or you might not. But that's the beauty of undefined behaviour - it would be 'right' either way.
After the subtraction you need to divide the result in the size of the pointed type.
(int*)ptr2 - (int*)ptr1 == (0x1000003 - 0x1000000) / sizeof(int)
(int*)ptr2 - (int*)ptr1 == (0x1000003 - 0x1000000) / 4 == 0
ptr1 and ptr2 are both char * type, that means one byte one pointer.
char *ptr2 = ptr1 + 3;
so
ptr2 - ptr1 = 3
Next, you cast both pointer to type int *, int type need 4 byte, so both pointer aim at the same int, both pointer have the same value through the memory align, you get the 0 result.
When you subtract two pointers, as long as they point into the same array, the result is the number of elements separating them.
Pointer Subtraction and Comparison
The memory addresses of the elements of the same array are always sequential. i.e
if the memory adress of myarray[0] is:
0x4000000
then the memory address of myarray[2] will definitely be
0x4000002
So when you store the address of arr into ptr1 assume it to be x
, and then when you make the address of ptr2, three units higher than ptr1, it will be x+3. So when you subtract ptr1 from ptr2 the answer will be:
(x+3) - x = 3
Hence the answer.
In the second printf() statement, if you want it to display the same result as above (3), you have to convert the pointer to int and not int*.
char *myvar; // given contents somewhere
int addr = (int)myvar; // addr now = the char pointer
So In your case:
printf ("(int)ptr2 - (int) ptr1 = %d", (int)ptr2 - (int)ptr1);