How casting of char pointer to int pointer works? - c

I am learning C. As I went through pointers there I noticed some strange behavior which I can't get it. When casting character pointer to integer pointer, integer pointer holds some weird value, no where reasonably related to char or char ascii code. But while printing casted variable with '%c', it prints correct char value.
Printing with '%d' gives some unknown numbers.
printf("%d", *pt); // prints as unknown integer value that too changes for every run
But while printing as '%c' then
printf("%c", *pt); // prints correct casted char value
Whole Program:
int main() {
char f = 'a';
int *pt = (int *)&f;
printf("%d\n", *pt);
printf("%c\n", *pt);
return 0;
}
Please explain how char to int pointer casting works and explain the output value.
Edit:
If I make the below changes to the program, then output will be as expected. Please explain this too.
#include <stdio.h>
int main() {
char f = 'a';
int *pt = (int *)&f;
printf("%d\n", *pt);
printf("%c\n", *pt);
int val = (int)f;
printf("%d\n", val);
printf("%c", val);
return 0;
}
Output:
97
a
97
a
Please explain this behavior too.

For what the C language specifies, this is just plain undefined behavior. You have a char sized region of memory from which you are reading an int; the result is undefined.
As for what is likely happening: The C runtime ends up dumping some random garbage on the stack before main is even executed. char f = 'a'; happens to rewrite one byte of the garbage to a known value, but the padding to align pt means the remaining bytes are never rewritten at all, and have "whatever the runtime left behind" in them. So when you read an int out, on a little endian system, the low byte equals the value of 'a', but the high bytes are whatever garbage happens to be left in the padding space.
As for why %c works, since the low byte is still the same, and %c only examines the low byte of the int provided, all the garbage is ignored, and things happen to work as expected. This only works on a little endian machine though; on a big endian machine, it would be the high byte initialized to 'a', but the low byte (garbage) would be printed by %c.

You have define f as a char. This allocates typically 1 byte of storage in most of the hardware. You take the address of f, cast it to (int *) and assign it to an int * variable, pt. Size of integer depends on the underlying hardware - it could be 2 or 4 or even more. When you assign address of f to pt, the address that gets assigned to pt depends on factors such as int size and the alignment requirements. That is why when you print *pt, you see a garbage value. Actually, the ASCII value of 'a' is contained in the garbage, the position of which depends on the int size, endianness of the hardware, etc. If you print *pt with %x, you will see 61 in the output (61 hex is 97 in decimal).

<#include <stdio.h>
int main()
{
//type casting in pointers
int a = 500; //value is assgned
int *p; //pointer p
p = &a; //stores the address in the pointer
printf("p=%d\n*p=%d", p, *p);
printf("\np+1=%d\n*(p+1)=%d", p + 1, *(p + 1));
char *p0;
p0 = (char *)p;
printf("\n\np0=%d\n*p0=%d", p0, *p0);
return 0;
}

Related

Cast pointer type in C

#include <stdio.h>
int main () {
char c = 'A';
int *int_ptr;
double *double_ptr;
*int_ptr = *(int *)&c;
*double_ptr = *(double *)&c;
printf("Original char = %c \n", c);
printf("Integer pointer = %d \n", *int_ptr);
printf("Double pointer = %f\n", *double_ptr);
return 0;
}
The questing is – Why can't I assign the double_ptr using this code, because it causes segmentation fault, but works fine for integer?
As I understand char is 1-byte long and int is 4-bytes long, so double is 8 bytes-long.
By using expression *(double *)&c I expect the following:
& – Get the memory address of c.
(double *) – pretend that this is a pointer to double.
*() – get the actual value and assign it to double var.
Your code has Undefined Behaviour. Therefore anything could happen.
The UB is because you are casting a char which is one byte to types that are 4 and 8 bytes, which means you are (potentially) accessing memory out of bounds, or with the wrong alignment.
Whether any of this will "work" or "not work" on any particular system is not very relevant, because the code is erroneous.
In your program, typecast of char to int* or double* and then a dereference would get some number of extra bytes from memory, which is undefined behavior.

Why does the C program give me this output when printing a pointer?

#include <stdio.h>
int main()
{
int i;
int buf[10];
char *p ;
p = 4;
printf("%d",p);
return 0;
}
Output:
4
How come it is 4? I was expecting some address value. Can you please help me understand it?
This is undefined behavior, because %d expects an integer.
The reason why you see this output is that pointers have enough capacity to store small integer numbers, such as 4. If by coincidence the pointer size on your system matches the size of an integer, printf would find a representation that it expects at the location where it expects it, so it would print the numeric value of your pointer.
The proper way to print your pointer would be with the %p format specifier, and a cast:
printf("%p", (void*)p);
I was expecting some address value.
You would get an address value if you had assigned p some address. For example, if you did this
char buf[10];
char *p = &buf[3];
printf("%p", (void*)p);
you would see the address of buf's element at index 3.
Demo.

Why does following C code print 45 in case of int 45 and 36 in case of STRING and ASCII value of CHAR?

struct s{
int a;
char c;
};
int main()
{
struct s b = {5,'a'};
char *p =(char *)&b;
*p = 45;
printf("%d " , b.a);
return 0;
}
If *p is changes to any character than it prints ASCII value of character , if *p changed to any string ("xyz") than it prints 36 . Why it's happening ?
Can you give memory map of structure s and *p ?
According to me mamory of struct
s as z-> (****)(*) assuming 4 byte for int . and when s initialize than it would have become (0000000 00000000 00000000 00000101)(ASCII OF char a) and *p points to starting address of z . and *p is a character pointer so it will be store ASCII value at each BYTE location i.e. at each * will be occupied by ASCII of char . But now we make it to 45 so z would have become (45 0 0 5 )(ASCII of char a) . But it's not true why ?
When you write to the struct through a char * pointer, you store 45 in the first byte of the struct. If you are on a Little-Endian implementation, you will write to the low end of b.a. If you are on a Big-Endian implementation, you will write to the high end of b.a.
Here is a visualization of what typically happens to the structure on an implementation with 16-bit ints, before and after the assignment *p=45. Note that the struct is padded to a multiple of sizeof(int).
Little-Endian
Before: a [05][00] (int)5
c [61]
[ ]
After: a [2d][00] (int)45
c [61]
[ ]
Big-Endian
Before: a [00][05] (int)5
c [61]
[ ]
After: a [2d][05] (int)11525
c [61]
[ ]
With larger ints, there are more ways to order the bytes, but you are exceedingly unlikely to encounter any other that the two above in real life.
However, The next line invokes undefined behaviour for two reasons:
printf("%d " , b.a);
You are modifying a part of b.a through a pointer of a different type. This may give b.a a "trap representation", and reading a value containing a trap representation causes undefined behaviour. (And no, you are not likely to ever encounter a trap representation (in an integer type) in real life.)
You are calling a variadic function without a function declaration. Variadic functions typically have unusal ways of passing arguments, so the compiler has to know about it. The usual fix is to #include <stdio.h>.
Undefined behaviour means that anything could happen, such as printing the wrong value, crashing your program or (the worst of them all) doing exactly what you expect.
your struct looks in little endian like:
00000101 00000000 00000000 00000000 01100001
so p points to the 5 and overwrite it. at the printf the 4 little endian bytes print the 45.
if you would try it on big endian 754974725 would be the result, because p points to the MSB side of the int.
a simple test program to find out if you are on little or big endian:
int main()
{
int a = 0x12345678;
unsigned char *c = (unsigned char*)(&a);
if (*c == 0x78)
printf("little-endian\n");
else
printf("big-endian\n");
return 0;
}
The C standard guarantees that the address of the first member of a structure is the address of the structure. That is, in your case,
int* p =(int*)&b;
is a safe cast. But there is no standard way of accessing the char member from the address of the structure. This is because the standard does not say anything about the contiguity of successive members in memory: in fact the compiler may or may not insert gaps (called structure packing) between members to suit the chipset.
So what you're doing is essentially undefined.
Because, this
*p = 45;
Changes the value of what p points to to 45. And you made p point to b.
if you want Pointers as structure member instead of an array of char.
try this..
#include<stdio.h>
struct s{
int a;
char *c;
};
int main()
{
struct s b = {5, "a"};
printf("%d %s", b.a, b.c);
return 0;
}
try this Pointers to structure.
#include<stdio.h>
struct s{
int a;
char c[1];
};
int main()
{
struct s *p;
struct s b = {5, 'a'};
p = &b;
printf("%d %s", p->a, p->c);
return 0;
}
char *p = (char*) &b; - in this line, p points to the beginning of b struct as a char pointer.
*p = 45; writes 45 to the memory space of b which is can be accessed by b.a as well.
when you print printf("%d ", b.a); you'll print the 45 stored in the stack memory assigned as a member of struct b you'll get 45.
try debugging it yourself and you'll see it in the watch window

Why does second printf print 0

#include<stdio.h>
int main()
{
char arr[] = "somestring";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
printf("ptr2 - ptr1 = %ld\n", ptr2 - ptr1);
printf("(int*)ptr2 - (int*) ptr1 = %ld", (int*)ptr2 - (int*)ptr1);
return 0;
}
I understand
ptr2 - ptr1
gives 3 but cannot figure out why second printf prints 0.
It's because when you substract two pointers, you get the distance between the pointer in number of elements, not in bytes.
(char*)ptr2-(char*)ptr1 // distance is 3*sizeof(char), ie 3
(int*)ptr2-(int*)ptr1 // distance is 0.75*sizeof(int), rounded to 0
EDIT: I was wrong by saying that the cast forces the pointer to be aligned
If you want to check the distance between addresses don't use (int *) or (void *), ptrdiff_t is a type able to represent the result of any valid pointer subtraction operation.
#include <stdio.h>
#include <stddef.h>
int main(void)
{
char arr[] = "somestring";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
ptrdiff_t diff = ptr2 - ptr1;
printf ("ptr2 - ptr1 = %td\n", diff);
return 0;
}
EDIT: As pointed out by #chux, use "%td" character for ptrdiff_t.
Casting a char pointer with int* would make it aligned to the 4bytes (considering int is 4 bytes here). Though ptr1 and ptr2 are 3 bytes away, casting them to int*, results in the same address -- hence the result.
This is because sizeof(int) == 4
Each char takes 1 byte. Your array of chars looks like this in memory:
[s][o][m][e][s][t][r][i][n][g][0]
When you have an array of ints, each int occupies four bytes. storing '1' and '2' conceptually looks more like this:
[0][0][0][1][0][0][0][2]
Ints must therefore be aligned to 4-byte boundaries. Your compiler is aliasing the address to the lowest integer boundary. You'll note that if you use 4 instead of 3 this works as you expected.
The reason you have to perform a subtraction to get it to do it (just passing the casted pointers to printf doesn't do it) is because printf is not strictly typed, i.e. the %ld format does not contain the information that the parameter is an int pointer.

why the address difference is getting printed (according to the pointer type)

#include<stdio.h>
int main()
{
int *p=0;
char *ch=0;
p++;
ch++;
printf ("%d and %d\n",p,ch);
return 0;
}
Output:
4 and 1
I know the char pointer increments as +1 in the address that it is pointing too.
I know the pointer to an int increments as +4 in the address in gcc that it is pointing too.
I know Derefrencing a pointer should be done by the use of * with the pointer.
Queries:
Why is this not giving any garbage value for p and ch as both are pointers and has not assigned any address;
Why is this giving me the address difference that the respective pointer has obtained while incrementing, or is this a undefined behavior.
3.Why is the output 4 and 1?
Pl. Explain.
I have compiled this code on gcc-4.3.4.
Its a C code.
I am sorry if this comes out to be a copy of some question as I was not able to find any such question on stackoverflow.
1.Why is this not giving any garbage value for p and ch as both are pointers and has not assigned any address;
err, you assigned address here > int *p = 0 and char *ch = 0. p contains address 0x00000000 and ch contains the address 0x00000000
2.Why is this giving me the address difference that the respective pointer has obtained while incrementing, or is this a undefined
behavior.
char *ch = 0; means that ch contains the address 0. Incrementing the address using ++ will increment the value by sizeof(char) viz 1. Similarly for integer. p contains the address 0. And using the ++ operator increments the value by sizeof(int) which seems to be 4 on your machine(note, this isn't always true, especially for 64 bit machine).
3.Why this output is 4 1 ? here
Because at first, p contained 0, then incremented by sizeof(type_of(p)) = sizeof(int) = 4 on your machine and ch incremented by sizeof(type_of(ch)) = sizeof(char) = 1.
First, your code is printing pointers as integers. While this is probably what you're trying to do, it is not defined behavior, as it is entirely unportable on platforms where the size of a pointer (in bytes) is not the same as the size of int. if you want to print pointer values, use %p instead.
To answer your questions. You are assigning values to both pointers: 0, which is synonymous with NULL.
Second. The reason you're getting 4 1 is due to the size of an int vs the size of a char on your platform. The char is going to be 1. On your platform, anint is 4 bytes wide. When incrementing a pointer the compiler will automatically move the address it references by the byte-count of the underlying type it represents.
#include<stdio.h>
int main()
{
int *p=0; // int is 4 bytes on your platform
char *ch=0; // char is 1 byte
p++; // increments the address in p by 4
ch++; // increments the address in ch by 1
printf ("%d and %d\n",p,ch);
return 0;
}
EDIT: you're going to get the similar results, but with a supported print statement, do this instead:
#include<stdio.h>
int main()
{
int *p=0;
char *ch=0;
p++;
ch++;
printf ("%p and %p\n",p,ch);
return 0;
}
Output (on my Mac) is:
0x4 and 0x1
As per my knowledge, I have added the answers to your questions inline:
#include<stdio.h>
int main()
{
int x=0,*p=0;
char c = 'A', *ch=0;
x++;
// You have initialized this to 0, so incrementing adds 4 (int pointer)
// Remember, the address '4' means nothing here
p++;
// You have initialized this to 0, so incrementing adds 1 (char pointer)
// Remember, the address '1' means nothing here
ch++;
// You are now printing the values of the pointers itself
// This does not make any sense. If you are using pointers, you would want to know what is being referenced
printf ("%d , %d and %d\n",x,p,ch);
// This will FAIL
// Because, you are now trying to print the data pointed by the pointers
// Note the use of '*'. This is called de-referencing
printf ("%d , %d and %d\n", x, *p, *ch);
// Let p point to x, de-referencing will now work
p = &x;
printf ("%d , %d\n", x, *p); // 1, 1
// Let ch point to c, de-referencing will now work
ch = &c;
printf ("%c , %c\n", c, *ch); // 'A', 'A'
return 0;
}
Hope this helps.

Resources