What Actually this line of code `ptr=(char *)&a;` does? - c

I have the code :
#include<stdio.h>
void main(){
int i;
float a=5.2;
char *ptr;
ptr=(char *)&a;
for(i=0;i<=3;i++)
printf("%d ",*ptr++);
}
I got the output as 102 102 -90 64. I couldn't predict how it came, I get confused with this line ptr=(char *)&a;. Can anyone explain me what it does? And as like other variables the code *ptr++ increments? Or there is any other rule for pointers with this case.
I'm a novice in C, so explain the answer in simple terms. Thanks in advance.

This is called a cast. In C, a cast lets you convert or reinterpret a value from one type to another. When you take the address of the float, you get a float*; casting that to a char* gives you a pointer referring to the same location in memory, but pretending that what lives there is char data rather than float data.
sizeof(float) is 4, so printing four bytes starting from that location gives you the bytes that make up the float, according to IEEE-754 single-precision format. Some of the bytes have their high bits set, so when interpreted as signed char and then converted to int for display, they appear as negative values on account of their two's-complement representation.
The expression *ptr++ is equivalent to *(ptr++), which first increments ptr then dereferences its previous value; you can think of it as simultaneously dereferencing and advancing ptr.

That line casts the address of a, denoted &a, to a char*, i.e. a pointer to characters/bytes. The printf loop then prints the values of the four constituent bytes of a in decimal.
(Btw., it would have been neater if the loop had been
for (i=0; i<sizeof(a); i++)
printf("%d ", ptr[i]);
)

The line ptr=(char *)&a; casts the address of the float variable to a pointer of type char. Thus you are now interpreting the 4 bytes of which the float consists of as single bytes, which values you print with your for loop.
The statement *ptr++ post-increments the pointer after reading its value, which means, you read the value pointed to (the single bytes of the float) and then advance the pointer by the offset of one byte.

&a gets the address of a, that is, it yields a pointer of type float *. This pointer of type float * is then cast to a pointer of type char *.
Because sizeof(char) == 1, a can now be seen through ptr as a sequence of bytes.
This is useful when you want to abstract away the type of a variable and treat it as a finite sequence of bytes, especially applicable in serialisation and hashing.

Related

Why does this expression come out to 4 in C?

So this expression comes out to 4:
int a[] = {1,2,3,4,5}, i= 3, b,c,d;
int *p = &i, *q = a;
char *format = "\n%d\n%d\n%d\n%d\n%d";
printf("%ld",(long unsigned)(q+1) - (long unsigned)q);
I have to explain it in my homework and I have no idea why it's coming out to that value. I see (long unsigned) casting q+1, and then we subtract the value of whatever q is pointing at as a long unsigned and I assumed we would be left with 1. Why is this not the case?
Because q is a pointer the expression q+1 employs pointer arithmetic. This means that q+1 points to one element after q, not one byte after q.
The type of q is int *, meaning it points to an int. The size of an int on your platform is most likely 4 bytes, so adding 1 to a int * actually adds 4 to the raw pointer value so that it points to the next int in the array.
Try printing the parts of the expression and it becomes a bit clearer what is going on.
printf("%p\n",(q+1));
printf("%p\n",q);
printf("%ld\n",(long unsigned)(q+1));
printf("%ld\n",(long unsigned)q);
It becomes more clear that q is a pointer pointing to the zeroth element of a, and q+1 is a pointer pointing to the next element of a. Int's are 4 bytes on my machine (and presumably on your machine), so they are four bytes apart. Casting the pointers to unsigned values has no effect on my machine, so printing out the difference between the two gives a value of 4.
0x7fff70c3d1a4
0x7fff70c3d1a0
140735085269412
140735085269408
It's because sizeof(int) is 4.
This is an esoteric corner of C that is usually best avoided.
(If it doesn't make sense yet, add some temporary variables).
BTW, the printf format string is incorrect. But that's not why it's outputting 4.

casting int pointer to char pointer

I've read several posts about casting int pointers to char pointers but i'm still confused on one thing.
I understand that integers take up four bytes of memory (on most 32 bit machines?) and characters take up on byte of memory. By casting a integer pointer to a char pointer, will they both contain the same address? Does the cast operation change the value of what the char pointer points to? ie, it only points to the first 8 bits of an integers and not all 32 bits ? I'm confused as to what actually changes when I cast an int pointer to char pointer.
By casting a integer pointer to a char pointer, will they both contain the same address?
Both pointers would point to the same location in memory.
Does the cast operation change the value of what the char pointer points to?
No, it changes the default interpretation of what the pointer points to.
When you read from an int pointer in an expression *myIntPtr you get back the content of the location interpreted as a multi-byte value of type int. When you read from a char pointer in an expression *myCharPtr, you get back the content of the location interpreted as a single-byte value of type char.
Another consequence of casting a pointer is in pointer arithmetic. When you have two int pointers pointing into the same array, subtracting one from the other produces the difference in ints, for example
int a[20] = {0};
int *p = &a[3];
int *q = &a[13];
ptrdiff_t diff1 = q - p; // This is 10
If you cast p and q to char, you would get the distance in terms of chars, not in terms of ints:
char *x = (char*)p;
char *y = (char*)q;
ptrdiff_t diff2 = y - x; // This is 10 times sizeof(int)
Demo.
The int pointer points to a list of integers in memory. They may be 16, 32, or possibly 64 bits, and they may be big-endian or little endian. By casting the pointer to a char pointer, you reinterpret those bits as characters. So, assuming 16 bit big-endian ints, if we point to an array of two integers, 0x4142 0x4300, the pointer is reinterpreted as pointing to the string "abc" (0x41 is 'a', and the last byte is nul). However if integers are little endian, the same data would be reinterpreted as the string "ba".
Now for practical purposes you are unlikely to want to reinterpret integers as ascii strings. However its often useful to reinterpret as unsigned chars, and thus just a stream of raw bytes.
Casting a pointer just changes how it is interpreted; no change to its value or the data it points to occurs. Using it may change the data it points to, just as using the original may change the data it points to; how it changes that data may differ (which is likely the point of doing the casting in the first place).
A pointer is a particular variable that stores the memory address where another variable begins. Doesnt matter if the variable is a int or a char, if the first bit has the same position in the memory, then a pointer to that variable will look the same.
the difference is when you operate on that pointer. If your pointer variable is p and it's a int pointer, then p++ will increase the address that it contains of 4 bytes.
if your pointer is p and it's a char pointer, then p++ will increase the address that it contains of 1 byte.
this code example will help you understand:
int main(){
int* pi;
int i;
char* pc;
char c;
pi = &i;
pc = &c;
printf("%p\n", pi); // 0x7fff5f72c984
pi++;
printf("%p\n", pi); // 0x7fff5f72c988
printf("%p\n", pc); // 0x7fff5f72c977
pc++;
printf("%p\n", pc); // 0x7fff5f72c978
}

Type casting char pointer to integer pointer

So I saw a few example on how the endianness of an architecture could be found. Let's say we have an integer pointer that points to an int data type. And let's say the int value is 0x010A0B12. In a little endian architecture, the least significant byte, i.e, 12, will be stored in the lowest memory address, right? So the lowest byte in a 4-byte integer will be 12.
Now, on to the check. If we declare a char pointer p, and type cast the integer pointer to a char * and store it in p, and print the dereferenced value of p, we will get a clue on the endianness of the architecture. If it's 12, we're little endian; 01 signifies big endian. This sounds really neat...
int a = 0x010A0B12;
int *i = &a;
char *p = (char*)i;
printf("%d",*p); // prints the decimal equivalent of 12h!
Couple of questions here, really. Since pointers are strongly typed, shouldn't a character pointer strictly point to a char data type? And what's up with printing with %d? Shouldn't we rather print with %c, for character?
Since pointers are strongly typed, shouldn't a character pointer strictly point to a char data type?
C has a rule that any pointer can be safely converted to char* and to void*. Converting an int* to char*, therefore, is allowed, and it is also portable. The pointer would be pointing to the initial byte of your int's internal representation.
Shouldn't we rather print with %c, for character?
Another thing is in play here: variable-length argument list of printf. When you pass a char to an untyped parameter of printf, the default conversion applies: char gets converted to int. That is why %d format takes the number just fine, and prints it out as you expect.
You could use %c too. The code that processes %c specifier reads the argument as an int, and then converts it to a char. 0x12 is a special character, though, so you would not see a uniform printout for it.
Since pointers are strongly typed, shouldn't a character pointer strictly point to a char data type?
This is kind of undefined behavior - but such that most sane implementations will do what you mean. So most people would say ok to it.
And what's up with printing with %d?
Format %d expects argument of type int, and the actual arg of type char is promoted to int by usual C rules. So this is ok again. You probably don't want to use %c since the content of byte pointed by p may be any byte, not always a valid text character.

typecasting a pointer to an int .

I can't understand the output of this program .
What I get of it is , that , first of all , the pointers p, q ,r ,s were pointing towards null .
Then , there has been a typecasting . But how the heck , did the output come as 1 4 4 8 . I might be very wrong in my thoughts . So , please correct me if I am wrong .
int main()
{
int a, b, c, d;
char* p = (char*)0;
int *q = (int *)0;
float* r = (float*)0;
double* s = (double*)0;
a = (int)(p + 1);
b = (int)(q + 1);
c = (int)(r + 1);
d = (int)(s + 1);
printf("%d %d %d %d\n", a, b, c, d);
_getch();
return 0;
}
Pointer arithmetic, in this case adding an integer value to a pointer value, advances the pointer value in units of the type it points to. If you have a pointer to an 8-byte type, adding 1 to that pointer will advance the pointer by 8 bytes.
Pointer arithmetic is valid only if both the original pointer and the result of the addition point to elements of the same array object, or just past the end of it.
The way the C standard describes this is (N1570 6.5.6 paragraph 8):
When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
[...]
If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; otherwise, the behavior
is undefined. If the result points one past the last element of the
array object, it shall not be used as the operand of a unary *
operator that is evaluated.
A pointer just past the end of an array is valid, but you can't dereference it. A single non-array object is treated as a 1-element array.
Your program has undefined behavior. You add 1 to a null pointer. Since the null pointer doesn't point to any object, pointer arithmetic on it is undefined.
But compilers aren't required to detect undefined behavior, and your program will probably treat a null pointer just like any valid pointer value, and perform arithmetic on it in the same way. So if the null pointer points to address 0 (this is not guaranteed, BTW, but it's very common), then adding 1 to it will probably give you a pointer to address N, where N is the size in bytes of the type it points to.
You then convert the resulting pointer to int (which is at best implementation-defined, will lose information if pointers are bigger than int, and may yield a trap representation) and you print the int value. The result, on most systems, will probably show you the sizes of char, int, float, and double, which are commonly 1, 4, 4, and 8 bytes, respectively.
Your program's behavior is undefined, but the way it actually behaves on your system is typical and unsurprising.
Here's a program that doesn't have undefined behavior that illustrates the same point:
#include <stdio.h>
int main(void) {
char c;
int i;
float f;
double d;
char *p = &c;
int *q = &i;
float *r = &f;
double *s = &d;
printf("char: %p --> %p\n", (void*)p, (void*)(p + 1));
printf("int: %p --> %p\n", (void*)q, (void*)(q + 1));
printf("float: %p --> %p\n", (void*)r, (void*)(r + 1));
printf("double: %p --> %p\n", (void*)s, (void*)(s + 1));
return 0;
}
and the output on my system:
char: 0x7fffa67dc84f --> 0x7fffa67dc850
int: 0x7fffa67dc850 --> 0x7fffa67dc854
float: 0x7fffa67dc854 --> 0x7fffa67dc858
double: 0x7fffa67dc858 --> 0x7fffa67dc860
The output is not as clear as your program's output, but if you examine the results closely you can see that adding 1 to a char* advances it by 1 byte, an int* or float* by 4 bytes, and a double* by 8 bytes. (Other than char, which by definition has a size of 1 bytes, these may vary on some systems.)
Note that the output of the "%p" format is implementation-defined, and may or may not reflect the kind of arithmetic relationship you might expect. I've worked on systems (Cray vector computers) where incrementing a char* pointer would actually update a byte offset stored in the high-order 3 bits of the 64-bit word. On such a system, the output of my program (and of yours) would be much more difficult to interpret unless you know the low-level details of how the machine and compiler work.
But for most purposes, you don't need to know those low-level details. What's important is that pointer arithmetic works as it's described in the C standard. Knowing how it's done on the bit level can be useful for debugging (that's pretty much what %p is for), but is not necessary to writing correct code.
Adding 1 to a pointer advances the pointer to the next address appropriate for the pointer's type.
When the (null)pointers+1 are recast to int, you are effectively printing the size of each of the types being pointed to by the pointers.
printf("%d %d %d %d\n", sizeof(char), sizeof(int), sizeof(float), sizeof(double) );
does pretty much the same thing. If you want to increment each pointer by only 1 BYTE, you'll need to cast them to (char *) before incrementing them to let the compiler know
Search for information about pointer arithmetic to learn more.
You're typecasting the pointers to primitive datatypes rather type casting them to pointers themselves and then using * (indirection) operator to indirect to that variable value. For instance, (int)(p + 1); means p; a pointer to constant, is first incremented to next address inside memory (0x1), in this case. and than this 0x1 is typecasted to an int. This totally makes sense.
The output you get is related to the size of each of the relevant types. When you do pointer arithmetic as such, it increases the value of the pointer by the added value times the base type size. This occurs to facilitate proper array access.
Because the size of char, int, float, and double are 1, 4, 4, and 8 respectively on your machine, those are reflected when you add 1 to each of the associated pointers.
Edit:
Removed the alternate code which I thought did not exhibit undefined behavior, which in fact did.

Why does my homespun sizeof operator need a char* cast?

Below is the program to find the size of a structure without using sizeof operator:
struct MyStruct
{
int i;
int j;
};
int main()
{
struct MyStruct *p=0;
int size = ((char*)(p+1))-((char*)p);
printf("\nSIZE : [%d]\nSIZE : [%d]\n", size);
return 0;
}
Why is typecasting to char * required?
If I don't use the char* pointer, the output is 1 - why?
Because pointer arithmetic works in units of the type pointed to. For example:
int* p_num = malloc(10 * sizeof(int));
int* p_num2 = p_num + 5;
Here, p_num2 does not point five bytes beyond p_num, it points five integers beyond p_num. If on your machine an integer is four bytes wide, the address stored in p_num2 will be twenty bytes beyond that stored in p_num. The reason for this is mainly so that pointers can be indexed like arrays. p_num[5] is exactly equivalent to *(p_num + 5), so it wouldn't make sense for pointer arithmetic to always work in bytes, otherwise p_num[5] would give you some data that started in the middle of the second integer, rather than giving you the sixth integer as you would expect.
In order to move a specific number of bytes beyond a pointer, you need to cast the pointer to point to a type that is guaranteed to be exactly 1 byte wide (a char).
Also, you have an error here:
printf("\nSIZE : [%d]\nSIZE : [%d]\n", size);
You have two format specifiers but only one argument after the format string.
If I don't use the char* pointer, the output is 1 - WHY?
Because operator- obeys the same pointer arithmetic rules that operator+ does. You incremented the sizeof(MyStruct) when you added one to the pointer, but without the cast you are dividing the byte difference by sizeof(MyStruct) in the operator- for pointers.
Why not use the built in sizeof() operator?
Because you want the size of your struct in bytes. And pointer arithmetics implicitly uses type sizes.
int* p;
p + 5; // this is implicitly p + 5 * sizeof(int)
By casting to char* you circumvent this behavior.
Pointer arithmetic is defined in terms of the size of the type of the pointer. This is what allows (for example) the equivalence between pointer arithmetic and array subscripting -- *(ptr+n) is equivalent to ptr[n]. When you subtract two pointers, you get the difference as the number of items they're pointing at. The cast to pointer to char means that it tells you the number of chars between those addresses. Since C makes char and byte essentially equivalent (i.e. a byte is the storage necessary for one char) that's also the number of bytes occupied by the first item.

Resources