Type casting the character pointer

Type casting the character pointer - c

I am from Java back ground.I am learning C in which i gone through a code snippet for type conversion from int to char.
int a=5;
int *p;
p=&a;
char *a0;
a0=(char* )p;
My question is that , why we use (char *)p instead of (char)p.
We are only casting the 4 byte memory(Integer) to 1 byte(Character) and not the value related to it

You need to consider pointers as variable that contains addresses. Their sole purpose is to show you where to look in the memory.
so consider this:
int a = 65;
void* addr = &a;
now the 'addr' contains the address of the the memory where 'a' is located
what you do with it is up to you.
here I decided to "see" that part of the memory as an ASCII character that you could print to display the character 'A'
char* car_A = (char*)addr;
putchar(*car_A); // print: A (ASCII code for 'A' is 65)
if instead you decide to do what you suggested:
char* a0 = (char)addr;
The left part of the assignment (char)addr will cast a pointer 'addr' (likely to be 4 or 8 bytes) to a char (1 byte)
The right part of the assignment, the truncated address, will be assigned as the address of the pointer 'a0'
If you don't see why it doesn't make sense let me clarify with a concrete example
Say the address of 'a' is 0x002F4A0E (assuming pointers are stored on 4 bytes) then
'*addr' is equal to 65
'addr' is equal to 0x002F4A0E
When casting it like so (char)addr this become equal to 0x0E.
So the line
char* a0 = (char)addr;
become
char* a0 = 0x0E
So 'a0' will end up pointing to the address 0x0000000E and we don't know what is in this location.
I hope this clarify your problem

First of all, p is not necessarily 4 bytes since it's architecture-dependent. Second, p is a pointer to an integer, a0 is a pointer to a character, not a character. You're taking a pointer pointing to an integer and casting it to a pointer to a character. There are few good reasons to do this. You could also cast the value to a character, but I can't imagine any reason for doing this either.

Pointers do not provide information whether they point to a single object of first object of an array.
Consider
int *p;
int a[5] = { 1, 2, 3, 4, 5 };
int x = 1;
p = a;
p = &x;
So having a value in the pointer p you can not say whether the value is the address of the first element of the array a or it is the address of the single object x.
It is your responsibility to interpret the address correctly.
In this expression-statement
a0=(char* )p;
the address of the extent of memory pointed to by the pointer p and occupied by an object of the type int (it is unknown whether it is a single object or the first object of an array) is interpreted as an address of an extent of memory occupied by an object of the type char. Whether it is a single object of the type char or the first object of a character array with the size equal to sizeof( int ) depends on your intention that is how you are going to deal with the pointer.

Related

Difference between char pp and (char) p?

I am having a problem with my exercise in which I have to explain the running of pointers in C.
Can you explain what is the differences between char *pp and (char*) p and the outputs to me?
#include <stdio.h>
#include <stdlib.h>
/*
*
*/
int main(int argc, char** argv) {
int n=260, *p=&n;
printf("n=%d\n", n);
char *pp=(char*)p;
*pp=0;
printf("n=%d\n",n);
return (EXIT_SUCCESS);
}
n=260
n=256
I'm so sorry for the mistake I've done! Hope you guys can help me.

Your question is a basic question, but one that every new C-programmer wrestles with and is fundamental to understanding C. Understanding pointers. While they are easy to understand once you understand them, getting to that point can be frustrating based on the way the information is presented in many books or tutorials.
Pointer Basics
A pointer is simply a normal variable that holds the address of something else as its value. In other words, a pointer points to the address where something else can be found. Where you normally think of a variable holding an immediate values, such as int n = 260;, a pointer (e.g. int *p = &n;) would simply hold the address where 260 is stored in memory.
If you need to access the value stored at the memory address pointed to by p, you dereference p using the unary '*' operator, (e.g. int j = *p; will initialize j = 260).
If you want to obtain a variables address in memory, you use the & (address of) operator. If you need to pass a variable as a pointer, you simply provide the address of the variable as a parameter.
Since p points to the address where 260 is stored, if you change that value at that address (e.g. *p = 41;) 41 is now stored at the address where 260 was before. Since p points to the address of n and you have changed the value at that address, n now equals 41. However j resides in another memory location and its value was set before you changed the value at the address for n, the value for j remains 260.
Pointer Arithmetic
Pointer arithmetic works the same way regardless of the type of object pointed to because the type of the pointer controls the pointer arithmetic, e.g. with a char * pointer, pointer+1 points to the next byte (next char), for an int * pointer (normal 4-byte integer), pointer+1 will point to the next int at an offset 4-bytes after pointer. (so a pointer, is just a pointer.... where arithmetic is automatically handled by the type)
In your case you create a second pointer of a different type char *pp = (char*)p;. The pointer pp now also holds the address of n but it is interpreted at type char on access instead of type int.
The C standard prohibits access of a value stored at an address though a pointer of a different type. C11 Standard - §6.5 Expressions (p6,7) (known as the strict-aliasing rule). There are exceptions to the rule. One exception (the last point) is that any value may be accessed through a pointer of char type.
What Happens to the Value of n In Your Case?
When you assign:
*pp = 0;
you storing the single-byte 0 (or 00000000 in binary) to the memory location held by pp. Here is where endianess (little-endian, big-endian) come into play. Recall, for little-endian computers (just about all x86 and x86_64 IBM-PC clone type boxes), the values are stored in memory with the Least-Significant Byte first. (big-endian stores values with the Most-Significan Byte first). So your original value of n (10000100in binary) is stored in memory on a little-endian box as
n (little endian) : 00000100-00000001-00000000-00000000 (260)
^
|
p (type int)
The character pointer pp is assigned the address held by p, so both p and pp, hold the same address (the difference being one is a pointer to int the other a pointer to char:
n (little endian) : 00000100-00000001-00000000-00000000 (260)
^
|
p (type int)
pp (type char)
When you dereference pp (e.g. *pp) and assign the value zero (e.g. *pp = 0;), you overwrite the first byte of n in memory with zero. After the assignment, you now have:
n (little endian) : 00000000-00000001-00000000-00000000 (256)
^
|
p (type int)
pp (type char)
Which is the binary value 100000000, (256 or hex 0x0100) and what your code outputs for the value of n. Ask yourself this, if the computer you were using was big-endian, what would be resulting value have been?
Let me know if you have any further questions.

char *pp declares the variable pp as a pointer to char - pp will store the address of a char object.
(char *)p is a cast expression - it means “treat the value of p as a char *”.
p was declared as an int * - it stores the address of an int object (in this case, the address of n). The problem is that the char * and int *types are not compatible - you can’t assign one to the other directly1. You have to use a cast to convert the value to the right type.
Pointers to different types are themselves different types, and do not have to have the same size or representation. The one exception is the void * type - it was introduced specifically to be a “generic” pointer type, and you don’t need to explicitly cast when assigning between void * and other pointer types.

Can I dereference the address of an integer pointer?

I am trying to figure out all the possible ways I could fill in int pointer k considering the following givens:
int i = 40;
int *p = &i;
int *k = ___;
So far I came up with "&i" and "p". However, is it possible to fill in the blank with "*&p" or "&*p"?
My understanding of "*&p" is that it is dereferencing the address of an integer pointer. Which to me means if printed out would output the content of p, which is &i. Or is that not possible when initializing an int pointer? Or is it even possible at all anytime?
I understand "&*p" as the memory address of the integer *p points to. This one I am really unsure about also.
If anyone has any recommendations or suggestions I will greatly appreciate it! Really trying to understand pointers better.

Pointer Basics
A pointer is simply a normal variable that holds the address of something else as its value. In other words, a pointer points to the address where something else can be found. Where you normally think of a variable holding an immediate values, such as int i = 40;, a pointer (e.g. int *p = &i;) would simply hold the address where 40 is stored in memory.
If you need the value stored at the memory address p points to, you dereference p using the unary '*' operator, e.g. int j = *p; will initialize j = 40).
Since p points to the address where 40 is stored, if you change that value at that address (e.g. *p = 41;) 41 is now stored at the address where 40 was before. Since p points to the address of i and you have changed the value at that address, i now equals 41. However j resides in another memory location and its value was set before you changed the value at the address for i, the value for j remains 40.
If you want to create a second pointer (e.g. int *k;) you are just creating another variable that holds an address as its value. If you want k to reference the same address held by p as its value, you simply initialize k the same way you woul intialize any other varaible by assigning its value when it is declared, e.g. int *k = p; (which is the same as assigning k = p; at some point after initialization).
Pointer Arithmetic
Pointer arithmetic works the same way regardless of the type of object pointed to because the type of the pointer controls the pointer arithmetic, e.g. with a char * pointer, pointer+1 points to the next byte (next char), for an int * pointer (normal 4-byte integer), pointer+1 will point to the next int at an offset 4-bytes after pointer. (so a pointer, is just a pointer.... where arithmetic is automatically handled by the type)
Chaining & and * Together
The operators available to take the address of an object and dereference pointers are the unary '&' (address of) operator and the unary '*' (dereference) operator. '&' in taking the address of an object adds one level of indirection. '*' in dereferening a pointer to get the value (or thing) pointed to by the pointer removes one level of indirection. So as #KamilCuk explained in example in his comment it does not matter how many times you apply one after the other, one simply adds and the other removes a level of indirection making all but the final operator superfluous.
(note: when dealing with an array-of-pointers, the postfix [..] operator used to obtain the pointer at an index of the array also acts to derefernce the array of pointers removing one level of indirection)
Your Options
Given your declarations:
int i = 40;
int *p = &i;
int *k = ___;
and the pointer summary above, you have two options, both are equivalent. You can either initialize the pointer k with the address of i directly, e.g.
int *k = &i;
or you can initialize k by assinging the address held by p, e.g.
int *k = p;
Either way, k now holds, as its value, the memory location for i where 40 is currently stored.

I am a little bit unsure what you're trying to do but,
int* p = &i;
now, saying &*p is really just like saying p since this gives you the address.
Just that p is much clearer.

The rule is (quoting C11 standard footnote 102) that for any pointer E
&*E is equivalent to E
You can have as many &*&*&*... in front of any pointer type variable that is on the right side of =.
With the &*&*&* sequence below I denote: zero or more &* sequences. I've put a space after it so it's, like, somehow visible. So: we can assign pointer k to the address of i:
int *k = &*&*&* &i;
and assign k to the same value as p has:
int *k = &*&*&* p;
We can also take the address of pointer p, so do &p, it will have int** - ie. it will be a pointer to a pointer to int. And then we can dereference that address. So *&p. It will be always equal to p.
int *k = &*&*&* *&p;
is it possible to fill in the blank with "*&p" or "&*p"?
Yes, both are correct. The *&p first takes the address of p variables then deferences it, as I said above. The *&variable should be always equal to the value of variable. The second &*p is equal to p.
My understanding of "*&p" is that it is dereferencing the address of an integer pointer. Which to me means if printed out would output the content of p, which is &i. Or is that not possible when initializing an int pointer? Or is it even possible at all anytime?
Yes and yes. It is possible, anytime, with any type. The &* is possible with complete types only.
Side note: It's get really funny with functions. The dereference operator * is ignored in front of a function or a function pointer. This is just a rule in C. See ex. this question. You can have a infinite sequence of * and & in front of a function or a function pointer as long as there are no && sequences in it. It gets ridiculous:
void func(void);
void (*funcptr)(void) = ***&***********&*&*&*&****func;
void (*funcptr2)(void) = ***&***&***&***&***&*******&******&**funcptr;
Both funcptr and funcptr2 are assigned the same value and both point to function func.

casting int pointer to char pointer

I've read several posts about casting int pointers to char pointers but i'm still confused on one thing.
I understand that integers take up four bytes of memory (on most 32 bit machines?) and characters take up on byte of memory. By casting a integer pointer to a char pointer, will they both contain the same address? Does the cast operation change the value of what the char pointer points to? ie, it only points to the first 8 bits of an integers and not all 32 bits ? I'm confused as to what actually changes when I cast an int pointer to char pointer.

By casting a integer pointer to a char pointer, will they both contain the same address?
Both pointers would point to the same location in memory.
Does the cast operation change the value of what the char pointer points to?
No, it changes the default interpretation of what the pointer points to.
When you read from an int pointer in an expression *myIntPtr you get back the content of the location interpreted as a multi-byte value of type int. When you read from a char pointer in an expression *myCharPtr, you get back the content of the location interpreted as a single-byte value of type char.
Another consequence of casting a pointer is in pointer arithmetic. When you have two int pointers pointing into the same array, subtracting one from the other produces the difference in ints, for example
int a[20] = {0};
int *p = &a[3];
int *q = &a[13];
ptrdiff_t diff1 = q - p; // This is 10
If you cast p and q to char, you would get the distance in terms of chars, not in terms of ints:
char *x = (char*)p;
char *y = (char*)q;
ptrdiff_t diff2 = y - x; // This is 10 times sizeof(int)
Demo.

The int pointer points to a list of integers in memory. They may be 16, 32, or possibly 64 bits, and they may be big-endian or little endian. By casting the pointer to a char pointer, you reinterpret those bits as characters. So, assuming 16 bit big-endian ints, if we point to an array of two integers, 0x4142 0x4300, the pointer is reinterpreted as pointing to the string "abc" (0x41 is 'a', and the last byte is nul). However if integers are little endian, the same data would be reinterpreted as the string "ba".
Now for practical purposes you are unlikely to want to reinterpret integers as ascii strings. However its often useful to reinterpret as unsigned chars, and thus just a stream of raw bytes.

Casting a pointer just changes how it is interpreted; no change to its value or the data it points to occurs. Using it may change the data it points to, just as using the original may change the data it points to; how it changes that data may differ (which is likely the point of doing the casting in the first place).

A pointer is a particular variable that stores the memory address where another variable begins. Doesnt matter if the variable is a int or a char, if the first bit has the same position in the memory, then a pointer to that variable will look the same.
the difference is when you operate on that pointer. If your pointer variable is p and it's a int pointer, then p++ will increase the address that it contains of 4 bytes.
if your pointer is p and it's a char pointer, then p++ will increase the address that it contains of 1 byte.
this code example will help you understand:
int main(){
int* pi;
int i;
char* pc;
char c;
pi = &i;
pc = &c;
printf("%p\n", pi); // 0x7fff5f72c984
pi++;
printf("%p\n", pi); // 0x7fff5f72c988
printf("%p\n", pc); // 0x7fff5f72c977
pc++;
printf("%p\n", pc); // 0x7fff5f72c978
}

In C, why can't an integer value be assigned to an int* the same way a string value can be assigned to a char*?

I've been looking through the site but haven't found an answer to this one yet.
It is easiest (for me at least) to explain this question with an example.
I don't understand why this is valid:
#include <stdio.h>
int main(int argc, char* argv[])
{
char *mystr = "hello";
}
But this produces a compiler warning ("initialization makes pointer from integer without a cast"):
#include <stdio.h>
int main(int argc, char* argv[])
{
int *myint = 5;
}
My understanding of the first program is that creates a variable called mystr of type pointer-to-char, the value of which is the address of the first char ('h') of the string literal "hello". In other words with this initialization you not only get the pointer, but also define the object ("hello" in this case) which the pointer points to.
Why, then, does int *myint = 5; seemingly not achieve something analogous to this, i.e. create a variable called myint of type pointer-to-int, the value of which is the address of the value '5'? Why doesn't this initialization both give me the pointer and also define the object which the pointer points to?

In fact, you can do so using a compound literal, a feature added to the language by the 1999 ISO C standard.
A string literal is of type char[N], where N is the length of the string plus 1. Like any array expression, it's implicitly converted, in most but not all contexts, to a pointer to the array's first element. So this:
char *mystr = "hello";
assigns to the pointer mystr the address of the initial element of an array whose contents are "hello" (followed by a terminating '\0' null character).
Incidentally, it's safer to write:
const char *mystr = "hello";
There are no such implicit conversions for integers -- but you can do this:
int *ptr = &(int){42};
(int){42} is a compound literal, which creates an anonymous int object initialized to 42; & takes the address of that object.
But be careful: The array created by a string literal always has static storage duration, but the object created by a compound literal can have either static or automatic storage duration, depending on where it appears. That means that if the value of ptr is returned from a function, the object with the value 42 will cease to exist while the pointer still points to it.
As for:
int *myint = 5;
that attempts to assign the value 5 to an object of type int*. (Strictly speaking it's an initialization rather than an assignment, but the effect is the same). Since there's no implicit conversion from int to int* (other than the special case of 0 being treated as a null pointer constant), this is invalid.

When you do char* mystr = "foo";, the compiler will create the string "foo" in a special read-only portion of your executable, and effectively rewrite the statement as char* mystr = address_of_that_string;
The same is not implemented for any other type, including integers. int* myint = 5; will set myint to point to address 5.

i'll split my answer to two parts:
1st, why char* str = "hello"; is valid:
char* str declare a space for a pointer (number that represents a memory address on the current architecture)
when you write "hello" you actually fill the stack with 6 bytes of data
(don't forget the null termination) lets say at address 0x1000 - 0x1005.
str="hello" assigns the start address of that 5 bytes (0x1000) to the *str
so what we have is :
1. str, which takes 4 bytes in memory, holds the number 0x1000 (points to the first char only!)
2. 6 bytes 'h' 'e' 'l' 'l' 'o' '\0'
2st, why int* ptr = 0x105A4DD9; isn't valid:
well, this is not entirely true!
as said before, a Pointer is a number that represent an address,
so why cant i assign that number ?
it is not common because mostly you extract addresses of data and not enter the address manually.
but you can if you need !!!...
because it isn't something that is commonly done,
the compiler want to make sure you do so in propose, and not by mistake and forces you to CAST your data as
int* ptr = (int*)0x105A4DD9;
(used mostly for Memory mapped hardware resources)
Hope this clear things out.
Cheers

"In C, why can't an integer value be assigned to an int* the same way a string value can be assigned to a char*?"
Because it's not even a similar situation, let alone "the same way".
A string literal is an array of chars which – being an array – can be implicitly converted to a pointer to its first element. Said pointer is a char *.
But an int is not either a pointer in itself, nor an array, nor anything else implicitly convertible to a pointer. These two scenarios just don't have anything in common.

The problem is that you are trying to assign the address 5 to the pointer. Here you are not dereferencing the pointer, you are declaring it as a pointer and initializing it to the value 5 (as an address which surely is not what you intend to do). You could do the following.
#include <stdio.h>
int main(int argc, char* argv[])
{
int *myint, b;
b = 5;
myint = &b;
}

Dereferencing and typecasting

I've constructed the following sections of code to help myself understand pointer dereferencing and typecasting in C.
char a = 'a';
char * b = &a;
int i = (int) *b;
For the above, I understand that on the 3rd line, I've dereferenced b and got 'a' and (int) will typecast the value of 'a' to its corresponding value of 97 which is stored into i. But for this section of code:
char a = 'a';
char * b = &a;
int i = *(int *)b;
This results in i being some arbitrary large number like 792351. I'm assuming this is a memory address but my question is why? When I typecast b to an integer pointer, does this actually cause b to point to a different area in memory? What is going on?
EDIT: If the above doesn't work, then why would something like this work:
char a = 'a';
void * b = &a;
char c = *(char *)b;
This correctly assigns 'a' to c.

Your int is larger than your char - you get the 'a' value + some random data following it in memory.
E.g, assuming this layout in memory:
'a'
0xFF
0xFF
0xFF
Your char * and int * both point to the 'a'. When you dereference the char *, you get only the first byte, the 'a'. When you dereference the int * (assuming your int is 32-bit) you get the 'a' and the 3 bytes of uninitialized data following it.
EDIT: In response to updated question:
In char c = *(char *)b;, b still points at the 'a' value. You cast it to a char *, and then dereference it, getting the char pointed to by a char *

The last line you're concerned about does a very bad thing. First, it treats b as an int* whereas b is a char*. That is, the memory pointer to by b is assumed as 4 bytes(typically) instead of 1 byte. So when you dereference it, it goes to the 1 byte pointed by the actual b, takes the following 3 bytes too, treats those 4 bytes as a single int, and gives you the result. That's why it's garbage.
In general, casting one pointer type to another pointer type must be done with great caution.

You're casting a char pointer to an int pointer. Characters are (usually) stored as 8 bits. ints, on the other hand, are 32 bits (or 64 on 64-bit systems). So if you look at the other 24 bits of memory next to the 8 bits worth of b, you'll get a bunch of extra bits that weren't initialized. Even the position of *b in i is architecture dependent.
big-endian: **** ****|**** ****|**** ****|0110 0001
little-endian: 0110 0001|**** ****|**** ****|**** ****
When you cast the character stored in the above, all the asterisks become relevant.

Since a char is 1 Byte long, and an int 4, when you read an int from the address of a single character, you're reading the character and 3 more bytes. The content of these bytes is just whatever happens to lie in memory (pointers, the value of b) and could even be unallocated (resulting in a segmentation fault).

When you type cast it to a (int *) type, it will refer to a total of 4 bytes(size if int) in memory.

In the second case, you're treating the same address as if it pointed to an int. Officially, the result is simply undefined behavior.
Realistically, what happens is that whatever happens to be in the four1 bytes starting at that address get interpreted as an int.
1 4 bytes assuming a 32-bit int -- if your implementation has, for example, a 64-bit int, it'll be 8 bytes.