I somehow understand the idea about casting, but still have the following questions about it, when we cast a variable from one type to another:
Does casting change the actual data type generally? Like eg. if we have char *name ="james" and we cast the char pointer ==> int *name = (int *) name
Do the types of all fields (members) also change in case of a structure data type in C? I.e. if we have a struct student {int id, char *name} and there is a pointer to an instance of struct student and type cast it to another type, do the fields also change?
From another answer here,
casting (also called type coercion) does not change the data - the
underlying bits - but it changes the type, i.e., how those bits are
interpreted.
In other words, a cast tells the compiler:
"You would think the expression has this type, but I say you to use the expression as it had this different type"
At this point the compiler says "Ok" and compile code to do what you ask for with the cast. Take this example (probably there are better ones):
int a,b;
int *p;
a=256; // does not fit in a single byte
p=&a; // a pointer to a
b=*p; // <- now b is 256, the value of a
b=*(char *) p; // <- now b is 0 (probably)
The two assignments to "b" are different. In the first case the compiler nows that p points to integers, so it will move in b 4 or 8 bytes taken from a.
The second assignment, with the cast, tells the compiler to act as p was not a pointer to an integer, but a pointer to a char. And the compiler obeys, and takes only one byte from the value of a (pointed to by p).
Does casting change the actual data type generally? Like eg. if we have char *name ="james" and we cast the char pointer ==> int *name = (int *) name
Casting a pointer does not change the memory the pointer points to.
When a pointer, say p, is dereferenced, as with *p, and used to get a value from memory, the type of the pointer determines how the compiler interprets the memory. If p is a char *, then using *p for its value loads one byte from memory and interprets it as a char value. If p is an int *, then using *p for its value loads as many bytes from memory as an int uses and interprets them as an int value. If p is an int * and we use * (char *) p for its value, then (char *) p is a char *, so * (char *) p loads one byte from memory and interprets it as a char value.
Conversely, when using *p to store a value to memory, the value will be encoded according to the rules for the *p type, and the resulting bytes will be written to memory.
The C standard has rules about which types may be used to access memory that has been established to contain data of another type. If those rules are not followed, the behavior of the program is not defined by the C standard. Using an int * converted from a pointer to a char in an array of char is not one of the defined uses. Nominally, it asks the compiler to interpret the bytes of the array as if they encoded an int value, but, because the code is not following the rules of the C standard, the compiler might or might not do that.
Do the types of all fields (members) also change in case of a structure data type in C? I.e. if we have a struct student {int id, char *name} and there is a pointer to an instance of struct student and type cast it to another type, do the fields also change?
Casting a pointer does not change the memory the pointer points to. If you cast a pointer to one structure type to a pointer to another structure type and attempt to access memory with it, it might not work as you desire.
Related
If I have a pointer to an array of char*s, in other words, a char** named p1, what would I get if I do (char*)p1? I’m guessing there would be some loss of precision. What information would I lose, and what would p1 now be pointing to? Thanks!
If you had asked about converting an int ** to an int *, the answer would be different. Let’s consider that first, because I suspect it is more representative of the question you intended to ask, and the char * case is more complicated because there is a special purpose involved in that.
Suppose you have several int: int a, b, c, d;. You can make an array of pointers to them: int *p[] = { &a, &b, &c, &d };. You can also make a pointer to one of these pointers: int **q = &p[1];. Now q points to p[1], which contains the address of b.
When you write *q, the compiler knows q points to a pointer to an int, so it knows *q points to an int. If you write **q, the compiler, knowing that *q points to an int, will get *q from memory and use that as an address to get an int.
What happens if you convert q to an int * and try to use it, as in printf("%d\n", * (int *) q);? When you convert q to an int * you are (falsely) telling the compiler to treat it as a pointer to an int. Then, * (int *) q tells the compiler to go to that address and get an int.
This is invalid code—its behavior is not defined by the C standard. Specifically, it violates C 2018 6.5 7, which says that an object shall be accessed only by an lvalue expression that has a correct type—either a type compatible with that of the actual object or certain other cases, none of which apply here. At the place q points, there is a pointer to an int, but you tried to access it as if it were an int, and that is not allowed.
Now let’s consider the char ** to char * case. As before, you might take some char **q and convert it to char *. Now, you are telling the compiler to go to the place q points, where there is a pointer to a char, and to access that memory location as if there were a char there.
C has special rules for this case. You are allowed to examine the bytes that make up objects by accessing them through a char *. So, if you convert a char ** to char * and use it, as in * (char *) q, the result will be the first (lowest addressed) byte that makes up the pointer there. You can even look at the rest of the bytes, using code like this:
char *t = (char *) q;
printf("%d\n", t[0]);
printf("%d\n", t[1]);
printf("%d\n", t[2]);
printf("%d\n", t[3]);
This code will show you the decimal values for the first four bytes that make up the pointer to char that is at the location specified by q.
In summary, converting a char ** to char * will allow you to examine the bytes that represent a char *. However, in general, you should not convert pointers of one indirection level to pointers of another indirection level.
I find pointers make much more sense when I think of them as memory addresses rather than some abstract high level thing.
So a char** is a memory address which points to, a memory address which points to, a character.
0x0020 -> 0x0010 -> 0x0041 'A'
When you cast you change the interpretation of the data, not the actual data. So the char* is
0x0020 -> 0x0010 (unprintable)
This is almost certainly not useful. Interpreting this random data as a null terminated string would be potentially disastrous.
Suppose I have the variable declaration char **p. Does that mean that p is a pointer to a char pointer or that p is a pointer to a pointer of some type that points to a char?
There is a subtle difference between these two chains of pointers.
In other words, what I am trying to ask is given a char pointer to a pointer char **p, *p can obviously be a pointer to a char *, but could it also point to some other pointer type like void * which in turn points to a char?
The type of *p is always char *. It cannot be a void* that happens to be pointing to a char.
Pointer types are derived from some other type - an object type, a function type, or an incomplete type. The type from which the pointer is derived is called its reference type (C99, 6.2.5.20).
The reference type of char** is char*, meaning that dereferencing char** expression yields a char*.
A pointer contains an address. The C compiler uses the variable type such as char of a definition such as char *pC; to know how to access the data at the address contained in the pointer variable pC. So to C all addresses are pretty much the same, at least for all the main stream computer architectures, and the type just tells the C compiler how many bytes of memory to access when dereferencing the pointer or dereferencing the pointer pointed to by a variable.
So a definition such as char **p; tells the compiler that the variable p contains the address of a memory location, which is accessed by reading the number of bytes of a pointer, that points to another address, which is accessed by reading the number of bytes of a pointer, and that the address pointed to contains the address of a char.
And remember that with the C programming language you can use a cast to persuade the compiler to accept almost anything.
And a void * pointer variable is by definition capable of holding a pointer to any data type.
However it is your responsibility that what you are doing actually makes sense. So it is assumed the void * pointer contains the address of a character; that when the variable char **p; is dereferenced as in char aChar = **p; it is up to the programmer that the variable p contains a valid address and that the memory location whose address is pointed to, *p, contains a valid address. Or if you are doing something like char aStr[128]; strcpy (aStr, *p); then the pointer address pointed to by *p contains the address of a zero terminated string of characters.
And to some extent it depends on the C compiler. Some are more accepting than others. Some will issue warnings and some may issue errors and it probably also depends heavily on the compiler options selected for the compile.
Doing a test compile with Visual Studio 2017 Community Edition I can do the following:
char aStr[] = "this is a string";
void *p = aStr; // perfectly fine
char *pc = aStr; // perfectly fine
char **pp = &p; // warning C4133: 'initializing': incompatible types - from 'void **' to 'char **'
char **pp2 = (char **)&p; // perfectly fine since we are casting the pointer
char **pp3 = &aStr; // warning C4047: 'initializing': 'char **' differs in levels of indirection from 'char (*)[17]'
By the way, the last definition, char **pp3 = &aStr; really should be an error since if you dereference pp3 you do not get a valid pointer to a string.
However using the debugger to look at pp, it points to a valid pointer to a string and I can see the text of aStr.
In C programming language char **p will be described as, p is a pointer of pointer to a char. That means p can hold a address of another char pointer.
For example:-
char c = 'A';
char *b = &c;// here b is a char pointer so it can hold the address of a char variable which is c
char **p = &b;// p is a pointer of pointer to a char so here it can hold the address of a pointer of char i.e. b.
printf("%c", **p);// correct and will print A
Now here p is not a void pointer but you can make p also a void * pointer. So following is also correct
void **p = &b;
But to get the value of char c we have to type cast it like below
printf("%c", **p);// Not correct and will not print A
printf("%c", **((char **)p));//correct, first we type cast it to a pointer of pointer to char and dereference it to get the value stored in c
If you use the right-left rule on the declaration you’d see that:
you start with identifier p (P)
p has nothing on the right
you go left to the first * pointer (*P)
you go left again and you see * pointer (**P)
then you go left again and see char (char **P)
all together you can make the conclusion:
P is a pointer to a pointer to char.
Which is a fancy way of saying P is a double pointer to char.
So I'm learning about C pointers, and I'm a little confused. Pointers just point to specific memory address.
sizeof(char*), sizeof(int*), sizeof(double*) all output 8. So they all take 8 bytes to store a memory address.
However, if I try to compile something like this:
int main(void)
{
char letter = 'A';
int *a = &letter;
printf("letter: %c\n", *a);
}
I get a warning from the compiler (gcc):
warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
int *a = &letter;
However, char *a = &letter;, doesn't result in a warning.
Why does the type of the pointer matter, if it's 8 bytes long anyway? Why does declaring a pointer with a type different than the type of data it's pointing to yield in a warning?
The issue isn't about the size of the pointer - it's about the type of the pointee.
If you have a pointer to an int, the pointer takes up some number of bytes (seems like you have a 64-bit system, where that pointer takes up eight bytes). However, if you dereference that pointer to read or write what it points to, because the type of the pointer is int*, the read or write will try to manipulate sizeof(int) bytes at the target, and it will try to manipulate them as though they're an int.
If you have a single object of type char, which by definition has size 1, and you try to read or write it through a pointer of type int, which (on many systems) has size 4, then reading the pointer will pull back some garbage data along with the char and writing to the pointer will clobber random regions of memory around that char with unrelated values.
Additionally, C has a rule called the strict aliasing rule that says that you are not allowed to read or write through a pointer of a type that doesn't match the type of what's being pointed at (unless the pointer is of type char *, signed char*, or unsigned char*). Breaking strict aliasing can mess up all sorts of compiler optimizations and lead to code that doesn't behave as expected.
So in short, the size of the pointer really isn't the issue here. It's the semantics about what happens when you try to read or write what's being pointed at.
Think about what you're going to do with the pointer.
int n = 42;
char *p = &n; // BAD
If this compiles (a compiler can reject it outright rather than printing a non-fatal warning), you have a pointer that points to the memory occupied by the int object n. How are you going to get the value of that object? *p gives you a char result, most likely the first byte of n -- which may be the high-order byte or the low-order byte.
Pointer types depend on the type of object they point to so that you can access that object.
(Also, don't make assumptions based on the behavior of your particular implementation. 32-bit systems have 4-byte pointers, and the language doesn't guarantee that all pointers are the same size.)
It's not about bytes length but about the type of what you're pointing to. Char and int are completly different. Moreover, sizeof(char) equal 1 and sizeof(int *) equal 8.
I don't understand what kind of property the mystery member is below:
typedef struct _myobject
{
long number;
void *mystery;
} t_myobject;
What kind of member is this void member? How much memory does that take up? Where can I get more information about what that accomplishes (for instance, why would one use a void member?)
EDIT-- updated title to say void*
A void* variable is a "generic" pointer to an address in memory.
The field mystery itself consumes sizeof(void*) bytes in memory, which is typically either 4 or 8, depending on your system (on the size of your virtual memory address space, to be more accurate). However, it may point to some other object which consumes a different amount of memory.
A few usage examples:
int var;
char arr[10];
t_myobject obj;
obj.mystery = &var;
obj.mystery = arr;
obj.mystery = malloc(100);
Your struct declaration says void *, and your question says void. A void pointer member is a pointer to any kind of data, which is cast to the correct pointer type according to conditions known at run-time.
A void member is an "incomplete type" error.
Variable of type void * can hold address of any symbol. Assignment to this variable can be done directly but while dereferencing it needs to be type cast to the actual type. This is required to inform the compiler about how much memory bytes needs to be accessed while dereferencing. Data type is the one which tells the size of a variable.
int a = 10;
char b = 'c';
void *c = NULL;
c = &a;
printf("int is %d\n", *((int*)c));
c = &b;
printf("char is %c\n", *(char*)c));
In above example void pointer variable c stores address of int variable a at first. So while dereferencing void pointer c, its typecasted to int *. This informs the compiler to access 4 byte (size of int) to get the value. And then in second printf its typecasted to char *, this is to inform the compiler to access one byte (size of char) to get the value.
Wrong question header. The member is 'void *', not 'void'.
A pointer to anything, rather than nothing.
In C, I am having a structure like this
typedef struct
{
char *msg1;
char *msg2;
.
.
char *msgN;
}some_struct;
some_struct struct1;
some_struct *pstruct1 = &struct1;
I want to keep a pointer or a varible which when incremented or decremented, gives the next/last member variable of this structure. I do not want to use array of char * since it is already designed like this.
I tried using the union and structure combination, but I don't know how to write code for that.
Thought iterator may help but this is C.
Any suggestions ?
You can't do that, safely. You can take a chance that the adjacent character pointers are really adjacent (with no padding) as if they were in an array, but you can't be sure so that's pretty much straight into the undefined behavior minefield.
You can abstract it to an index, and do something like:
char * get_pointer(some_struct *p, int index)
{
if(index == 0)
return p->msg1;
if(index == 1)
return p->msg2;
/* and so on */
return NULL;
}
Then you get to work with an index which you can increment/decrement freely, and just call get_pointer() to map it to a message pointer when needed.
You can do this using strict C, but you need to take certain precautions to ensure compliance with the standard. I will explain these below, but the precautions you need to take are:
(0) Ensure there is no padding by including this declaration:
extern int CompileTimeAssert[
sizeof(some_struct) == NumberOfMembers * sizeof(char *) ? 1 : -1];
(1) Initialize the pointer from the address of the structure, not the address of a member:
char **p = (char **) (char *) &struct1;
(I suspect the above is not necessary, but I would have to insert more reasoning from the C standard.)
(2) Increment the pointer in the following way, instead of using ++ or adding one:
p = (char **) ((char *) p + sizeof(char *));
Here are explanations.
The declaration in (0) acts as a compile-time assertion. If there is no padding in the struct, then the size of the struct equals the number of members multiplied by the size of a member. Then the ternary operator evaluates to 1, the declaration is valid, and the compiler proceeds. If there is padding, the sizes are not equal, the ternary operator evaluates to -1, and the declaration is invalid because an array cannot have a negative size. Then the compiler reports an error and terminates.
Thus, a program containing this declaration will compile only if the struct does not have padding. Additionally, the declaration will not consume any space (it only declares an array that is never defined), and it may be repeated with other expressions (that evaluate to an array size of 1 if their condition is true), so different assertions may be tested with the same array name.
Items (1) and (2) deal with the problem that pointer arithmetic is normally guaranteed to work only within arrays (including a notional sentinel element at the end) (per C 2011 6.5.6 8). However, the C standard makes special guarantees for character types, in C 2011 6.3.2.3 7. A pointer to the struct may be converted to a pointer to a character type, and it will yield a pointer to the lowest addressed byte of the struct. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.
In (1), we know from C 2011 6.3.2.3 7, that (char *) &struct1 is a pointer to the first byte of struct1. When converted to (char **), it must be a pointer to the first member of struct1 (in particular thanks to C 2011 6.5.9 6, which guarantees that equal pointers point to the same object, even if they have different types).
Finally, (2) works around the fact that array arithmetic is not directly guaranteed to work on our pointer. That is, p++ would be incrementing a pointer that is not strictly in an array, so the arithmetic is not guaranteed by 6.5.6 8. So we convert it to a char *, for which increments are guaranteed to work by 6.3.2.3 7, we increment it four times, and we convert it back to char **. This must yield a pointer to the next member, since there is no padding.
One might claim that adding the size of char ** (say 4) is not the same as four increments of one char, but certainly the intent of the standard is to allow one to address the bytes of an object in a reasonable way. However, if you want to avoid even this criticism, you can change + sizeof(char *) to be +1+1+1+1 (on implementations where the size is 4) or +1+1+1+1+1+1+1+1 (where it is 8).
Take the address of the first member and store it to char **:
char **first = &struct1.msg1;
char **last = &struct1.msg1 + sizeof(some_struct) / sizeof(char *) - 1;
char **ptr = first; /* *ptr is struct.msg1 */
++ptr; /* now *ptr is struct1.msg2 */
This assumes that the structure only contains char * members.