Why are pointer type casts necessary? - c

I cannot understand why are pointer type casts necessary, as long as pointers point to an address and their type is important only when it comes to pointer arithmetic.
That is to say, if I encounter the next code snippet:
int a = 5; then both
char*b = (char*)&a; and int*c = (int*)&a
point to the very same memory location.
Indeed, when executing char*b = (char*)&a part of the memory contents may be lost, but this is due to the type of b is char* which can store only sizeof(char) bytes of memory, and this could be done implicitly.

The pointer type is important when you dereference it, since it indicates how many bytes should be read or stored.
It's also important when you perform pointer arithmetic, since this is done in units of the size that the pointer points to.

The type of the pointer is most important when you dereference the pointer.
The excerpt
char *b = (int*)&a;
is wrong because int * cannot be assigned to char *. You meant char *b = (char *)&a;. The C standard says you need an explicit cast because the types are not compatible. This is the "yeah yeah, I know what I am doing".
The excerpt
int*c = (int*)&a;
is right, but &a is already a pointer to an int, so the cast will do no conversion, therefore you can write it as int *c = &a;.

as long as pointers point to an address and their type is important only when it comes to pointer arithmetic.
No, not only for pointer arithmetic. Its also important for accessing the data pointed by the pointer, the wrong way of accessing as in your example leads to improper results.
when executing charb = (int)&a part of the memory contents may be lost, but this is due to the type of b is char* which can store only sizeof(char) bytes of memory,
First of all, that is wrong to do (unless we really need to extract 1-byte out of 4-bytes)
Second, the application of pointer cast is mostly used w.r.t void*(void pointers) when passing data for a function which can handle many different type, best example is qsort

Related

c fixed memory pool implementation

I'm trying to implement a fixed memory pool (first-fit free list)
my struct is:
struct mempool {
void* data;
int offset;
};
data is divided into 8byte blocks, 4 bytes pointing to next offset and 4 bytes data. offset points to the first free block. I'm trying to understand why accessing the first block is done by:
int* address = (int*)((int)&pool->data + pool->offset);
especially the (int)&pool->data part. Isn't pool->data already a pointer? why do I need its address to perform arithmetic and shift to the offset?
I'm trying to understand why accessing the first block is done by:
int* address = (int*)((int)&pool->data + pool->offset);
especially the (int)&pool->data part. Isn't pool->data already a
pointer?
Yes, pool->data is a pointer. And one can obtain the address of a pointer, so there's nothing inherently wrong with that. The result in this case has type void **.
Moreover, given that data is the first member of struct mempool, &pool would point to the same address. Although the latter has a different type (struct mempool *), that's probably mooted by the fact that the code performs a conversion to type int.
why do I need its address to perform arithmetic and shift to
the offset?
The effect is to compute an address relative to the location of the data pointer itself, rather than relative to its target. Furthermore, the cast to type int suggests that the offset is measured in bytes. That aspect of it is a bit unsafe, however, because it is not guaranteed that type int is large enough to support round-trip conversion from pointer to int to pointer.
This all seems consistent with your characterization of the pool having metadata adjacent to the data, but it remains unclear what purpose the data pointers serve.
Overall, although I'm not convinced of the correctness or efficacy of the minimal code presented. If it in fact serves the purpose for which it is intended, relative to the structure definition presented, then this variation should do so better and more clearly:
int* address = (int*)((char *)&pool + pool->offset);
That avoids the question of an integer type that can represent pointers (although there is one in the form of intptr_t). The cast to char * accounts for the fact that pointer arithmetic is performed in units the size of the pointed-to type, and the offset seems to be expressed in one-byte units.
You code does not seem correct. You are adding pool->offset to the address of pool->data field rather that to the address stored in pool->data field. I would suggest fixing like this:
int* address = (int *)pool->data + pool->offset;
in case your offset is in 4-byte chunks, or like this:
int* address = (int *)((char *)pool->data + pool->offset);
in case your offset is in bytes.
pool->data + pool->offset wouldn't be possible because you can't do pointer arithmetic on void pointers - that isn't valid C. Pointer arithmetic also assumes that the underlying type of this all is an array.
&pool->data gives the address of the pointer itself, which happens to be the address of the struct. The type void**. You can't do arithmetic on that either.
Therefore the naive, bad solution here is to cast the pointer to an int and then do simple addition. That doesn't work either, because int is not guaranteed to be able to hold the contents of a pointer. uintptr_t should have been used instead of int.
And finally, accessing that chunk of memory through int* then de-referencing it is only possible if what's stored there is already regarded as type int by the compiler. If not, it invokes undefined behavior, What is the strict aliasing rule?.
Summary: this is quite questionable code and there's many better ways to implement it.

Why are C pointers recommended to be the type of the data they're pointing to, if they're 8 bytes large regardless?

So I'm learning about C pointers, and I'm a little confused. Pointers just point to specific memory address.
sizeof(char*), sizeof(int*), sizeof(double*) all output 8. So they all take 8 bytes to store a memory address.
However, if I try to compile something like this:
int main(void)
{
char letter = 'A';
int *a = &letter;
printf("letter: %c\n", *a);
}
I get a warning from the compiler (gcc):
warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
int *a = &letter;
However, char *a = &letter;, doesn't result in a warning.
Why does the type of the pointer matter, if it's 8 bytes long anyway? Why does declaring a pointer with a type different than the type of data it's pointing to yield in a warning?
The issue isn't about the size of the pointer - it's about the type of the pointee.
If you have a pointer to an int, the pointer takes up some number of bytes (seems like you have a 64-bit system, where that pointer takes up eight bytes). However, if you dereference that pointer to read or write what it points to, because the type of the pointer is int*, the read or write will try to manipulate sizeof(int) bytes at the target, and it will try to manipulate them as though they're an int.
If you have a single object of type char, which by definition has size 1, and you try to read or write it through a pointer of type int, which (on many systems) has size 4, then reading the pointer will pull back some garbage data along with the char and writing to the pointer will clobber random regions of memory around that char with unrelated values.
Additionally, C has a rule called the strict aliasing rule that says that you are not allowed to read or write through a pointer of a type that doesn't match the type of what's being pointed at (unless the pointer is of type char *, signed char*, or unsigned char*). Breaking strict aliasing can mess up all sorts of compiler optimizations and lead to code that doesn't behave as expected.
So in short, the size of the pointer really isn't the issue here. It's the semantics about what happens when you try to read or write what's being pointed at.
Think about what you're going to do with the pointer.
int n = 42;
char *p = &n; // BAD
If this compiles (a compiler can reject it outright rather than printing a non-fatal warning), you have a pointer that points to the memory occupied by the int object n. How are you going to get the value of that object? *p gives you a char result, most likely the first byte of n -- which may be the high-order byte or the low-order byte.
Pointer types depend on the type of object they point to so that you can access that object.
(Also, don't make assumptions based on the behavior of your particular implementation. 32-bit systems have 4-byte pointers, and the language doesn't guarantee that all pointers are the same size.)
It's not about bytes length but about the type of what you're pointing to. Char and int are completly different. Moreover, sizeof(char) equal 1 and sizeof(int *) equal 8.

What does this do exactly? [*(char*)p1]

I am not used to pointers because I started learning Pascal in high school and now I am upgrading myself to C. My request would be to explain me what should I think when I see something like this [*(char*)p1]. Don't be shy writing me quite a few lines :)
Thank you.
P.S. p1 is a const void *. To be more accurate.
Assuming that [*(char*)p1] is an array designator, (char*) is used to cast p1 to make p1 char * type. Then * is used to dereference it to use value at the address (p1 points to) as index to some array.
void *p1;// pointer to void or generic pointer; might be used when you want to be flexible about the data type
(char*)p1; //typecast to a char pointer; you address the memory locatuion pointed to by P1 as char
*(char*)p1; //the value at the location pointed to by p1.
Hope this helps.
[*(char*)p1] is somewhat incomplete, there needs to be a variable name in front of it for the array subscription to make sense, such as for example foo[*(char*)p1].
In that case, it means:
convert p1 to pointer-to-char
dereference this pointer (giving a char value)
use this value as index to look up in an array
Note that using a char as index will make most compilers unhappy and cause it to emit a warning. That is because most often when a char is used as an index, it happens by error, not by intent, and also because it is implementation-defined whether char is signed or unsigned (so it is inherently non-portable, and you may end up indexing out of bounds by accident, if you assume the wrong one).
(char*)p1 is a typecast. This means, for this statement, we're treating p1 as a pointer to a character.
*(char*)p1 dereferences p1, interpreting it as a char type due to the typecast (char*). A pointer points to a memory location, and dereferencing that pointer returns the value at the location in memory p1 points to.
[*(char*)p1] ... Array access? is this a snippet from something larger?
This void* to someOtherType* conversion is common in C code, because malloc, used to allocate memory dynamically, returns void*.

free a cast pointer

I want to know if it is ok to free() a pointer cast to another type.
For instance if I do this:
char *p = malloc (sizeof (int));
int *q = (int *)p;
free (q);
I get no warning on gcc (-Wall).
On linux, the man pages on free says it is illegal to call free on a pointer that was not returned by malloc(), calloc() or realloc(). But what happens if the pointer was cast to another type in between?
I ask this because I read that the C standard does not require different pointer types (e.g. int* and char*) to have the same size, and I fail to understand how this is possible since they both need to be convertible to a void* in order to call the malloc/free functions.
Is the above code legal?
It's probably safe, but it's not absolutely guaranteed to be safe.
On most modern systems, all pointers (at least all object pointers) have the same representation, and converting from one pointer type to another just reinterprets the bits that make up the representation. But the C standard doesn't guarantee this.
char *p = malloc (sizeof (int));
This gives you a char* pointer to sizeof (int) bytes of data (assuming malloc() succeeds.)
int *q = (int *)p;
This converts the char* pointer to an int* pointer. Since int is bigger than char, an int* pointer could require less information to indicate what it points to. For example, on a word-oriented machine, an int* might point just point to a word, while a char* has to contain a word pointer and an offset that indicates which byte within the word it points to. (I've actually worked on a system, the Cray T90, that worked like this.) So a conversion from char* to int* can actually lose information.
free (q);
Since free() takes an argument of type void*, the argument q is implicitly converted from int* to void*. There is no guarantee in the language standard that converting a char* pointer to int*, and then converting the result to void*, gives you the same result as converting a char* directly to a void*.
On the other hand, since malloc() always returns a pointer that's correctly aligned to point to any type, even on a system where int* and char* have different representations, it's unlikely to cause problems in this particular case.
So your code is practically certain to work correctly on any system you're likely to be using, and very very likely to work correctly even on exotic systems you've probably never seen.
Still, I advise writing code that you can easily demonstrate is correct, by saving the original pointer value (of type char*) and passing it to free(). If it takes several paragraphs of text to demonstrate that your code is almost certainly safe, simplifying your assumptions is likely to save you effort in the long run. If something else goes wrong in your program (trust me, something will), it's good to have one less possible source of error to worry about.
A bigger potential problem with your code is that you don't check whether malloc() succeeded. You don't do anything that would fail if it doesn't (both the conversion and the free() call are ok with null pointers), but if you refer to the memory you allocated you could be in trouble.
UPDATE:
You asked whether your code is legal; you didn't ask whether it's the best way to do what you're doing.
malloc() returns a void* result, which can be implicitly converted to any pointer-to-object type by an assignment. free() takes a void* argument; any pointer-to-object type argument that you pass to it will be implicitly converted to void*. This round-trip conversion (void* to something_else* to void*) is safe. Unless you're doing some kind of type-punning (interpreting the same chunk of data as two different types), there's no need for any casts.
Rather than:
char *p = malloc (sizeof (int));
int *q = (int *)p;
free (q);
you can just write:
int *p = malloc(sizeof *p);
...
free(p);
Note the use of sizeof *p in the argument to malloc(). This gives you the size of whatever p points to without having to refer to its type explicitly. It avoids the problem of accidentally using the wrong type:
double *oops = malloc(sizeof (int));
which the compiler likely won't warn you about.
Yes, it's legal. free() takes a void pointer (void*), so the type doesn't matter. As long as the pointer passed to was returned by malloc/realloc/calloc it's valid.
Yes the pointer is not changed, the cast is merely how the compiler interprets the bunch of bits.
edit: The malloc call returns an address in memory ie a 32(or 64) bit number.
The cast only tells the compiler how to interpret the value stored at that address, is it a float, integer, string etc, and when you do arithmatic on the address how big a unit should it step in.
The code is legal, however it is not necessary. Since pointers only point to the address where data is stored, there is no need to allocate space, or subsequently free it.

Another C pointer Question

The following code :
int *a;
*a = 5;
will most likely result in a segmentation fault and I know why.
The following code :
int a;
*a = 5;
won't even compile.
(gcc says : invalid type argument of unary *).
Now, a pointer is simply an integer, which is used
for storing an address.
So, why should it be a problem if I say :
*a = 5;
Ideally, this should also result in a segmentation fault.
A pointer is not an integer. C has data types to
a) prevent certain programming errors, and
b) improve portability of programs
On some systems, pointers may not be integers, because they really consist of two integers (segment and offset). On other systems, the "int" type cannot be used to represent pointers because an int is 32 bits and a pointer is 64 bits. For these reasons, C disallows using ints directly as pointers. If you want to use an integral type that is large enough to hold a pointer, use intptr_t.
When you say
int a;
*a = 5;
you are trying to make the compiler dereference something that is not a pointer. Sure, you could cast it to a pointer and then dereference it, like so,
*((int*)a) = 5;
.. and that tells the compiler that you really, really want to do that. BUT -- It's kind of a risky thing to do. Why? Well, in your example, for instance, you never actually initialized the value of a, so when you use it as a pointer, you are going to have whatever value is already at the location being used for a. Since it looks like it is a local variable, that will be an un-init'd location in the function's stack frame, and could be anything. In essence, you would be trying to write the value 5 to some undetermined location; not really a wise thing to do!
It's said to illustrate that pointers merely store addresses, and that addresses may be thought as numbers, much like integers. But usually addresses have a structure (like, page number, offset within page, etc).
You should not take that by word. An integer literally stores a number, which you can add, subtract etc. But which you cannot use as a pointer. An integer is an integer, and a pointer is a pointer. They serve different purposes.
Sometimes, a cast from a pointer to an integer may be necessary (for whatever purposes - maybe in a OS kernel to do some address arithmetic). Then you may cast the pointer to such an integer type, previously figuring out whether your compiler guarantees correct sizes and preserves values. But if you want to dereference, you have to cast back to a pointer type.
You never actually assign "a" in the first case.
int* a = ?
*a = 5; //BAD. What is 'a' exactly?
int a = ? //but some int anyway
*a = 5; //'a' is not a pointer!
If you wish to use the integer as a pointer, you'll have to cast it first. Pointers may be integers, but conceptually they serve different purposes.
The operator * is a unary operator which is not defined for the integer data type. That's why the statement
*a = 5;
won't compile.
Also, an integer and a pointer are not the same thing. They are typically the same size in memory (4 bytes for 32 bit systems).
int* a — is a pointer to int. It points nowhere, you haven't initialized it. Please, read any book about C before asking such questions.

Resources