I want to know if it is ok to free() a pointer cast to another type.
For instance if I do this:
char *p = malloc (sizeof (int));
int *q = (int *)p;
free (q);
I get no warning on gcc (-Wall).
On linux, the man pages on free says it is illegal to call free on a pointer that was not returned by malloc(), calloc() or realloc(). But what happens if the pointer was cast to another type in between?
I ask this because I read that the C standard does not require different pointer types (e.g. int* and char*) to have the same size, and I fail to understand how this is possible since they both need to be convertible to a void* in order to call the malloc/free functions.
Is the above code legal?
It's probably safe, but it's not absolutely guaranteed to be safe.
On most modern systems, all pointers (at least all object pointers) have the same representation, and converting from one pointer type to another just reinterprets the bits that make up the representation. But the C standard doesn't guarantee this.
char *p = malloc (sizeof (int));
This gives you a char* pointer to sizeof (int) bytes of data (assuming malloc() succeeds.)
int *q = (int *)p;
This converts the char* pointer to an int* pointer. Since int is bigger than char, an int* pointer could require less information to indicate what it points to. For example, on a word-oriented machine, an int* might point just point to a word, while a char* has to contain a word pointer and an offset that indicates which byte within the word it points to. (I've actually worked on a system, the Cray T90, that worked like this.) So a conversion from char* to int* can actually lose information.
free (q);
Since free() takes an argument of type void*, the argument q is implicitly converted from int* to void*. There is no guarantee in the language standard that converting a char* pointer to int*, and then converting the result to void*, gives you the same result as converting a char* directly to a void*.
On the other hand, since malloc() always returns a pointer that's correctly aligned to point to any type, even on a system where int* and char* have different representations, it's unlikely to cause problems in this particular case.
So your code is practically certain to work correctly on any system you're likely to be using, and very very likely to work correctly even on exotic systems you've probably never seen.
Still, I advise writing code that you can easily demonstrate is correct, by saving the original pointer value (of type char*) and passing it to free(). If it takes several paragraphs of text to demonstrate that your code is almost certainly safe, simplifying your assumptions is likely to save you effort in the long run. If something else goes wrong in your program (trust me, something will), it's good to have one less possible source of error to worry about.
A bigger potential problem with your code is that you don't check whether malloc() succeeded. You don't do anything that would fail if it doesn't (both the conversion and the free() call are ok with null pointers), but if you refer to the memory you allocated you could be in trouble.
UPDATE:
You asked whether your code is legal; you didn't ask whether it's the best way to do what you're doing.
malloc() returns a void* result, which can be implicitly converted to any pointer-to-object type by an assignment. free() takes a void* argument; any pointer-to-object type argument that you pass to it will be implicitly converted to void*. This round-trip conversion (void* to something_else* to void*) is safe. Unless you're doing some kind of type-punning (interpreting the same chunk of data as two different types), there's no need for any casts.
Rather than:
char *p = malloc (sizeof (int));
int *q = (int *)p;
free (q);
you can just write:
int *p = malloc(sizeof *p);
...
free(p);
Note the use of sizeof *p in the argument to malloc(). This gives you the size of whatever p points to without having to refer to its type explicitly. It avoids the problem of accidentally using the wrong type:
double *oops = malloc(sizeof (int));
which the compiler likely won't warn you about.
Yes, it's legal. free() takes a void pointer (void*), so the type doesn't matter. As long as the pointer passed to was returned by malloc/realloc/calloc it's valid.
Yes the pointer is not changed, the cast is merely how the compiler interprets the bunch of bits.
edit: The malloc call returns an address in memory ie a 32(or 64) bit number.
The cast only tells the compiler how to interpret the value stored at that address, is it a float, integer, string etc, and when you do arithmatic on the address how big a unit should it step in.
The code is legal, however it is not necessary. Since pointers only point to the address where data is stored, there is no need to allocate space, or subsequently free it.
Related
I cannot understand why are pointer type casts necessary, as long as pointers point to an address and their type is important only when it comes to pointer arithmetic.
That is to say, if I encounter the next code snippet:
int a = 5; then both
char*b = (char*)&a; and int*c = (int*)&a
point to the very same memory location.
Indeed, when executing char*b = (char*)&a part of the memory contents may be lost, but this is due to the type of b is char* which can store only sizeof(char) bytes of memory, and this could be done implicitly.
The pointer type is important when you dereference it, since it indicates how many bytes should be read or stored.
It's also important when you perform pointer arithmetic, since this is done in units of the size that the pointer points to.
The type of the pointer is most important when you dereference the pointer.
The excerpt
char *b = (int*)&a;
is wrong because int * cannot be assigned to char *. You meant char *b = (char *)&a;. The C standard says you need an explicit cast because the types are not compatible. This is the "yeah yeah, I know what I am doing".
The excerpt
int*c = (int*)&a;
is right, but &a is already a pointer to an int, so the cast will do no conversion, therefore you can write it as int *c = &a;.
as long as pointers point to an address and their type is important only when it comes to pointer arithmetic.
No, not only for pointer arithmetic. Its also important for accessing the data pointed by the pointer, the wrong way of accessing as in your example leads to improper results.
when executing charb = (int)&a part of the memory contents may be lost, but this is due to the type of b is char* which can store only sizeof(char) bytes of memory,
First of all, that is wrong to do (unless we really need to extract 1-byte out of 4-bytes)
Second, the application of pointer cast is mostly used w.r.t void*(void pointers) when passing data for a function which can handle many different type, best example is qsort
I'm making a logging utility that tracks allocations and deallocations inside a library I made.
My program didn't crash but I'm still skeptical about my approach.
void my_free(struct my_type *heap)
{
if (!heap)
logger("Fatal error: %s", "Null pointer parameter");
// I leave this here on purpose for the application to crash
// in case heap == NULL
free(heap->buffer);
void *address = heap;
free(heap);
logger("Heap at %p successfully deallocated", address);
// since heap->buffer is always allocated/deallocated together with
// heap I only need to track down the heap address
}
Is there any problem doing it this way?
Can I store the address numerically? Like in an unsigned integer? What is the default type?
The best practice for managing pointers to objects that may be freed is to convert each pointer to uintptr_t while it is still valid (points to an object before it is freed) and to use those uintptr_t values for printing and other purposes.
Per C 2018 6.2.4 2, “The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.” A primary reason for this rule is to permit C implementations in which the bytes in the pointer do not contain the direct memory address but contain information that is used to look up address information in various data structures. When an object is freed, although the bytes in the pointer do not change, the data in those structures may change, and then attempting to use the pointer can fail in various ways. Notably, attempting to print the value can fail because the former address is no longer available in the data structures. However, since that rule exists, even less exotic C implementations may take advantage of it for optimization, so a C compiler can implement free(x); printf("%p", (void *) x); equivalently to free(x); printf("%p", (void *) NULL);, for example.
To work around this, you can save a pointer as a uintptr_t value and use that uintptr_t value for further use, including printing:
#include <inttypes.h>
#include <stdint.h>
...
uintptr_t ux = (uintptr_t) (void *) x;
free(x);
…
printf("%" PRIxPTR, ux);
Note that, to be strictly conforming, we first convert the pointer to void * and then to uintptr_t. This is simply because the C standard does not explicitly specify the behavior of converting any pointer to uintptr_t but does specify the behavior of converting a pointer to an object to void * and of converting a void * to uintptr_t.
Your code is fine.
It doesn't matter whether the pointer is valid1 or not if all you want to do is print the value of the pointer itself. After all, you can print the value of NULL, which by definition is an invalid pointer value, using %p with no problem.
Obviously, if you try to do something with *heap or heap[i] after the memory has been deallocated, you run into undefined behavior and anything can happen, but you can examine or print out heap (or address) all you want with no issue.
Where "valid" means "pointing to an object within that object's lifetime".
Per the C standard:
The conversion specifiers and their meanings are:
...
p
The argument shall be a pointer to void. The value of the pointer is converted to a sequence of printing characters, in an implementation-defined manner.
So, to be fully conforming to the C standard, you need to cast heap to void *:
logger("Heap at %p successfully deallocated", ( void * ) address);
The fact that you have used free() on the pointer doesn't mean you can't print the pointer's value on most implementations (See comments on other answers for why that may be true) - it just means you can't dereference the pointer.
void *address = &heap; -->> void *address = heap;
In your code you get the address if the local function parameter not it's value.
Because of my limited rating, I cannot yet comment on posts by others, so using "answer" while this is not answer but comment.
The answer/ comment above by Eric Postpischil needs to be peer reviewed.
I don't agree or understand his reasoning.
" ... When an object is freed, although the bytes in the pointer do not change, the data in those structures may change, and then attempting to use the pointer can fail in various ways. Notably, attempting to print the value can fail because the .."
Note the prototype of standard C free() call : void free(void *ptr). At the place of invocation, the ptr value will stay the same before and after the free call - it is passed by value.
To "use" the pointer will/may fail in various ways if by use we mean deference it after the free() call. Printing the value of the pointer does not deference it.
Also printing the (value of) the pointer with a call to printf using %p shall be fine even if pointer value representation is implementation defined.
"*...However, since that rule exists, even less exotic C implementations may take advantage of it for optimization, so a C compiler can implement free(x); printf("%p", (void *) x); equivalently to free(x); printf("%p", (void ) NULL);, for example. ..."
Even if it (compiler) do so, I cannot easily see what would this optimize, a call to a special minimized printf with a value 0 for pointer?
Also consider something like:
extern void my_free(void* ptr); //defined somewhere..
my_free( myptr );
printf("%p", myptr );
It would have nothing to optimize as you suggest here, and it(compiler) could deduce nothing about the value of myptr.
So no, I cannot agree with your interpretation of the C standard at this stage.
(my)EDIT:
Unless, with some 'exotic' implementations with a memory pointer being implemented as a struct, and it's cast to/from "user"s void*.. Then the user's heap value would be invalid or null when trying to print it..
But then every C call that accepts void* as a pointer to a memory would need to differentiate which void* points to heap which doesn't - say memcpy, memset , - and this then gets quite busy ...
So I'm a bit confused on how to make a function that will return a pointer to an array of ints in C. I understand that you cannot do:
int* myFunction() {
int myInt[aDefinedSize];
return myInt; }
because this is returning a pointer to a local variable.
So, I thought about this:
int* myFunction(){
int* myInt = (int) malloc(aDefinedSize * sizeof(int));
return myInt; }
This gives the error: warning cast from pointer to integer of different size
This implies to use this, which works:
int* myFunction(){
int* myInt = (int*) malloc(aDefinedSize * sizeof(int));
return myInt; }
What I'm confused by though is this:
the (int*) before the malloc was explained to me to do this: it tells the compiler what the datatype of the memory being allocated is. This is then used when, for example, you are stepping through the array and the compiler needs to know how many bytes to increment by.
So, if this explanation I was given is correct, isn't memory being allocated for aDefinedSize number of pointers to ints, not actually ints? Thus, isnt myInt a pointer to an array of pointers to ints?
Some help in understanding this would be wonderful. Thanks!!
So, if this explanation I was given is correct, isn't memory being allocated for aDefinedSize number of pointers to ints, not actually ints?
No, you asked malloc for aDefinedSize * sizeof(int) bytes, not
aDefinedSize * sizeof(int *) bytes. That's the size of memory you get, the type depends on the pointer used to access the memory.
Thus, isnt myInt a pointer to an array of pointers to ints?
No, since you defined it as a int *, a pointer-to-an-int.
Of course the pointer has no knowledge of how large the allocated memory are is, but only points at the first int that fits there. It's up to you as programmer to keep track of the size.
Note that you shouldn't use that explicit typecast. malloc returns a void *, that can be silently assigned to any pointer, as in here:
int* myInt = malloc(aDefinedSize * sizeof(int));
Arithmetic on the pointer works in strides of the pointed-to type, i.e. with int *p, p[3] is the same as *(p+3), which means roughly "go to p, go forward three times sizeof(int) in bytes, and access that location".
int **q would be a pointer-to-a-pointer-to-an-int, and might point to an array of pointers.
malloc allocates an array of bytes and returns void* pointing to the first byte. Or NULL if the allocation failed.
To treat this array as an array of a different data type, the pointer must be cast to that data type.
In C, void* implicitly casts to any data pointer type, so no explicit cast is required:
int* allocateIntArray(unsigned number_of_elements) {
int* int_array = malloc(number_of_elements * sizeof(int)); // <--- no cast is required here.
return int_array;
}
Arrays in C
In C, you want to remember that an array is just an address in memory, plus a length and an object type. When you pass it as an argument to a function or a return value from a function, the length gets forgotten and it’s treated interchangeably with the address of the first element. This has led to a lot of security bugs in programs that either read or write past the end of a buffer.
The name of an array automatically converts to the address of its first element in most contexts, so you can for example pass either arrays or pointers to memmove(), but there are a few exceptions where the fact it also has a length matters. The sizeof() operator on an array is the number of bytes in the array, but sizeof() a pointer is the size of a pointer variable. So if we declare int a[SIZE];, sizeof(a) is the same as sizeof(int)*(size_t)(SIZE), whereas sizeof(&a[0]) is the same as sizeof(int*). Another important one is that the compiler can often tell at compile time if an array access is out of bounds, whereas it does not know which accesses to a pointer are safe.
How to Return an Array
If you want to return a pointer to the same, static array, and it’s fine that you’ll get the same array each time you call the function, you can do this:
#define ARRAY_SIZE 32U
int* get_static_array(void)
{
static int the_array[ARRAY_SIZE];
return the_array;
}
You must not call free() on a static array.
If you want to create a dynamic array, you can do something like this, although it is a contrived example:
#include <stdlib.h>
int* make_dynamic_array(size_t n)
// Returns an array that you must free with free().
{
return calloc( n, sizeof(int) );
}
The dynamic array must be freed with free() when you no longer need it, or the program will leak memory.
Practical Advice
For anything that simple, you would actually write:
int * const p = calloc( n, sizeof(int) );
Unless for some reason the array pointer would change, such as:
int* p = calloc( n, sizeof(int) );
/* ... */
p = realloc( p, new_size );
I would recommend calloc() over malloc() as a general rule, because it initializes the block of memory to zeroes, and malloc() leaves the contents unspecified. That means, if you have a bug where you read uninitialized memory, using calloc() will always give you predictable, reproducible results, and using malloc() could give you different undefined behavior each time. In particular, if you allocate a pointer and then dereference it on an implementation where 0 is a trap value for pointers (like typical desktop CPUs), a pointer created by calloc() will always give you a segfault immediately, while a garbage pointer created by malloc() might appear to work, but corrupt any part of memory. That kind of bug is a lot harder to track down. It’s also easier to see in the debugger that memory is or is not zeroed out than whether an arbitrary value is valid or garbage.
Further Discussion
In the comments, one person objects to some of the terminology I used. In particular, C++ offers a few different kinds of ways to return a reference to an array that preserve more information about its type, for example:
#include <array>
#include <cstdlib>
using std::size_t;
constexpr size_t size = 16U;
using int_array = int[size];
int_array& get_static_array()
{
static int the_array[size];
return the_array;
}
std::array<int, size>& get_static_std_array()
{
static std::array<int, size> the_array;
return the_array;
}
So, one commenter (if I understand correctly) objects that the phrase “return an array” should only refer to this kind of function. I use the phrase more broadly than that, but I hope that clarifies what happens when you return the_array; in C. You get back a pointer. The relevance to you is that you lose the information about the size of the array, which makes it very easy to write security bugs in C that read or write past the block of memory allocated for an array.
There was also some kind of objection that I shouldn’t have told you that using calloc() instead of malloc() to dynamically allocate structures and arrays that contain pointers will make almost all modern CPUs segfault if you dereference those pointers before you initialize them. For the record: this is not true of absolutely all CPUs, so it’s not portable behavior. Some CPUs will not trap. Some old mainframes will trap on a special pointer value other than zero. However, it’s come in very handy when I’ve coded on a desktop or workstation. Even if you’re running on one of the exceptions, at least your pointers will have the same value each time, which should make the bug more reproducible, and when you debug and look at the pointer, it will be immediately obvious that it’s zero, whereas it will not be immediately obvious that a pointer is garbage.
I am not used to pointers because I started learning Pascal in high school and now I am upgrading myself to C. My request would be to explain me what should I think when I see something like this [*(char*)p1]. Don't be shy writing me quite a few lines :)
Thank you.
P.S. p1 is a const void *. To be more accurate.
Assuming that [*(char*)p1] is an array designator, (char*) is used to cast p1 to make p1 char * type. Then * is used to dereference it to use value at the address (p1 points to) as index to some array.
void *p1;// pointer to void or generic pointer; might be used when you want to be flexible about the data type
(char*)p1; //typecast to a char pointer; you address the memory locatuion pointed to by P1 as char
*(char*)p1; //the value at the location pointed to by p1.
Hope this helps.
[*(char*)p1] is somewhat incomplete, there needs to be a variable name in front of it for the array subscription to make sense, such as for example foo[*(char*)p1].
In that case, it means:
convert p1 to pointer-to-char
dereference this pointer (giving a char value)
use this value as index to look up in an array
Note that using a char as index will make most compilers unhappy and cause it to emit a warning. That is because most often when a char is used as an index, it happens by error, not by intent, and also because it is implementation-defined whether char is signed or unsigned (so it is inherently non-portable, and you may end up indexing out of bounds by accident, if you assume the wrong one).
(char*)p1 is a typecast. This means, for this statement, we're treating p1 as a pointer to a character.
*(char*)p1 dereferences p1, interpreting it as a char type due to the typecast (char*). A pointer points to a memory location, and dereferencing that pointer returns the value at the location in memory p1 points to.
[*(char*)p1] ... Array access? is this a snippet from something larger?
This void* to someOtherType* conversion is common in C code, because malloc, used to allocate memory dynamically, returns void*.
Here is a little snippet of code from Wikipedia's article on malloc():
int *ptr;
ptr = malloc(10 * sizeof (*ptr)); // Without a cast
ptr = (int*)malloc(10 * sizeof (int)); // With a cast
I was wondering if someone could help me understand what is going on here. So, from what I know, it seems like this is what's happening:
1) initialize an integer pointer that points to NULL. It is a pointer so its size is 4-bytes. Dereferencing this pointer will return the value NULL.
2) Since C allows for this type of automatic casting, it is safe not to include a cast-to-int-pointer. I am having trouble deciphering what exactly is being fed into the malloc function though (and why). It seems like we are getting the size of the dereferenced value of ptr. But isn't this NULL? So the size of NULL is 0, right? And why are we multiplying by 10??
3) The last line is just the same thing as above, except that a cast is explicitly declared. (cast from void pointer to int pointer).
I'm assuming we're talking about C here. The answer is different for C++.
1) is entirely off. ptr is a pointer to an int, that's all. It's uninitialized, so it has no deterministic value. Dereferencing it is undefined behaviour -- you will most certainly not get 0 out! The pointer also will most likely not point to 0. The size of ptr is sizeof(ptr), or sizeof(int*); nothing else. (At best you know that this is no larger than sizeof(void*).)
2/3) In C, never cast the result of malloc: int * p = malloc(sizeof(int) * 10);. The code allocates enough memory for 10 integers, i.e. 10 times the size of a single integer; the return value of the call is a pointer to that memory.
The first line declares a pointer to an integer, but doesn't initialize it -- so it points at some random piece of memory, probably invalid. The size of ptr is whatever size pointers to int are, likely either 4 or 8 bytes. The size of what it points at, which you'd get by dereferencing it when it points somewhere valid, is whatever size an int has.
The second line allocates enough memory for 10 ints from the heap, then assigns it to ptr. No cast is used, but the void * returned by malloc() is automatically converted to whatever type of pointer is needed when assigned. The sizeof (*ptr) gives the size of the dereferenced ptr, i.e. the size of what ptr points to (an int). For sizeof, it doesn't matter whether ptr actually points to a valid memory, just what the type would be.
The third line is just like the second, but with two changes: It explicitly casts the void * return from malloc() to an int *, to match the type of ptr; and it uses sizeof with the type name int rather than an expression of that type, like *ptr. The explicit cast is not necessary, and some people strongly oppose its use, but in the end it comes down to preference.
After either of the malloc()s ptr should point to a valid location on the heap and can be dereferenced safely, as long as malloc was successful.
For line 2 malloc() is allocating enough memory to hold 10 pointers.
malloc() is a general purpose function void so it must be cast to whatever type you actually want to use, in the above example pointer to int.