Does realloc mutate its arguments - c

Does realloc mutate its first argument?
Is mutating the first argument dependent on the implementation?
Is there a reason it should not be const? As a counter example memcpy makes its src argument const.
ISO C standard, section 7.20.3 Memory management functions, does not specify. The Linux man page for realloc does not specify.
#include <stdio.h>
#include <stdlib.h>
int main() {
int* list = NULL;
void* mem;
mem = realloc(list, 64);
printf("Address of `list`: %p\n", list);
list = mem;
printf("Address of `list`: %p\n", list);
mem = realloc(list, 0);
printf("Address of `list`: %p\n", list);
// free(list); // Double free
list = mem;
printf("Address of `list`: %p\n", list);
}
When I run the above code on my Debian laptop:
The first printf is null.
The second printf has an address.
The third printf has the same address as the second.
In accordance with the spec, trying to free the address results in a double free error.
The forth printf is null.

The function does not change the original pointer because it deals with a copy of the pointer. That is the pointer is not passed by reference.
Consider the following program
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *p = malloc( sizeof( int ) );
*p = 10;
printf( "Before p = %p\n", ( void * )p );
char *q = realloc( p, 2 * sizeof( int ) );
printf( "After p = %p\n", ( void * )p );
free( q );
return 0;
}
Its output is
Before p = 0x5644bcfde260
After p = 0x5644bcfde260
As you see the pointer p was not changed.
However the new pointer q can have the same value as pointer p had before the call of realloc.
From the C Standard (7.22.3.5 The realloc function)
4 The realloc function returns a pointer to the new object (which
may have the same value as a pointer to the old object), or a null
pointer if the new object could not be allocated.
Of course if you will write
p = realloc( p, 2 * sizeof( int ) );
instead of
char *q = realloc( p, 2 * sizeof( int ) );
then it is evident that in general the new value of pointer p can differ from the old value of p (though can be the same according to the quote). For example if the function was unable to reallocate memory. In this case a memory leak will occur provided that initial value of the pointer p was not equal to NULL. Because in this case (when the initial value of the pointer was not equal to NULL) the address of the early allocated memory will be lost.
The old memory is not deallocated if a new memory extent can not be
allocated because the function needs to copy the old content to the
new extent of memory.
From the C Standard (7.22.3.5 The realloc function)
If memory for the new object cannot be allocated, the old object is
not deallocated and its value is unchanged.
Pay attention to that this call
mem = realloc(list, 0);
does not necessary return NULL.
From the C Standard (7.22.3 Memory management functions)
If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or the
behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object.

First of all, formally, realloc frees the memory pointed to by its first argument after allocating a new object and copying the contents. As such, semantically it's absolutely correct that the pointed-to type not be const qualified. In limited cases, the new object's address may be the same as the old object's address, but a correct program largely can't even see this (comparing against the old pointer is undefined behavior), much less depend on it.
Secondly, I think you're confusing the const-ness of the argument type and the pointed-to type. const on argument types makes no sense whatsoever (and is ignored by the language, except in the implementation of the called function where it makes the local variable receiving the argument constant) since arguments are always values, not references to some object in the caller. Of course realloc can't change the value of the caller's pointer variable you pass to it. However, due to any use of invalid pointers being undefined behavior, your program can (because UB allows anything) exhibit behavior as if the caller's copy had been modified. For example, comparing it for equality with the new pointer may give inconsistent results. The const on memcpy's src makes a pointer-to-const type, not a const type.

realloc() can free the memory that its argument points to, if it can't reuse the same memory. I think this is considered to be like a mutation (since it effectively destroys it completely).
Semantically, realloc() is equivalent to:
void *realloc(void *ptr, size_t size) {
void *result = malloc(size);
if (result && ptr) {
memcpy(result, ptr, min(size, _allocation_size(ptr)));
free(ptr);
}
return result;
}
where _allocation_size() is some internal function of the C runtime that determines the size of a dynamic memory allocation.
Since the argument to free() is not declared const void *, neither is the first argument to realloc().

I'm not entirely sure what you mean by "Does realloc mutate its first argument?".
It certainly doesn't change the value of the pointer in the caller -- no C function can do that.
But does it alter the value of the pointed-to memory? That's a trickier question.
As far as the programmer is concerned, you hand realloc a pointer to M bytes, and it returns you a (possibly different) pointer to N bytes.
If it hands you back the same pointer (meaning that it was able to do the reallocation "in place"), and if N ≥ M, it definitely does not touch the M former bytes.
If it hands you back the same pointer but N < M (that is, if you reallocated the region smaller), you're no longer allowed to access or even ask about the bytes beyond M, so it's particularly hard to say whether they were modified. (But in fact, they might well have been modified, in the process of marking them unused, and available for future allocation).
Finally, if realloc hands you back a different pointer, the M former bytes are "gone" -- again, you're no longer allowed to access them, so it's hard to say if they were modified, but they probably were, because all of them are now available for future allocation.
But in any case: the pointer you hand to realloc is a pointer into the heap, and realloc definitely alters the heap as it does its work, so yes, I think it's safe to say that realloc mutates its first argument, which therefore should not be declared const. (Even in the first case I discussed, where realloc "definitely did not touch the M former bytes", it probably did still adjust some nearby data structures, to record the new allocation.)
And, finally, if by "mutate" you mean the sort of thing that C++ programs are allowed to do when member variables are declared mutable -- that is, a change happens behind the scenes to some data structure referenced by a pointer that was otherwise qualified const -- well, yes, that's not too far off from what realloc does. If realloc's first argument were const, and if the modifications realloc did perform were to data structures qualified as mutable, then I suppose this would work -- but also if we were talking about C++.
But of course we're not talking about C++; we're talking about C, which doesn't even have the mutable qualifier.
(I'd say memcpy isn't a counterexample, because it doesn't do anything that even remotely smells like writing to any data structures associated with its second argument.)

Does realloc mutate its first argument?
If you mean change the value of the variable passed as parameter the answer is no. The point isn't related to the specific realloc() function, but more generally to the way used by the language to handle parameters. C language produce a private copy of each argument, typically on the stack, before to pass them to the function, For this reason each change to them is confined locally and is lost when the function returns and the stack reused. Formally the C language pass almost all types by value (arrays are a well known exception). Anyway I'll come back on the argument below.
Is mutating the first argument dependent on the implementation?
Of course not. As said above this depends by the language.
Is there a reason it should not be const? As a counter example memcpy
makes its src argument const.
Of course there is a reason.
Forget about void * memcpy ( void * destination, const void * source, size_t num ) that has no connection to void* realloc (void* ptr, size_t size), lets consider that the management of dynamic memory depends on specific local implementation, but basically all allocation routines are based on memory pools, normally divided in small chunks, from where are derived the memory blocks returned to our programs. We can imagine that when we require to shrink the block the system will remove some chunks giving back a smaller block that incidentally remain at the same address, but if we require an extension maybe the chunks following our block are already assigned we can't proceed.
On an embedded 8 bit micro may happen that the actual memory block cannot be extended, but that another memory area is large enough for the scope, in that case we can copy the former block data to the new one and return it. But in this case we have a different address in memory.
But the malloc() must be universal independently from the machine where it is implemented, starting from 8 bits embedded applications to 64bits desktops with GBytes of available memory and virtual memory support. For this reason the standard must provide a definition that could fit all cases.
The second point is how pass the result, pass/fail, of the reallocation, if would have been used a reference to the memory block pointer (ie passing &ptr), in case of a failure returning NULL the original pointer would have been lost!. The user, to preserve it, must have done a copy of the pointer before to realloc(), but this procedure is unnatural e prone to errors.
For this reason in the standard library the problem is approached from a different side: the reallocation will formally return always a freshly allocated memory block in which has been copied the former memory block data. The programmer is required only to check the result before use it (see below code example).
The standard is very clear in the function definition, as already mentioned in other answers, that for sake of completeness I report below. From ISO/IEC 9899:2017 §7.22.3.5 The realloc function:
The realloc function deallocates the old object pointed to by ptr and
returns a pointer to a new object that has the size specified by size.
The contents of the new object shall be the same as that of the old
object prior to deallocation, up to the lesser of the new and old
sizes.
Any bytes in the new object beyond the size of the old object have
indeterminate values.
If ptr is a null pointer, the realloc function behaves like the malloc
function for the specified size. Otherwise, if ptr does not match a
pointer earlier returned by a memory management function, or if the
space has been deallocated by a call to the free or realloc function,
the behavior is undefined.
If size is nonzero and memory for the new object is not allocated, the
old object is not deallocated.
If size is zero and memory for the new object is not allocated, it is
implementation-defined whether the old object is deallocated. If the
old object is not deallocated, its value shall be unchanged.
The realloc function returns a pointer to the new object (which may
have the same value as a pointer to the old object), or a null pointer
if the new object has not been allocated.
Because you don't know if realloc() returns a new object or the former one, or even NULL in case of error, you should consider realloc() as always returning a new object, hence the code:
int* list = NULL;
void* mem;
mem = realloc(list, 64);
printf("Address of `list`: %p\n", list);
Is wrong at least for two reasons:
Obviously Because if realloc() return a new object and frees the
old memory, the variable list contains an invalid pointer. Moreover it could fail returning NULL, in that case the former block will still be valid.
Because you can't expect to have list changed in any way passing it as a local parameter in a function. Of course list will retain its former value that is NULL.
While passing a null pointer to realloc() is standard compliant, because it explicitly says that in this case the behavior will be the same as malloc(), passing a zero size the behavior is implementation-defined implying that the former block will be deallocated by some compilers, but not from some others. The latter means that the behavior can change on compiler basis, on your machine we can deduce that evidently the compiler behavior is to deallocate the block because of the double free error you got and the null pointer returned by realloc(). Please note also that in latter case when passing a zero size to realloc()the returned NULL could not mean that a failure occurred, and that the function was successful, but in case of failure you will not able to correctly understand if there was a failure or not. This is an ambiguity of the function (or it is so at my knowledge comments are welcome).
The golden rules to follow when using realloc() are basically these:
Keep in mind that the object returned from function is always a
new object and you have to save it.
Because realloc() can fail and return a NULL pointer, never use
code as that below, because if it fails we will overwrite the old
object pointer loosing the possibility to recover data or free the
former object. Always use a temporary variable to check the return
value.
Example code:
void *p = malloc(SIZE);
/* Wrong approach we overwrite anyway teh pointer */
p = realloc(p, 2*SIZE);
/** Correct approach */
void *pTmp = realloc(p, 2*SIZE);
if (NULL == pTmp)
{
//Error manage code
}
else
{
p = pTmp; //assign value
}
Now you may ask why on many machines, having virtual memory management as desktops, smartphones and the like, often happen to have unchanged memory address returned from realloc(). Well the point is that, thanks to the virtual memory management, more physical not contiguous memory chunks can be added to the virtual memory chain, then the virtual memory descriptors can be manipulated, mapping consequential virtual addresses to each physical chunk in such a way that the user sees a flat contiguous virtual memory space.

Related

Freeing dynamically allocated int that needs to be returned but cannot be freed in main in c

As my long title says: I am trying to return a pointer in c that has been dynamically allocated, I know, I have to free it, but I do not know how to myself, my search has showed that it can only be freed in main, but I cannot leave it up to the user to free the int.
My code looks like this right now,
int *toInt(BigInt *p)
{
int *integer = NULL;
integer = calloc(1, sizeof(int));
// do some stuff here to make integer become an int from a passed
// struct array of integers
return integer;
}
I've tried just making a temp variable and seeing the integer to that then freeing integer and returning the temp, but that hasn't worked. There must be a way to do this without freeing in main?
Program design-wise, you should always let the "module" (translation unit) that did the allocation be responsible for freeing the memory. Expecting some other module or the caller to free() memory is indeed bad design.
Unfortunately C does not have constructors/destructors (nor "RAII"), so this has to be handled with a separate function call. Conceptually you should design the program like this:
#include "my_type.h"
int main()
{
my_type* mt = my_type_alloc();
...
my_type_free(mt);
}
As for your specific case, there is no need for dynamic allocation. Simply leave allocation to the caller instead, and use a dedicated error type for reporting errors:
err_t toInt (const BigInt* p, int* integer)
{
if(bad_things())
return ERROR;
*integer = p->stuff();
return OK;
}
Where err_t is some custom error-handling type (likely enum).
Your particular code gains nothing useful from dynamic allocation, as #unwind already observed. You can save yourself considerable trouble by just avoiding it.
In a more general sense, you should imagine that with each block of allocated memory is associated an implicit obligation to free. There is no physical or electronic representation of that obligation, but you can imagine it as a virtual chit associated at any given time with at most one copy of the pointer to the space during the lifetime of the allocation. You can transfer the obligation between copies of the pointer value at will. If the pointer value with the obligation is ever lost through going out of scope or being modified then you have a leak, at least in principle; if you free the space via a copy of the pointer that does not at that time hold the obligation to free, then you have a (possibly virtual) double free.
I know I have to free it, but I do not know how to myself
A function that allocates memory and returns a copy of the pointer to it without making any other copies, such as your example, should be assumed to associate the obligation to free with the returned pointer value. It cannot free the allocated space itself, because that space must remain allocated after the function returns (else the returned pointer is worse than useless). If the obligation to free were not transferred to the returned pointer then a (virtual) memory leak would occur when the function's local variables go out of scope at its end, leaving no extant copy of the pointer having obligation to free.
I cannot leave it up to the user to free the int.
If you mean you cannot leave it up to the caller, then you are mistaken. Of course you can leave it up to the caller. If in fact the function allocates space and returns a pointer to it as you describe, then it must transfer the obligation to free to the caller along with the returned copy of the pointer to the allocated space. That's exactly what the calloc() function does in the first place. Other functions do similar, such as POSIX's strdup().
Because there is no physical or electronic representation of obligation to free, it is essential that your functions document any such obligations placed on the caller.
Just stop treating it as a pointer, there's no need for a single int.
Return it directly, and there will be no memory management issues since it's automatically allocated:
int toInt(const BigInt *p)
{
int x;
x = do some stuff;
return x;
}
The caller can just do
const int my_x = toInt(myBigInt);
and my_x will be automatically cleaned away when it does out of scope.

clang - Undefined behavior in realloc aliasing [duplicate]

When you free memory, what happens to pointers that point into that memory? Do they become invalid immediately? What happens if they later become valid again?
Certainly, the usual case of a pointer going invalid then becoming "valid" again would be some other object getting allocated into what happens to be the memory that was used before, and if you use the pointer to access memory, that's obviously undefined behavior. Dangling pointer memory overwrite lesson 1, pretty much.
But what if the memory becomes valid again for the same allocation? There's only one Standard way for that to happen: realloc(). If you have a pointer to somewhere within a malloc()'d memory block at offset > 1, then use realloc() to shrink the block to less than your offset, your pointer becomes invalid, obviously. If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block, is the dangling pointer valid again?
This is such a corner case that I don't really know how to interpret the C or C++ standards to figure it out. The below is a program that shows it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
static const char s_message[] = "hello there";
static const char s_kitty[] = "kitty";
char *string = malloc(sizeof(s_message));
if (!string)
{
fprintf(stderr, "malloc failed\n");
return 1;
}
memcpy(string, s_message, sizeof(s_message));
printf("%p %s\n", string, string);
char *overwrite = string + 6;
*overwrite = '\0';
printf("%p %s\n", string, string);
string[4] = '\0';
char *new_string = realloc(string, 5);
if (new_string != string)
{
fprintf(stderr, "realloc #1 failed or moved the string\n");
free(new_string ? new_string : string);
return 1;
}
string = new_string;
printf("%p %s\n", string, string);
new_string = realloc(string, 6 + sizeof(s_kitty));
if (new_string != string)
{
fprintf(stderr, "realloc #2 failed or moved the string\n");
free(new_string ? new_string : string);
return 1;
}
// Is this defined behavior, even though at one point,
// "overwrite" was a dangling pointer?
memcpy(overwrite, s_kitty, sizeof(s_kitty));
string[4] = s_message[4];
printf("%p %s\n", string, string);
free(string);
return 0;
}
When you free memory, what happens to pointers that point into that memory? Do they become invalid immediately?
Yes, definitely. From section 6.2.4 of the C standard:
The lifetime of an object is the portion of program execution during which storage is
guaranteed to be reserved for it. An object exists, has a constant address, and retains
its last-stored value throughout its lifetime. If an object is referred to outside of its
lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when
the object it points to (or just past) reaches the end of its lifetime.
And from section 7.22.3.5:
The realloc function deallocates the old object pointed to by ptr and returns a
pointer to a new object that has the size specified by size. The contents of the new
object shall be the same as that of the old object prior to deallocation, up to the lesser of
the new and old sizes. Any bytes in the new object beyond the size of the old object have
indeterminate values.
Note the reference to old object and new object ... by the standard, what you get back from realloc is a different object than what you had before; it's no different from doing a free and then a malloc, and there is no guarantee that the two objects have the same address, even if the new size is <= the old size ... and in real implementations they often won't because objects of different sizes are drawn from different free lists.
What happens if they later become valid again?
There's no such animal. Validity isn't some event that takes place, it's an abstract condition placed by the C standard. Your pointers might happen to work in some implementation, but all bets are off once you free the memory they point into.
But what if the memory becomes valid again for the same allocation? There's only one Standard way for that to happen: realloc()
Sorry, no, the C Standard does not contain any language to that effect.
If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block
You can't know whether it will ... the standard does not guarantee any such thing. And notably, when you realloc to a smaller size, most implementations modify the memory immediately following the shortened block; reallocing back to the original size will have some garbage in the added part, it won't be what it was before it was shrunk. In some implementations, some block sizes are kept on lists for that block size; reallocating to a different size will give you totally different memory. And in a program with multiple threads, any freed memory can be allocated in a different thread between the two reallocs, in which case the realloc for a larger size will be forced to move the object to a different location.
is the dangling pointer valid again?
See above; invalid is invalid; there's no going back.
This is such a corner case that I don't really know how to interpret the C or C++ standards to figure it out.
It's not any sort of corner case and I don't know what you're seeing in the standard, which is quite clear that freed memory has indeteterminate content and that the values of any pointers to or into it are also indeterminate, and makes no claim that they are magically restored by a later realloc.
Note that modern optimizing compilers are written to know about undefined behavior and take advantage of it. As soon as you realloc string, overwrite is invalid, and the compiler is free to trash it ... e.g., it might be in a register that the compiler reallocates for temporaries or parameter passing. Whether any compiler does this, it can, precisely because the standard is quite clear about pointers into objects becoming invalid when the object's lifetime ends.
If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block, is the dangling pointer valid again?
No. Unless realloc() returns a null pointer, the call terminates the lifetime of the allocated object, implying that all pointers pointing into it become invalid. If realloc() succeeds, it returns the address of a new object.
Of course, it just might happen that it's the same address as the old one. In that case, using an invalid pointer to the old object to access the new one will generally work in non-optimizing implementations of the C language.
It would still be undefined behaviour, though, and might actually fail with aggressively optimizing compilers.
The C language is unsound, and it's generally up to the programmer to uphold its invariants. Failing to do so will break the implicit contract with the compiler and may result in incorrect code being generated.
It depends on your definition of "valid". You've perfectly described the situation. If you want to consider that "valid", then it's valid. If you don't want to consider that "valid", then it's invalid.

realloc() dangling pointers and undefined behavior

When you free memory, what happens to pointers that point into that memory? Do they become invalid immediately? What happens if they later become valid again?
Certainly, the usual case of a pointer going invalid then becoming "valid" again would be some other object getting allocated into what happens to be the memory that was used before, and if you use the pointer to access memory, that's obviously undefined behavior. Dangling pointer memory overwrite lesson 1, pretty much.
But what if the memory becomes valid again for the same allocation? There's only one Standard way for that to happen: realloc(). If you have a pointer to somewhere within a malloc()'d memory block at offset > 1, then use realloc() to shrink the block to less than your offset, your pointer becomes invalid, obviously. If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block, is the dangling pointer valid again?
This is such a corner case that I don't really know how to interpret the C or C++ standards to figure it out. The below is a program that shows it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
static const char s_message[] = "hello there";
static const char s_kitty[] = "kitty";
char *string = malloc(sizeof(s_message));
if (!string)
{
fprintf(stderr, "malloc failed\n");
return 1;
}
memcpy(string, s_message, sizeof(s_message));
printf("%p %s\n", string, string);
char *overwrite = string + 6;
*overwrite = '\0';
printf("%p %s\n", string, string);
string[4] = '\0';
char *new_string = realloc(string, 5);
if (new_string != string)
{
fprintf(stderr, "realloc #1 failed or moved the string\n");
free(new_string ? new_string : string);
return 1;
}
string = new_string;
printf("%p %s\n", string, string);
new_string = realloc(string, 6 + sizeof(s_kitty));
if (new_string != string)
{
fprintf(stderr, "realloc #2 failed or moved the string\n");
free(new_string ? new_string : string);
return 1;
}
// Is this defined behavior, even though at one point,
// "overwrite" was a dangling pointer?
memcpy(overwrite, s_kitty, sizeof(s_kitty));
string[4] = s_message[4];
printf("%p %s\n", string, string);
free(string);
return 0;
}
When you free memory, what happens to pointers that point into that memory? Do they become invalid immediately?
Yes, definitely. From section 6.2.4 of the C standard:
The lifetime of an object is the portion of program execution during which storage is
guaranteed to be reserved for it. An object exists, has a constant address, and retains
its last-stored value throughout its lifetime. If an object is referred to outside of its
lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when
the object it points to (or just past) reaches the end of its lifetime.
And from section 7.22.3.5:
The realloc function deallocates the old object pointed to by ptr and returns a
pointer to a new object that has the size specified by size. The contents of the new
object shall be the same as that of the old object prior to deallocation, up to the lesser of
the new and old sizes. Any bytes in the new object beyond the size of the old object have
indeterminate values.
Note the reference to old object and new object ... by the standard, what you get back from realloc is a different object than what you had before; it's no different from doing a free and then a malloc, and there is no guarantee that the two objects have the same address, even if the new size is <= the old size ... and in real implementations they often won't because objects of different sizes are drawn from different free lists.
What happens if they later become valid again?
There's no such animal. Validity isn't some event that takes place, it's an abstract condition placed by the C standard. Your pointers might happen to work in some implementation, but all bets are off once you free the memory they point into.
But what if the memory becomes valid again for the same allocation? There's only one Standard way for that to happen: realloc()
Sorry, no, the C Standard does not contain any language to that effect.
If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block
You can't know whether it will ... the standard does not guarantee any such thing. And notably, when you realloc to a smaller size, most implementations modify the memory immediately following the shortened block; reallocing back to the original size will have some garbage in the added part, it won't be what it was before it was shrunk. In some implementations, some block sizes are kept on lists for that block size; reallocating to a different size will give you totally different memory. And in a program with multiple threads, any freed memory can be allocated in a different thread between the two reallocs, in which case the realloc for a larger size will be forced to move the object to a different location.
is the dangling pointer valid again?
See above; invalid is invalid; there's no going back.
This is such a corner case that I don't really know how to interpret the C or C++ standards to figure it out.
It's not any sort of corner case and I don't know what you're seeing in the standard, which is quite clear that freed memory has indeteterminate content and that the values of any pointers to or into it are also indeterminate, and makes no claim that they are magically restored by a later realloc.
Note that modern optimizing compilers are written to know about undefined behavior and take advantage of it. As soon as you realloc string, overwrite is invalid, and the compiler is free to trash it ... e.g., it might be in a register that the compiler reallocates for temporaries or parameter passing. Whether any compiler does this, it can, precisely because the standard is quite clear about pointers into objects becoming invalid when the object's lifetime ends.
If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block, is the dangling pointer valid again?
No. Unless realloc() returns a null pointer, the call terminates the lifetime of the allocated object, implying that all pointers pointing into it become invalid. If realloc() succeeds, it returns the address of a new object.
Of course, it just might happen that it's the same address as the old one. In that case, using an invalid pointer to the old object to access the new one will generally work in non-optimizing implementations of the C language.
It would still be undefined behaviour, though, and might actually fail with aggressively optimizing compilers.
The C language is unsound, and it's generally up to the programmer to uphold its invariants. Failing to do so will break the implicit contract with the compiler and may result in incorrect code being generated.
It depends on your definition of "valid". You've perfectly described the situation. If you want to consider that "valid", then it's valid. If you don't want to consider that "valid", then it's invalid.

C Language: Why do dynamically-allocated objects return a pointer, while statically-allocated objects give you a choice?

This is actually a much more concise, much more clear question than the one I had asked here before(for any who cares): C Language: Why does malloc() return a pointer, and not the value? (Sorry for those who initially think I'm spamming... I hope it's not construed as the same question since I think the way I phrased it there made it unintentionally misleading)
-> Basically what I'm trying to ask is: Why does a C programmer need a pointer to a dynamically-allocated variable/object? (whatever the difference is between variable/object...)
If a C programmer has the option of creating just 'int x' or just 'int *x' (both statically allocated), then why can't he also have the option to JUST initialize his dynamically-allocated variable/object as a variable (and NOT returning a pointer through malloc())?
*If there are some obscure ways to do what I explained above, then, well, why does malloc() seem the way that most textbooks go about dynamic-allocation?
Note: in the following, byte refers to sizeof(char)
Well, for one, malloc returns a void *. It simply can't return a value: that wouldn't be feasible with C's lack of generics. In C, the compiler must know the size of every object at compile time; since the size of the memory being allocated will not be known until run time, then a type that could represent any value must be returned. Since void * can represent any pointer, it is the best choice.
malloc also cannot initialize the block: it has no knowledge of what's being allocated. This is in contrast with C++'s operator new, which does both the allocation and the initialization, as well as being type safe (it still returns a pointer instead of a reference, probably for historical reasons).
Also, malloc allocates a block of memory of a specific size, then returns a pointer to that memory (that's what malloc stands for: memory allocation). You're getting a pointer because that's what you get: an unitialized block of raw memory. When you do, say, malloc(sizeof(int)), you're not creating a int, you're allocating sizeof(int) bytes and getting the address of those bytes. You can then decide to use that block as an int, but you could also technically use that as an array of sizeof(int) chars.
The various alternatives (calloc, realloc) work roughly the same way (calloc is easier to use when dealing with arrays, and zero-fills the data, while realloc is useful when you need to resize a block of memory).
Suppose you create an integer array in a function and want to return it. Said array is a local variable to the function. You can't return a pointer to a local variable.
However, if you use malloc, you create an object on the heap whose scope exceeds the function body. You can return a pointer to that. You just have to destroy it later or you will have a memory leak.
It's because objects allocated with malloc() don't have names, so the only way to reference that object in code is to use a pointer to it.
When you say int x;, that creates an object with the name x, and it is referenceable through that name. When I want to set x to 10, I can just use x = 10;.
I can also set a pointer variable to point to that object with int *p = &x;, and then I can alternatively set the value of x using *p = 10;. Note that this time we can talk about x without specifically naming it (beyond the point where we acquire the reference to it).
When I say malloc(sizeof(int)), that creates an object that has no name. I cannot directly set the value of that object by name, since it just doesn't have one. However, I can set it by using a pointer variable that points at it, since that method doesn't require naming the object: int *p = malloc(sizeof(int)); followed by *p = 10;.
You might now ask: "So, why can't I tell malloc to give the object a name?" - something like malloc(sizeof(int), "x"). The answer to this is twofold:
Firstly, C just doesn't allow variable names to be introduced at runtime. It's just a basic restriction of the language;
Secondly, given the first restriction the name would have to be fixed at compile-time: if this is the case, C already has syntax that does what you want: int x;.
You are thinking about things wrong. It is not that int x is statically allocated and malloc(sizeof(int)) is dynamic. Both are allocated dynamically. That is, they are both allocated at execution time. There is no space reserved for them at the time you compile. The size may be static in one case and dynamic in the other, but the allocation is always dynamic.
Rather, it is that int x allocates the memory on the stack and malloc(sizeof(int)) allocates the memory on the heap. Memory on the heap requires that you have a pointer in order to access it. Memory on the stack can be referenced directly or with a pointer. Usually you do it directly, but sometimes you want to iterate over it with pointer arithmetic or pass it to a function that needs a pointer to it.
Everything works using pointers. "int x" is just a convenience - someone, somewhere got tired of juggling memory addresses and that's how programming languages with human-readable variable names were born.
Dynamic allocation is... dynamic. You don't have to know how much space you are going to need when the program runs - before the program runs. You choose when to do it and when to undo it. It may fail. It's hard to handle all this using the simple syntax of static allocation.
C was designed with simplicity in mind and compiler simplicity is a part of this. That's why you're exposed to the quirks of the underlying implementations. All systems have storage for statically-sized, local, temporary variables (registers, stack); this is what static allocation uses. Most systems have storage for dynamic, custom-lifetime objects and system calls to manage them; this is what dynamic allocation uses and exposes.
There is a way to do what you're asking and it's called C++. There, "MyInt x = 42;" is a function call or two.
I think your question comes down to this:
If a C programmer has the option of creating just int x or just int *x (both statically allocated)
The first statement allocates memory for an integer. Depending upon the placement of the statement, it might allocate the memory on the stack of a currently executing function or it might allocate memory in the .data or .bss sections of the program (if it is a global variable or static variable, at either file scope or function scope).
The second statement allocates memory for a pointer to an integer -- it hasn't actually allocated memory for the integer itself. If you tried to assign a value using the pointer *x=1, you would either receive a very quick SIGSEGV segmentation violation or corrupt some random piece of memory. C doesn't pre-zero memory allocated on the stack:
$ cat stack.c
#include <stdio.h>
int main(int argc, char *argv[]) {
int i;
int j;
int k;
int *l;
int *m;
int *n;
printf("i: %d\n", i);
printf("j: %d\n", j);
printf("k: %d\n", k);
printf("l: %p\n", l);
printf("m: %p\n", m);
printf("n: %p\n", n);
return 0;
}
$ make stack
cc stack.c -o stack
$ ./stack
i: 0
j: 0
k: 32767
l: 0x400410
m: (nil)
n: 0x4005a0
l and n point to something in memory -- but those values are just garbage, and probably don't belong to the address space of the executable. If we store anything into those pointers, the program would probably die. It might corrupt unrelated structures, though, if they are mapped into the program's address space.
m at least is a NULL pointer -- if you tried to write to it, the program would certainly die on modern hardware.
None of those three pointers actually point to an integer yet. The memory for those integers doesn't exist. The memory for the pointers does exist -- and is initially filled with garbage values, in this case.
The Wikipedia article on L-values -- mostly too obtuse to fully recommend -- makes one point that represented a pretty significant hurdle for me when I was first learning C: In languages with assignable variables it becomes necessary to distinguish between the R-value (or contents) and the L-value (or location) of a variable.
For example, you can write:
int a;
a = 3;
This stores the integer value 3 into whatever memory was allocated to store the contents of variable a.
If you later write:
int b;
b = a;
This takes the value stored in the memory referenced by a and stores it into the memory location allocated for b.
The same operations with pointers might look like this:
int *ap;
ap=malloc(sizeof int);
*ap=3;
The first ap= assignment stores a memory location into the ap pointer. Now ap actually points at some memory. The second assignment, *ap=, stores a value into that memory location. It doesn't update the ap pointer at all; it reads the value stored in the variable named ap to find the memory location for the assignment.
When you later use the pointer, you can choose which of the two values associated with the pointer to use: either the actual contents of the pointer or the value pointed to by the pointer:
int *bp;
bp = ap; /* bp points to the same memory cell as ap */
int *bp;
bp = malloc(sizeof int);
*bp = *ap; /* bp points to new memory and we copy
the value pointed to by ap into the
memory pointed to by bp */
I found assembly far easier than C for years because I found the difference between foo = malloc(); and *foo = value; confusing. I hope I found what was confusing you and seriously hope I didn't make it worse.
Perhaps you misunderstand the difference between declaring 'int x' and 'int *x'. The first allocates storage for an int value; the second doesn't - it just allocates storage for the pointer.
If you were to "dynamically allocate" a variable, there would be no point in the dynamic allocation anyway (unless you then took its address, which would of course yield a pointer) - you may as well declare it statically. Think about how the code would look - why would you bother with:
int x = malloc(sizeof(int)); *x = 0;
When you can just do:
int x = 0;

Is there an alternative way to free dynamically allocated memory in C - not using the free() function?

I am studying for a test, and I was wondering if any of these are equivalent to free(ptr):
malloc(NULL);
calloc(ptr);
realloc(NULL, ptr);
calloc(ptr, 0);
realloc(ptr, 0);
From what I understand, none of these will work because the free() function actually tells C that the memory after ptr is available again for it to use. Sorry that this is kind of a noob question, but help would be appreciated.
Actually, the last of those is equivalent to a call to free(). Read the specification of realloc() very carefully, and you will find it can allocate data anew, or change the size of an allocation (which, especially if the new size is larger than the old, might move the data around), and it can release memory too. In fact, you don't need the other functions; they can all be written in terms of realloc(). Not that anyone in their right mind would do so...but it could be done.
See Steve Maguire's "Writing Solid Code" for a complete dissection of the perils of the malloc() family of functions. See the ACCU web site for a complete dissection of the perils of reading "Writing Solid Code". I'm not convinced it is as bad as the reviews make it out to be - though its complete lack of a treatment of const does date it (back to the early 90s, when C89 was still new and not widely implemented in full).
D McKee's notes about MacOS X 10.5 (BSD) are interesting...
The C99 standard says:
7.20.3.3 The malloc function
Synopsis
#include <stdlib.h>
void *malloc(size_t size);
Description
The malloc function allocates space for an object whose size is specified by size and
whose value is indeterminate.
Returns
The malloc function returns either a null pointer or a pointer to the allocated space.
7.20.3.4 The realloc function
Synopsis
#include <stdlib.h>
void *realloc(void *ptr, size_t size);
Description
The realloc function deallocates the old object pointed to by ptr and returns a
pointer to a new object that has the size specified by size. The contents of the new
object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.
If ptr is a null pointer, the realloc function behaves like the malloc function for the
specified size. Otherwise, if ptr does not match a pointer earlier returned by the
calloc, malloc, or realloc function, or if the space has been deallocated by a call
to the free or realloc function, the behavior is undefined. If memory for the new
object cannot be allocated, the old object is not deallocated and its value is unchanged.
Returns
The realloc function returns a pointer to the new object (which may have the same
value as a pointer to the old object), or a null pointer if the new object could not be
allocated.
Apart from editorial changes because of extra headers and functions, the ISO/IEC 9899:2011 standard says the same as C99, but in section 7.22.3 instead of 7.20.3.
The Solaris 10 (SPARC) man page for realloc says:
The realloc() function changes the size of the block pointer to by ptr to size bytes and returns a pointer to the (possibly moved) block. The contents will be unchanged up to the lesser of the new and old sizes. If the new size of the block requires movement of the block, the space for the previous instantiation of the block is freed. If the new size is larger, the contents of the newly allocated portion of the block are unspecified. If ptr is NULL, realloc() behaves like malloc() for the specified size. If size is 0 and ptr is not a null pointer, the space pointed to is freed.
That's a pretty explicit 'it works like free()' statement.
However, that MacOS X 10.5 or BSD says anything different reaffirms the "No-one in their right mind" part of my first paragraph.
There is, of course, the C99 Rationale...It says:
7.20.3 Memory management functions
The treatment of null pointers and zero-length allocation requests in the definition of these
functions was in part guided by a desire to support this paradigm:
OBJ * p; // pointer to a variable list of OBJs
/* initial allocation */
p = (OBJ *) calloc(0, sizeof(OBJ));
/* ... */
/* reallocations until size settles */
while(1) {
p = (OBJ *) realloc((void *)p, c * sizeof(OBJ));
/* change value of c or break out of loop */
}
This coding style, not necessarily endorsed by the Committee, is reported to be in widespread
use.
Some implementations have returned non-null values for allocation requests of zero bytes.
Although this strategy has the theoretical advantage of distinguishing between “nothing” and “zero” (an unallocated pointer vs. a pointer to zero-length space), it has the more compelling
theoretical disadvantage of requiring the concept of a zero-length object. Since such objects
cannot be declared, the only way they could come into existence would be through such
allocation requests.
The C89 Committee decided not to accept the idea of zero-length objects. The allocation
functions may therefore return a null pointer for an allocation request of zero bytes. Note that this treatment does not preclude the paradigm outlined above.
QUIET CHANGE IN C89
A program which relies on size-zero allocation requests returning a non-null pointer
will behave differently.
[...]
7.20.3.4 The realloc function
A null first argument is permissible. If the first argument is not null, and the second argument is 0, then the call frees the memory pointed to by the first argument, and a null argument may be
returned; C99 is consistent with the policy of not allowing zero-sized objects.
A new feature of C99: the realloc function was changed to make it clear that the pointed-to
object is deallocated, a new object is allocated, and the content of the new object is the same as
that of the old object up to the lesser of the two sizes. C89 attempted to specify that the new object was the same object as the old object but might have a different address. This conflicts
with other parts of the Standard that assume that the address of an object is constant during its
lifetime. Also, implementations that support an actual allocation when the size is zero do not
necessarily return a null pointer for this case. C89 appeared to require a null return value, and
the Committee felt that this was too restrictive.
Thomas Padron-McCarthy observed:
C89 explicitly says: "If size is zero and ptr is not a null pointer, the object it points to is freed." So they seem to have removed that sentence in C99?
Yes, they have removed that sentence because it is subsumed by the opening sentence:
The realloc function deallocates the old object pointed to by ptr
There's no wriggle room there; the old object is deallocated. If the requested size is zero, then you get back whatever malloc(0) might return, which is often (usually) a null pointer but might be a non-null pointer that can also be returned to free() but which cannot legitimately be dereferenced.
realloc(ptr, 0);
is equivalent to free(ptr); (although I wouldn't recommended its use as such!)
Also: these two calls are equivalent to each other (but not to free):
realloc(NULL,size)
malloc(size)
The last one--realloc(ptr, 0)--comes close. It will free any allocated block and replace it with a minimal allocation (says my Mac OS X 10.5 manpage). Check your local manpage to see what it does on your system.
That is, if ptr pointed at a substantial object, you'll get back most of its memory.
The man page on Debian Lenny agrees with Mitch and Jonathan...does BSD really diverge from Linux on this?
From the offending man page:
The realloc() function tries to change the size of the allocation pointed
to by ptr to size, and returns ptr. [...]
If size is zero and ptr is not NULL, a new,
minimum sized object is allocated and the original object is freed.
The linux and solaris man pages are very clean, and the '89 standard: realloc(ptr,0) works like free(ptr). The Mac OS manpage above, and the standard as quoted by Jonathan are less clear but seems to leave room to break the equivalence.
I've been wondering why the difference: the "act like free" interpretation seems very natural to me. Both of the implementations I have access to include some environment variable driven tunablity, but the BSD version accepts many more options Some examples:
MallocGuardEdges If set, add a guard page before and after
each large block.
MallocDoNotProtectPrelude If set, do not add a guard page before large
blocks, even if the MallocGuardEdges envi-
ronment variable is set.
MallocDoNotProtectPostlude If set, do not add a guard page after large
blocks, even if the MallocGuardEdges envi-
ronment variable is set.
and
MallocPreScribble If set, fill memory that has been allocated
with 0xaa bytes. This increases the likeli-
hood that a program making assumptions about
the contents of freshly allocated memory
will fail.
MallocScribble If set, fill memory that has been deallo-
cated with 0x55 bytes. This increases the
likelihood that a program will fail due to
accessing memory that is no longer allo-
cated.
Possibly the "minimum sized object" is nothing (i.e. equivalent to free) in the normal modes, but something with some of the guards in place. Take that for what it's worth.

Resources