realloc in C — exact behaviour and uses - c

Reading some literature, I was able to grasp that realloc takes in a void pointer and a size variable and re-allocates the memory of the block the void pointer points to.
What will happen if realloc is called on an integer pointer (int *)
with a size of character? And vice versa.
What can be a possible application of this? (An example would definitely help.)

The realloc() function is the all-in-one memory management system.
If called with a null pointer and a non-zero size, it allocates memory.
If called with a valid pointer and a zero size, it frees memory.
If called with a valid pointer and a non-zero size, it changes the size of the allocated memory.
If you call realloc() with an invalid pointer — one which was not obtained from malloc(), calloc() or realloc() — then you get undefined behaviour.
You could pass realloc() an integer pointer to an allocated space of sizeof(char) bytes (1 byte), but you'd be in danger of invoking undefined behaviour. The problem is not with realloc(); it is with the code that was given an unusable integer pointer. Since only 1 byte was allocated but sizeof(int) is greater than 1 (on essentially all systems; there could be exceptions, but not for someone asking this question), there is no safe way to use that pointer except by passing it to free() or realloc().
Given:
int *pointer = malloc(sizeof(char));
you cannot do *pointer = 0; because there isn't enough space allocated (formally) for it to write to. You cannot do int x = *pointer; because there isn't enough space allocated (formally) for it to read from. The word 'formally' is there because in practice, the memory allocators allocate a minimum size chunk, which is often 8 or 16 bytes, so there actually is space after the one byte. However, your are stepping outside the bounds of what the standard guarantees, and it is possible to conceive of memory allocators that would hand you exactly one byte. So, don't risk it. An integer pointer to a single byte of allocated memory is unusable except as an argument to the memory allocation functions.
The first argument to realloc() is a void *. Since you're going to have a prototype in scope (#include <stdlib.h>), the compiler will convert the int * to a void * (if there's anything to do for such a cast), and as long as the space pointed at was allocated, everything will be fine; realloc() will change the allocation size, possibly returning the same pointer or possibly returning a different pointer, or it will release the space if the new size is zero bytes.

There is a vitally important requirement to non-NULL pointers that you pass to realloc: they must themselves come from a call to malloc, calloc or realloc, otherwise the behavior is undefined.
If you allocate a chunk of memory sufficient to store an int and then realloc for a char, you will always get back the same pointer, because sizeof(char) is less than or equal to the sizeof(int):
int* intPtr = malloc(sizeof(int));
int* otherPtr = realloc(intPtr, sizeof(char));
// intPtr == otherPtr
If you try it the other way around, you will almost certainly get back the same pointer as well, because memory allocators rarely, if ever, parcel the memory to chunks smaller than sizeof(int). However, the result is implementation-dependent, so theoretically you may get back a different address.
As far as the utility of any of the above exercises goes, it is not useful: realloc has been designed with the intention to help you manage variable-sized arrays, simplifying the code for growing the size of such arrays, and potentially reducing the number of allocations and copying. I do not see a reason for realloc-ing a scalar.

The type of the first parameter of realloc is void *, as you said yourself. So the function argument that you pass is converted to a void pointer, which is an implicit and safe conversion.
It's the same as if you called a function with a long int parameter with an int argument, essentially.

Realloc takes the size in bytes. If you do
int* a= malloc(sizeof(int));
and then
a=realloc(a,1);
of course a will now not be big enough for an int type and writing an int in it will give you odd behavior.

As stated by dasblinkenlight, realloc doesn't make sense for a scalar.
In your example:
int * a = malloc(sizeof(int));
int * b = realloc(a,sizeof(char));
Would result in a == b, simply because sizeof(char) < sizeof(int) and there is no reason for moving the data to a new location.
The difference is, that after the realloc you are writing your int to unallocated space, as you decreased the allocated space using realloc. Here this is only of theoretical relevance. Because of alignment it is very unlikely, that the os is reusing the freed memory. But you shouldn't rely on that. This is undefined behavior.
It could become relevant, if you resize the space of a long int to an int. This depends on your architecture.
Writing to unallocated space is like laying a time bomb and not knowing when it will explode. You should only read from / write to allocated space.

Related

Does sizeof returns the amount of memory allocated?

I read that:
sizeof operator returns the size of the data type, not the amount of memory allocated to the variable.
Isn't the amount of memory allocated depends on the size of the data type? I mean that sizeof will return 4 (architecture-dependent) when I pass int to it.
Am I missing something here?
sizeof returns the number of bytes that a variable or stack allocated array occupies.
Examples:
sizeof(char)=1 (in most configurations)
But sizeof(char*)=8 (depending on the platform)
If you dynamically allocate memory with malloc, you will receive a pointer to that block of memory. If use the sizeof on it, you will just get the size of the pointer.
However, sizeof() a stack allocated array like when you write int a[10] is the size of the allocated memory (so 4*10)
The size of the pointer doesn't depend on the size of the datatype it represents. (On 32 bit platforms, a pointer is 32bit)
The text you quote is technically incorrect. sizeof variable_name does return the size of memory that the variable called variable_name occupies.
The text makes a common mistake of conflating a pointer with the memory it points to. Those are two separate things. If a pointer points to an allocated block, then that block is not allocated to the pointer. (Nor are the contents of the block stored in the pointer -- another common mistake).
The allocation exists in its own right, the pointer variable exists elsewhere, and the pointer variable points to the allocation. The pointer variable could be changed to point elsewhere without disturbing the allocation.
sizeof returns the number of bytes
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type.
but the size of each byte is not guaranteed to be 8. So you don't obtain directly the amount of memory allocated.
A byte is composed of a contiguous sequence of bits, the number of which is implementation-defined
anyway you can deduce the amount of memory allocated using the CHAR_BIT constant, which contains the number of bit is a byte.
"Memory allocation" in C typically refers to explicit allocation (i.e: on the heap - malloc() and friends), or implicit allocation (i.e: on the stack).
As you've defined, sizeof() returns the size of the data type:
sizeof(char) - a single char
sizeof(void *) - an void pointer
If you call malloc(sizeof(int)), you're requesting "enough memory to hold the data for an int", which may be 4 bytes on your system... you may find that more memory than you requested is allocated (though this will typically be hidden from you, see canaries).
Additionally, if you call int *x = malloc(1024), and sizeof(*x), you might get 4, because an int happens to be 4 bytes... even though the memory you've allocated is 1 KiB. If you were to incorrectly call sizeof(x), then you'll get the size of a pointer returned, not the size of the type it points to. Neither of these (sizeof(*x) or sizeof(x)) will return 1024.

(int*) when dynamically allocating array of ints in c

So I'm a bit confused on how to make a function that will return a pointer to an array of ints in C. I understand that you cannot do:
int* myFunction() {
int myInt[aDefinedSize];
return myInt; }
because this is returning a pointer to a local variable.
So, I thought about this:
int* myFunction(){
int* myInt = (int) malloc(aDefinedSize * sizeof(int));
return myInt; }
This gives the error: warning cast from pointer to integer of different size
This implies to use this, which works:
int* myFunction(){
int* myInt = (int*) malloc(aDefinedSize * sizeof(int));
return myInt; }
What I'm confused by though is this:
the (int*) before the malloc was explained to me to do this: it tells the compiler what the datatype of the memory being allocated is. This is then used when, for example, you are stepping through the array and the compiler needs to know how many bytes to increment by.
So, if this explanation I was given is correct, isn't memory being allocated for aDefinedSize number of pointers to ints, not actually ints? Thus, isnt myInt a pointer to an array of pointers to ints?
Some help in understanding this would be wonderful. Thanks!!
So, if this explanation I was given is correct, isn't memory being allocated for aDefinedSize number of pointers to ints, not actually ints?
No, you asked malloc for aDefinedSize * sizeof(int) bytes, not
aDefinedSize * sizeof(int *) bytes. That's the size of memory you get, the type depends on the pointer used to access the memory.
Thus, isnt myInt a pointer to an array of pointers to ints?
No, since you defined it as a int *, a pointer-to-an-int.
Of course the pointer has no knowledge of how large the allocated memory are is, but only points at the first int that fits there. It's up to you as programmer to keep track of the size.
Note that you shouldn't use that explicit typecast. malloc returns a void *, that can be silently assigned to any pointer, as in here:
int* myInt = malloc(aDefinedSize * sizeof(int));
Arithmetic on the pointer works in strides of the pointed-to type, i.e. with int *p, p[3] is the same as *(p+3), which means roughly "go to p, go forward three times sizeof(int) in bytes, and access that location".
int **q would be a pointer-to-a-pointer-to-an-int, and might point to an array of pointers.
malloc allocates an array of bytes and returns void* pointing to the first byte. Or NULL if the allocation failed.
To treat this array as an array of a different data type, the pointer must be cast to that data type.
In C, void* implicitly casts to any data pointer type, so no explicit cast is required:
int* allocateIntArray(unsigned number_of_elements) {
int* int_array = malloc(number_of_elements * sizeof(int)); // <--- no cast is required here.
return int_array;
}
Arrays in C
In C, you want to remember that an array is just an address in memory, plus a length and an object type. When you pass it as an argument to a function or a return value from a function, the length gets forgotten and it’s treated interchangeably with the address of the first element. This has led to a lot of security bugs in programs that either read or write past the end of a buffer.
The name of an array automatically converts to the address of its first element in most contexts, so you can for example pass either arrays or pointers to memmove(), but there are a few exceptions where the fact it also has a length matters. The sizeof() operator on an array is the number of bytes in the array, but sizeof() a pointer is the size of a pointer variable. So if we declare int a[SIZE];, sizeof(a) is the same as sizeof(int)*(size_t)(SIZE), whereas sizeof(&a[0]) is the same as sizeof(int*). Another important one is that the compiler can often tell at compile time if an array access is out of bounds, whereas it does not know which accesses to a pointer are safe.
How to Return an Array
If you want to return a pointer to the same, static array, and it’s fine that you’ll get the same array each time you call the function, you can do this:
#define ARRAY_SIZE 32U
int* get_static_array(void)
{
static int the_array[ARRAY_SIZE];
return the_array;
}
You must not call free() on a static array.
If you want to create a dynamic array, you can do something like this, although it is a contrived example:
#include <stdlib.h>
int* make_dynamic_array(size_t n)
// Returns an array that you must free with free().
{
return calloc( n, sizeof(int) );
}
The dynamic array must be freed with free() when you no longer need it, or the program will leak memory.
Practical Advice
For anything that simple, you would actually write:
int * const p = calloc( n, sizeof(int) );
Unless for some reason the array pointer would change, such as:
int* p = calloc( n, sizeof(int) );
/* ... */
p = realloc( p, new_size );
I would recommend calloc() over malloc() as a general rule, because it initializes the block of memory to zeroes, and malloc() leaves the contents unspecified. That means, if you have a bug where you read uninitialized memory, using calloc() will always give you predictable, reproducible results, and using malloc() could give you different undefined behavior each time. In particular, if you allocate a pointer and then dereference it on an implementation where 0 is a trap value for pointers (like typical desktop CPUs), a pointer created by calloc() will always give you a segfault immediately, while a garbage pointer created by malloc() might appear to work, but corrupt any part of memory. That kind of bug is a lot harder to track down. It’s also easier to see in the debugger that memory is or is not zeroed out than whether an arbitrary value is valid or garbage.
Further Discussion
In the comments, one person objects to some of the terminology I used. In particular, C++ offers a few different kinds of ways to return a reference to an array that preserve more information about its type, for example:
#include <array>
#include <cstdlib>
using std::size_t;
constexpr size_t size = 16U;
using int_array = int[size];
int_array& get_static_array()
{
static int the_array[size];
return the_array;
}
std::array<int, size>& get_static_std_array()
{
static std::array<int, size> the_array;
return the_array;
}
So, one commenter (if I understand correctly) objects that the phrase “return an array” should only refer to this kind of function. I use the phrase more broadly than that, but I hope that clarifies what happens when you return the_array; in C. You get back a pointer. The relevance to you is that you lose the information about the size of the array, which makes it very easy to write security bugs in C that read or write past the block of memory allocated for an array.
There was also some kind of objection that I shouldn’t have told you that using calloc() instead of malloc() to dynamically allocate structures and arrays that contain pointers will make almost all modern CPUs segfault if you dereference those pointers before you initialize them. For the record: this is not true of absolutely all CPUs, so it’s not portable behavior. Some CPUs will not trap. Some old mainframes will trap on a special pointer value other than zero. However, it’s come in very handy when I’ve coded on a desktop or workstation. Even if you’re running on one of the exceptions, at least your pointers will have the same value each time, which should make the bug more reproducible, and when you debug and look at the pointer, it will be immediately obvious that it’s zero, whereas it will not be immediately obvious that a pointer is garbage.

How to limit the size allocated memory?

While studying pointers I found that if I allocate memory for a 2 digit int I can give that int a higher value. Can someone explain me why it works like that and how to work around it?
Here is the code:
int *p;
p = (int *) malloc(2*sizeof(int));
p = 123456;
First of all, Please see this discussion on why not to cast the return value of malloc() and family in C..
That said, here, you're just overwriting the pointer returned by malloc(). Don't do that. It will cause memory leak. Also, if you try to dereference this pointer later, you may face undefined behavior, as there is no guarantee that this pointer points to a valid memory location. Accessing invalid memory leads to UB.
Finally, to address
I allocate memory for a 2 digit int [...]
let me tell you, you are allocating memory to hold two integers, not a two digit integer. Also, to store the integer values into the memory area pointed by the pointer, you need to dereference the pointer, like *p and store (assign) the value there, like *p=12345;.
malloc() allocates memory of the size of bytes passed as it's argument. Check the sizeof(int) (and 2*sizeof(int), if you want) to make it more clear.
Regarding this, quoting C11, chapter
void *malloc(size_t size);
The malloc function allocates space for an object whose size is specified by size [...]
So, here, malloc() returns a pointer with the size of two integers (i.e., capable of holding two integer values), not two digits or bytes.
Also, it's always a good practice to check the success of malloc() before using the returned pointer to avoid the possible UB by dereferencing NULL in case of malloc() failure.
if i allocate memory for a 2 digit int i can give that int a higher value.
No, you cannot. You have allocated memory for two ints, with full int range. Standard guarantees at least five digits (215-1), but on most modern systems you are safe with eight digits (231-1).
Back to your question, there is no way in C to ensure that your program does not write past the end of the memory that it allocates. It is the responsibility of your program to not do that, including any checks it must perform in order to catch potentially unsafe access.
This is called undefined behavior. The system may crash when you access memory past the end of the allocated area, but it does not have to.
allocate memory for a 2 digit int
malloc(2*sizeof(int)) allocates memory for two integers, not for a single two-digit integer.
you do not need the cast in the first place
You are allocated memory for two integers - typically 32 bits each
You are giving a pointer the value 123456
Perhaps reread the book

Compatibility of data size while using malloc()

I recently studied about malloc() in C with declaration as follows:
void *malloc(size_t size)
where size_t is unsigned int and size defines the no. of bytes to be reserved.
Question is that on my system float values occupy 4bytes of memory. So if i make memory pointer(of float type) using malloc of 2bytes,
float *p;
p = (float *)malloc(2);
then how come it does not give any error? Because what i think is that float data required 4 bytes so if i issue only 2 bytes to it then it may lead to some data loss.
or is it that i m understanding malloc() incorrrectly?
In the example you give, if you only allocate 2 bytes for a float * and then attempt to write to that location by dereferencing the pointer, you'll be writing to memory that hasn't been allocated. This results in undefined behavior. That means it might work, it might core dump, or it might behave in unpredictable ways.
If you want to allocate memory for one or more floats, you would do it like this:
// allocates space for an array of 5 floats
// don't cast the result of malloc
int arrayLen = 5;
float *f = malloc(sizeof(float) * arrayLen);
You're encountering an implementation-specific result of the requirements of the C Standard:
7.22.3 Memory management functions
The order and contiguity of storage allocated by successive calls to
the aligned_alloc , calloc , malloc , and realloc functions
is unspecified. The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any type
of object with a fundamental alignment requirement and then used to
access such an object or an array of such objects in the space
allocated (until the space is explicitly deallocated).
In order to provide storage "suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement", an implementation has to return memory from malloc() et al at specific offsets that are multiples of the most restrictive alignment requirement for the system. That's usually something like 8 or 16 bytes.
Given that each and every block returned has to be aligned that way, most implementations internally create blocks of memory in multiples of the alignment requirement.
So if your system has an 8-byte alignment requirement, your malloc() implementation is likely to actually give you an 8-byte block of memory even though you requested two bytes. Likewise, ask for 19 bytes and you'll likely get something like 24 in reality.
It's still undefined behavior to go beyond what you asked for, though. And undefined behavior does unfortunately include "works just fine".
This can be problematic if you try to use that pointer -actually the pointer is fine, it's the allocated memory it points to that's too small-, the reason the compiler doesn't recognize it as an error is the fact that the pointer isn't "aware" of what it points to , actually pointers are variables that contain memory addresses, so basically they're just a number, and in most cases ( as user694733 pointed out) the size of the pointer is the same whether it points to a short or a float.
what the compiler sees is a cast from (void*) to (float*) and to the compiler it's a totally valid cast.
Your question has actually nothing to do with the malloc, but rather to the data casting. In this case you have casted the bytes located starting from the address returned by malloc to float. Thus, if you later say *p = 0.0f; you will actually write 4 bytes to the mentioned memory area, but only 2 bytes are ligal to you to use since you have allocated only 2 bytes. Therefore, your code will compile and run with memory corruption (this either will result in crash, or in an unexpected runtime behavior later)

size of memory allocated by malloc

I am assigning a new memory chunk to a pointer, but apparently the size of the chunk is not the one which I pass as a parameter to malloc
char *q="tre";
printf("q in main %zu\n", sizeof(q));
q = (char*)malloc(6);
printf("q in main %zu\n", sizeof(q));
Outputs
8
8
The pointer however does point to a new memory chunk.
How is this possible?
sizeof returns size of pointer, in your case it is (char*), it will not give the memory allocated by the malloc. Keep the memory size in separate variable for later use.
char *q;
printf("%zu\n", sizeof(q));
sizeof(q) refers to the size of the pointer, not the amount of memory it points to.
What you are obtaining is the size of the variable q as a pointer type. In general all pointers will have the same size in your program.
Since 8 bytes are 64 bits, it seems you are doing 64-bit applications. :)
sizeof(q) returns the size of the pointer q which will on a 64 bit machine be 8 bytes, not the size of the memory block allocated at that pointer. sizeof is a compile time not a runtime operation.
I'm not clear what you want to do here, but if you want to allocate enough memory for a string at location s, then you want to malloc(strlen(s)+1) (+1 for the terminating NULL).
Perhaps you want to get the size of malloc()ed block. There is not a portable way to do this to my knowledge, but malloc_usable_size nearly does it on glibc. From the man page:
malloc_usable_size() returns the number of bytes available in the dynamically allocated buffer ptr, which may be greater than the requested size (but is guaranteed to be at least as large, if the request was successful). Typically, you should store the requested allocation size rather than use this function.
Note the last sentence.

Resources