malloc function memory management - c

malloc() function forms a single block of memory (say 20 bytes typecasted to int), so how it can be used as an array of int blocks like as calloc() function? Shouldn't it be used to store just one int value in whole 20 bytes (20*8 bits)?

(say 20 bytes typecasted to int)
No, the returned memory is given as a pointer to void, an incomplete type.
We assign the returned pointer to a variable of pointer to some type, and we can use that variable to access the memory.
Quoting C11, chapter ยง7.22.3, Memory management functions
[....] The
pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to
a pointer to any type of object with a fundamental alignment requirement and then used
to access such an object or an array of such objects in the space allocated (until the space
is explicitly deallocated). [...] The pointer returned points to the start (lowest byte address) of the
allocated space. [....]
Since the allocated memory is contiguous, pointer arithmetic works, just as in case of arrays, since in arrays also, elements are placed in contiguous memory.
One point to clarify, a pointer is not an array.

There's an abstract concept in C formally known as effective type, meaning the actual type of the data stored in memory. This is something the compiler keeps track of internally.
Most objects in C have such an effective type at the point when the variable is declared, for example if we type int a; then the effective type of what's stored in a is int.
Meaning it is legal to do evil things like this:
int a;
double* d = (double*)&a;
*(int*)d = 1;
This works because the effective type of the actual memory remains an int, even though we pointed at it with a wildly incompatible type. As long as we access it with the same type as the effective type, all is well. If we access the data using the wrong type, very bad things will happen, such as program crashes or dormant bugs.
But when we call malloc family of functions, we only tell them to reserve n number of bytes, with no type specified. This memory is guaranteed to be allocated in adjacent memory cells, but nothing else. The only difference between malloc and calloc is that the latter sets all values in this raw memory to zero. Neither function knows anything about types or arrays.
The returned chunk of raw memory has no effective type. Not until the point when we access it, then it gets the effective type which corresponds to the type used for the access.
So just as in the previous example, it doesn't matter which type of pointer we set to point at the data. It doesn't matter if we write int* i = malloc(n); or bananas_t* b = malloc(n);, because the pointed-at memory does not yet have a type. It does not get one until at the point where we access it for the first time.

There is nothing special about memory returned from malloc compared to memory returned from calloc, other that the fact that the bytes of the memory block returned by calloc are initialized to 0. Memory returned by malloc does not have to be used for a single object but may also be used for an array.
This means that the following are equivalent:
int *p1 = malloc(3 * sizeof(int));
p1[0] = 1;
p1[2] = 2;
p1[3] = 3;
...
int *p2 = calloc(3, sizeof(int));
p2[0] = 1;
p2[2] = 2;
p2[3] = 3;
Both will return 3 * sizeof(int) bytes of memory which can be used as an array of int of size 3.

What malloc returns back to you is just a pointer to the starting memory address where the contiguous block of memory was allocated.
The size of the contiguous block of memory that you allocated using malloc depends on the argument you passed into malloc function. http://www.cplusplus.com/reference/cstdlib/malloc/
If you want to store int variable then you will do it by defining the pointer type you use to be of an int type.
example:
int p*; //pointer of type integer
size_t size = 20;
p = (int *) malloc(size); //returns to pointer p the memory address
after this, using the pointer p the programmer can access int (4 byte precision) values.
calloc only difference against malloc is that calloc initiallizes all values at this memory block to zero.

Related

malloc function return pointer to an array right?

I have a doubt. When malloc() function returns a pointer, is it a pointer to a linear block of memory (similar to array)? Or is it something else?
I would like to know the structure of that memory.
I would like to know the structure of that memory.
There is no structure. Just the memory chunk
(similar to array?)
the memory chunk is exactly the same as array.
The difference is only that reference of that chunk is a pointer.
You cant use `sizeof operator to get the size of that memory chunk.
the reference of the pointer is not the same as reference to the memory chunk. (address of the array always give the reference to first element of the array)
A pointer only points to a single object of the pointed-to type (although that object may be an aggregate type) or function. That object may be the first object in a larger sequence like an array, but you can't know that from the pointer itself. Given code like
char x;
char *p1 = &x;
char *p2 = malloc( sizeof *p2 * 10 );
There's no way to know from the pointers themselves that p1 points to a single standalone object while p2 points to the first in a sequence of objects. You have to keep track of that information separately.
This is true for pointers to aggregate types like
char a[20];
char (*p3)[20] = &a;
char (*p4)[20] = malloc( sizeof *p4 * 10 );
Same deal as above - both p3 and p4 point to a single 20-element array of char. In p4's case, it's pointing to the first in a sequence of 20-element arrays of char, but again you can't know that from the value of p4 itself.
Note that malloc and calloc don't operate in terms of objects, they operate in terms of bytes - you tell them how many bytes of memory you want to reserve, but they have no idea what type of object or sequence of objects is going to occupy that memory. They also need some way to keep track of what's been allocated, so many implementations will reserve some extra memory on each allocation for bookkeeping purposes.
malloc function return pointer to an array right?
size_t n = ...;
void *p = malloc(n);
When the returned pointer is a null pointer, the allocation failed and the pointer should not be dereferenced. It does not need to be free'd.
if (p == NULL) Handle_Failure();
A successful void *malloc(size_t n) call does return a pointer, a void * that can be assigned to any object pointer type. A cast is not needed. p is not a pointer to an array, just a void *. The allocated memory can be the destination of a copy of any object including arrays.
my_type *p = malloc(n);
Use the pointer to store the contents of an array, int, or any object. As long as the initial allocation was big enough, it does not matter. The pointer meets the alignment requirements for all object types. When done, free it exactly once. Then do not use the value in the pointer. Yet p can be re-assigned.
int foo(const struct abc *x) {
struct abc *p = malloc(sizeof *p);
if (p == NULL) {
return 1;
}
*p = *x;
bar(p);
free(p);
return 0;
}
Allocation of 0 bytes is a special case and can be done. Should malloc() return a null pointer or not, the pointer should not be dereferenced. No *p.
Once code gets p, code does not have a portable way to get the size of memory allocated. Code should keep track of the original n as needed. malloc() uses a size_t argument.
Note: free(NULL) is OK
When malloc(); function returns a pointer is it a pointer to a linear blocks of memory
Yes, and it's a void pointer.
(similar to array).
No. It's not an array, or in other words, it's not the pointer to the first element of an array of size passed to malloc(). It's a memory region/ block, virtually contiguous, but the returned pointer (or the variable storing the return value) will not have properties of an array.
would like to know the structure of that memory
If you only want to use the returned memory location, you need not bother. You can just store the returned pointer to a variable of a pointer to complete type, and use that variable to perform operations (read from/ write to) that memory location.
It does not return pointer to an array.
malloc(size_t n) returns a pointer to a contiguous memory location specified by the parameter n.
Note that it is opposed to calloc(size_t nmem, size_t size) which initializes each bit to zero while malloc() leaves the value as indeterminate as suggested in comments.
Default value of malloc
Why malloc may be indeterminate from SO
More about calloc and malloc from SO

Does sizeof returns the amount of memory allocated?

I read that:
sizeof operator returns the size of the data type, not the amount of memory allocated to the variable.
Isn't the amount of memory allocated depends on the size of the data type? I mean that sizeof will return 4 (architecture-dependent) when I pass int to it.
Am I missing something here?
sizeof returns the number of bytes that a variable or stack allocated array occupies.
Examples:
sizeof(char)=1 (in most configurations)
But sizeof(char*)=8 (depending on the platform)
If you dynamically allocate memory with malloc, you will receive a pointer to that block of memory. If use the sizeof on it, you will just get the size of the pointer.
However, sizeof() a stack allocated array like when you write int a[10] is the size of the allocated memory (so 4*10)
The size of the pointer doesn't depend on the size of the datatype it represents. (On 32 bit platforms, a pointer is 32bit)
The text you quote is technically incorrect. sizeof variable_name does return the size of memory that the variable called variable_name occupies.
The text makes a common mistake of conflating a pointer with the memory it points to. Those are two separate things. If a pointer points to an allocated block, then that block is not allocated to the pointer. (Nor are the contents of the block stored in the pointer -- another common mistake).
The allocation exists in its own right, the pointer variable exists elsewhere, and the pointer variable points to the allocation. The pointer variable could be changed to point elsewhere without disturbing the allocation.
sizeof returns the number of bytes
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type.
but the size of each byte is not guaranteed to be 8. So you don't obtain directly the amount of memory allocated.
A byte is composed of a contiguous sequence of bits, the number of which is implementation-defined
anyway you can deduce the amount of memory allocated using the CHAR_BIT constant, which contains the number of bit is a byte.
"Memory allocation" in C typically refers to explicit allocation (i.e: on the heap - malloc() and friends), or implicit allocation (i.e: on the stack).
As you've defined, sizeof() returns the size of the data type:
sizeof(char) - a single char
sizeof(void *) - an void pointer
If you call malloc(sizeof(int)), you're requesting "enough memory to hold the data for an int", which may be 4 bytes on your system... you may find that more memory than you requested is allocated (though this will typically be hidden from you, see canaries).
Additionally, if you call int *x = malloc(1024), and sizeof(*x), you might get 4, because an int happens to be 4 bytes... even though the memory you've allocated is 1 KiB. If you were to incorrectly call sizeof(x), then you'll get the size of a pointer returned, not the size of the type it points to. Neither of these (sizeof(*x) or sizeof(x)) will return 1024.

How does malloc() know you want to use the block of memory it supplies as an array?

If malloc() returns a pointer to a single block of memory, how can it be used to store multiple values contiguously and allow access to each one using the subscript operator, acting as a pointer to an array?
If I were to try and change the "second element" of an integer by subscripting its address, it would cause undefined behaviour. As malloc() returns the pointer to a single block of memory, shouldn't the pointer it returns refer to the entire block, and thus subscripting it should access the garbage value next to it in memory?
Furthermore, the allocated memory can also be used to store a single value, but only up to the size of the type the pointer is cast to, not to that of the allocated block of memory.
Is all this something to do with the type the pointer is cast to after being returned? Could someone point me in the right direction?
I think your misunderstanding is here:
As malloc() returns the pointer to a single block of memory, shouldn't the pointer it returns refer to the entire block, and thus subscripting it should access the garbage value next to it in memory?
Indeed if you do p = malloc(n) and p has type "pointer to some type of size n", then p[1] is an out-of-bounds array access. However, normally when you do p = malloc(n) to allocate an array, the type of p is not a pointer to the array (of size n), but a pointer to the first element of the array. That is, instead of
char (*p)[500] = malloc(500);
you do:
char *p = malloc(500);
and in this case p[1] is perfectly valid. Note that with the first, unusual, form, you could still do (*p)[1] or p[0][1] and have it be valid.
But be careful, if you use malloc several times it will return memory allocated in different parts of heap. So you can't move around from one array to another.

C: pointer to malloc'ed heap location plus 4

I'm trying to implement a new malloc that stores the size at the front of the malloc'ed region, and then returns a pointer to the incremented location (what comes after the stored unsigned int).
void* malloc_new(unsigned size) {
void* result = malloc(size + sizeof(unsigned));
((unsigned*)result)[0] = size;
result += sizeof(unsigned);
return result;
}
I'm having doubts regarding whether the
result += sizeof(unsigned);
line is correct (does what I want).
Say the original address in the heap for the malloc is X, and the size of unsigned is 4, I want the 'result' pointer to point to X + 4, right? Meaning that the memory location in the stack that stores the 'result' pointer should contain (the original heap address location + 4).
result += sizeof(unsigned); should give you at least a warning (pointer arithmetic on void * leads to undefined behavior).
unsigned *result = malloc(size + sizeof size);
result[0] = size;
return result + 1;
should be the easier way.
Please note that the returned memory is not well aligned for all possible datatypes. You will run into troubles if you are using this memory for double or other 64bit datatypes. You should use an 8 byte datatype uint64_t for storing the size, then the memory block afterwards is well aligned.
In addition to the problems noted in other answers with performing pointer arithmetic on void * pointers, you're also likely violating one of the restrictions the C standard places on memory returned from functions such as malloc().
7.22.3 Memory management functions, paragraph 1 of the C standard states:
The order and contiguity of storage allocated by successive calls to the aligned_alloc, calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object. The pointer returned points to the start (lowest byte address) of the allocated space. If the space cannot be allocated, a null pointer is returned. If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.
Note the bolded part.
Unless your system has a fundamental alignment that's only four bytes (8 or 16 is much more typical), you are violating that restriction, and wil invoke undefined behavior per 6.3.2.3 Pointers, paragraph 7 for any object type with a fundamental alignment requirement larger than four bytes:
... If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. ...
How void pointer arithmetic is happening in GCC
C does not allow pointer arithmetic with void * pointer type.
GNU C allows it by considering the size of void is 1.
The result is void* so result += sizeof(unsigned); just happens to work on compatible compilers.
You can refactor your function into:
void *malloc_new(unsigned size) {
void* result = malloc(size + sizeof(unsigned));
((unsigned*)result)[0] = size;
result = (char*)result + sizeof(unsigned);
return result;
}
Side note, before void type and void* generic pointer existed in the C language, programmers used char* to represent a generic pointer.
You can do void* arithmetic if you cast the type first to char*, for instance. And then cast back to void*. To get better alignment, use 64 bit type for the size, e.g. uint64_t.
#define W_REF_VOID_PTR(ptr,offset) \
((void*)((char*) (ptr) + (offset)))

Is this explicit heap or stack dynamic?

Given the following code snippet in C:
int* x;
x = (int *) malloc(40);
We know that this is an explicit heap dynamic allocation.
If I change the code to this though:
int* x = (int *) malloc(40);
is it still an explicit heap dynamic? My friend thinks it's a stack dynamic, but I think its an explicit heap dynamic because we're allocating memory from the heap.
Explicit heap dynamic is defined as variables that are allocated and deallocated by explicit run-time instructions written by the programmer. Wouldn't that imply that any malloc/calloc call would be explicit heap?
Edit: I spoke to my professor and she clarified some stuff for me.
When we declare something like
char * str = (char *) malloc(15);
We say that str is of data type char pointer, and has a stack dynamic storage binding. However, when we are referring to the object referenced by str, we say that it is explicit heap dynamic.
First off, if you are going to store the results of malloc in a variable, then declare it properly as a pointer, not an int.
The problem with storing the malloc results in an int, is that you could possibly truncate the pointer and blow up when you dereference it later. For instance if you ran that expression on a 64 bit system, the malloc would come back with a 8 byte pointer. But since you are assigning it to a 4 byte int, it gets truncated. Not good.
Your code should be like this:
int* x = (int*)malloc(bla);
Anyways, the int pointer x itself is stored on the stack. But don't get the two confused. X itself is a pointer on the stack, but it points to memory allocated on the heap.
Note:
32 bit applications (usually) have 4 byte pointers.
64 bit applications (usually) have 8 byte pointers.
Yes, malloc allocates requested memory on the heap. Heap memory is used for dynamic data structures that grow and shrink.
From standard 6.3.2.3
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.
This claims why the type of the pointer should be int*. Also the casting is not needed - void* to int* conversion will be implicitly done over here.
Also why do you think declaring a variable with initializer or without it would affect the storage duration of the memory allocated by *alloc and it's friend functions. It is not.
Interesting thing x has automatic storage duration (usually this is realized using stack) but the memory it contains (yes type of x would be int*) - it will be of allocated storage duration. The thing is heap/stack are not something specified or mentioned by C standard. Most implementations usually realize allocated storage duration using heap.

Resources