I'm just reading about malloc() in C.
The Wikipedia article provides an example, however it justs allocate enough memory for an array of 10 ints in comparison with int array[10]. Not very useful.
When would you decided to use malloc() over C handling the memory for you?
Dynamic data structures (lists, trees, etc.) use malloc to allocate their nodes on the heap. For example:
/* A singly-linked list node, holding data and pointer to next node */
struct slnode_t
{
struct slnode_t* next;
int data;
};
typedef struct slnode_t slnode;
/* Allocate a new node with the given data and next pointer */
slnode* sl_new_node(int data, slnode* next)
{
slnode* node = malloc(sizeof *node);
node->data = data;
node->next = next;
return node;
}
/* Insert the given data at the front of the list specified by a
** pointer to the head node
*/
void sl_insert_front(slnode** head, int data)
{
slnode* node = sl_new_node(data, *head);
*head = node;
}
Consider how new data is added to the list with sl_insert_front. You need to create a node that will hold the data and the pointer to the next node in the list. Where are you going to create it?
Maybe on the stack! - NO - where will that stack space be allocated? In which function? What happens to it when the function exits?
Maybe in static memory! - NO - you'll then have to know in advance how many list nodes you have because static memory is pre-allocated when the program loads.
On the heap? YES - because there you have all the required flexibility.
malloc is used in C to allocate stuff on the heap - memory space that can grow and shrink dynamically at runtime, and the ownership of which is completely under the programmer's control. There are many more examples where this is useful, but the one I'm showing here is a representative one. Eventually, in complex C programs you'll find that most of the program's data is on the heap, accessible through pointers. A correct program always knows which pointer "owns" the data and will carefully clean-up the allocated memory when it's no longer needed.
What if you don't know the size of the array when you write your program ?
As an example, we could imagine you want to load an image. At first you don't know its size, so you will have to read the size from the file, allocate a buffer with this size and then read the file in that buffer. Obviously you could not have use a static size array.
EDIT:
Another point is: When you use dynamic allocation, memory is allocated on the heap while arrays are allocated on the stack. This is quite important when you are programming on embedded device as stack can have a limited size compared to heap.
I recommend that you google Stack and Heap.
int* heapArray = (int*)malloc(10 * sizeof(int));
int stackArray[10];
Both are very similar in the way you access the data. They are very different in the way that the data is stored behind the scenes. The heapArray is allocated on the heap and is only deallocted when the application dies, or when free(heapArray) is called. The stackArray is allocated on the stack and is deallocated when the stack unwinds.
In the example you described int array[10] goes away when you leave your stack frame. If you would like the used memory to persist beyond local scope you have to use malloc();
Although you can do variable length arrays as of C99, there's still no decent substitute for the more dynamic data structures. A classic example is the linked list. To get an arbitrary size, you use malloc to allocate each node so that you can insert and delete without massive memory copying, as would be the case with a variable length array.
For example, an arbitrarily sized stack using a simple linked list:
#include <stdio.h>
#include <stdlib.h>
typedef struct sNode {
int payLoad;
struct sNode *next;
} tNode;
void stkPush (tNode **stk, int val) {
tNode *newNode = malloc (sizeof (tNode));
if (newNode == NULL) return;
newNode->payLoad = val;
newNode->next = *stk;
*stk = newNode;
}
int stkPop (tNode **stk) {
tNode *oldNode;
int val;
if (*stk == NULL) return 0;
oldNode = *stk;
*stk = oldNode->next;
val = oldNode->payLoad;
free (oldNode);
return val;
}
int main (void) {
tNode *top = NULL;
stkPush (&top, 42);
printf ("%d\n", stkPop (&top));
return 0;
}
Now, it's possible to do this with variable length arrays but, like writing an operating system in COBOL, there are better ways to do it.
malloc() is used whenever:
You need dynamic memory allocation
If you need to create array of size n, where n is calculated during your program execution, the only way you can do it is using malloc().
You need to allocate memory in heap
Variables defined in some functions live only till the end of this function. So, if some "callstack-independent" data is needed, it must be either passed/returned as function parameter (which is not always suitable), or stored in heap. The only way to store data in heap is to use malloc(). There are variable-size arrays, but they are allocated on stack.
Related
Having the following,
struct node{
int value;
struct node *next;
};
typedef struct node Node;
typedef struct node *pNode;
Node newNode(){
Node n;
n.value = 5;
return n;
}
pNode newpNode(){
pNode pn = (pNode) malloc(sizeof(Node));
pn->value = 6;
return pn;
}
I read somewhere that if memory deallocation is to be done by the caller function, I should use newpNode(), and otherwise use newNode(), but that does not help me understand quite enough.
Can you give me some concrete examples of when should I use newNode() and when newpNode()?
Edit: forgot pn inside newpNode()
In this simple example, there is no strong need to use one over the other.
When you call newNode(), memory is allocated when the function is called to store the size of a Node to be returned from the call (on the call stack). This memory can be assigned to a variable and you will keep it around (the memory on the call stack will be memcpy'd into your local variable):
Node n = newNode();
However, as a Node gets more complicated, you will run into problems. For example, if you have nested data structures, these won't get copied along and could be destroyed as newNode() cleans up.
Also, as the memory required for Node gets larger (i.e. more fields), it will take more and more memory on the stack to make these calls. This can limit things such as recursion, or just general efficiency.
To deal with these limitations, you allocate memory on the heap in newPNode(); This always returns a pointer, regardless of the size of Node. However, you have to make sure you explicitly clean up this memory later, or you will have a memory leak.
I have a unsigned char* head that is pointing to a certain addess in memory and now I have to create a typedef struct that I've declared starting at the location of that pointer...I am confused on how to do that!
Here is the declaration of the typedef
typedef struct {
struct block *next;
struct block *prev;
int size;
unsigned char *buffer;
} block;
My assignment involves implementing malloc, so I can't use malloc. A block is part of a free_list which contains all chunks of free memory blocks that I have in my program heap. Hence, the previous and next pointers that point to the previous and next free blocks of memory.
Head points to the start of the free_list. When I have to split say the first block of free memory to satisfy a malloc() request that needs less space then that free block has I need to move my head and create a new block struct there.
Hope this makes sense. If not, the assignment looks something like this
Your struct has no tag, so you need to give it one in order for it to point to itself:
typedef struct block {
struct block *next;
struct block *prev;
int size;
unsigned char *buffer;
} block;
If you're using C99 you can initialise the memory at head directly, if necessary, without declaring a temporary struct block:
*(block *)head = (block){NULL, NULL, 0, NULL};
You now have a struct block at the address head, as long as you cast it properly.
e.g.
((block *)head)->size = 5;
Or you assign a cast pointer to it:
block *p = (block *)head;
p->size = 5;
unsigned char* head = /* whatever you have assuming that it has a sufficient size. */;
/* Create a block in memory */
block* b = (block*)malloc(sizeof(block));
/*
* modify data in b here as you wish.
*/
b->next = 0;
b->prev = 0;
/* etc... */
/* copy b to head */
memcpy(head, b, sizeof(block));
/* free block */
free(b);
The above assumes that head has enough space to store an instance of block.
What it does is create a block, and copy the memory to the position of head, then free the allocated block.
From comments:
head points to the start of a place in memory where I can overwrite data...You may assume that I have enough space!
Then to obtain a properly typed pointer:
struct block *p = (struct block *)head;
and to have a copy of the block:
struct block b = *(struct block *)head;
The operating system will provide an API call to allocate blocks of memory that your malloc can carve up and provide to callers. In Linux/unix look at sbrk. In Windows look at the Win32 heap API. Your records will point into this block. Making sure no two allocated sections of the block overlap is the job of your allocator code.
It looks like your records are implementing a free list. So how are you going to allocate list nodes when you don't have an allocator (yet)? The usual solution is to do it in the free blocks themselves. So a free block has the structure:
typedef struct free_block {
struct free_block *next, *prev;
size_t size;
unsigned char buffer[1];
} FREE_BLOCK;
Now this data structure actually lies at the start of a free block. Its buffer has only 1 byte in the declaration, but the actual buffer is size bytes. Initially you'd have something like:
static FREE_BLOCK *free_list = sbrk(ARENA_SIZE);
free_list->next = free_list->prev = free_list;
free_list->size = ARENA_SIZE - offsetof(FREEBLOCK, buffer);
This places the whole arena on the free list as a single block. Your allocator will search free_list to find a block that's big enough, carve out the piece it needs, put the remaining small block (if any) back on the free list. For freeing, it will add the freed block to the list and coalesce adjacent blocks.
Simple free list allocators differ in how they choose the free block to allocate from: first fit, rotating first fit, best fit, worst fit, etc. In practice rotating first fit seems to work as well as or better than any of the others.
Incidentally, all of the common algorithms implemented with free lists don't need double links. Single ones will do.
Since this is an academic assignment, it should be fine to just call malloc (instead of an operating system API) to establish the big block (often called the "arena") your allocator will manage. You could also declare a big array of bytes.
I wanted to create a generic Linked List in C. Following is the structure of the node:
typedef struct node {
void *value;
int size; // n bytes
ind index; // index of the node
struct node *next;
} Node;
And my delete_node function is as following. The search function sends a pointer to the Node I want to delete.
Node *search_list(Node *list, void *data, int n_bytes);
int delete_node(Node *list, Node *to_be_deleted); // returns 1 on success
Inside the delete_node function I want to free up the memory pointed by void *value and then free up the memory allocated for the Node itself.
free(to_be_deleted->value); // Would this work??
free(to_be_deleted);
Since it is void pointer we don't know that how many bytes the object it is pointing to has occupied. How can we free up the memory for that?
Sorry if it is a stupid questions?
The memory allocator keeps track of how large memory allocations are on its own -- there's no need to tell free() how much memory to free.
As such, you should be able to just get rid of size and n_bits.
free(to_be_deleted->value); // Would this work??
Straight forward answer , Yes this will work.
simple thing :
see the definitions of free() and malloc()
void free(void *) // free takes void* as argument so it will work
void* malloc(sizeof(type))
In mallocwe have to pass thesize that how many bytes we want to allocate.
but in free just pass the pointer and whatever bytes allocated to that pointer on heap storage it will be freed
Yes, what you wrote should work. The reason is that malloc (which is a library call) creates metadata that is used to determine which parts of memory are free and which ones are taken. When you call free(), you are actually only modifying this metadata such that subsequent calls to malloc know that this memory can be re-used (note that most implementations will not zero the existing data).
I'm studying Data Structures, and I'm not getting why stacks and queues need to be declared like:
struct stack *Stack;
(forget about the struct syntax)
I mean, why it is always declared as a pointer?
They are not always declared like that!
In general, declaring a variable as a pointer is useful for later allocating it dynamically. This can be due to a couple of reasons:
The variable is too big for the program stack
You want to return that variable from a function
In your case, let's think of two different implementations of stack:
struct stack
{
void *stuff[10000];
int size;
};
This is a terrible implementation, but assuming you have one like this, then you'd most probably not want to put it on the program stack.
Alternatively, if you have:
struct stack
{
void **stuff;
int size;
int mem_size;
};
You dynamically change the size of stuff anyway, so there is absolutely no harm in declaring a variable of type struct stack on the program stack, i.e. like this:
struct stack stack;
Unless, you'd want to return it from a function. For example:
struct stack *make_stack(int initial_size)
{
struct stack *s;
s = malloc(sizeof(*s));
if (s == NULL)
goto exit_no_mem;
if (initial_size == 0)
initial_size = 1;
s->stuff = malloc(initial_size * sizeof(*s->stuff));
if (s->stuff == NULL)
goto exit_no_stuff_mem;
s->size = 0;
s->mem_size = initial_size;
return s;
exit_no_stuff_mem:
free(s);
exit_no_mem:
return NULL;
}
Personally, though, I would have declared the function like this:
int make_stack(struct stack *s, int initial_size);
and allocate the struct stack on the program stack.
It depends on how your stack structure is defined (not just the layout of the struct, but the operations that manipulate it as well).
It's entirely possible to define a stack as a simple array and index, such as
struct stack_ {
T data[N]; // for some type T and size N
size_t stackptr; // Nobody caught that error, so it never existed, right? ;-)
} stack;
stack.stackptr = N; // stack grows towards 0
// push operation
if (stack.stackptr)
stack.data[--stack.stackptr] = some_data();
else
// overflow
// pop operation
if (stack.stackptr < N)
x = stack.data[stack.stackptr++];
else
// underflow
However, fixed-sized arrays are limiting. One easy method of implementing a stack is to use a list structure:
struct stack_elem {
T data;
struct stack_elem *next;
};
The idea is that the head of the list is the top of the stack. Pushing an item onto the stack adds an element at the head of the list; popping an item removes that element from the head of the list:
int push(struct stack_elem **stack, T data)
{
struct stack_elem *s = malloc(sizeof *s);
if (s)
{
s->data = data; // new element gets data
s->next = *stack; // set new element to point to current stack head
*stack = s; // new element becomes new stack head
}
return s != NULL;
}
int pop(struct stack_elem **stack, T *data)
{
int stackempty = (*stack == NULL);
if (!stackempty)
{
struct stack_elem *s = *stack; // retrieve the current stack head
*stack = (*stack)->next; // set stack head to point to next element
*data = s->data; // get the data
free(s); // deallocate the element
}
return r;
}
int main(void)
{
struct stack_elem *mystack = NULL; // stack is initially empty
T value;
...
if (!push(&mystack, some_data()))
// handle overflow
...
if (!pop(&mystack, &value))
// handle underflow
...
}
Since push and pop need to be able to write new pointer values to mystack, we need to pass a pointer to it, hence the double indirection for stack in push and pop.
No, they don't have to be declared as pointers.
One can as well allocate stacks and queues as global variables:
struct myHash { int key; int next_idx; int data[4]; } mainTable[65536];
struct myHash duplicates[65536*10];
int stack[16384];
myHash also includes a linked list for duplicate entries using indices.
But as stated in the comments, if one has to add more elements to the structures, that was initially planned, then pointers come handy.
An additional reason to declare structures as pointers is that it typically with pointers one can access both the complete structure as a whole, any individual element of the structure or some subset of the elements. That makes the syntax more versatile. Also when the structure is passed as a parameter to some external function, a pointer is inevitable.
There's really no need to implement stacks and queues with pointers - others have already stated this fact clearly. Look at #JohnBode 's answer for how a stack can be perfectly implemented using arrays. The thing is that modelling certain data structures (such as stacks, queues and linked lists) using pointers, allows you to program them in a very efficient way in terms of both execution speed and memory consumption.
Usually an underlying array for holding a data structure is very good implementation choice if your use-cases require frequent random access to an element in the structure, given it's positional index (this is FAST with an array). However growing the structure past its initial capacity can be expensive AND you waste memory with the unused elements in the array. Insertion and deletion operations can also be very expensive since you may need to rearrange elements to either compact the structure or make space for the new elements.
Since a queue and a stack don't have this random-acess requirement and you don't insert or delete elements in the middle of them, it is a better implementation choice to dynamically allocate each individual element "on the fly", requesting memory when a new element is required (this is what malloc does), and freeing it as an element is deleted. This is fast and will consume no more memory than it is actually needed by your data structure.
As aleady pointed out it depends on how big the struct is.
Another reason is encapsulation. The stack implementation might not expose the definition of struct stack in its header file. This hides the implementation detail from the user, with the downside that free store allocation is required.
If I have a snippit of my program like this:
struct Node *node;
while(...){
node = malloc(100);
//do stuff with node
}
This means that every time I loop through the while loop I newly allocate 100 bytes that is pointed to by the node pointer right?
If this is true, then how do I free up all the memory that I have made with all the loops if I only have a pointer left pointing to the last malloc that happened?
Thanks!
Please allocate exactly the size you need: malloc(sizeof *node); -- if you move to a 64-bit platform that doubles the size of all your members, your old 96-byte structure might take 192 bytes in the new environment.
If you don't have any pointers to any of the struct Nodes you have created, then I don't think you should be allocating them with malloc(3) in the first place. malloc(3) is best if your application requires the data to persist outside the calling scope of the current function. I expect that you could re-write your function like this:
struct Node node;
while(...){
//do stuff with node
}
or
while(...){
struct Node node;
//do stuff with node
}
depending if you want access to the last node (the first version) or not (the second version).
Of course, if you actually need those structures outside this piece of code, then you need to store references to them somewhere. Add them to a global list keeping track of struct Node objects, or add each one to the next pointer of the previous struct Node, or add each one to a corresponding struct User that refers to them, whatever is best for your application.
If you set node = NULL before the loop and then use free(node) before node = malloc(100) you should be OK. You will also need to do a free(node) after the loop exits. But then again, it all depends on what "//do stuff with node" actually does. As others have pointed out, malloc(100) is not a good idea. What I would use is malloc(sizeof(*node)). That way, if the type of node changes, you don't have to change the malloc line.
If you don't need the malloc'ed space at the end of one iteration anymore, you should free it right away.
To keep track of the allocated nodes you could save them in a dynamically growing list:
#include <stdlib.h>
int main() {
int i;
void *node;
int prt_len = 0;
void **ptrs = NULL;
for (i = 0; i < 10; i++) {
node = malloc(100);
ptrs = realloc(ptrs, sizeof(void*) * ++prt_len);
ptrs[prt_len-1] = node;
/* code */
}
for (i = 0; i < prt_len; i++) {
free(ptrs[i]);
}
free(ptrs);
return 0;
}
Note: You should probably re-think your algorithm if you need to employ such methods!
Otherwise see sarnold's answer.
then how do I free up all the memory that I have made with all the loops if I only have a pointer left pointing to the last malloc that happened?
You can't. You just created a giant memory leak.
You have to keep track of every chunk of memory you malloc() and free() it when you're done using it.
You can not. You need to store all the pointer to free the memory. if you are saving those pointer somewhere then only you can free the memory.