Examples or documentation for custom storage allocator in c?

Examples or documentation for custom storage allocator in c? - c

I allocate a big region of memory lets say x of 1000 bytes.
// I am using c language and all of this is just pseudo code(function prototypes mostly) so far.
pointer = malloc( size(1000 units) ); // this pointer points to region of memory we created.
now we select this region by a pointer and allocate memory inside it to smaller blocks like
void *allocate_from_region( size_of_block1(300) ); //1000-300=700 (left free)
void *allocate_from_region( size_of_block2(100) ); //700-100 =600
void *allocate_from_region( size_of_block3(300) ); //600-300 =300
void *allocate_from_region( size_of_block4(100) ); //300-100 =200
void *allocate_from_region( size_of_block5(150) ); //200-150 =50
// here we almost finished space we have in region (only 50 is left free in region)
boolean free_from_region(pointer_to_block2); //free 100 more
//total free = 100+50 but are not contiguous in memory
void *allocate_from_region( size_of_block6(150) ); // this one will fail and gives null as it cant find 150 units memory(contiguous) in region.
boolean free_from_region(pointer_to_block3); // this free 300 more so total free = 100+300+50 but contiguous free is 100+300 (from block 2 and 3)
void *allocate_from_region( size_of_block6(150); // this time it is successful
Are there any examples about how to manage memory like this?
So far I have only did examples where I can allocate blocks next to each other in a region of memory and and end it once I ran out of memory inside the region.
But how to search for blocks which are free inside the region and then check if enough contiguous memory is available.
I am sure there should be some documentation or examples in c which shows how to do it.

Sure. What you are proposing is more-or-less exactly what some malloc implementations do. They maintain a "free list". Initially the single large block is on this list. When you make a request, the algorithm to allocate n bytes is:
search the free list to find a block at B of size m >= n
Remove B from the free list.
Return the block from B+n through B+m-1 (size m-n) to the free list (unless m-n==0)
Return a pointer to B
To free a block at B of size n, we must put it back on the free list. However this isn't the end. We must also "coalesce" it with adjacent free blocks, if any, either above or below or both. This is the algorithm.
Let p = B; m = n; // pointer to base and size of block to be freed
If there is a block of size x on the free list and at the address B + n,
remove it, set m=m+x. // coalescing block above
If there is a block of size y on the free list and at address B - y,
remove it and set p=B-y; m=m+y; // coalescing block below
Return block at p of size m to the free list.
The remaining question is how to set up the free list to make it quick to find blocks of the right size during allocation and to find adjacent blocks for coalescing during free operations. The simplest way is a singly linked list. But there are many possible alternatives that can yield better speed, usually at some cost of additional space for data structures.
Additionally there is the choice of which block to allocate when more than one is big enough. The usual choices are "first fit" and "best fit". For first fit, just take the first one discovered. Often the best technique is (rather than starting at the lowest addresses every time) to remember the free block after the one just allocated and use this as a starting point for the next search. This is called "rotating first fit."
For best, fit, traverse as many block as necessary to find the one that most closely matches the size requested.
If allocations are random, first fit actually performs a bit better than best fit in terms of memory fragmentation. Fragmentation is the bane of all non-compacting allocators.

Related

What happens with the memory when decrease array with realloc?

I was wondering what happens with the memory when u realloc -1 your array. According everything that I've read about realloc I suppose that pointer still points at the same place in memory (there's no need for function to seek another block of memory as that one is available and sufficient), tell me if I'm wrong. My question is: Is the deleted piece of array deleted (like with using free()) or are the values stay untouched and the piece of memory is being shared for future operations of malloc, calloc etc.?
EDIT:
I have one more question. Does this function work properly? It should delete element of array previously overwriting it by the next element of the array. Doing it over the whole array, the last element is the same as the one before last and the last one is deleted. PicCounter is the number of pictures already uploaded to program. Check this out:
int DeletePicture(struct Picture **tab, int *PicCounter)
{
int PicToDelete;
printf("Enter the number of pic to delete ");
scanf("%d", &PicToDelete);
for (int i = PicToDelete - 1; i < (*PicCounter) - 2; i++)
{
(*tab)[i] = (*tab)[i + 1];
}
struct Picture *temp;
temp = realloc(*tab, ((*PicCounter)-1) * sizeof(*temp));
if (temp != NULL)
{
*tab = temp;
//That doesn't delete the element, because in main I can still print it
//like e.g. tab[lastelement].
(*PicCounter)--;
printf("Picture has been deleted\n");
return 0;
}
else
{
printf("Memory reallocation error\n");
return 1;
}
}

Regarding void *realloc(void *ptr, size_t size), the C standard says in C 2018 7.22.3.5 paragraphs 2 and 3:
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.
If ptr is a null pointer, the realloc function behaves like the malloc function for the specified size. Otherwise, if ptr does not match a pointer earlier returned by a memory management function, or if the space has been deallocated by a call to the free or realloc function, the behavior is undefined. If size is nonzero and memory for the new object is not allocated, the old object is not deallocated. If size is zero and memory for the new object is not allocated, it is implementation-defined whether the old object is deallocated. If the old object is not deallocated, its value shall be unchanged.
What this means when you ask to reduce the size of a previously allocated object is:
The returned pointer might or might not be the same as the original pointer. (See discussion below.)
The C standard permits the portion of memory that is released to be reused for other allocations. Whether or not it is reused is up to the C implementation.
Whether the values in the released portion of memory are immediately overwritten or not is not specified by the C standard. Certainly the user of realloc may not rely on any behavior regarding that memory.
Discussion
When a memory allocation is reduced, it certainly seems “easy” for the memory allocation routines to simply return the same pointer while remembering that the released memory is free. However, memory allocation systems are fairly complex, so other factors may be involved. For example, hypothetically:
To support many small allocations without much overhead, a memory allocation system might create a pool of memory for one-to-four byte allocations, another pool for five-to-eight, another pool for eight-to-16, and a general pool for larger sizes. For the larger sizes, it might remember each allocation individually, customizing its size and managing them all with various data structures. For the smaller sizes, it might keep little more than a bitmap for each, with each bit indicating whether or not its corresponding four-byte (or eight or 16) region is allocated. In such a system, if you release eight bytes of a 16-byte allocation, the memory allocation software might move the data to something in the eight-byte pool.
In any memory allocation system, if you release just a few bytes at the end of an allocation, it might not be enough bytes to take advantage of—the data structures required to track the few bytes you released might be bigger than the few bytes. So it is not worthwhile to make them available for reuse. The memory allocation system just keeps them with the block, although it may remember the data in the block is actually a bit smaller than the space reserved for it.

malloc storing its metadata

Surprisingly both the programs gave the difference between the two pointers same even though the data types were different.....
How exactly does malloc store its meta data was what i was trying to find out with this little experiment...
Program 1 :
int main ()
{
char *i,*j;
i=(char*)malloc (sizeof(char));
j=(char*)malloc (sizeof(char));
printf ("%x\n",i);
printf ("%x\n",j);
return 0;
}
Output :
710010
710030
Program 2 :
int main ()
{
int *i,*j;
i=(int*)malloc (sizeof(int));
j=(int*)malloc (sizeof(int));
printf ("%x\n",i);
printf ("%x\n",j);
return 0;
}
Output :
16b8010
16b8030
What i had in mind before this program :
| meta data of i | memory space of i | meta data of j | memory space of j |
but the results don't support the theory....

malloc "rounds up" allocations to a convenient size set at compile time for the library. This causes subsequent allocations and deallocations to fragment memory less than if allocations were created to exactly match requests.
Where malloc stores its metadata is not actually why the values for both are 0x20 "apart". But you can read up on one method of implementing malloc (and friends) here; see especially slides 16 and 28.
Imagine the case of a string manipulation program, where lots of different sized allocations were occurring in "random" order. Tiny "left over" chunks would quickly develop leaving totally useless bytes of memory spread out between the used chunks. malloc prevents this by satisfying all memory requests in multiples of some minimum size (apparently 0x20 in this case). (OK, technically is you request 0x1E bytes, there will be 2 bytes of "wasted" space left over and unused after your request. Since malloc allocates 0x20 bytes instead of 0x1E, BUT there will not ever be a 2-byte fragment left over. Which is really good because the metadate for malloc is definitely bigger than 2-bytes, so there would be no way to even keep track of those bytes.)

Rather than allocating from a compiled-in fixed-size array, malloc will request space from the operating system as needed. Since other activities in the program may also request space without calling this allocator, the space that malloc manages may not be contiguous. Thus its free storage is kept as a list of free blocks. Each block contains a size, a pointer to the next block, and the space itself. The blocks are kept in order of increasing storage address, and the last block (highest address) points to the first.
When a request is made, the free list is scanned until a big-enough block is found. This algorithm is called first fit, by contrast with best fit, which looks for the smallest block that will satisfy the request. If the block is exactly the size requested it is unlinked from the list and returned to the user. If the block is too big, it is split, and the proper amount is returned to the user while the residue remains on the free list. If no big-enough block is found, another large chunk is obtained by the operating system and linked into the free list.

malloc normally uses a pool of memory and "meta data" is held in the pool not "in between" the chunks of memory allocated.

Freeing portions of dynamically allocated blocks?

I was curious whether there exists a dynamic memory allocation system that allows the programmer to free part of an allocated block.
For example:
char* a = malloc (40);
//b points to the split second half of the block, or to NULL if it's beyond the end
//a points to a area of 10 bytes
b = partial_free (a+10, /*size*/ 10)
Thoughts on why this is wise/unwise/difficult? Ways to do this?
Seems to me like it could be useful.
Thanks!
=====edit=====
after some research, it seems that the bootmem allocator for the linux kernel allows something similar to this operation with the bootmem_free call. So, I'm curious -- why is it that the bootmem allocator allows this, but ANSI C does not?

No there is no such function which allows parital freeing of memory.
You could however use realloc() to resize memory.
From the c standard:
7.22.3.5 The realloc function
#include <stdlib.h>
void *realloc(void *ptr, size_t size);
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.

There is no ready-made function for this, but doing this isn't impossible. Firstly, there is realloc() . realloc takes a pointer to a block of memory and resizes the allocation to the size specified.
Now, if you have allocated some memory:
char * tmp = malloc(2048);
and you intend to deallocate the first, 1 K of memory, you may do:
tmp = realloc(foo, 2048-1024);
However, the problem in this case is that you cannot be certain that tmp will remain unchanged. Since, the function might just deallocate the entire 2K memory and move it elsewhere.
Now I'm not sure about the exact implementation of realloc, but from what I understand, the code:
myptr = malloc( x - y );
actually mallocs a new memory buffer of size x-y, then it copies the bytes that fit using memcpy and finally frees the original allocated memory.
This may create some potential problems. For example, the new reallocated memory may be located at a different address, so any past pointers you may have may become invalidated. Resulting in undefined runtime errors, segmentation faults and general debugging hell. So I would try to avoid resorting to this.

Firstly, I cannot think of any situation where you would be likely to need such a thing (when there exists realloc to increase/decrease the memory as mentioned in the answers).
I would like to add another thing. In whatever implementations I have seen of the malloc subsystem (which I admit is not a lot), malloc and free are implemented to be dependent on something called as the prefix byte(s). So whatever address is returned to you by malloc, internally the malloc subsystem will allocate some additional byte(s) of memory prior to the address returned to you, to store sanity check information which includes number of allocated bytes and possible what allocation policy you use (if your OS supports multiple mem allocation policies) etc. When you say something like free (x bytes), the malloc subsystem goes back to peek back into the prefix byte to sanity check and only if it finds the prefix in place does the free successfully happen. Therefore, it will not allow you to free some number of blocks starting in between.

how is dynamic memory allocation better than array?

int numbers*;
numbers = malloc ( sizeof(int) * 10 );
I want to know how is this dynamic memory allocation, if I can store just 10 int items to the memory block ? I could just use the array and store elemets dynamically using index. Why is the above approach better ?
I am new to C, and this is my 2nd day and I may sound stupid, so please bear with me.

In this case you could replace 10 with a variable that is assigned at run time. That way you can decide how much memory space you need. But with arrays, you have to specify an integer constant during declaration. So you cannot decide whether the user would actually need as many locations as was declared, or even worse , it might not be enough.
With a dynamic allocation like this, you could assign a larger memory location and copy the contents of the first location to the new one to give the impression that the array has grown as needed.
This helps to ensure optimum memory utilization.

The main reason why malloc() is useful is not because the size of the array can be determined at runtime - modern versions of C allow that with normal arrays too. There are two reasons:
Objects allocated with malloc() have flexible lifetimes;
That is, you get runtime control over when to create the object, and when to destroy it. The array allocated with malloc() exists from the time of the malloc() call until the corresponding free() call; in contrast, declared arrays either exist until the function they're declared in exits, or until the program finishes.
malloc() reports failure, allowing the program to handle it in a graceful way.
On a failure to allocate the requested memory, malloc() can return NULL, which allows your program to detect and handle the condition. There is no such mechanism for declared arrays - on a failure to allocate sufficient space, either the program crashes at runtime, or fails to load altogether.

There is a difference with where the memory is allocated. Using the array syntax, the memory is allocated on the stack (assuming you are in a function), while malloc'ed arrays/bytes are allocated on the heap.
/* Allocates 4*1000 bytes on the stack (which might be a bit much depending on your system) */
int a[1000];
/* Allocates 4*1000 bytes on the heap */
int *b = malloc(1000 * sizeof(int))
Stack allocations are fast - and often preferred when:
"Small" amount of memory is required
Pointer to the array is not to be returned from the function
Heap allocations are slower, but has the advantages:
Available heap memory is (normally) >> than available stack memory
You can freely pass the pointer to the allocated bytes around, e.g. returning it from a function -- just remember to free it at some point.
A third option is to use statically initialized arrays if you have some common task, that always requires an array of some max size. Given you can spare the memory statically consumed by the array, you avoid the hit for heap memory allocation, gain the flexibility to pass the pointer around, and avoid having to keep track of ownership of the pointer to ensure the memory is freed.
Edit: If you are using C99 (default with the gnu c compiler i think?), you can do variable-length stack arrays like
int a = 4;
int b[a*a];

In the example you gave
int *numbers;
numbers = malloc ( sizeof(int) * 10 );
there are no explicit benefits. Though, imagine 10 is a value that changes at runtime (e.g. user input), and that you need to return this array from a function. E.g.
int *aFunction(size_t howMany, ...)
{
int *r = malloc(sizeof(int)*howMany);
// do something, fill the array...
return r;
}
The malloc takes room from the heap, while something like
int *aFunction(size_t howMany, ...)
{
int r[howMany];
// do something, fill the array...
// you can't return r unless you make it static, but this is in general
// not good
return somethingElse;
}
would consume the stack that is not so big as the whole heap available.
More complex example exists. E.g. if you have to build a binary tree that grows according to some computation done at runtime, you basically have no other choices but to use dynamic memory allocation.

Array size is defined at compilation time whereas dynamic allocation is done at run time.
Thus, in your case, you can use your pointer as an array : numbers[5] is valid.
If you don't know the size of your array when writing the program, using runtime allocation is not a choice. Otherwise, you're free to use an array, it might be simpler (less risk to forget to free memory for example)
Example:
to store a 3-D position, you might want to use an array as it's alwaays 3 coordinates
to create a sieve to calculate prime numbers, you might want to use a parameter to give the max value and thus use dynamic allocation to create the memory area

Array is used to allocate memory statically and in one go.
To allocate memory dynamically malloc is required.
e.g. int numbers[10];
This will allocate memory statically and it will be contiguous memory.
If you are not aware of the count of the numbers then use variable like count.
int count;
int *numbers;
scanf("%d", count);
numbers = malloc ( sizeof(int) * count );
This is not possible in case of arrays.

Dynamic does not refer to the access. Dynamic is the size of malloc. If you just use a constant number, e.g. like 10 in your example, it is nothing better than an array. The advantage is when you dont know in advance how big it must be, e.g. because the user can enter at runtime the size. Then you can allocate with a variable, e.g. like malloc(sizeof(int) * userEnteredNumber). This is not possible with array, as you have to know there at compile time the (maximum) size.

Memory allocator in C -- how to utilize sbrk()'ed space

I've been writing an implementation of malloc and was wondering if someone could help me with this problem.
Basically, I would like to reuse memory after allocating it using sbrk(), and having made certain that the memory is free.
So essentially, imagine my memory is like this
|------------------------------|
...and I do some allocations. When I allocate memory, each bit has a head (h) and data (d).
|hddddddhddd---hdd--hddd-------|
Now I've got these holes, and if I want to use say, the first gap in my diagram, how do I set it up so that it's got a head (h) and a body (dd) also?
I've gotten to the point where now I've got a pointer to the memory location I want. In C, its pointed to by a pointer. The pointer has a custom type, where "meta" is a struct I defined. So now I have
metaStruct * mypointer = the memory address.
But when I try to do
mypointer->size = 30;
Or
mypointer->buddy = 1;
I get a segfault.
The question: how do I set it up so that the memory address, which has been allocated via sbrk(), will have the form of my struct? Obviously I can't just go myPointer = malloc(sizeof(metaStruct)), because I am writing malloc itself. I'm also not interested in sbrk()'ing more space, but rather, utilizing the existing space that I'm pointing to (I want to disregard its junk data and use the space).
How do I go about doing that?

As far as I know, p=sbrk(n) enlarges the available address space of (at least) n bytes and returns the base address of the new allocated area in "p". So you now have a block of memory starting at "p" and n bytes long (probably more than n, it depends on the system).
So I suppose that your "metaStruct" contains a "size" field, a "next free area" field, and a "data" field,
metaStruct * m ;
p=sbrk(sizeof(metaStruct)+ data_size);
m = (metaStruct *)p;
m->size = data_size;
m->next = NULL;
memcpy(m->data, ...,data_size);
The code is not perfect, on some systems the sbrk function (indeed it's often a function, not a basic system call - and of course you should check if sbrk fails) doesn't return aligned pointers, so you have to align the pointer manually. Also, you can obtain the actual allocated size by calling sbrk(0) after sbrk(n) and calculating the difference between the two pointers. In general, you should mainatin a collection of "free blocks" and try to use them fist, then call sbrk only if none of them is large enough.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight