Implementing segregated memory storage (malloc) in C - c

I'm trying to implement my own malloc using a segregated free list (using this textbook as a reference: http://csapp.cs.cmu.edu/), but I'm not sure how to start.
I have a method, malloc_init() that uses sbrk to return a slab of memory. Within the context of my assignment, I'm not allowed to ask for more memory after this initial call, and the amount of memory I'm allowed to request is limited by MAX_HEAP_SIZE (set by someone else). I am thinking that I will keep an array of pointers, each of which points to a freelist of predetermined size.
How do I set up this array after calling sbrk? How do I figure out how many bytes should go into each "bucket" and how the class size of each freelist? In terms of code implementation, how does one set up the array of freelist pointers? Any tips or hints would be greatly appreciated! I've looked for example code online but have not found anything satisfying.

Memory allocation theory takes entire chapters or books, but here's some quick ideas to get you started.
You could do something like:
char *blobs[10];
where blobs[0] points to chunks of 16 bytes, blobs[1] points to chunks of 32 bytes, blobs[2] points to 64 byte chunks, ... upto blobs[9] pointing at 8k chunks. Then when you get the initial chunk do something like:
bsize = 8192;
idx = 9;
memsize = MAX_HEAP_SIZE;
while (idx >= 0) {
while (memsize > bsize) {
/* carve a bsize chunk from your initial block */
/* and insert it onto a singly-linked headed by listblobs[idx]; */
/* use the first (sizeof(char *)) bytes of each chunk as a next pointer */
}
bsize /= 2;
idx--;
}
Then when ever you need to allocate, find the right list and grab a chunk from it.
You'll need to use grab a slightly larger chunk than the request so you have a
place to record which list the chunk came from so you can free it.
You may find making the blobs array more than 10 entries is needed so you can handle larger requests.
If you want to be more sophisticated you can do block dividing when servicing requests.
That is, if somebody requests 33.2K from a 64KB blob, maybe you want to only give then 34KB and divide the remaining space in the 64K blob into 16K, 8K, 4K, 2K chunks to add to those free lists.

Not sure if there's a "standard" way to do this, but just thinking it through logically: you have a big blob of memory and you want to carve it up into different sized buckets. So first you need to figure out the bucket sizes you're going to support. I'm not a systems programmer, so I can't say what a "good" bucket size is, but I imagine they'll be in some non-consecutive powers of 2 (e.g., 16 bytes, 64 bytes, 512 bytes, etc).
Once you have your bucket sizes, you will need to divide up the memory blob into buckets. The best way is to use a bit of the blob space for a header at the start of each block. The header will contain the size of the block and a flag indicating whether or not it's free.
struct header
{
unsigned short blkSize;
unsigned char free;
};
In your init function you will divide up the blob:
void my_init()
{
// "base" is a pointer to the start of the blob
void *base = sbrk((intptr_t)MAX_HEAP_SIZE);
if (!base)
{
// something went wrong
exit(1);
}
// carve up the blob into buckets of varying size
header *hdr = (header*)base;
for (int i = 0; i < num16Bblocks; i++)
{
hdr->blkSize = 16;
hdr->free = 1;
// increment the pointer to the start of the next block's header
hdr += 16 + sizeof(header);
}
// repeat for other sizes
for (int i = 0; i < num64Bblocks; i++)
{
hdr->blkSize = 64;
hdr->free = 1;
// increment the pointer to the start of the next block's header
hdr += 64 + sizeof(header);
}
// etc
}
When a user requests some memory, you will walk the blob until you find the smallest bucket that will fit, mark it as no longer free and return a pointer to the start of the bucket:
void *my_malloc(size_t allocationSize)
{
// walk the blocks until we find a free one of the appropriate size
header *hdr = (header*)base;
while (hdr <= base + MAX_HEAP_SIZE)
{
if (hdr->blkSize >= allocationSize &&
hdr->free)
{
// we found a free block of an appropriate size, so we're going to mark it
// as not free and return the memory just after the header
hdr->free = 0;
return (hdr + sizeof(header));
}
// didn't fit or isn't free, go to the next block
hdr += hdr->blkSize + sizeof(header);
}
// did not find any available memory
return NULL;
}
To free (reclaim) some memory, simply mark it as free in the header.
void my_free(void *mem)
{
// back up to the header
header *hdr = (header*)(mem - sizeof(header));
// it's free again
hdr->free = 1;
}
This is a very basic implementation and has several drawbacks (e.g., doesn't handle fragmentation, is not very dynamic), but it may give you a good jumping off point.

Related

C how to free sub memory?

I allocate a big memory , char* test= malloc(10000000); , then I put value on this memory , and do some work for each value.
What I want is , each 1000 index, I want to release all the memory until it.
For ex.
for(long i=0; i<10000000;i++)
DoSomeWork(test[i]);
if(i%1000==0)
releaseMemory(i-1000,i);
How can I do it in c?
I know that free can only free all of my allocate, but I don't want to wait to the end of work the free all the memory.
I want each 1000 works free all the 1000 back
I must to allocate all the memory in the begining of program.
What you want can be achieved by allocating the program in smaller chunks.
You have to adjust your algorithm to handle a bunch of small sub-arrays which you then can release after use.
In this case, it might be useful to allocate the chunks in reversed direction to give the libc the chance to release the freed memory to the underlying OS.
Let me enhance a bit here:
Assume you want an array with 10000000 (10 million) entries. Instead of allocating it as one chunk as depicted in the question, it could be possible to have
#define CHUNKSIZE 10000
#define ENTRYSIZE 8
#define NUM_CHUNKS 1000
void test(void)
{
void** outer_array = malloc(NUM_CHUNKS * sizeof(void*))
for (int i = 0; i < NUM_CHUNKS; i++) {
void * chunk = malloc(CHUNKSIZE * ENTRYSIZE);
outer_array[NUM_CHUNKS - 1 - i] = chunk;
// allocate them in reverse order
}
// now, set item #123456
size_t item_index = 123456;
// TODO check if the index is below the maximum
size_t chunk_index = item_index / CHUNKSIZE;
size_t index_into_chunk = item_index % CHUNKSIZE;
void * item_address = &outer_array[chunk_index][index_into_chunk * ENTRY_SIZE];
// after having processed one chunk, you can free it:
free(outer_array[0]);
outer_array[0] = NULL;
}
There are (roughly) two possibilities how a program can enhance the heap in order to allocate memory:
It can obtain a completely new memory block from the OS, indepedent from the "main address space". Then it can use it for allocation and return it to the OS as soon as it is free()d. This happens in some allocators if the allocation size is above a certain threshold.
It can enhance the program address space. Then, the new memory is added at the end. After free()ing the last memory block, the program address space can be reduced again. This happens in some allocators if the allocation size is below a certain threshold.
This way, your program's memory footprint decreases over time.

How to use or free dynamically allocated memory when I run the program multiple times?

How do I free dynamically allocated memory?
Suppose input (assume it is given by user) is 1000 and now if I allocate memory of 1000 and after this(second time) if user gives input as 500 can I reuse already allocated memory ?
If user now inputs value as say 3000 , how do I go with it ? can I reuse already allocated 1000 blocks of memory and then create another 2000 blocks of memory ? or should I create all 3000 blocks of memory ?
which of these is advisable?
#include <stdio.h>
#include <stdlib.h>
typedef struct a
{
int a;
int b;
}aa;
aa* ptr=NULL;
int main() {
//code
int input=2;
ptr=malloc(sizeof(aa)*input);
for(int i=0;i<input;i++)
{
ptr[i].a=10;
ptr[i].b=20;
}
for(int i=0;i<input;i++)
{
printf("%d %d\n",ptr[i].a,ptr[i].b);
}
return 0;
}
I believe, you need to read about the "lifetime" of allocated memory.
For allocator functions, like malloc() and family, (quoting from C11, chapter ยง7.22.3, for "Memory management functions")
[...] The lifetime of an allocated object extends from the allocation
until the deallocation. [....]
So, once allocated, the returned pointer to the memory remains valid until it is deallocated. There are two ways it can be deallocated
Using a call to free() inside the program
Once the program terminates.
So, the allocated memory is available, from the point of allocation, to the termination of the program, or the free() call, whichever is earlier.
As it stands, there can be two aspects, let me clarify.
Scenario 1:
You allocate memory (size M)
You use the memory
You want the allocated memory to be re-sized (expanded/ shrinked)
You use some more
You're done using
is this is the flow you expect, you can use realloc() to resize the allocated memory size. Once you're done, use free().
Scenario 2:
You allocate memory (size M)
You use the memory
You're done using
If this is the case, once you're done, use free().
Note: In both the cases, if the program is run multiple times, there is no connection between or among the allocation happening in each individual invocation. They are independent.
When you use dynamically allocated memory, and adjust its size, it is important to keep track of exactly how many elements you have allocated memory for.
I personally like to keep the number of elements in use in variable named used, and the number of elements I have allocated memory for in size. For example, I might create a structure for describing one-dimensional arrays of doubles:
typedef struct {
size_t size; /* Number of doubles allocated for */
size_t used; /* Number of doubles in use */
double *data; /* Dynamically allocated array */
} double_array;
#define DOUBLE_ARRAY_INIT { 0, 0, NULL }
I like to explicitly initialize my dynamically allocated memory pointers to NULL, and their respective sizes to zero, so that I only need to use realloc(). This works, because realloc(NULL, size) is exactly equivalent to malloc(NULL). I also often utilize the fact that free(NULL) is safe, and does nothing.
I would probably write a couple of helper functions. Perhaps a function that ensures there is room for at_least entries in the array:
void double_array_resize(double_array *ref, size_t at_least)
{
if (ref->size < at_least) {
void *temp;
temp = realloc(ref->data, at_least * sizeof ref->data[0]);
if (!temp) {
fprintf(stderr, "double_array_resize(): Out of memory (%zu doubles).\n", at_least);
exit(EXIT_FAILURE);
}
ref->data = temp;
ref->size = at_least;
}
/* We could also shrink the array if
at_least < ref->size, but usually
this is not needed/useful/desirable. */
}
I would definitely write a helper function that not only frees the memory used, but also updates the fields to reflect that, so that it is completely safe to call double_array_resize() after freeing:
void double_array_free(double_array *ref)
{
if (ref) {
free(ref->data);
ref->size = 0;
ref->used = 0;
ref->data = NULL;
}
}
Here is how a program might use the above.
int main(void)
{
double_array stuff = DOUBLE_ARRAY_INIT;
/* ... Code and variables omitted ... */
if (some_condition) {
double_array_resize(&stuff, 321);
/* stuff.data[0] through stuff.data[320]
are now accessible (dynamically allocated) */
}
/* ... Code and variables omitted ... */
if (weird_condition) {
/* For some reason, we want to discard the
possibly dynamically allocated buffer */
double_array_free(&stuff);
}
/* ... Code and variables omitted ... */
if (other_condition) {
double_array_resize(&stuff, 48361242);
/* stuff.data[0] through stuff.data[48361241]
are now accessible. */
}
double_array_free(&stuff);
return EXIT_SUCCESS;
}
If I wanted to use the double_array as a stack, I might do
void double_array_clear(double_array *ref)
{
if (ref)
ref->used = 0;
}
void double_array_push(double_array *ref, const double val)
{
if (ref->used >= ref->size) {
/* Allocate, say, room for 100 more! */
double_array_resize(ref, ref->used + 100);
}
ref->data[ref->used++] = val;
}
double double_array_pop(double_array *ref, const double errorval)
{
if (ref->used > 0)
return ref->data[--ref->used];
else
return errorval; /* Stack was empty! */
}
The above double_array_push() reallocates for 100 more doubles, whenever the array runs out of room. However, if you pushed millions of doubles, this would mean tens of thousands of realloc() calls, which is usually considered wasteful. Instead, we usually apply a reallocation policy, that grows the size proportionally to the existing size.
My preferred policy is something like (pseudocode)
If (elements in use) < LIMIT_1 Then
Resize to LIMIT_1
Else If (elements in use) < LIMIT_2 Then
Resize to (elements in use) * FACTOR
Else
Resize to (elements in use) + LIMIT_2
End If
The LIMIT_1 is typically a small number, the minimum size ever allocated. LIMIT_2 is typically a large number, something like 220 (two million plus change), so that at most LIMIT_2 unused elements are ever allocated. FACTOR is between 1 and 2; many suggest 2, but I prefer 3/2.
The goal of the policy is to keep the number of realloc() calls at an acceptable (unnoticeable) level, while keeping the amount of allocated but unused memory low.
The final note is that you should only try to keep around a dynamically allocated buffer, if you reuse it for the same (or very similar) purpose. If you need an array of a different type, and don't need the earlier one, just free() the earlier one, and malloc() a new one (or let realloc() in the helpers do it). The C library will try to reuse the same memory anyway.
On current desktop machines, something like a hundred or a thousand malloc() or realloc() calls is probably unnoticeable compared to the start-up time of the program. So, it is not that important to minimize the number of those calls. What you want to do, is keep your code easily maintained and adapted, so logical reuse and variable and type names are important.
The most typical case where I reuse a buffer, is when I read text input line by line. I use the POSIX.1 getline() function to do so:
char *line = NULL;
size_t size = 0;
ssize_t len; /* Not 'used' in this particular case! :) */
while (1) {
len = getline(&line, &size, stdin);
if (len < 1)
break;
/* Have 'len' chars in 'line'; may contain '\0'! */
}
if (ferror(stdin)) {
fprintf(stderr, "Error reading standard input!\n");
exit(EXIT_FAILURE);
}
/* Since the line buffer is no longer needed, free it. */
free(line);
line = NULL;
size = 0;

Why should we use `realloc` if we need a `tmp buffer`

As far of my concern if realloc fails we loose the information and realloc set the Buffer(pointer) to NULL
Consider de following program:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(void){
char *ptr = malloc(256);
if (!ptr){
printf("Error, malloc\n");
exit(1);
}
strcpy(ptr, "Michi");
ptr = realloc (ptr, 1024 * 102400000uL); /* I ask for a big chunk here to make realloc to fail */
if (!ptr){
printf("Houston we have a Problem\n");
}
printf("PTR = %s\n", ptr);
if (ptr){
free(ptr);
ptr = NULL;
}
}
And the output of course is:
Houston we have a Problem
PTR = (null)
I just lost the information inside ptr.
Now to fix this we should use a temporary buffer(pointer) before to see if we get that chunk of memory and if we get it we can use it, if not we still have the main buffer(pointer) safe.
Now please consider the following program, where instead of calling realloc I call malloc on a temporary buffer(pointer):
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(void){
char *ptr = malloc(256);
char *tmpPTR = NULL;
if (!ptr){
printf("Error, malloc\n");
exit(1);
}
strcpy(ptr, "Michi");
tmpPTR = malloc (1024 * 102400000uL);
if (tmpPTR){
strcpy(tmpPTR, ptr);
strcat(tmpPTR, " - Aloha");
if (ptr){
free(ptr);
ptr = NULL;
}
}else{
printf("Malloc failed on tmpPTR\n\n");
}
if (ptr){
printf("PTR = %s\n", ptr);
free(ptr);
ptr = NULL;
}else if (tmpPTR){
printf("tmpPTR = %s\n", tmpPTR);
free(tmpPTR);
ptr = NULL;
}
}
And the output is:
Malloc failed on tmpPTR
PTR = Michi
Now why should I ever use realloc?
Is there any benefit of using realloc instead of malloc based on this context?
Your problem is with how you use realloc. You don't have to assign the result of realloc to the same pointer that you re-allocate. And as you point out it even poses a problem if the realloc fails. If you immediately assign the result to ptr then indeed you lose the previous buffer when something goes wrong. However, if you assign the result of realloc to tmpPTR, then ptr remains fine, even if the realloc fails. Use realloc as follows:
char * ptr = malloc(256);
if(!ptr){
return 1;
}
char * tmpPTR = realloc(ptr, 512);
if(!tmpPTR){
printf("Houston, we have a problem");
// ptr is fine
}else{
ptr = tmpPTR;
}
// ptr is realloc()ed
In the above code, tmpPTR is not a new (temporary) buffer, but just a (temporary) pointer. If the realloc is succesful it points to the same buffer (but possibly in a different location), and if it fails it is NULL. realloc doesn't always need to allocate a new buffer, but may be able to change the existing one to fit the new size. But if it fails, the original buffer will not be changed.
If you use malloc with a temporary buffer, then (for this example) you need at least 256 + 512 = 768 bytes and you always need to copy the old data. realloc may be able to re-use the old buffer so copying is not necessary and you don't use more memory than requested.
You can use your malloc approach, but realloc is almost always more efficient.
The realloc scheme is simple. You do not need a separate call to malloc. For example if you initially have 256 bytes allocated for ptr, simply use a counter (or index, i below) to keep track of how much of the memory within the block allocated to ptr has been used, and when the counter reaches the limit (1 less than the max for 0-based indexes, or 2 less than the max if you are using ptr as a string), realloc.
Below shows a scheme where you are simply adding 256 additional bytes to ptr each time the allocation limit is reached:
int i = 0, max = 256;
char *ptr = malloc(max);
/* do whatever until i reaches 255 */
if (i + 1 >= max) {
void *tmp = realloc (ptr, max + 256);
if (!tmp) {
fprintf (stderr, "error: realloc - memory exhausted.\n")
/* handle error */
}
ptr = tmp;
max += 256;
}
note: your handle error can exit whatever loop you are in to preserve the existing data in ptr. You do not need to exit at that point.
The advantage of realloc over malloc is that it may be able to extend the original dynamic memory area so without the need to copy all the previous elements; you can't do that with malloc1. And whether this optimization is available costs no work to you.
Let's assume you have a previously allocated pointer:
char *some_string = malloc(size); // assume non-NULL
Then
if (realloc_needed) {
char *tmp = realloc(some_string, new_size);
if ( tmp == NULL )
// handle error
else
some_string = tmp; // (1)
At (1), you update the old pointer with the new one. Two things can happen: the address has effectively changed (and the elements been automatically copied) or it hasn't - you don't really care. Either way, your data is now at some_string.
Only the actual implementation (OS / libc) knows whether it's possible to enlarge the block: you don't get to see it, it's an implementation detail. You can however check your implementation's code and see how it's implemented.
Now to fix this we should use a temporary buffer(pointer) before to see if we get that chunk of memory and if we get it we can use it, if not we still have the main buffer(pointer) safe.
That not only doesn't help, it makes things worse because now you no longer have the pointer to the block you tried to reallocate. So how can you free it?
So it:
Wastes memory.
Require an extra allocate, copy, and free.
Makes the realloc more likely to fail because of 1.
Leaks memory since the pointer to the block you tried to reallocate is lost.
So no, that's not a good way to handle realloc returning NULL. Save the original pointer when you call realloc so you can handle failure sanely. The point of realloc to save you from having to manage two copies of the data and to avoid even making them when that's possible. So let realloc do this work for you whenever you can.
It is technically malloc(size) that is unneeded, because realloc(NULL, size) performs the exact same task.
I often read inputs of indeterminate length. As in the following function example, I rarely use malloc(), and instead use realloc() extensively:
#include <stdlib.h>
#include <errno.h>
struct record {
/* fields in each record */
};
struct table {
size_t size; /* Number of records allocated */
size_t used; /* Number of records in table */
struct record item[]; /* C99 flexible array member */
};
#define MAX_ITEMS_PER_READ 1
struct table *read_table(FILE *source)
{
struct table *result = NULL, *temp;
size_t size = 0;
size_t used = 0, n;
int err = 0;
/* Read loop */
while (1) {
if (used + MAX_ITEMS_PER_READ > size) {
/* Array size growth policy.
* Some suggest doubling the size,
* or using a constant factor.
* Here, the minimum is
* size = used + MAX_ITEMS_PER_READ;
*/
const size_t newsize = 2*MAX_ITEMS_PER_READ + used + used / 2;
temp = realloc(result, sizeof (struct table) +
newsize * sizeof (result->item[0]));
if (!temp) {
err = ENOMEM;
break;
}
result = temp;
size = newsize;
}
/* Read a record to result->item[used],
* or up to (size-used) records starting at result->item + used.
* If there are no more records, break.
* If an error occurs, set err = errno, and break.
*
* Increment used by the number of records read: */
used++;
}
if (err) {
free(result); /* NOTE: free(NULL) is safe. */
errno = err;
return NULL;
}
if (!used) {
free(result);
errno = ENODATA; /* POSIX.1 error code, not C89/C99/C11 */
return NULL;
}
/* Optional: optimize table size. */
if (used < size) {
/* We don't mind even if realloc were to fail here. */
temp = realloc(result, sizeof (struct table) +
used * sizeof table->item[0]);
if (temp) {
result = temp;
size = used;
}
}
result->size = size;
result->used = used;
errno = 0; /* Not normally zeroed; just my style. */
return result;
}
My own practical reallocation policies tend to be very conservative, limiting the size increase to a megabyte or so. There is a very practical reason for this.
On most 32-bit systems, userspace applications are limited to 2 to 4 gigabyte virtual address space. I wrote and ran simulation systems on a lot of different x86 systems (32-bit), all with 2 to 4 GB of memory. Usually, most of that memory is needed for a single dataset, which is read from disk, and manipulated in place. When the data is not in final form, it cannot be directly memory-mapped from disk, as a translation -- usually from text to binary -- is needed.
When you use realloc() to grow the dynamically allocated array to store such huge (on 32-bit) datasets, you are only limited by the available virtual address space (assuming there is enough memory available). (This especially applies to 32-bit applications on 64-bit systems.)
If, instead, you use malloc() -- i.e., when you notice your dynamically allocated array is not large enough, you malloc() a new one, copy the data over, and discard the old one --, your final data set size is limited to a lesser size, the difference depending on your exact array size growth policy. If you use the typical double when resizing policy, your final dataset is limited to about half (the available virtual address space, or available memory, whichever is smaller).
On 64-bit systems with lots and lots of memory, realloc() still matters, but is much more of a performance issue, rather than on 32-bit, where malloc() is a limiting factor. You see, when you use malloc() to allocate a completely new array, and copy the old data to the new array, the resident set size -- the actual amount of physical RAM needed by your application -- is larger; you use 50% more physical RAM to read the data than you would when using realloc(). You also do a lot of large memory-to-memory copies (when reading a huge dataset), which are limited to physical RAM bandwidth, and indeed slow down your application (although, if you are reading from a spinning disk, that is the actual bottleneck anyway, so it won't matter much).
The nastiest effect, and the most difficult to benchmark, are the indirect effects. Most operating systems use "free" RAM to cache recently accessed files not modified yet, and this really does decrease the wall clock time used by most workloads. (In particular, caching typical libraries and executables may shave off seconds from the startup time of large application suites, if the storage media is slow (ie. a spinning disk, and not a SSD).) Your memory-wasting malloc()-only approach gobbles up much more actual physical RAM than needed, which evicts cached, often useful, files from memory!
You might benchmark your program, and note that there is no real difference in run times between using your malloc()-only approach and realloc() approach I've shown above. But, if it works with large datasets, the users will notice that using the malloc()-only program slows down other programs much more than the realloc()-using program, with the same data!
So, although on 64-bit systems with lots of RAM using malloc() only is basically an inefficient way to approach things, on 32-bit systems it limits the size of dynamically allocated arrays when the final size is unknown beforehand. Only using realloc() can you there achieve the maximum possible dataset size.
Your assumption is wrong. Please do note that a pointer is not a buffer. When the function realloc() succeeds, it deallocates the old pointer(frees the original buffer) and return a new pointer to the new allocation(buffer), but when it fails, it leaves the old buffer intact and returns NULL.
So, you do not need a temporary buffer. You need a temporary pointer. I am going to borrow the example from kninnug, this is what you need to do:
char * ptr = malloc(256);
if (!ptr) {
return 1;
}
char * tmpPTR = realloc(ptr, 512);
if (!tmpPTR) {
printf("Houston, we have a problem");
// ptr is fine
}
else {
ptr = tmpPTR;
}
// ptr is realloc()ed

Trick to avoid needing to initialize an array

Normally if I want to allocate a zero initialized array I would do something like this:
int size = 1000;
int* i = (int*)calloc(sizeof int, size));
And later my code can do this to check if an element in the array has been initialized:
if(!i[10]) {
// i[10] has not been initialized
}
However in this case I don't want to pay the upfront cost of zero initializing the array because the array may be quite large (i.e. gigs). But in this case I can afford to use as much memory as I want memory.
I think I remember that there is a technique to keep track of the elements in the array that have been initialed, without paying any up front cost, that also allows O(1) cost (not amortized with a hash table). My recollection is that the technique requires an extra array of the same size.
I think it was something like this:
int size = 1000;
int* i = (int*)malloc(size*sizeof int));
int* i_markers = (int*)malloc(size*sizeof int));
If an entry in the array is used it is recorded like this:
i_markers[10] = &i[10];
And then it's use can be checked later like this:
if(i_markers[10] != &i[10]) {
// i[10] has not been initialized
}
Of course this isn't quite right because i_markers[10] could have been randomly set to &i[10].
Can anyone out there remind me of the technique?
Thank you!
I think I remembered it.
Is this right? Is there a better way or are there variations on this?
Thanks again.
(This was updated to be the right answer)
struct lazy_array {
int size;
int* values;
int* used;
int* back_references;
int num_used;
};
struct lazy_array* create_lazy_array(int size) {
struct lazy_array* lazy = (struct lazy_array*)malloc(sizeof(lazy_array));
lazy->size = 1000;
lazy->values = (int*)malloc(size*sizeof int));
lazy->used = (int*)malloc(size*sizeof int));
lazy->back_references = (int*)malloc(size*sizeof int));
lazy->num_used = 0;
return lazy;
}
void use_index(struct lazy_array* lazy, int index, int value) {
lazy->values[index] = value;
if(is_index_used(lazy, index))
return;
lazy->used[index] = lazy->used;
lazy->back_references[lazy->used[index]] = index;
++lazy->used;
}
int is_index_used(struct lazy_array* lazy, int index) {
return lazy->used[index] < lazy->num_used &&
lazy->back_references[lazy->used[index]] == index);
}
On most compilers/standard libraries I know of, large calloc requests (and malloc for that matter) are implemented in terms of the OS's bulk memory request logic. On Linux, that means a copy-on-write mmap-ing of the zero page, and on Windows it means VirtualAlloc. In both cases, the OS gives you memory that is already zero, and calloc recognizes this; it only explicitly zeroes the memory if it was doing a small calloc from the small allocation heap. So until you write to any given page in the allocation, it's zero "for free". No need to be explicitly lazy; the allocator is being lazy for you.
For small allocations it does need to memset to clear the memory, but then, it's fairly cheap to memset a few thousand bytes (or tens of thousands) of bytes. For the really large allocations where zeroing would be costly, you're getting OS provided memory that's zero-ed for free (separate from the rest of the heap); e.g. for dlmalloc in typical configuration, allocations beyond 256 KB will always be freshly mmap-ed and munmap-ed, which means you're getting freshly mapped copy-on-write mappings of the zero page (the cost to zero them being deferred until you perform a write somewhere in the page, and paid whether you got the 256 KB via malloc or calloc).
If you want better guarantees about zeroing, or to get free zeroing on smaller allocations (though it's more wasteful the closer to one page you get), you can just explicitly do what malloc/calloc do implicitly and use the OS provided zero-ed memory, e.g. replace:
sometype *x = calloc(num, sizeof(*x)); // Or the similar malloc(num * sizeof(*x));
if (!x) { ... do error handling stuff ... }
...
free(x);
with either:
sometype *x = mmap(NULL, num * sizeof(*x), PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (x == MAP_FAILED) { ... do error handling stuff ... }
...
munmap(x, num * sizeof(*x));
or on Windows:
sometype *x = VirtualAlloc(NULL, num * sizeof(*x), MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
if (!x) { ... do error handling stuff ... }
...
VirtualFree(x, 0, MEM_RELEASE); // VirtualFree with MEM_RELEASE only takes size of 0
It gets you the same lazy initialization (though on Windows, this may mean that the pages have simply been lazily zero-ed in the background between requests, so they'd be "real" zeroes when you got them, vs. *NIX where they'd be CoW-ed from the zero page, so the get zero-ed live when you write to them).
This can be done, although it relies on undefined behavior. It is called a lazy array.
The trick is to use a reverse lookup table. Every time you store a value, you store its index in the lazy array:
void store(int value)
{
if (is_stored(value)) return;
lazy_array[value] = next_index;
table[next_index] = value;
++next_index;
}
int is_stored(int value)
{
if (lazy_array[value]<0) return 0;
if (lazy_array[value]>=next_index) return 0;
if (table[lazy_array[value]]!=value) return 0;
return 1;
}
The idea is that if the value has not been stored in the lazy array, then the lazy_array[value] will be garbage. Its value will either be an invalid index or a valid index into your reverse lookup table. If it is an invalid index, then you immediately know nothing has been stored there. If it is a valid index, then you check your table. If you have a match then the value was stored, otherwise it wasn't.
The downside is that reading from uninitialized memory is undefined behavior. Based on my experience, it will probably work, but there are no guarantees.
There are many possible techniques. Everything depends on your task. For instance, you can remember maximal number of initialized element max of your array. I.e. if your algorithm can garantee, that all elements from 0 to max ara initialized, you can use simple check if (0 <= i && i <= max) or something like this.
But if your algorithms need to initialize arbitrary elements (i.e. random access), you need general solution. For instance, more effective data structure (not simple array, but sparse array or something like this).
So, add more details about your task. I expect we'll find the best solution for it.

malloc code in C

I have a code block that seems to be the code behind malloc. But as I go through the code, I get the feeling that parts of the code are missing. Does anyone know if there is a part of the function that's missing? Does malloc always combine adjacent chunks together?
int heap[10000];
void* malloc(int size) {
int sz = (size + 3) / 4;
int chunk = 0;
if(heap[chunk] > sz) {
int my_size = heap[chunk];
if (my_size < 0) {
my_size = -my_size
}
chunk = chunk + my_size + 2;
if (chunk == heap_size) {
return 0;
}
}
The code behind malloc is certainly much more complex than that. There are several strategies. One popular code is the dlmalloc library. A simpler one is described in K&R.
The code is obviously incomplete (not all paths return a value). But in any case this is not a "real" malloc. This is probably an attempt to implement a highly simplified "model" of 'malloc'. The approach chosen by the author of the code can't really lead to a useful practical implementation.
(And BTW, standard 'malloc's parameter has type 'size_t', not 'int').
Well, one error in that code is that it doesn't return a pointer to the data.
I suspect the best approach to that code is [delete].
When possible, I expect that malloc will try to put different requests close to each other, as it will have a block of code that is available for malloc, until it has to get a new block.
But, that also depends on the requirements imposed by the OS and hardware architecture. If you are only allowed to request a certain minimum size of code then it may be that each allocation won't be near each other.
As others mentioned, there are problems with the code snippet.
You can find various open-source projects that have their own malloc function, and it may be best to look at one of those, in order to get an idea what is missing.
malloc is for dynamically allocated memory. And this involves sbrk, mmap, or maybe some other system functions for Windows and/or other architectures. I am not sure what your int heap[10000] is for, as the code is too incomplete.
Effo's version make a little bit more sense, but then it introduce another black box function get_block, so it doesn't help much.
The code seems to be run on a metal machine, normally no virtual address mapping on such a system which only use physical address space directly.
See my understanding, on a 32 bits system, sizeof(ptr) = 4 bytes:
extern block_t *block_head; // the real heap, and its address
// is >= 0x80000000, see below "my_size < 0"
extern void *get_block(int index); // get a block from the heap
// (lead by block_head)
int heap[10000]; // just the indicators, not the real heap
void* malloc(int size)
{
int sz = (size + 3) / 4; // make the size aligns with 4 bytes,
// you know, allocated size would be aligned.
int chunk = 0; // the first check point
if(heap[chunk] > sz) { // the value is either a valid free-block size
// which meets my requirement, or an
// address of an allocated block
int my_size = heap[chunk]; // verify size or address
if (my_size < 0) { // it is an address, say a 32-bit value which
// is >0x8000...., not a size.
my_size = -my_size // the algo, convert it
}
chunk = chunk + my_size + 2; // the algo too, get available
// block index
if (chunk == heap_size) { // no free chunks left
return NULL; // Out of Memory
}
void *block = get_block(chunk);
heap[chunk] = (int)block;
return block;
}
// my blocks is too small initially, none of the blocks
// will meet the requirement
return NULL;
}
EDIT: Could somebody help to explain the algo, that is, converting address -> my_size -> chunk? you know, when call reclaim, say free(void *addr), it'll use this address -> my_size -> chunk algo too, to update the heap[chunk] accordingly after return the block to the heap.
To small to be a whole malloc implementation
Take a llok in the sources of the C library of Visual Studio 6.0, there you will find the implementation of malloc if I remeber it correctly

Resources