So I have this assignment to implement my own malloc and free in C. The problem is one of the requirements for the memory_free(void *ptr) function. It has to return 1 if the pointer is invalid, i.e. it hasn't been allocated by the memory_alloc(unsigned int size), or return 0 otherwise. I just can't figure out a way to do this, without it being absolutely time inefficient.
So my memory structure is this: I have a global pointer to the beginning of the array I get to act as a heap. Every block of memory has an int header to tell the size of it and whether it's free or not.
This is my memory_free(void *ptr) function right now, TYPE is typedef unsigned int:
int memory_free(void *ptr)
{
void *head = ptr;
if (head == NULL)
return 1;
head -= sizeof(TYPE);
if (!((*(TYPE*) head) & 1 ))
return 1;
(*(TYPE*) head) &= ~0x1;
return 0;
}
The pointer ptr points to the first byte of user block, which means that if I want to read the header, I have to go back 4 bytes. One solution to check the validity of the pointer is to go through the heap from the beginning and see if I get on the header in question, but that's not time efficient. Could anyone tell me a better way?
One O(1) solution would be to make the header 8 bytes instead of four; use the extra four bytes to indicate validity. For example, it could be the one's complement of what you store in the other four bytes. So you look at the header and if those extra bytes contain anything other than the one's complement of the first part of the header, you know it's not a valid block.
I see 2 possible alternatives:
Keep a linked list of pointers that you have allocated : filled by memory_alloc and consumed by memory_free. This way you can double-check if what has been passed to memory_free is coherent.
The linked-list might be time-consuming: as a compromise you can just store the addresses of the beginning and the end of your memory pool and just ensure that pointers passed to memory_free are in the correct bound. Its far less precise and sure but faster.
Related
I have pointer to buffer that was initialised with calloc:
notEncryptBuf = (unsigned char*) calloc(1024, notEncryptBuf_len);
Later I moved pointer to another position:
notEncryptBuf+=20;
And finally I free buffer:
free(notEncryptBuf);
Will if free whole allocated size? How C knows size of memory it need to free?
The behavior of free is specified only if it is passed an address that was previously returned by malloc or a related routine or is passed a null pointer (in which case it does nothing). If you pass an address modified from an original allocation, as by notEncryptBuf += 20;, the behavior of free is not specified.
C implementations commonly know how much space is in an allocation because they store it in some bytes immediately preceding the allocation. For example, if you ask for 1,024 byes, it may allocate 1,040, record information about the allocation in the first 16 bytes, and return to you the address 16 bytes after the whole allocation. Then, when you pass that address to free, it looks in the 16 bytes before that address to see the amount of space.
Other implementations are theoretically possible. For example, a memory manager could designate one zone of memory for common fixed-size allocations, such as 32 bytes, and then use a bitmap to indicate whether each 32-byte block in that zone is free or allocated. Or it could keep a database of allocations, using a hash table or trees or other data structures. When free is called, it would look up the address in the database.
How C knows size of memory it need to free?
"C" does not know about memory allocation or freeing. It relies on the underlying memory manager to keep track of the allocated memories and free them up.
That said, if you pass a pointer to free() which was not returned by any allocator function, it invokes undefined behaviour. So, you cannot pass the pointer which you have shifted to free(). You need to pass the pointer which was returned by calloc().
A good way to answer these kinds of questions is to ask yourself, “how would I myself write malloc() and free()?”
Suppose you have a “memory pool” — fancy words for just an array of bytes:
unsigned char memory[10000];
Now the user wants eight bytes of that. The user calls ptr = my_malloc(8). You know full well that you can’t just give the user any random spot in your memory array — you can only give away stuff that hasn’t already been given away.
In other words, you somehow need to keep track of what pieces of memory have been given away.
Linked-lists → Variable-sized elements in an array
One way we know of to manage dynamic memory is through linked-lists. A linked list is a block of memory that you organize with a struct:
struct node
{
SOME_TYPE data; // the data to store
struct node * next; // a pointer to the next node
};
However, since we are the memory manager, we don’t have some magic pool to allocate our node. We have to use our own memory[] to create space for the node.
Let’s make a simple modification. Instead of a pointer, we will keep track of how big a piece is. We can do this with a structure:
struct piece_of_memory
{
int size;
unsigned char memory[size];
};
Memory, then, is just an array of those things, where all the sizes add up to our available memory pool size:
piece_of_memory memory[...];
So now our initial pool of memory looks like this:
int size = 9992; // 10000 - sizeof(int), which is minus eight bytes on 64-bit systems
unsigned char memory[9992];
Graphically, that’s something like
[----,------------------------------------------------------------]
↑ ↑
9992 memory
If I give away eight bytes, that gets reordered:
[----,---][----,--------------------------------------------------]
↑ ↑ ↑ ↑
↑ ↑ ↑ memory
↑ ↑ new size = 9992 - 8 - sizeof(int) = 9976
↑ ↑
8 returned from malloc
That is two of those structs in a row
int size = 8
unsigned char memory[8]
int size = 9976
unsigned char memory[9976]
We can verify that the pieces all use exactly 10000 bytes:
{(size) 8 + (8 bytes) 8} + {(size) 8 + (9976 bytes) 9976}
= 16 + 9984
= 10000
So when the user asks us to ptr = my_malloc(8), we find a piece of memory with at least eight available bytes, rearrange things, then return a pointer to the ‘memory’ part (not the ‘size’!).
Freeing allocated memory
Suppose our user is now finished with the eight bytes and calls my_free(ptr).
[----,---][----,--------------------------------------------------]
8 ↑ 9976
↑
free me!
We can find our struct piece_of_memory (it is sizeof(int) bytes before the address returned to us), and we can recombine the free pieces of memory into a whole free block:
[----,------------------------------------------------------------]
9992
Notice how this only works if the user gives us an address we gave it earlier, right? What would happen if I returned a wrong ptr value?
More to think about
Naturally we must also be able to keep track of which blocks are available to return and which ones are in use. This makes our struct piece_of_memory a bit more complicated. We could do something like:
struct piece_of_memory
{
int size;
bool is_used;
unsigned char memory[];
};
We also need a way for the memory manager to search through the memory blocks for a piece that is big enough for the requested size. If we want to be smart about it, we might take some time to find the smallest available block that is big enough for the requested size.
We don’t actually have to keep the (‘size’ and ‘is_used’) with the ‘memory’ pieces, either. We could split up our struct to simply have an array of (‘size’ + ‘is_used’) structures at one end of our memory[] array and all the pieces of returned memory at the other end.
Finally, we must waste a little memory when we divide it up in order to make sure that we always return a pointer that is aligned for the worst-case alignment needs our user might put it to. For example, if user wants to get dynamic memory for an array of double, we don’t want to return something that is byte-aligned.
This isn’t the only way to do it!
This is just one simple way. More advanced structures could certainly be used as well.
Conclusions
Hopefully you can answer your own questions now:
How does the memory manager know how much memory to free?
(Because it keeps track of it.)
Can I return a pointer that was not given to me by the memory manager?
(No, because it would break things.)
Obviously the memory manager can be written to prevent things from breaking if you try to free a pointer it did not give you, but the C specification does not require it to. It requires (expects) the user to not give it bad input.
I am storing the input data which includes the specific order, so I choose to use array to sort them:
struct Node** array = (struct Node**)malloc(sizeof(Node**) * DEFAULT_SIZE);
int i;
int size = DEFAULT_SIZE;
while(/* reading input */) {
// do something
int index = token; // token is part of an input line, which specifies the order
struct Node* node = (struct Node*)malloc(sizeof(struct Node));
*node = (struct Node){value, index};
// do something
if (index >= size) {
array = realloc(array, index + 1);
size = index + 1;
}
array[index] = node;
}
I am trying to loop through the array and do something when the node exists at the index
int i;
for (i = 0; i < size; i++) {
if (/* node at array[i] exists */) {
// do something
}
}
How can I check if node exists at the specific index of the array? (Or what is the "default value" of the struct node after I allocated its memory?) I only know it is not NULL...
Should I use calloc and try if ((int)array[index] != 0)? Or there is a better data structure I am able to use?
When you realloc (or malloc) your list of pointers, the system resizes/moves the array, copying your data if needed, and reserving more space ahead without changing the data, so you get what was there before. You cannot rely on the values.
Only calloc does a zero init, but you cannot calloc when you realloc.
For starters you should probably use calloc:
struct Node** array = calloc(DEFAULT_SIZE,sizeof(*array));
In your loop, just use realloc and set the new memory to NULL so you can test for null pointers
Note that your realloc size is incorrect, you have to multiply by the size of the element. Also update the size after reallocation or that won't work more than once.
Note the tricky memset which zeroes only the unallocated data without changing the valid pointer data. array+size computes the proper address size due to pointer arithmetic, but the size parameter is in bytes, so you have to multiply by sizeof(*array) (the size of the element)
if (index >= size)
{
array = realloc(array, (index + 1)*sizeof(*array)); // fixed size
memset(array+size,0,(index+1-size) * sizeof(*array)); // zero the rest of elements
size = index+1; // update size
}
aside:
realloc for each element is inefficient, you should realloc by chunks to avoid too many system calls/copies
I have simplified the malloc calls, no need to cast the return value of malloc, and also better to pass sizeof(*array) instead of sizeof(Node **). In case the type of array changes you're covered (also protects you from one-off errors with starred types)
The newly-allocated memory contains garbage and reading a pointer from uninitialized memory is a bug.
If you allocated using calloc( DEFAULT_SIZE, sizeof(Node*) ) instead, the contents of the array would be defined: all bits would be set to zero. On many implementations, this is a NULL pointer, although the standard does not guarantee it. Technically, there could be a standard-conforming compiler that makes the program crash if you attempt to read a pointer with all bits set to zero.
(Only language lawyers need to worry about that, though. In practice, even the fifty-year-old mainframes people bring up as the example of a machine where NULL was not binary 0 updated its C compiler to recognize 0 as a NULL pointer, because that broke too much code.)
The safe, portable way to do what you want is to initialize every pointer in the array to NULL:
struct Node** const array = malloc(sizeof(Node**) * DEFAULT_SIZE);
// Check for out-of-memory error if you really want to.
for ( ptrdiff_t i = 0; i < DEFAULT_SIZE; ++i )
array[i] = NULL;
After the loop executes, every pointer in the array is equal to NULL, and the ! operator returns 1 for it, until it is set to something else.
The realloc() call is erroneous. If you do want to do it that way, the size argument should be the new number of elements times the element size. That code will happily make it a quarter or an eighth the desired size. Even without that memory-corruption bug, you’ll find yourself doing reallocations far too often, which might require copying the entire array to a new location in memory.
The classic solution to that is to create a linked list of array pages, but if you’re going to realloc(), it would be better to multiply the array size by a constant each time.
Similarly, when you create each Node, you’d want to initialize its pointer fields, if you care about portability. No compiler this century will generate less-efficient code if you do.
If you only allocate nodes in sequential order, an alternative is to create an array of Node rather than Node*, and maintain a counter of how many nodes are in use. A modern desktop OS will only map in as many pages of physical memory for the array as your process writes to, so simply allocating and not initializing a large dynamic array does not waste real resources in most environments.
One other mistake that’s probably benign: the elements of your array have type struct Node*, but you allocate sizeof(Node**) rather than sizeof(Node*) bytes for each. However, the compiler does not type-check this, and I am unaware of any compiler where the sizes of these two kinds of object pointer could be different.
You might need something like this
unsigned long i;
for (i = 0; i < size; i++) {
if (array[i]->someValidationMember==yourIntValue) {
// do something
}
}
Edit.
The memory to be allocated must be blank. Or if an item is deleted just simply change the Node member to zero or any of your choice.
I've read and understand the concepts behind the binary buddies approach to memory allocation, and I'm trying to put it to work in C but I have a few implementation specific questions before I can really get started.
https://drive.google.com/file/d/0BxJX9LHXUU59OWZ6ZmhvV1lBX2M/view?usp=sharing
- This is a link to the assignment specification, my question pertains to problem 5.
The problem specifies that one call to malloc is to be made at the initialization of the allocator, and all requests for memory must be serviced using the space acquired from this call.
It's clear that the initial pointer to this space must be incremented in some way when a call to get_memory() is made, and the new pointer will be returned to the calling process. How can I increment the pointer by a specific number of bytes?
I understand that free lists for each block size must be kept, but I'm unsure exactly how these will be initialized and maintained. What is stored in the free list exactly? The memory pointer?
I apologize if these questions have been asked before, I haven't found a relevant question that provided enough clarity for me to get working.
For your first question, you just have to increment your pointer like a normal variable.
The value of a pointer corresponds to the address in memory of the data it points to. By incrementing it by, say 10, you actually move 10 bytes further into your memory.
As for the free list, malloc() creates a structure contingent with the allocated memory block containing informations such as the address of the memory block,its size, and whether it is free or not.
You goal is to create these structures so you can keep track of the status the different memory blocks you have allocated or free with your get_memory() and release_memory() function.
You might also find this useful : https://stackoverflow.com/a/1957125/4758798
It's clear that the initial pointer to this space must be incremented in some way when a call to get_memory() is made, and the new pointer will be returned to the calling process. How can I increment the pointer by a specific number of bytes?
When you call get_memory(), you will return a pointer to the main memory added to some offset. The word 'increment' implies that you are going to change the value of the initial pointer, which you should not do.
Here is some simple code of me subaddressing one big memory block.
#include <stdlib.h>
#include <stdio.h>
int main (void)
{
// Allocate a block of memory
void * memory_block = malloc (512);
// Now "Split" that memory into two halves.
void * first_half = memory_block;
void * second_half = memory_block + 256;
// We can even keep splitting...
void * second_first_half = second_half;
void * second_second_half = second_half + 128;
// Note that this splitting doesn't actually change the main memory block.
// We're just bookmarking locations in it.
printf ("memory_block %p\n", memory_block);
printf ("first_half %p\n", first_half);
printf ("second_half %p\n", second_half);
printf ("second_first_half %p\n", second_first_half);
printf ("second_second_half %p\n", second_second_half);
return 0;
}
I understand that free lists for each block size must be kept, but I'm unsure exactly how these will be initialized and maintained. What is stored in the free list exactly? The memory pointer?
At a minimum, you probably want to keep track of the memory pointer and the size of that memory block, so something like this...
typedef struct memory_block {
void * memory;
size_t size;
} memory_block_t;
There are other ways to represent this though. For example, you get equivalent information by keeping track of their memory offsets relative to the global malloc. I would suggest treating memory as a set of offsets like this:
void * global_memory; // Assigned by start_memory()
// Functionally equivalent to the above struct
// memory = global_memory + begin;
// size = end - begin;
typedef struct memory_block {
size_t begin;
size_t end;
} memory_block_t;
There are multiple approaches to this difficult problem.
I'm trying to create a memory allocation system, and part of this involves storing integers at pointer locations to create a sort of header. I store a couple of integers, and then two pointers (with locations to the next and prev spots in memory).
Right now I'm trying to figure out if I can store the pointer at a location that I could later use as the original pointer.
int * header;
int * prev;
int * next;
...
*(header+3) = prev;
*(header+4) = next;
Then later...
headerfunction(*(header+4));
would perform an operation using the pointer to the 'next' location in memory.
(code for illustration only)
Any help or suggestions greatly appreciated!
Don't do direct pointer manipulation. Structs were made to eliminate the need for you to do that directly.
Instead, do something a bit more like this:
typedef struct
{
size_t cbSize;
} MyAwesomeHeapHeader;
void* MyAwesomeMalloc(size_t cbSize)
{
MyAwesomeHeapHeader* header;
void* internalAllocatorPtr;
size_t cbAlloc;
// TODO: Maybe I want a heap footer as well?
// TODO: I should really check the following for an integer overflow:
cbAlloc = sizeof(MyAwesomeHeapHeader) + cbSize;
internalAllocatorPtr = MyAwesomeRawAllocator(cbAlloc);
// TODO: Check for null
header = (MyAwesomeHeapHeader*)internalAllocatorPtr;
header->heapSize = cbSize;
// TODO: other fields here.
return (uint8_t*)(internalAllocatorPtr) + sizeof(MyAwesomeHeapHeader);
}
What-ever you are doing is not safe because you are trying to write a memory location which is not pointed by header as *(header+3) it will try to write to some other memory location 12 byte far from header pointer & if this newly memory is held by another variable then it will cause problem.
You can do as first of all allocating a big memory & then the start address will give you the source of your memory in which you can use some starting bytes or memory for controlling other parts of the remaining memory with the help of structures.
Akp is correct, just looking at what you are trying to accomplish in your code segment, if you are trying to store integer pointers in header, header should be defined as such:
int **header;
and then memory should be allocated for it.
With regards to the actual memory allocation, if on a Unix machine, you should look into the brk() syscall.
You are building a memory allocation system, and thus we assume you have a trunk of memory somewhere you can use freely to manage allocations and freeings.
As per your question, the header pointer is allocated in the heap memory (by the compiler and libraries) - and you may wonder if it is safe to use that memory since you are allocating memory. It depends on your system, and if there is another (system) memory allocation management.
But what you could do is
main() {
void *header;
void *prev;
void *next;
manage_memory_allocations(&header, &prev, &next); // never returns
}
In this case, the pointers are created on the stack - so the allocation depends on the memory where the processor stack points to.
Note the "never returns" as the memory is "freed" as soon as main ends.
Suppose I have a block of memory as such:
void *block = malloc(sizeof(void *) + size);
How do I set a pointer to the beginning of the block while still being able to access the rest of the reserved space? For this reason, I do not want to simply assign 'block' to another pointer or NULL.
How do I set the first two bytes of the block as NULL or have it point somewhere?
This doesn't make any sense unless you're running on a 16-bit machine.
Based on the way that you're calling malloc(), you're planning to have the first N bytes be a pointer to something else (where N may be 2, 4, or 8 depending on whether you're running on a 16-, 32-, or 64-bit architecture). Is this what you really want to do?
If it is, then you can create use a pointer-to-a-pointer approach (recognizing that you can't actually use a void* to change anything, but I don't want to confuse matters by introducing a real type):
void** ptr = block;
However, it would be far more elegant to define your block with a struct (this may contain syntax errors; I haven't run it through a compiler):
typedef struct {
void* ptr; /* replace void* with whatever your pointer type really is */
char[1] data; } MY_STRUCT;
MY_STRUCT* block = malloc(sizeof(MY_STRUCT) + additional);
block->ptr = /* something */
memset(block, 0, 2);
memset can be found in string.h
Putting the first two bytes of the allocated memory block to 0 is easy. There is many ways to do it, for example:
((char*)block)[0] = 0;
((char*)block)[1] = 0;
Now, the way the question is asked show some misunderstanding.
You can put anything in the first two bytes of your allocated block, it doesn't change anything for accessing the following bytes. The only difference is that C string manipulation operator use as a convention that strings end with a 0 byte. Then if you do things like strcpy((char*)block, target) it will stop copying immediately if the first byte is a zero. But you can still do strcpy((char*)block+2, target).
Now if you want to store a pointer a the beginning of the block (and usually it's not 2 bytes).
You can do the same thing as above but using void* instead of char.
((void**)block)[0] = your_pointer;
You access the rest of the block as you like, just get it's address and go on. You could do it for example with.
void * pointer_to_rest = &((void**)block)[1];
PS: I do not recommand such pointer games. They are very error prone. Your best move would probably be to follow the struct method proposed by #Anon.
void *block = malloc(sizeof(void *) + size); // allocate block
void *ptr = NULL; // some pointer
memcpy(block, &ptr, sizeof(void *)); // copy pointer to start of block
I have a guess at what you're trying to ask, but your wording is so confusing that I could be totally wrong. I am assuming that you want a pointer that points to the "first 2 bytes" of the block you allocated, and then another pointer that points to the rest of the block.
Pointers carry no information about the size of the memory block that they point to, so you can do this:
void *block = malloc(sizeof(void *) + size);
void *first_two_bytes = block;
void *rest_of_block = ((char*)block)+2;
Now, first_two_bytes points to the beginning of the block that you allocated, and you should just treat it as if it pointed to a memory area 2 bytes long.
And rest_of_block points to the portion of the block starting 3 bytes in, and you should treat it as if it pointed to a memory area 2 bytes smaller than what you allocated.
Note, however, that this is still only a single allocation, and you should only free the block pointer. If you free all three pointers, you will corrupt the heap, since you will be calling free more than once on the same block.
While implementing a map interface using a hash table I faced a similar issue, where each key-value pair (both of which are not statically sized, omitting the option of defining a compile-time struct) had to be stored in block of heap memory that also included a pointer to the next element in a linked list (should the blocks be chained in the event that more than one is hashed to the same index in the hash table array). Leaving space for the pointer at the beginning of the block, I found that the solution mentioned by kriss:
((void**)block)[0] = your_pointer;
where you cast the pointer to the block as an array, and then use the bracket syntax to handle pointer arithmetic and dereferencing, was the cleanest solution for copying a new value into this pointer "field" of the block.