Opinions and suggestions regarding my approach to first fit malloc function

Opinions and suggestions regarding my approach to first fit malloc function - c

I'm writing a malloc function for a college assignment. Here's a basic layout of my idea:
1)Define a node struct with pointers to previous node, next node, as well as a char for size and vacancy. Each region in the heap will contain a hidden node with this information.
2)Malloc function. Starting with first node loop through each node checking for vacancy. If a node is vacant and is large enough return a ptr to the beginning of the region not including the node. If no space is available use sbrk to allocate requested space PLUS space for a node.
3)Free function. Go to pointer passed as parameter-sizeof(struct node) and set the vacancy to vacant. Then starting with the beginning of the list, traverse the list merging adjacent free spaces.
How does this approach sound? My main concern is with actually starting the linked list. For instance, should I create a node with sbrk before I start to do any allocations and store a ptr to it as a global variable? If so how do I initialize a first node before I allow the malloc function to be called by a driver program?
Thanks in advance. I'm not asking for someone to write my code, only to provide some insight and suggestions regarding my ideas.

I would avoid keeping all the bookkeeping information on nodes while they're allocated. I'd have the bare minimum of information (usually just the block size) at the beginning of the block, but nothing more.
I'd track free blocks and allocated blocks separately, so when you're searching for a free block, you don't waste time on blocks that are already in use.
I'd separate the free list into two pieces, and coalesce blocks lazily. In other words, have one free list you're allocating from, and a second that's just a holding area. When the user calls free, just link the block into the holding area, nothing more. When the list you're using for allocations starts to run low, sort the blocks in the holding area by address, then merge with the allocation free list. Then walk the list and merge adjacent blocks.
When you do need to call sbrk (or whatever) to allocate more space from the system, do not just allocate enough space to satisfy the current allocation request. Instead, allocate a fairly large block (e.g., a megabyte) and then split that to get satisfy the request, and add the rest as block to the free list. If you're running low enough on memory that you have to go to sbrk once, chances are the next few calls will do the same, so you might as well be greedy, and grab enough memory immediately to stand a decent chance of satisfying more requests as well.
The basic idea of the third is to avoid doing coalescing as long as possible to increase the chances of finding adjacent blocks, so when you do coalesce you'll probably do some real good, and avoid wasting time trying to coalesce when there are only a few adjacent blocks free.

Related

Is there a way to free all nodes in a linked list in one go instead of iterating through each node?

I have the following linked list that has several nodes which have been attached to it. Is there away to free all the notes at once instead of iterating through each node?
This is the struct I have:
struct Courses{
char *courseName;
int creditValue;
Courses *next;
};Courses;

You can write your own, intermediate allocator that allocates a big block of (say) 1000 node structures. They you can build your list by "allocating" nodes out of your big chunk one at a time, with a simple variable to keep track of how many you've used, and some code to catch the case that you've used them all up.
Then when it's time to free your list, you can just free that one block in one go.
In your example, you may also have to worry about memory dynamically allocated for the courseName pointer. You can handle that, too, although your intermediate allocator gets more complicated, because it ends up being more of a general-purpose malloc replacement, not just a special-purpose Courses node allocator.

No. There's no way other than iterating over the list.
On modern operating systems, allocated memory will be reclaimed by the operating system once the process exits. So you can avoid free()'ing yourself. But this is not a good alternative especially if the program is a long running one or nor is it a universal approach to free memory.

Separating the linked list implementation from the malloc function

I have been told to design a linked list module where the linked list functions will all be in one header file but the memory allocation(malloc) will not happen those modules.
The memory allocation should ONLY happen in the main function. I am unable to figure it out. Do help me.

that has been implemented already: look at <sys/queue.h> that's a header only linked list.

In main you should allocate a sufficiently large amount of memory (A memory pool), in one go. Now in your module you need to manage (allocate and free) memory chunks from this pool, and not bother with malloc.
If you don't know about memory pool, read this - http://en.wikipedia.org/wiki/Memory_pool.
However there is problem of fragmentation, which you need to tackle. In the steps below I use a bit array, to mark the free and allocated nodes.
Example-
In main you allocate 50*sizeof(node) (50 depends on application)
Now, you pass the pointer of allocated pool to your function.
Keep a counter, to count number of allocated nodes, initialize it to 0
Also keep a bit array, of size 50, initialized to 0 (all free)
When allocating, check for overflow, iterate over the bit-array, to find first free node. If j th bit is 0, pass on the address of new node as Base + j(*sizeof node), and increment the counter. Set the j th bit to 1.
When deallocating, simply decrement the counter, and set the corresponding bit to 0.
HTH

You can do this as an array of structures and link it by array index. That array can then be allocated in the main function. Note that you have to keep track of the number of entries you have in your list, as the list will be limited to the number of entries you allocate.

Malloc Allocation Schemes

Yes, I am taking a Computer systems course.
I had a few questions about the various allocation schemes to implement malloc.
For explicit lists, if I implement malloc using a LIFO-like stack, what exactly is the purpose of having pointers to previous freed memory? Like why do you need doubly-linked lists? Wouldn't singly linked lists work just as well?
Malloc lecture.
I found this link online, you can look at slide 7 to see what I'm talking about.
When looking at a segregated list allocation scheme, these lists are uni-directional right? And also, what exactly is the coalescing mechanism? Like for example, if 4 words are freed, would you first try and join it when the free space around you before inserting it back into the respective segregated linked list? Or would you simply insert the 4 word block in the '4 word' section of the respective segregated linked list?
Thank you.

Since a freed block always has room for two pointers, why not doubly-link the list? It simplifies the coalescing code so it doesn't have to maintain a trailing pointer while traversing the list. It also allows traversing the list in either direction in case there is a hint for which end of the list might be closer to begin the search. One obscure system I once looked at kept a pointer in the "middle", where the last activity occurred.
When freeing a block. There are only four possible cases:
The free block is adjacent after a free block.
The free block is adjacent before a free block.
The free block is between and adjacent to both free blocks before and after it.
The free block is not adjacent to any free block.
The purposes of coalescing adjacent free blocks are:
to reduce the length of the linked list
to accurately reflect the size of a free block without burdening the allocator to look ahead to see if two blocks are adjacent
Sorting a free block into a specific-length freelist often has benefits, but in most practical implementations, coalescing is a priority so that an alloc() request for a different size block isn't inappropriately denied when there are many differently-sized free blocks.

Is it better to use LIFO order or FIFO order when keeping a list of freed blocks in a dynamic memory allocator?

I'm trying to implement malloc() in C for class, and I can't decide whether a block should be added to the end of the free list or the head of the free list. Which would be better, and why? The list I'm using is a doubly linked list and (for now) is unordered.

Without running a benchmark, the most likely choice to give best performance is FIFO, i.e. put freed blocks at the head of the free list.
This is because FIFO is most likely to provide temporal locality of reference, because a just-freed block is more likely to reside in a CPU cache than a block freed earlier and not used for a longer period of time.

The difference between the two shouldn't be obvious (if there is one): the order in which blocks are allocated and freed depends of a user (the programmer who's using your malloc), thus you can consider it as random.
Make at least an ordered list by sizes.
Take a look at some other techniques if you really want something fast, For instance, implement a buddy system.

Array Performance very similar to LinkedList - What gives?

So the title is somewhat misleading... I'll keep this simple: I'm comparing these two data structures:
An array, whereby it starts at size 1, and for each subsequent addition, there is a realloc() call to expand the memory, and then append the new (malloced) element to the n-1 position.
A linked list, whereby I keep track of the head, tail, and size. And addition involves mallocing for a new element and updating the tail pointer and size.
Don't worry about any of the other details of these data structures. This is the only functionality I'm concerned with for this testing.
In theory, the LL should be performing better. However, they're near identical in time tests involving 10, 100, 1000... up to 5,000,000 elements.
My gut feeling is that the heap is large. I think the data segment defaults to 10 MB on Redhat? I could be wrong. Anyway, realloc() is first checking to see if space is available at the end of the already-allocated contiguous memory location (0-[n-1]). If the n-th position is available, there is not a relocation of the elements. Instead, realloc() just reserves the old space + the immediately following space. I'm having a hard time finding evidence of this, and I'm having a harder time proving that this array should, in practice, perform worse than the LL.
Here is some further analysis, after reading posts below:
[Update #1]
I've modified the code to have a separate list that mallocs memory every 50th iteration for both the LL and the Array. For 1 million additions to the array, there are almost consistently 18 moves. There's no concept of moving for the LL. I've done a time comparison, they're still nearly identical. Here's some output for 10 million additions:
(Array)
time ./a.out a 10,000,000
real 0m31.266s
user 0m4.482s
sys 0m1.493s
(LL)
time ./a.out l 10,000,000
real 0m31.057s
user 0m4.696s
sys 0m1.297s
I would expect the times to be drastically different with 18 moves. The array addition is requiring 1 more assignment and 1 more comparison to get and check the return value of realloc to ensure a move occurred.
[Update #2]
I ran an ltrace on the testing that I posted above, and I think this is an interesting result... It looks like realloc (or some memory manager) is preemptively moving the array to larger contiguous locations based on the current size.
For 500 iterations, a memory move was triggered on iterations:
1, 2, 4, 7, 11, 18, 28, 43, 66, 101, 154, 235, 358
Which is pretty close to a summation sequence. I find this to be pretty interesting - thought I'd post it.

You're right, realloc will just increase the size of the allocated block unless it is prevented from doing so. In a real world scenario you will most likely have other objects allocated on the heap in between subsequent additions to the list? In that case realloc will have to allocate a completely new chunk of memory and copy the elements already in the list.
Try allocating another object on the heap using malloc for every ten insertions or so, and see if they still perform the same.

So you're testing how quickly you can expand an array verses a linked list?
In both cases you're calling a memory allocation function. Generally memory allocation functions grab a chunk of memory (perhaps a page) from the operating system, then divide that up into smaller pieces as required by your application.
The other assumption is that, from time to time, realloc() will spit the dummy and allocate a large chunk of memory elsewhere because it could not get contiguous chunks within the currently allocated page. If you're not making any other calls to memory allocation functions in between your list expand then this won't happen. And perhaps your operating system's use of virtual memory means that your program heap is expanding contiguously regardless of where the physical pages are coming from. In which case the performance will be identical to a bunch of malloc() calls.
Expect performance to change where you mix up malloc() and realloc() calls.

Assuming your linked list is a pointer to the first element, if you want to add an element to the end, you must first walk the list. This is an O(n) operation.
Assuming realloc has to move the array to a new location, it must traverse the array to copy it. This is an O(n) operation.
In terms of complexity, both operations are equal. However, as others have pointed out, realloc may be avoiding relocating the array, in which case adding the element to the array is O(1). Others have also pointed out that the vast majority of your program's time is probably spent in malloc/realloc, which both implementations call once per addition.
Finally, another reason the array is probably faster is cache coherency and the generally high performance of linear copies. Jumping around to erratic addresses with significant gaps between them (both the larger elements and the malloc bookkeeping) is not usually as fast as doing a bulk copy of the same volume of data.

The performance of an array-based solution expanded with realloc() will depend on your strategy for creating more space.
If you increase the amount of space by adding a fixed amount of storage on each re-allocation, you'll end up with an expansion that, on average, depends on the number of elements you have stored in the array. This is on the assumption that realloc will need to (occasionally) allocate space elsewhere and copy the contents, rather than just expanding the existing allocation.
If you increase the amount of space by adding a proportion of your current number of elements (doubling is pretty standard), you'll end up with an expansion that, on average, takes constant time.

Will the compiler output be much different in these two cases?

This is not a real life situation. Presumably, in real life, you are interested in looking at or even removing items from your data structures as well as adding them.
If you allow removal, but only from the head, the linked list becomes better than the array because removing an item is trivial and, if instead of freeing the removed item, you put it on a free list to be recycled, you can eliminate a lot of the mallocs needed when you add items to the list.
On the other had, if you need random access to the structure, clearly an array beats the linked list.

(Updated.)
As others have noted, if there are no other allocations in between reallocs, then no copying is needed. Also as others have noted, the risk of memory copying lessens (but also its impact of course) for very small blocks, smaller than a page.
Also, if all you ever do in your test is to allocate new memory space, I am not very surprised you see little difference, since the syscalls to allocate memory are probably taking most of the time.
Instead, choose your data structures depending on how you want to actually use them. A framebuffer is for instance probably best represented by a contiguous array.
A linked list is probably better if you have to reorganise or sort data within the structure quickly.
Then these operations will be more or less efficient depending on what you want to do.
(Thanks for the comments below, I was initially confused myself about how these things work.)

What's the basis of your theory that the linked list should perform better for insertions at the end? I would not expect it to, for exactly the reason you stated. realloc will only copy when it has to to maintain contiguity; in other cases it may have to combine free chunks and/or increase the chunk size.
However, every linked list node requires fresh allocation and (assuming double linked list) two writes. If you want evidence of how realloc works, you can just compare the pointer before and after realloc. You should find that it usually doesn't change.
I suspect that since you're calling realloc for every element (obviously not wise in production), the realloc/malloc call itself is the biggest bottleneck for both tests, even though realloc often doesn't provide a new pointer.
Also, you're confusing the heap and data segment. The heap is where malloced memory lives. The data segment is for global and static variables.