custom memory allocator - c

I wrote a custom memory allocator. It has 2 restrictions that I want to remove so it works like malloc/free.
1.) The mem_free call requires a cast to an unsigned char * for its input parameter. I would like it to take a pointer of any type. How can this be done?
2.) The memory allocator I wrote allocates a block of memory to the front of the buffer and it also writes its size. The free function removes the last block of allocated memory in the buffer. So the order of the malloc/free calls matter or it will not work. How can I remove this restriction?
want to be able to do this:
char* ptr1 = mem_alloc(10);
char* ptr2 = mem_alloc(4);
mem_free(ptr1);
mem_free(ptr2);
have to do this now:
char* ptr1 = mem_alloc(10);
char* ptr2 = mem_alloc(4);
mem_free(ptr2);
mem_free(ptr1);
code:
unsigned char* mem_alloc(unsigned int size) {
unsigned int s;
if( (size + MEM_HEADER_SIZE) > (MEM_MAX_SIZE - mem_current_size_bytes) ) {
return NULL;
}
if(is_big_endian() == 0) {
s = (mem_buff[3] << 24) + (mem_buff[2] << 16) + (mem_buff[1] << 8) + mem_buff[0];
} else {
s = (mem_buff[0] << 24) + (mem_buff[1] << 16) + (mem_buff[2] << 8) + mem_buff[3];
}
memcpy(mem_buff + mem_current_size_bytes, &size, sizeof(unsigned int));
unsigned char* result = mem_buff + (mem_current_size_bytes + MEM_HEADER_SIZE);
mem_current_size_bytes += MEM_HEADER_SIZE + size;
return result;
}
void mem_free(unsigned char* ptr) {
unsigned int i,s;
for(i=0; i<mem_current_size_bytes; i++) {
if( (char*)ptr == (char*)(mem_buff + i) ) {
if(is_big_endian() == 0) {
s = (*(ptr - 1) << 24) + (*(ptr - 2) << 16) + (*(ptr - 3) << 8) + *(ptr - 4);
} else {
s = (*(ptr - 4) << 24) + (*(ptr - 3) << 16) + (*(ptr - 2) << 8) + *(ptr - 1);
}
mem_current_size_bytes-=s;
mem_current_size_bytes-=MEM_HEADER_SIZE;
break;
}
}
}

1) Use a void* instead.
2) Store a map of addresses to allocated blocks and a seperate map of unallocated blocks. You can then look up which block is being freed in the allocated map, remove it and then add the block to the unallocated map (making sure to merge it with any free blocks either side of it). Of course, this can and does lead to memory fragmentation but that is rather unavoidable really.

You wrote you are looking for ideas, so I am attaching one of my projects I've done at university, in which we should implement malloc() and free()...
You can see how I've done this and perhaps get an inspiration or use it all if you want. It sure isn't the fastest implementation possible, but is rather easy and should be bug free ;-)
(note if someone from the same course happen to came across this - pls don't use this and rather make your own one - hence the license ;-)
/*
* The 1st project for Data Structures and Algorithms course of 2010
*
* The Faculty of Informatics and Information Technologies at
* The Slovak University of Technology, Bratislava, Slovakia
*
*
* Own implementation of stdlib's malloc() and free() functions
*
* Author: mnicky
*
*
* License: modified MIT License - see the section b) below
*
* Copyright (C) 2010 by mnicky
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* a) The above copyright notice and this permission notice - including the
* section b) - shall be included in all copies or substantial portions
* of the Software.
*
* b) the Software WILL NOT BE USED IN ANY WORK DIRECTLY OR INDIRECTLY
* CONNECTED WITH The Faculty of Informatics and Information Technologies at
* The Slovak University of Technology, Bratislava, Slovakia
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*
*/
#include <stdio.h>
typedef unsigned int MEMTYPE;
MEMTYPE *mem;
MEMTYPE memSize;
MEMTYPE avail; //1st index of the 1st free range
//return size of block
MEMTYPE blockSize(MEMTYPE x) {
return mem[x];
}
//return next free block
MEMTYPE next(MEMTYPE x) {
return mem[x + mem[x]];
}
//return index of pointer to next free block
MEMTYPE linkToNext(MEMTYPE x) {
return x + mem[x];
}
//initialize memory
void my_init(void *ptr, unsigned size) {
mem = (MEMTYPE *) ptr;
memSize = size / sizeof(MEMTYPE);
mem[0] = memSize - 1;
mem[memSize - 1] = memSize;
avail = 0;
}
//allocate memory
void *my_alloc(unsigned size) {
if (size == 0) { //return NULL pointer after attempt to allocate 0-length memory
return NULL;
}
MEMTYPE num = size / sizeof(MEMTYPE);
if (size % sizeof(MEMTYPE) > 0) num++;
MEMTYPE cur, prev; //pointer to (actually index of) current block, previous block
MEMTYPE isFirstFreeBeingAllocated = 1; //whether the first free block is being allocated
prev = cur = avail;
//testing, whether we have enough free space for allocation
test:
if (avail == memSize) { //if we are on the end of the memory
return NULL;
}
if (blockSize(cur) < num) { //if the size of free block is lower than requested
isFirstFreeBeingAllocated = 0;
prev = cur;
if (next(cur) == memSize) { //if not enough memory
return NULL;
}
else
cur = next(cur);
goto test;
}
if (blockSize(cur) == num) { //if the size of free block is equal to requested
if (isFirstFreeBeingAllocated)
avail = next(cur);
else
mem[linkToNext(prev)] = next(cur);
}
else { //if the size of free block is greater than requested
if (isFirstFreeBeingAllocated) {
if ((blockSize(cur) - num) == 1) //if there is only 1 free item left from this (previously) free block
avail = next(cur);
else
avail = cur + num + 1;
}
else {
if ((blockSize(cur) - num) == 1) //if there is only 1 free item left from this (previously) free block
mem[linkToNext(prev)] = next(cur);
else
mem[linkToNext(prev)] = cur + num + 1;
}
if ((blockSize(cur) - num) == 1) //if there is only 1 free item left from this (previously) free block
mem[cur] = num + 1;
else {
mem[cur + num + 1] = blockSize(cur) - num - 1;
mem[cur] = num;
}
}
return (void *) &(mem[cur+1]);
}
//free memory
void my_free(void *ptr) {
MEMTYPE toFree; //pointer to block (to free)
MEMTYPE cur, prev;
toFree = ((MEMTYPE *)ptr - (mem + 1));
if (toFree < avail) { //if block, that is being freed is before the first free block
if (((linkToNext(toFree) + 1) == avail) && avail < memSize) //if next free block is immediately after block that is being freed
mem[toFree] += (mem[avail] + 1); //defragmentation of free space
else
mem[linkToNext(toFree)] = avail;
avail = toFree;
}
else { //if block, that is being freed isn't before the first free block
prev = cur = avail;
while (cur < toFree) {
prev = cur;
cur = next(cur);
}
if ((linkToNext(prev) + 1) == toFree) { //if previous free block is immediately before block that is being freed
mem[prev] += (mem[toFree] + 1); //defragmentation of free space
if (((linkToNext(toFree) + 1) == cur) && cur < memSize) //if next free block is immediately after block that is being freed
mem[prev] += (mem[cur] + 1); //defragmentation of free space
else
mem[linkToNext(toFree)] = cur;
}
else {
mem[linkToNext(prev)] = toFree;
mem[linkToNext(toFree)] = cur;
}
}
}
It was done to use as small amount of space for metainformation as possible, so the allocated space is marked by the number in 1st node of allocated range, indicating the number of allocated nodes following.
Amount of free space in the free range is marked by a number indicating the number of free nodes following (including that node) and last node of the free range contains number of 1st index of the next following free range - sth like this (the red space is allocated, the white is free):
And it can be used like this:
char region[30000000]; //space for memory allocation
my_init(region, 30000000); //memory initialization
var = (TYPE *) my_alloc(sizeof(TYPE)); //memory allocation
my_free((void *) var); //freeing the memory

Just use a void* pointer instead of char* in the argument of mem_free.
For making the memory allocator work with any memory location, you need to add much more complexity...i'd recommend researching how memory heaps are managed and you will find some basic schemes to try out.

You have to rewrite the whole code. Memory allocators usually use a linked list for storing used and unused chunks of memory, you can put the this link in the header of the chunk before size. I advise you to search for some articles about how memory allocators work. Writing a well performing memory allocator is a hard task.

Related

Fragmentation in custom allocator. Unable to order pointers correctly

Context: I have written a best-fit memory allocator. It allocates large blocks of memory, and serves best fitting chunks of it to programs upon request. If memory reserves are exhausted, its asks for OS for more. All free blocks are stored on a linked-list, ordered by increasing pointer value.
Problem:: When memory is released, the program is supposed to link it back into the free list of blocks for recycling, and more crucially merges the given block with its surrounding blocks if possible. Unfortunately, this only works well as long as I do not need to ask the OS for more then one super-block to serve. When this does occur, the new blocks I receive have nonsensical addressing and get inserted between other super-block addressing space. This leads to permanent fragmentation.
Tl;dr: I am being given new super-blocks of memory whose address is within the address space of another super-block, leading to fragmentation when sub-blocks are returned.
Illustration of problem:
Here is a diagram of the problem I described above.
Numbers: Memory addresses (from real execution).
Beige blocks: Free memory.
White blocks: Memory unlinked for use.
The diagram shows the progression of memory usage until the fragmentation catalyst occurs. You can see once the block gets inserted, merging will never be possible up for the busy blocks once they are checked back in.
Code: Reproducible Example:
The following includes the allocator and a small test program. It must be compiled with at least C99.
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
/*****************************************************************************/
/* SYMBOLIC CONSTANTS */
/*****************************************************************************/
#define NEXT(hp) ((hp)->key.next)
#define UNITS(hp) ((hp)->key.units)
#define UNIT_SIZE sizeof(Header)
#define MIN_UNITS_ALLOC 64
#define MAX(a,b) ((a) > (b) ? (a) : (b))
/*****************************************************************************/
/* TYPE DEFINITIONS */
/*****************************************************************************/
typedef union header {
intmax_t align;
struct {
union header *next;
unsigned units;
} key;
} Header;
/*****************************************************************************/
/* GLOBAL VARIABLES */
/*****************************************************************************/
Header base = {.key = { NULL, 0 }};
Header *list;
/*****************************************************************************/
/* PROTOTYPES */
/*****************************************************************************/
/* Allocates a 'bytes' size block of memory. On success, returns pointer to
* the block. On error, NULL is returned. */
void *alloc (size_t bytes);
/* Returns a 'bytes' size block of allocated memory for reuse */
void release (void *bytes);
/* Attempts to reserve memory from the operating system via a system call */
static Header *reserve (unsigned units);
/*****************************************************************************/
/* FUNCTION IMPLEMENTATIONS */
/*****************************************************************************/
void *alloc (size_t bytes) {
size_t units, diff;
Header *block, *lastBlock, *best, *lastBest;
if (bytes == 0) {
return NULL;
}
if (list == NULL) {
list = base.key.next = &base;
}
best = lastBest = NULL;
units = (bytes + UNIT_SIZE - 1) / UNIT_SIZE + 1;
diff = SIZE_MAX;
for (lastBlock = list, block = NEXT(list); ; lastBlock = block, block = NEXT(block)) {
/* Loop across list, find closest fitting block */
if (UNITS(block) >= units && UNITS(block) - units < diff) {
diff = UNITS(block) - units;
best = block;
lastBest = lastBlock;
}
/* Upon cycle completion */
if (block == list) {
/* If no block available, reserve some. */
if (best == NULL) {
if ((lastBest = reserve(units)) == NULL) {
return NULL;
} else {
fprintf(stderr, "\nalloc: Out of memory, linking new block %lld of size %u.\n\n", (long long)lastBest, UNITS(lastBest));
release((void *)(lastBest + 1));
}
/* If block of perfect size, return. Else slice and return */
} else {
if (diff == 0) {
NEXT(lastBest) = NEXT(best);
} else {
UNITS(best) = diff;
best += diff;
UNITS(best) = units;
}
fprintf(stderr, "alloc: Unlinked block %lld of %u units.\n", (long long)best, UNITS(best));
return (void *)(best + 1);
}
}
}
}
void release (void *bytes) {
Header *p, *block;
block = (Header *)bytes - 1;
/* Choose p such that: p -> block -> NEXT(p) */
for (p = list; !(p < block && NEXT(p) > block); p = NEXT(p)) {
if (p >= NEXT(p) && (block > p || block < NEXT(p))) {
break;
}
}
/* Merge block with NEXT(p) if adjacent */
if (block + UNITS(block) == NEXT(p)) {
NEXT(block) = NEXT(NEXT(p));
UNITS(block) += UNITS(NEXT(p));
} else {
NEXT(block) = NEXT(p);
}
/* Merge block with p if adjacent */
if (p + UNITS(p) == block) {
NEXT(p) = NEXT(block);
UNITS(p) += UNITS(block);
} else {
NEXT(p) = block;
}
}
static Header *reserve (unsigned units) {
char *bytes, *sbrk(int);
Header *block;
units = MAX(units, MIN_UNITS_ALLOC);
if ((bytes = sbrk(units)) == (char *)-1) {
return NULL;
} else {
block = (Header *)bytes;
UNITS(block) = units;
}
return block;
}
/*****************************************************************************/
/* TESTING FUNCTIONS (DELETE) */
/*****************************************************************************/
void printFreeList (unsigned byAddress) {
Header *lp = list;
if (lp == NULL) {
fprintf(stdout, "List is NULL\n");
return;
}
do {
fprintf(stdout, "[ %lld ] -> ", (byAddress ? (long long)lp : (long long)UNITS(lp)));
lp = NEXT(lp);
} while (lp != list);
putc('\n', stdout);
}
void releaseItemAtIndex (int i, int k, long long *p[]) {
fprintf(stderr, "Releasing block %d/%d\n", i, k);
release(p[i]);
for (int j = i; j < k; j++) {
if (j + 1 < k) {
p[j] = p[j + 1];
}
}
}
#define MAX_TEST_SIZE 5000
int main (int argc, const char *argv[]) {
/* Seed PRNG: For removed random deletion (now manual) */
srand(time(NULL));
/* Array of pointers to store allocated blocks, blockSize we want to allocate. */
long long *p[MAX_TEST_SIZE], blockSize = 256;
/* Number of blocks we choose to allocate */
int k = 6;
/* Allocate said blocks */
for (int i = 0; i < k; i++) {
p[i] = alloc(blockSize * sizeof(char));
printFreeList(0); printFreeList(1); putchar('\n');
}
fprintf(stderr, "\n\n");
int idx;
while (k > 0) {
fprintf(stderr, "Delete an index between 0 up to and including %d:\n", k - 1);
scanf("%d", &idx);
releaseItemAtIndex(idx, k, p);
printFreeList(0); printFreeList(1); putchar('\n');
k--;
}
return 0;
}
Miscellaneous Details:
I am running a 64 bit operating system.
I do not know if pointer comparison on the heap is guaranteed to be valid. This is not guaranteed by the standard according to K&R.
Well I eventually wrote one a couple years later that seems to work alright. Bonus: It's using static memory.
Header File
#if !defined(STATIC_ALLOCATOR_H)
#define STATIC_ALLOCATOR_H
/*
*******************************************************************************
* (C) Copyright 2020 *
* Created: 07/04/2020 *
* *
* Programmer(s): *
* - Jillian Oduber *
* - Charles Randolph *
* *
* Description: *
* Static first-fit memory allocator *
* *
*******************************************************************************
*/
#include <iostream>
#include <cstddef>
#include <cstdint>
#include "dx_types.h"
extern "C" {
#include "stddef.h"
}
/*
*******************************************************************************
* Type Definitions *
*******************************************************************************
*/
// Defines the header block used in the custom allocator
typedef union block_h {
struct {
union block_h *next;
size_t size; // Size is in units of sizeof(block_h)
} d;
max_align_t align;
} block_h;
/*
*******************************************************************************
* Class Definitions *
*******************************************************************************
*/
class Static_Allocator {
public:
/*\
* #brief Configures the class with the given static memory capacity
* #param memory Pointer to static array (must support address comparison)
* #param size Total number of bytes alloted for distribution
\*/
Static_Allocator(uint8_t *memory, size_t size);
/*\
* #brief Returns a memory block of the desired size
* #param size Number of bytes to allocate
* #return
* - valid pointer if memory is available
* - NULL otherwise
\*/
uint8_t *alloc (size_t size);
/*\
* #brief Releases the allocated block of memory
* #param ptr Pointer to allocated block of memory
* #return
* - STATUS_OK on success
* - STATUS_ERR if invalid block
* - STATUS_BAD_PARAM on NULL parameter
\*/
dx::status_t free (uint8_t *ptr);
/*\
* #brief Returns the amount of free memory
* #return size_t Bytes (of available free memory)
\*/
size_t free_memory_size ();
/*\
* #brief Debug utility which shows what is on the free-list
* #return None
\*/
void show ();
/*\
* #brief Debug utility which asserts there is only a single
* memory block
\*/
bool unified ();
private:
// Fixed allocator capacity
size_t d_capacity;
// Fixed memory
uint8_t *d_memory;
// Pointer to head of the linked list
block_h *d_free_list;
// Available memory
size_t d_free_memory_size;
// Unit size
static const size_t mem_unit_size = sizeof(block_h);
};
#endif
Source File
#include "static_allocator.h"
Static_Allocator::Static_Allocator (uint8_t *memory, size_t size):
d_capacity(0),
d_memory(NULL),
d_free_list(NULL),
d_free_memory_size(0)
{
// Assign memory array
d_memory = memory;
// Trim off two mem_unit_size for head + init-block + spillover
d_capacity = size - (size % mem_unit_size) - 2 * mem_unit_size;
// Ensure free memory is set to capacity
d_free_memory_size = d_capacity;
}
uint8_t * Static_Allocator::alloc (size_t size)
{
block_h *last, *curr;
// Number of blocks sized units to allocate
size_t nblocks = (size + mem_unit_size - 1) / mem_unit_size + 1;
// If larger than total capacity or invalid size; immediately return
if ((nblocks * sizeof(block_h)) > d_capacity || size == 0) {
return NULL;
}
// If uninitialized, create initial list units
if ((last = d_free_list) == NULL) {
block_h *head = reinterpret_cast<block_h *>(d_memory);
head->d.size = 0;
block_h *init = head + 1;
init->d.size = d_capacity / sizeof(block_h);
init->d.next = d_free_list = last = head;
head->d.next = init;
}
// Problem: You must not be allowed to merge the init block
// Look for space - stop if wrapped around
for (curr = last->d.next; ; last = curr, curr = curr->d.next) {
// If sufficient space available
if (curr->d.size >= nblocks) {
if (curr->d.size == nblocks) {
last->d.next = curr->d.next;
} else {
curr->d.size -= nblocks;
curr += curr->d.size; // Suspect
curr->d.size = nblocks;
}
// Reassign free list head
d_free_list = last;
// Update amount of free memory available
d_free_memory_size -= nblocks * sizeof(block_h);
return reinterpret_cast<uint8_t *>(curr + 1);
}
// Otherwise not sufficient space. If reached
// the head of the list again, there is no more
// memory left
if (curr == d_free_list) {
return NULL;
}
}
}
dx::status_t Static_Allocator::free (uint8_t *ptr)
{
block_h *b, *p;
// Check if parameter is valid
if (ptr == NULL) {
std::cerr << "Null pointer!";
return dx::STATUS_BAD_PARAM;
}
// Check if memory is in range
if (!(ptr >= (d_memory + 2 * mem_unit_size)
&& ptr < (d_memory + d_capacity + 2 * mem_unit_size))) {
std::cerr << "Pointer out of range!";
return dx::STATUS_ERR;
}
// Obtain block header (ptr - sizeof(block_h))
b = reinterpret_cast<block_h *>(ptr) - 1;
// Update available memory size
d_free_memory_size += b->d.size;
// Find insertion location for block
for (p = d_free_list; !(b >= p && b < p->d.next); p = p->d.next) {
// If the block comes at the end of the list - break
if (p >= p->d.next && b > p) {
break;
}
// If at the end of the list, but block comes before next link
if (p >= p->d.next && b < p->d.next) {
break;
}
}
// [p] <----b----> [p->next] ----- [X]
// Check if we can merge forwards
if (b + b->d.size == p->d.next) {
b->d.size += (p->d.next)->d.size;
b->d.next = (p->d.next)->d.next;
} else {
b->d.next = p->d.next;
}
// Check if we can merge backwards
if (p + p->d.size == b) {
p->d.size += b->d.size;
p->d.next = b->d.next;
} else {
p->d.next = b;
}
d_free_list = p;
return dx::STATUS_OK;
}
size_t Static_Allocator::free_memory_size ()
{
return d_free_memory_size;
}
void Static_Allocator::show ()
{
std::cout << "Block Size: " << sizeof(block_h) << "\n"
<< "Capacity: " << d_capacity << "\n"
<< "Free Size (B): " << d_free_memory_size << "\n";
if (d_free_list == NULL) {
std::cout << "<uninitialized>\n";
return;
}
// Show all blocks
block_h *p = d_free_list;
do {
std::cout << "-------------------------------\n";
std::cout << "Block Address: " << (uint64_t)(p) << "\n";
std::cout << "Blocks (32B): " << p->d.size << "\n";
std::cout << "Next: " << (uint64_t)(p->d.next) << "\n";
std::cout << "{The next block is " << (uint64_t)((p->d.next) - (p)) << " bytes away, but we claim the next " << p->d.size * mem_unit_size << " bytes\n";
p = p->d.next;
} while (p != d_free_list);
std::cout << "-------------------------------\n\n\n";
}
bool Static_Allocator::unified () {
if (d_free_list == NULL) {
return false;
}
return ((d_free_list->d.next)->d.next == d_free_list);
}

Confused about K&R implementation of malloc

The code for function malloc() in K&R section 8.7 is below
void *malloc(unsigned nbytes) {
Header *p, *prevp;
Header *moreroce(unsigned);
unsigned nunits;
nunits = (nbytes+sizeof(Header)-1)/sizeof(header) + 1;
if ((prevp = freep) == NULL) { /* no free list yet */
base.s.ptr = freeptr = prevptr = &base;
base.s.size = 0;
}
for (p = prevp->s.ptr; ; prevp = p, p = p->s.ptr) {
if (p->s.size >= nunits) { /* big enough */
if (p->s.size == nunits) { /* exactly */
prevp->s.ptr = p->s.ptr;
} else { /* allocate tail end */
p->s.size -= nunits;
p += p->s.size;
p->s.size = nunits;
}
freep = prevp;
return (void *)(p+1);
}
if (p == freep) /* wrapped around free list */
if ((p = morecore(nunits)) == NULL)
return NULL; /* none left */
}
}
I'm mainly confused by the "allocate tail end" part.
Suppose p->s.size = 5 and nunits = 2. According to the code, we first subtract 2 from p->s.size, advance p by 3, record allocated size at that address and return (void *)(p+1).
Let p' denote p after self-increment and * denote free space. The memory after above operations should look like this:
p * * p' * *
We've actually allocated 2 units of memory, but the remaining free space for p should be 2 instead of 3, since one unit is occupied by the header information for the allocated tail end.
So I think the line
p->s.size -= nunits;
should be replaced by
p->s.size -= nunits + 1;
Have I missed anything?
The answer is in this line
nunits = (nbytes+sizeof(Header)-1)/sizeof(header) + 1;
That line takes the number of bytes requested nbytes, adds sizeof(Header)-1 to round up, divides by sizeof(header) to get the number of units needed to hold nbytes. And finally, it adds 1 to make room for the header. So all the code after that assumes that you're reserving space for nbytes (plus padding if needed) plus the header.

Strange behaviour on Realloc: invalid next size [duplicate]

This question already has an answer here:
free char*: invalid next size (fast) [duplicate]
(1 answer)
Closed 8 years ago.
I know there are tons of other realloc questions and answers and I have read almost all of them, but I still couldn't manage to fix my problem.
I decided to stop trying when I accidentaly discovered a very strange behaviour of my code.
I introduced a line to try something, but although I don't use the value of newElems in main, the line changes the behaviour.
When the line is commented, the code fails at first realloc. Including the line, the first realloc works. (it still crashes on the second one).
Any ideas on what might be happening?
int main(int argc, char** argv) {
Pqueue q = pqueue_new(3);
Node a = {.name = "a"}, b = {.name = "b"},
c = {.name = "c"}, d = {.name = "d"};
push(& q, & a, 3);
// the next one is the strange line: as you can see, it doesn't modify q
// but commenting it out produces different behaviour
Pqueue_elem* newElems = realloc(q.elems, 4 * q.capacity * sizeof *newElems);
push(& q, & b, 5);
push(& q, & c, 4);
char s[5];
Node* n;
for (int i = 1; i <= 65; ++i) {
sprintf(s, "%d", i);
n = malloc(sizeof *n);
n->name = strdup(s);
push(& q, n, i);
}
Node* current = NULL;
while ((current = pop(& q))) {
printf("%s ", current->name);
}
return 0;
}
and the push function:
void push(Pqueue* q, Node* item, int priority) {
if (q->size >= q->capacity) {
if (DEBUG)
fprintf(stderr, "Reallocating bigger queue from capacity %d\n",
q->capacity);
q->capacity *= 2;
Pqueue_elem* newElems = realloc(q->elems,
q->capacity * sizeof *newElems);
check(newElems, "a bigger elems array");
q->elems = newElems;
}
// append at the end, then find its correct place and move it there
int idx = ++q->size, p;
while ((p = PARENT(idx)) && priority > q->elems[p].priority) {
q->elems[idx] = q->elems[p];
idx = p;
}
// after exiting the while, idx is at the right place for the element
q->elems[idx].data = item;
q->elems[idx].priority = priority;
}
The pqueue_new function:
Pqueue pqueue_new(unsigned int size) {
if (size < 4)
size = 4;
Pqueue* q = malloc(sizeof *q);
check(q, "a new queue.");
q->capacity = size;
q->elems = malloc(q->capacity * sizeof *(q->elems));
check(q->elems, "queue's elements");
return *q;
}
realloc will change the amount of memory that is allocated, if needed. It is also free to move the data to another place in memory if that's more efficient (avoiding memory fragmentation).
The function, then, returns a new pointer to the new location in memory where your data is hiding. You're calling realloc, and allocating (probably) four times as much memory as before, so it's very likely that that allocated memory is situated elsewhere in memory.
In your comment, you said realloc works like free + malloc. Well, in some cases it can behave similarly, however: realloc and free are different functions, that do different tasks. Both are functions that manage the dynamic memory, so yes, obviously there are similarities, and in the case of realloc, sometimes they can seem to be doing the same thing, however: As I explained here, realloc and free are fundamentally different functions
However, by not assigning the return value of realloc to q.elems, you're left with a pointer to a memory address that is no longer valid. The rest of your program can, and probably does, exhibit signs of undefined behaviour, then.
Unless you show some more code, I suspect this will take care of the problem:
//change:
Pqueue_elem* newElems = realloc(q.elems, 4 * q.capacity * sizeof *newElems);
//to
q.elems = realloc(q.elems, 4 * q.capacity * sizeof *newElems);
Or better yet, check for NULL pointers:
Pqueue_elem* newElems = realloc(q.elems, 4 * q.capacity * sizeof *newElems);
if (newElems == NULL)
exit( EXIT_FAILURE );// + fprintf(stderr, "Fatal error...");
q.elems = newElems;//<-- assign new pointer!
Looking at your pqueue_new function, I would suggest a different approach. Have it return the pointer to Pqueue. You're working with a piece of dynamic memory, treat it accordingly, and have your code reflect that all the way through:
Pqueue * pqueue_new(size_t size)
{//size_t makes more sense
if (size < 4)
size = 4;
Pqueue* q = malloc(sizeof *q);
check(q, "a new queue.");
q->capacity = size;
q->elems = malloc(q->capacity * sizeof *(q->elems));
check(q->elems, "queue's elements");
return q;
}
Alternatively, pass the function a pointer to a stack variable:
void pqueue_new(Pqueue *q, size_t size)
{
if (q == NULL)
{
fprintf(stderr, "pqueue_new does not do NULL pointers, I'm not Chuck Norris");
return;//or exit
}
if (size < 4)
size = 4;
check(q, "a new queue.");
q->capacity = size;
q->elems = malloc(q->capacity * sizeof *(q->elems));
check(q->elems, "queue's elements");
}
//call like so:
int main ( void )
{
Pqueue q;
pqueue_new(&q, 3);
}
Those would be the more common approaches.
Thank you all for the suggestions! I wouldn't have solved it without them,
The strange behaviour was caused by an off by one error. I was reallocating the queue only when q->size >= q->capacity, but since q was indexed from 0, it meant that before realloc I was writing in a forbidden location (q->elems[q->size]), which messed everything up.

Explain this implementation of malloc from the K&R book

This is an excerpt from the book on C by Kernighan and Ritchie. It shows how to implement a version of malloc. Although well commented, I am having great difficulty in understanding it. Can somebody please explain it?
typedef long Align; /* for alignment to long boundary */
union header { /* block header */
struct {
union header *ptr; /* next block if on free list */
unsigned size; /* size of this block */
} s;
Align x; /* force alignment of blocks */
};
typedef union header Header;
static Header base; /* empty list to get started */
static Header *freep = NULL; /* start of free list */
/* malloc: general-purpose storage allocator */
void *malloc(unsigned nbytes)
{
Header *p, *prevp;
Header *morecore(unsigned);
unsigned nunits;
nunits = (nbytes+sizeof(Header)-1)/sizeof(header) + 1;
if ((prevp = freep) == NULL) { /* no free list yet */
base.s.ptr = freeptr = prevptr = &base;
base.s.size = 0;
}
for (p = prevp->s.ptr; ; prevp = p, p = p->s.ptr) {
if (p->s.size >= nunits) { /* big enough */
if (p->s.size == nunits) /* exactly */
prevp->s.ptr = p->s.ptr;
else { /* allocate tail end */
p->s.size -= nunits;
p += p->s.size;
p->s.size = nunits
}
freep = prevp;
return (void *)(p+1);
}
if (p == freep) /* wrapped around free list */
if ((p = morecore(nunits)) == NULL)
return NULL; /* none left */
}
}
#define NALLOC 1024 /* minimum #units to request */
/* morecore: ask system for more memory */
static Header *morecore(unsigned nu)
{
char *cp, *sbrk(int);
Header *up;
if (nu < NALLOC)
nu = NALLOC;
cp = sbrk(nu * sizeof(Header));
if (cp == (char *) -1) /* no space at all */
return NULL;
up = (Header *) cp;
up->s.size = nu;
free((void *)(up+1));
return freep;
}
/* free: put block ap in free list */
void free(void *ap) {
Header *bp, *p;
bp = (Header *)ap - 1; /* point to block header */
for (p = freep; !(bp > p && bp < p->s.ptr); p = p->s.ptr)
if (p >= p->s.ptr && (bp > p || bp < p->s.ptr))
break; /* freed block at start or end of arena */
if (bp + bp->size == p->s.ptr) {
bp->s.size += p->s.ptr->s.size;
bp->s.ptr = p->s.ptr->s.ptr;
} else
bp->s.ptr = p->s.ptr;
if (p + p->size == bp) {
p->s.size += bp->s.size;
p->s.ptr = bp->s.ptr;
} else
p->s.ptr = bp;
freep = p;
}
Ok, what we have here is a chunk of really poorly written code. What I will do in this post could best be described as software archaeology.
Step 1: fix the formatting.
The indention and compact format doesn't do anyone any good. Various spaces and empty rows need to be inserted. The comments could be written in more readable ways. I'll start by fixing that.
At the same time I'm changing the brace style from K&R style - please note that the K&R brace style is acceptable, this is merely a personal preference of mine. Another personal preference is to write the * for pointers next to the type pointed at. I'll not argue about (subjective) style matters here.
Also, the type definition of Header is completely unreadable, it needs a drastic fix.
And I spotted something completely obscure: they seem to have declared a function prototype inside the function. Header* morecore(unsigned);. This is very old and very poor style, and I'm not sure if C even allows it any longer. Lets just remove that line, whatever that function does, it will have to be defined elsewhere.
typedef long Align; /* for alignment to long boundary */
typedef union header /* block header */
{
struct
{
union header *ptr; /* next block if on free list */
unsigned size; /* size of this block */
} s;
Align x; /* force alignment of blocks */
} Header;
static Header base; /* empty list to get started */
static Header* freep = NULL; /* start of free list */
/* malloc: general-purpose storage allocator */
void* malloc (unsigned nbytes)
{
Header* p;
Header* prevp;
unsigned nunits;
nunits = (nbytes + sizeof(Header) - 1) / sizeof(header) + 1;
if ((prevp = freep) == NULL) /* no free list yet */
{
base.s.ptr = freeptr = prevptr = &base;
base.s.size = 0;
}
for (p = prevp->s.ptr; ; prevp = p, p = p->s.ptr)
{
if (p->s.size >= nunits) /* big enough */
{
if (p->s.size == nunits) /* exactly */
prevp->s.ptr = p->s.ptr;
else /* allocate tail end */
{
p->s.size -= nunits;
p += p->s.size;
p->s.size = nunits
}
freep = prevp;
return (void *)(p+1);
}
if (p == freep) /* wrapped around free list */
if ((p = morecore(nunits)) == NULL)
return NULL; /* none left */
}
}
Ok now we might actually be able to read the code.
Step 2: weed out widely-recognized bad practice.
This code is filled with things that are nowadays regarded as bad practice. They need to be removed, since they jeopardize the safety, readability and maintenance of the code. If you want a reference to an authority preaching the same practices as me, check out the widely-recognized coding standard MISRA-C.
I have spotted and removed the following bad practices:
1) Just typing unsigned in the code could lead to be confusion: was this a typo by the programmer or was the intention to write unsigned int? We should replace all unsigned with unsigned int. But as we do that, we find that it is used in this context to give the size of various binary data. The correct type to use for such matters is the C standard type size_t. This is essentially just an unsigned int as well, but it is guaranteed to be "large enough" for the particular platform. The sizeof operator returns a result of type size_t and if we look at the C standard's definition of the real malloc, it is void *malloc(size_t size);. So size_t is the most correct type to use.
2) It is a bad idea to use the same name for our own malloc function as the one residing in stdlib.h. Should we need to include stdlib.h, things will get messy. As a rule of thumb, never use identifier names of C standard library functions in your own code. I'll change the name to kr_malloc.
3) The code is abusing the fact that all static variables are guaranteed to be initialized to zero. This is well-defined by the C standard, but a rather subtle rule. Lets initialize all statics explicitly, to show that we haven't forgotten to init them by accident.
4) Assignment inside conditions is dangerous and hard to read. This should be avoided if possible, since it can also lead to bugs, such as the classic = vs == bug.
5) Multiple assignments on the same row is hard to read, and also possibly dangerous, because of the order of evaluation.
6) Multiple declarations on the same row is hard to read, and dangerous, since it could lead to bugs when mixing data and pointer declarations. Always declare each variable on a row of its own.
7) Always uses braces after every statement. Not doing so will lead to bugs bugs bugs.
8) Never type cast from a specific pointer type to void*. It is unnecessary in C, and could hide away bugs that the compiler would otherwise have detected.
9) Avoid using multiple return statements inside a function. Sometimes they lead to clearer code, but in most cases they lead to spaghetti. As the code stands, we can't change that without rewriting the loop though, so I will fix this later.
10) Keep for loops simple. They should contain one init statement, one loop condition and one iteration, nothing else. This for loop, with the comma operator and everything, is very obscure. Again, we spot a need to rewrite this loop into something sane. I'll do this next, but for now we have:
typedef long Align; /* for alignment to long boundary */
typedef union header /* block header */
{
struct
{
union header *ptr; /* next block if on free list */
size_t size; /* size of this block */
} s;
Align x; /* force alignment of blocks */
} Header;
static Header base = {0}; /* empty list to get started */
static Header* freep = NULL; /* start of free list */
/* malloc: general-purpose storage allocator */
void* kr_malloc (size_t nbytes)
{
Header* p;
Header* prevp;
size_t nunits;
nunits = (nbytes + sizeof(Header) - 1) / sizeof(header) + 1;
prevp = freep;
if (prevp == NULL) /* no free list yet */
{
base.s.ptr = &base;
freeptr = &base;
prevptr = &base;
base.s.size = 0;
}
for (p = prevp->s.ptr; ; prevp = p, p = p->s.ptr)
{
if (p->s.size >= nunits) /* big enough */
{
if (p->s.size == nunits) /* exactly */
{
prevp->s.ptr = p->s.ptr;
}
else /* allocate tail end */
{
p->s.size -= nunits;
p += p->s.size;
p->s.size = nunits
}
freep = prevp;
return p+1;
}
if (p == freep) /* wrapped around free list */
{
p = morecore(nunits);
if (p == NULL)
{
return NULL; /* none left */
}
}
} /* for */
}
Step 3: rewrite the obscure loop.
For the reasons mentioned earlier. We can see that this loop goes on forever, it terminates by returning from the function, either when the allocation is done, or when there is no memory left. So lets create that as a loop condition, and lift out the return to the end of the function where it should be. And lets get rid of that ugly comma operator.
I'll introduce two new variables: one result variable to hold the resulting pointer, and another to keep track of whether the loop should continue or not. I'll blow K&R's minds by using the bool type, which is part of the C language since 1999.
(I hope I haven't altered the algorithm with this change, I believe I haven't)
#include <stdbool.h>
typedef long Align; /* for alignment to long boundary */
typedef union header /* block header */
{
struct
{
union header *ptr; /* next block if on free list */
size_t size; /* size of this block */
} s;
Align x; /* force alignment of blocks */
} Header;
static Header base = {0}; /* empty list to get started */
static Header* freep = NULL; /* start of free list */
/* malloc: general-purpose storage allocator */
void* kr_malloc (size_t nbytes)
{
Header* p;
Header* prevp;
size_t nunits;
void* result;
bool is_allocating;
nunits = (nbytes + sizeof(Header) - 1) / sizeof(header) + 1;
prevp = freep;
if (prevp == NULL) /* no free list yet */
{
base.s.ptr = &base;
freeptr = &base;
prevptr = &base;
base.s.size = 0;
}
is_allocating = true;
for (p = prevp->s.ptr; is_allocating; p = p->s.ptr)
{
if (p->s.size >= nunits) /* big enough */
{
if (p->s.size == nunits) /* exactly */
{
prevp->s.ptr = p->s.ptr;
}
else /* allocate tail end */
{
p->s.size -= nunits;
p += p->s.size;
p->s.size = nunits
}
freep = prevp;
result = p+1;
is_allocating = false; /* we are done */
}
if (p == freep) /* wrapped around free list */
{
p = morecore(nunits);
if (p == NULL)
{
result = NULL; /* none left */
is_allocating = false;
}
}
prevp = p;
} /* for */
return result;
}
Step 4: make this crap compile.
Since this is from K&R, it is filled with typos. sizeof(header) should be sizeof(Header). There are missing semi-colons. They use different names freep, prevp versus freeptr, prevptr, but clearly mean the same variable. I believe the latter were actually better names, so lets use those.
#include <stdbool.h>
typedef long Align; /* for alignment to long boundary */
typedef union header /* block header */
{
struct
{
union header *ptr; /* next block if on free list */
size_t size; /* size of this block */
} s;
Align x; /* force alignment of blocks */
} Header;
static Header base = {0}; /* empty list to get started */
static Header* freeptr = NULL; /* start of free list */
/* malloc: general-purpose storage allocator */
void* kr_malloc (size_t nbytes)
{
Header* p;
Header* prevptr;
size_t nunits;
void* result;
bool is_allocating;
nunits = (nbytes + sizeof(Header) - 1) / sizeof(Header) + 1;
prevptr = freeptr;
if (prevptr == NULL) /* no free list yet */
{
base.s.ptr = &base;
freeptr = &base;
prevptr = &base;
base.s.size = 0;
}
is_allocating = true;
for (p = prevptr->s.ptr; is_allocating; p = p->s.ptr)
{
if (p->s.size >= nunits) /* big enough */
{
if (p->s.size == nunits) /* exactly */
{
prevptr->s.ptr = p->s.ptr;
}
else /* allocate tail end */
{
p->s.size -= nunits;
p += p->s.size;
p->s.size = nunits;
}
freeptr = prevptr;
result = p+1;
is_allocating = false; /* we are done */
}
if (p == freeptr) /* wrapped around free list */
{
p = morecore(nunits);
if (p == NULL)
{
result = NULL; /* none left */
is_allocating = false;
}
}
prevptr = p;
} /* for */
return result;
}
And now we have somewhat readable, maintainable code, without numerous dangerous practices, that will even compile! So now we could actually start to ponder about what the code is actually doing.
The struct "Header" is, as you might have guessed, the declaration of a node in a linked list. Each such node contains a pointer to the next one. I don't quite understand the morecore function, nor the "wrap-around", I have never used this function, nor sbrk. But I assume that it allocates a header as specified in this struct, and also some chunk of raw data following that header. If so, that explains why there is no actual data pointer: the data is assumed to follow the header, adjacently in memory. So for each node, we get the header, and we get a chunk of raw data following the header.
The iteration itself is pretty straight-forward, they are going through a single-linked list, one node at a time.
At the end of the loop, they set the pointer to point one past the end of the "chunk", then store that in a static variable, so that the program will remember where it previously allocated memory, next time the function is called.
They are using a trick to make their header end up on an aligned memory address: they store all the overhead info in a union together with a variable large enough to correspond to the platform's alignment requirement. So if the size of "ptr" plus the size of "size" are too small to give the exact alignment, the union guarantees that at least sizeof(Align) bytes are allocated. I believe that this whole trick is obsolete today, since the C standard mandates automatic struct/union padding.
I'm studying K&R as I'd imagine OP was when he asked this question, and I came here because I also found these implementations to be confusing. While the accepted answer is very detailed and helpful, I tried to take a different tack which was to understand the code as it was originally written - I've gone through the code and added comments to the sections of the code that were difficult to me. This includes code for the other routines in the section (which are the functions free and memcore - I've renamed them kandr_malloc and kandr_free to avoid conflicts with the stdlib). I thought I would leave this here as a supplement to the accepted answer, for other students who may find it helpful.
I acknowledge that the comments in this code are excessive. Please know that I am only doing this as a learning exercise and I am not proposing that this is a good way to actually write code.
I took the liberty of changing some variable names to ones that seemed more intuitive to me; other than that the code is essentially left intact. It seems to compile and run fine for the test programs that I used, although valgrind had complaints for some applications.
Also: some of the text in the comments is lifted directly from K&R or the man pages - I do not intend to take any credit for these sections.
#include <unistd.h> // sbrk
#define NALLOC 1024 // Number of block sizes to allocate on call to sbrk
#ifdef NULL
#undef NULL
#endif
#define NULL 0
// long is chosen as an instance of the most restrictive alignment type
typedef long Align;
/* Construct Header data structure. To ensure that the storage returned by
* kandr_malloc is aligned properly for the objects that are stored in it, all
* blocks are multiples of the header size, and the header itself is aligned
* properly. This is achieved through the use of a union; this data type is big
* enough to hold the "widest" member, and the alignment is appropriate for all
* of the types in the union. Thus by including a member of type Align, which
* is an instance of the most restrictive type, we guarantee that the size of
* Header is aligned to the worst-case boundary. The Align field is never used;
* it just forces each header to the desired alignment.
*/
union header {
struct {
union header *next;
unsigned size;
} s;
Align x;
};
typedef union header Header;
static Header base; // Used to get an initial member for free list
static Header *freep = NULL; // Free list starting point
static Header *morecore(unsigned nblocks);
void kandr_free(void *ptr);
void *kandr_malloc(unsigned nbytes) {
Header *currp;
Header *prevp;
unsigned nunits;
/* Calculate the number of memory units needed to provide at least nbytes of
* memory.
*
* Suppose that we need n >= 0 bytes and that the memory unit sizes are b > 0
* bytes. Then n / b (using integer division) yields one less than the number
* of units needed to provide n bytes of memory, except in the case that n is
* a multiple of b; then it provides exactly the number of units needed. It
* can be verified that (n - 1) / b provides one less than the number of units
* needed to provide n bytes of memory for all values of n > 0. Thus ((n - 1)
* / b) + 1 provides exactly the number of units needed for n > 0.
*
* The extra sizeof(Header) in the numerator is to include the unit of memory
* needed for the header itself.
*/
nunits = ((nbytes + sizeof(Header) - 1) / sizeof(Header)) + 1;
// case: no free list yet exists; we have to initialize.
if (freep == NULL) {
// Create degenerate free list; base points to itself and has size 0
base.s.next = &base;
base.s.size = 0;
// Set free list starting point to base address
freep = &base;
}
/* Initialize pointers to two consecutive blocks in the free list, which we
* call prevp (the previous block) and currp (the current block)
*/
prevp = freep;
currp = prevp->s.next;
/* Step through the free list looking for a block of memory large enough to
* fit nunits units of memory into. If the whole list is traversed without
* finding such a block, then morecore is called to request more memory from
* the OS.
*/
for (; ; prevp = currp, currp = currp->s.next) {
/* case: found a block of memory in free list large enough to fit nunits
* units of memory into. Partition block if necessary, remove it from the
* free list, and return the address of the block (after moving past the
* header).
*/
if (currp->s.size >= nunits) {
/* case: block is exactly the right size; remove the block from the free
* list by pointing the previous block to the next block.
*/
if (currp->s.size == nunits) {
/* Note that this line wouldn't work as intended if we were down to only
* 1 block. However, we would never make it here in that scenario
* because the block at &base has size 0 and thus the conditional will
* fail (note that nunits is always >= 1). It is true that if the block
* at &base had combined with another block, then previous statement
* wouldn't apply - but presumably since base is a global variable and
* future blocks are allocated on the heap, we can be sure that they
* won't border each other.
*/
prevp->s.next = currp->s.next;
}
/* case: block is larger than the amount of memory asked for; allocate
* tail end of the block to the user.
*/
else {
// Changes the memory stored at currp to reflect the reduced block size
currp->s.size -= nunits;
// Find location at which to create the block header for the new block
currp += currp->s.size;
// Store the block size in the new header
currp->s.size = nunits;
}
/* Set global starting position to the previous pointer. Next call to
* malloc will start either at the remaining part of the partitioned block
* if a partition occurred, or at the block after the selected block if
* not.
*/
freep = prevp;
/* Return the location of the start of the memory, i.e. after adding one
* so as to move past the header
*/
return (void *) (currp + 1);
} // end found a block of memory in free list case
/* case: we've wrapped around the free list without finding a block large
* enough to fit nunits units of memory into. Call morecore to request that
* at least nunits units of memory are allocated.
*/
if (currp == freep) {
/* morecore returns freep; the reason that we have to assign currp to it
* again (since we just tested that they are equal), is that there is a
* call to free inside of morecore that can potentially change the value
* of freep. Thus we reassign it so that we can be assured that the newly
* added block is found before (currp == freep) again.
*/
if ((currp = morecore(nunits)) == NULL) {
return NULL;
}
} // end wrapped around free list case
} // end step through free list looking for memory loop
}
static Header *morecore(unsigned nunits) {
void *freemem; // The address of the newly created memory
Header *insertp; // Header ptr for integer arithmatic and constructing header
/* Obtaining memory from OS is a comparatively expensive operation, so obtain
* at least NALLOC blocks of memory and partition as needed
*/
if (nunits < NALLOC) {
nunits = NALLOC;
}
/* Request that the OS increment the program's data space. sbrk changes the
* location of the program break, which defines the end of the process's data
* segment (i.e., the program break is the first location after the end of the
* uninitialized data segment). Increasing the program break has the effect
* of allocating memory to the process. On success, brk returns the previous
* break - so if the break was increased, then this value is a pointer to the
* start of the newly allocated memory.
*/
freemem = sbrk(nunits * sizeof(Header));
// case: unable to allocate more memory; sbrk returns (void *) -1 on error
if (freemem == (void *) -1) {
return NULL;
}
// Construct new block
insertp = (Header *) freemem;
insertp->s.size = nunits;
/* Insert block into the free list so that it is available for malloc. Note
* that we add 1 to the address, effectively moving to the first position
* after the header data, since of course we want the block header to be
* transparent for the user's interactions with malloc and free.
*/
kandr_free((void *) (insertp + 1));
/* Returns the start of the free list; recall that freep has been set to the
* block immediately preceeding the newly allocated memory (by free). Thus by
* returning this value the calling function can immediately find the new
* memory by following the pointer to the next block.
*/
return freep;
}
void kandr_free(void *ptr) {
Header *insertp, *currp;
// Find address of block header for the data to be inserted
insertp = ((Header *) ptr) - 1;
/* Step through the free list looking for the position in the list to place
* the insertion block. In the typical circumstances this would be the block
* immediately to the left of the insertion block; this is checked for by
* finding a block that is to the left of the insertion block and such that
* the following block in the list is to the right of the insertion block.
* However this check doesn't check for one such case, and misses another. We
* still have to check for the cases where either the insertion block is
* either to the left of every other block owned by malloc (the case that is
* missed), or to the right of every block owned by malloc (the case not
* checked for). These last two cases are what is checked for by the
* condition inside of the body of the loop.
*/
for (currp = freep; !((currp < insertp) && (insertp < currp->s.next)); currp = currp->s.next) {
/* currp >= currp->s.ptr implies that the current block is the rightmost
* block in the free list. Then if the insertion block is to the right of
* that block, then it is the new rightmost block; conversely if it is to
* the left of the block that currp points to (which is the current leftmost
* block), then the insertion block is the new leftmost block. Note that
* this conditional handles the case where we only have 1 block in the free
* list (this case is the reason that we need >= in the first test rather
* than just >).
*/
if ((currp >= currp->s.next) && ((currp < insertp) || (insertp < currp->s.next))) {
break;
}
}
/* Having found the correct location in the free list to place the insertion
* block, now we have to (i) link it to the next block, and (ii) link the
* previous block to it. These are the tasks of the next two if/else pairs.
*/
/* case: the end of the insertion block is adjacent to the beginning of
* another block of data owned by malloc. Absorb the block on the right into
* the block on the left (i.e. the previously existing block is absorbed into
* the insertion block).
*/
if ((insertp + insertp->s.size) == currp->s.next) {
insertp->s.size += currp->s.next->s.size;
insertp->s.next = currp->s.next->s.next;
}
/* case: the insertion block is not left-adjacent to the beginning of another
* block of data owned by malloc. Set the insertion block member to point to
* the next block in the list.
*/
else {
insertp->s.next = currp->s.next;
}
/* case: the end of another block of data owned by malloc is adjacent to the
* beginning of the insertion block. Absorb the block on the right into the
* block on the left (i.e. the insertion block is absorbed into the preceeding
* block).
*/
if ((currp + currp->s.size) == insertp) {
currp->s.size += insertp->s.size;
currp->s.next = insertp->s.next;
}
/* case: the insertion block is not right-adjacent to the end of another block
* of data owned by malloc. Set the previous block in the list to point to
* the insertion block.
*/
else {
currp->s.next = insertp;
}
/* Set the free pointer list to start the block previous to the insertion
* block. This makes sense because calls to malloc start their search for
* memory at the next block after freep, and the insertion block has as good a
* chance as any of containing a reasonable amount of memory since we've just
* added some to it. It also coincides with calls to morecore from
* kandr_malloc because the next search in the iteration looks at exactly the
* right memory block.
*/
freep = currp;
}
The basic of malloc()
In Linux, there are two typical ways to request memory: sbrk and mmap. These system calls have severe limitations on frequent small allocations. malloc() is a library function to address this issue. It requests large chunks of memory with sbrk/mmap and returns small memory blocks inside large chunks. This is much more efficient and flexible than directly calling sbrk/mmap.
K&R malloc()
In the K&R implementation, a core (more commonly called arena) is a large chunk of memory. morecore() requests a core from system via sbrk(). When you call malloc()/free() multiple times, some blocks in the cores are used/allocated while others are free. K&R malloc stores the addresses of free blocks in a circular single linked list. In this list, each node is a block of free memory. The first sizeof(Header) bytes keep the size of the block and the pointer to the next free block. The rest of bytes in the free block are uninitialized. Different from typical lists in textbooks, nodes in the free list are just pointers to some unused areas in cores; you don't actually allocate each node except for cores. This list is the key to the understanding of the algorithm.
The following diagram shows an example memory layout with two cores/arenas. In the diagram, each character takes sizeof(Header) bytes. # is a Header, + marks allocated memory and - marks free memory inside cores. In the example, there are three allocated blocks and three free blocks. The three free blocks are stored in the circular list. For the three allocated blocks, only their sizes are stored in Header.
This is core 1 This is core 2
#---------#+++++++++#++++++++++++ #----------#+++++++++++++++++#------------
| | |
p->ptr->ptr p = p->ptr->ptr->ptr p->ptr
In your code, freep is an entry point to the free list. If you repeatedly follow freep->ptr, you will come back to freep – it is circular. Once you understand the circular single-linked list, the rest is relatively easy. malloc() finds a free block and possibly splits it. free() adds a free block back to the list and may merge it to adjacent free blocks. They both try to maintain the structure of the list.
Other comments on the implementation
The code comments mentioned "wrapped around" in malloc(). That line happens when you have traversed the entire free list but can't find a free block larger than the requested length. In this case, you have to add a new core with morecore().
base is a zero-sized block that is always included in the free list. It is a trick to avoid special casing. It is not strictly necessary.
free() may look a little complex because it has to consider four different cases to merge a newly freed block to other free blocks in the list. This detail is not that important unless you want to reimplement by yourself.
This blog post explains K&R malloc in more details.
PS: K&R malloc is one of the most elegant pieces of code in my view. It was really eye opening when I first understood the code. It makes me sad that some modern programmers, not even understanding the basic of this implementation, are calling the masterpiece crap solely based on its coding style.
I also found this exercise great and interesting.
In my opinion visualizing the structure may help a lot with understanding the logic - or at least this worked for me. Below is my code, which prints as much as possible about the flow of the K&R malloc.
The most significant change I made in the K&R malloc is the change of 'free' to make sure some old pointer will not be used again.
Other than that I added comments and fixed some small typos.
Experimenting with NALLOC, MAXMEM and the test variables in 'main' could be also of help.
On my computer (Ubuntu 16.04.3) this compiled without errors with:
gcc -g -std=c99 -Wall -Wextra -pedantic-errors krmalloc.c
krmalloc.c :
#include <stdio.h>
#include <unistd.h>
typedef long Align; /* for alignment to long boundary */
union header { /* block header */
struct {
union header *ptr; /* next block if on free list */
size_t size; /* size of this block */
/* including the Header itself */
/* measured in count of Header chunks */
/* not less than NALLOC Header's */
} s;
Align x; /* force alignment of blocks */
};
typedef union header Header;
static Header *morecore(size_t);
void *mmalloc(size_t);
void _mfree(void **);
void visualize(const char*);
size_t getfreem(void);
size_t totmem = 0; /* total memory in chunks */
static Header base; /* empty list to get started */
static Header *freep = NULL; /* start of free list */
#define NALLOC 1 /* minimum chunks to request */
#define MAXMEM 2048 /* max memory available (in bytes) */
#define mfree(p) _mfree((void **)&p)
void *sbrk(__intptr_t incr);
int main(void)
{
char *pc, *pcc, *pccc, *ps;
long *pd, *pdd;
int dlen = 100;
int ddlen = 50;
visualize("start");
/* trying to fragment as much as possible to get a more interesting view */
/* claim a char */
if ((pc = (char *) mmalloc(sizeof(char))) == NULL)
return -1;
/* claim a string */
if ((ps = (char *) mmalloc(dlen * sizeof(char))) == NULL)
return -1;
/* claim some long's */
if ((pd = (long *) mmalloc(ddlen * sizeof(long))) == NULL)
return -1;
/* claim some more long's */
if ((pdd = (long *) mmalloc(ddlen * 2 * sizeof(long))) == NULL)
return -1;
/* claim one more char */
if ((pcc = (char *) mmalloc(sizeof(char))) == NULL)
return -1;
/* claim the last char */
if ((pccc = (char *) mmalloc(sizeof(char))) == NULL)
return -1;
/* free and visualize */
printf("\n");
mfree(pccc);
/* bugged on purpose to test free(NULL) */
mfree(pccc);
visualize("free(the last char)");
mfree(pdd);
visualize("free(lot of long's)");
mfree(ps);
visualize("free(string)");
mfree(pd);
visualize("free(less long's)");
mfree(pc);
visualize("free(first char)");
mfree(pcc);
visualize("free(second char)");
/* check memory condition */
size_t freemem = getfreem();
printf("\n");
printf("--- Memory claimed : %ld chunks (%ld bytes)\n",
totmem, totmem * sizeof(Header));
printf(" Free memory now : %ld chunks (%ld bytes)\n",
freemem, freemem * sizeof(Header));
if (freemem == totmem)
printf(" No memory leaks detected.\n");
else
printf(" (!) Leaking memory: %ld chunks (%ld bytes).\n",
(totmem - freemem), (totmem - freemem) * sizeof(Header));
printf("// Done.\n\n");
return 0;
}
/* visualize: print the free list (educational purpose) */
void visualize(const char* msg)
{
Header *tmp;
printf("--- Free list after \"%s\":\n", msg);
if (freep == NULL) { /* does not exist */
printf("\tList does not exist\n\n");
return;
}
if (freep == freep->s.ptr) { /* self-pointing list = empty */
printf("\tList is empty\n\n");
return;
}
printf(" ptr: %10p size: %-3lu --> ", (void *) freep, freep->s.size);
tmp = freep; /* find the start of the list */
while (tmp->s.ptr > freep) { /* traverse the list */
tmp = tmp->s.ptr;
printf("ptr: %10p size: %-3lu --> ", (void *) tmp, tmp->s.size);
}
printf("end\n\n");
}
/* calculate the total amount of available free memory */
size_t getfreem(void)
{
if (freep == NULL)
return 0;
Header *tmp;
tmp = freep;
size_t res = tmp->s.size;
while (tmp->s.ptr > tmp) {
tmp = tmp->s.ptr;
res += tmp->s.size;
}
return res;
}
/* mmalloc: general-purpose storage allocator */
void *mmalloc(size_t nbytes)
{
Header *p, *prevp;
size_t nunits;
/* smallest count of Header-sized memory chunks */
/* (+1 additional chunk for the Header itself) needed to hold nbytes */
nunits = (nbytes + sizeof(Header) - 1) / sizeof(Header) + 1;
/* too much memory requested? */
if (((nunits + totmem + getfreem())*sizeof(Header)) > MAXMEM) {
printf("Memory limit overflow!\n");
return NULL;
}
if ((prevp = freep) == NULL) { /* no free list yet */
/* set the list to point to itself */
base.s.ptr = freep = prevp = &base;
base.s.size = 0;
}
/* traverse the circular list */
for (p = prevp->s.ptr; ; prevp = p, p = p->s.ptr) {
if (p->s.size >= nunits) { /* big enough */
if (p->s.size == nunits) /* exactly */
prevp->s.ptr = p->s.ptr;
else { /* allocate tail end */
/* adjust the size */
p->s.size -= nunits;
/* find the address to return */
p += p->s.size;
p->s.size = nunits;
}
freep = prevp;
return (void *)(p+1);
}
/* back where we started and nothing found - we need to allocate */
if (p == freep) /* wrapped around free list */
if ((p = morecore(nunits)) == NULL)
return NULL; /* none left */
}
}
/* morecore: ask system for more memory */
/* nu: count of Header-chunks needed */
static Header *morecore(size_t nu)
{
char *cp;
Header *up;
/* get at least NALLOC Header-chunks from the OS */
if (nu < NALLOC)
nu = NALLOC;
cp = (char *) sbrk(nu * sizeof(Header));
if (cp == (char *) -1) /* no space at all */
return NULL;
printf("... (sbrk) claimed %ld chunks.\n", nu);
totmem += nu; /* keep track of allocated memory */
up = (Header *) cp;
up->s.size = nu;
/* add the free space to the circular list */
void *n = (void *)(up+1);
mfree(n);
return freep;
}
/* mfree: put block ap in free list */
void _mfree(void **ap)
{
if (*ap == NULL)
return;
Header *bp, *p;
bp = (Header *)*ap - 1; /* point to block header */
if (bp->s.size == 0 || bp->s.size > totmem) {
printf("_mfree: impossible value for size\n");
return;
}
/* the free space is only marked as free, but 'ap' still points to it */
/* to avoid reusing this address and corrupt our structure set it to '\0' */
*ap = NULL;
/* look where to insert the free space */
/* (bp > p && bp < p->s.ptr) => between two nodes */
/* (p > p->s.ptr) => this is the end of the list */
/* (p == p->p.str) => list is one element only */
for (p = freep; !(bp > p && bp < p->s.ptr); p = p->s.ptr)
if (p >= p->s.ptr && (bp > p || bp < p->s.ptr))
/* freed block at start or end of arena */
break;
if (bp + bp->s.size == p->s.ptr) { /* join to upper nbr */
/* the new block fits perfect up to the upper neighbor */
/* merging up: adjust the size */
bp->s.size += p->s.ptr->s.size;
/* merging up: point to the second next */
bp->s.ptr = p->s.ptr->s.ptr;
} else
/* set the upper pointer */
bp->s.ptr = p->s.ptr;
if (p + p->s.size == bp) { /* join to lower nbr */
/* the new block fits perfect on top of the lower neighbor */
/* merging below: adjust the size */
p->s.size += bp->s.size;
/* merging below: point to the next */
p->s.ptr = bp->s.ptr;
} else
/* set the lower pointer */
p->s.ptr = bp;
/* reset the start of the free list */
freep = p;
}

malloc implementation?

I'm trying to implement malloc and free for C, and I am not sure how to reuse memory. I currently have a struct that looks like this:
typedef struct _mem_dictionary {
void *addr;
size_t size;
int freed;
} mem_dictionary;
My malloc looks like this:
void *malloc(size_t size) {
void *return_ptr = sbrk(size);
if (dictionary == NULL)
dictionary = sbrk(1024 * sizeof(mem_dictionary));
dictionary[dictionary_ct].addr = return_ptr;
dictionary[dictionary_ct].size = size;
dictionary[dictionary_ct].freed = 1;
dictionary_ct++;
return return_ptr;
}
When I free memory, I would just mark the address as 0 (that would indicate that it is free). In my malloc, I would then use a for loop to look for any value in the array to equal 0 and then allocate memory to that address. I'm kind of confused how to implement this.
The easiest way to do it is to keep a linked list of free block. In malloc, if the list is not empty, you search for a block large enough to satisfy the request and return it. If the list is empty or if no such block can be found, you call sbrk to allocate some memory from the operating system. in free, you simply add the memory chunk to the list of free block. As bonus, you can try to merge contiguous freed block, and you can change the policy for choosing the block to return (first fit, best fit, ...). You can also choose to split the block if it is larger than the request.
Some sample implementation (it is not tested, and is obviously not thread-safe, use at your own risk):
typedef struct free_block {
size_t size;
struct free_block* next;
} free_block;
static free_block free_block_list_head = { 0, 0 };
static const size_t overhead = sizeof(size_t);
static const size_t align_to = 16;
void* malloc(size_t size) {
size = (size + sizeof(size_t) + (align_to - 1)) & ~ (align_to - 1);
free_block* block = free_block_list_head.next;
free_block** head = &(free_block_list_head.next);
while (block != 0) {
if (block->size >= size) {
*head = block->next;
return ((char*)block) + sizeof(size_t);
}
head = &(block->next);
block = block->next;
}
block = (free_block*)sbrk(size);
block->size = size;
return ((char*)block) + sizeof(size_t);
}
void free(void* ptr) {
free_block* block = (free_block*)(((char*)ptr) - sizeof(size_t));
block->next = free_block_list_head.next;
free_block_list_head.next = block;
}
Note: (n + align_to - 1) & ~ (align_to - 1) is a trick to round n to the nearest multiple of align_to that is larger than n. This only works when align_to is a power of two and depends on the binary representation of numbers.
When align_to is a power of two, it only has one bit set, and thus align_to - 1 has all the lowest bit sets (ie. align_to is of the form 000...010...0, and align_to - 1 is of the form 000...001...1). This means that ~ (align_to - 1) has all the high bit set, and the low bit unset (ie. it is of the form 111...110...0). So x & ~ (align_to - 1) will set to zero all the low bits of x and round it down to the nearest multiple of align_to.
Finally, adding align_to - 1 to size ensure that we round-up to the nearest multiple of align_to (unless size is already a multiple of align_to in which case we want to get size).
You don't want to set the size field of the dictionary entry to zero -- you will need that information for re-use. Instead, set freed=1 only when the block is freed.
You cannot coalesce adjacent blocks because there may have been intervening calls to sbrk(), so that makes this easier. You just need a for loop which searches for a large enough freed block:
typedef struct _mem_dictionary
{
void *addr;
size_t size;
int freed;
} mem_dictionary;
void *malloc(size_t size)
{
void *return_ptr = NULL;
int i;
if (dictionary == NULL) {
dictionary = sbrk(1024 * sizeof(mem_dictionary));
memset(dictionary, 0, 1024 * sizeof(mem_dictionary));
}
for (i = 0; i < dictionary_ct; i++)
if (dictionary[i].size >= size
&& dictionary[i].freed)
{
dictionary[i].freed = 0;
return dictionary[i].addr;
}
return_ptr = sbrk(size);
dictionary[dictionary_ct].addr = return_ptr;
dictionary[dictionary_ct].size = size;
dictionary[dictionary_ct].freed = 0;
dictionary_ct++;
return return_ptr;
}
void free(void *ptr)
{
int i;
if (!dictionary)
return;
for (i = 0; i < dictionary_ct; i++ )
{
if (dictionary[i].addr == ptr)
{
dictionary[i].freed = 1;
return;
}
}
}
This is not a great malloc() implementation. In fact, most malloc/free implementations will allocate a small header for each block returned by malloc. The header might start at the address eight (8) bytes less than the returned pointer, for example. In those bytes you can store a pointer to the mem_dictionary entry owning the block. This avoids the O(N) operation in free. You can avoid the O(N) in malloc() by implementing a priority queue of freed blocks. Consider using a binomial heap, with block size as the index.
I am borrowing code from Sylvain's response. He seems to have missed calculating the size of the free_block* ini calculating the overhead.
In overall the code works by prepending this free_block as a header to the allocated memory.
1. When user calls malloc, malloc returns the address of the payload, right after this header.
2. when free is called, the address of the starting of the header for the block is calculated (by subtracting the header size from the block address) and that is added to the free block pool.
typedef struct free_block {
size_t size;
struct free_block* next;
} free_block;
static free_block free_block_list_head = { 0, 0 };
// static const size_t overhead = sizeof(size_t);
static const size_t align_to = 16;
void* malloc(size_t size) {
size = (size + sizeof(free_block) + (align_to - 1)) & ~ (align_to - 1);
free_block* block = free_block_list_head.next;
free_block** head = &(free_block_list_head.next);
while (block != 0) {
if (block->size >= size) {
*head = block->next;
return ((char*)block) + sizeof(free_block);
}
head = &(block->next);
block = block->next;
}
block = (free_block*)sbrk(size);
block->size = size;
return ((char*)block) + sizeof(free_block);
}
void free(void* ptr) {
free_block* block = (free_block*)(((char*)ptr) - sizeof(free_block ));
block->next = free_block_list_head.next;
free_block_list_head.next = block;
}

Resources