How to write a simple malloc function in c - c

As an assignment in operating systems we have to write our own code for malloc and free in C programming language, I know if i asked the code for it there is no point of me to study. i'm facing the problem of not knowing where to include initializing char array with 50000 bytes and making two lists free and used. in my function i can't trigger malloc or free to happen automatically. and a 3rd party main program will be used to test my functions.....
if my file is mymalloc.c or what ever
void* myalloc(size_t size)
{
//code for allocating memory
}
void myfree(void *ptr)
{
//code for free the memory
}
where do the code for initiating memory space and lists will go..

I will provide you with the basic concept which you can use to write your own code for malloc() and free() functions using C.
Assume that we have a contiguous block of memory of a certain size. It will be our abstract sense of memory which will carry all the requested memory allocations plus the data structures that are used to hold data about those allocated blocks.
We use a simple linked list to carry the data related to the allocated as well as free blocks of memory.
Its structure is as follows.
struct block{
size_t size; /*Specifies the size of the block to which it refers*/
int free; /*This is the flag used to identify whether a block is free
or not*/
struct block *next; /*This points to the next metadata block*/
};
You will need 2 source files for this purpose. One is mymalloc.h which is the header file which contains the initialization parts and the function prototypes of the rest of the functions that we are going to implement. The other is the mymalloc.c source file which contains all the necessary function implementations.
There needs to be a function to initialize the first free memory block.
And another function to split a block of memory which has more than enough space to give to the requested size. And another method to scan through the linked list and merge any consecutive blocks that are free, so that it prevents external fragmentation.
Note: We use the First-fit-algorithm to find a free block to allocate memory.
I think this will help anyone who is in search of a simple way to write their own malloc and free functions using C. Please follow the following link for a detailed explanation.
http://tharikasblogs.blogspot.com/p/how-to-write-your-own-malloc-and-free.html

I think you only have to implement a memory manager. So you don't have to use brk, sbrk, ...
Just put used memory in a simple array and fragment it somehow. Since it's homework you want to make it as simple as possible or else you run into problems due to complexity/time constraints of your assignment.
You only have to decide which tactic you want to use. I'd suggest to use the buddy system. Though it's a bit more complicated than the most simple ones.. maybe fixed sized fragmentation is simpler..
Maybe this is also a good read.
Don't do something low-level as suggested in the other answers..

The implementation greatly depends upon operating system and architecture, anyhow you may take a look at this: http://www.raspberryginger.com/jbailey/minix/html/lib_2ansi_2malloc_8c-source.html
(and study how it works!).

If you are on a unix system you can look the manual of brk and sbrk. Those system calls "push/set" the limit of the heap.
Using those you can manage your memory pages, allocating them as you need.
I would advise a chained-list to manage your different allocated spaces and building functions to split them or to merge them if they are free.
If you need to try your code with high-level applications, you can name your functions malloc/free, compile them to a shared-object (.so) and then use LD_PRELOAD and LD_LIBRARY_PATH environment variables to load your .so and replace system's malloc.
Every command you call then will use your shared object and thus your malloc, telling you if your malloc is stable or if it fails to comply with reality.
If you need a clear example of this i'd be happy to put some code here, but I do not want to make my answer too hard to read.

First, you could make a fake malloc which always fail
/* fake malloc */
void* myalloc(size_t sz)
{ return NULL; }
but that is "cheating". You want to make a malloc which is useful.
You probably want to make a system call which asks the kernel for memory. Of course, you'll need the symetrical syscall to release memory. On Linux and many Posix systems you'll often use mmap and munmap syscalls.
(You could also use sbrk, but using mmap with munmap is easier and more general)
The idea is that you get big chunks of memory (with mmap) and then you manage smaller memory zones inside. The interesting detail is how to manage these smaller zones. You may want to deal with large malloc differently than "small" allocations.
You really want to read wikipedia page on memory allocation

You could have a global static variable that is initialized to zero. Then check that variable at the start of your malloc and free function. In your malloc function, if the variable is zero then initialize whatever you need, and then set the variable to non-zero. In your free function, just return if the variable is zero.

More like that, is a simple malloc :
void* my_malloc(size_t size)
{
return (sbrk(size));
}
man sbrk will help you.
The problem now is to create a free and to create a efficient malloc :-)
if you want to test your malloc you can do like this :
$> LD_PRELOAD=/mypath/my_malloc.so /bin/ls
but you need to create a dynamic library before because malloc is a .so

Related

c malloc functionality for custom memory region

Is there any malloc/realloc/free like implementation where i can specify a memory region where to manage the memory allocation?
I mean regular malloc (etc.) functions manages only the heap memory region.
What if I need to allocate some space in a shared memory segment or in a memory mapped file?
Not 100 %, As per your question you want to maintain your own memory region. so you need to go for your own my_malloc, my_realloc and my_free
Implementing your own my_malloc may help you
void* my_malloc(int size)
{
char* ptr = malloc(size+sizeof(int));
memcpy(ptr, &size, sizeof(int));
return ptr+sizeof(int);
}
This is just a small idea, full implementation will take you to the
answer.
Refer this question
use the same method to achieve my_realloc and my_free
I asked myself this question recently too, because I wanted a malloc implementation for my security programs which could safely wipe out a static memory region just before exit (which contains sensitive data like encryption keys, passwords and other such data).
First, I found this. I thought it could be very good for my purpose, but I really could not understand it's code completely. The license status was also unclear, as it is very important for one of my projects too.
I ended up writing my own.
My own implementation supports multiple heaps at same time, operating over them with pool descriptor structure, automatic memory zeroing of freed blocks, undefined behavior and OOM handlers, getting exact usable size of allocated objects and testing that pointer is still allocated, which is very sufficient for me. It's not very fast and it is more educational grade rather than professional one, but I wanted one in a hurry.
Note that it does not (yet) knows about alignment requirements, but at least it returns an address suitable for storing an 32 bit integer.
Iam using Tasking and I can store data in a specific space of memory. For example I can use:
testVar _at(0x200000);
I'm not sure if this is what you are looking for, but for example I'am using it to store data to external RAM. But as far as I know, it's only workin for global variables.
It is not very hard to implement your own my_alloc and my_free and use preferred memory range. It is simple chain of: block size, flag free/in use, and block data plus final-block marker (e.g. block size = 0). In the beginning you have one large free block and know its address. Note that my_alloc returns the address of block data and block size/flag are few bytes before.

C - Design your own free( ) function

Today, I appeared for an interview and the interviewer asked me this,
Tell me the steps how will you design your own free( ) function for
deallocate the allocated memory.
How can it be more efficient than C's default free() function ? What can you conclude ?
I was confused, couldn't think of the way to design.
What do you think guys ?
EDIT : Since we need to know about how malloc() works, can you tell me the steps to write our own malloc() function
That's actually a pretty vague question, and that's probably why you got confused. Does he mean, given an existing malloc implementation, how would you go about trying to develop a more efficient way to free the underlying memory? Or was he expecting you to start discussing different kinds of malloc implementations and their benefits and problems? Did he expect you to know how virtual memory functions on the x86 architecture?
Also, by more efficient, does he mean more space efficient or more time efficient? Does free() have to be deterministic? Does it have to return as much memory to the OS as possible because it's in a low-memory, multi-tasking environment? What's our criteria here?
It's hard to say where to start with a vague question like that, other than to start asking your own questions to get clarification. After all, in order to design your own free function, you first have to know how malloc is implemented. So chances are, the question was really about whether or not you knew anything about how malloc can be implemented.
If you're not familiar with the internals of memory management, the easiest way to get started with understanding how malloc is implemented is to first write your own.
Check out this IBM DeveloperWorks article called "Inside Memory Management" for starters.
But before you can write your own malloc/free, you first need memory to allocate/free. Unfortunately, in a protected mode OS, you can't directly address the memory on the machine. So how do you get it?
You ask the OS for it. With the virtual memory features of the x86, any piece of RAM or swap memory can be mapped to a memory address by the OS. What your program sees as memory could be physically fragmented throughout the entire system, but thanks to the kernel's virtual memory manager, it all looks the same.
The kernel usually provides system calls that allow you to map in additional memory for your process. On older UNIX OS's this was usually brk/sbrk to grow heap memory onto the edge of your process or shrink it off, but a lot of systems also provide mmap/munmap to simply map a large block of heap memory in. It's only once you have access to a large, contiguous looking block of memory that you need malloc/free to manage it.
Once your process has some heap memory available to it, it's all about splitting it into chunks, with each chunk containing its own meta information about its size and position and whether or not it's allocated, and then managing those chunks. A simple list of structs, each containing some fields for meta information and a large array of bytes, could work, in which case malloc has to run through the list until if finds a large enough unallocated chunk (or chunks it can combine), and then map in more memory if it can't find a big enough chunk. Once you find a chunk, you just return a pointer to the data. free() can then use that pointer to reverse back a few bytes to the member fields that exist in the structure, which it can then modify (i.e. marking chunk.allocated = false;). If there's enough unallocated chunks at the end of your list, you can even remove them from the list and unmap or shrink that memory off your process's heap.
That's a real simple method of implementing malloc though. As you can imagine, there's a lot of possible ways of splitting your memory into chunks and then managing those chunks. There's as many ways as there are data structures and algorithms. They're all designed for different purposes too, like limiting fragmentation due to small, allocated chunks mixed with small, unallocated chunks, or ensuring that malloc and free run fast (or sometimes even more slowly, but predictably slowly). There's dlmalloc, ptmalloc, jemalloc, Hoard's malloc, and many more out there, and many of them are quite small and succinct, so don't be afraid to read them. If I remember correctly, "The C Programming Language" by Kernighan and Ritchie even uses a simple malloc implementation as one of their examples.
You can't blindly design free() without knowing how malloc() works under the hood because your implementation of free() would need to know how to manipulate the bookkeeping data and that's impossible without knowing how malloc() is implemented.
So an unswerable question could be how you would design malloc() and free() instead which is not a trivial question but you could answer it partially for example by proposing some very simple implementation of a memory pool that would not be equivalent to malloc() of course but would indicate your presence of knowledge.
One common approach when you only have access to user space (generally known as memory pool) is to get a large chunk of memory from the OS on application start-up. Your malloc needs to check which areas of the right size of that pool are still free (through some data structure) and hand out pointers to that memory. Your free needs to mark the memory as free again in the data structure and possibly needs to check for fragmentation of the pool.
The benefits are that you can do allocation in nearly constant time, the drawback is that your application consumes more memory than actually is needed.
Tell me the steps how will you design your own free( ) function for deallocate the allocated memory.
#include <stdlib.h>
#undef free
#define free(X) my_free(X)
inline void my_free(void *ptr) { }
How can it be more efficient than C's default free() function ?
It is extremely fast, requiring zero machine cycles. It also makes use-after-free bugs go away. It's a very useful free function for use in programs which are instantiated as short-lived batch processes; it can usefully be deployed in some production situations.
What can you conclude ?
I really want this job, but in another company.
Memory usage patterns could be a factor. A default implementation of free can't assume anything about how often you allocate/deallocate and what sizes you allocate when you do.
For example, if you frequently allocate and deallocate objects that are of similar size, you could gain speed, memory efficiency, and reduced fragmentation by using a memory pool.
EDIT: as sharptooth noted, only makes sense to design free and malloc together. So the first thing would be to figure out how malloc is implemented.
malloc and free only have a meaning if your app is to work on top of an OS. If you would like to write your own memory management functions you would have to know how to request the memory from that specific OS or you could reserve the heap memory right away using existing malloc and then use your own functions to distribute/redistribute the allocated memory through out your app
There is an architecture that malloc and free are supposed to adhere to -- essentially a class architecture permitting different strategies to coexist. Then the version of free that is executed corresponds to the version of malloc used.
However, I'm not sure how often this architecture is observed.
The knowledge of working of malloc() is necessary to implement free(). You can find a implementation of malloc() and free() using the sbrk() system call in K&R The C Programming Language Chapter 8, Section 8.7 "Example--A Storage Allocator" pp.185-189.

Removing Dynamic Memory Allocation - from a embedded C program

I'm trying to port a C library to a embedded platform (Xilinx Microblaze), and the library contains some calls to malloc(), alloc(), calloc() and free().
These functions calls requite additional libraries to be imported in to the embedded platform, and will make the program code larger.
What's the best steps to take to remove dynamic allocation from a C program, and only use static allocation. What are the facts i should find out, what calculations should i make ? Any tips are welcome.
example of malloc call:
decoder->sync = malloc(sizeof(*decoder->sync));
if (decoder->sync == 0)
return -1;
Many Thanks,
Rosh
There are two issues to deal with when converting dynamic memory allocations (runtime) to static allocations (compile time). First, the compiler obviously has to know how much memory to allocate at compile time. In your example above it looks like whatever decoder->sync points to is a constant size, so it shouldn't be a problem. If you were allocating memory for a byte array for a variable length data sequence, though, you would have a problem. You would either have to allocate enough for the maximum possible data length, or break the data up into chunks, or... hopefully you get the idea.
The other issue is heap vs. stack. All dynamic memory allocations come from the heap. Non-global static memory allocations come from the stack, and stacks can be pretty small in embedded environments. This means that if the memory allocation is medium-to-largeish, you will probably need to make it global or "static" (locally scoped static variables also come out of the heap) to avoid stack overflow, even if the variable would not otherwise need to be global.
Hope this makes sense.

When is malloc necessary in C?

I think all malloc(sizeof(structure)) can be replaced this way:
char[sizeof(structure)]
Then when is malloc necessary?
When you don't know how many object of some kind you need (e.g. linked list elements);
when you need to have data structures of size known only at runtime (e.g. strings based on unknown input); this is somewhat mitigated by the introduction of VLAs in C99, but see the next point:
when you know at compile time their size (or you can use VLAs), but it's just too big for the stack (typically a few MBs at most) and it would make no sense to make such thing global (e.g. big vectors to manipulate);
when you need to have an object whose lifetime is different than what automatic variables, which are scope-bound (=>are destroyed when the execution exits from the scope in which they are declared), can have (e.g. data that must be shared between different objects with different lifetimes and deleted when no one uses it anymore).
Notice that it isn't completely impossible to do without dynamic memory allocation (e.g. the whole rockbox project works almost without it), but there are cases in which you actually need to emulate it by using a big static buffer and writing your own allocator.
By the way, in C++ you will never use malloc()/free(), but the operators new and delete.
Related: a case in which trying to work without malloc has proven to be a big mess.
You will use malloc to dynamically allocate memory, either because:
you don't know at compile-time how much memory will be required,
you want to be able to reallocate memory later on (for instance using realloc),
you want to be able to discard the allocated memory earlier than by waiting for its release based on the scope of your variable.
I can see your point. You could think you could always using a declarative syntax for all of these, even using variables to declare the size of your memory spaces, but that would:
be non-standard,
give you less control,
possibly use more memory as you will need to do copies instead of re-allocating.
You will probably get to understand this in time, don't worry.
Also, you should try to learn more about the memory model. You don't use the same memory spaces when using a dynamic allocation and when using a static allocation.
For first pointers, visit:
Dynamic Memory Allocation
Static Memory Allocation
Stack vs Heap
Stack vs Heap?
How C Programming Works - Dynamic Data Structures
Friendly advice: I don't know if you develop C on *NIX or Windows, but in any case if you use gcc, I recommend using the following compilation flags when you teach yourself:
-Wall -ansi -pedantic -Wstrict-prototypes
You should read about dynamic memory allocation. You obviously don't know what it is.
The main difference between the two is that memory allocated with malloc() exists until you say so. Static memory such as char buff[10]; only exists in the function scope.
malloc is a dynamic memory allocator which helps u up to assign memory to ur variables according to ur need and therefore reduces the loss of memory.It is also supported by realloc() function through which u can edit the memory required which u have defined earlier through malloc() or calloc(). So in short we can say that malloc() can be used for managing the memory space and making use of the necessary memory without wasting it.
You never should do this the way you are proposing. Others already told you about the difference of allocating storage on the heap versus allocation on the function stack. But if and when you are allocating on the stack you should just declare your variable:
structure A = { /* initialize correctly */ };
There is no sense or point in doing that as an (basically) untyped char array. If you also need the address of that beast, well, take the address of with &A.
When you don't know how much memory to allocate at compile time. Take a very simple program, where you need to store the numbers entered by the user in linked list. Here you dont know how many numbers will be entered by the user. So as user enters a number you will create a node for it using malloc and store it in the linked list.
If you use char[sizeof(structure)] instead of malloc, then I think no dynamic memory allocation is done.
Besides the fact that your char[] method cannot resize or determine the size at runtime, your array might not be properly aligned for the type of structure you want to use it for. This can result in undefined behaviour.

Checking if something was malloced

Given a pointer to some variable.. is there a way to check whether it was statically or dynamically allocated??
Quoting from your comment:
im making a method that will basically get rid of a struct. it has a data member which is a pointer to something that may or may not be malloced.. depending on which one, i would like to free it
The correct way is to add another member to the struct: a pointer to a deallocation function.
It is not just static versus dynamic allocation. There are several possible allocators, of which malloc() is just one.
On Unix-like systems, it could be:
A static variable
On the stack
On the stack but dynamically allocated (i.e. alloca())
On the heap, allocated with malloc()
On the heap, allocated with new
On the heap, in the middle of an array allocated with new[]
On the heap, within a struct allocated with malloc()
On the heap, within a base class of an object allocated with new
Allocated with mmap
Allocated with a custom allocator
Many more options, including several combinations and variations of the above
On Windows, you also have several runtimes, LocalAlloc, GlobalAlloc, HeapAlloc (with several heaps which you can create easily), and so on.
You must always release memory with the correct release function for the allocator you used. So, either the part of the program responsible for allocating the memory should also free the memory, or you must pass the correct release function (or a wrapper around it) to the code which will free the memory.
You can also avoid the whole issue by either requiring the pointer to always be allocated with a specific allocator or by providing the allocator yourself (in the form of a function to allocate the memory and possibly a function to release it). If you provide the allocator yourself, you can even use tricks (like tagged pointers) to allow one to also use static allocation (but I will not go into the details of this approach here).
Raymond Chen has a blog post about it (Windows-centric, but the concepts are the same everywhere): Allocating and freeing memory across module boundaries
The ACE library does this all over the place. You may be able to check how they do it. In general you probably shouldn't need to do this in the first place though...
Since the heap, the stack, and the static data area generally occupy different ranges of memory, it is possible with intimate knowledge of the process memory map, to look at the address and determine which allocation area it is in. This technique is both architecture and compiler specific, so it makes porting your code more difficult.
Most libc malloc implementations work by storing a header before each returned memory block which has fields (to be used by the free() call) which has information about the size of the block, as well as a 'magic' value. This magic value is to protect against the user accidently deleting a pointer which wasn't alloc'd (or freeing a block which was overwritten by the user). It's very system specific so you'd have to look at the implementation of your libc library to see exactly what magic value was there.
Once you know that, you move the given pointer back to point at header and then check it for the magic value.
Can you hook into malloc() itself, like the malloc debuggers do, using LD_PRELOAD or something? If so, you could keep a table of all the allocated pointers and use that. Otherwise, I'm not sure. Is there a way to get at malloc's bookkeeping information?
Not as a standard feature.
A debug version of your malloc library might have some function to do this.
You can compare its address to something you know to be static, and say it's malloced only if it's far away, if you know the scope it should be coming from, but if its scope is unknown, you can't really trust that.
1.) Obtain a map file for the code u have.
2.) The underlying process/hardware target platform should have a memory map file which typically indicates - starting address of memory(stack, heap, global0, size of that block, read-write attributes of that memory block.
3.) After getting the address of the object(pointer variable) from the mao file in 1.) try to see which block that address falls into. u might get some idea.
=AD

Resources