Unit testing involving C free() - c

I'm working in Unix/Linux with C. I have a basic understanding of how memory allocation works, enough to know that if I malloc() then free(), I'm not likely going to actually free an entire page; thus if I use getrusage() before and after a free() I'm not likely going to see any difference.
I'd like to write a unit test for a function which destroys a data structure to see that the memory regions involved has actually been freed. I'm open to an OS-dependent solution, in which case my primary platform is
Linux beast 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
with OS X and FreeBSD as secondaries. I'm also open to a drop in replacement malloc() if there's a solution that makes checking free() relatively easy.
To be clear, I'm testing a routine that is to delete a large data structure, and I want to make sure that all the allocated regions are actually freed, in essence a unit test that the particular unit does not have a basic memory leak. I'm going to assume that free() does its job, I'm just making sure that my code actually calls free on all allocated regions it's responsible for.
In this particular case it's a tree structure and for each piece of data in the tree the structure is responsible for calling the routine that deletes the data stored in the tree as well, which might be some other arbitrary thing...

I hate to do this again, but after a night of sleep I found an explicit answer, it comes from SVID (does anyone even remember System V?), but is incorporated in glibc on Linux, likely through it's use in dlmalloc; hence it can be used on other system by using dlmalloc() as a drop in replacement malloc.
use the routine
struct mallinfo mallinfo(void);
and struct mallinfo is
struct mallinfo
{
int arena; /* non-mmapped space allocated from system */
int ordblks; /* number of free chunks */
int smblks; /* number of fastbin blocks */
int hblks; /* number of mmapped regions */
int hblkhd; /* space in mmapped regions */
int usmblks; /* maximum total allocated space */
int fsmblks; /* space available in freed fastbin blocks */
int uordblks; /* total allocated space */
int fordblks; /* total free space */
int keepcost; /* top-most, releasable (via malloc_trim) space */
};
In particular arena and uordblks will give you the number of bytes allocated by malloc() independent of the size of the pages requested from the OS using sbrk() or mmap(), which are given by arena and hblkhd.

My best suggestion is to use valgrind.
They make wrappers for malloc and free and shows you memory leaks better than you will do with a homegrown system. And they catch all kinds of other errors not easily spotted and tested with unittests like uninitialized variables.
So always run your tests with valgrind and make your success criteria that all tests pass and that valgrind shows no errors and no un-allocated memory.
PS: It might be a bit of a pain getting started if you already have a bit of a codebase, but if you take the time to fix all the valgrind errors both your tests and your system will be much more reliable!

One trick you can use for all kinds of memory leak debug is to simply increase the scale until the problem stands out more. Do one malloc/free cycle, then capture the results of getrusage(), then do 1000 more malloc/free cycles and ensure your process didn't actually allocate more memory. By leaking 1000 of them you increase the scale of the problem to the point where it stands out.
If you want something more deterministic (but slightly more intrusive) you can override malloc() and free() to track allocations and observe the tracking data from the test harness. You can find several examples of debug malloc libraries to show you how it's done.

The easiest way is to make wrappers for malloc and free. If you use the real malloc and free then the following could happen:
you allocate memory
you do something with that memory
you free the memory
someone else in the OS, even something else in your own code, needs memory and the OS allocates that memory elsewhere, since it's now free
you test to see if the memory is free, and it's not, but you think it should be, so you get a false test failure.

Related

Is it a memory leak in C when code fails to free memory, but the OS will anyway?

Say I have the following program for demonstration purposes only:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *my_memory = malloc(50000);
int *my_int = malloc(sizeof(int));
//Do other things but never free my_memory or my_int
return 0;
}
According to Wikipedia:
In computer science, a memory leak is a type of resource leak that occurs when a computer program incorrectly manages memory allocations1 in such a way that memory which is no longer needed is not released. A memory leak may also happen when an object is stored in memory but cannot be accessed by the running code.[2]
Sentence #1 implies that a memory leak can be unneeded, unfreed memory. However, many, many programmers I've worked with have stated that "there is no memory leak because the OS frees the memory." These programmers believe that memory leaks only occur if there is no longer a reference point or handle to some allocation, thus making it no longer accessible and no longer able to be freed.
As far as I know, it is true that the OS does free the memory, in regards to modern macOS, Windows, and Linux.
I've used AddressSanitizer, Dr. Memory, and Valgrind; they flag this type of program as having "memory leaks."
So my question is, is an example such as the above, where memory is allocated and not freed before program termination, a memory leak or not?
Yes and no.
Most programs are written with the expectation that they free their memory before termination. But most operating systems are written to not assume anything about the memory used by applications, so they will indeed reclaim that memory after a program has terminated.
It's not that freeing memory in your application code always directly returns it to the operating system anyway--typical memory allocators will just mark the memory as unused and available for use in a subsequent call to malloc().
There are also programs written with the express intent of not freeing their allocated memory. This is a simple, effective way to implement very high performance deallocation, because the operating system sees the allocated memory as a small number of large chunks and takes much less time to deallocate those chunks than if the application were to explicitly free each of them (assuming they were allocated as a much larger number of small objects from the application perspective). This is a valid approach, though it does complicate the use of some analysis tools like valgrind, so some programs avoid it by explicitly allocating and deallocating a small number of large "pools" of memory and internally dividing those up to store small objects. This "pool allocator" approach is analogous to what the C standard malloc() and free() normally do for you.
yes this is a memory leak.
each time you allocate space you must free that space when you don't need it any more.
since allocated memory can be shared between threads, it's hard for OS to detect that there is no more programs that doesn't use that space any more and it will still allocated after the end of the program.

Malloc vs custom allocator: Malloc has a lot of overhead. Why?

I have an image compression application that now has two different versions of memory allocation systems. In the original one, malloc is used everywhere, and in the second one, I implemented a simple pool-allocator, that just allocates chunk of memory and returns parts of that memory to myalloc() calls.
We've been noticing a huge memory overhead when malloc is used: At the height of its memory usage, the malloc() code requires about 170 megabytes of memory for a 1920x1080x16bpp image, while the pool allocator allocates just 48 megabytes, of which 47 are used by the program.
In terms of memory allocation patterns, the program allocates a lot of 8byte(most), 32-byte(many) and 1080byte-blocks(some) with the test image. Apart from these, there are no dynamic memory allocations in the code.
The OS of the testing system is Windows 7 (64 Bit).
How did we test memory usage?
With the custom allocator, we could see how much memory is used because all malloc calls are defered to the allocator. With malloc(), in Debug mode we just stepped through the code and watched the memory usage in the task manager. In release mode we did the same, but less fine grained because the compiler optimizes a lot of stuff away so we couldn't step through the code piece by piece (the memory difference between release and debug was about 20MB, which I would attribute to optimization and lack of debug information in release mode).
Could malloc alone be the cause of such a huge overhead? If so, what exactly causes this overhead inside malloc?
On Windows 7 you will always get the low-fragmentation heap allocator, without explicitly calling HeapSetInformation() to ask for it. That allocator sacrifices virtual memory space to reduce fragmentation. Your program is not actually using 170 megabytes, you are just seeing a bunch of free blocks lying around, waiting for an allocation of a similar size.
This algorithm is very easy to beat with a custom allocator that doesn't do anything to reduce fragmentation. Which may well work out for you, albeit that you don't see the side effects of it until you keep the program running longer than a single debug session. You do need to make sure it is stable for days or weeks if that is the expected usage pattern.
Best thing to do is just not fret about it, 170 MB is rather small potatoes. And do keep in mind that this is virtual memory, it doesn't cost anything.
First at all malloc aligns the pointers to 16 byte boundaries. Furthermore they store at least one pointer (or allocated length) in the addresses preceding the returned value. Then they probably add a magic value or release counter to indicate that the linked list is not broken or that the memory block has not been released twice (free ASSERTS for double frees).
#include <stdlib.h>
#include <stdio.h>
int main(int ac, char**av)
{
int *foo = malloc(4);
int *bar = malloc(4);
printf("%d\n", (int)bar - (int)foo);
}
Return: 32
Caution: When you run your program in the Visual Studio or with any debugger attached, by default the malloc behaviour is changed a lot, Low Fragmentation Heap is not used and a memory overhead may be not representative of real usage (see also https://stackoverflow.com/a/3768820/16673). You need to use environment variable _NO_DEBUG_HEAP=1 to avoid being hit by this, or to measure the memory usage when not running under a debugger.

How to write a simple malloc function in c

As an assignment in operating systems we have to write our own code for malloc and free in C programming language, I know if i asked the code for it there is no point of me to study. i'm facing the problem of not knowing where to include initializing char array with 50000 bytes and making two lists free and used. in my function i can't trigger malloc or free to happen automatically. and a 3rd party main program will be used to test my functions.....
if my file is mymalloc.c or what ever
void* myalloc(size_t size)
{
//code for allocating memory
}
void myfree(void *ptr)
{
//code for free the memory
}
where do the code for initiating memory space and lists will go..
I will provide you with the basic concept which you can use to write your own code for malloc() and free() functions using C.
Assume that we have a contiguous block of memory of a certain size. It will be our abstract sense of memory which will carry all the requested memory allocations plus the data structures that are used to hold data about those allocated blocks.
We use a simple linked list to carry the data related to the allocated as well as free blocks of memory.
Its structure is as follows.
struct block{
size_t size; /*Specifies the size of the block to which it refers*/
int free; /*This is the flag used to identify whether a block is free
or not*/
struct block *next; /*This points to the next metadata block*/
};
You will need 2 source files for this purpose. One is mymalloc.h which is the header file which contains the initialization parts and the function prototypes of the rest of the functions that we are going to implement. The other is the mymalloc.c source file which contains all the necessary function implementations.
There needs to be a function to initialize the first free memory block.
And another function to split a block of memory which has more than enough space to give to the requested size. And another method to scan through the linked list and merge any consecutive blocks that are free, so that it prevents external fragmentation.
Note: We use the First-fit-algorithm to find a free block to allocate memory.
I think this will help anyone who is in search of a simple way to write their own malloc and free functions using C. Please follow the following link for a detailed explanation.
http://tharikasblogs.blogspot.com/p/how-to-write-your-own-malloc-and-free.html
I think you only have to implement a memory manager. So you don't have to use brk, sbrk, ...
Just put used memory in a simple array and fragment it somehow. Since it's homework you want to make it as simple as possible or else you run into problems due to complexity/time constraints of your assignment.
You only have to decide which tactic you want to use. I'd suggest to use the buddy system. Though it's a bit more complicated than the most simple ones.. maybe fixed sized fragmentation is simpler..
Maybe this is also a good read.
Don't do something low-level as suggested in the other answers..
The implementation greatly depends upon operating system and architecture, anyhow you may take a look at this: http://www.raspberryginger.com/jbailey/minix/html/lib_2ansi_2malloc_8c-source.html
(and study how it works!).
If you are on a unix system you can look the manual of brk and sbrk. Those system calls "push/set" the limit of the heap.
Using those you can manage your memory pages, allocating them as you need.
I would advise a chained-list to manage your different allocated spaces and building functions to split them or to merge them if they are free.
If you need to try your code with high-level applications, you can name your functions malloc/free, compile them to a shared-object (.so) and then use LD_PRELOAD and LD_LIBRARY_PATH environment variables to load your .so and replace system's malloc.
Every command you call then will use your shared object and thus your malloc, telling you if your malloc is stable or if it fails to comply with reality.
If you need a clear example of this i'd be happy to put some code here, but I do not want to make my answer too hard to read.
First, you could make a fake malloc which always fail
/* fake malloc */
void* myalloc(size_t sz)
{ return NULL; }
but that is "cheating". You want to make a malloc which is useful.
You probably want to make a system call which asks the kernel for memory. Of course, you'll need the symetrical syscall to release memory. On Linux and many Posix systems you'll often use mmap and munmap syscalls.
(You could also use sbrk, but using mmap with munmap is easier and more general)
The idea is that you get big chunks of memory (with mmap) and then you manage smaller memory zones inside. The interesting detail is how to manage these smaller zones. You may want to deal with large malloc differently than "small" allocations.
You really want to read wikipedia page on memory allocation
You could have a global static variable that is initialized to zero. Then check that variable at the start of your malloc and free function. In your malloc function, if the variable is zero then initialize whatever you need, and then set the variable to non-zero. In your free function, just return if the variable is zero.
More like that, is a simple malloc :
void* my_malloc(size_t size)
{
return (sbrk(size));
}
man sbrk will help you.
The problem now is to create a free and to create a efficient malloc :-)
if you want to test your malloc you can do like this :
$> LD_PRELOAD=/mypath/my_malloc.so /bin/ls
but you need to create a dynamic library before because malloc is a .so

How to find how much memory is actually used up by a malloc call?

If I call:
char *myChar = (char *)malloc(sizeof(char));
I am likely to be using more than 1 byte of memory, because malloc is likely to be using some memory on its own to keep track of free blocks in the heap, and it may effectively cost me some memory by always aligning allocations along certain boundaries.
My question is: Is there a way to find out how much memory is really used up by a particular malloc call, including the effective cost of alignment, and the overhead used by malloc/free?
Just to be clear, I am not asking to find out how much memory a pointer points to after a call to malloc. Rather, I am debugging a program that uses a great deal of memory, and I want to be aware of which parts of the code are allocating how much memory. I'd like to be able to have internal memory accounting that very closely matches the numbers reported by top. Ideally, I'd like to be able to do this programmatically on a per-malloc-call basis, as opposed to getting a summary at a checkpoint.
There isn't a portable solution to this, however there may be operating-system specific solutions for the environments you're interested in.
For example, with glibc on Linux, you can use the mallinfo() function from <malloc.h> which returns a struct mallinfo. The uordblks and hblkhd members of this structure contains the dynamically allocated address space used by the program including book-keeping overhead - if you take the difference of this before and after each malloc() call, you will know the amount of space used by that call. (The overhead is not necessarily constant for every call to malloc()).
Using your example:
char *myChar;
size_t s = sizeof(char);
struct mallinfo before, after;
int mused;
before = mallinfo();
myChar = malloc(s);
after = mallinfo();
mused = (after.uordblks - before.uordblks) + (after.hblkhd - before.hblkhd);
printf("Requested size %zu, used space %d, overhead %zu\n", s, mused, mused - s);
Really though, the overhead is likely to be pretty minor unless you are making a very very high number of very small allocations, which is a bad idea anyway.
It really depends on the implementation. You should really use some memory debugger. On Linux Valgrind's Massif tool can be useful. There are memory debugging libraries like dmalloc, ...
That said, typical overhead:
1 int for storing size + flags of this block.
possibly 1 int for storing size of previous/next block, to assist in coallescing blocks.
2 pointers, but these may only be used in free()'d blocks, being reused for application storage in allocated blocks.
Alignment to an approppriate type, e.g: double.
-1 int (yes, that's a minus) of the next/previous chunk's field containing our size if we are an allocated block, since we cannot be coallesced until we're freed.
So, a minimum size can be 16 to 24 bytes. and minimum overhead can be 4 bytes.
But you could also satisfy every allocation via mapping memory pages (typically 4Kb), which would mean overhead for smaller allocations would be huge. I think OpenBSD does this.
There is nothing defined in the C library to query the total amount of physical memory used by a malloc() call. The amount of memory allocated is controlled by whatever memory manager is hooked up behind the scenes that malloc() calls into. That memory manager can allocate as much extra memory as it deemes necessary for its internal tracking purposes, on top of whatever extra memory the OS itself requires. When you call free(), it accesses the memory manager, which knows how to access that extra memory so it all gets released properly, but there is no way for you to know how much memory that involves. If you need that much fine detail, then you need to write your own memory manager.
If you do use valgrind/Massif, there's an option to show either the malloc value or the top value, which differ a LOT in my experience. Here's an excerpt from the Valgrind manual http://valgrind.org/docs/manual/ms-manual.html :
...However, if you wish to measure all the memory used by your program,
you can use the --pages-as-heap=yes. When this option is enabled,
Massif's normal heap block profiling is replaced by lower-level page
profiling. Every page allocated via mmap and similar system calls is
treated as a distinct block. This means that code, data and BSS
segments are all measured, as they are just memory pages. Even the
stack is measured...

how to free memory assigned using malloc?

struct element {
unsigned long int ip;
int type;
int rtt;
struct element * next;
struct element * edge;
};
I have a linked list. I create new nodes using malloc.
I tried to free up memory using free (ptr to node)
but when I run the traverse function again, I can traverse the linked list and the rtt value is correct as well as the next and edge pointers as I can follow the linked list. ONly the ip value is corrupted. why is this?
The behaviour of malloc() and free() depends heavily on the operating system and C library that you are using. In most implementations there are actually two memory allocators in play:
The OS memory allocator, which uses the virtual memory facilities of the processor to provide a process with its own address space and maps physical memory pages into that address space for use.
The C library memory allocator, which is in fact part of the application code and uses the pages provided by the OS to provide fine-grained memory management facilities, as provided by malloc() and free().
In general, calling free() does one or more of the following:
It marks the pointed-to memory area as free in the C memory allocator. This allows that memory to be reused. free() does not zero-out freed memory.
It may return memory to the OS, depending on the settings of the C memory allocator and whether it is actually possible to free that part of the heap. If memory is not returned to the OS, it can be reused by future malloc() calls by the same application.
If you try to access memory that has been freed, usually one of three things will happen:
The memory has been returned to the OS, and your program, typically, crashes. If you ask me, that is probably the best scenario - you have a problem, sure, but you know it.
The memory has not been reused, therefore your old data is still there. Your program goes on as if nothing was wrong. This is in my opinion the worst case scenario. Your code appears to work correctly and, if Murphy has a say in this, it will continue to do so until it reaches your end users - then it will fail spectacularly.
The memory has been reused by your program, and your code will start messing around with its own data. If you are careful (and lucky?), you will probably notice that the results are off. If not, well...
If you are on Linux/Unix Valgrind is a good tool to catch memory management problems like this. There are also replacement libraries for the C memory allocator, such as DUMA that will also allow you to detect such issues.
Memory is not wiped when you free it - that would be a waste of processor time. It is just allocated to the "free list". That is why your data is still there
Whenever you free a block, you should set the corresponding pointer to NULL, so you don't accidentally reference it - it could be reused at any time.
Actually free doesn't delete anything, it just tells the OS it can use that memory again, for example next time you call malloc() it could overwrite some of your nodes.
Freeing the memory releases it for reuse. It doesn't necessarily destroy the data that was at that location. It is up to you not to access a memory region that has been released because the behavior is undefined (i.e. usually very bad).
If you want the data destroyed for some strange reason, then overwrite the memory area prior to freeing it (e.g. memset(buf, 0, len)).

Resources