I have one C code app. which i was building using MS-VS2005. I had one output data buffer which was being allocated dynamically using malloc.
For some test cases, the memory size which was being malloc'd was falling short than the the actual output size in bytes which was generated. That larger sized output was written into the smaller sized buffer causing buffer overflow. As a result of which the test-run was crashing with MSVS-2005 showing up a window "Heap corruption ...."
I knew it had to do with some dynamic memory allocation, but i took long time to actually find the root cause, as i did not doubt the memory allocation because i was allocating large enough size necessary for the output. But one particular test case was generating more output than what i had calculated, hence the resulting crash.
My question is:
1.) What tools i can use to detect such dynamic memory buffer over-flow conditions. Can they also help detect any buffer overflow conditions(irrespective of whether the buffer/array is on heap, stack, global memory area)?
2.) Will memory leak tools(like say Purify) or code analysis tools like lint, klocworks would have helped in particular case? I believe they have to be run time analysis tools.
Thank you.
-AD.
One solution, which I first encountered in the book Writing Solid Code, is to "wrap" the malloc() API with diagnostic code.
First, the diagnostic malloc() arranges to allocate additional bytes for a trailing sentinel. For example, an additional four bytes following the allocated memory are reserved and contain the characters 'FINE'.
Later, when the pointer from malloc() is passed to free(), a corresponding diagnostic version of free() is called. Before calling the standard implementation of free() and relinquishing the memory, the trailing sentinel is verified; it should be unmodified. If the sentinel is modified, then the block pointer has been misused at some point subsequent to being returned from the diagnostic malloc().
There are advantages of using a memory protection guard page rather than a sentinel pattern for detecting buffer overflows. In particular, with a pattern-based method, the illegal memory access is detected only after the fact. Only illegal writes are detected by the sentinel pattern method. The memory protection method catches both illegal reads and writes, and they are detected immediately as they occur.
Diagnostic wrapper functions for malloc() can also address other misuses of malloc(), such as multiple calls to free() for same memory block. Also, realloc() can be modified to always move blocks when executed in a debugging environment, to test the callers of realloc().
In particular, the diagnostic wrappers may record all of the blocks allocated and freed, and report on memory leaks when the program exits. Memory leaks are blocks which are allocated by not freed during the program execution.
When wrapping the malloc() API, one must wrap all of the related functions, including calloc(), realloc(), and strdup().
The typical way of wrapping these functions is via preprocessor macros:
#define malloc(s) diagnostic_malloc(s, __FILE__, __LINE__)
/* etc.... */
If the need arises to code a call to the standard implementation (for example, the allocated block will be passed to a third-party, binary-only library which expects to free the block using the standard free() implementation, the original function names can be accessed rather than the preprocessor macro by using (malloc)(s) -- that is, place parentheses around the function name.
Something you can try is allocate enough pages + 1 using VirtualAlloc, use VirtualProtect with PAGE_READONLY | PAGE_GUARD flags on the last page, then align the suspected allocation so the end of the object is near the beginning of the protected page. If all goes well you should get an access violation when the guard page is accessed. It helps if you know approximately which allocation is overwritten. Otherwise it requires overriding all allocations which may require a lot of extra memory (at least 2 pages per allocation). A variation on this technique that I'm hereby christening as "statistical page guard" is to only randomly allocate memory for a relatively small percentage of allocations in that manner to avoid large bloat for small objects. Over a large number of execution runs you should be able to hit the error. The random number generator would have to be seeded off something like time in this case. Similarly you can allocate the guard page in front of the object if you suspect an overwrite at a lower address (can't do both at the same time but possible to randomly mix up as well).
An update: it turns out the gflags.exe (used to be pageheap.exe) microsoft utility already supports "statistical page guard" so i reinvented the wheel :) All you need to do is run gflags.exe /p /enable [/random 0-100] YourApplication.exe and run your app. If you are using a custom heap or custom guards on your heap allocations then you can simply switch to using HeapAlloc at least for catching bugs and then switch back. Gflags.exe is a part of Support Tools package and can be downloaded from the microsoft download center, just do a search there.
PC-Lint can catch some forms of malloc/new size problems, but I'm not sure if it would have found yours.
VS2005 has good buffer overflow checking for stack objects in debug mode (runs at end of function). And it does periodic checking of the heap.
As for it helping to track down where the problems occurred, this is where I tend to start using macro's to dump all allocations to match against the corrupted memory later (when it's detected).
Painful process, so I'm keen to learn better ways also.
Consider our Memory Safety Check. I think it will catch all the errors you describe. Yes, it is runtime checking of every access, with some considerable overhead (not as bad as valgrind we think) with the benefit of diagnosing the first program action that is errorneous.
Related
I am trying to understand the scenarios in which call to memcpy can fail silently because invalid pointers will result in access violation/segfaults. Also, there will be issues in case of overlapping pointers. Apart from these, are there any other ways the memcpy call can fail? Or we can consider it'll pass all the time without any error. How to verify it?
The memcpy has the precondition that the memory areas will not overlap. You invoke undefined behavior if you violate that precondition.
Similarly, you also invoke undefined behavior if you read past the bounds of the source buffer or write past the bounds of the destination buffer. This is dictated in the standard.
When you invoke undefined behavior, you can't predict how the program will behave in the future (or even in the past). It could crash, it could output strange results, or it could appear to work normally.
Using a tool such as valgrind is very helpful in identifying when your program violates various memory constraints, such as reading or writing past the end of a buffer, using an uninitialized value, dereferencing a null pointer, or performing a double free.
If you give it valid pointers that do not overlap and you do not overrun the buffers with your reads/writes, it will not fail. If you do some of those things, it may still not fail, but could do unexpected things. It will always return dest.
I am trying to understand the scenarios in which call to memcpy can fail silently because invalid pointers will result in access violation/segfaults.
Typically when there's an access violation/segfault, software fails loudly. This can happen if memcpy() is given dodgy pointers or a bad size, which includes "correct pointers but heap was corrupted" (e.g. the metadata that malloc()/free() uses to keep track of allocated areas was overwritten by a bug causing free() to give the underlying virtual RAM back to the kernel for an area that should've been kept, and causing memcpy() to fail with an access violation because an area it should've been able to access can't be accessed).
The other cases are external failure conditions. If the OS decided to send some of the data to swap space but gets a read error from the device when trying to fetch the data back from swap space when you try to access it, there's very little to OS can do about it (your process and any other process using that data can't continue correctly). If you're using ECC RAM and the memory controller says there's an uncorrectable error with the RAM you're using it's similar. It's also possible for the OS to use "lazy page allocation" (e.g. pages of memory are allocated when you write to them and not when you thought you allocated them) and "over commit" (pretend that more pages were allocated than can be provided), so that when memcpy() writes to an area that was allocated the OS can't handle it (e.g. it triggers an "OOM/out of memory killer" that terminates a process to free up some RAM). Finally, it's possible for the code to be corrupted (e.g. faulty RAM without ECC, malicious attack like "Rowhammer", corrupted shared library, ...) so that (e.g.) using memcpy() causes a SIGILL. Of course all of these things aren't related to memcpy() itself and can just as easily happen anywhere else.
Also, there will be issues in case of overlapping pointers.
Yes. Some (most) implementations of memcpy() are optimised to copy larger blocks (e.g. optimised to use SSE on 80x86 and moving 16 bytes at a time) where if the areas overlap the data gets mangled. Some (most) implementations of memcpy() assume that it can copy data in one specific direction which will cause data to be corrupted if areas overlap in the wrong way (e.g. if the implementation uses the "lowest address first" direction and the destination area overlaps and is at a higher address than the source, then writes to the destination will overwrite source data that hasn't been copied yet).
Apart from these, are there any other ways the memcpy call can fail?
No, I think I covered all the possible failure cases above.
Or we can consider it'll pass all the time without any error. How to verify it?
For the "overlapping areas" problem it shouldn't be hard to write a wrapper around memcpy() that detects overlap and generates an error (so that it doesn't silently corrupt data). Unfortunately this only finds problems at run-time (after it's too late - e.g. possibly after it's been released and running on the end user's computer). For some of the cases might be easy enough to detect "overlapping areas" using a static source code analyser, but these cases are likely to be the "easily detected by testing at run-time before software is released" cases.
For some things (dodgy pointers, corrupted heap) there are tools (valgrind) to detect problems. Unfortunately these only detect problems when they actually happen (and don't detect problems that don't happen during testing but do happen when software is running on the end-user's computer).
For the remainder (OS failures and hardware failures), if you can't trust the OS or hardware then you can't assume any code that verifies anything will work properly either.
I have a large body of legacy code that I inherited. It has worked fine until now. Suddenly at a customer trial that I cannot reproduce inhouse, it crashes in malloc. I think that I need to add instrumentation e.g on top of malloc I have my own malloc that stores some meta information about each malloc e.g. who has made the malloc call. When it crashes, I can then look up the meta information and see what was happening. I had done something similar years ago but cannot recall it now...I am sure people have come up with better ideas. Will be glad to have inputs.
Thanks
Is memory allocation broken?
Try valgrind.
Malloc is still crashing.
Okay, I'm going to have to assume that you mean SIGSEGV (segmentation fault) is firing in malloc. This is usually caused by heap corruption. Heap corruption, that itself does not cause a segmentation fault, is usually the result of an array access outside of the array's bounds. This is usually nowhere near the point where you call malloc.
malloc stores a small header of information "in front of" the memory block that it returns to you. This information usually contains the size of the block and a pointer to the next block. Needless to say, changing either of these will cause problems. Usually, the next-block pointer is changed to an invalid address, and the next time malloc is called, it eventually dereferences the bad pointer and segmentation faults. Or it doesn't and starts interpreting random memory as part of the heap. Eventually its luck runs out.
Note that free can have the same thing happen, if the block being released or the free block list is messed up.
How you catch this kind of error depends entirely on how you access the memory that malloc returns. A malloc of a single struct usually isn't a problem; it's malloc of arrays that usually gets you. Using a negative (-1 or -2) index will usually give you the block header for your current block, and indexing past the array end can give you the header of the next block. Both are valid memory locations, so there will be no segmentation fault.
So the first thing to try is range checking. You mention that this appeared at the customer's site; maybe it's because the data set they are working with is much larger, or that the input data is corrupt (e.g. it says to allocate 100 elements and then initializes 101), or they are performing things in a different order (which hides the bug in your in-house testing), or doing something you haven't tested. It's hard to say without more specifics. You should consider writing something to sanity check your input data.
Try Asan
AddressSanitizer (aka ASan) is a memory error detector for C/C++. It finds:
Use after free (dangling pointer dereference)
Heap buffer overflow
Stack buffer overflow
Global buffer overflow
Use after return
Use after scope
Initialization order bugs
Memory leaks
Please find the links to know more and how to use it
https://github.com/google/sanitizers/wiki/AddressSanitizer and
https://github.com/google/sanitizers/wiki/AddressSanitizerFlags
I know this is old, but issues like this will continue to exist as long as we have pointers. Although valgrind is the best tool for this purpose, it has a steep learning curve and often the results are too intimidating to understand.
Assuming you are working on some *nux, another tool I can suggest is electricfence. Quote:
Electric Fence helps you detect two common programming bugs:
software that overruns the boundaries of a malloc() memory allocation,
software that touches a memory allocation that has been released by free().
Unlike other malloc() debuggers, Electric Fence will detect read accesses
as well as writes, and it will pinpoint the exact instruction that causes
an error.
Usage is amazingly simple. Just link your code with an additional library lefence
When you run the application, a corefile will be generated when memory is corrupted, instead of when corrupted memory is used.
Interviewer - If you have no tools to check how would you detect memory leak problems?
Answer - I will read the code and see if all the memory I have allocated has been freed by me in the code itself.
Interviewer wasn't satisfied. Is there any other way to do so?
For all the implementation defined below, one needs to write wrappers for malloc() & free() functions.
To keep things simple, keep track of count of malloc() & free(). If not equal then you have a memory leak.
A better version would be to keep track of the addresses malloc()'ed & free()'ed this way you can identify which addresses are malloc()'ed but not free()'ed. But this again, won't help much either, since you can't relate the addresses to source code, especially it becomes a challenge when you have a large source code.
So here, you can add one more feature to it. For eg, I wrote a similar tool for FreeBSD Kernel, you can modify the malloc() call to store the module/file information (give each module/file a no, you can #define it in some header), the stack trace of the function calls leading to this malloc() and store it in a data structure, along side the above information whenever a malloc() or free() is called. Use addresses returned by malloc() to match with it free(). So, when their's a memory leak, you have information about what addresses were not free()'ed in which file, what were the exact functions called (through the stack trace) to pin point it.
The way, this tool worked was, on a crash, I used to get a core-dump. I had defined globals (this data structure where I was collecting data) in kernel memory space, which I could access using gdb and retrieve the information.
Edit:
Recently while debugging a memeory leak in linux kernel, I came across this tool called kmemleak which implements a similar algorithm I described in point#3 above. Read under the Basic Algorithm section here: https://www.kernel.org/doc/Documentation/kmemleak.txt
My response when I had to do this for real was to build tools... a debugging heap layer, wrapped around the C heap, and macros to switch code to running against those calls rather than accessing the normal heap library directly. That layer included some fencepost logic to detect array bounds violations, some instrumentation to monitor what the heap was doing, optionally some recordkeeping of exactly who allocated and freed each block...
Another approach, of course, is "divide and conquer". Build unit tests to try to narrow down which operations are causing the leak, then to subdivide that code further.
Depending on what "no tools" means, core dumps are also sometimes useful; seeing the content of the heap may tell you what's being leaked, for example.
And so on....
I have to do a project in C where I have to constantly allocate memory for big data structures and then free it. Does there exista a library with a function that helps to keep track of the memory usage so I can be sure if I am doing things correctly? (I'm new to C)
For example, a function that returns:
A) The total of memory used by the program at the moment, OR
B) The total of memory left,
would do the job. I already googled for that and searched in other answers.
Thanks!
Try tcmalloc: you are looking for a heap profiler, although valgrind might be more useful initially.
If you're worried about memory leaks, valgrind is probably what you need. On the other hand, if you're more concerned just with whether you're data structures are using excessive memory, you might just use the common mallinfo function included as an extension to malloc in many unix standard libraries including glibc on Linux.
Although some people excoriate it, the book "Writing Solid Code" by Steve Maguire has a lot of reasonable ideas about how to track your memory usage without modifying the system memory allocation functions. Basically, instead of calling the raw malloc() etc functions directly, you call your own memory allocation API built on top of the standard one. Your API can track allocations and frees, detect double frees, frees of non-allocated memory, unreleased (leaked) memory, complete dumps of what is allocated, etc. You either need to crib the code from the book or write your own equivalent code. One interesting problem is providing a stack trace for each allocation; there isn't a standard way to determine the call stack. (The book is a bit dated now; it was written just a few years after the C89 standard was published and does not exploit const qualifiers.)
Some will argue that these services can be provided by the system malloc(); indeed, they can, and these days often are. You should look carefully at the manual provided for your version of malloc(), and decide whether it provides enough for you. If not, then the wrapper API mechanism is reasonable. Note that using your own API means you track what you explicitly allocate, while leaving library functions not written to use your API using the system services - as, indeed, does your code, under the covers.
You should also look into valgrind. It does a superb job tracking memory abuses, and in particular will report leaked memory (memory that was allocated but not freed). It also spots when you read or write outside the bounds of an allocated space, spotting buffer overflows.
Nevertheless, ultimately, you need to be disciplined in the way you write your code, ensuring that every time you allocate memory, you know when it will be released.
Every time you allocate/free memory, you could log how big your data structure is.
A project I am working on involves a flight vehicle with GNC code written in a C library (.out). We must call this C code from LabVIEW (the primary avionics software) in the form of a .out library, and the nature of the software requires static pointers to store data between successive calls to the function. We call the GNC executive function at regular intervals throughout a flight. I'm now trying to call this function using a Matlab MEX wrapper in a DLL on Windows, and this has uncovered some memory management issues.
I am declaring the structures at the beginning of the function like this:
static Nav_str *Nav_IN_OUT_ptr;
static hguid_ref *Guid_IN_OUT_ptr;
static HopControl *Control_IN_OUT_ptr;
Nav_IN_OUT_ptr = (Nav_str *)malloc(sizeof(Nav_str));
Guid_IN_OUT_ptr = (hguid_ref *)malloc(sizeof(hguid_ref));
Control_IN_OUT_ptr = (HopControl *)malloc(sizeof(HopControl));
This happens during every run of the function. However, after this function is called iteratively several times, it always crashes with a memory segmentation fault after it tries to exit. My understanding was that this memory was supposed to clean itself up, is that incorrect?
In order to clean it up manually, I added these lines to the end, to be called only on a clean-up iteration:
free(Nav_IN_OUT_ptr);
free(Guid_IN_OUT_ptr);
free(Control_IN_OUT_ptr);
Is this the correct way to free this memory? Can I free this memory? Might there be another reason for the segmentation error other than C not giving up the memory properly after the last call, or Matlab not properly managing its memory? I've searched all over for someone with a similar problem (even contacting Mathworks) without much luck, so any comments or suggestions would be much appreciated.
Failing to free memory is not going to cause a segmentation fault. It's probably likely your problem lies somewhere else. The two likely conditions are:
Overflowing a buffer
Using a pointer to memory that has previously been free'd.
Using a bad pointer value, somehow set incorrectly.
Trying to free a pointer not returned by malloc'd (or already free'd)
My understanding was that this memory
was supposed to clean itself up, is
that incorrect?
Yes, you need to call free() to release the memory back to the heap. I would also suggest that you set the pointer value to null after the free, this may help you catch condition 2, from above.
Nav_IN_OUT_ptr = (Nav_str *)malloc(sizeof(Nav_str));
This code statement is questionable. What is Nav_str type? Are you sure you don't mean to use strlen(Nav_str)+1?
I also need to ask what is the purpose for making your pointers static? Static function variables are basically globals, and only to be used in rare cases.
Your code does have a memory leak - it is allocating that memory each time the function is called. Even your current method still has the memory leak - if you only call free() once, in the final iteration, then you have only freed the most recent allocation.
However, a memory leak will not generally cause a segmentation fault (unless your memory leak exhausts all available memory, causing subsequent malloc() calls to return NULL).
If you wish to have static structures that are only allocated once and re-used, you do not need to use malloc() at all - you can simply change your declarations to:
static Nav_str Nav_IN_OUT;
static hguid_ref Guid_IN_OUT;
static HopControl Control_IN_OUT;
... and use Nav_IN_OUT.field instead of Nav_IN_OUT_ptr->field, and &Nav_IN_OUT in place of Nav_IN_OUT_ptr (if you are directly passing the pointer value to other functions).
My understanding was that this memory was supposed to clean itself up, is that incorrect?
Sorry, but you were incorrect. :) Memory allocated with malloc() will persist until you manually remove it with free(). (You did get this right in the end. Hooray. :)
Is this the correct way to free this memory? Can I free this memory?
That is the correct way to free the memory, but it might not be in the correct place. In general, try to write your free() calls the same time you write your malloc() calls.
Maybe you allocate at the start of a function and then free at the end of the function. (In that case, on-stack memory use might be better, if the memory is only ever used by functions called by the original function.)
Maybe you have a foo_init() function that calls malloc() and creates associated contexts from an API, then you pass that context into other routines that operate on that data, and then you need to place the free() calls into a foo_destroy() or foo_free() or similar routine. All your callers then need to balance the foo_init() and foo_free() calls. This would be especially appropriate if you can't just write the foo_init() and foo_destroy() calls in one function; say, your objects might need to be removed at some random point in a larger event loop.
And maybe the data should just be allocated once and live forever. That would be correct for some application designs, and it's tough to tell just from the variable names whether or not these blocks of data should live forever.
Might there be another reason for the segmentation error other than C not giving
up the memory properly after the last call, or Matlab not properly managing its memory?
There sure could be; perhaps this memory is being returned too soon, perhaps some pointer is being free()ed two or more times, or you're overwriting your buffers (that malloc(sizeof(Nav_str)) call is a little worrying; it is probably just allocating four or eight bytes, based on the pointer size on your platform; and before you replace it with strlen(), note that strlen() won't leave space for a NUL byte at the end of the string; malloc(len+1); is the usual pattern for allocating memory for a string, and I get concerned any time I don't see that +1 in the call.)
Some time with valgrind would doubtless help find memory errors, and maybe some time with Electric Fence could help. valgrind is definitely newer, and can definitely handle 'large' programs better (since electric fence will allocate a new page for every malloc(), it can be expensive).