Debugging memory corruption - c

Earlier I encountered a problem with dynamic memory in C (visual studio) .
I had a more or less working program that threw a run-time error when freeing one of the buffers. It was a clear memory corruption, the program wrote over the end of the buffer.
My problem is, that it was very time consuming to track down. The error was thrown way down after the corruption, and i had to manually debug the entire run to find when is the buffer end overwritten.
Is there any tool\ way to assist in tracking down this issue? if the program would have crashed immediately i would have found the problem a lot faster...
an example of the issue:
int *pNum = malloc(10 * sizeof(int));
// ||
// \/
for(int i = 0; i < 13; i++)
{
pNum[i] = 3;
}
// error....
free(pNum);

I use "data breakpoints" for that. In your case, when the program crashes, it might first complain like this:
Heap block at 00397848 modified at 0039789C past requested size of 4c
Then, start your program again, and set a data breakpoint at address 0039789C. When the code writes to that address, the execution will stop. It often happens that i find the bug immediately at this point.
If your program allocates and deallocates memory repeatedly, and it happens to be at this exact address, just disable deallocations:
_CrtSetDbgFlag(_CrtSetDbgFlag(_CRTDBG_REPORT_FLAG) | _CRTDBG_DELAY_FREE_MEM_DF);

I use pageheap. This is a tool from Microsoft that changes how the allocator works. With pageheap on, when you call malloc, the allocation is rounded up to the nearest page(a block of memory), and an additional page of virtual memory that is set to no-read/no-write is placed after it. The dynamic memory you allocate is aligned so that the end of your buffer is just before the end of the page before the virtual page. This way, if you go over the edge of your buffer, often by a single byte, the debugger can catch it easily.

Is there any tool\ way to assist in tracking down this issue?
Yes, that's precisely the type of error which static code analysers try to locate. e.g. splint/PC-Lint
Here's a list of such tools:
http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis
Edit: In trying out splint on your code snippet I get the following warning:
main.c:9:2: Possible out-of-bounds store: pnum[i]
Presumably this warning would have assisted you.

Our CheckPointer tool can help find memory management errors. It works with GCC 3/4 and Microsoft dialects of C.
Many dynamic checkers only catch accesses outside of an object, and then only if the object is heap allocated. CheckPointer will find memory access errors inside a heap-allocated object; it is illegal to access off the end of a field in a struct regardless of the field type; most dynamic checkers cannot detect such errors. It will also find accesses off the edge of locals.

Related

simple overwriting buffer not causing bufferoverflow C valgrind gcc no error [duplicate]

why is this not giving error when I compile?
#include <iostream>
using namespace std;
int main()
{
int *a = new int[2];
// int a[2]; // even this is not giving error
a[0] = 0;
a[1] = 1;
a[2] = 2;
a[3] = 3;
a[100] = 4;
int b;
return 0;
}
can someone explain why this is happening.
Thanks in advance.)
Because undefined behavior == anything can happen. You're unlucky that it doesn't crash, this sort of behavior can potentially hide bugs.
Declaring two variables called a certainly is an error; if your compiler accepts that, then it's broken. I assume you mean that you still don't get an error if you replace one declaration with the other.
Array access is not range-checked. At compile time, the size of an array is often not known, and the language does not require a check even when it is. At run time, a check would degrade performance, which would go against the C++ philosophy of not paying for something you don't need. So access beyond the end of an array gives undefined behaviour, and it's up to the programmer to make sure it doesn't happen.
Sometimes, an invalid access will cause a segmentation fault, but this is not guaranteed. Typically, memory protection is only applied to whole pages of memory, with a typical page size of a few kilobytes. Any access within a page of valid memory will not be caught. There's a good chance that the memory you access contains some other program variable, or part of the call stack, so writing there could affect the program's behaviour in just about any way you can imagine.
If you want to be safe, you could use std::vector, and only access its elements using its at() function. This will check the index, and throw an exception if it's out of range. It will also manage memory allocation for you, fixing the memory leak in your example.
I'm guessing you're coming from Java or a Java-like language where once you step out of the boundary of an array, you get the "array index out of bounds" exception.
Well, C expects more from you; it saves up the space you ask for, but it doesn't check to see if you're going outside the boundary of that saved up space. Once you do that as mentioned above, the program has that dreaded undefined behavior.
And remember for the future that if you have a bug in your program and you can't seem to find it, and when you go over the code/debug it, everything seems OK, there is a good chance you're "out of bounds" and accessing an unallocated place.
compilers with good code analysis would certainly warn on that code referencing beyond your array allocation. forgetting the multiple a declaration, if you ran it, it may or may not fault (undefined behavior as others have said). if, for example, you got a 4KB page of heap (in processor address space), if you don't write outside of that page, you won't get a fault from the processor. upon delete of the array, if you had done it, and depending on the heap implementation, the heap might detect that it is corrupted.

malloc causes the application to crash and show memory map [duplicate]

I have a large body of legacy code that I inherited. It has worked fine until now. Suddenly at a customer trial that I cannot reproduce inhouse, it crashes in malloc. I think that I need to add instrumentation e.g on top of malloc I have my own malloc that stores some meta information about each malloc e.g. who has made the malloc call. When it crashes, I can then look up the meta information and see what was happening. I had done something similar years ago but cannot recall it now...I am sure people have come up with better ideas. Will be glad to have inputs.
Thanks
Is memory allocation broken?
Try valgrind.
Malloc is still crashing.
Okay, I'm going to have to assume that you mean SIGSEGV (segmentation fault) is firing in malloc. This is usually caused by heap corruption. Heap corruption, that itself does not cause a segmentation fault, is usually the result of an array access outside of the array's bounds. This is usually nowhere near the point where you call malloc.
malloc stores a small header of information "in front of" the memory block that it returns to you. This information usually contains the size of the block and a pointer to the next block. Needless to say, changing either of these will cause problems. Usually, the next-block pointer is changed to an invalid address, and the next time malloc is called, it eventually dereferences the bad pointer and segmentation faults. Or it doesn't and starts interpreting random memory as part of the heap. Eventually its luck runs out.
Note that free can have the same thing happen, if the block being released or the free block list is messed up.
How you catch this kind of error depends entirely on how you access the memory that malloc returns. A malloc of a single struct usually isn't a problem; it's malloc of arrays that usually gets you. Using a negative (-1 or -2) index will usually give you the block header for your current block, and indexing past the array end can give you the header of the next block. Both are valid memory locations, so there will be no segmentation fault.
So the first thing to try is range checking. You mention that this appeared at the customer's site; maybe it's because the data set they are working with is much larger, or that the input data is corrupt (e.g. it says to allocate 100 elements and then initializes 101), or they are performing things in a different order (which hides the bug in your in-house testing), or doing something you haven't tested. It's hard to say without more specifics. You should consider writing something to sanity check your input data.
Try Asan
AddressSanitizer (aka ASan) is a memory error detector for C/C++. It finds:
Use after free (dangling pointer dereference)
Heap buffer overflow
Stack buffer overflow
Global buffer overflow
Use after return
Use after scope
Initialization order bugs
Memory leaks
Please find the links to know more and how to use it
https://github.com/google/sanitizers/wiki/AddressSanitizer and
https://github.com/google/sanitizers/wiki/AddressSanitizerFlags
I know this is old, but issues like this will continue to exist as long as we have pointers. Although valgrind is the best tool for this purpose, it has a steep learning curve and often the results are too intimidating to understand.
Assuming you are working on some *nux, another tool I can suggest is electricfence. Quote:
Electric Fence helps you detect two common programming bugs:
software that overruns the boundaries of a malloc() memory allocation,
software that touches a memory allocation that has been released by free().
Unlike other malloc() debuggers, Electric Fence will detect read accesses
as well as writes, and it will pinpoint the exact instruction that causes
an error.
Usage is amazingly simple. Just link your code with an additional library lefence
When you run the application, a corefile will be generated when memory is corrupted, instead of when corrupted memory is used.

Random malloc crash?

I'm trying to read a binary file that has blocks starting with an identifier (like a 3DS file). I loop through the file and using a switch the program determines what identifier a block has and then reads the data into the file struct. Sometimes I need to use malloc to allocate memory for dynamic sized data. While reading, the switch often goes through the same case wherin memory is allocated, but at a specific point in the file it crashes on that same malloc. The file that I want to read is about 1MB. But when I try the program with another file of about 10kB and the same structure, it reads it succesfully.
What could be causing this problem?
The error code that I get when debugging is:
Heap corruption detected at 0441F080
HEAP[prog.exe]: HEAP: Free Heap block 441f078 modified at 441f088 after it was freed
Also when I execute it in debug mode, for some reason I can read more data from the file. The program lives longer before it crashes.
Here is the code piece where it crashes:
switch (id) {
case 0x62:
case 0x63:
// ...
{
char n_vertices = id - 0x60 + 1;// just how I calculate the n_vertices from the block ID
fread(&mem.blocks[i].data.attr_6n.height, 2, 1, f);
mem.blocks[i].data.attr_6n.vertices = malloc(2 * n_vertices);// crash
for (short k = 0; k < n_vertices; k++) {
fread(&mem.blocks[i].data.attr_6n.vertices[k], 2, 1, f);// read shorts
}
}
break;
// ...
}
You probably have a corrupt heap. This could be caused by invalid deallocations (deallocating unowned or already free memory), or by some random chunk of code writing outside its memory area into a place that happens to hold the heap bookeeping data structures. This most likely will be a piece of code that has nothing whatsoever to do with that dynamically allocated memory.
Tracking down bugs like this is a real bear. They tend to appear long after the offending code has executed, and they have an annoying tendency to turn into heisenbugs (bugs that move or go away when you attempt to debug them).
My suggestion for approaching debugging would be to try to comment out portions of your code and see what causes the problem to go away. That isn't foolproof, as you could just end up moving the out-of-bounds write to somewhere else.
Looking over the code you just posted, one thing I would highly suggest you do is verify that your malloc specified enough memory to hold all the data you are attempting to load into it. It looks to me like you are assuming 2 bytes for each vertex. That seems a bit suspicious to me. I don't know your code, but 4 or 8 would be much more common element sizes to see there. Regardless, industry practice is to use sizeof() on the target type to help ensure you have it right.
Another option, if that debugger message of yours can show you where it is happening, would be to put a debugger watch point there (or write some watching code...or manually dump and inspect the area) when stepping in the debugger to try to figure out which is the offending line of code.
Good luck. I hate these bugs.
Most likely the heap gets corrupted somehow, malloc is crashing e.g. trying to traverse a corrupted linked list of free blocks (or a similar structure, I'm not exactly sure what is used in modern heap allocators these days).
Make sure your code is not writing past the end of an allocated block.
You need to run this in a memory debugger like valgrind. Since it looks like you're on windows, see the following: Is there a good Valgrind substitute for Windows?

Strange C program behaviour

I really have a strange situation. I'm making a Linux multi-threaded C application using all the nitty-gritty memory stuff involving char* strings, and I'm stuck in a really odd position.
Basically, what happens is, using POSIX threads, I'm reading and writing to a two-dimensional char array, but it has unusual errors. You have my word that I have done extensive testing on what they are individually accessing, and they don't read another threads' data, let alone write to others. When the last thread that works with the array changes its parts of the array, it seems to change the last few chars of its arrays and put characters in there that I don't know how they could possibly have got in there; mainly ones that print as black diamond question mark things.
I use valgrind and GDB, and they don't really help. As far as I can tell, all should work. Valgrind tells me I'm not freeing everything.
I know all that sounds fairly undescriptive, but here's where it gets weird: if I compile my program with electric fence, then it all works. Valgrind tells me I'm freeing everything and that there's no memory errors at all, just as I thought it should have been. It works absolutely flawlessly!
So, I guess my question is, why does my program work fine when compiled with electric fence?
(And also as a side question, what steps need to be taken to ensure 100% "thread-safe" code?)
Electric fence allocates pages, I've heard at least two, for each allocation you make. It uses the OSs paging mechanisms to check for accessing outside of the allocation. This means that if you want a new 14-character array you end up with a whole new page to hold it, say 8k. Most of the page is unused but you can detect errant accesses by watching which pages get used. I can imagine that on account of having so much extra space if a problem gets past the guards you wouldn't see an error.
If you don't have a bad access but rather corruption due to two threads not locking correctly efence won't detect it. efence also likely keeps pointers to allocated memory, fooling valgrind into reporting no problems. You should run valgrind with the --show-reachable=yes flag and see what's unclaimed at the end of your run.
It sounds like you're trashing your data structures. Try putting canaries at the beginning and end of your arrays, open up GDB, then put write breakpoints on the canaries.
A canary is a const value that should never be changed - its only purpose is to detect memory corruption should it be overwritten. For example:
int the_size_i_need;
char* array = malloc((the_size_i_need + 2) * sizeof(char));
array[0] = 0xAA;
array[the_size_i_need+1] = 0xFF;
char* real_array = array+1;
/* Do some stuff here using real_array */
if (array[0] != 0xAA || array[the_size_i_need+1] != 0xFF) {
printf("Oh noes! We're corrupted\n");
}
Oh god, I'm so sorry. I've worked it out: there was a variable given to the thread for each to put their answer into, but I didn't define it as zero, and it contains 2 funny chars. Maybe the electric fence malloc() allocates 'zeroed' memory like calloc(), but standard malloc() of course doesn't.

Heap error in C

I know this is really general, but I get "this" (see below) when I run my .c file in Visual C++ 2008 Express. It happens when I call malloc (). Take my work on this - I dynamically allocate memory properly.
HEAP[Code.exe]: HEAP: Free Heap block 211a10 modified at 211af8 after it was freed
Windows has triggered a breakpoint in Code.exe.
This may be due to a corruption of the heap, which indicates a bug in Code.exe or any of the DLLs it has loaded.
This may also be due to the user pressing F12 while Code.exe has focus.
The output window may have more diagnostic information.
Why do I get this error? What does this even mean?
The error message tells you exactly why you got it:
Free Heap block 211a10 modified at 211af8 after it was freed
You had a heap allocated block that was freed then something wrote to that area of memory. It's not nice to write to a freed block of memory.
The error isn't actually happening when you call malloc; that's just when it triggers a free heap scan. The actual error happened somewhere before. You malloced some memory at address 211a10 (that's what malloc returned to you). Then you (or some other lib) freed it. Then later, when you call malloc in debug mode, it scans the whole heap -- as a courtesy to you, the poor programmer. It discovers that somebody (your or some lib you call) wrote over part of that array, specifically at address 211af8, or 0xe8 bytes into the array. So you're either still hanging onto a pointer that's been freed (most likely) and using it, or you're just trashing random memory.
In my case, with similar symptoms, the issue was the struct alignment mismatch (/Zp option)
I defined for my code a different struct alignment than external libraries (wxWidgets).
However, wxWidgets was built with the makefile, so it was compiled using the defaut /Zp.
And wxWidget is statically linked.
You can do that, but if you try to delete a wxWidgets-class object from your code the compiler becomes confused about the exact size of struct members.
And when running, you get this message:
HEAP[Code.exe]: HEAP: Free Heap block 211a10 modified at 211af8 after it was freed
Windows has triggered a breakpoint in Code.exe.
Solution:
Be sure to use the same "Struct Member Alignment" in all code and libraries.
Best rule is to define /ZP to use "default" value.
In Visual Studio, under Properties C/C++ Code Generation
MSDN cite: "You should not use this option unless you have specific alignment requirements."
See here
Tip: use #pragma pack if you need to control the alignment in some structs
See there
Example:
#pragma pack(1) // - 1 byte alignment
typedef union
{
u64 i;
struct{ // CUSTOMS.s is used by Folders
u32 uidx; // Id, as registered
byte isoS, isoT; // isoS/isoT combination.
byte udd0, udd1; // custom values (TBD)
}s;
}CUSTOMS;
struct Header // exactly 128 bits
{
u32 version;
u32 stamp; // creation time
CUSTOMS customs; // properties
}
#pragma pack() // this pragma restores the **default** alignment
*
Hope this explanation helps, because this is not actually a bug in code, but a serious configuration mistake: difficult to detect because it is located in subtle compiler options. Thanks for all,
*
I dynamically allocate memory properly.
I think that the problem here is that you unallocate the memory inproperly. What I mean by this is that, you might be trying to use freed memory. Sorry I can't help any further, you could probably add the actual code.
Take my work on this - I dynamically allocate memory properly.
But are you sure your buffers are all of the correct size and you free() them properly? Double frees and buffer overflows can easily lead to heap corruption that can cause malloc() to fail in all kind of ways.
If the management structures used internally by malloc() get damaged, it will usually not lead to an error immediately. But later calls to malloc() or free() that try to use these damaged structures will fail do erratic things.
Are you using malloc() on an array? Because I think the error just might be you forgetting to allocate an extra memory location at the end -- what happens is it tries to write to that location, which isn't allocated to it, and assumes it's trying to write to a place that has already been freed.

Resources