Tips for debugging memory overwrite errors in C - c

Periodically in C I have errors which I believe come from an error in my code causing data to be corrupted, possibly by writing an array using a size which is too larger so flowing over allocated memory into other variables. I find this very painful to debug but figure it must be a common problem so maybe there are better approaches.
My current approach is to put print statements to print the value stored at the offending address throughout my code to figure out where it is getting overwritten.
Are there better was to go about it? I have tried putting a breakpoint on that memory address but somehow it doesn't trigger.
An example is shown below - results will of course depend on the compiler.
PLEASE NOTE - code below has error by design to illustrate the sort of issue I am trying to debug.
uint16_t array_a[256];
uint8_t array_b[256];
void main()
{
array_a[4] = 0;
printf("Before: array_a[4] = %u\n", array_a[4]);
// Intentionally write to memory beyond end of array_b to illustrate issue.
for (uint16_t k=0;k<266;k++)
{
array_b[k] = k;
}
printf("After: array_a[4] = %u\n", array_a[4]);
getch();
}
Output:
Before: array_a[4] = 0
After: array_a[4] = 2312
So although I haven't explicitly touched array_a, its values have changed in my loop. Easy to debug here, but this soon gets pretty complicated when code in one class is affecting variables in another!

Related

Understanding the reason behind a output

I am a beginner is Computer Science and I recently started learning the language C.
I was studying the for loop and in the book it was written that even if we replace the initialization;testing;incrementation statement of a for loop by any valid statement the compiler will not show any syntax error.
So now I run the following program:
#include<stdio.h>
int main()
{
int i;
int j;
for(i<4;j=5;j=0)
printf("%d",i);
return 0;
}
I have got the following output.
OUTPUT:
1616161616161616161616161616161616161616.........indefinitely
I understood why this is an indefinite loop but i am unable to understand why my PC is printing this specific output? Is there any way to understand in these above kind of programs what the system will provide us as output?
This is a case of undefined behaviour.
You have declared i so it has a memory address but haven't set the value so its value is just whatever was already in that memory address (in this case 16)
I'd guess if you ran the code multiple times (maybe with restarts between) the outputs would change.
Some more information on uninitialized variables and undefined behaviour: https://www.learncpp.com/cpp-tutorial/uninitialized-variables-and-undefined-behavior/
Looks like i was never initialized, and happens to contain 16, which the for loop is printing continuously. Note that printf does not add a new line automatically. Probably want to read the manual on how to use for.
To better understand what is going on, you can add additional debugging output to see what other variables are set to (don't forget the \n newline!) but really problem is the for loop doesn't seem right. Unless there's something going on with j that you aren't showing us, it really doesn't belong there at all.
You are defining the variable i but never initializing it, instead, you are defining the value for j but never using it.
On a for loop, first you initialize the control variable, if you haven't already done it. Then you specify the condition you use to know if you should run another iteration or not. And last but not least, change the value of the control variable so you won't have infinite loops.
An example:
#include <stdio.h>
int main()
{
// 1. we declare i as an integer starting from 1
// 2. we will keep iterating as long as i is smaller than 10
// 3. at the end of each iteration, we increment it's value by 1
for (int i = 1; i < 10; i++) {
printf("%d\n", i);
}
return 0;
}

Why is my variable not being updated?

I have defined a variable with an initial value.
When stepping through the code:
I can see the initial value
My function changes the value
When I use the variable later it has the wrong value
What is happening?
Note: this is intended as a reference question for common problems. If the generic answers here don't help you, post a question containing your complete, actual code.
There are several reasons why a variable might not keep a value. While some are arcane and difficult to debug, some of the most common reasons include:
The variable is modified by an interrupt
//---in main()---
unint8_t rxByte = 0;
printf("%d", rxByte); //prints "0"
//---later in Uart0_Rx_Handler()---
rxByte = U0RXREG; //rxByte set to (for example) 55
//---later in main()---
printf("%d", rxByte); //still prints "0"!!!
If a variable is modified by an interrupt handler, it needs to be declared volatile. Volatile lets the compiler know that the variable could be modified asynchronously and that it shouldn't used a cached copy in a register.
//---in main()---
volatile unint8_t rxByte = 0;
printf("%d", rxByte); //prints "0"
//---later in Uart0_Rx_Handler()---
rxByte = U0RXREG; //rxByte set to 55
//---later in main()---
printf("%d", rxByte); //corectly prints 55
Overrunning an array's bounds
There are no checks in C to prevent you from going beyond the bounds of an array.
int array[10];
int my_var = 55;
printf("%d", my_var); //prints "55"
for(i=0; i<11; i++) // eleven is one too many indexes for this array
{
array[i] = i;
}
printf("%d", my_var); // prints "11"!!!
In this case, we go through the loop 11 times, which is one index bigger than the array. In most compilers, this will result in overwriting variables declared after the array (anywhere on the page, they don't even have to be declared on the next line). This scenario can occur in many different circumstances, including multi-dimensional arrays and stack corruption.
Forgetting to dereference a pointer
While trivial, forgetting the asterisk on a pointer when making assignments will not set the variable correctly
int* pCount;
pCount = 10; //forgot the asterisk!!!
printf("%d", *pCount); //prints ??
Masking a variable with the same name
Reusing a variable name in an inner scope (like inside an if/for/while block or inside a function) hides a variable with the same name elsewhere.
int count = 10; //count is 10
if(byteRecevied)
{
int count = U0RXREG; //count redeclared!!!
DoSomething(count);
printf("%d", count); //prints "55"
}
printf("%d", count); //prints "10"
I'd like to add another possible reason to Zack's excellent answer.
Optimisation
This one frequently surprises me. A good optimising compiler will notice when two different variables are never used at the same time, and will optimise the program by giving those variables the same address in memory. When you are stepping through the code, you may see a variable apparently changing in the watch window. But what's really happening is that the variable that shares its address is being written to.
The other trick the compiler pulls is simply getting rid of a variable that it realises isn't necessary. Sometimes you might be doing the equivalent of this in your code:
force_a = mass_a * acceleration_a
force_b = mass_b * acceleration_b
total_force = force_a + force_b
The compiler sees that there's no real need for the variables force_a and force_b, and so changes the code to this:
total_force = (mass_a * acceleration_a) + (mass_b * acceleration_b)
You'll never see force_a and force_b being updates, but you'll still be able to add them to the watch window.
When I step through my program, I'm convinced that some variable or other has the wrong value in it, but when I let my program run through without stepping, it seems to work. Check that this isn't happening to you.
Added:
As Ashish Kulkarni mentioned, you can check this by turning off optimisation.

Why is this piece of code triggering "Write overrun warning (C6386)" with MSVS2012

I have the following piece of C code:
#include <stdint.h>
typedef union{
uint8_t c[4];
uint16_t s[2];
uint32_t l;
}U4;
uint32_t cborder32(uint32_t l)
{
U4 mask,res;
unsigned char* p = (unsigned char*)&l;
mask.l = 0x00010203;
res.c[(uint8_t)(mask.c[0])] = (uint8_t)p[0]; // <-- this line gives C6386
res.c[(uint8_t)(mask.c[1])] = (uint8_t)p[1];
res.c[(uint8_t)(mask.c[2])] = (uint8_t)p[2];
res.c[(uint8_t)(mask.c[3])] = (uint8_t)p[3];
return res.l;
}
And it triggers a Write overrun warning when running code analysis on it. http://msdn.microsoft.com/query/dev11.query?appId=Dev11IDEF1&l=EN-US&k=k%28C6386%29&rd=true
The error is:
C6386 Write overrun Buffer overrun while writing to 'res.c': the writable size is '4' bytes, but '66052' bytes might be written.
Invalid write to 'res.c[66051]', (writable range is 0 to 3)
And I just don't understand why ... Is there anyone who can explain me why?
I'd put this down as a potential bug in the Microsoft product. It appears to be using the full value of mask.l (0x01020304 being decimal 66051) when figuring out the array index, despite the fact you clearly want mask.c[0] forced to a uint8_t value.
So the first step is to notify Microsoft. They may come back and tell you you're wrong, and hopefully give you the C++ standard section that states why what you're doing is wrong. Or they may just state the code analysis tool is "best effort only". Since it's not actually preventing you from compiling (and it's not generating errors or warnings during compilation), they could still claim VC++ is compliant.
I would hope, of course, they wouldn't take that tack since they have a lot of interest in ensuring their tools are the best around.
The second step you should take is question why you want to do what you're doing in that way in the first place. What you have seems to be a simple byte-ordering switcher based on a mask. The statement:
res.c[(uint8_t)(mask.c[0])] = (uint8_t)p[0];
is problematic anyway since (uint8_t)(mask.c[0]) may well evaluate out to something greater than 3, and you're going to write beyond the end of your union in that case.
You may think that ensuring mask has no bytes greater than 3 may prevent this but it may be that the analyser doesn't know this. In any case, there are many ways already to switch byte order, such as with the htons family of functions or, since your stuff is hard-coded anyway, just use one of:
res.c[0] = p[0]; res.c[1] = p[1]; res.c[2] = p[2]; res.c[3] = p[3];
or:
res.c[0] = p[3]; res.c[1] = p[2]; res.c[2] = p[1]; res.c[3] = p[0];
or something else, for stranger byte ordering requirements. Using this method doesn't cause any complaints from the analyser at all.
If you really want to do it with the current mask method, you can remove the analyser warning (at least in VS2013 which is what I'm using) by temporarily supressing it (for one line):
#pragma warning(suppress : 6386)
res.c[mask.c[0]] = p[0];
res.c[mask.c[1]] = p[1];
res.c[mask.c[2]] = p[2];
res.c[mask.c[3]] = p[3];
(with casts removed since the types are already correct).

I have a for loop that refuses to stay within the defined limits. The limits have been verified (by printing) during runtime

I am inserting the snippet of code that is causing the problem. Please focus on the uncommented code, as that is the actual code. The commented out code was there to help me debug. Apologies for poor readability, but I decided to let the comments stay, in case anyone had a doubt as to how I arrived at my diagnosis of the problem.
void backProp(double back_prop_err, double x[])// x is the complete output of input layer neurons.
{
//j++;
//printf("\nA"); printf("\nj=%d",j);
error = out*(1-out)*back_prop_err;
printf("\nB:%20.18f",error);
//if(j==8)
//{
// printf("\nstart= %d, stop= %d",start, stop);
// temp=1;
//}
for(int i=start; i<= stop; i++)
{
// if(i==24)
// break;
// if(temp==1)
// {
// printf("\nstart= %d and stop= %d", start, stop); //temp=0;
// }
//j++;
//printf("\nC");
del_w[i] = c*error*x[i];
printf("\ndel_w[%d]=%20.18f",i,del_w[i]);
}
}
Please ignore the commented out sections. They were there to display stuff on the screen, so I could debug the code.
The Problem:
There are 3 classes, let's call them A, B and C. A has 24 objects, stored as an array. B has 10 objects, again in an array and C has only 1 object.
The above code is from the B class.
for class B's object[7], the value of start and stop (see above code) is 0 and 23, respectively. I have verified the same during runtime using the commented out pieces of code that you see above.
Yet, when the backProp function is called, at that particular for loop, an infinite loop is entered. The value of i keeps increasing without bound till DevC++ crashes. The commented out if(i==24) was put there to prevent that.
This only happens with class B's object[7]... Not with the previous 7 objects (object[0]...object[6]). For them, that particular loop starts and stops as per the set "start" and "stop" variables. Is it important that for those objects, "stop" is a small number (like 6 or 12 max)?
Class B's object[8] and object[9] also have stop = 23, which leads me to suspect that they too would face the same problem.
The Question: Why is this happening? start and stop variables are set. Why isn't the loop staying within those limits?
I have tried to keep this as concise as possible. Thank you for your efforts at reading this wall of a question.
EDIT: start and stop are private variables of class B (of which, the above function is a public function). They are set to their respective values using another public function, which I call from main.
j is somewhat embarrassing. It's a global static integer, initially set to 0. The reason I did this is so that I could count when class B's object[7] was accessing the backProp function. j gets incremented each time backProp is called, and as it's a static global, it acts as a counter so I can count till the object[7].
As per Oak's suggestion, I am putting up the link for the code here: http://pastebin.com/ftxBGs2y
Yeah, efforts are on to learn the DevC++ debugger. :P
I am pretty sure I don't have any problems with the dimensions of x. I will look into the matter again though.
My bet is that you are modifying i unintentionally when writing elements to del_w[]: perhaps the number of elements in the array is not large enough for what you are doing.
I say this because i and del_w could be close to each other on the stack but don't know for sure because I don't know how you allocated the memory for del_w.
Check the bounds of your arrays carefully, remembering that they are zero based.
It will not be anything funny in the for loop; you can trust the compiler!
But use a good debugger though or you'll waste so much time.

Need help tracking down an illegal write. valgrind not noticing it

ive got a C program that gets caught in a for loop that it shouldn't, running it with
valgrind --tool=memcheck --leak-check=yes a.out
doesnt return anything even up until the program gets caught. is there a way to change the settings of valgrind to help me find the leak? as many have pointed out, it wouldnt be considered a leak, apologies
thanks in advance
here is the loop in question
int clockstate=0;
int clocklength=0;
int datalength=0;
int datastate=0;
int dataloc = 9;
((((some other code that i don't think is important to this part))))
int dataerr[13] = {0};
int clockerr[13] = {0}; // assumes that spill does not change within an event.
int spill=0;
int k = 0;
spill = Getspill(d+4*255+1); // get spill bit from around the middle
//printf("got spill: %d \n", spill); // third breakpoint
for( k = 0; k < 512; k++)
{
// Discardheader(d); // doesnt actually do anything, since it's a header.f
int databit = Getexpecteddata(d+4*k+1);
printf("%d ",k);
int transmitted = Datasample(&datastate, &datalength, d+4*k+2,dataerr,dataloc, databit);
printf("%d ",k);
Clocksample(&clockstate, &clocklength, d+4*k+3,clockerr, transmitted);
printf("%d \n",k);
// assuming only one error per event (implying the possibility of multi-error "errors"
// we construct the final error at the very end of the (outside this loop)
}
and the loop repeats after printing
254 254 254
255 255 255
256 256 1 <- this is the problem
2 2 2
3 3 3
edit** so i've tracked down where it is happening, and at one point in
void Clocksample (int* state, int* length, char *d, int *type, int transbit);
i have code that says *length = 1; so it seems that this command is somehow writing onto int k. my question now is, how did this happen, why isnt it changing length back to one like i want, and how do i fix it. if you want, i can post the whole code to Clocksample
Similar to last time, something in one of those functions, Clocksample() this time, is writing to memory that doesn't belong to the data/arrays that the function should be using. Most likely an out of bounds array write. Note: this is not a memory leak, which is allocating then losing track of memory blocks that should be freed.
Set a breakpoint at the call to Clocksample() for when k is 256. Then step into Clocksample(), keeping a watch on k (or the memory used by k). You can probably also just set a hardware memory write breakpoint on the memory allocated to k. How you do any of this depends on the debugger you're using.
Now single-step (or just run to the return of Clocksample() if you have a hardware breakpoint set) and when k changes, you'll have the culprit.
Please note that Valgrind is exceedingly weak when it comes to detecting stack buffer overflows (which is what appears to be happening here).
Google address-sanitizer is much better at detecting stack overflows, and I suggest you try it instead.
So your debugging output indicates that k is being changed during the call to your function Clocksample. I see that you are passing the addresses of at least two variables, &clockstate and &clocklength into that call. It seems quite likely to me that you have an array overrun or some other wild pointer in Clocksample that ends up overwriting the memory location where k is stored.
It might be possible to narrow down the bug if you post the code where k is declared (and whatever other variables are declared nearby in the same scope). For example if clocklength is declared right before k then you probably have a bug in using the pointer value &clocklength that leads to writing past the end of clocklength and corrupting k. But it's hard to know for sure without having the actual layout of variables you're using.
valgrind doesn't catch this because if, say, clocklength and k are right next to each other on the stack, valgrind can't tell if you have a perfectly valid access to k or a buggy access past the end of clocklength, since all it checks is what memory you actually access.

Resources