Need help tracking down an illegal write. valgrind not noticing it - c

ive got a C program that gets caught in a for loop that it shouldn't, running it with
valgrind --tool=memcheck --leak-check=yes a.out
doesnt return anything even up until the program gets caught. is there a way to change the settings of valgrind to help me find the leak? as many have pointed out, it wouldnt be considered a leak, apologies
thanks in advance
here is the loop in question
int clockstate=0;
int clocklength=0;
int datalength=0;
int datastate=0;
int dataloc = 9;
((((some other code that i don't think is important to this part))))
int dataerr[13] = {0};
int clockerr[13] = {0}; // assumes that spill does not change within an event.
int spill=0;
int k = 0;
spill = Getspill(d+4*255+1); // get spill bit from around the middle
//printf("got spill: %d \n", spill); // third breakpoint
for( k = 0; k < 512; k++)
{
// Discardheader(d); // doesnt actually do anything, since it's a header.f
int databit = Getexpecteddata(d+4*k+1);
printf("%d ",k);
int transmitted = Datasample(&datastate, &datalength, d+4*k+2,dataerr,dataloc, databit);
printf("%d ",k);
Clocksample(&clockstate, &clocklength, d+4*k+3,clockerr, transmitted);
printf("%d \n",k);
// assuming only one error per event (implying the possibility of multi-error "errors"
// we construct the final error at the very end of the (outside this loop)
}
and the loop repeats after printing
254 254 254
255 255 255
256 256 1 <- this is the problem
2 2 2
3 3 3
edit** so i've tracked down where it is happening, and at one point in
void Clocksample (int* state, int* length, char *d, int *type, int transbit);
i have code that says *length = 1; so it seems that this command is somehow writing onto int k. my question now is, how did this happen, why isnt it changing length back to one like i want, and how do i fix it. if you want, i can post the whole code to Clocksample

Similar to last time, something in one of those functions, Clocksample() this time, is writing to memory that doesn't belong to the data/arrays that the function should be using. Most likely an out of bounds array write. Note: this is not a memory leak, which is allocating then losing track of memory blocks that should be freed.
Set a breakpoint at the call to Clocksample() for when k is 256. Then step into Clocksample(), keeping a watch on k (or the memory used by k). You can probably also just set a hardware memory write breakpoint on the memory allocated to k. How you do any of this depends on the debugger you're using.
Now single-step (or just run to the return of Clocksample() if you have a hardware breakpoint set) and when k changes, you'll have the culprit.

Please note that Valgrind is exceedingly weak when it comes to detecting stack buffer overflows (which is what appears to be happening here).
Google address-sanitizer is much better at detecting stack overflows, and I suggest you try it instead.

So your debugging output indicates that k is being changed during the call to your function Clocksample. I see that you are passing the addresses of at least two variables, &clockstate and &clocklength into that call. It seems quite likely to me that you have an array overrun or some other wild pointer in Clocksample that ends up overwriting the memory location where k is stored.
It might be possible to narrow down the bug if you post the code where k is declared (and whatever other variables are declared nearby in the same scope). For example if clocklength is declared right before k then you probably have a bug in using the pointer value &clocklength that leads to writing past the end of clocklength and corrupting k. But it's hard to know for sure without having the actual layout of variables you're using.
valgrind doesn't catch this because if, say, clocklength and k are right next to each other on the stack, valgrind can't tell if you have a perfectly valid access to k or a buggy access past the end of clocklength, since all it checks is what memory you actually access.

Related

What exactly happens if I declare a 10 elements array and try to access a bigger position within it?

It just happened to me. A bug. I set a 5 element array and a position variable to scroll through all its index:
int matematica[5];
int pos = 0;
and then I had my loop working just ok. Like this:
while (pos < 5) {
printf("Entre com o número da matrícula do %dº aluno: \n", pos+1);
scanf("%d", &num);
if (num != 35)
matematica[pos] = num;
pos++;
}
Everything working like a charm. After that, I had do the same to 150 positions, so I changed the while loop from while (pos < 5) to while (pos < 150) but forgot to do the same with the array. What happened then is the object of my question itself. The program didn't crash or something, it just happened that the printf and scanf statements run a bit more than 5 times then stops (sometimes 8 times, sometimes 7...)
Why does that happens. I of course fixed it later, but I still can't grasp the logic behind that bug.
The C standard says this triggers Undefined behavior,
anything could happen.
It could appear to work "correctly"
it could terminate with an error code.
it could do something unexpected.
This type of bug is called a buffer overrun, and these can often lead to arbitrary code execution (which is a special subclass of "something unexpected")
In your example pos probably occupies the same memory as matematica[5], (because most (all?) compilers pack global variables together much like fields in a struct) so depending on what number you enter in the sixth place the loop may stop or continue, negative numbers could cause
interesting results.
When you declare your array to be the array of 5 integers, you reserve a chunk of memory which holds it and then, you receive a pointer to it (which you can use however you want). When you call for an array like matematica[0], then you point to the beginning of an array and use value, which is out there. Then, when you call matematica[6] (which is outside the boundaries), you still reach the very same memory, but outside your array. It is totally legal, but it is unknown what is stored there. When you upload any other value there, you might overwrite your own data and cause weird bugs, and when you try to read it, it will be probably a random number. It will not crash, but you are warned :)

Array & segmentation fault

I'm creating the below array:
int p[100];
int
main ()
{
int i = 0;
while (1)
{
p[i] = 148;
i++;
}
return (0);
}
The program aborts with a segmentation fault after writing 1000 positions of the array, instead of the 100. I know that C doesn't check if the program writes out of bounds, this is left to the OS. I'm running it on ubuntu, the size of the stack is 8MB (limit -s). Why is it aborting after 1000? How can I check how much memory my OS allocates for an array?
Sorry if it's been asked before, I've been googling this but can't seem to find a specific explanation for this.
Accessing an invalid memory location leads to Undefined Behavior which means that anything can happen. It is not necessary for a segmentation-fault to occur.
...the size of the stack is 8MB (limit -s)...
variable int p[100]; is not at the stack but in data area because it is defined as global. It is not initialized so it is placed into BSS area and filled with zeros. You can check that printing array values just at the beginning of main() function.
As other said, using p[i] = 148; you produced undefined behaviour. Filling 1000 position you most probably reached end of BSS area and got segmentation fault.
It appear that you clearly get over the 100 elements defined (int p[100];) since you make a loop without any limitation (while (1)).
I would suggest to you to use a for loop instead:
for (i = 0; i < 100; i++) {
// do your stuff...
}
Regarding you more specific question about the memory, consider that any outside range request (in your situation over the 100 elements of the array) can produce an error. The fact that you notice it was 1000 in your situation can change depending on memory usage by other program.
It will fail once the CPU says
HEY! that's not Your memory, leave it!
The fact that the memory is not inside of the array does not mean that it's not for the application to manipulate.
The program aborts with a segmentation fault after writing 1000 positions of the array, instead of the 100.
You do not reason out Undefined Behavior. Its like asking If 1000 people are under a coconut tree, will 700 hundred of them always fall unconscious if a Coconut smacks each of their heads?

Write magic bytes to the stack to monitor its usage

I have a problem on an embedded device that I think might be related to a stack overflow.
In order to test this I was planning to fill the stack with magic bytes and then periodically check if the stack has overflowed by examining how much of my magic bytes that are left intact.
But I can't get the routine for marking the stack to work. The application keeps crashing instantly. This is what I have done just at the entry point of the program.
//fill most of stack with magic bytes
int stackvar = 0;
int stackAddr = (int)&stackvar;
int stackAddrEnd = stackAddr - 25000;
BYTE* stackEnd = (BYTE*) stackAddrEnd;
for(int i = 0; i < 25000; ++i)
{
*(stackEnd + i) = 0xFA;
}
Please note that the allocated stack is larger than 25k. So I'm counting on some stack space to already be used at this point. Also note that the stack grows from higher to lower addresses that's why I'm trying to fill from the bottom and up.
But as I said, this will crash. I must be missing something here.
From what I can see, you may be easily overwriting the contents of the stackEnd variable in the last few iterations of the loop. This is obviously a bad thing, as you're using it in the very same loop. Does stopping in your loop at, say 24900, help?
I'd suggest to stop the loop at a well calculated value depending on the size of integer on your platform then.
As others have already noted, you overwrite stackEnd. Depending on the endianness of you end up with a pointer to 0xXXXXFAFA, which is already larger (64250) than 25000 or with a pointer to 0xFAFAXXXX, which is somewhere else.
Since this is an embedded device and thus an entirely different architecture than i386, it is quite possible, that the stack grows up- instead of downwards.

How is this loop ending and are the results deterministic?

I found some code and I am baffled as to how the loop exits, and how it works. Does the program produce a deterministic output?
The reason I am baffled is:
1. `someArray` is of size 2, but clearly, the loop goes till size 3,
2. The value is deterministic and it always exits `someNumber` reaches 4
Can someone please explain how this is happening?
The code was not printing correctly when I put angle brackets <> around include's library names.
#include <stdlib.h>
#include <time.h>
#include <stdio.h>
int main() {
int someNumber = 97;
int someArray[2] = {0,1};
int findTheValue;
for (findTheValue=0; (someNumber -= someArray[findTheValue]) >0; findTheValue++) {
}
printf("The crazy value is %d", findTheValue);
return EXIT_SUCCESS;
}
Accessing an array element beyond its bounds is undefined behavior. That is, the program is allowed to do anything it pleases, reply 42, eat your hard disk or spend all your money. Said in other words what is happening in such cases is entirely platform dependent. It may look "deterministic" but this is just because you are lucky, and also probably because you are only reading from that place and not writing to it.
This kind of code is just bad. Don't do that.
Depending on your compiler, someArray[2] is a pointer to findTheValue!
Because these variables are declared one-after-another, it's entirely possible that they would be positioned consecutively in memory (I believe on the stack). C doesn't really do any memory management or errorchecking, so someArray[2] just means the memory at someArray[0] + 2 * sizeof(int).
So when findTheValue is 0, we subtract, then when findTheValue is 1, we subtract 1. When findTheValue is 2, we subtract someNumber (which is now 94) and exit.
This behavior is by no means guaranteed. Don't rely on it!
EDIT: It is probably more likely that someArray[2] just points to garbage (unspecified) values in your RAM. These values are likely more than 93 and will cause the loop to exit.
EDIT2: Or maybe someArray[2] and someArray[3] are large negative numbers, and subtracting both causes someNumber to roll over to negative.
The loop exits because (someNumber -= someArray[findTheValue]) doesnt set.
Adding a debug line, you can see
value 0 number 97 array 0
value 1 number 96 array 1
value 2 number 1208148276 array -1208148180
that is printing out findTheValue, someNumber, someArray[findTheValue]
Its not the answer I would have expected at first glance.
Checking addresses:
printf("&someNumber = %p\n", &someNumber);
printf("&someArray[0] = %p\n", &someArray[0]);
printf("&someArray[1] = %p\n", &someArray[1]);
printf("&findTheValue = %p\n", &findTheValue);
gave this output:
&someNumber = 0xbfc78e5c
&someArray[0] = 0xbfc78e50
&someArray[1] = 0xbfc78e54
&findTheValue = 0xbfc78e58
It seems that for some reason the compiler puts the array in the beginning of the stack area, then the variables that are declared below and then those that are above in the order they are declared. So someArray[3] effectively points at someNumber.
I really do not know the reason, but I tried gcc on Ubuntu 32 bit and Visual Studio with and without optimisation and the results were always similar.

Segmentation Fault after slightly modifying my code

I'll copy the relevant lines:
(Declarations)
typedef struct { /* per una entrada de la taula de posicion */
int f;
int c;
} pos;
pos *p_opo[9];
(in main)
for (i = 0; i < num; i++) {
p_opo[i] = (pos *) calloc(n_fil * n_col / 2, sizeof (pos));
}
Now, after only having introduced this lines, the code breaks in an arbitrary point (in a call to a given library function). I suspect I'm corrupting something with this, although I don't know what.
All I want is to have an array of variable size arrays!
PD: num is an argument of the program. I've been running it with num=1 anyway.
num should be less or equal to 9. (0..8 allocated pointers in p_opo equals 9 !)
Note that in C that you get errors in a different place in case of memory leaks, etc. The reason for this is that by changing some code, other code or data can be rearranged and this may end up in segmentation faults.
So the problem may very well be in another part of your program. Make sure you have all you warnings turned on (like the -Wall option in gcc), it may give you some clues.
If your call to calloc asks for memory of size 0 it may return NULL, and if you are making use of that memory it could be causing the segmentation fault. So if:
0 == (n_fil * n_col / 2)
or somehow
0 == sizeof (pos) /* I don't think that this is possible */
the size of the memory that you are asking for is 0, and so calloc can return NULL.
If this is not the case then I don't think that you have enough code up there for anyone to know why it is segfaulting. You should keep in mind that errors like this can go unnoticed until you add or change some code that seems to be totally unrelated to the code that has the actual error.
Seeing you casting the return of calloc makes me suspicious. Don't do that, this leads to a typical error if you forget the include for the system function.
This happes if you are on a machine with 64 bit pointers and 32 bit int.

Resources