Write magic bytes to the stack to monitor its usage

Write magic bytes to the stack to monitor its usage - c

I have a problem on an embedded device that I think might be related to a stack overflow.
In order to test this I was planning to fill the stack with magic bytes and then periodically check if the stack has overflowed by examining how much of my magic bytes that are left intact.
But I can't get the routine for marking the stack to work. The application keeps crashing instantly. This is what I have done just at the entry point of the program.
//fill most of stack with magic bytes
int stackvar = 0;
int stackAddr = (int)&stackvar;
int stackAddrEnd = stackAddr - 25000;
BYTE* stackEnd = (BYTE*) stackAddrEnd;
for(int i = 0; i < 25000; ++i)
{
*(stackEnd + i) = 0xFA;
}
Please note that the allocated stack is larger than 25k. So I'm counting on some stack space to already be used at this point. Also note that the stack grows from higher to lower addresses that's why I'm trying to fill from the bottom and up.
But as I said, this will crash. I must be missing something here.

From what I can see, you may be easily overwriting the contents of the stackEnd variable in the last few iterations of the loop. This is obviously a bad thing, as you're using it in the very same loop. Does stopping in your loop at, say 24900, help?
I'd suggest to stop the loop at a well calculated value depending on the size of integer on your platform then.

As others have already noted, you overwrite stackEnd. Depending on the endianness of you end up with a pointer to 0xXXXXFAFA, which is already larger (64250) than 25000 or with a pointer to 0xFAFAXXXX, which is somewhere else.
Since this is an embedded device and thus an entirely different architecture than i386, it is quite possible, that the stack grows up- instead of downwards.

Related

Using memcpy() to move tail of buffer to its beginning? (overlap)

I have binary file read buffer which reads structures of variable length. Near the end of buffer there will always be incomplete struct. I want to move such tail of buffer to its beginning and then read buffer_size - tail_len bytes during next file read. Something like this:
char[8192] buf;
cur = 0, rcur = 0;
while(1){
read("file", &buf[rcur], 8192-rcur);
while (cur + sizeof(mystruct) < 8192){
mystruct_ptr = &buf[cur];
if (mystruct_prt->tailsize + cur >= 8192) break; //incomplete
//do stuff
cur += sizeof(mystruct) + mystruct_ptr->tailsize;
}
memcpy(buf,&buf[cur],8192-cur);
rcur=8192-cur;
cur = 0;
}
It should be okay if tail is small and buffer is big because then memcpy most likely won't overlap copied memory segment during single copy iteration. However it sounds slightly risky when tail becomes big - bigger than 50% of buffer.
If buffer is really huge and tail is also huge then it still should be okay since there's physical limit of how much data can be copied in single operation which if I remember correctly is 512 bytes for modern x86_64 CPUs using vector units. I thought about adding condition that checks length of tail and if it's too big comparing to size of buffer, performs naive byte-by-byte copy but question is:
How big is too big to consider such overlapping memcpy more or less safe. tail > buffer size - 2kb?

Per the standard, memcpy() has undefined behavior if the source and destination regions overlap. It doesn't matter how big the regions are or how much overlap there is. Undefined behavior cannot ever be considered safe.
If you are writing to a particular implementation, and that implementation defines behavior for some such copying, and you don't care about portability, then you can rely on your implementation's specific behavior in this regard. But I recommend not. That would be a nasty bug waiting to bite people who decide to use the code with some other implementation after all. Maybe even future you.
And in this particular case, having the alternative of using memmove(), which is dedicated to this exact purpose, makes gambling with memcpy() utterly reckless.

Array & segmentation fault

I'm creating the below array:
int p[100];
int
main ()
{
int i = 0;
while (1)
{
p[i] = 148;
i++;
}
return (0);
}
The program aborts with a segmentation fault after writing 1000 positions of the array, instead of the 100. I know that C doesn't check if the program writes out of bounds, this is left to the OS. I'm running it on ubuntu, the size of the stack is 8MB (limit -s). Why is it aborting after 1000? How can I check how much memory my OS allocates for an array?
Sorry if it's been asked before, I've been googling this but can't seem to find a specific explanation for this.

Accessing an invalid memory location leads to Undefined Behavior which means that anything can happen. It is not necessary for a segmentation-fault to occur.

...the size of the stack is 8MB (limit -s)...
variable int p[100]; is not at the stack but in data area because it is defined as global. It is not initialized so it is placed into BSS area and filled with zeros. You can check that printing array values just at the beginning of main() function.
As other said, using p[i] = 148; you produced undefined behaviour. Filling 1000 position you most probably reached end of BSS area and got segmentation fault.

It appear that you clearly get over the 100 elements defined (int p[100];) since you make a loop without any limitation (while (1)).
I would suggest to you to use a for loop instead:
for (i = 0; i < 100; i++) {
// do your stuff...
}
Regarding you more specific question about the memory, consider that any outside range request (in your situation over the 100 elements of the array) can produce an error. The fact that you notice it was 1000 in your situation can change depending on memory usage by other program.

It will fail once the CPU says
HEY! that's not Your memory, leave it!
The fact that the memory is not inside of the array does not mean that it's not for the application to manipulate.

The program aborts with a segmentation fault after writing 1000 positions of the array, instead of the 100.
You do not reason out Undefined Behavior. Its like asking If 1000 people are under a coconut tree, will 700 hundred of them always fall unconscious if a Coconut smacks each of their heads?

Purposely waste all of main memory to learn fragmentation

In my class we have an assignment and one of the questions states:
Memory fragmentation in C: Design, implement, and run a C-program that does the following: it allocated memory for a sequence of of 3m arrays of size 500000 elements each; then it deallocates all even-numbered arrays and allocates a sequence of m arrays of size 700000 elements each. Measure the amounts of time your program requires for the allocations of the first sequence and for the second sequence. Choose m so that you exhaust all of the main memory available to your program. Explain your timings
My implementation of this is as follows:
#include <iostream>
#include <time.h>
#include <algorithm>
void main(){
clock_t begin1, stop1, begin2, stop2;
double tdif = 0, tdif2 = 0;
for(int k=0;k<1000;k++){
double dif, dif2;
const int m = 50000;
begin1 = clock();
printf("Step One\n");
int *container[3*m];
for(int i=0;i<(3*m);i++)
{
int *tmpAry = (int *)malloc(500000*sizeof(int));
container[i] = tmpAry;
}
stop1 = clock();
printf("Step Two\n");
for(int i=0;i<(3*m);i+=2)
{
free(container[i]);
}
begin2 = clock();
printf("Step Three\n");
int *container2[m];
for(int i=0;i<m;i++)
{
int *tmpAry = (int *)malloc(700000*sizeof(int));
container2[i] = tmpAry;
}
stop2 = clock();
dif = (stop1 - begin1)/1000.00;
dif2 = (stop2 - begin2)/1000.00;
tdif+=dif;
tdif/=2;
tdif2+=dif2;
tdif2/=2;
}
printf("To Allocate the first array it took: %.5f\n",tdif);
printf("To Allocate the second array it took: %.5f\n",tdif2);
system("pause");
};
I have changed this up a few different ways, but the consistencies I see are that when I initially allocate the memory for 3*m*500000 element arrays it uses up all of the available main memory. But then when I tell it to free them the memory is not released back to the OS so then when it goes to allocate the m*700000 element arrays it does it in the page file (swap memory) so it does not actually display memory fragmentation.
The above code runs this 1000 times and averages it, it takes quite some time. The first sequence average took 2.06913 seconds and the second sequence took 0.67594 seconds. To me the second sequence is supposed to take longer to show how fragmentation works, but because of the swap being used this does not occur. Is there a way around this or am I wrong in my assumption?
I will ask the professor about what I have on monday but until then any help would be appreciated.

Many libc implementations (I think glibc included) don't release memory back to the OS when you call free(), but keep it so you can use it on the next allocation without a syscall. Also, because of the complexity of modern paging and virtual memory stratagies, you can never be sure where anything is in physical memory, which makes it almost imposible to intentionally fragment it (even if it comes fragmented). You have to remember, all virtual memory, and all physical memory are different beasts.
(The following is written for Linux, but probably applicable to Windows and OSX)
When your program makes the first allocations, let's say there is enough physical memory for the OS to squeeze all of the pages in. They aren't all next to each-other in physical memory -- they are scattered wherever they can be. Then the OS modifies the page table to make a set of continuous virtual addresses, that refer to the scattered pages around in memory. But here's the thing -- because you don't really use the first memory you allocate, it becomes a really good candidate for swapping out. So, when you come along to do the next allocations, the OS, running out of memory, will probably swap out some of those pages to make room for the new ones. Because of this, you are actually measuring disk speeds, and the efficiency of the operations systems paging mechanism -- not fragmentation.
Remember, an set of continuous virtual addresses is almost never physically continuous in practice (or even in memory).

Need help tracking down an illegal write. valgrind not noticing it

ive got a C program that gets caught in a for loop that it shouldn't, running it with
valgrind --tool=memcheck --leak-check=yes a.out
doesnt return anything even up until the program gets caught. is there a way to change the settings of valgrind to help me find the leak? as many have pointed out, it wouldnt be considered a leak, apologies
thanks in advance
here is the loop in question
int clockstate=0;
int clocklength=0;
int datalength=0;
int datastate=0;
int dataloc = 9;
((((some other code that i don't think is important to this part))))
int dataerr[13] = {0};
int clockerr[13] = {0}; // assumes that spill does not change within an event.
int spill=0;
int k = 0;
spill = Getspill(d+4*255+1); // get spill bit from around the middle
//printf("got spill: %d \n", spill); // third breakpoint
for( k = 0; k < 512; k++)
{
// Discardheader(d); // doesnt actually do anything, since it's a header.f
int databit = Getexpecteddata(d+4*k+1);
printf("%d ",k);
int transmitted = Datasample(&datastate, &datalength, d+4*k+2,dataerr,dataloc, databit);
printf("%d ",k);
Clocksample(&clockstate, &clocklength, d+4*k+3,clockerr, transmitted);
printf("%d \n",k);
// assuming only one error per event (implying the possibility of multi-error "errors"
// we construct the final error at the very end of the (outside this loop)
}
and the loop repeats after printing
254 254 254
255 255 255
256 256 1 <- this is the problem
2 2 2
3 3 3
edit** so i've tracked down where it is happening, and at one point in
void Clocksample (int* state, int* length, char *d, int *type, int transbit);
i have code that says *length = 1; so it seems that this command is somehow writing onto int k. my question now is, how did this happen, why isnt it changing length back to one like i want, and how do i fix it. if you want, i can post the whole code to Clocksample

Similar to last time, something in one of those functions, Clocksample() this time, is writing to memory that doesn't belong to the data/arrays that the function should be using. Most likely an out of bounds array write. Note: this is not a memory leak, which is allocating then losing track of memory blocks that should be freed.
Set a breakpoint at the call to Clocksample() for when k is 256. Then step into Clocksample(), keeping a watch on k (or the memory used by k). You can probably also just set a hardware memory write breakpoint on the memory allocated to k. How you do any of this depends on the debugger you're using.
Now single-step (or just run to the return of Clocksample() if you have a hardware breakpoint set) and when k changes, you'll have the culprit.

Please note that Valgrind is exceedingly weak when it comes to detecting stack buffer overflows (which is what appears to be happening here).
Google address-sanitizer is much better at detecting stack overflows, and I suggest you try it instead.

So your debugging output indicates that k is being changed during the call to your function Clocksample. I see that you are passing the addresses of at least two variables, &clockstate and &clocklength into that call. It seems quite likely to me that you have an array overrun or some other wild pointer in Clocksample that ends up overwriting the memory location where k is stored.
It might be possible to narrow down the bug if you post the code where k is declared (and whatever other variables are declared nearby in the same scope). For example if clocklength is declared right before k then you probably have a bug in using the pointer value &clocklength that leads to writing past the end of clocklength and corrupting k. But it's hard to know for sure without having the actual layout of variables you're using.
valgrind doesn't catch this because if, say, clocklength and k are right next to each other on the stack, valgrind can't tell if you have a perfectly valid access to k or a buggy access past the end of clocklength, since all it checks is what memory you actually access.

Frame Pointer / Program Counter / Array Overflow

I'm working on a practice problem set for C programming, and I've encountered this question. I'm not entirely sure what the question is asking for... given that xDEADBEEF is the halt instruction, but where do we inject deadbeef? why is the FP relevant in this question? thank you!
You’ve been assigned as the lead computer engineer on an interplanetary space mission to Jupiter. After several months in space, the ship’s main computer, a HAL9000, begins to malfunction and starts killing off the crew members. You’re the last crew member left alive and you need to trick the HAL 9000 computer into executing a HALT instruction. The good news is that you know that the machine code for a halt instruction is (in hexadecimal) xDEADBEEF (in decimal, this is -559,038,737). The bad news is that the only program that the HAL 9000 operating system is willing to actually run is chess. Fortunately, we have a detailed printout of the source code for the chess program (an excerpt of all the important parts is given below). Note that the getValues function reads a set of non-zero integers and places each number in sequence in the array x. The original author of the program obviously expected us to just provide two positive numbers, however there’s nothing in the program that would stop us from inputting three or more numbers. We also know that the stack will use memory locations between 8000 and 8999, and that the initial frame pointer value will be 8996.
void getValues(void) {
int x[2]; // array to hold input values
int k = 0;
int n;
n = readFromKeyboard(); // whatever you type on the keyboard is assigned to n
while (n != 0) {
x[k] = nextNumber;
k = k + 1;
n = readFromKeyboard();// whatever you type on the keyboard is assigned to n
}
/* the rest of this function is not relevant */
}
int main(void) {
int x;
getValues();
/* the rest of main is not relevant */
}
What sequence of numbers should you type on the keyboard to force the computer to execute a halt instruction?
SAMPLE Solution
One of the first three numbers should be -559038737. The fourth number must be the address of where 0xdeadbeef was placed into memory. Typical values for the 4th number are 8992 (0xdeadbeef is the second number) or 8991 (0xdeadbeef is first number).

What you want to do is overflow the input such that the program will return into a set of instructions you have overwritten at the return address.
The problem lies here:
int x[2]; // array to hold input values
By passing more than 3 values in, you can overwrite memory that you shouldn't. Explaining the sample example:
First input -559,038,737 puts xDEADBEEF in memory
Second input -559,038,737, why not.
Third number -559,038,737 can't hurt
Fourth number 8992 is the address we want the function to return into.
When the function call returns, it will return to the address overwrote the return address on the stack with (8992).
Here are some handy resources, and an excerpt:
The actual buffer-overflow hack work slike this:
Find code with overflow potential.
Put the code to be executed in the
buffer, i.e., on the stack.
Point the return address to the same code
you have just put on the stack.
Also a good book on the topic is "Hacking: The art of exploitation" if you like messing around with stacks and calling procedures.
In your case, it seems they are looking for you to encode your instructions in integers passed to the input.
An article on buffer overflowing

Hint: Read about buffer overflow exploits.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight