ARM PC overwritten with incorrect value in buffer overflow - arm

I am working on stack smashing on ARM and I have a buffer declared as:
char buff[12];
in my code.
In order to find the location where the PC gets overwritten in gdb I write
AAAABBBBCCCCDDDDEEEEFFFF to buff
I expected DDDD to overwrite FP(r11) as 0x44444444 (and it execute correct) but the PC was overwitten with 0x45454544 (instead of 0x45454545)
Does anyone have an idea why the last byte is D(44) instead of E(45)? I have tried longer input but the value in the PC always drops by a few bits.
Screenshot
of GDB output

The PC register cannot hold an odd value - the LSB are forced to 0 to ensure the address is aligned.

Related

How to read & write 1 byte into a particular memory address?

I'm trying to read a single byte from a particular memory address of a structure followed by writing that same byte into that same address. My goal is to load 64-byte memory associated with that memory address into the cache line.
I have a structure variable testStructure of size 12584. I tried the following code to read and write back 1 byte into the memory address,
unsigned char *p;
int forEachCacheLine = sizeof(testStructure);
printf("size of forEachCacheLine is %d\n", forEachCacheLine);
for (int i = 0; i<forEachCacheLine ; i+=64) {
printf("i is %d\n",i);
// read 1 byte
p=(unsigned char *)&testStructure+i;
printf("Read from %p byte is %hhx\n", &testStructure+i, p);
// write 1 byte
*(unsigned char *)(&testStructure+i)=p;
printf("Write into %p byte is %hhx\n\n", &testStructure+i, p);
}
upon running the code I get the following output:
size of forEachCacheLine is 12584
i is 0
Read from 0x7f5d42e71f80 byte is 80
Write into 0x7f5d42e71f80 byte is 80
i is 64
Read from 0x7f5d42f36980 byte is c0
Segmentation fault (core dumped)
As the output shows, the writing attempt in the first iteration was successful. However, the second iteration causes a Segfault. Any comments on why I'm getting this Segmentation fault?
I'm not quite sure if this is the correct approach to achieve my goal. But as of now, this is the only way I can think off. If this is an incorrect approach can someone please suggest an alternate approach?
While your for loop only seems to increment i by 64 bytes each iteration, your debug output shows that the second read is 805376 bytes beyond the first - which is out of bounds and likely causing the segfault. 805376 is 64 times 12584. What this means is that each iteration is incrementing p by 64 teststructure's instead of 64 chars.
Try replacing
p=(unsigned char *)&testStructure+i;
with
p=((unsigned char *)&testStructure)+i;
This ensures that &teststructure is a char* and not a teststructure* when we add i to it.
On most modern systems, there is little to no performance gain from explicitly caching memory like this. If you really want the memory in cache though - try using __builtin_prefetch() on each 64 bytes of teststructure. This is a GNU extension that populates a cache line. You should also pay attention to it's optional arguments.

x86 assembly function stack

i am honing my assembly for buffer overflow exploitation.
Machine specs : kali linux 32 bit running in virtual box.
i am running this code
#include <stdio.h>
getinput(){
char buffer[8]; //allocating 8 bytes
gets(buffer); //read input
puts(buffer); // print;
}
main() {
getInput();
return 0 ;
}
My understaning is that when the function getInput() is invoked the following happens :
1 - the address of the next instruction in main is pushed on the stack.
2 - ebp register is pushed on the stack.
3 - 8 bytes are allocated on the stack for the buffer.
That a total of 16 bytes.. but
As you can see in the image , just before reading the input in the getInput() function
it shows a total of 24 bytes of the stack.
specifically, i don't know why there is an extra 0x0000000 on the top of the stack
moreover, when i try to over-write the return address by inputing something like (ABCDABCDABCDABCD[desired address for target program]) it justs over-writes everything.
And if i try to input something like \xab\xab\xab\xab it gives a segementation fault , although this is only 4 bytes and should fit perfectly into the 8 bytes buffer.
Thank you in advance.
In real life buffer overflow attacks, you never know the size of the stack frame. If you discover a buffer overflow bug, it's up to you to determine the offset from the buffer start to the return address. Treat your toy example exactly like that.
That said, the structure of the stack frame can be driven by numerous considerations. The calling convention might call for specific stack alignment. The compiler might create invisible variables for its own internal bookkeeping, which may vary depending on compiler flags, such as the level of optimization. There might be some space for saved caller registers, the number of which is driven by the register usage of the function itself. There might even be a guard variable specifically to detect buffer overflows. In general, you can't deduce the stack frame structure from the source alone. Unless you wrote the compiler, that is.
After diassembling the getInput routin , it turned out that the extra 4 bytes came from the compiler pushing $ebx on the stack for some reason.
After testing with various payloads , it appeared that i was not considering the deceptive null byte that is added at the end of the string. so i only need to add one extra byte to mitigate the effect of the null byte.
The proper payload was : printf "AAAAAAAAAAAAA\xc9\x61\x55\x56" | ./test

Attempting to parse a WAV file, memcpy results are unexpected

Assume that I have a small WAV file I've opened and dumped as a array of char for processing.
Right now, I am attempting to memcpy the fmt chunk ID into a 4 byte buffer.
char fmt[4];
memcpy(fmt_chunk_id, raw_file + 12, sizeof(char) * 4);
From my understanding of memcpy, this will copy the 4 bytes starting at offset 12 into fmt. However, when I go to debug the program I get some very strange output:
It seems to copy the fmt section correctly, but now for some reason I have a bunch of garbage after it. Interestingly, this garbage comes before format at offset bytes 0 (RIFF), and 8 (WAVE). It is a little endian file (RIFF).
I can't for the life of me figure out why I'm getting data from the beginning of the buffer at the end of this given that I only copied 4 bytes worth of data (which should exactly fit the first 4 characters f m t and space).
What is going on here? The output seems to indicate to me I'm somehow over-reading memory somewhere - but if that was the case I'd expect garbage rather than the previous offset's data.
EDIT:
If it matters, the data type of raw_file is const char* const.
The debugger is showing you an area of memory which has been dynamically allocated on the stack.
What is in all probability happening is that you read data from the file, and even if you asked to read, say, 50 bytes, the underlying system might have decided to read more (1024, 2048, or 4096 bytes usually). So those bytes passed around in memory, likely some on the stack, and that stack is being reused by your function now. If you asked to read more than those four bytes, then this is even more likely to happen.
Then the debugger sees that you are pointing to a string, but in C strings run until they get terminated by a zero (ASCIIZ). So what you're shown is the first four bytes and everything else that followed, up to the first 0x00 byte.
If that's important to you, just
char fmt[5];
fmt[4] = 0;
// read four bytes into fmt.
Now the debugger will only show you the first four bytes.
But now you see why you should always scrub and overwrite sensitive information from a memory area before free()ing it -- the data might remain there and even be reused or dumped by accident.

Why doesn't scanf() generate memory clobbering errors when filling array.

Given an array with 5 elements, it is well known that if you use scanf() to read in exactly 5 elements, then scanf() will fill the array and then clobber memory by putting a null character '\0' into the 6th element without generating an error(Im calling it a 6th element but I know its memory thats not part of the array) As is described here: Null termination of char array
However when you try to read in 6 elements or more an error is generated because the OS detects that memory is being clobbered and the kernel sends a signal. Can someone clear up why an error is not generated in the first case of memory clobbering above?
Example code:
// ex1.c
#include <stdio.h>
int main(void){
char arr[5];
scanf("%s", arr);
printf("%s\n", arr);
return 0;
}
Compile, run and enter four characters: 1234. This stores them in the array correctly and doesn't clobber memory. No error here.
$ ./ex1
1234
1234
Run again and enter five characters. This will clobber memory because scanf() stored an extra '\0' null character in memory after the 5th element. No error is generated.
$ ./ex1
12345
12345
Now enter six characters which we expect to clobber memory. The error that is generated looks like(ie. Im guessing) its the result of a signal sent by the kernel saying that we just clobbered the stack(local memory) somehow....Why is an error being generated for this memory clobbering but not for the previous one above?
$ ./ex1
123456
123456
*** stack smashing detected ***: ./ex1 terminated
Aborted (core dumped)
This seems to happen no matter what size I make the array.
The behaviour is undefined if in both the cases where you input more than characters than the buffer can hold.
The stack smashing detection mechanism works by using canaries. When the canary value gets overwritten SIGABRT is generated. The reason why it doesn't get generated is probably because there's at least one extra byte of memory after the array (typically one-past-the-end of an object is required to be a valid pointer. But it can't be used to store to values -- legally).
In essence, the canary wasn't overwritten when you input 1 extra char but it does get overwritten when you input 2 bytes for one reason or another, triggering SIGABRT.
If you have some other variables after arr such as:
#include <stdio.h>
int main(void){
char arr[5];
char var[128];
scanf("%s", arr);
printf("%s\n", arr);
return 0;
}
Then the canary may not be overwritten when you input few more bytes as it might be simply overwriting var. Thus prolonging the buffer overflow detection by the compiler. This is a plausible explanation. But in any case, your program is invalid if it overruns buffer and you should not rely the stack smashing detection by the compiler to save you.
.Why is an error being generated for this memory clobbering but not for the previous one above?
Because for the 1st test it seemed to work just because of (bad) luck.
In both cases arr was accessed out-of-bounds and by doing so the code invoked undefined behaviour. This means the code might do what you expect or not or what ever, like booting the machine, formatting the disk ...
C does not test for memory access, but leaves this to the programmer. Who could have made the call to scanf() save by doing:
char arr[5];
scanf("%4s", arr); /* Stop scanning after 4th character. */
Stack Smashing here is actually caused due to a protection mechanism used by compiler to detect buffer overflow errors.The compiler adds protection variables (known as canaries) which have known values.
In your case when an input string of size greater than 5 causes corruption of this variable resulting in SIGABRT to terminate the program.
You can read more about buffer overflow protection. But as #alk answered you are invoking Undefined Behavior
Actually
If we declare a array of size 5, then also rather we can put and access data from this array as memory beyond this array is empty and we can do the same till this memory is free but once it assigned to another program now even we are unable to acces a data present there

Format String Vulnerability troubles

So I have this function:
void print_usage(char* arg)
{
char buffer[640];
sprintf(buffer, "Usage: %s [options]\n"
"Randomly generates a password, optionally writes it to /etc/shadow\n"
"\n"
"Options:\n"
"-s, --salt <salt> Specify custom salt, default is random\n"
"-e, --seed [file] Specify custom seed from file, default is from stdin\n"
"-t, --type <type> Specify different encryption method\n"
"-v, --version Show version\n"
"-h, --help Show this usage message\n"
"\n"
"Encryption types:\n"
" 0 - DES (default)\n"
" 1 - MD5\n"
" 2 - Blowfish\n"
" 3 - SHA-256\n"
" 4 - SHA-512\n", arg);
printf(buffer);
}
I wish to utilize a format string vulnerability attack (my assignment). Here is my attempt:
I have an exploit program which fills a buffer with noops and shell code (I have used this program to buffer overflow the same function, so I know its good). Now, I did an object dump of the file to find the .dtors_list address and I got 0x0804a20c, adding 4 bytes to get the end I get 0x804a210.
Next I used gdb to find at what address my noops begin while running my program. Using this I got 0xffbfdbb8.
So up to this point I feel like I'm correct, now I know I want to use format string to copy the noop address into my .dtors_end address. Here is the string I came up with (this is the string I'm providing as user input to the function):
"\x10\xa2\x04\x08\x11\xa2\x04\x08\x12\xa2\x04\x08\x13\xa2\x04\x08%%.168u%%1$n%%.51u%%2$n%%.228u%%3$n%%.64u%%4$n"
This doesn't work for me. The program runs normally and the %s is replaced with the string I input (minus the little endian memory address at the front, and the two percent signs are now one percent sign for some reason).
Anyways, I'm kind of stumped here, any help would be appreciated.
Disclaimer: I'm no expert.
You're passing "\x10\xa2\x04\x08\x11\xa2\x04\x08\x12\xa2\x04\x08\x13\xa2\x04\x08%%.168u%%1$n%%.51u%%2$n%%.228u%%3$n%%.64u%%4$n" as the value of arg? That means that buffer will contain
"Usage:\x20\x10\xa2\x04\x08\x11\xa2\x04\x08\x12\xa2\x04\x08\x13\xa2\x04\x08%.168u%1$n%.51u%2$n%.228u%3$n%.64u%4$n [options]\x0aRandomly..."
Now let's further assume that you're on an x86-32 target (if you're on x86-64, this won't work), and that you're compiling with an optimization level that doesn't put anything in print_usage's stack frame except for the 640-byte buffer array.
Then printf(buffer) will do the following things, in order:
Push the 4-byte address &buffer.
Push a 4-byte return address.
Invoke printf...
Print out "Usage:\x20\x10\xa2\x04\x08\x11\xa2\x04\x08\x12\xa2\x04\x08\x13\xa2\x04\x08" (a sequence of 23 bytes).
%.168u: Interpret the next argument to printf as an unsigned int and print it in a field of width 168. Since printf has no next argument, this is actually going to print the next thing on the stack; that is, the first four bytes of buffer; that is, "Usag" (0x67617355).
%1$n: Interpret the second argument to printf as a pointer to int and store 23+168 at that location. This stores 0x000000bf in location 0x67617355. So this is your main problem: You should have used %2$n instead of %1$n and added one junk byte to the front of your arg. (Incidentally, notice that GNU says "If any of the formats has a specification for the parameter position all of them in the format string shall have one. Otherwise the behavior is undefined." So you should go through and add 1$s to all your %us just to be on the safe side.)
%.51u: Print another 51 bytes of garbage.
%2$n: Interpret the third argument to printf as a pointer to int and store 0x000000f2 in that garbage location. As above, this should have been %3$n.
... etc. etc. ...
So, your major bug here is that you forgot to account for the "Usage: " prefix.
I assume you were trying to store the four bytes 0xffbfdbb8 into address 0x804a210. Let's say you'd gotten that to work. But then what would your next step be? How do you get the program to treat the four-byte quantity at 0x804a210 as a function pointer and jump through it?
The traditional way to exploit this code would be to exploit the buffer overflow in sprintf, rather than the more complicated "%n" vulnerability in printf. You just need to make your arg roughly 640 characters long and make sure that the 4 bytes of it that correspond to print_usage's return address contain the address of your NOP sled.
Even that part is tricky, though. You might conceivably be running into something related to ASLR: just because your sled exists at address 0xffbfdbb8 in one run doesn't mean it'll exist at that same address in the next run.
Does this help?

Resources