i am honing my assembly for buffer overflow exploitation.
Machine specs : kali linux 32 bit running in virtual box.
i am running this code
#include <stdio.h>
getinput(){
char buffer[8]; //allocating 8 bytes
gets(buffer); //read input
puts(buffer); // print;
}
main() {
getInput();
return 0 ;
}
My understaning is that when the function getInput() is invoked the following happens :
1 - the address of the next instruction in main is pushed on the stack.
2 - ebp register is pushed on the stack.
3 - 8 bytes are allocated on the stack for the buffer.
That a total of 16 bytes.. but
As you can see in the image , just before reading the input in the getInput() function
it shows a total of 24 bytes of the stack.
specifically, i don't know why there is an extra 0x0000000 on the top of the stack
moreover, when i try to over-write the return address by inputing something like (ABCDABCDABCDABCD[desired address for target program]) it justs over-writes everything.
And if i try to input something like \xab\xab\xab\xab it gives a segementation fault , although this is only 4 bytes and should fit perfectly into the 8 bytes buffer.
Thank you in advance.
In real life buffer overflow attacks, you never know the size of the stack frame. If you discover a buffer overflow bug, it's up to you to determine the offset from the buffer start to the return address. Treat your toy example exactly like that.
That said, the structure of the stack frame can be driven by numerous considerations. The calling convention might call for specific stack alignment. The compiler might create invisible variables for its own internal bookkeeping, which may vary depending on compiler flags, such as the level of optimization. There might be some space for saved caller registers, the number of which is driven by the register usage of the function itself. There might even be a guard variable specifically to detect buffer overflows. In general, you can't deduce the stack frame structure from the source alone. Unless you wrote the compiler, that is.
After diassembling the getInput routin , it turned out that the extra 4 bytes came from the compiler pushing $ebx on the stack for some reason.
After testing with various payloads , it appeared that i was not considering the deceptive null byte that is added at the end of the string. so i only need to add one extra byte to mitigate the effect of the null byte.
The proper payload was : printf "AAAAAAAAAAAAA\xc9\x61\x55\x56" | ./test
Related
Assume that I have a small WAV file I've opened and dumped as a array of char for processing.
Right now, I am attempting to memcpy the fmt chunk ID into a 4 byte buffer.
char fmt[4];
memcpy(fmt_chunk_id, raw_file + 12, sizeof(char) * 4);
From my understanding of memcpy, this will copy the 4 bytes starting at offset 12 into fmt. However, when I go to debug the program I get some very strange output:
It seems to copy the fmt section correctly, but now for some reason I have a bunch of garbage after it. Interestingly, this garbage comes before format at offset bytes 0 (RIFF), and 8 (WAVE). It is a little endian file (RIFF).
I can't for the life of me figure out why I'm getting data from the beginning of the buffer at the end of this given that I only copied 4 bytes worth of data (which should exactly fit the first 4 characters f m t and space).
What is going on here? The output seems to indicate to me I'm somehow over-reading memory somewhere - but if that was the case I'd expect garbage rather than the previous offset's data.
EDIT:
If it matters, the data type of raw_file is const char* const.
The debugger is showing you an area of memory which has been dynamically allocated on the stack.
What is in all probability happening is that you read data from the file, and even if you asked to read, say, 50 bytes, the underlying system might have decided to read more (1024, 2048, or 4096 bytes usually). So those bytes passed around in memory, likely some on the stack, and that stack is being reused by your function now. If you asked to read more than those four bytes, then this is even more likely to happen.
Then the debugger sees that you are pointing to a string, but in C strings run until they get terminated by a zero (ASCIIZ). So what you're shown is the first four bytes and everything else that followed, up to the first 0x00 byte.
If that's important to you, just
char fmt[5];
fmt[4] = 0;
// read four bytes into fmt.
Now the debugger will only show you the first four bytes.
But now you see why you should always scrub and overwrite sensitive information from a memory area before free()ing it -- the data might remain there and even be reused or dumped by accident.
Given an array with 5 elements, it is well known that if you use scanf() to read in exactly 5 elements, then scanf() will fill the array and then clobber memory by putting a null character '\0' into the 6th element without generating an error(Im calling it a 6th element but I know its memory thats not part of the array) As is described here: Null termination of char array
However when you try to read in 6 elements or more an error is generated because the OS detects that memory is being clobbered and the kernel sends a signal. Can someone clear up why an error is not generated in the first case of memory clobbering above?
Example code:
// ex1.c
#include <stdio.h>
int main(void){
char arr[5];
scanf("%s", arr);
printf("%s\n", arr);
return 0;
}
Compile, run and enter four characters: 1234. This stores them in the array correctly and doesn't clobber memory. No error here.
$ ./ex1
1234
1234
Run again and enter five characters. This will clobber memory because scanf() stored an extra '\0' null character in memory after the 5th element. No error is generated.
$ ./ex1
12345
12345
Now enter six characters which we expect to clobber memory. The error that is generated looks like(ie. Im guessing) its the result of a signal sent by the kernel saying that we just clobbered the stack(local memory) somehow....Why is an error being generated for this memory clobbering but not for the previous one above?
$ ./ex1
123456
123456
*** stack smashing detected ***: ./ex1 terminated
Aborted (core dumped)
This seems to happen no matter what size I make the array.
The behaviour is undefined if in both the cases where you input more than characters than the buffer can hold.
The stack smashing detection mechanism works by using canaries. When the canary value gets overwritten SIGABRT is generated. The reason why it doesn't get generated is probably because there's at least one extra byte of memory after the array (typically one-past-the-end of an object is required to be a valid pointer. But it can't be used to store to values -- legally).
In essence, the canary wasn't overwritten when you input 1 extra char but it does get overwritten when you input 2 bytes for one reason or another, triggering SIGABRT.
If you have some other variables after arr such as:
#include <stdio.h>
int main(void){
char arr[5];
char var[128];
scanf("%s", arr);
printf("%s\n", arr);
return 0;
}
Then the canary may not be overwritten when you input few more bytes as it might be simply overwriting var. Thus prolonging the buffer overflow detection by the compiler. This is a plausible explanation. But in any case, your program is invalid if it overruns buffer and you should not rely the stack smashing detection by the compiler to save you.
.Why is an error being generated for this memory clobbering but not for the previous one above?
Because for the 1st test it seemed to work just because of (bad) luck.
In both cases arr was accessed out-of-bounds and by doing so the code invoked undefined behaviour. This means the code might do what you expect or not or what ever, like booting the machine, formatting the disk ...
C does not test for memory access, but leaves this to the programmer. Who could have made the call to scanf() save by doing:
char arr[5];
scanf("%4s", arr); /* Stop scanning after 4th character. */
Stack Smashing here is actually caused due to a protection mechanism used by compiler to detect buffer overflow errors.The compiler adds protection variables (known as canaries) which have known values.
In your case when an input string of size greater than 5 causes corruption of this variable resulting in SIGABRT to terminate the program.
You can read more about buffer overflow protection. But as #alk answered you are invoking Undefined Behavior
Actually
If we declare a array of size 5, then also rather we can put and access data from this array as memory beyond this array is empty and we can do the same till this memory is free but once it assigned to another program now even we are unable to acces a data present there
I am using gcc on Linux and the below code compiles successfully but not printing the values of variable i correctly, if a character is entered once at a time i jumps or reduces to 0. I Know I am using %d for a char at scanf(I was trying to erase the stack). Is this a case of attempt to erase stack or something else ?( I thought if the stack was erased the program would crash).
#include <stdio.h>
int main()
{
int i;
char c;
for (i=0; i<5; i++) {
scanf ("%d", &c);
printf ("%d ", i);
}
return 0;
}
Besides the arguments to main, you have an int and a char on the stack.
Lets assume sizeof(int) == 4 and only have a look at i and c.
( int i )(char c )
[ 0 ][ 1 ][ 2 ][ 3 (&i)][ 4 (&c)]
So this is actually your stack layout without argc and *argv.
With i consuming four times more memory than c in this case.
The stack grows in the opposite direction, so if you write something bigger than a char to c, it will write to [4] and further to the left, never to the right. So [5] will never get written to. Instead you overwrite [3].
For the case where you write an int to c and int is four times bigger than c, you'll actually write to [1][2][3][4], just [0] will not be overwritten, but 3/4 of the memory for the int will be corrupted.
On a big-endian system, the most significant byte of i will be stored in [3] and therefore get overwritten by this operation. On a little-endian system, the most significant byte is stored in [0] and would be preserved. Nonetheless, you corrupt your stack this way.
As ams mentions this is not always true. There could be different alignments for efficiency or because the platform only supports aligned access, leaving gaps between variables. Also a compiler is allowed to do any optimizations as long as it has no visible side-effects as stated by the as-if rule. In this case the variables could perfectly be stored in a register and never be saved on the stack at all. But a lot of other compiler optimizations and platform dependencies can make this way more complex.
So this is only the simplest case without taking platform dependencies and compiler optimizations into account and also seems to be what happens in your special case with maybe some minor differences.
With your scanf(), you are inserting a int inside a char. One char is usually stored using just 1 byte, so your compiler will probably overflow, but it could or not overwrite other values, depending on the alignment of the variables.
If your compiler reserves 1 byte for the char, and the memory address of the int is just after the address of the char (that will probably be the case), then your scanf() will just overwrite the first bytes of i. If you are in a little-endian machine and you enter values smaller than 256, then i will always be 0.
But it can grow larger if you enter a bigger value. Try entering 256; i will become 1. With 512, i will become 2, and so one.
But you are not "erasing the stack", just overwriting some bytes (in fact, you are overwriting sizeof(int) bytes; one of them correspond to the char and the others will probably be all the bytes in your int but one).
If you really want to "erase the stack", you could do something like:
#include <string.h>
int
main(void) {
char c;
memset(&c, 0, 10000);
return 0;
}
#foobar has given a very nice description of what one compiler on one architecture happens to do. He(?) also gives an answer to what a hypothetical big-endian compiler/system might do.
To be fair, that's probably the most useful answer, but it's still not correct.
There is no rule that says the stack must grow in one way or the other (although, much like whether to drive on the left or the right, a choice must be made, and a descending stack is most common).
There is no rule that says the compiler must layout the variables on the stack in any particular way. The C standard doesn't care. Not even the official architecture/OS-specific ABI cares a jot about that, as long as the bits necessary for unwinding work. In this case, the compiler could even choose a different scheme for every function in the system (not that it's likely).
All that is certain is that scanf will try to write something int-sized to a char, and that this is undefined. In practice there are several possible outcomes:
It works fine, nothing extra is overwritten, and you got lucky. (Perhaps int is the same size as char, or there was padding in the stack.)
It overwrites the data following the char in memory, whatever that is.
It overwrites the data just before and/or after the char. (This might happen on an architecture where an aligned store instruction disregards the bottom bits of the write address.)
It crashes with an unaligned access exception.
It detects the stack scribble, prints a message, and reports the incident.
Of course, none of this will happen because you compile with -Wall -Wextra -Werror enabled, and GCC tells you that your scanf format doesn't match your variable types.
As Kerrek SB commented, the behavior is undefined.
As you know, you pass a char * to the scanf function, but tells the function to treat it like an int *.
It might (although very unlikely to) overwrite something else on the stack, for example, i, the previous stack pointer or the return address.
It might just override unused bytes, for example if the compiler uses padding to align the stack.
It might cause a crash, for example if the address of c is not 4- or 8-byte aligned and the platform requires de-reference of int to be 4- or 8-byte aligned.
And it might do anything else.
But the answer is still - anything is possible in such case. The behavior is simply not defined.
Let me preface this by saying that i am a newbie, and im in a entry level C class at school.
Im writing a program that required me to use malloc and malloc is allocating 8x the space i expect it to in all cases. Even when just to malloc(1), it is allocation 8 bytes instead of 1, and i am confused as to why.
Here is my code I tested with. This should only allow one character to be entered plus the escape character. Instead I can enter 8, so it is allocating 8 bytes instead of 1, this is the case even if I just use a integer in malloc(). Please ignore the x variable, it is used in the actual program, but not in this test. :
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main (int argc ,char* argv[]){
int x = 0;
char *A = NULL;
A=(char*)malloc(sizeof(char)+1);
scanf("%s",A);
printf("%s", A);
free(A);
return 0;
}
A=(char*)malloc(sizeof(char)+1);
is going to allocate at least 2 bytes (sizeof(char) is always 1).
I don't understand how you are determining that it is allocating 8 bytes, however malloc is allowed to allocate more memory than you ask for, just never less.
The fact that you can use scanf to write a longer string to the memory pointed to by A does not mean that you have that memory allocated. It will overwrite whatever is there, which may result in your program crashing or producing unexpected results.
malloc is allocating as much memory as you asked for.
If you can read more than the allocated bytes (using scanf) it's because scanf is reading also over the memory you own: it's a buffer overflow.
You should limit the data scanf can read this way:
scanf( "%10s", ... ); // scanf will read a string no longer than 10
Im writing a program that required me
to use malloc and malloc is allocating
8x the space i expect it to in all
cases. Even when just to malloc(1), it
is allocation 8 bytes instead of 1,
and i am confused as to why.
Theoretically speaking, the way you do things in the program, is not allocating 8 bytes.
You can still type in 8 bytes (or any number of bytes) because in C there is no check, that you are still using a valid place to write.
What you see is Undefined Behaviour, and the reason for that is that you write in memory that you shouldn't. There is nothing in your code that stops the program after n byte(s) you allocated have been used.
You might get Seg Fault now, or later, or never. This is Undefined Behaviour. Just because it appears to work, does not mean it is right.
Now, Your program could indeed allocate 8 bytes instead of 1.
The reason for that is because of Alignment
The same program might allocate a different size in a different machine and/or a different Operating System.
Also, since you are using C you don't really need to cast. See this for a start.
In your code, there is no limit on how much data you can load in with scanf, leading to a buffer overflow (security flaw/crash). You should use a format string that limits the amount of data read in to the one or two bytes that you allocate. The malloc function will probably allocate some extra space to round the size up, but you should not rely on that.
malloc is allowed to allocate more memory than you ask for. It's only required to provide at least as much as you ask for, or fail if it can't.
using malloc or creating a buffer on the stack will allocate memory in words.
On a 32-bit system the word size is 4 bytes, so when you ask for
A=(char*)malloc(sizeof(char)+1);
(which is essentially A=(char*)malloc(2);
the system will actually give you 4 bytes. On a 64-bit machine you should get 8 bytes.
The way you use scanf there is dangerous as it will overflow the buffer if a string greater than the allocated size leaving a heap overflow vulnerability in your program. scanf in this case will attempt to stuff a string of any length in to that memory so using it to count the allocated size will not work.
What system are you running on? If it's 64 bit, it is possible that the system is allocating the smallest possible unit that it can. 64 bits being 8 bytes.
EDIT: Just a note of interest:
char *s = malloc (1);
Causes 16 bytes to be allocated on iOS 4.2 (Xcode 3.2.5).
If you enter 8 if will just allocate 2 bytes sizeof(char) == 1 (unless you are on some obscure platform) and you will write you number to that char. Then on printf it will output the number you stored in there. So if you store the number 8 it'll display 8 on the command line. It has nothing to do with the count of chars allocated.
Unless of course you looked up in a debugger or somewhere else that it is really allocating 8 bytes.
scanf has no idea how big the target buffer actually is. All it knows is the starting address of the buffer. C does no bounds checking, so if you pass it the address of a buffer sized to hold 2 characters, and you enter a string that's 10 characters long, scanf will write those extra 8 characters to the memory following the end of the buffer. This is called a buffer overrun, which is a common malware exploit. For whatever reason, the six bytes immediately following your buffer aren't "important", so you can enter up to 8 characters with no apparent ill effects.
You can limit the number of characters read in a scanf call by including an explicit field width in the conversion specifier:
scanf("%2s", A);
but it's still up to you to make sure that target buffer is large enough to accomodate that width. Unfortunately, there's no way to specify the field width dynamically as there is with printf:
printf("%*s", fieldWidth, string);
because %*s means something completely different in scanf (basically, skip over the next string).
You could use sprintf to build your format string:
sprintf(format, "%%%ds", max_bytes_in_A);
scanf(format, A);
but you have to make sure the buffer format is wide enough to hold the result, etc., etc., etc.
This is why I usually recommend fgets() for interactive input.
I might be stupid and you need to excuse me in that case...but I don't get this.
I'm allocating a buffer of 16 chars and then (in a for loop) put in 23(!?) random characters and then printing that stuff out.
What I don't get is how I can put 23 chars into a buffer that is malloc'ed as 16 chars...When I change the loop to 24 characters I get an error though(at least in Linux with gcc)...but why not "earlier" (17 characters should break it...no?)
This is my example code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int n;
char *buf;
buf = malloc(16 * sizeof(*buf));
if(buf == NULL) exit(1);
for(n = 0; n < 22; n++)
{
buf[n] = rand()%26+'a';
}
buf[n]='\0';
printf("Random string: %s\n", buf);
free(buf);
buf = NULL;
getchar();
return 0;
}
You are producing an error, but like many bugs it just happens to not be noticed. That's all there is to it.
It might be for one of several reasons - probably that the way the free store is structured, there's slack space between allocations because the system needs (or wants) to keep addresses of allocatable blocks aligned on certain boundaries. So writes a little past your allocated block don't interfere with the free stores data structures, but a little bit further and they do.
It is also quite possible that your bug did corrupt something the free store manager was using, but it just happened to not be actually used in your simple program so the error wasn't noticed (yet).
Most memory allocation strategies round your malloc request up to some quantization value. Often 16 or 32 bytes. That quantization usually happens after the allocator has added in its overhead (used to keep track of the allocated blocks), so it's common to find that you can overrun a malloc by some small number of bytes without doing any actual damage to the heap, especially when the allocation size is odd.
Of course, this isn't something that you want to depend on, it's an implementation detail of the c runtime library, and subject to change without notice.
The behaviour when you overrun a buffer in C is undefined. So anything may happen including nothing. Specifically C is not required and is designed intentionally not to perform bounds checking.
If you get any runtime error at all, it will generally be because it has been detected and trapped by the operating system, not by the C runtime. And that will only occur if the access encroaches upon memory not mapped to the process. More often it will simply access memory belonging to your process but which may be in use by other memory objects; the behaviour of your program will then depend on what your code does with the invalid data.
In C you will get away with these kinds of things. Sometime later other parts of the programs may come in and overwrite the area you are not supposed to use. So it is better not to test these things.