I am trying to do an example from the Smashing the Stack for Fun and Profit in C, but am kind of stuck at a point,
following is the code (I have a 64-bit machine with Ubuntu 64-bit):
int main()
{
int x;
x = 0;
func(1,2,3);
x = 1;
printf("x is : %d\n", x);
}
void func(int a, int b, int c)
{
char buffer[1];
int *ret;
ret = buffer + 17;
(*ret) += 7;
}
The above code works fine and on returning the x=1 line is not executed, but I can't understand the logic behind ret = buffer + 17;, shouldn't it be ret = buffer + 16; i.e, 8bytes for buffer and 8 for the saved base pointer on stack.
Secondly, my understanding is that char buffer[1] is taking 8 bytes (owing to 64-bit arch)
and if I increase this buffer to say buffer[2], still the same code should work fine, BUT this is not happening and it starts giving seg fault.
Regards,
Numan
'char' on every architecture I've used is 8 bits wide irrespective of whether it's an 8 bit micro, a 16 bit micro, a 32 bit PC, or a 64 bit new PC. Int, on the other hand, tends to be the word size.
The order which the locals are put on the stack can be implementation specific. My guess is that your compiler is putting "int *ret" on the stack before "char buffer1". So, to get to the return address, we have to go through "char buffer1" (1 byte), "int *ret" (8 bytes), and the saved base pointer (8 bytes) for a total of 17 bytes.
Here's a description of the stack frame on x86 64-bit:
http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/projects/x86-64
Step through the disassembly in gdb (disassemble, stepi, nexti) and look at the registers at each step (info registers).
Here how you can step through disassembly:
gdb ./myprogram
break main
run
display/4i $eip
stepi
stepi
...
info registers
...
You should also know (you probably already do given that you got part of it working) that on many distros, the stack protector is enabled by default in gcc. You can manually disable it with -fno-stack-protector.
With a lot of this stack smashing stuff, your best friend is gdb. Since you're segfaulting already you're already writing memory you're not supposed to be (a good sign). A more effective way to do it right is to change the return address to somewhere else that's a valid address (e.g. to func's address or to some shellcode you've got). A great resource I'd recommend is the Shellcoder's Handbook, but since you're on a 64-bit architecture a lot of the examples need a bit of work to get going.
Aside from (or better yet, in addition to) running a debugger, you can also use the printf "%p" construct to print the addresses of your variables, e.g.:
printf("buf: %p\n", buffer); //&buffer[0] works too; &buffer works for an array
printf("ret: %p\n", &ret):
printf("a: %p\n", &a);
Printing the addresses of various things can give great insight into how your compiler/implementation is arranging things in the background. And you can do it directly from C code, too!
Consider taking a look at stealth's borrowed code chunk technique, if you're interested in x64 buffer overflow exploitation.
Related
I have the following program in C:
1 #include<stdio.h>
2
3 int main(void) {
4 int i=0;
5 for (int k=0; k<10; k++)
6 printf("Number: %d", k);
7 printf("Hello\n");
8 return 0;
9 }
When I run it in gdb it gives me a listing of all the registers, but I don't see the variable k in any of those reigsters. For example, in the below screenshot, I know k=4, but I don't see that value in any of the registers. Where would this number be stored then?
I know k=4, but I don't see that value in any of the registers. Where would this number be stored then?
If you optimized the program, the value would indeed likely be stored in a register (but the program will be much harder to debug).
Without optimization, the value is stored on stack (to be precise, given the disassembly, it is stored at location $rbp-8), and is loaded into a register by the very next instruction (the one before which you have stopped).
If you do stepi and look at the value of $rax, you will find it right there.
P.S. info locals will give you info about local variables.
Update:
What does stepi do?
It executes a single machine instruction, then stops. You can find this out by reading the manual, or by using help stepi GDB command.
What/were is $rbp-8? Could you please explain a bit more about what that is and how it works?
That is something that would be covered in every introductory x86 programming book or tutorial.
Briefly, current state of the program execution can be described as a series of linked activation records or "frames". On x86 without optimization, the $RBP register is usually used as a frame pointer register (i.e. it points to the current frame). Locals are stored at negative offsets from the frame pointer (here, k is stored at offset -8).
I am very new to C, it's my second high-level programming language after Java. I have gotten most of the basics down, but for whatever reason I am unable to write a single character to screen memory.
This program is compiled using Turbo C for DOS on an Am486-DX4-100 running at 120mhz. The graphics card is a very standard VLB Diamond Multimedia Stealth SE using a Trio32 chip.
For an OS I am running PC-DOS 2000 with an ISO codepage loaded. I am running in standard MDA/CGA/EGA/VGA style 80 column text mode with colour.
Here is the program as I have it written:
#include <stdio.h>
int main(void) {
unsigned short int *Video = (unsigned short int *)0xB8000;
*Video = 0x0402;
getchar();
return 0;
}
As I stated, I am very new to C, so I apologize if my error seems obvious, I was unable to find a solid source on how to do this that I could understand.
To my knowledge, in real mode on the x86 platform, the screen memory for text mode starts at 0xB8000. Each character is stored in two bytes, one for the character, and one for the background/foreground. The idea is to write the value 0x0402 (which should be a red smiling face) to 0xB8000. This should put it at the top left of the screen.
I have taken into account the possibility that the screen may be scrolling, and thus immediately removing my character upon execution in two ways. To resolve this issue, I have tried:
Repeatedly write this value using a loop
Write it a bit further down.
I can read and print the value I wrote to memory, so it's obviously still somewhere in memory, but for whatever reason I do not get anything onscreen. I'm obviously doing something wrong, however I do not know what could be the issue. If any other details are needed, please ask. Thank you for any possible help you can give.
In real mode to address the first full 1MiB of memory a mechanism called 20-bit segment:offset addressing is used. 0xb8000 is a physical memory address. You need to use something called a far pointer that allows you to address memory with real mode segmentation. The different types of pointers are described in this Stackoverflow Answer
0xb8000 can be represented as a segment of 0xb800 and an offset of 0x0000. The calculation to get physical address is segment*16+offset. 0xb800*16+0x0000=0xb8000. With this in mind you can include dos.h and use the MK_FP C macro to initialize a far pointer to such an address given segment and offset.
From the documentation MK_FP is defined as:
MK_FP() Make a Far Pointer
#include <dos.h>
void far *MK_FP(seg,off);
unsigned seg; Segment
unsigned off; Offset
MK_FP() is a macro that makes a far pointer from its component segment 'seg' and offset 'off' parts.
Returns: A far pointer.
Your code could be written like this:
#include <stdio.h>
#include <dos.h>
int main(void) {
unsigned short int far *Video = (unsigned short int far *)MK_FP(0xB800,0x0000);
*Video = 0x0402;
getchar();
return 0;
}
The memory segment adress depends on the video mode used:
0xA0000 for EGA/VGA graphics modes (64 KB)
0xB0000 for monochrome text mode (32 KB)
0xB8000 for color text mode and CGA-compatible graphics modes (32 KB)
To directly access vram you need a 32 bit-pointer to hold segement and offset address otherwise you would mess up your heap. This usually leads to undefined behaviour.
char far *Video = (char far *)0xb8000000;
See also: What are near, far and huge pointers?
As #stacker pointed-out, in the 16-bit environment you need to assign the pointer carefully. AFAIK you need to put FAR keyword (my gosh, what a nostalgia).
Also make sure you don't compile in so-called "Huge" memory model. It's incompatible with far addressing, because every 32-bit pointer is automatically "normalized" to 20 bits. Try selecting "Large" memory model.
I am using gcc version 4.7.2 on Ubuntu 12.10 x86_64.
First of all these are the sizes of data types on my terminal:
sizeof(char) = 1
sizeof(short) = 2 sizeof(int) = 4
sizeof(long) = 8 sizeof(long long) = 8
sizeof(float) = 4 sizeof(double) = 8
sizeof(long double) = 16
Now please have a look at this code snippet:
int main(void)
{
char c = 'a';
printf("&c = %p\n", &c);
return 0;
}
If I am not wrong we can't predict anything about the address of c. But each time this program gives some random hex address ending in f. So the next available location will be some hex value ending in 0.
I observed this pattern in case of other data types too. For an int value the address was some hex value ending in c. For double it was some random hex value ending in 8 and so on.
So I have 2 questions here.
1) Who is governing this kind of memory allocation ? Is it gcc or C standard ?
2) Whoever it is, Why it's so ? Why the variable is stored in such a way that next available memory location starts at a hex value ending in 0 ? Any specific benefit ?
Now please have a look at this code snippet:
int main(void)
{
double a = 10.2;
int b = 20;
char c = 30;
short d = 40;
printf("&a = %p\n", &a);
printf("&b = %p\n", &b);
printf("&c = %p\n", &c);
printf("&d = %p\n", &d);
return 0;
}
Now here what I observed is completely new for me. I thought the variable would get stored in the same order they are declared. But No! That's not the case. Here is the sample output of one of random run:
&a = 0x7fff8686a698
&b = 0x7fff8686a694
&c = 0x7fff8686a691
&d = 0x7fff8686a692
It seems that variables get sorted in increasing order of their sizes and then they are stored in the same sorted order but with maintaining the observation 1. i.e. the last variable (largest one) gets stored in such a way that the next available memory location is an hex value ending in 0.
Here are my questions:
3) Who is behind this ? Is it gcc or C standard ?
4) Why to waste the time in sorting the variables first and then allocating the memory instead of directly allocating the memory on 'first come first serve' basis ? Any specific benefit of this kind of sorting and then allocating memory ?
Now please have a look at this code snippet:
int main(void)
{
char array1[] = {1, 2};
int array2[] = {1, 2, 3};
printf("&array1[0] = %p\n", &array1[0]);
printf("&array1[1] = %p\n\n", &array1[1]);
printf("&array2[0] = %p\n", &array2[0]);
printf("&array2[1] = %p\n", &array2[1]);
printf("&array2[2] = %p\n", &array2[2]);
return 0;
}
Now this is also shocking for me. What I observed is that the array is always stored at some random hex value ending in '0' if the elements of an array >= 2 and if elements < 2
then it gets memory location following observation 1.
So here are my questions:
5) Who is behind this storing an array at some random hex value ending at 0 thing ? Is it gcc or C standard ?
6) Now why to waste the memory ? I mean array2 could have been stored immediately after array1 (and hence array2 would have memory location ending at 2). But instead of that array2 is stored at next hex value ending at 0 thereby leaving 14 memory locations in between. Any specific benefits ?
The address at which the stack and the heap start is given to the process by the operating system. Everything else is decided by the compiler, using offsets that are known at compile time. Some of these things may follow an existing convention followed in your target architecture and some of these do not.
The C standard does not mandate anything regarding the order of the local variables inside the stack frame (as pointed out in a comment, it doesn't even mandate the use of a stack at all). The standard only bothers to define order when it comes to structs and, even then, it does not define specific offsets, only the fact that these offsets must be in increasing order. Usually, compilers try to align the variables in such a way that access to them takes as few CPU instructions as possible - and the standard permits that, without mandating it.
Part of the reasons are mandated by the application binary interface (ABI) specifications for your system & processor.
See the x86 calling conventions and the SVR4 x86-64 ABI supplement (I'm giving the URL of a recent copy; the latest original is surprisingly hard to find on the Web).
Within a given call frame, the compiler could place variables in arbitrary stack slots. It may try (when optimizing) to reorganize the stack at will, e.g. by decreasing alignment constraints. You should not worry about that.
A compiler try to put local variables on stack location with suitable alignment. See the alignof extension of GCC. Where exactly the compiler put these variables is not important, see my answer here. (If it is important to your code, you really should pack the variables in a single common local struct, since each compiler, version and optimization flags could do different things; so don't depend on that precise behavior of your particular compiler).
While I was doing "Learn C The Hard Way" examples, I thought to myself:
I set int a = 10; but where does that value 10 actually? Can I access it manually from the outside while my program is running?
Here's a little C code snippet for demonstration purposes:
int main (int argc, char const* argv[]) {
int a = 10;
int b = 5;
int c = a + b;
return 0;
}
I opened up the The GNU Project Debugger (GDB) and entered:
break main
run
next 2
From what I understood 0x7fff5bffb04 is a memory address of int c. I then used hexdump -C /dev/mem system call to dump the entire memory into the terminal.
Now the question is where do I look for the variable c in this massive hex dump? My hope is that given the address 0x7fff5bffb04 I can find its value, which is 15. Also, bonus question, what does each column in hexdump -C represent? (I know the last column is ASCII representation)
I then used hexdump -C /dev/mem system call to dump the entire memory into the terminal.
Your hexdump dumped physical memory addresses. The address 0x7fff5bffb04 is a virtual address of the variable in the process you are debugging. It is mapped to some physical address, but you will not be able to find which without examining kernel mapping tables (as Mat already told you in a comment).
To examine virtual address space, use /proc/<pid>/mem (as Barmar already told you in a comment).
But this entire exercise is pointless, because you already can examine the virtual memory in GDB, and you are not going to see anything when you look at virtual memory that GDB didn't already show you much more conveniently [1].
[1] Except you could see GDB-inserted breakpoints, but you are not expected to understand that :-)
Firstly, there is no reason why the values would even exist in ram. More than Likly the machine code for this program simply has the values in cpu registers. You would have to have more bytes (try at least 512) and set them to a random value, which you could then search for in the memory dump.
You are far better of looking at the assembly code produced by the c compiler.
Good day everyone!
I am trying to understand how buffer overflow works. Right now, I’m in the process of determining the address of a function’s return address which I’m supposed to change to perform a buffer overflow attack. I’ve written a simple program based from an example I’ve read in the internet. What this program does is it creates an integer pointer to store the address of the function's return address in the stack. To do this, (granted I understand how a function/program variables get organized in the stack), I add 8 to the buffer variable’s address and set it as the value of ret. I’m not doing anything here that would change the address contained in the location of func’s return address.
UPDATE: I've modified the program a bit, so it prints the stack address of func's parameter a. As you can see, the distance between a and buffer is about 8 bytes, so that would probably mean, based from the stack layout, that saved FP and old EIP (func return address) is in between. Am I right?
Here's the program:
void func( int a){
char buffer[3];
int *ret;
ret = buffer + 11; // this is the configuratio which made the whole program works... This now points to the address containing func's return address
printf (" address of a is %d\n", &a);
printf ("address of buffer is %x\n", buffer);
printf ("address of ret is %x\n", ret);
printf ("value of ret is %x\n", (*ret));
}
void main(){
int num;
num = 0;
func(num);
num = 1;
printf("Num now is %d", num);
}
Output of the program when gets excecuted:
alt text http://img20.imageshack.us/img20/2034/72783404.png
As you can see, I’m printing the address of the variables buffer and ret. I’ve added an additional statement printing the value of the ret variable (supposed location of func return address, so this should print the address of the next instruction which will get executed after func returns from execution).
Here is the dump which shows the supposed address of the instruction to be executed after func returns. (Underlined in green) As you can see, that value is way different from the value printed contained in the variable ret.
alt text http://img717.imageshack.us/img717/8273/assemblycodecopy.png
My question is, why are they different? (of course in the assumption that what I’ve done are all correct).
Else, what have I done wrong? Is my understanding of the program’s runtime stack wrong? Please, help me understand this. My project is due nextweek and I’ve barely touched it yet. I’m sorry if I’m being demanding, I badly need your help.
For the following program
int main(int argc, char **argv) {
int v[2];
return 0;
}
The stack layout is basically the following:
-------------
arg n
-------------
.........
-------------
0x1010 arg 0
-------------
0x100C ret address
=============
0x1008 old fp
-------------
0x1004 v[1]
-------------
0x1000 v[0]
-------------
You can find out main's return address using v + 3.
Assuming the addresses placed on the left side of the stack, v has address 0x1000 , return adress has the address (v + 3 => 0x1000 + 4 * 3 = 0x100C)
First off, notice that the address of buffer is an odd number 0xbffffd51 and then you add 8 to it to get 0xbffffd59. I would be quite surprised if the return address on the stack was not aligned to a four byte address.
Depending on the compiler, exactly how the stack frame is layed out could vary (for example, even though buffer is first in the source code, the compiler could put ret higher in the stack), so you may need to experiment with your values. I would do a couple of things:
Change buffer to be 4 bytes.
Experiment with different offsets. I have a feeling that you may need to look 12 bytes or even 16 bytes up to find your return address.
Of course, you can't modify the original num unless you pass the pointer to it; so in the main, num first is 0, then it is 1, and it is never really modified by the func. Address of a (&a) in func is the address of the local copy (by value) of the argument, likely an address on the stack in most cases. And what would ret point to? You have a 3 char buffer, and you get the address beyond it; you must consider it now a pointer to garbage, even though likely your pointing to something "interesting", according to how local variables are "organized" in memory. So you can't be 100% sure it points to the return address indeed. You're assuming the following:
0 4 bytes (for char, assuming 4bytes alignment)
4 4 bytes (for whatever, maybe argument)
8 4 bytes (return address)
And it depends. It depends on the architecture; it depends on how the compiler "translate" the code of the function. Let us imagine x86. The following is a reasonable way of doing func
func:
push ebp ; save some regs...
push eax ; or with pusha?
mov ebp, esp
push 0 ; for char a[3]
mov eax, ebp
add eax, 4 ; -4 + 8
push eax ; for int *ret
; -4(ebp) gives a
; -8(ebp) gives int *ret
; so ebp-4 is the pointer to a, we
; add 8, to obtain ebp+4, which points
; to saved ebp... missing the ret ptr
; (other code...)
mov esp, ebp
pop eax ; or with popa?
pop ebp
ret
and what if the saved regs are more? what if the order of char a[4] and int *ret is swapped? How do you can know? You can't assume anything, unless, you write the code yourself directly in asm, in this case you can controll exactly what's happening. Otherwise, a working C code to do what you want would work by chance...