How to call a function located in an executable from a loaded DLL? - c

I have located a function inside an executable which I'd like to call from my DLL. The address of it would be 0x0090DE00 according to OllyDbg. I've tried to call it directly:
luaL__openlib *f = ((luaL__openlib*)(module_handle + 0x0090DE00));
but also with adding the base of the module handle to it as suggested here:
uint8_t * module_handle = (uint8_t *)GetModuleHandle(L"ForgedAlliance1.exe");
luaL__openlib *f = ((luaL__openlib*)(module_handle + 0x0090DE00));
It appears that this is not working as I get access violation exceptions - it appears that the pointer is not valid.
So: How can I call this function by using its address?
I just inserted a simple RET instruction at 0x00C0B530. My code does now look as follows:
typedef void (*test) ();
EXTERN_DLL_EXPORT void initialize(lua_State *L)
{
// Adding this should not be necessary. I get 0x00C0B530 from
// OllyDbg where the offset 0x00401000 is included
uint8_t * module_handle = (uint8_t *)GetModuleHandle(L"ForgedAlliance1.exe");
test *f = NULL;
f = ((test*)(0x00C0B530));
(*f)(); // Crashing
}
What I don't quite understand is why I get a different address in the exception message:
Exception thrown at 0x909090C3 in ForgedAlliance1.exe: 0xC0000005: Access violation executing location 0x909090C3.
UPDATE: I just realized that 0x909090C3 is not just a pointer here, it is the code itself
90 | NOP
90 | NOP
90 | NOP
C3 | RETN
Seems I am messing something up with pointers. Why does it try to execute "location" 0x909090C3. That's not the location.

Alright, it was just a pointer mess-up. Sorry for that - did not write in C for quite a while. I did it right, basically, but the problem with
f = ((test*)(0x00C0B530));
(*f)();
is, that (*f) is 0x909090C3 - the instructions inside the executable - and this is the address the program tries to jump to which is of course invalid.
So the trick was:
int test_addr = 0x00C0B530
f = ((test*)(&test_addr ));
(*f)();
I am sure this can be done a bit simpler but this is working now.

Related

Is Ghidra misinterpreting a function call?

When analyzing the assembly listing in Ghidra, I stumbled upon this instruction:
CALL dword ptr [EBX*0x4 + 0x402ac0]=>DAT_00402abc
I assumed that the program was calling a function whose address was inside DAT_00402abc, which I initially thought it was a dword variable. Indeed, when trying to create a function in the location where DAT_00402abc is in, Ghidra wouldn't let me do it.
The decompiler shows to me this line of code to translate that instruction:
(*(code *)(&int2)[iVar2])();
So I was wondering, what does it mean and what's the program supposed to do with this call? Is there a possibility that Ghidra totally messed up? And if so, how should I interpret that instruction?
I'm not at all familiar with Ghidra, but I can tell you how to interpret the machine instruction...
CALL dword ptr [EBX*0x4 + 0x402ac0]
There is a table of function addresses at 0x402ac0; the EBX'th entry in that table is being called. I have no idea what DAT_00402abc means, but if you inspect memory in dword-sized chunks at address 0x0402ac0 you should find plausible function addresses. [EDIT: 0x0040_2abc = 0x0040_2ac0 - 4. I suspect this means Ghidra thinks EBX has value -1 when control reaches this point. It may be wrong, or maybe the program has a bug. One would expect EBX to have a nonnegative value when control reaches this point.]
The natural C source code corresponding to this instruction would be something like
extern void do_thing_zero(void);
extern void do_thing_one(void);
extern void do_thing_two(void);
extern void do_thing_three(void);
typedef void (*do_thing_ptr)(void);
const do_thing_ptr do_thing_table[4] = {
do_thing_zero, do_thing_one, do_thing_two, do_thing_three
};
// ...
void do_thing_n(unsigned int n)
{
if (n >= 4) abort();
do_thing_table[n]();
}
If the functions in the table take arguments or return values, you'll see argument-handing code before and after the CALL instruction you quoted, but the CALL instruction itself will not change.
You would be seeing something different and much more complicated if the functions didn't all take the same set of arguments.

qemu: uncaught target signal 11 (Segmentation fault) - core dumped, when trying to return a struct

I just noticed that I am unable to have a function return a struct.
I am running this on ARM32/debian docker image with threads enabled.
This is the function that gives me the run time error:
struct CEC_call des_CEC_call(char * buffy){
char request = buffy[0]; // fails here
buffy+=4;
char obligation = buffy[1];
buffy+=4;
struct CEC_call ceccall;
pepcall.request = request;
pepcall.obligation = obligation;
return ceccall;
}
But if I change the return type to void, there is no issue in running:
void des_CEC_call(char * buffy){
char request = buffy[0]; // doesn't fail here
buffy+=4;
char obligation = buffy[1];
buffy+=4;
struct CEC_call ceccall;
pepcall.request = request;
pepcall.obligation = obligation;
}
Return works fine as well with any of the standard return types.
Header where the struct is defined is included in the file with the function although it will still crash even if the struct is defined in the same file. Not sure how to proceed with debugging, any help appreciated.
EDIT:
More details, based on suggestions from comments:
I have rerun the same program on my mac as well as some other non arm architectures with docker, and it runs without any noticeable issues. Some aspects relating to bit shifting are slightly different as expected but no run time error from the segmentation fault. I tried running it with various optimisation levels, but to no avail.
I have used GDB before so I thought that might provide some insight, sadly I have not been able to get it to work on this container.
I ensured GDB is installed and recompiled the binary with -0g.
I ran docker with --cap-add=SYS_PTRACE and --security-opt seccomp=unconfined.
Each time I got:
warning: Could not trace the inferior process.
Error:
warning: ptrace: Function not implemented
During startup program exited with code 127.
I am able to use GDB with other non-arm, non-32bit docker images without any issues. I think this is enough for another question, as I've spent ages trying to get GDB working with that environment.
I am not sure really how to verify otherwise, but I have printed out the address buffy is pointing and the value held by buffy[0] in the preceding functions as well as the problematic one.
Without struct return:
address of buffy = 0xff58b9ec
buffer[0] = ff
address of buffy = 0xff58b9ec
buffer[0] = ff
address of buffy = 0xff58b9ec
buffer[0] = ff
With struct return:
address of buffy = 0xff58b9ec
buffer[0] = ff
address of buffy = 0xff58b9ec
buffer[0] = ff
address of buffy = (nil)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Struct CEC_call does not have any other fields.
It could be a buffer overflow somewhere, but there aren't any buffers at least none made by me. I have not used QEMU IIRC or valingrad before, but will look into them in more details. I can not test nateively at the moment as I do not have the access to the intended embedded linux.
struct CEC_call ceccall;
pepcall.request = request;
pepcall.obligation = obligation;
It seems you have mismatch in names of your variables: ceccall and pepcall, and you return an uninitialized variable ceccall.
My problem was that the header for the file that has got the struct CEC_call des_CEC_call(char * buffy) function declaration has not been included in the calling file.
Function called worked fine if it was returning standard types or void, but with custom struct return the array pointer passed in was nullified. This kind of baffled me initially as I didn’t think it would compile due to missing declaration and this segmentation fault only happened on arm32 architecture, I didn't get that crash on OSX.

SPARC assembly jmp \boot

I'll explain the problem briefly. I have a Leon3 board (gr-ut-g99). Using GRMON2 I can load executables at the desired address in the board.
I have two programs. Let's call them A and B. I tried to load both in memory and individually they work.
What I would like to do now is to make the A program call the B program.
Both programs are written in C using a variant of the gcc compiler (the Gaisler Sparc GCC).
To do the jump I wrote a tiny inline assembler function in program A that jumps to a memory address where I loaded the program B.
below a snippet of the program A
unsigned int return_address;
unsigned int * const RAM_pointer = (unsigned int *) RAM_ADDRESS;
printf("RAM pointer set to: 0x%08x \n",(unsigned int)RAM_pointer);
printf("jumping...\n");
__asm__(" nop;" //clean the pipeline
"jmp %1;" // jmp to programB
:"=r" (return_address)
:"r" (RAM_pointer)
);
RAM_ADDRESS is a #define
#define RAM_ADDRESS 0x60000000
The program B is a simple hello world. The program B is loaded at the 0x60000000 address. If I try to run it, it works!
int main()
{
printf ("HELLO! I'M BOOTED! \n");
fflush(stdout);
return 0;
}
What I expect when I run the ProgramA, is to see the "jumping..." message on the console and then see the "HELLO! I'M BOOTED!" from the programB
What happens instead an IU exception.
Below I posted the messages show by grmon2 monitor. I also reported the "inst" report which should show the last operations performed before the exception.
grmon2> run
IU exception (tt = 0x07, mem address not aligned)
0x60004824: 9fc04000 call %g1
grmon2> inst
TIME ADDRESS INSTRUCTION RESULT SYMBOL
407085 600047FC mov %i3, %o2 [600063B8] -
407086 60004800 cmp %i4 [00000013] -
407089 60004804 be 0x60004970 [00000000] -
407090 60004808 mov %i0, %o0 [6000646C] -
407091 6000480C mov %i4, %o3 [00000013] -
407092 60004810 cmp %i4, %l0 [80000413] -
407108 60004814 bleu 0x60004820 [00000000] -
407144 60004818 ld [%i1 + 0x20], %o1 [FFFFFFFF] -
407179 60004820 ld [%i1 + 0x28], %g1 [FFFFFFFF] -
407186 60004824 call %g1 [ TRAP ] -
I also tried to substitute the "jmp" with a "jmpl" or a "call" but it does not worked.
I'm quite confused.
I do not know how to cope well with the problem and therefore I do not know what other information it is necessary to provide.
I can say that, the programB is loaded at 0x60000000 and the entry_point is, of course, 0x60000000. Running directly program B from that entry point it works good!
Thanks in advance for your help!
Looks to me like you did execute the jump, and it got to program B, as evidenced by the addresses of the instructions in the trace buffer. But where you crashed was in stdio trying to print stuff. Stdio makes extensive use of function pointers, and the sequence clearly shows a call instruction with the target address in a register, which indicates use of a function pointer.
I suggest putting fflush(stdout) in program A just before the jump, and this will allow you to see the messages before doing the jump. Then, in program B, instead of using printf, just put some known value in memory that you can examine later via the monitor to verify that it got there.
My guess is that the stdio library has some data or parameter that needs to be set up at the start of the program, and that's not being done or not done properly. Not sure about the platform you are running on, but do you have some sort of debugging or single stepping ability, like in a debugger? If so, just single step through the jump and follow where the program goes.

Generating functions at runtime in C

I would like to generate a function at runtime in C. And by this I mean I would essentially like to allocate some memory, point at it and execute it via function pointer. I realize this is a very complex topic and my question is naïve. I also realize there are some very robust libraries out there that do this (e.g. nanojit).
But I would like to learn the technique, starting with the basics. Could someone knowledgeable give me a very simple example in C?
EDIT: The answer below is great but here is the same example for Windows:
#include <Windows.h>
#define MEMSIZE 100*1024*1024
typedef void (*func_t)(void);
int main() {
HANDLE proc = GetCurrentProcess();
LPVOID p = VirtualAlloc(
NULL,
MEMSIZE,
MEM_RESERVE|MEM_COMMIT,
PAGE_EXECUTE_READWRITE);
func_t func = (func_t)p;
PDWORD code = (PDWORD)p;
code[0] = 0xC3; // ret
if(FlushInstructionCache(
proc,
NULL,
0))
{
func();
}
CloseHandle(proc);
VirtualFree(p, 0, MEM_RELEASE);
return 0;
}
As said previously by other posters, you'll need to know your platform pretty well.
Ignoring the issue of casting a object pointer to a function pointer being, technically, UB, here's an example that works for x86/x64 OS X (and possibly Linux too). All the generated code does is return to the caller.
#include <unistd.h>
#include <sys/mman.h>
typedef void (*func_t)(void);
int main() {
/*
* Get a RWX bit of memory.
* We can't just use malloc because the memory it returns might not
* be executable.
*/
unsigned char *code = mmap(NULL, getpagesize(),
PROT_READ|PROT_EXEC|PROT_WRITE,
MAP_SHARED|MAP_ANON, 0, 0);
/* Technically undefined behaviour */
func_t func = (func_t) code;
code[0] = 0xC3; /* x86 'ret' instruction */
func();
return 0;
}
Obviously, this will be different across different platforms but it outlines the basics needed: get executable section of memory, write instructions, execute instructions.
This requires you to know your platform. For instance, what is the C calling convention on your platform? Where are parameters stored? What register holds the return value? What registers must be saved and restored? Once you know that, you can essentially write some C code that assembles code into a block of memory, then cast that memory into a function pointer (though this is technically forbidden in ANSI C, and will not work depending if your platform marks some pages of memory as non-executable aka NX bit).
The simple way to go about this is simply to write some code, compile it, then disassemble it and look at what bytes correspond to which instructions. You can write some C code that fills allocated memory with that collection of bytes and then casts it to a function pointer of the appropriate type and executes.
It's probably best to start by reading the calling conventions for your architecture and compiler. Then learn to write assembly that can be called from C (i.e., follows the calling convention).
If you have tools, they can help you get some things right easier. For example, instead of trying to design the right function prologue/epilogue, I can just code this in C:
int foo(void* Data)
{
return (Data != 0);
}
Then (MicrosoftC under Windows) feed it to "cl /Fa /c foo.c". Then I can look at "foo.asm":
_Data$ = 8
; Line 2
push ebp
mov ebp, esp
; Line 3
xor eax, eax
cmp DWORD PTR _Data$[ebp], 0
setne al
; Line 4
pop ebp
ret 0
I could also use "dumpbin /all foo.obj" to see that the exact bytes of the function were:
00000000: 55 8B EC 33 C0 83 7D 08 00 0F 95 C0 5D C3
Just saves me some time getting the bytes exactly right...

Buffer overflow in C

I'm attempting to write a simple buffer overflow using C on Mac OS X 10.6 64-bit. Here's the concept:
void function() {
char buffer[64];
buffer[offset] += 7; // i'm not sure how large offset needs to be, or if
// 7 is correct.
}
int main() {
int x = 0;
function();
x += 1;
printf("%d\n", x); // the idea is to modify the return address so that
// the x += 1 expression is not executed and 0 gets
// printed
return 0;
}
Here's part of main's assembler dump:
...
0x0000000100000ebe <main+30>: callq 0x100000e30 <function>
0x0000000100000ec3 <main+35>: movl $0x1,-0x8(%rbp)
0x0000000100000eca <main+42>: mov -0x8(%rbp),%esi
0x0000000100000ecd <main+45>: xor %al,%al
0x0000000100000ecf <main+47>: lea 0x56(%rip),%rdi # 0x100000f2c
0x0000000100000ed6 <main+54>: callq 0x100000ef4 <dyld_stub_printf>
...
I want to jump over the movl instruction, which would mean I'd need to increment the return address by 42 - 35 = 7 (correct?). Now I need to know where the return address is stored so I can calculate the correct offset.
I have tried searching for the correct value manually, but either 1 gets printed or I get abort trap – is there maybe some kind of buffer overflow protection going on?
Using an offset of 88 works on my machine. I used Nemo's approach of finding out the return address.
This 32-bit example illustrates how you can figure it out, see below for 64-bit:
#include <stdio.h>
void function() {
char buffer[64];
char *p;
asm("lea 4(%%ebp),%0" : "=r" (p)); // loads address of return address
printf("%d\n", p - buffer); // computes offset
buffer[p - buffer] += 9; // 9 from disassembling main
}
int main() {
volatile int x = 7;
function();
x++;
printf("x = %d\n", x); // prints 7, not 8
}
On my system the offset is 76. That's the 64 bytes of the buffer (remember, the stack grows down, so the start of the buffer is far from the return address) plus whatever other detritus is in between.
Obviously if you are attacking an existing program you can't expect it to compute the answer for you, but I think this illustrates the principle.
(Also, we are lucky that +9 does not carry out into another byte. Otherwise the single byte increment would not set the return address how we expected. This example may break if you get unlucky with the return address within main)
I overlooked the 64-bitness of the original question somehow. The equivalent for x86-64 is 8(%rbp) because pointers are 8 bytes long. In that case my test build happens to produce an offset of 104. In the code above substitute 8(%%rbp) using the double %% to get a single % in the output assembly. This is described in this ABI document. Search for 8(%rbp).
There is a complaint in the comments that 4(%ebp) is just as magic as 76 or any other arbitrary number. In fact the meaning of the register %ebp (also called the "frame pointer") and its relationship to the location of the return address on the stack is standardized. One illustration I quickly Googled is here. That article uses the terminology "base pointer". If you wanted to exploit buffer overflows on other architectures it would require similarly detailed knowledge of the calling conventions of that CPU.
Roddy is right that you need to operate on pointer-sized values.
I would start by reading values in your exploit function (and printing them) rather than writing them. As you crawl past the end of your array, you should start to see values from the stack. Before long you should find the return address and be able to line it up with your disassembler dump.
Disassemble function() and see what it looks like.
Offset needs to be negative positive, maybe 64+8, as it's a 64-bit address. Also, you should do the '+7' on a pointer-sized object, not on a char. Otherwise if the two addresses cross a 256-byte boundary you will have exploited your exploit....
You might try running your code in a debugger, stepping each assembly line at a time, and examining the stack's memory space as well as registers.
I always like to operate on nice data types, like this one:
struct stackframe {
char *sf_bp;
char *sf_return_address;
};
void function() {
/* the following code is dirty. */
char *dummy;
dummy = (char *)&dummy;
struct stackframe *stackframe = dummy + 24; /* try multiples of 4 here. */
/* here starts the beautiful code. */
stackframe->sf_return_address += 7;
}
Using this code, you can easily check with the debugger whether the value in stackframe->sf_return_address matches your expectations.

Resources