Executing machine code in memory - c

I'm trying to figure out how to execute machine code stored in memory.
I have the following code:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[])
{
FILE* f = fopen(argv[1], "rb");
fseek(f, 0, SEEK_END);
unsigned int len = ftell(f);
fseek(f, 0, SEEK_SET);
char* bin = (char*)malloc(len);
fread(bin, 1, len, f);
fclose(f);
return ((int (*)(int, char *)) bin)(argc-1, argv[1]);
}
The code above compiles fine in GCC, but when I try and execute the program from the command line like this:
./my_prog /bin/echo hello
The program segfaults. I've figured out the problem is on the last line, as commenting it out stops the segfault.
I don't think I'm doing it quite right, as I'm still getting my head around function pointers.
Is the problem a faulty cast, or something else?

You need a page with write execute permissions. See mmap(2) and mprotect(2) if you are under unix. You shouldn't do it using malloc.
Also, read what the others said, you can only run raw machine code using your loader. If you try to run an ELF header it will probably segfault all the same.
Regarding the content of replies and downmods:
1- OP said he was trying to run machine code, so I replied on that rather than executing an executable file.
2- See why you don't mix malloc and mman functions:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/mman.h>
int main()
{
char *a=malloc(10);
char *b=malloc(10);
char *c=malloc(10);
memset (a,'a',4095);
memset (b,'b',4095);
memset (c,'c',4095);
puts (a);
memset (c,0xc3,10); /* return */
/* c is not alligned to page boundary so this is NOOP.
Many implementations include a header to malloc'ed data so it's always NOOP. */
mprotect(c,10,PROT_READ|PROT_EXEC);
b[0]='H'; /* oops it is still writeable. If you provided an alligned
address it would segfault */
char *d=mmap(0,4096,PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_ANON,-1,0);
memset (d,0xc3,4096);
((void(*)(void))d)();
((void(*)(void))c)(); /* oops it isn't executable */
return 0;
}
It displays exactly this behavior on Linux x86_64 other ugly behavior sure to arise on other implementations.

Using malloc works fine.
OK this is my final answer, please note I used the orignal poster's code.
I'm loading from disk, the compiled version of this code to a heap allocated area "bin", just as the orignal code did (the name is fixed not using argv, and the value 0x674 is from;
objdump -F -D foo|grep -i hoho
08048674 <hohoho> (File Offset: 0x674):
This can be looked up at run time with the BFD (Binary File Descriptor library) or something else, you can call other binaries (not just yourself) so long as they are statically linked to the same set of lib's.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
unsigned char *charp;
unsigned char *bin;
void hohoho()
{
printf("merry mas\n");
fflush(stdout);
}
int main(int argc, char **argv)
{
int what;
charp = malloc(10101);
memset(charp, 0xc3, 10101);
mprotect(charp, 10101, PROT_EXEC | PROT_READ | PROT_WRITE);
__asm__("leal charp, %eax");
__asm__("call (%eax)" );
printf("am I alive?\n");
char *more = strdup("more heap operations");
printf("%s\n", more);
FILE* f = fopen("foo", "rb");
fseek(f, 0, SEEK_END);
unsigned int len = ftell(f);
fseek(f, 0, SEEK_SET);
bin = (char*)malloc(len);
printf("read in %d\n", fread(bin, 1, len, f));
printf("%p\n", bin);
fclose(f);
mprotect(&bin, 10101, PROT_EXEC | PROT_READ | PROT_WRITE);
asm volatile ("movl %0, %%eax"::"g"(bin));
__asm__("addl $0x674, %eax");
__asm__("call %eax" );
fflush(stdout);
return 0;
}
running...
co tmp # ./foo
am I alive?
more heap operations
read in 30180
0x804d910
merry mas
You can use UPX to manage the load/modify/exec of a file.
P.S. sorry for the previous broken link :|

It seems to me you're loading an ELF image and then trying to jump straight into the ELF header? http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
If you're trying to execute another binary, why don't you use the process creation functions for whichever platform you're using?

An typical executable file has:
a header
entry code that is called before main(int, char **)
The first means that you can't generally expect byte 0 of the file to be executable; intead, the information in the header describes how to load the rest of the file in memory and where to start executing it.
The second means that when you have found the entry point, you can't expect to treat it like a C function taking arguments (int, char **). It may, perhaps, be usable as a function taking no paramters (and hence requiring nothing to be pushed prior to calling it). But you do need to populate the environment that will in turn be used by the entry code to construct the command line strings passed to main.
Doing this by hand under a given OS would go into some depth which is beyond me; but I'm sure there is a much nicer way of doing what you're trying to do. Are you trying to execute an external file as a on-off operation, or load an external binary and treat its functions as part of your program? Both are catered for by the C libraries in Unix.

It is more likely that that it is the code that is jumped to by the call through function-pointer that is causing the segfault rather than the call itself. There is no way from the code you have posted to determine that that code loaded into bin is valid. Your best bet is to use a debugger, switch to assembler view, break on the return statement and step into the function call to determine that the code you expect to run is indeed running, and that it is valid.
Note also that in order to run at all the code will need to be position independent and fully resolved.
Moreover if your processor/OS enables data execution prevention, then the attempt is probably doomed. It is at best ill-advised in any case, loading code is what the OS is for.

What you are trying to do is something akin to what interpreters do. Except that an interpreter reads a program written in an interpreted language like Python, compiles that code on the fly, puts executable code in memory and then executes it.
You may want to read more about just-in-time compilation too:
Just in time compilation
Java HotSpot JIT runtime
There are libraries available for JIT code generation such as the GNU lightning and libJIT, if you are interested. You'd have to do a lot more than just reading from file and trying to execute code, though. An example usage scenario will be:
Read a program written in a scripting-language (maybe
your own).
Parse and compile the source into an
intermediate language understood by
the JIT library.
Use the JIT library to generate code
for this intermediate
representation, for your target platform's CPU.
Execute the JIT generated code.
And for executing the code you'd have to use techniques such as using mmap() to map the executable code into the process's address space, marking that page executable and jumping to that piece of memory. It's more complicated than this, but its a good start in order to understand what's going on beneath all those interpreters of scripting languages such as Python, Ruby etc.
The online version of the book "Linkers and Loaders" will give you more information about object file formats, what goes on behind the scenes when you execute a program, the roles of the linkers and loaders and so on. It's a very good read.

You can dlopen() a file, look up the symbol "main" and call it with 0, 1, 2 or 3 arguments (all of type char*) via a cast to pointer-to-function-returning-int-taking-0,1,2,or3-char*

Use the operating system for loading and executing programs.
On unix, the exec calls can do this.
Your snippet in the question could be rewritten:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char* argv[])
{
return execv(argv[1],argv+2);
}

Executable files contain much more than just code. Header, code, data, more data, this stuff is separated and loaded into different areas of memory by the OS and its libraries. You can't load a program file into a single chunk of memory and expect to jump to it's first byte.
If you are trying to execute your own arbitrary code, you need to look into dynamic libraries because that is exactly what they're for.

Related

How can I access interpreter path address at runtime in C?

By using the objdump command I figured that the address 0x02a8 in memory contains start the path /lib64/ld-linux-x86-64.so.2, and this path ends with a 0x00 byte, due to the C standard.
So I tried to write a simple C program that will print this line (I used a sample from the book "RE for beginners" by Denis Yurichev - page 24):
#include <stdio.h>
int main(){
printf(0x02a8);
return 0;
}
But I was disappointed to get a segmentation fault instead of the expected /lib64/ld-linux-x86-64.so.2 output.
I find it strange to use such a "fast" call of printf without specifiers or at least pointer cast, so I tried to make the code more natural:
#include <stdio.h>
int main(){
char *p = (char*)0x02a8;
printf(p);
printf("\n");
return 0;
}
And after running this I still got a segmentation fault.
I don't believe this is happening because of restricted memory areas, because in the book it all goes well at the 1st try. I am not sure, maybe there is something more that wasn't mentioned in that book.
So need some clear explanation of why the segmentation faults keep happening every time I try running the program.
I'm using the latest fully-upgraded Kali Linux release.
Disappointing to see that your "RE for beginners" book does not go into the basics first, and spits out this nonsense. Nonetheless, what you are doing is obviously wrong, let me explain why.
Normally on Linux, GCC produces ELF executables that are position independent. This is done for security purposes. When the program is run, the operating system is able to place it anywhere in memory (at any address), and the program will work just fine. This technique is called Address Space Layout Randomization, and is a feature of the operating system that nowdays is enabled by default.
Normally, an ELF program would have a "base address", and would be loaded exactly at that address in order to work. However, in case of a position independent ELF, the "base address" is set to 0x0, and the operating system and the interpreter decide where to put the program at runtime.
When using objdump on a position independent executable, every address that you see is not a real address, but rather, an offset from the base of the program (that will only be known at runtime). Therefore it is not possible to know the position of such a string (or any other variable) at runtime.
If you want the above to work, you will have to compile an ELF that is not position independent. You can do so like this:
gcc -no-pie -fno-pie prog.c -o prog
It no longer works like that. The 64-bit Linux executables that you're likely using are position-independent and they're loaded into memory at an arbitrary address. In that case ELF file does not contain any fixed base address.
While you could make a position-dependent executable as instructed by Marco Bonelli it is not how things work for arbitrary executables on modern 64-bit linuxen, so it is more worthwhile to learn to do this with position-independent ones, but it is a bit trickier.
This worked for me to print ELF i.e. the elf header magic, and the interpreter string. This is dirty in that it probably only works for a small executable anyway.
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
int main(){
// convert main to uintptr_t
uintptr_t main_addr = (uintptr_t)main;
// clear bottom 12 bits so that it points to the beginning of page
main_addr &= ~0xFFFLLU;
// subtract one page so that we're in the elf headers...
main_addr -= 0x1000;
// elf magic
puts((char *)main_addr);
// interpreter string, offset from hexdump!
puts((char *)main_addr + 0x318);
}
There is another trick to find the beginning of the ELF executable in memory: the so-called auxiliary vector and getauxval:
The getauxval() function retrieves values from the auxiliary vector,
a mechanism that the kernel's ELF binary loader uses to pass certain
information to user space when a program is executed.
The location of the ELF program headers in memory will be
#include <sys/auxv.h>
char *program_headers = (char*)getauxval(AT_PHDR);
The actual ELF header is 64 bytes long, and the program headers start at byte 64 so if you subtract 64 from this you will get a pointer to the magic string again, therefore our code can be simplified to
#include <stdio.h>
#include <inttypes.h>
#include <sys/auxv.h>
int main(){
char *elf_header = (char *)getauxval(AT_PHDR) - 0x40;
puts(elf_header + 0x318); // or whatever the offset was in your executable
}
And finally, an executable that figures out the interpreter position from the ELF headers alone, provided that you've got a 64-bit ELF, magic numbers from Wikipedia...
#include <stdio.h>
#include <inttypes.h>
#include <sys/auxv.h>
int main() {
// get pointer to the first program header
char *ph = (char *)getauxval(AT_PHDR);
// elf header at this position
char *elfh = ph - 0x40;
// segment type 0x3 is the interpreter;
// program header item length 0x38 in 64-bit executables
while (*(uint32_t *)ph != 3) ph += 0x38;
// the offset is 64 bits at 0x8 from the beginning of the
// executable
uint64_t offset = *(uint64_t *)(ph + 0x8);
// print the interpreter path...
puts(elfh + offset);
}
I guess it segfaults because of the way you use printf: you dont use the format parameter how it is designed to be.
When you want to use the printf function to read data the first argument it takes is a string that will format how the display will work int printf(char *fmt , ...) "the ... represent the data you want to display accordingly to the format string parameter
so if you want to print a string
//format as text
printf("%s\n", pointer_to_beginning_of_string);
//
If this does not work cause it probably will it is because you are trying to read memory that you are not supposed to access.
try adding extra flags " -Werror -Wextra -Wall -pedantic " with your compiler and show us the errors please.

How to not write more bytes than are in the buffer in C?

I'm trying to write a simple copy program. It reads test_data.txt in chunks of 100 bytes and copies those bytes to test_dest.txt. I find that the destination file is at least one unit of chunk larger than the source file. How could I adjust it so that just the right number of bytes are copied? Would I need a copy buffer of size 1?
Please not the point is to solve it using low level I/O system calls.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
int main() {
int fh = open("test_data.txt", O_RDONLY);
int trg = open("test_dest.txt", O_CREAT | O_WRONLY);
int BUF_SIZE = 100;
char inp[BUF_SIZE];
int read_bytes = read(fh, inp, BUF_SIZE);
while (read_bytes > 0) {
write(trg, inp, BUF_SIZE);
read_bytes = read(fh, inp, BUF_SIZE);
}
close(trg);
close(fh);
return 0;
}
The read function just told you how many bytes it read. You should write that amount of bytes:
write(trg, inp, read_bytes);
On another note, you really should check for failures from the write call as well. And definitely the open calls.
On yet another note, you only really need one call to read:
ssize_t read_bytes; // The read function is specified by POSIX to return a ssize_t
while ((read_bytes = read(fh, inp, sizeof inp)) > 0)
{
write(trg, inp, read_bytes);
}
Your code is not standard C11. Check by reading the standard n1570, and read before the Modern C book.
Your code is more or less POSIX, and certainly compiles on most Linux distributions, e.g. Debian or Ubuntu (you want to install their build-essentials metapackage).
Please read the documentation of open(2), read(2), write(2), every syscalls(2) you are using, and of errno(3).
Notice that each of the functions you are calling can fail, and your code should test for the failure case. Also be aware that a write (or a read) could be partial in some cases, and this is documented.
Recommendation:
with a recent GCC -the usual C compiler on most Linux distributions, compile with all warnings and debug info, so gcc -Wall -Wextra -g.
Read Advanced Linux Programming and How to debug small programs
Learn to use the GDB debugger.
Read about build automation tools, such as GNU make (a very common tool on most Linux systems) or ninja.
Be aware of strace(1). You might use it on cp(1), or study the source code of GNU coreutils (providing cp).
Remember that most Linux distributions are mostly made of open source software.
You are allowed to study their source code.
I even believe that you should study their source code, at least for inspiration !
I'm trying to write a simple copy program. It reads test_data.txt in chunks of 100 bytes and copies those bytes to test_dest.txt
If performance matters, the chunk size of 100 bytes is really too small in practice. I would recommend a power of two bigger than 4Kbytes (the usual page size on x86-64).

Linux: executing code that is loaded to memory manually

I'm expermenting with function pointers on Linux and trying to execute this C program:
#include <stdio.h>
#include <string.h>
int myfun()
{
return 42;
}
int main()
{
char data[500];
memcpy(data, myfun, sizeof(data));
int (*fun_pointer)() = (void*)data;
printf("%d\n", fun_pointer());
return 0;
}
Unfortunately it segfaults on fun_pointer() call. I suspect that it is connected with some memory flags, but I don't found information about it.
Could you explain why this code segfaults? Don't see to the fixed data array size, it is ok and copying without calling the function is successfull.
UPD: Finally I've found that the memory segment should be marked as executable using mprotect system call called with PROT_EXEC flag. Moreover the memory segment should be returned by mmap function as stated in the POSIX specification.
There is the same code that uses allocated by mmap memory with PROT_EXEC flag (and works):
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
int myfun()
{
return 42;
}
int main()
{
size_t size = (char*)main - (char*)myfun;
char *data = mmap(NULL, size, PROT_EXEC | PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
memcpy(data, myfun, size);
int (*fun_pointer)() = (void*)data;
printf("%d\n", fun_pointer());
munmap(data, size);
return 0;
}
This example should be complied with -fPIC gcc option to ensure that the code in functions is position-independent.
Several problems there:
Your data array stays in data segment, not in code segment.
The address relocation is not handled.
The code size is not known, just guessed.
In addition to Diask's answer you probably want to use some JIT compilation techniques (to generate executable code in memory), and you should be sure that the memory zone containing the code is executable (see mprotect(2) and the NX bit; often the call stack is not executable for security reasons). You could use GNU lightning (quickly emitting slow machine code), asmjit, libjit, LLVM, GCCJIT (able to slowly emit fast optimized machine code). You could also emit some C code in some temporary file /tmp/emittedcode.c, fork a compilation command gcc -Wall -O -fPIC -shared /tmp/emittedcode.c -o /tmp/emittedcode.so then dlopen(3) that shared object /tmp/emittedcode.so and use dlsym(3) to find function pointers by their name there.
See also this, this, this, this and that answers. Read about trampoline code, closures, and continuations & CPS.
Of course, copying code from one zone to another usually don't work (it has to be position independent code to make that work, or you need your own relocation machinery, a bit like a linker does).
It's because this line is wrong:
memcpy(data, myfun, sizeof(data));
You are copying the code (compiled) of the function instead of the address of the function.
myfun and &myfun will have the same adress, so to do your memcpy operation, you will have to use a function pointer and then copy from its address.
Example:
int (*p)();
p = myfun;
memcpy(data, &p, sizeof(data));

Trying to implement enable_execute_stack (Mac OS X)

I have downloaded and compiled Apples source and added it to Xcode.app/Contents/Developer/usr/bin/include/c++/v1. Now how do I go about implementing in C? The code I am working with is from this post about Hackadays shellcode executer. My code is currently like so:
#include <stdio.h>
#include <stdlib.h>
unsigned char shellcode[] = "\x31\xFA......";
int main()
{
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
printf("2\n");
}
I have compiled with both:
gcc -fno-stack-protector shell.c
clang -fno-stack-protector shell.c
I guess my final question is, how do I tell the compiler to implement "__enable_execute_stack"?
The stack protector is different from an executable stack. That introduces canaries to detect when the stack has been corrupted.
To get an executable stack, you have to link saying to use an executable stack. It goes without saying that this is a bad idea as it makes attacks easier.
The option for the linker is -allow_stack_execute, which turns into the gcc/clang command line:
clang -Wl,-allow_stack_execute -fno-stack-protector shell.c
your code, however, does not try to execute code on the stack, but it does attempt to change a small amount of the stack content, trying to accomplish a return to the shellcode, which is one of the most common ROP attacks.
On a typically compiled OSX 32bit environment this would be attempting to overwrite what is called the linkage area (this is the address of the next instruction that will be called upon function return). This assumes that the code was not compiled with -fomit-frame-pointer. If it's compiled with this option, then you're actually moving one extra address up.
On OSX 64bit it uses the 64bit ABI, the registers are 64bit, and all the values would need to be referenced by long rather than by int, however the manner is similar.
The shellcode you've got there, though, is actually in the data segment of your code (because it's a char [] it means that it's readable/writable, not readable-executable. You would need to either mmap it (like nneonneo's answer) or copy it into the now-executable stack, get it's address and call it that way.
However, if you're just trying to get code to run, then nneonneo's answer makes it pretty easy, but if you're trying to experiment with exploit-y code, then you're going to have to do a little more work. Because of the non-executable stack, the new kids use return-to-library mechanisms, trying to get the return to call, say, one of the exec/system calls with data from the stack.
With modern execution protections in place, it's a bit tricky to get shellcode to run like this. Note that your code is not attempting to execute code on the stack; rather, it is storing the address of the shellcode on the stack, and the actual code is in the program's data segment.
You've got a couple options to make it work:
Put the shellcode in an actual executable section, so it is executable code. You can do this with __attribute__((section("name"))) with GCC and Clang. On OS X:
const char code[] __attribute__((section("__TEXT,__text"))) = "...";
followed by a
((void (*)(void))code)();
works great. On Linux, use the section name ".text" instead.
Use mmap to create a read-write section of memory, copy your shellcode, then mprotect it so it has read-execute permissions, then execute it. This is how modern JITs execute dynamically-generated code. An example:
#include <sys/mman.h>
void execute_code(const void *code, size_t codesize) {
size_t pagesize = (codesize + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1);
void *chunk = mmap(NULL, pagesize, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);
if(chunk == MAP_FAILED) return;
memcpy(chunk, code, codesize);
mprotect(chunk, pagesize, PROT_READ|PROT_EXEC);
((void (*)(void)chunk)();
munmap(chunk, pagesize);
}
Neither of these methods requires you to specify any special compiler flags to work properly, and neither of them require fiddling with the saved EIP on the stack.

unix command result to a variable - char*

How can I assign "pwd" (or any other command in that case) result (present working dir) to a variable which is char*?
command can be anything. Not bounded to just "pwd".
Thanks.
Start with popen. That will let you run a command with its standard output directed to a FILE * that your parent can read. From there it's just a matter of reading its output like you would any normal file (e.g., with fgets, getchar, etc.)
Generally, however, you'd prefer to avoid running an external program for that -- you should have getcwd available, which will give the same result much more directly.
Why not just call getcwd()? It's not part of C's standard library, but it is POSIX, and it's very widely supported.
Anyway, if pwd was just an example, have a look at popen(). That will run an external command and give you a FILE* with which to read its output.
There is a POSIX function, getcwd() for this - I'd use that.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main(int argc, char* argv[]) {
char *dir;
dir = getcwd(NULL, 0);
printf("Current directory is: %s\n", dir);
free(dir);
return 0;
}
I'm lazy, and like the NULL, 0 parameters, which is a GNU extension to allocate as large a buffer as necessary to hold the full pathname. (It can probably still fail, if you're buried a few hundred thousand characters deep.)
Because it is allocated for you, you need to free(3) it when you're done. I'm done with it quickly, so I free(3) it quickly, but that might not be how you need to use it.
You can fork and use one of the execv* functions to call pwd from your C program, but getting the result of that would be messy at best.
The proper way to get the current working directory in a C program is to call char* getcwd(char* name, size_t size);

Resources