Protect memory on x86-64 - c

I'm implementing a compiler which compiles a simulated processor's instruction set to code in x86 instructions. No physical processor exists which runs on the to be simulated instruction set; there is just the simulation on x86. When executing the compiled machine code (the simulation), I want to make sure that only memory designated for the simulation is read or written. This serves two purposes: 1) An access outside of the designated memory area could mean that I have a bug in my compiler. I want to find such bugs. 2) An access outside of the designated memory area can also mean that the source instructions that I compiled have a logical error and therefor try to access a memory address which does not exist in my simulation, thus an error should be raised.
In a simpler form, you can imagine my code to look like this:
void simulate(char* designated_memory, size_t len) {
// code intended to access *designated_memory till *(designated_memory + len - 1) only
}
Is there a way on x86-64 and/or in linux to enforce that simulate() can only access its own stack and designated_memory, and any other access would generate an error. E.g. the code could look like this:
restrict_access_to(designated_memory, designated_memory + len - 1);
simulate(designated_memory, len);
remove_access_restriction();
A solution in C would be nice; asm is fine, too.
UPDATE:
Following Jester's comments, I came to try out this:
#include <stdio.h>
#include <unistd.h>
#include <malloc.h>
#include <sys/mman.h>
int main() {
size_t pagesize = sysconf(_SC_PAGESIZE);
printf("pagesize...........: %lu\n", pagesize);
char* m;
size_t len = 12345;
len = (len + pagesize - 1) / pagesize * pagesize;
posix_memalign(&m, pagesize, len);
printf("page aligned memory: %lx - %lx\n", (unsigned long) m, (unsigned long) m + len);
printf("protecting 0 till m..."); fflush(stdout);
mprotect(0, (size_t) m, PROT_NONE);
printf("done\n");
printf("protecting (m + len) till ?..."); fflush(stdout);
mprotect(m + len, 0x7fffffff, PROT_NONE);
printf("done\n");
printf("trying to modify memory..."); fflush(stdout);
*(m - 1000) = 5;
printf("done: %i\n", *(m - 1000));
free(m);
}
Which outputs:
pagesize...........: 4096
page aligned memory: 9ac000 - 9b0000
protecting 0 till m...done
protecting (m + len) till ?...done
trying to modify memory...done: 5
Segmentation fault (core dumped)
I think that this shows that modifying data outside of the allowed area still works, but that should not happen.

i do not use an interpreter. i use a compiler to compile the simulated >instructions to x86 instructions. i do not want to include all the range checking >in the compilation to x86 instructions for performance reasons.
So your task is very similar to what does qemu, when it does emulation for example ARM on x86, it convert arm instructions to x86 instructions,
so I suggest you look at source code of qemu: http://wiki.qemu.org/Main_Page
The next very similar to your project is valrgrind,
the way how valgrind works is very similar to what you do:
it execute program on some kind of virtual cpu to check memory accesses,
and to speedup things it uses jit (http://valgrind.org/).
And the last opensource project, that solve similar problems to your is
https://github.com/google/sanitizers/tree/master/address-sanitizer
Yes, it is instrumented code, but the result works much faster then valgrind,
the idea how to instrument generated code, but leave performance on suitable level you can find in this video about asan internals:
https://www.youtube.com/watch?v=Q2C2lP8_tNE

If you are executing arbitrary assembly instructions in user-space code, there isn't really a way to enforce protection of the memory -- memory protection generally requires having the ability to activate kernel-mode flags on the CPU.
However, since you are writing a simulator, and compiling another language to a set of assembly instructions of your choosing, you have a different option: control the instructions being emmitted. Instead of emitting raw memory access instructions for simulated memory accesses, have the simulation compiler replace instructions which would access memory with instructions which call a memory-access function of your own design, and implement memory access protection in that function. This could also be done as Lorehead states by having the memory region be something like a std::vector.
Note that just replacing normal move instructions with your own move function isn't really sufficient to protect your own wrapper code from being hacked by "simulated" code. If your simulation emits raw jump commands, then it can be broken by simulated code which jumps outside of the simulation. Or by carefully planned push/pop instructions which cause the wrapper code to return to other simulator code later, after you should have left the simulation.
To actually be (somewhat) safe (this is by no means exhaustive, or guaranteed to be sufficient to make the simulated code safe), you would need to make sure that your simulation compiler generates safe code; things which affect what memory may be accessed, including mov, jump, push, pop, call, and ret would need to be replaced with function calls that perform equivalent actions safely, rather than just executing assembly code that you can't trust.
You would also need to make sure to wrap this in code which saves your registers for your outer program, since the inner code may arbitrarily change their contents.

You write that you don’t want to add runtime bounds checking for performance reasons. I suggest you look closely at the actual runtime cost and reconsider: it’s literally a single instruction if you use indirect addressing (ja) which even the simplest branch predictors can get right if you put the exception handler last. If you can recognize and optimize (some) loops, you can do one bounds check before the loop, instead of one before every access inside the loop.
If you really can’t afford this overhead, one option you can try is to compile the executable and fork off a process that has no other memory in its address space. If you’re going to run in a process that uses the stack, though, it’s theoretically possible that native code could clobber the stack. You can also make it impossible to access memory outside the address space if you make the size of the address space exactly 2^16 or 2^32 bytes and have the compiled code access it with that size register. Otherwise, you’re stuck trying to poison every single other page of the address space.

Related

C: How to change my own program in my program in runtime?

At runtime, either the assembler or machine code (which is it?) should be somewhere in RAM. Can I somehow get access to it, and read or even write to it?
This is just for educational purposes.
So, I just could compile this code. Am I really reading myself here?
#include <stdio.h>
#include <sys/mman.h>
int main() {
void *p = (void *)main;
mprotect(p, 4098, PROT_READ | PROT_WRITE | PROT_EXEC);
printf("Main: %p\n Content: %i", p, *(int *)(p+2));
unsigned int size = 16;
for (unsigned int i = 0; i < size; ++i) {
printf("%i ", *((int *)(p+i)) );
}
}
Though, if I add
*(int*)p =4;
then it's a segmentation fault.
From the answers, I could construct the following code which modifies itself during runtime:
#include <stdio.h>
#include <sys/mman.h>
#include <errno.h>
#include <string.h>
#include <stdint.h>
void * alignptr(void * ptr, uintptr_t alignment) {
return (void *)((uintptr_t)ptr & ~(alignment - 1));
}
// pattern is a 0-terminated string
char* find(char *string, unsigned int stringLen, char *pattern) {
unsigned int iString = 0;
unsigned int iPattern;
for (unsigned int iString = 0; iString < stringLen; ++iString) {
for (iPattern = 0;
pattern[iPattern] != 0
&& string[iString+iPattern] == pattern[iPattern];
++iPattern);
if (pattern[iPattern] == 0) { return string+iString; }
}
return NULL;
}
int main() {
void *p = alignptr(main, 4096);
int result = mprotect(p, 4096, PROT_READ | PROT_WRITE | PROT_EXEC);
if (result == -1) {
printf("Error: %s\n", strerror(errno));
}
// Correct a part of THIS program directly in RAM
char programSubcode[12] = {'H','e','l','l','o',
' ','W','o','r','l','t',0};
char *programCode = (char *)main;
char *helloWorlt = find(programCode, 1024, programSubcode);
if (helloWorlt != NULL) {
helloWorlt[10] = 'd';
}
printf("Hello Worlt\n");
return 0;
}
This is amazing! Thank you all!
In principle it is possible, in practice your operating system will protect itself from your dangerous code!
Self-modifying code may have been regarded as a "neat-trick" in the days when computers had very tiny memories (in the 1950's). It later (when it was no longer necessary) came to be regarded as bad practice - resulting in code that was hard to maintain and debug.
In more modern systems (at the end of the 20th Century) it became a behaviour indicative of viruses and malware. As a consequence all modern desktop operating systems disallow modification of the code space of a program and also prevent execution of code injected into data space. Modern systems with an MMU can mark memory regions as read-only, and not-executable for example.
The simpler question of how to obtain the address of the code space - that is simple. A function pointer value for example is generally the address of the function:
int main()
{
printf( "Address of main() = %p\n", (void*)main ) ;
}
Note also that on a modern system this address will be a virtual rather then physical address.
Machine code is loaded into memory. In theory you can read and write it just like any other part of memory your program as access to.
There can be some roadblocks to doing this in practice. Modern OSes try and limit the data sections of memory to read/write operations but no execution, and machine code sections of memory to read/execute but no writing. This is to try and limit potential security vulnerabilities that come with allowing executing what ever the program feels like putting into memory (like random stuff it might pull down from the Internet).
Linux provides the mprotect system call to allow some amount of customization for memory protection. Windows provides the SetProcessDEPPolicy system call.
Edit for updated question
It looks like you're trying this on Linux, and using mprotect. The code you posted is not checking the return value from mprotect, so you don't know if the call is succeeding or failing. Here is an updated version that checks the return value:
#include <stdio.h>
#include <sys/mman.h>
#include <errno.h>
#include <string.h>
#include <stdint.h>
void * alignptr(void * ptr, uintptr_t alignment)
{
return (void *)((uintptr_t)ptr & ~(alignment - 1));
}
int main() {
void *p = alignptr(main, 4096);
int result = mprotect(p, 4096, PROT_READ | PROT_WRITE | PROT_EXEC);
if (result == -1) {
printf("Error: %s\n", strerror(errno));
}
printf("Main: %p\n Content: %i", main, *(int *)(main+2));
unsigned int size = 16;
for (unsigned int i = 0; i < size; ++i) {
printf("%i ", *((int *)(main+i)) );
}
}
Note the changes to the length parameter passed to mprotect and the function aligning the pointer to a system page boundary. You'll need to investigate on your specific system. My system has an alignment of 4096 bytes (determined by running getconf PAGE_SIZE) and after aligning the pointer and changing the length parameter to mprotect to the page size this works, and lets you write over your pointer to main.
As others have said, this is a bad way to dynamically load code. Dynamic libraries, or plugins, are the preferred method.
On most operating systems (Linux, Windows, Android, MacOSX, etc...), a program don't execute (directly) in RAM but has its virtual address space and runs in it (stricto sensu, the code is not -always or necessarily- in RAM; you can have code which is not in RAM and which gets executed, after some page fault bring it transparently in RAM). The RAM is (directly) managed by the OS, but your process only sees its virtual address space (initialized at execve(2) time and modified with mmap(2), munmap, mprotect, mlock(2)...). Use proc(5) and try cat /proc/$$/maps in a Linux shell to understand more the virtual address space of your shell process. On Linux, you could query the virtual address space of your process by reading the /proc/self/maps file (sequentially, it is a textual pseudo-file).
Read Operating Systems: Thee Easy Pieces to learn more about OSes.
In practice, if you want to augment the code inside your program (running on some common OS) you'll better use plugins and the dynamic loading facilities. On Linux and POSIX systems you'll use dlopen(3) (which uses mmap etc...) then with dlsym(3) you'll obtain the (virtual) address of some new function and you could call it (by storing it in some function pointer of your C code).
You don't really define what a program is. I claim that a program is not only an executable, but also made of other resources (such as specific libraries, perhaps fonts or configuration files, etc...) and that is why when you install some program, quite often much more than the executable is moved or copied (look into what make install does for most free software programs even as simple as GNU coreutils). Therefore, a program (on Linux) which generates some C code (e.g. in some temporary file /tmp/genecode.c), compiles that C code into a plugin /tmp/geneplug.so (by running gcc -Wall -O -fPIC /tmp/genecode.c -o /tmp/geneplug.so), then dlopen that /tmp/geneplug.so plugin is genuinely modifying itself. And if you code in C exclusively that is a sane way of writing self-modifying programs.
Generally, your machine code sits in the code segment, and that code segment is read-only (and sometimes even execute-only; read about the NX bit). If you really want to overwrite code (and not to extend it), you'll need to use facilities (perhaps mprotect(2) on Linux) to change that permissions and enable rewriting inside the code segment.
Once some part of your code segment is writable, you could overwrite it.
Consider also some JIT-compiling libraries, such as libgccjit or asmjit (and others), to generate machine code in memory.
When you execve a new fresh executable, most of its code does not (yet) sit in RAM. But (from the point view of user code in the application) you can run it (and the kernel will transparently, but lazily, bring code pages into RAM, thru demand paging). That is what I try to explain by saying that your program run in its virtual address space (not directly in RAM). An entire book is needed to explain that further.
For example, if you have a huge executable (for simplicity, assume it is statically linked) of one gigabyte. When you start that executable (with execve) the entire gigabyte is not brought into RAM. If your program exits quickly, most of the gigabyte have not been brought into RAM and stays on the disk. Even if your program runs for a long time, but never calls a huge routine of a hundred megabyte of code, that code part (the 100Mbyte of the never used routine) won't be in RAM.
BTW, stricto sensu, self modifying code is rarely used these days (and current processors don't even handle that efficiently, e.g. because of caches and branch predictors). So in practice, you don't modify exactly your machine code (even if that would be possible).
And malware don't have to modify the currently executed code. It could (and often does) inject new code in memory and jumps somehow to it (more precisely, call it thru some function pointer). So in general you don't overwrite existing "actively used" code, you create new code elsewhere and you call it or jump to it.
If you want to create new code elsewhere in C, plugin facilities (e.g. dlopen and dlsym on Linux), or JIT libraries, are more than enough.
Notice that the mention of "changing your program" or "writing code" is very ambiguous in your question.
You might just want to extend the code of your program (and then using plugin techniques, or JIT-compilation libraries, is relevant). Notice that some programs (e.g. SBCL) are able to generate machine code at every user interaction.
You could change the existing code of your program, but then you should explain what that exactly means (what does "code" mean for you exactly ? Is it only the currently executed machine instruction or is it the entire code segment of your program?). Do you think of self-modifying code, of generating new code, of dynamic software updating?
Can I somehow get access to it, and read or even write to it?
Of course yes. You need to change protection in your virtual address space for your code (e.g. with mprotect) and then to write many bytes on some "old code" part. Why would you want to do that is a different story (and you have not explained why). I don't see any educational purposes in doing that -you are likely to crash your program quite quickly (unless you take a lot of precautions to write good enough machine code in memory).
I am a great fan of metaprogramming but I generally generate some new code and jump into it. On our current machines, I see no value in overwriting existing code. And (on Linux), my manydl.c program demonstrates that you could generate C code, compile, and dynamically link more than a million plugins (and dlopen all of them) in a single program. In practice, on current laptop or desktop computers, you can generate a lot of new code (before being concerned by limits). And C is fast enough (both in compilation time and in run time) that you could generate a thousands of C lines at every user interaction (so several times per second), compile and dynamically load it (I did that ten years ago in my defunct GCC MELT project).
If you want to overwrite executable files on disk (I see no value in doing that, it is much simpler to create fresh executables), you need to understand deeply their structure. For Linux, dive into the specifications of ELF.
In the edited question, you forgot to test against failure of mprotect. It is probably failing (because 4098 is not a power of 2 and a page multiple). So please at least code:
int c = mprotect(p, 4096, PROT_READ | PROT_WRITE | PROT_EXEC);
if (c) { perror("mprotect"); exit(EXIT_FAILURE); };
Even with the 4096 (instead of 4098) that mprotect is likely to fail with EINVAL, because main is probably not aligned to a 4K page. (Don't forget that your executable also contains crt0 code).
BTW, for educational purposes, you should add the following code near the start of your main:
char cmdbuf[80];
snprintf (cmdbuf, sizeof(cmdbuf), "/bin/cat /proc/%d/maps", (int)getpid());
fflush(NULL);
if (system(cmdbuf))
{ fprintf(stderr, "failed to run %s\n", cmdbuf); exit(EXIT_FAILURE));
and you could add a similar code chunk near the end. You might replace the snprintf format string for cmdbuf with "pmap %d".
The most straightforward and practical way to accomplish this is to use function pointers. You can declare a pointer such as:
void (*contextual_proc)(void) = default_proc;
Then call it with the syntax contextual_proc();. You can also assign a different function with the same signature to contextual_proc, say contextual_proc = proc_that_logs;, and any code that calls contextual_proc() will then (modulo thread-safety) call the new code instead.
This is a lot like self-modifying code in effect, but it is easier to understand, portable, and actually works on modern CPUs where executable memory is not writable and instructions are cached.
In C++, you would use subclasses for this; static dispatch will implement it the same way under the hood.

how to fix stack overflow error?

so, i was making this program that let people know the number of contiguous subarray which sum is equal to a certain value.
i have written the code , but when i try to run this code in vcexpress 2010, it says these error
Unhandled exception at 0x010018e7 in test 9.exe: 0xC00000FD: Stack overflow.
i have tried to search for the solution in this website and other webisites, but i can't seem to find any solution which could help me fix the error in this code(they are using recursion while i'm not).
i would be really grateful if you would kindly explain what cause this error in my code, and how to fix this error. any help would be appreciated. Thank you.
here is my code :
#include <stdio.h>
int main ()
{
int n,k,a=0,t=0;
unsigned long int i[1000000];
int v1,v2=0,v3;
scanf("%d %d",&n,&k);
for(v3=0;v3<n;v3++)
{
scanf("%d",&i[v3]);
}
do
{
for(v1=v2;v1<n;v1++)
{
t=i[v1]+t;
if(t==k)
{
a++;
break;
}
}
t=0;
v2++;
}while(v2!=n);
printf("%lu",a);
return 0;
}
Either move
unsigned long int i[1000000];
outside of main, thus making it a global variable (not an automatic one), or better yet, use some C dynamic heap allocation:
// inside main
unsigned long int *i = calloc(1000000, sizeof(unsigned long int));
if (!i) { perror("calloc"); exit(EXIT_FAILURE); };
BTW, for such a pointer, I would use (for readability reasons) some other name than i. And near the end of main you'll better free(i); to avoid memory leaks.
Also, you could move these 2 lines after the read of n and use calloc(n, sizeof(unsigned long int)) instead of calloc(1000000, sizeof(unsigned long int)) ; then you can handle arrays bigger than a million elements if your computer and system provides enough resources for that.
Your initial code is declaring an automatic variable which goes into the call frame of main on your call stack (which has a limited size, typically a megabyte or a few of them). On some operating systems there is a way to increase the size of that call stack (in an OS-specific way). BTW each thread has its own call stack.
As a rule of thumb, your C functions (including main) should avoid having call frames bigger than a few kilobytes. With the GCC compiler, you could invoke it with gcc -Wall -Wextra -Wframe-larger-than=1024 -g to get useful warnings and debug information.
Read the virtual address space wikipage. It has a nice picture worth many words. Later, find the way to query, on your operating system, the virtual address space of your process (on Linux, use proc(5) like cat /proc/$$/maps etc...). In practice, your virtual address space is likely to contain many segments (perhaps a dozen, sometimes thousands). Often, the dynamic linker or some other part of your program (or of your C standard library) uses memory-mapped files. The standard C heap (managed by malloc etc) may be organized in several segments.
If you want to understand more about virtual address space, take time to read a good book, like: Operating systems, three easy pieces (freely downloadable).
If you want to query the organization of the virtual address space in some process, you need to find an operating-system specific way to do that (on Linux, for a process of pid 1234, use /proc/1234/maps or /proc/self/maps from inside the process).
Memory is laid out much more differently than simply 4 segments(which was done long ago). The answer to the question can be generalized this way - the global or dynamically allocated memory space is handled differently than that of local variables by the system, where as the memory for local variable is limited in size, memory for dynamic allocation or global variables doesn't put a lower constraint like this.
In modern system the concept of virtual address space is there. The process from your program gets a chunk of it. That portion of memory is now responsible for holding the required memory.
Now for dynamic allocation and so on, the process can request more memory and depending on the other processes and so on, new memory request is serviced. For dynamic or global array there is no limit process wise (of course system wise there is- it cant haverequest all memory). That's why dynamic allocation or using global variable won't cause the process to run out of it's allocated memory unlike the automatic lifetime memory that it originally had for local variables.
Basically you can check your stack size
for example in Linux : ulimit -s (Kbytes)
and then decide how you manipulate your code regarding that.
As a concept I would never allocate big piece of memory on the stack because unless you know exactly the depth of your function call and the stack use, it's hard to control the precised allocated memory on stack during run time

Assigning (const char *) to function pointer executing a hex code

I found a C code that looks like this:
#include <stdio.h>
char code[] =
"\x31\xd2\xb2\x30\x64\x8b\x12\x8b\x52\x0c\x8b\x52\x1c\x8b\x42"
"\x08\x8b\x72\x20\x8b\x12\x80\x7e\x0c\x33\x75\xf2\x89\xc7\x03"
"\x78\x3c\x8b\x57\x78\x01\xc2\x8b\x7a\x20\x01\xc7\x31\xed\x8b"
"\x34\xaf\x01\xc6\x45\x81\x3e\x46\x61\x74\x61\x75\xf2\x81\x7e"
"\x08\x45\x78\x69\x74\x75\xe9\x8b\x7a\x24\x01\xc7\x66\x8b\x2c"
"\x6f\x8b\x7a\x1c\x01\xc7\x8b\x7c\xaf\xfc\x01\xc7\x68\x72\x6c"
"\x64\x01\x68\x6c\x6f\x57\x6f\x68\x20\x48\x65\x6c\x89\xe1\xfe"
"\x49\x0b\x31\xc0\x51\x50\xff\xd7";
int main(void)
{
int (*func)();
func = (int(*)()) code;
(int)(*func)();
return 0;
}
For the given HEX CODE this program runs well and printing ("HelloWorld"). I was thinking that the HEX CODE is some machine instructions and by calling a function pointer that's pointing to that CODE we are executing that CODE.
Was my thought right? is there something to improve it?
How this HEX CODE gets generated?
Tanks for advance.
You are correct that by forcing a function pointer like this you are calling into machine instructions written as a hexadecimal string variable.
I doubt that a program like this would work on any CPU since about 2005.
On most RISC CPUs (like ARM) and on all Intel and AMD CPUs that support 64-bit, memory pages have a No Execute bit. Or in reverse an Execute bit.
On memory pages that do not have an Execute bit, the CPU will not run code. Compilers do not put variables into executable memory pages.
In order to run injected shell codes, attackers now have to use "return into libc" or function pointer overwrite attacks which set things up to call mprotect or VirtualProtect to set the execute bit on their shell code. Either that or get it injected into a executable space such as the Java, .NET, or Javascript JIT compiler uses.
Security hardened kernels will deny the ability to call mprotect. Once the program's address space is set by the dynamic library loader, it sets a security flag and no new executable pages can be created.
In order to make it always work you could assign some executable_readwrite space with malloc or the like and put the code in there and then execute it. Then there won't be any access violation faults.
void main(int argc, char** argv)
{
void* PointerToNewMemoryRegion=0;
void (*FunctionPointer) ();
PointerToNewMemoryRegion=VirtualAlloc(RandomPointer,113,MEM_COMMIT | MEM_RESERVE,PAGE_EXECUTE_READWRITE);
if (PointerToNewMemoryRegion == NULL)
{
std::cout<<"Failed to Allocate Memory region Error code: "<<GetLastError();
return;
}
memcpy(PointerToNewMemoryRegion, code,113);
FunctionPointer = (void(*)()) PointerToNewMemoryRegion;
(void)(*FunctionPointer) ();
VirtualFree(PointerToNewMemoryRegion,113,MEM_DECOMMIT)
}
but the code never returns to my code to execute so my last line is pointless. So my code has a memory leak.
To ask this question from a "general C" point of view isn't all that meaningful.
First of all, your code has many major problems:
The literal "\xFF\xFF\xFF" equals 0xFFFFFF00, not 0x00FFFFFF as may or may not have been the intention.
What this hex code means and if it is at all meaningful, is endian-dependent and also depends on the address bus width of the given CPU.
As others have mentioned, casts between function pointers and regular pointers isn't supported or well-defined by C, the C standard lists it as a "common extension".
That being said, code like this has about one single purpose, and that is various forms of boot loaders and self-updating software used in embedded systems.
Suppose for example that you have a boot loader program that is tasked with re-programming something in the very same segment of flash memory where said program itself is executed from. That is impossible because of the way the memory hardware works. So in order to do so, you would have to execute the actual flash programming routine from RAM. Since the array of hex gibberish is stored in RAM, the program can execute from there with the function pointer trick, assuming that the C compiler has a non-standard extension that allows the cast.
As for how to generate the code, you either write it all in assembler and then translate the assembler instructions to op codes manually (very tedious). Or more likely, you write the function in C and then disassemble it and copy/paste the op codes from the disassembly.
The latter is more dangerous though, as the critical part of getting code like this to work is calling convention: you must be absolutely sure that the function stacks/unstacks things properly when it is called and when it is done, restoring the contents of any CPU registers used etc. Which may force you to write part of the function in assembler anyhow. Needless to say, the code will be completely non-portable.

C (or asm): how to execute c code stored in memory (copied from labels)

I try to "inline" my VM by copying code segments from C code between labels to memory allocated by malloc. So I have Ops defined with start and end labels, and I want to copy the instruction defined by the following code to a buffer and then get executed (Im not sure if this is even possible)
OP_PUSH0_START:
sp += 4; *sp = 0; // I WANT THE INSTRUCTIONS OF THIS LINE COPIED TO THE BUFFER
OP_PUSH0_END:
to do so I thought the following code snippet will work
void * ptr0 = &&OP_PUSH0_START;
void * ptr1 = &&OP_PUSH0_END;
while(ptr0 < ptr1)
{
buf[c++] = *ptr0;
ptr0++;
}
goto buf; //jump to start of buffer
but I cant eaven read it out without getting a memory error
I would be happy about any links or any suggestions how to achieve this
The only legal way to transfer execution to an arbitrary location is to use a function pointer. goto only jumps to labels, not arrays or anything else.
Also you cannot take the address of a label. A label is not an object or a function.
It is rightly pointed out that data areas are often placed in memory whose content cannot be executed as CPU instructions. There are, however, often workarounds for that. Windows and Linux provide functions to change the permissions/rights/privileges/whatever-you-call-it of a region of the memory.
For example, here's an example of doing the kind of thing you're trying to do on Windows.
Just an addition to Alexey's answer I would link my own sample of creating the jit-executor.
How to make a C program that can run x86 hex codes
The AsmJIT library is a fine x86/x64 "one line" assembler which actually creates a complete executable chunk of memory.
The portable version of jit engine is the LuaJIT. It supports the creation of function trampolines for the ARM/x86/PowerPC/MIPS architectures.
The thing about "pointer to the label" cannot be standard in C, because there are hardware architectures in which data and code do not share the same memory.

Kind of self-modifying program in C [duplicate]

This question already has answers here:
Closed 14 years ago.
Is it possible to write a C function that does the following?
Allocate a bunch of memory in the heap
Writes machine code in it
Executes those machines instructions
Of course, I would have to restore the state of the stack to what it was prior to the execution of those machine instructions manually, but I want to know if this is feasible in first place.
It's certainly possible. For various reasons, we've spent a lot of effort of the last 30-40 years trying to make it as difficult as possible, but it is possible. In most systems now, there are hardware and software mechanisms that attempt to protect data space from being executed.
The basics, though, are fairly straightforward: you construct a piece of code, and assemble it, either by hand or4 via a compiler. You then need a fragment of code space, so you insert the code into your program
unsigned int prgm[] = { 0x0F, 0xAB, 0x9A ... }; // Random numbers, just as an example
since you wanted to use the heap you need to malloc the space
void * myspace ;
if((myspace= malloc(sizeof(prgm))) != NULL) {
memcpy(myspace, pgrm, sizeof(pgrm));
} else { // allocation error
}
Now, what you need is a way to get the program counter to point to that chunk of data that is also your chunk of code. Here's where you need a little craftiness. Setting the program counter is no big deal; that's just a JUMP instruction for your underlying machine. But how to do that?
One of the easiest ways is by purposefully messing with the stack. The stack, again conceptually, looks something like this (the details depend on both your OS and compiler pairs, and on your hardware):
| subroutine return addr |
| parameters ... |
| automatic variables |
The basic trick here is to sneakily get the address of your code into the return address; when a routine returns, it basically jumps to that return addrfess. If you can fake it out, the PC will be set to where you like.
So, what you need is a routine, let's call it "goThere()"
void goThere(void * addr){
int a ; // observe above; this is the first space
// on the stack following the parameters
int * pa; // so we use it's address
pa = (&a - (sizeof(int)+(2*sizeof(void*))) ; // so use the address
// but back up by the size of an int, the pointer on the
// stack, and the return address
// Now 'pa' points to the routine's return add on the stack.
*pa = addr; // sneak the address of the new code into return addr
return ; // and return, tricking it into "returning"
// to the address of your special code block
}
Will it work? Well, maybe, depending on the hardware and OS. Most modern OS's will protect the heap (via memory mapping or similar) from the PC moving into it. This is a useful thing for security purposes, because we'd just as well not let you take that kind of complete control.
This is very similar to this question :)
Read calling code stored in the heap from vc++. On posix, mprotect seems to be appropriate (look into man mprotect):
char *mem = malloc(sizeof(code));
mprotect(mem, sizeof(code), PROT_READ|PROT_WRITE|PROT_EXEC);
memcpy(mem, code, sizeof(code));
// now arrange some code to jump to mem. But read the notes here on casting
// from void* to a function pointer:
// http://www.opengroup.org/onlinepubs/009695399/functions/dlsym.html
However, it says:
Whether PROT_EXEC has any effect different from PROT_READ is architecture- and kernel version-dependent. On some hardware architectures (e.g., i386), PROT_WRITE implies PROT_READ.
So better, first check whether on your operation system, that works.
RE: manually restoring the stack
If you follow the calling conventions used by your platform / compiler inside the machine code you generate, then you shouldn't have to do any manual stack restoring. The compiler will do that for you, when you do
*pfunc(args)
it should add any appropriate pre or post call stack manipulation steps that are necessary.
Just make sure that you follow the right conventions inside the generated code, however.

Resources