gcc inline assembly for context-switching - c

I am trying to implement context switch using gcc for m68k processors. I need to use inline assembly for saving all the registers d0, d1...d7 and a0,...a7. I was wondering if I can use a loop in my inline gcc that would allow me to save these registers instead of write a separate line of code for each register.
for eg.
move.l %d0, temp
pcb.cpuregs.d0 = temp
i want to make 0 inside d0 like a loop counter.

Here you go:
MOVEM.L D0-D7/A0-A7,-(A7) ;Save registers onto stack.
You don't have to use the stack, you can use some other address.
I have a feeling that the pre-decrement mode is compulsory,
but I can't test that right now as I don't have a 68k machine.
p.s. that's probably not gcc dialect, seeing as gcc didn't exist when
I wrote that code, but I'm sure you can figure it out.
p.p.s why not use setjmp instead of inline assembly?
then your context switcher would be semi-portable.

You may want to consider macros:
#define SAVE_REG_DXX(no) __asm__ __volatile__("move.l %%d" #no ", %0" : "=g" (pcb.cpuregs.d ## no))
SAVE_REG_DXX(0);
SAVE_REG_DXX(1);
SAVE_REG_DXX(2);
#undef SAVE_REG_DXX

You can't use a C-style for loop inside the asm block. But you can use your C code to build a string and pass that along to asm.

Related

What happens in the assembly output when we add "cc" to clobber list

I read that if we specify "cc" in clobber list it indicates that an assembler code modifies flags register
Wrote a sample program to check the difference in between adding "cc" and not adding.
Comparing the assembly there is no change when we add "cc".
#include <stdio.h>
int main(void)
{
unsigned long sum;
asm("incq %0"
: "=r"(sum)//output operand
: "r" (sum) //input operand
);
printf("sum= %lu\n", sum);
return 0;
}
When should we use "cc", and what is the effect of it in the assembly output
For x86, absolutely nothing. For x86 and x86-64, a cc clobber is implicit in every asm() statement. This design decision makes some sense because most x86 instructions wrote FLAGS. And because it's easy to miss and could be hard to catch with testing. (Although there's no shortage of things that are easy to get wrong with GNU C inline asm. There's usually no need to use it.)
(It does make it impossible to tell the compiler when your asm statement doesn't modify flags, but the cost of that is probably low, usually just one more instruction to redo a compare or something, or to save a variable so it can be compared later.)
If you want to be pedantic, you can use a "cc" clobber in every asm statement that modifies FLAGS.
For non-x86, you must use a cc clobber in every asm statement that modifies flags / condition codes (on ISAs that have them). e.g. ARM. On ARM, setting flags is optional; instructions with an S suffix set flags. So adds r0, r1, r2 sets flags according to r0 = r1+r2, but add r0, r1, r2 leaves flags untouched.
If you left out a "cc" clobber (on non-x86), the compiler might emit asm that set flags before an asm statement and read them afterwards, as part of implementing some other non-asm statement. So it could be essentially the same as destroying a register: nonsensical behaviour that depends on the details of what the compiler was using the register or the flags for, and which varies with optimization level and/or compiler version.
This is why testing isn't sufficient to prove inline asm is safe. With one compiler version, you could easily get lucky and have the compiler generate code that happened not to keep anything in the status register / condition codes across an asm statement, but a different compiler version or different surrounding code in a function where this inlines could be vulnerable to a buggy asm statement.

Is there a conventional way to spend exactly one clock cycle in C?

Essentially, I want to execute a "NOP" in assembly.
I know tricks like (void)0 exist, but from what I understand, these compile to literally nothing. What I want is something that will waste exactly one clock cycle when compiled. Is there a standard way to do such a thing?
Depending on compiler-specifics you can create a naked function that executes a NOP in an asm inline block of code.
In VC++, that would be something like this:
__declspec(naked) void foo()
{
__asm NOP;
}
Just a side-note: you can use the _emit keyword too. In IA-32 doing that would look something like this:
__asm
{
_emit 0x90; // This will effectively creates a NOP instruction.
}
An alternative in GCC and Clang would be:
asm volatile("nop");
Remember that naked functions don't have any sort of prologue or epilogue.
As well noted by Eric Postpischil in comments, there is no guarantee at all that a NOP instruction will spend exactly 1 cycle to execute.

GCC Assembly Inline: Function Body with Only Inlined Assembly Code

I am trying to reuse some assembly code in my C project. Suppose I have a sequence of instructions, and I would like to organize them as a function:
void foo() {
__asm__ (
"mov %eax, %ebx"
"push %eax"
...
);
}
However, one obstacle is that in the compiled assembly code of function foo, besides the inlined assembly code, compiler would also generate some prologue instructions for this function, and the whole assembly program would become something like:
foo:
push %ebp <---- routine code generated by compilers
mov %ebp, %esp <---- routine code generated by compilers
mov %eax, %ebx
push %eax
Given my usage scenario, such routine code actually breaks the original semantics of the inlined assembly.
So here is my question, is there any way that I can prevent compiler from generating those function prologue and epilogue instructions, and only include the inlined assembly code?
You mention that you use gcc for compiling.
In this case you can use -O2 optimization level. This will cause the compiler to do stack optimization and if your inline assembly is simple, it won't insert the prologue and epilogue. Although, this might not be guaranteed in every case because optimizations keep changing. (My gcc with -O2 does it).
Another option is that you can put the entire function (including the foo:) inside an assembly block as
__asm__ (
"foo:\n"
"mov ..."
);
With this option you need to know the name mangling specifications if any. You will also have to add .globl foo before the function start if you want the function to be non static.
Lastly you can check the gcc __attribute__ ((naked)) attribute on the function declaration. But as mentioned by MichaelPetch, this is not available for the X86 target.
The whole point of inline asm code is to interface with the C compiler's scheduler and register allocator in a sane way, by giving you a way to specify how to hook up the assembly code to the compiler's constraint solving machinery. That's why it rarely makes sense to have inline asm code with specific registers in it; you instead want to use constraints to allocate some registers and have the compiler tell you what they are.
If you really want to write stand-alone asm code that communicates with the rest of you program by the system ABI, write that code in a separate .s (or .S) file that you include in your project, rather than trying to use inline asm code.

How can I write in (GNU) C a proxy function to interface two different calling conventions?

I'm writing an interpreter/compiler hybrid where the calling convention passes parameters on the CPU stack. Functions are simply pointers to machine code (like C function pointers) potentially generated at runtime. I need a proxy function to interface with the custom calling convention. I want to write as much as possible of this function in C, although necessarily some parts will have to be written in assembly. I will refer to this proxy function as apply.
I don't fully understand the semantics of GCC inline assembly and I would like to know if the following tentative implementation of a 1-ary apply function is correct, or where it goes wrong. In particular, I wonder about the integrity of the stack between the many __asm__ blocks: how does the compiler (GCC and clang in my case) interpret the stack pointer register being clobbered, and what are the consequences of that in the generated code? Does the compiler understand that I want to "own" the stack? Is the memory clobber necessary?
Through experimentation I found that clang with -fomit-frame-pointer correctly disables this optimization for a function when it sees the rsp register in a clobber list, since rsp is obviously not anymore a reliable way of addressing local variables on the stack. This is not true in GCC, and as a consequence it generates buggy code (this seems like a bug in GCC). So I guess this answers some of my questions. I can live with -fno-omit-frame-pointer, but it seems as if GCC doesn't consider the various implications of rsp being clobbered.
This is written for x86-64, although I am interested in eventually porting it to other architectures. We assume all registers are preserved across calls in the custom calling convention.
#define push(x) \
__asm__ volatile ("pushq %0;" : : "g" (x) : "rsp", "memory")
#define pop(n) \
__asm__ volatile ("addq %0, %%rsp;" : : "g" (n * 8) : "rsp", "memory")
#define call(f) \
__asm__ volatile ("callq *%0;" : : "g" (f) : "cc", "memory")
void apply(void* f, void* x) {
push(x);
call(f);
pop(1);
}
I think the -mno-red-zone flag is technically necessary to use the stack in the way I want. Is this correct?
The previous code assumes all registers are preserved across calls. But if there's a set of registers which aren't preserved, how should I reflect this in the code? I get the feeling that adding them to the call clobber list won't produce correct results because the registers may be pushed onto the top of the stack, shadowing the pushed x. If instead they are saved on a previously reserved area of the call frame, it may work. Is this the case? Can I rely on this behaviour? (Is it silly of me to hope so?)
Another option would be to manually preserve and restore these registers but I have a strong feeling this will only give the illusion of safety and break at some point.
I need a proxy function to interface with the custom calling convention. I want to write as much as possible of this function in C, although necessarily some parts will have to be written in assembly.
I'm sorry, this simply will not work. You must write the entire proxy function in assembly language.
More concretely -- I don't know about clang, but GCC assumes at a very basic level that nobody touches the stack pointer in inline assembly, ever. That doesn't mean it will error out -- it means it will blithely mis-optimize on the assumption that you didn't do that, even though you told it you did. This is not something that is likely ever to change; it's baked into the register allocator and all umpteen CPU back ends.
Now, the good news is, you may be able to persuade libffi to do what you want. It's got proxy functions which someone else has written in assembly language for you; if it fits your use case, it'll save you quite a bit of trouble.

Arbitrary code execution using existing code only

Let's say I want to execute an arbitrary mov instruction. I can write the following function (using GCC inline assembly):
void mov_value_to_eax()
{
asm volatile("movl %0, %%eax"::"m"(function_parameter):"%eax");
// will move the value of the variable function_parameter to register eax
}
And I can make functions like this one that will work on every possible register.
I mean -
void movl_value_to_ebx() { asm volatile("movl %0, %%ebx"::"m"(function_parameter):"%ebx"); }
void movl_value_to_ecx() { asm volatile("movl %0, %%ecx"::"m"(function_parameter):"%ecx"); }
...
In a similar way I can write functions that will move memory in arbitrary addresses into specific registers, and specific registers to arbitrary addresses in memory. (mov eax, [memory_address] and mov [memory_address],eax)
Now, I can perform these basic instructions whenever I want, so I can create other instructions. For example, to move a register to another register:
function_parameter = 0x028FC;
mov_eax_to_memory(); // parameter is a pointer to some temporary memory address
mov_memory_to_ebx(); // same parameter
So I can parse an assembly instruction and decide what functions to use based on it, like this:
if (sourceRegister == ECX) mov_ecx_to_memory();
if (sourceRegister == EAX) mov_eax_to_memory();
...
if (destRegister == EBX) mov_memory_to_ebx();
if (destRegister == EDX) mov_memory_to_edx();
...
If it can work, It allows you to execute arbitrary mov instructions.
Another option is to make a list of functions to call and then loop through the list and call each function. Maybe it requires more tricks for making equivalent instructions like these.
So my question is this: Is is possible to make such things for all (or some) of the possible opcodes? It probably requires a lot of functions to write, but is it possible to make a parser, that will build code somehow based on given assembly instructions ,and than execute it, or that's impossible?
EDIT: You cannot change memory protections or write to executable memory locations.
It is really unclear to me why you're asking this question. First of all, this function...
void mov_value_to_eax()
{
asm volatile("movl %0, %%eax"::"m"(function_parameter):"%eax");
// will move the value of the variable function_parameter to register eax
}
...uses GCC inline assembly, but the function itself is not inline, meaning that there will be prologue & epilogue code wrapping it, which will probably affect your intended result. You may instead want to use GCC inline assembly functions (as opposed to functions that contain GCC inline assembly), which may get you closer to what you want, but there are still problems with that.....
OK, so supposing you write a GCC inline assembly function for every possible x86 opcode (at least the ones that the GCC assembler knows about). Now supposing you want to invoke those functions in arbitrary order to accomplish whatever you might wish to accomplish (taking into account which opcodes are legal to execute at ring 3 (or in whatever ring you're coding for)). Your example shows you using C statements to encode logic for determining whether to call an inline assembly function or not. Guess what: Those C statements are using processor registers (perhaps even EAX!) to accomplish their tasks. Whatever you wanted to do by calling these arbitrary inline assembly functions is being stomped on by the compiler-emitted assembly code for the logic (if (...), etc). And vice-versa: Your inline assembly function arbitrary instructions are stomping on the registers that the compiler-emitted instructions expect to not be stomped-on. The result is not likely to run without crashing.
If you want to write code in assembly, I suggest you simply write it in assembly & use the GCC assembler to assemble it. Alternatively, you can write whole C-callable assembly functions within an asm() statement, and call them from your C code, if you like. But the C-callable assembly functions you write need to operate within the rules of the calling convention (ABI) you're using: If your assembly functions use a callee-saved register, your function will need to save the original value in that register (generally on the stack), and then restore it before returning to the caller.
...OK, based on your comment Because if it's working it can be a way to execute code if you can't write it to memory. (the OS may prevent it)....
Of course you can execute arbitrary instructions (as long as they're legal for whatever ring you're running in). How else would JIT work? You just need to call the OS system call(s) for setting the permissions of the memory page(s) in which your instructions reside... change them to "executable" and then call 'em!

Resources