void save_context(uint8_t index) {
context *this_context = contextArray + index;
uint8_t *this_stack = this_context->stack;
asm volatile("st %0 r0": "r"(this_stack));
}
I have something like this.
!!! I would like to store the registers r0 r1 r2... into my stack[] array.
What I am programming is the context switch. The context has the structure like this:
typedef struct context_t {
uint8_t stack[THREAD_STACK_SIZE];
void *pstack;
struct context_t *next;
}context;
My problem is that I am not able to pass the c variable "this_stack" to inline assembly. My aim is to store all the registers, stack pointer and SREG on my stack.
After compiling, it gives error:
Description Resource Path Location Type
`,' required 5_multitasking line 754, external location: C:\Users\Jiadong\AppData\Local\Temp\ccDo7xn3.s C/C++ Problem
I looked up the avr inline assembly tutorial. But I don't quite get a lot.
Could anyone help me?
asm volatile ("st %0 r0": "r"(this_stack));
There are several problems in that line: Wrong % print-modifier, missing , between the operands, incorrect constraint and missing description of side effects.
The memory access is supposed to use indirect addressing, so one way is to use indirect+displacement with "b"ase register Y or Z:
asm volatile ("std %a1+0, r0" "\n\t"
"std %a1+1, r1" "\n\t"
"..."
: "+m" (this_context->stack)
: "b" (this_stack));
Notice print modifier %a which prints R30 as Z and not as r30.
Operand 0 is just used to express that this_context->stack is being changed if you don't want the all-memory-clobber "memory". Moreover, there's no need for an intermediate variable for operand 1 because it's not altered: you can use just as well "b" (this_context->stack) for operand 1.
Alternatively, post-increment addressing on "e"xtended (pointer) registers X, Y or Z can be used:
asm volatile ("st %a1+, r0" "\n\t"
"st %a1+, r1" "\n\t"
"..."
: "=m" (this_context->stack), "+e" (this_stack));
"label" makes no sense, that should be a constraint. It also makes no sense trying to save the stack pointer into an array. It might make sense to load the stack pointer with the address of that array, but that's not the save_context.
Anyway, to get the value of SPL which is the stack pointer you can do something like this:
asm volatile("in %0, %1": "=r" (*this_stack) : "I" (_SFR_IO_ADDR(SPL)));
(There is a q constraint but at least my gcc version doesn't like it.)
To get true registers, for example r26 you can do:
register uint8_t r26_value __asm__("r26");
asm volatile("": "=r" (r26_value));
There is a constraint, "m", documented in the GCC manual, but it doesn't always work on AVR. Here is an example of how it should work from sanguino/bootloaders/atmega644p/ATmegaBOOT
asm volatile("...
...
"sts %0,r16 \n\t"
...
: "=m" (SPMCSR) : ... );
I have found "m" to be fragile though. If a function uses a variable in C code, outside of the inline assembly, the compiler may choose to place it in the Z register and it will try to use Z in assembler too. This causes an assembler error when used with the sts instruction. Looking at the assembler output from the C compiler is the best way to debug this kind of problem.
Rather than using an "m" constraint, you can just put the literal address you want into your assembler code. For an example, see pins_teensy.c, where timer_0_fract_count is not included in the :
asm volatile(
...
"sts timer0_fract_count, r24" "\n\t"
Related
I want to read a register named x0 in arm64 (not x86_64) using C language. What's the best way (bug free and portability?)
I search all the network, I just find some ways:
register int *foo asm ("a5"); //1
register int foo asm ("a5"); //2 which right?
or
intptr_t sp;
asm ("movl %%esp, %0" : "=r" (sp) ); //3
The first way have some bugs I think. x0 in arm64 is 64bit. I think int *f can not hold the 64 bit addr.
The second way is for x86. It seem not work make it in this way:
asm ("movl %x0, %0" : "=r" (sp) );
So what's the correct way read a register in C
The easiest way to do so is like this:
uint64_t foo;
asm volatile ("mov %0, x0" : "=r"(foo) ::);
This copies the content of register x0 into the variable foo. Note that the content of x0 is going to be fairly unpredictable at any given point in the code; I don't quite see the use in finding its contents. You should escpecially not rely on x0 containing any particular value at the beginning or end of a function or right before or after calling a function. The C compiler is allowed to use any register for any purpose at any point in the program and it is known to make use of this right.
This is basically to perform swap for the buffers while transferring a message buffer. This statement left me puzzled (because of my unfamiliarity with the embedded assembly code in c). This is a power pc instruction
#define ASMSWAP32(dest_addr,data) __asm__ volatile ("stwbrx %0, 0, %1" : : "r" (data), "r" (dest_addr))
Besides being unsafe because of a bug, this macro is also less efficient than what the compiler will generate for you.
stwbrx = store word byte-reversed. The x stands for indexed.
You don't need inline asm for this in GNU C, where you can use __builtin_bswap32 and let the compiler emit this instruction for you.
void swapstore_asm(int a, int *p) {
ASMSWAP32(p, a);
}
void swapstore_c(int a, int *p) {
*p = __builtin_bswap32(a);
}
Compiled with gcc4.8.5 -O3 -mregnames, we get identical code from both functions (Godbolt compiler explorer):
swapstore:
stwbrx %r3, 0, %r4
blr
swapstore_c:
stwbrx %r3,0,%r4
blr
But with a more complicated address (storing to p[off], where off is an integer function arg), the compiler knows how to use both register inputs, while your macro forces the compiler to have the address in a single register:
void swapstore_offset(int a, int *p, int off) {
= __builtin_bswap32(a);
}
swapstore_offset:
slwi %r5,%r5,2 # *4 = sizeof(int)
stwbrx %r3,%r4,%r5 # use an indexed addressing mode, with both registers non-zero
blr
swapstore_offset_asm:
slwi %r5,%r5,2
add %r4,%r4,%r5 # extra instruction forced by using the macro
stwbrx %r3, 0, %r4
blr
BTW, if you're having trouble understanding GNU C inline asm templates, looking at the compiler's asm output can be a useful way to see what gets substituted in. See How to remove "noise" from GCC/clang assembly output? for more about reading compiler asm output.
Also note that this macro is buggy: it's missing a "memory" clobber for the store. And yes, you still need that with asm volatile. The compiler doesn't assume that *dest_addr is modified unless you tell it, so it could hoist a non-volatile load of *dest_addr ahead of this insn, or more likely to be a real problem, sink a store after it. (e.g. if you zeroed a buffer before storing to it with this, the compiler might actually zero after this instruction.)
Instead of a "memory" clobber (and also leaving out volatile), you could tell the compiler which memory location you modify with a =m" (*dest_addr) operand, either as a dummy operand or with a constraint on the addressing mode so you could use it as reg+reg. (IDK PPC well enough to know what "=m" usually expands to.)
In most cases this bug won't bite you, but it's still a bug. Upgrading your compiler version or using link-time optimization could maybe make your program buggy with no source-level changes.
This kind of thing is why https://gcc.gnu.org/wiki/DontUseInlineAsm
See also https://stackoverflow.com/tags/inline-assembly/info.
#define ASMSWAP32(dest_addr,data) ...
This part should be clear
__asm__ volatile ( ... : : "r" (data), "r" (dest_addr))
This is the actual inline assembly:
Two values are passed to the assmbly code; no value is returned from the assembly code (this is the colons after the actual assembly code).
Both parameters are passed in registers ("r"). The expression %0 will be replaced by the register that contains the value of data while the expression %1 will be replaced by the register that contains the value of dest_addr (which will be a pointer in this case).
The volatile here means that the assembly code has to be executed at this point and cannot be moved to somewhere else.
So if you use the following code in the C source:
ASMSWAP(&a, b);
... the following assembler code will be generated:
# write the address of a to register 5 (for example)
...
# write the value of b to register 6
...
stwbrx 6, 0, 5
So the first argument of the stwbrx instruction is the value of b and the last argument is the address of a.
stwbrx x, 0, y
This instruction writes the value in register x to the address stored in register y; however it writes the value in "reverse endian" (on a big-endian CPU it writes the value "little endian".
The following code:
uint32 a;
ASMSWAP32(&a, 0x12345678);
... should therefore result in a = 0x78563412.
I'm experimenting with GCC's inline assembler (I use MinGW, my OS is Win7).
Right now I'm only getting some basic C stdlib functions to work. I'm generally familiar with the Intel syntax, but new to AT&T.
The following code works nice:
char localmsg[] = "my local message";
asm("leal %0, %%eax" : "=m" (localmsg));
asm("push %eax");
asm("call %0" : : "m" (puts));
asm("add $4,%esp");
That LEA seems redundant, however, as I can just push the value straight onto the stack. Well, due to what I believe is an AT&T peculiarity, doing this:
asm("push %0" : "=m" (localmsg));
will generate the following assembly code in the final executable:
PUSH DWORD PTR SS:[ESP+1F]
So instead of pushing the address to my string, its contents were pushed because the "pointer" was "dereferenced", in C terms. This obviously leads to a crash.
I believe this is just GAS's normal behavior, but I was unable to find any information on how to overcome this. I'd appreciate any help.
P.S. I know this is a trivial question to those who are experienced in the matter. I expect to be downvoted, but I've just spent 45 minutes looking for a solution and found nothing.
P.P.S. I realize the proper way to do this would be to call puts( ) in the C code. This is for purely educational/experimental reasons.
While inline asm is always a bit tricky, calling functions from it is particularly challenging. Not something I would suggest for a "getting to known inline asm" project. If you haven't already, I suggest looking through the very latest inline asm docs. A lot of work has been done to try to explain how inline asm works.
That said, here are some thoughts:
1) Using multiple asm statements like this is a bad idea. As the docs say: Do not expect a sequence of asm statements to remain perfectly consecutive after compilation. If certain instructions need to remain consecutive in the output, put them in a single multi-instruction asm statement.
2) Directly modifying registers (like you are doing with eax) without letting gcc know you are doing so is also a bad idea. You should either use register constraints (so gcc can pick its own registers) or clobbers to let gcc know you are stomping on them.
3) When a function (like puts) is called, while some registers must have their values restored before returning, some registers can be treated as scratch registers by the called function (ie modified and not restored before returning). As I mentioned in #2, having your asm modify registers without informing gcc is a very bad idea. If you know the ABI for the function you are calling, you can add its scratch registers to the asm's clobber list.
4) While in this specific example you are using a constant string, as a general rule, when passing asm pointers to strings, structs, arrays, etc, you are likely to need the "memory" clobber to ensure that any pending writes to memory are performed before starting to execute your asm.
5) Actually, the lea is doing something very important. The value of esp is not known at compile time, so it's not like you can perform push $12345. Someone needs to compute (esp + the offset of localmsg) before it can be pushed on the stack. Also, see second example below.
6) If you prefer intel format (and what right-thinking person wouldn't?), you can use -masm=intel.
Given all this, my first cut at this code looks like this. Note that this does NOT clobber puts' scratch registers. That's left as an exercise...
#include <stdio.h>
int main()
{
const char localmsg[] = "my local message";
int result;
/* Use 'volatile' since 'result' is usually not going to get used,
which might tempt gcc to discard this asm statement as unneeded. */
asm volatile ("push %[msg] \n\t" /* Push the address of the string. */
"call %[puts] \n \t" /* Call the print function. */
"add $4,%%esp" /* Clean up the stack. */
: "=a" (result) /* The result code from puts. */
: [puts] "m" (puts), [msg] "r" (localmsg)
: "memory", "esp");
printf("%d\n", result);
}
True this doesn't avoid the lea due to #5. However, if that's really important, try this:
#include <stdio.h>
const char localmsg[] = "my local message";
int main()
{
int result;
/* Use 'volatile' since 'result' is usually not going to get used. */
asm volatile ("push %[msg] \n\t" /* Push the address of the string. */
"call %[puts] \n \t" /* Call the print function. */
"add $4,%%esp" /* Clean up the stack. */
: "=a" (result) /* The result code. */
: [puts] "m" (puts), [msg] "i" (localmsg)
: "memory", "esp");
printf("%d\n", result);
}
As a global, the address of localmsg is now knowable at compile time (ok, I'm simplifying a bit), the asm produced looks like this:
push $__ZL8localmsg
call _puts
add $4,%esp
Tada.
I'm writing inline assembly statements using a GNU-based toolchain, and there are three instructions within the inline assembly to update a single bit of a system register. The steps will be:
move(read) a system register to a general register
'AND' it with the variable value from C code
move(write) back to the system register just read
in the instruction set I'm using, the inline assembly syntax is like this:
unsigned int OV_TMP = 0xffefffff;
asm volatile ( "mfsr %0, $PSW\n\t"
"and %0, %0, %1\n\t"
"mtsr %0, $PSW"
: : "r"(OV_TMP) : );
%1 is the register which I want to forward the value of OV_TMP into.
%0 is the problem for me, and my problem is :
How to write the inline assembly code once there is a register used internally and is not assigned from nor copy to the C variables in the C code?
The thing to consider here is that, from the compiler's perspective, the register is assigned-to by the inline assembly, even if you don't use it again later. That is, you're generating the equivalent of:
register unsigned int OV_TMP = 0xffefffff, scratch;
scratch = magic() & OV_TMP;
more_magic(scratch);
/* and then don't re-use scratch for anything from here on */
The magic and/or more_magic steps cannot be moved or combined away because of the volatile, so the compiler cannot simply delete the written-but-unused register.
The mfsr and mtsr look like powerpc instructions to me, and I would probably do the and step in C code (see footnote); but the following should generally work:
unsigned int OV_TMP = 0xffefffff, scratch;
asm volatile("mfsr %0, $PSW\n\t"
"and %0, %0, %1\n\t"
"mtsr %0, $PSW"
: "=&r"(scratch) : "r"(OV_TMP));
Here the "=&r" constraint says that the output operand (%0) is written before the input operand (%1) is read.
Footnote: As far as I know (which is not very far, I've only ever done a tiny bit of ppc assembly) there's no need to keep the mfsr and mtsr instructions a specific distance apart, unlike certain lock-step sequences on other processors. If so, I would write something more like this:
static inline unsigned int read_psw() {
unsigned int result;
asm volatile("mfsr %0, $PSW" : "=r"(result));
return result;
}
static inline void write_psw(unsigned int value) {
asm volatile("mtsr %0, $PSW" :: "r"(value));
}
#define PSW_FE0 0x00100000 /* this looks like it's FE0 anyway */
...
write_psw(read_psw() & ~PSW_FE0); /* some appropriate comment here */
I am trying to use a thread local variable in inline assembly, but when I see the diassembled code, It appears that the compiler doesn't generate the right code. For the following inline code, where saved_sp is globally declared as __thread long saved_sp,
__asm__ __volatile__ (
"movq %rsp, saved_sp\n\t");
The disassembly looks like the following.
mov %rsp,0x612008
Which is clearly not the right thing, because I know that gcc uses the fs segment for thread local variables. It should had generated something like
mov %rsp, fs:somevalue
which it is not. Why is that so? Is using thread local variables in inline assembly problematic?
A simple thing that would surely work is to take a pointer to the thread local variable, and write to it.
Your compiler will surely do long *saved_fp_p = &saved_fp correctly, and inline assembly will only deal with saved_fp_p, which is a local variable.
You can also use gcc's input and output syntax:
__asm__ __volatile__ (
"mov %%rsp, 0(%0)" : : "r" (&saved_sp)
);
This puts the compiler in charge of resolving the address of saved_fp, and the assembly code gets it in a register.
We found out that this also works,
__asm__ __volatile__ asm ("mov %rsp,%0" : "=m" (saved_sp))