How to use a variable offset in inline assembly?

How to use a variable offset in inline assembly? - c

The overall problem I am trying to solve, is to call printf, while fetching its format string and arguments from a raw buffer. So far, the solution that seems to be working the best is through the use of inline assembly as a way to pass the mixed typing variadic arguments to the function.
Currently we have chars and ints working flawlessly, and floats/doubles working up until we need to pass them on the stack. (Passing through xmm0 - xmm7 works flawlessly for us). The goal here is to push these floating point values to the stack once xmm0-xmm7 have all been used. These values would then be used in the subsequent call to printf. The way we handle this for the chars and ints is to push them onto the stack just by simply using the push instruction, which the call to printf is able to use just fine, but since that instruction doesn't work for floating point values we have to manually 'push' it onto the stack with the method below. I realize that this is very likely to be the wrong way to handle this, but we haven't been able to figure a way out of doing it this way.
Currently our solution to passing more than eight floating point values on the stack requires us to know the offset of the argument that is being passed to our printf call. In this case the offsets correspond to 8 byte increments. The 9th argument is to be loaded into (%rsp), the 10th into 0x8(%rsp) the 11th into 0x10(%rsp) the 12th into 0x18(%rsp) with the rest of the arguments continuing this trend.
My goal with this "variable offset" is to just reduce the amount of repeated code that handles the incremented offset. Currently it just checks which argument is being processed, and jumps to the hardcoded constant offset. But this has led to a lot of duplicated code, which I was hoping to clean up.
Below is a small snippet of what we are doing currently to move one of the arguments into its appropriate place for the call to printf to access the argument.
double myDouble = 1.23;
asm volatile (
"movsd %0, 0x8(%%rsp)" #The 0x8 is the offset we are hoping to pass in
:: "m" (myDouble)
);
I am looking for a way to store this offset (0x8, 0x10, 0x18,...) in a variable that can be incremented by eight as I process the arguments, though I now fear that this will break once we start mixing in more mixed typed values that are pushed onto the stack.
Any guidance would be greatly appreciated!

That's not possible using the instruction with a constant offset. To generate the code, the offset would need to be known at compile time, and not be variable. You have to use a different instruction, an indirect load with base register and offset:
int foo(int64_t offset, double value)
{
asm volatile (
"movsd %0, (%%rsp,%1)" :: "x" (value), "r" (offset)
: "memory"
);
}
You could also let the CPU do the multiplication by 8 by using a scaled offset addressing mode:
int foo(int64_t offset, double value)
{
asm volatile (
"movsd %0, (%%rsp,%1,8)" :: "x" (value), "r" (offset)
: "memory"
);
}
Or if you want to emulate push, then sub $8, %%rsp / movsd %0, (%%rsp), but you can't mess with the stack pointer from inline asm without breaking compiler-generated code.

Related

How use INT %0 in inline asm with the interrupt number coming from a C variable?

I want to call the bios inline my c code. I tried asm("int %%al"::"a" (interrupt)); but gcc write Error: operand size mismatch for 'int'. I wonder that code work.

The int instruction must take its vector as an immediate; it has no form that takes the number from a register. See the instruction description; note that the second form is INT imm8 and there is nothing like INT r8 or INT r/m8 that would allow a register or memory operand.
If interrupt can be evaluated as a compile-time constant then you may be able to do
asm volatile("int %0" : : "i" (interrupt));
Note that in order for the interrupt to do something useful, you probably have to load various values into registers beforehand, and retrieve the values returned. Those will need to be done as part of the same asm block, requiring more operands and constraints. You cannot put something like asm("mov $0x1a, %%ah"); in a preceding block; the compiler need not preserve register contents between blocks.
If you truly don't know the interrupt number until runtime, your options are either to assemble all 256 possible int instructions and jump to the right one, or else use self-modifying code.

What is the use of matching constraints in inline assembly

From the following link,
https://www.ibm.com/developerworks/library/l-ia/index.html
a single variable may serve as both the input and the output operand.
I wrote the following code:
#include <stdio.h>
int main()
{
int num = 1;
asm volatile ("incl %0"
:"=a"(num)
:"0"(num));
printf("num:%d\n", num);
return 0;
}
The above code increments the value of num.
What is the use of matching constraints, if i don't use matching constraints, the code does not work as expected.
asm volatile ("incl %0"
:"=a"(num));

why and when should we use matching constraints
That's not the question you asked; you asked why you need an input at all, which should be fairly obvious when you know what the syntax actually means. (That "=r"(var) is a pure output, independent of any previous value the C variable had, like var = 123; would be). So "=r" with an inc instruction is like var = stale_garbage + 1;
But anyway, as I commented, the interesting question is "why do matching constraints exist when you can just use "+r"(var) for a read/write operand, instead of the more complicated matching-constraint syntax?"
They're rarely useful; usually you can use the same variable for input and output especially if you have your asm inside a C wrapper function. But if you don't want to use the same C var for input and output, but still need them to pick the same register or memory, then you want a matching constraint. One use-case might be wrapping a system call is one use-case; you might want to use a different C variable for the call number vs. the return value. (Except you could just use "=a" and "a" instead of a matching constraint; the compiler doesn't have a choice.) Or maybe an output var of a narrower or different type than the input var could be another use-case.
IIRC, x87 is another use-case; I seem to recall "+t" not working.
I think that "+r" RMW constraints are internally implemented as an output with a "hidden" matching constraint. But while %1 normally errors in an asm template that only has one operand, if that operand is an in/out "+something" then GCC doesn't reject %1 as being too high an operand number. And if you look at the asm to see which register or memory it actually chose for that out-of-bounds operand number, it does match the in/out operand.
So "+r" is basically syntactic sugar for matching constraints. I'm not sure if it was new at some point, and before GCC version x.y you had to use matching constraints? It's not rare to see tutorial examples that use matching constraints with the same var for both input and output that would simpler to read with "+" RMW constraints.
Basics:
With constraints like "a" and "=a" you don't need a matching constraint; the compiler only has 1 choice anyway. Where it's useful is "=r" where the compiler could pick any register, and you need it to pick the same register for an input operand.
If you just used "=r" and a separate "r" input, you'd be telling the compiler that it can use this as a copy-and-whatever operation, leaving the original input unmodified and producing the output in a new register. Or overwriting the input if it wants to. That would be appropriate for lea 1(%[srcreg]), %[dstreg] but not inc %0. The latter would assume that %0 and %1 are the same register, therefore you need to do something to make sure that's true!

This code:
asm volatile ("incl %0"
:"=a"(num));
Doesn't work because in order to increase the value in a register (by 1 in this case) an original value needs to be read from the register; 1 added to it; and the value written back to the register. =a only says that the output of the register EAX will be moved to num when finished but the compiler won't load the register EAX with the original value of num. The code above will just add 1 to whatever happens to be in EAX (could be anything) and puts that in num when the inline assembly is finished.
asm volatile ("incl %0"
:"=a"(num)
:"0"(num));
On the other hand this says that num is both used as an input (so the value of num is moved to EAX) and that it also outputs a value in EAX so the compiler will move the value in EAX to num when the inline assembly is finished.
It could have been rewritten to use an input/output constraint as well (this does the same thing):
asm volatile ("incl %0"
:"+a"(num));
There is no need for volatile here either since all of the side effects are captured in the constraints. Adding volatile unnecessarily can lead to less efficient code generation but the code will still work. I would have written it this way:
asm ("incl %0"
:"+a"(num));

What does cmp %eax,0x80498d4(,%ebx,4) mean?

I know there are some other questions similar to this, but I'm still having trouble understanding the () part of it. Could someone spell this syntax out for me? Thanks.

cmp %eax,0x80498d4(,%ebx,4)
cmp is the comparison assembly instruction. It performs a comparison between two arguments by signed subtracting the right argument from the left and sets a CPU EFLAGS register. This EFLAGS register can then be used to do conditional branching / moving, etc.
First argument: `%eax (the value in the %eax register)
Second argument: 0x80498d4(,%ebx,4). This is read as offset ( base, index, scale ) In your example, the value of the second argument is the memory location offset 0x80498d4 + base (which I believe defaults to zero if not included) + value in %ebx register * 4 (scaling factor).
Note: I believe base here is empty and defaults to the value 0.
You can take a look at http://docs.oracle.com/cd/E19120-01/open.solaris/817-5477/ennby/index.html for more information on the syntax for Intel x86 assembly instructions.

inline asm unknown

static inline void *__memset(void *s, char c, size_t n) {
int d0, d1;
asm volatile (
"rep; stosb;"
: "=&c" (d0), "=&D" (d1)
: "0" (n), "a" (c), "1" (s)
: "memory");
return s;
}
What are "d0" and "d1" used for? Could you please explain all the code completely?Thank you!

You need to understand gcc extended inline asm format:
The first part is the actual assembly. In this case there are only 2 instructions
The second part specifies output constraints and the third part specifies input constraints. The fourth part specifies the assembly will clobber the memory
Output
"=&c" associates d0 with the ecx register and marks it for write-only. & means it can be modified before the end of the code
"=&D" means the same thing, for the edi register
Input
"0" (n) associates n with the first mentioned register. In your case, with ecx
"a" (c) associates c with eax
"1" (s) associates s with edi
Assembly
So there you have it. Repeat this ecx times (n times): store eax (c) into edi (s) then increment it.
So then, why the unused d0 and d1 ? I'm not sure. I too think they are useless in this case and the whole output section could be left empty BUT I don't think it's possible to specify "writable" and "early-clobbered" in the input constraints. So I think d0 and d1 are there to make & possible.
I would try writing it like this:
asm volatile (
"rep\n"
"stosb\n"
:
: "c" (n), "a" (c), "D" (s)
: "%ecx", "%edi", "memory"
);

What are "d0" and "d1" used for?
In effect, it says that the final values of %ecx, %edi (assuming 32-bit) are stored in d0, d1 respectively. This serves a couple of purposes:
It lets the compiler know that, as outputs, these registers are effectively clobbered. By assigning them to temporary variables, an optimizing compiler also knows that there is no need to actually perform the 'store' operation.
The "=&" specifies these as early-clobber operands. They may be written to before all the inputs are consumed. So if the compiler is free to choose an input register, it shouldn't alias these two.
This isn't technically necessary for %ecx, since it's explicitly named as an input: "0" (n) - the 'rep' count in this case. I'm not sure it's necessary for %edi either, since it can't be updated before the input "1" (s) is consumed, and the instruction executed. And again, as it's explicitly named as an input, the compiler isn't free to choose another register. In short, "=&" doesn't hurt here, but it doesn't do anything.
As "a" (c) specifies an input-only register %eax set to (c), the compiler may assume that %eax still holds this value after the 'asm' - which is indeed the case with "rep; stosb;".
"memory" specifies that memory can be modified in a way unknown to the compiler - which is true in this case, it's setting (n) bytes starting at (r) to the value (c) - assuming the direction flag is cleared, which it should be. This does have the effect of forcing a reload of values, as the compiler can't assume that registers reflect the memory values they're supposed to anymore. It doesn't hurt, and it may be necessary to make it safe for a general case memset, but it's often overkill.
Edit: Input operands may not overlap clobber operands. It doesn't make sense to specify something as input-only and clobbered. I don't think the compiler allows this, and it wouldn't be wise to use an ambiguous specification even if it did. From the manual:
You may not write a clobber description in a way that overlaps with an input or output operand. For example, you may not have an operand describing a register class with one member if you mention that register in the clobber list.
Reviewing some old answers, I thought I would add a link to the excellent Lockless GCC inline ASM tutorial. The article builds on prior sections, unlike the gcc manual which is best described as a 'reference', and not really suited to any sort of structured learning.

What is r() and double percent %% in GCC inline assembly language?

Example:
int main(void)
{
int x = 10, y;
asm ("movl %1, %%eax;"
"movl %%eax, %0;"
:"=r"(y) /* y is output operand */
:"r"(x) /* x is input operand */
:"%eax"); /* %eax is clobbered register */
}
what is r(y)?
also why %% is used before eax? Generally single % is used right?

Okay, this is gcc inline assembler which very powerful but difficult to understand.
First off, the % char is a special char. It lets you define register and number placeholders (mor on this later). Unfortunately the % is also used to as part of a register name (such as %EAX) so in gcc inline assembler you have to use two percent chars if you want to name a register.
%0, %1 and %2 (ect..) are placeholder input and output operands. These are defined in the list followed by the assembler string.
In your example %0 becomes a placeholder for y, and %1 becomes a placeholder for x. The compiler will make sure the variables will be in the registers for input operands before the asm-code gets executed, and it will make sure the output operand will get written to the variable specified in the output operand list.
Now you should get an idea what r(y) is: It is an input operand that reserves a register for the variable y and assigns it to the placeholder %1 (because it is the second operand listed after the inline assembler string).
There are lots of other placeholder types. m lets you specify a memory location, and if I'm not mistaken i can be used for numeric constants. You'll find them all listed in the gcc documentation.
Then there is the clobber list. This list is important! It lists all registers, flags, memory-locations ect that gets modified in your assembler code (such as the EAX in your example). If you get this wrong the optimizer will not know what has been modified and it is very likely that you end up with code that doesn't work.
Your example is by the way almost pointless. It just loads the value X into a register and assigns this register to EAX. Afterwards EAX gets stored into another register which will then later become your y variable. So all it does is a simple assignment:
y = x;
A last thing: If you have worked with Intel-style assembler before: You have to read the arguments backwards. For all instructions the source operand is the one following the instruction itself, and the target operand is the one on the right of the comma. Compared to Intel syntax this is exactly the other way around.

Try this tutorial. It covers everything you ask: for example, try section 6 - it explains constraints quite well, and what the "=" sign is for. Even the concept of clobbered registers is covered (section 5.3).

The lines with "r" or "=r" are operand constraints. The "=" means output operand. Essentially, this:
:"=r"(y)
:"r"(x)
means that %0 (ie: the first operand) corresponds to y and is for output, and %1 (the second operand) corresponds to x.
A single % is normally used in AT&T syntax assembly, but for inline assembly the single % is used for operand references (eg: %0, %1) while a double % is used for literal register references. Think of it like the way you have to use a double % in a printf format if you want a literal % in the output.
A clobbered register is a register whose value will be modified by the assembly code. As you can see from the code, eax is written to. You need to tell gcc about this so that it knows that the compiled code can't keep anything it needs for later in eax when it's about to invoke this assembly.

I can't answer all of this, but a clobbered register is one that will get used somewhere in the computation in a way that will destroy its current value. So if the caller wants to use the current value later, it needs to save it somehow.
In asm directives like this, when you write the assembly you figure out which registers are going to be clobbered by it; you then tell the compiler this (as shown in your example), and the compiler does what it has to do to preserve the current value of that register if necessary. The compiler knows a lot about how values in registers and elsewhere will be used for later computations, but it usually can't analyse embedded assembly. So you do the analysis yourself and the compiler uses the clobbering information to safely incorporate the assembly into its optimisation choices.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight