This question already has answers here:
Why we need Clobbered registers list in Inline Assembly?
(1 answer)
For temporary registers in the asm statement, should I use clobber or dummy output?
(1 answer)
How to specify clobbered bottom of the x87 FPU stack with extended gcc assembly?
(1 answer)
Closed 8 months ago.
I just learned clobber usage through these docs(https://gcc.gnu.org/onlinedocs/gcc-7.5.0/gcc/Extended-Asm.html) as well as previous questions.
I also read another introduction post.
The description in the post is:
Its purpose is to inform the compiler that the registers in the
Clobbers list will be implicitly modified by the assembly code in the
asm statement. Since the registers in Clobbers will be implicitly
modified by the assembly code in the asm statement, the compiler will
not use the registers specified in Clobbers when selecting registers
for input operands and output operands, thus avoiding logical errors
such as data overwriting.
In actual use, I still have some confusion.
If I specify the registers used by input and output, do I still need to write clobber? I saw an example introducing clobber.
#include <stdio.h>
int inc2(int src) {
int dst;
asm("mov %1, %0\n\t"
"mov $3, %%eax\n\t"
"add $1, %0"
: "=r"(dst)
: "r"(src));
return dst;
}
int inc3(int src) {
int dst;
asm("mov %1, %0\n\t"
"mov $3, %%eax\n\t"
"add $1, %0"
: "=r"(dst)
: "r"(src)
: "%eax");
return dst;
}
int main(int argc, char *argv[]) {
printf("inc2: %d\n", inc2(1));
printf("inc3: %d\n", inc3(1));
}
The above example outputs are 2 and 4 respectively. The reason is that the eax register is used in mov $3, %eax in the src2 function, and eax is also selected as the input register.
If clobber is used to indicate which registers should not be selected as input registers, if I explicitly instruct, is there no need to write clobber?
For example
int inc4(int src) {
int dst;
asm("mov %%rcx, %%rax\n\t"
"add $1, %%rax"
: "=a"(dst)
: "c"(src)
: );
return dst;
}
In the above example, the input register is explicitly ecx and the output is explicitly eax. Is it possible that by this method I can avoid writing clobber altogether? Or should the registers modified in assembly be marked?
I have seen some use of inline assembly in some open source code, but it is not consistent with my understanding. But these codes are written by very professional people and widely used by many people. I think it's very precise. I would like to analyze some of how to write inline assembly canonically through these examples.
(1) Why does the two functions below clobber include edx and ecx? Is it because the output register (=a) selects eax, so the clobber does not need to be full of eax? There are no input parameters in this function, only one output. What is the significance of marking these two registers?
static inline uint32_t rdtscp() {
uint32_t rv;
asm volatile ("rdtscp": "=a" (rv) :: "edx", "ecx");
return rv;
}
static inline uint32_t memaccesstime(void *v) {
uint32_t rv;
asm volatile (
"mfence\n"
"lfence\n"
"rdtscp\n"
"mov %%eax, %%esi\n"
"mov (%1), %%eax\n"
"rdtscp\n"
"sub %%esi, %%eax\n"
: "=&a" (rv): "r" (v): "ecx", "edx", "esi");
return rv;
}
Thanks!
Related
I am trying to figure out how to use the variable ret within the inline assembly code below, but I keep getting this error: undefined reference to 'ret.
char getkey(void){
int ret;
asm(
"movq $0, %RAX\n\t"
"INT $0X16\n\t"
"movq %RAX, ret"
);
return ret;
}
What you're trying to do won't work. PC BIOS interrupts, like int 16h, are only available when the system is running in real mode (i.e, at startup before the MMU is enabled); they cannot be used in Linux executables.
That being said, in general, you can specify an output register using gcc assembler constraints. For example:
asm(
"movq $0, %RAX\n"
"int $0x16\n"
: "=a" (ret)
);
Note that there's no mov instruction at the end of this code! The "=A" constraint tells the compiler that the result will be left in the A register; it'll figure out what to do from there. (There are ways to eliminate the first mov as well, if you're clever about it.)
I have this code
int64_t ret;
char *fmt = "\n\n\n%s\n\n\n";
char *s;
s = kmalloc(13, GFP_KERNEL);
memcpy(s, "Hello world!\0", 13);
__asm__ __volatile__ (
"movq %2, %%rdi;"
"movq %1, %%rsi;"
"movq $2, %%rax;"
"call printk;"
"movq %%rax, %0;"
: "=r" (ret)
: "r" (s), "r" (fmt)
:
);
__asm__ __volatile__ (
"movq %0, %%rdi;"
"movq $1, %%rax;"
"call kfree;"
:
: "r" (s)
:
);
return ret;
which crashes on the kfree call. It must be something silly, but I can't find out what it is. What is the problem?
EDIT: "Crashes" == The kernel module triggers an Oops.
EDIT: It seems to work just fine if I use m instead of r. But why?
You should look at the generated assembly code (use -S option to gcc) to see exactly what is happening.
I am guessing that the problem is that you didn't specify the correct clobbers for the printk block. Remember that certain registers may be freely used by a called function. The compiler doesn't process the contents of the inline asm block, so it doesn't know it has a call inside. It expects that registers not specified as output or clobber will come out unchanged from the asm block. Thus, I assume the compiler caches the variable s in a register that is destroyed by the printk or your use of rax, rdi and rsi which you also don't tell the compiler. Note that there are specific register constraints, so you could use those instead of moving the arguments around in assembly.
Also note that rax (actually al) is only used for varargs and stdargs functions, and even then it should just contain the number of arguments passed via sse vector (xmm) registers, not the total number. See the x86-64 ABI docs for more information.
Finally, if you really want to write asm code, you should consider writing standalone asm module instead of C if possible. You could save yourself some headache. See this question for an example.
I came across a code that looks like this:
asm volatile (
# [...]
"movl $1200, %%ecx;"
# [...]
);
I know what movl $1200, %ecx does in x86. but I was confused by why there are two percent signs.
GCC inline assembly uses %0, %1, %2, etc. to refer to input and output operands. That means you need to use two %% for real registers.
Check this howto for great information.
It depends
if there is a colon : after the string, then it is an extended asm, and %% escapes the percent which could have especial meanings as mentioned by Carl. Example:
uint32_t in = 1;
uint32_t out = 0;
asm volatile (
"movl %1, %%eax;"
"inc %%eax;"
"movl %%eax, %0"
: "=m" (out) /* Outputs. '=' means written to. */
: "m" (in) /* Inputs. No '='. */
: "%eax"
);
assert(out == in + 1);
otherwise, it will be a compile time error, because without colon it is a basic asm which does not support variable constraints and does not need or support escaping %1. E.g.:
asm volatile ("movl $1200, %ecx;");
works just fine.
Extended asm is more often used since it is much more powerful.
This helps GCC to distinguish between the operands and registers. operands have a single % as prefix. '%%' is always used with registers.
This question already has answers here:
How to invoke a system call via syscall or sysenter in inline assembly?
(2 answers)
Closed 3 years ago.
I'm trying to use inline assembly...
I read this page http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx but I can't understand the parameters passing to my function.
I'm writing a C write example.. this is my function header:
write2(char *str, int len){
}
And this is my assembly code:
global write2
write2:
push ebp
mov ebp, esp
mov eax, 4 ;sys_write
mov ebx, 1 ;stdout
mov ecx, [ebp+8] ;string pointer
mov edx, [ebp+12] ;string size
int 0x80 ;syscall
leave
ret
What do I have to do pass that code to the C function... I'm doing something like this:
write2(char *str, int len){
asm ( "movl 4, %%eax;"
"movl 1, %%ebx;"
"mov %1, %%ecx;"
//"mov %2, %%edx;"
"int 0x80;"
:
: "a" (str), "b" (len)
);
}
That's because I don't have an output variable, so how do I handle that?
Also, with this code:
global main
main:
mov ebx, 5866 ;PID
mov ecx, 9 ;SIGKILL
mov eax, 37 ;sys_kill
int 0x80 ;interruption
ret
How can I put that code inline in my code.. so I can ask for the pid to the user.. like this..
This is my precode
void killp(int pid){
asm ( "mov %1, %%ebx;"
"mov 9, %%ecx;"
"mov 37, %%eax;"
:
: "a" (pid) /* optional */
);
}
Well, you don't say specifically, but by your post, it appears like you're using gcc and its inline asm with constraints syntax (other C compilers have very different inline syntax). That said, you probably need to use AT&T assembler syntax rather than Intel, as that's what gets used with gcc.
So with the above said, lets look at your write2 function. First, you don't want to create a stack frame, as gcc will create one, so if you create one in the asm code, you'll end up with two frames, and things will probably get very confused. Second, since gcc is laying out the stack frame, you can't access vars with "[ebp + offset]" as you don't know how it's being laid out.
That's what the constraints are for -- you say what kind of place you want gcc to put the value (any register, memory, specific register) and the use "%X" in the asm code. Finally, if you use explicit registers in the asm code, you need to list them in the 3rd section (after the input constraints) so gcc knows you are using them. Otherwise it might put some important value in one of those registers, and you'd clobber that value.
You also need to tell the compiler that inline asm will or might read from or write to memory pointed-to by the input operands; that is not implied.
So with all that, your write2 function looks like:
void write2(char *str, int len) {
__asm__ volatile (
"movl $4, %%eax;" // SYS_write
"movl $1, %%ebx;" // file descriptor = stdout_fd
"movl %0, %%ecx;"
"movl %1, %%edx;"
"int $0x80"
:: "g" (str), "g" (len) // input values we MOV from
: "eax", "ebx", "ecx", "edx", // registers we destroy
"memory" // memory has to be in sync so we can read it
);
}
Note the AT&T syntax -- src, dest rather than dest, src and % before the register name.
Now this will work, but its inefficient as it will contain lots of extra movs. In general, you should NEVER use mov instructions or explicit registers in asm code, as you're much better off using constraints to say where you want things and let the compiler ensure that they're there. That way, the optimizer can probably get rid of most of the movs, particularly if it inlines the function (which it will do if you specify -O3). Conveniently, the i386 machine model has constraints for specific registers, so you can instead do:
void write2(char *str, int len) {
__asm__ volatile (
"movl $4, %%eax;"
"movl $1, %%ebx;"
"int $0x80"
:: "c" (str), /* c constraint tells the compiler to put str in ecx */
"d" (len) /* d constraint tells the compiler to put len in edx */
: "eax", "ebx", "memory");
}
or even better
// UNSAFE: destroys EAX (with return value) without telling the compiler
void write2(char *str, int len) {
__asm__ volatile ("int $0x80"
:: "a" (4), "b" (1), "c" (str), "d" (len)
: "memory");
}
Note also the use of volatile which is needed to tell the compiler that this can't be eliminated as dead even though its outputs (of which there are none) are not used. (asm with no output operands is already implicitly volatile, but making it explicit doesn't hurt when the real purpose isn't to calculate something; it's for a side effect like a system call.)
edit
One final note -- this function is doing a write system call, which does return a value in eax -- either the number of bytes written or an error code. So you can get that with an output constraint:
int write2(const char *str, int len) {
__asm__ volatile ("int $0x80"
: "=a" (len)
: "a" (4), "b" (1), "c" (str), "d" (len),
"m"( *(const char (*)[])str ) // "dummy" input instead of memory clobber
);
return len;
}
All system calls return in EAX. Values from -4095 to -1 (inclusive) are negative errno codes, other values are non-errors. (This applies globally to all Linux system calls).
If you're writing a generic system-call wrapper, you probably need a "memory" clobber because different system calls have different pointer operands, and might be inputs or outputs. See https://godbolt.org/z/GOXBue for an example that breaks if you leave it out, and this answer for more details about dummy memory inputs/outputs.
With this output operand, you need the explicit volatile -- exactly one write system call per time the asm statement "runs" in the source. Otherwise the compiler is allowed to assume that it exists only to compute its return value, and can eliminate repeated calls with the same input instead of writing multiple lines. (Or remove it entirely if you didn't check the return value.)
I am trying to write some inline assembly into C. I have two arrays as input, what I need is to copy one element in array1 into array2, and the following is what I have at the moment:
asm (
"movl %0,%%eax;"
"movl %1,%%ebx;"
"movl (%%eax),%%ecx;"
"movl %%ecx,(%ebx);"
"xor %%ecx,%%ecx;"
"movl 4(%%eax),%%ecx;"
//do something on %ecx
"movl %%ecx,4(%ebx);" //write second
:
:"a"(array1),"b"(array2)
);
Why do I get a segmentation fault?
Your inline assembler code is broken. You can't directly use EAX and EBX without adding them to the clobber list. Otherwise the compiler does not know which registers have been modified.
It is very likely that one of the registers that you've modified contained something damn important that later caused the segmentation fault.
This code will copy one element from array1 to array2:
asm (
"movl (%0), %%eax \n\t" /* read first dword from array1 into eax */
"movl %%eax, (%1) \n\t" /* write dword into array2
: /* outputs */
: /* inputs */ "r"(array1),"r"(array2)
: /* clobber */ "eax", "memory"
);
A better version with proper register constraints would drop the hard coded EAX like this:
int dummy;
asm (
"movl (%1), %0 \n\t"
"movl %0, (%2) \n\t"
: /* outputs, temps.. */ "=r" (dummy)
: /* inputs */ "r"(array1),"r"(array2)
: /* clobber */ "memory"
);
Btw - In general I have the feeling that you're not that familiar with assembler yet. Writing inline-assembler is a bit harder to get right due to all the compiler magic. I suggest that you start writing some simple functions in assembler and put them into a separate .S file first.. That's much easier..
Your best option is C code:
target_array[target_idx] = source_array[source_idx];
This avoids segmentation faults as long as the indexes are under control.
what about memcpy ?