what is wrong with following inline assembly code? - c

What is wrong with the following code?
__asm__("mov %0, %%eax" : :"a" (ptr));
__asm__(".intel_syntax noprefix");//switch to intel syntax.
asm("lidt [eax]");
I get error in compilation like this:
/tmp/cciOoSro.s: Assembler messages:
/tmp/cciOoSro.s:1737: Error: no such instruction: popl %ebp
This is to load interrupt descriptor table IDT for my Os. But seems something wrong. I am not used to at&t syntax. I am used to intel syntax.
the function is to load the pointer of my idt to the processor using lidt.
void setup_idt(uint32 ptr) //to setup the idt i.e to load the idt's pointer
{
__asm__("mov %0, %%eax" : :"a" (ptr));
__asm__(".intel_syntax noprefix");//switch to intel sytax.
__asm__("lidt [eax]");
}

I think the .intel_syntax noprefix line applied to everything until the end of the source. So it tried to interpreted gcc's assembly code as Intel code.
You should:
1. Merge all assembly line into one __asm__ statement (__asm__("line one\n" "line two\n").
2. The last line should do .att_syntax prefix, to return to AT&T syntax.
Or just use AT&T syntax. It isn't so hard.

Related

Switching between Intel and ATT mode in GCC

So I have this inline assembly code along with my C code, and I want to use intel syntax for this particular call to asm(), however I need to switch back to ATT syntax or else it will give a long list of errors.
asm(".intel_syntax prefix");
asm volatile (
"add %0, $1 \n\t"
: "=r" (dst)
: "r" (src));
asm(".att_syntax prefix");
Now it gives the following error
/tmp/ccDNa2Wk.s: Assembler messages:
/tmp/ccDNa2Wk.s:180: Error: no such instruction: `movl -16(%ebp),%eax'
/tmp/ccDNa2Wk.s:187: Error: no such instruction: `movl %eax,-12(%ebp)'
I dont understand how to fix the error, i have no call to movl in any part of my code.
Since you haven't yet accepted an answer (<hint><hint>), let me add a third thought:
1) Instead of having 3 asm statements, do it in 1:
asm(".intel_syntax prefix\n\t"
"add %0, 1 \n\t"
".att_syntax prefix"
: "=r" (dst)
: "r" (src));
2) Change your compile options to include -masm=intel and omit the 2 syntax statements.
3) It is possible to support both intel and att at the same time. This way your code works whatever value is passed for -masm:
asm("{addl $1, %0 | add %0, 1}"
: "=r" (dst)
: "r" (src));
I should also mention that your asm may not work as expected. Since you are updating the contents of dst (instead of overwriting it), you probably want to use "+r" instead of "=r". And you do realize that this code doesn't actually use src, right?
Oh, and your original asm is NOT intel format (the $1 is the give-away).
I would try to do the following tests:
In some C code not containing inline assembler insert the line
asm(".att_syntax prefix");
in multiple different locations. Then compile the C code to object files and disassemble these object files (compiling to assembler won't work for this test).
Then compare the disassembly of the original code with the disassembly of the code containing the ".att_syntax" lines.
If the line ".att_syntax prefix" indeed is the correct line for switching back to AT&T mode the disassemblies must be equal AND compiling must work without any errors.
In the next step take your code and compile to assembler instead of object code ("-S" option of GCC). Then you can look at the assembler code.
My idea is the following one:
If you use data exchange in inline assembler ("=r" and "r" for example) GCC needs to insert code that is doing the data exchange:
asm(".intel_syntax prefix");
// GCC inserts code moving "src" to "%0" here
asm volatile (
"add %0, $1 \n\t"
: "=r" (dst)
: "r" (src));
// GCC inserts code moving "%0" to "dst" here
asm(".att_syntax prefix");
This code inserted by GCC is of course in AT&T syntax.
If you want to use Intel syntax in inline assembly you have to use the ".att_syntax" and ".intel_syntax" instructions in the same inline assembly block, just like this:
// GCC inserts code moving "src" to "%0" here
asm volatile (
".intel_syntax prefix \n\t"
"add %0, $1 \n\t"
".att_syntax prefix \n\t"
: "=r" (dst)
: "r" (src));
// GCC inserts code moving "%0" to "dst" here

extended asm: invalid instruction suffix for 'mov'

Using i686-elf-gcc and i686-elf-ld to compile and link.
/tmp/ccyjfCee.s:25: Error: invalid instruction suffix for 'mov'
makefile:21: recipe for target 'Release/boot.o' failed
When I tried to modify movw %0, %%dx to movw $0x1, %%dx. It compiled and linked successfully. So I wonder why there is something wrong with the line. In light of .code16, the offset address of pStr should be 16bit, which fits into dx register well. What's wrong with it?
__asm__(".code16\n");
void printString(const char* pStr) {
__asm__ __volatile__ ("movb $0x09, %%ah\n\t"
"movw %0, %%dx\n\t"
"int $0x21"
:
:"r"(pStr)
:"%ah", "%dx");
}
void _start() {
printString("Hello, World");
}
Technically you can use the .code16gcc directive to generate 16 bit code and the %w0 substitution to force word sized register.
Note that the above will only let you create a program that will run in 16 bit real mode under DOS (after some postprocessing to get it to the proper format). If that's not what you want, you will need to use the appropriate OS system calls instead of int 0x21 and not write 16 bit code.

GCC INLINE ASSEMBLY Won't Let Me Overwrite $esp

I'm writing code to temporarily use my own stack for experimentation. This worked when I used literal inline assembly. I was hardcoding the variable locations as offsets off of ebp. However, I wanted my code to work without haivng to hard code memory addresses into it, so I've been looking into GCC's EXTENDED INLINE ASSEMBLY. What I have is the following:
volatile intptr_t new_stack_ptr = (intptr_t) MY_STACK_POINTER;
volatile intptr_t old_stack_ptr = 0;
asm __volatile__("movl %%esp, %0\n\t"
"movl %1, %%esp"
: "=r"(old_stack_ptr) /* output */
: "r"(new_stack_ptr) /* input */
);
The point of this is to first save the stack pointer into the variable old_stack_ptr. Next, the stack pointer (%esp) is overwritten with the address I have saved in new_stack_ptr.
Despite this, I found that GCC was saving the %esp into old_stack_ptr, but was NOT replacing %esp with new_stack_ptr. Upon deeper inspection, I found it actually expanded my assembly and added it's own instructions, which are the following:
mov -0x14(%ebp),%eax
mov %esp,%eax
mov %eax,%esp
mov %eax,-0x18(%ebp)
I think GCC is trying to preserve the %esp, because I don't have it explicitly declared as an "output" operand... I could be totally wrong with this...
I really wanted to use extended inline assembly to do this, because if not, it seems like I have to "hard code" the location offsets off of %ebp into the assembly, and I'd rather use the variable names like this... especially because this code needs to work on a few different systems, which seem to all offset my variables differently, so using extended inline assembly allows me to explicitly say the variable location... but I don't understand why it is doing the extra stuff and not letting me overwrite the stack pointer like it was before, ever since I started using extended assembly, it's been doing this.
I appreciate any help!!!
Okay so the problem is gcc is allocating input and output to the same register eax. You want to tell gcc that you are clobbering the output before using the input, aka. "earlyclobber".
asm __volatile__("movl %%esp, %0\n\t"
"movl %1, %%esp"
: "=&r"(old_stack_ptr) /* output */
: "r"(new_stack_ptr) /* input */
);
Notice the & sign for the output. This should fix your code.
Update: alternatively, you could force input and output to be the same register and use xchg, like so:
asm __volatile__("xchg %%esp, %0\n\t"
: "=r"(old_stack_ptr) /* output */
: "0"(new_stack_ptr) /* input */
);
Notice the "0" that says "put this into the same register as argument 0".

gcc removes inline assembler code

It seems like gcc 4.6.2 removes code it considers unused from functions.
test.c
int main(void) {
goto exit;
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
Disassembly of main()
0x08048404 <+0>: push ebp
0x08048405 <+1>: mov ebp,esp
0x08048407 <+3>: nop # <-- This is all whats left of my jmp.
0x08048408 <+4>: mov eax,0x0
0x0804840d <+9>: pop ebp
0x0804840e <+10>: ret
Compiler options
No optimizations enabled, just gcc -m32 -o test test.c (-m32 because I'm on a 64 bit machine).
How can I stop this behavior?
Edit: Preferably by using compiler options, not by modifing the code.
Looks like that's just the way it is - When gcc sees that code within a function is unreachable, it removes it. Other compilers might be different.
In gcc, an early phase in compilation is building the "control flow graph" - a graph of "basic blocks", each free of conditions, connected by branches. When emitting the actual code, parts of the graph, which are not reachable from the root, are discarded.
This isn't part of the optimization phase, and is therefore unaffected by compilation options.
So any solution would involve making gcc think that the code is reachable.
My suggestion:
Instead of putting your assembly code in an unreachable place (where GCC may remove it), you can put it in a reachable place, and skip over the problematic instruction:
int main(void) {
goto exit;
exit:
__asm__ __volatile__ (
"jmp 1f\n"
"jmp $0x0\n"
"1:\n"
);
return 0;
}
Also, see this thread about the issue.
I do not believe there is a reliable way using just compile options to solve this. The preferable mechanism is something that will do the job and work on future versions of the compiler regardless of the options used to compile.
Commentary about Accepted Answer
In the accepted answer there is an edit to the original that suggests this solution:
int main(void) {
__asm__ ("jmp exit");
handler:
__asm__ __volatile__("jmp $0x0");
exit:
return 0;
}
First off jmp $0x0 should be jmp 0x0. Secondly C labels usually get translated into local labels. jmp exit doesn't actually jump to the label exit in the C function, it jumps to the exit function in the C library effectively bypassing the return 0 at the bottom of main. Using Godbolt with GCC 4.6.4 we get this non-optimized output (I have trimmed the labels we don't care about):
main:
pushl %ebp
movl %esp, %ebp
jmp exit
jmp 0x0
.L3:
movl $0, %eax
popl %ebp
ret
.L3 is actually the local label for exit. You won't find the exit label in the generated assembly. It may compile and link if the C library is present. Do not use C local goto labels in inline assembly like this.
Use asm goto as the Solution
As of GCC 4.5 (OP is using 4.6.x) there is support for asm goto extended assembly templates. asm goto allows you to specify jump targets that the inline assembly may use:
6.45.2.7 Goto Labels
asm goto allows assembly code to jump to one or more C labels. The GotoLabels section in an asm goto statement contains a comma-separated list of all C labels to which the assembler code may jump. GCC assumes that asm execution falls through to the next statement (if this is not the case, consider using the __builtin_unreachable intrinsic after the asm statement). Optimization of asm goto may be improved by using the hot and cold label attributes (see Label Attributes).
An asm goto statement cannot have outputs. This is due to an internal restriction of the compiler: control transfer instructions cannot have outputs. If the assembler code does modify anything, use the "memory" clobber to force the optimizers to flush all register values to memory and reload them if necessary after the asm statement.
Also note that an asm goto statement is always implicitly considered volatile.
To reference a label in the assembler template, prefix it with ‘%l’ (lowercase ‘L’) followed by its (zero-based) position in GotoLabels plus the number of input operands. For example, if the asm has three inputs and references two labels, refer to the first label as ‘%l3’ and the second as ‘%l4’).
Alternately, you can reference labels using the actual C label name enclosed in brackets. For example, to reference a label named carry, you can use ‘%l[carry]’. The label must still be listed in the GotoLabels section when using this approach.
The code could be written this way:
int main(void) {
__asm__ goto ("jmp %l[exit]" :::: exit);
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
We can use asm goto. I prefer __asm__ over asm since it will not throw warnings if compiling with -ansi or -std=? options.
After the clobbers you can list the jump targets the inline assembly may use. C doesn't actually know if we jump or not as GCC doesn't analyze the actual code in the inline assembly template. It can't remove this jump, nor can it assume what comes after is dead code. Using Godbolt with GCC 4.6.4 the unoptimized code (trimmed) looks like:
main:
pushl %ebp
movl %esp, %ebp
jmp .L2 # <------ this is the goto exit
jmp 0x0
.L2: # <------ exit label
movl $0, %eax
popl %ebp
ret
The Godbolt with GCC 4.6.4 output still looks correct and appears as:
main:
jmp .L2 # <------ this is the goto exit
jmp 0x0
.L2: # <------ exit label
xorl %eax, %eax
ret
This mechanism should also work whether you have optimizations on or off, and shouldn't matter whether you are compiling for 64-bit or 32-bit x86 targets.
Other Observations
When there are no output constraints in an extended inline assembly template the asm statement is implicitly volatile. The line
__asm__ __volatile__("jmp 0x0");
Can be written as:
__asm__ ("jmp 0x0");
asm goto statements are considered implicitly volatile. They don't require a volatile modifier either.
Would this work, make it so gcc can't know its unreachable
int main(void)
{
volatile int y = 1;
if (y) goto exit;
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
If a compiler thinks it can cheat you, just cheat back: (GCC only)
int main(void) {
{
/* Place this code anywhere in the same function, where
* control flow is known to still be active (such as at the start) */
extern volatile unsigned int some_undefined_symbol;
__asm__ __volatile__(".pushsection .discard" : : : "memory");
if (some_undefined_symbol) goto handler;
__asm__ __volatile__(".popsection" : : : "memory");
}
goto exit;
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
This solution will not add any additional overhead for meaningless instructions, though only works for GCC when used with AS (as is the default).
Explaination: .pushsection switches text output of the compiler to another section, in this case .discard (which is deleted during linking by default). The "memory" clobber prevents GCC from trying to move other text within the section that will be discarded. However, GCC doesn't realize (and never could because the __asm__s are __volatile__) that anything happening between the 2 statements will be discarded.
As for some_undefined_symbol, that is literally just any symbol that is never being defined (or is actually defined, it shouldn't matter). And since the section of code using it will be discarded during linking, it won't produce any unresolved-reference errors either.
Finally, the conditional jump to the label you want to make appear as though it was reachable does exactly that. Besides that fact that it won't appear in the output binary at all, GCC realizes that it can't know anything about some_undefined_symbol, meaning it has no choice but to assume that both of the if's branches are reachable, meaning that as far as it is concerned, control flow can continue both by reaching goto exit, or by jumping to handler (even though there won't be any code that could even do this)
However, be careful when enabling garbage collection in your linker ld --gc-sections (it's disabled by default), because otherwise it might get the idea to get rid of the still unused label regardless.
EDIT:
Forget all that. Just do this:
int main(void) {
__asm__ __volatile__ goto("" : : : : handler);
goto exit;
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
Update 2012/6/18
Just thinking about it, one can put the goto exit in an asm block, which means that only 1 line of code needs to change:
int main(void) {
__asm__ ("jmp exit");
handler:
__asm__ __volatile__("jmp $0x0");
exit:
return 0;
}
That is significantly cleaner than my other solution below (and possibly nicer than #ugoren's current one too).
This is pretty hacky, but it seems to work: hide the handler in a conditional that can never be followed under normal conditions, but stop it from being eliminated by stopping the compiler from being able to do its analysis properly with some inline assembler.
int main (void) {
int x = 0;
__asm__ __volatile__ ("" : "=r"(x));
// compiler can't tell what the value of x is now, but it's always 0
if (x) {
handler:
__asm__ __volatile__ ("jmp $0x0");
}
return 0;
}
Even with -O3 the jmp is preserved:
testl %eax, %eax
je .L2
.L3:
jmp $0x0
.L2:
xorl %eax, %eax
ret
(This seems really dodgy, so I hope there is a better way to do this. edit just putting a volatile in front of x works so one doesn't need to do the inline asm trickery.)
I've never heard of a way to prevent gcc from removing unreachable code; it seems that no matter what you do, once gcc detects unreachable code it always removes it (use gcc's -Wunreachable-code option to see what it considers to be unreachable).
That said, you can still put this code in a static function and it won't be optimized out:
static int func()
{
__asm__ __volatile__("jmp $0x0");
}
int main(void)
{
goto exit;
handler:
func();
exit:
return 0;
}
P.S
This solution is particularily handy if you want to avoid code redundancy when implanting the same "handler" code block in more than one place in the original code.
gcc may duplicate asm statements inside functions and remove them during optimisation (even at -O0), so this will never work reliably.
one way to do this reliably is to use a global asm statement (i.e. an asm statement outside of any function). gcc will copy this straight to the output and you can use global labels without any problems.

Thread local variables and inline assembly

I am trying to use a thread local variable in inline assembly, but when I see the diassembled code, It appears that the compiler doesn't generate the right code. For the following inline code, where saved_sp is globally declared as __thread long saved_sp,
__asm__ __volatile__ (
"movq %rsp, saved_sp\n\t");
The disassembly looks like the following.
mov %rsp,0x612008
Which is clearly not the right thing, because I know that gcc uses the fs segment for thread local variables. It should had generated something like
mov %rsp, fs:somevalue
which it is not. Why is that so? Is using thread local variables in inline assembly problematic?
A simple thing that would surely work is to take a pointer to the thread local variable, and write to it.
Your compiler will surely do long *saved_fp_p = &saved_fp correctly, and inline assembly will only deal with saved_fp_p, which is a local variable.
You can also use gcc's input and output syntax:
__asm__ __volatile__ (
"mov %%rsp, 0(%0)" : : "r" (&saved_sp)
);
This puts the compiler in charge of resolving the address of saved_fp, and the assembly code gets it in a register.
We found out that this also works,
__asm__ __volatile__ asm ("mov %rsp,%0" : "=m" (saved_sp))

Resources