Strange behavior with gcc inline assembly? - c

I'm new to gcc inline assembly.
Why this code output "1" instead of "5"?
code:
#include <stdio.h>
static inline int atomic_add(volatile int *mem, int add)
{
asm volatile(
"lock xadd %0, (%1);"
: "=a"(add)
: "r"(mem), "a"(add)
: "memory"
);
return add;
}
int main(void)
{
int a=1;
int b=5;
printf ( "%d\n", atomic_add(&a, b) );
return 0;
}
run:
$ ./a.out
1 # why not 5?
many thx. :)

The variable add stars out with the value 5 and *mem starts out with 1.
The lock xadd %0, (%1) assembly template gets compiled by gcc to:
lock xadd %eax, (%edx)
GCC has to use eax because your constraints indicate that %0 should use %eax. Your constraint also ties %eax to the variable add. I believe that GCC is free to us whatever register it wants for the other operand (in my test it happened to use %edx).
So:
%eax starts with 5, and %edx points to a memory location that has the value 1
the xadd instruction swaps the two operands and places the sum in the destination, so after executing %eax has 1 and the memory pointed to by%edxcontains6`
Your constraint also indicates that %eax should be stored back into the variable add, so add gets 1. And that is what is returned from the function.

In x86, XADD is the Exchange and Add instruction. So the register holding the add parameter became 1 after the lock xadd instruction. add is then returned by atomic_add() thus you see 1 printed instead of 5.
For atomic_add() you probably want to just use lock add instead of lock xadd:
#include <stdio.h>
static inline int atomic_add(volatile int *mem, int add)
{
asm volatile(
"lock add %0, (%1);"
: "=a"(add)
: "r"(mem), "a"(add)
: "memory"
);
return add;
}
int main(void)
{
int a=1;
int b=5;
printf ( "%d\n", atomic_add(&a, b) );
return 0;
}
And this prints 5 like you expect:
$ ./a.out
5

Related

extended asm in gcc: ‘asm’ operand has impossible constraints

This function "strcpy" aims to copy the content of src to dest, and it works out just fine: display two lines of "Hello_src".
#include <stdio.h>
static inline char * strcpy(char * dest,const char *src)
{
int d0, d1, d2;
__asm__ __volatile__("1:\tlodsb\n\t"
"stosb\n\t"
"testb %%al,%%al\n\t"
"jne 1b"
: "=&S" (d0), "=&D" (d1), "=&a" (d2)
: "0"(src),"1"(dest)
: "memory");
return dest;
}
int main(void) {
char src_main[] = "Hello_src";
char dest_main[] = "Hello_des";
strcpy(dest_main, src_main);
puts(src_main);
puts(dest_main);
return 0;
}
I tried to change the line : "0"(src),"1"(dest) to : "S"(src),"D"(dest), the error occurred: ‘asm’ operand has impossible constraints. I just cannot understand. I thought that "0"/"1" here specified the same constraint as the 0th/1th output variable. the constraint of 0th output is =&S, te constraint of 1th output is =&D. If I change 0-->S, 1-->D, there shouldn't be any wrong. What's the matter with it?
Does "clobbered registers" or the earlyclobber operand(&) have any use? I try to remove "&" or "memory", the result of either circumstance is the same as the original one: output two lines of "Hello_src" strings. So why should I use the "clobbered" things?
The earlyclobber & means that the particular output is written before the inputs are consumed. As such, the compiler may not allocate any input to the same register. Apparently using the 0/1 style overrides that behavior.
Of course the clobber list also has important use. The compiler does not parse your assembly code. It needs the clobber list to figure out which registers your code will modify. You'd better not lie, or subtle bugs may creep in. If you want to see its effect, try to trick the compiler into using a register around your asm block:
extern int foo();
int bar()
{
int x = foo();
asm("nop" ::: "eax");
return x;
}
Relevant part of the generated assembly code:
call foo
movl %eax, %edx
nop
movl %edx, %eax
Notice how the compiler had to save the return value from foo into edx because it believed that eax will be modified. Normally it would just leave it in eax, since that's where it will be needed later. Here you can imagine what would happen if your asm code did modify eax without telling the compiler: the return value would be overwritten.

Beginner Inline Assembly Segmentation fault

I am writing Inline assembly for the first time and I don't know why I'm getting a Seg fault when I try to run it.
#include <stdio.h>
int very_fast_function(int i){
asm volatile("movl %%eax,%%ebx;"
"sall $6,%%ebx;"
"addl $1,%%ebx;"
"cmpl $1024,%%ebx;"
"jle Return;"
"addl $1,%%eax;"
"jmp End;"
"Return: movl $0,%%eax;"
"End: ret;": "=eax" (i) : "eax" (i) : "eax", "ebx" );
return i;
/*if ( (i*64 +1) > 1024) return ++i;
else return 0;*/
}
int main(int argc, char *argv[])
{
int i;
i=40;
printf("The function value of i is %d\n", very_fast_function(i));
return 0;
}
Like I said this is my first time so if it's super obvious I apologize.
You shall not use ret directly. Reason: there're initialization like push the stack or save the frame pointer when entering each function, also there're corresponding finalization. You just leave the stack not restored if use ret directly.
Just remove ret and there shall not be segmentation fault.
However I suppose the result is not as expected. The reason is your input/output constrains are not as expected. Please notice "=eax" (i) you write does not specify to use %%eax as the output of i, while it means to apply constraint e a and x on output variable i.
For your purpose you could simply use r to specify a register. See this edited code which I've just tested:
asm volatile("movl %1,%%ebx;"
"sall $6,%%ebx;"
"addl $1,%%ebx;"
"cmpl $1024,%%ebx;"
"jle Return;"
"addl $1,%0;"
"jmp End;"
"Return: movl $0,%0;"
"End: ;": "=r" (i) : "r" (i) : "ebx" );
Here To use %%eax explicitly, use "=a" instead of "=r".
For further information, please read this http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html
ret should not be used in inline assembly blocks - the function you're in needs some cleanup beyond what a simple ret will handle.
Remember, inline assembly is inserted directly into the function it's embedded in. It's not a function unto itself.

Swap with push / assignment / pop in GNU C inline assembly?

I was reading some answers and questions on here and kept coming up with this suggestion but I noticed no one ever actually explained "exactly" what you need to do to do it, On Windows using Intel and GCC compiler. Commented below is exactly what I am trying to do.
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
//assembly code begin
/*
push x into stack; < Need Help
x=y; < With This
pop stack into y; < Please
*/
//assembly code end
printf("x=%d,y=%d",x,y);
getchar();
return 0;
}
You can't just push/pop safely from inline asm, if it's going to be portable to systems with a red-zone. That includes every non-Windows x86-64 platform. (There's no way to tell gcc you want to clobber it). Well, you could add rsp, -128 first to skip past the red-zone before pushing/popping anything, then restore it later. But then you can't use an "m" constraints, because the compiler might use RSP-relative addressing with offsets that assume RSP hasn't been modified.
But really this is a ridiculous thing to be doing in inline asm.
Here's how you use inline-asm to swap two C variables:
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
asm("" // no actual instructions.
: "=r"(y), "=r"(x) // request both outputs in the compiler's choice of register
: "0"(x), "1"(y) // matching constraints: request each input in the same register as the other output
);
// apparently "=m" doesn't compile: you can't use a matching constraint on a memory operand
printf("x=%d,y=%d\n",x,y);
// getchar(); // Set up your terminal not to close after the program exits if you want similar behaviour: don't embed it into your programs
return 0;
}
gcc -O3 output (targeting the x86-64 System V ABI, not Windows) from the Godbolt compiler explorer:
.section .rodata
.LC0:
.string "x=%d,y=%d"
.section .text
main:
sub rsp, 8
mov edi, OFFSET FLAT:.LC0
xor eax, eax
mov edx, 1
mov esi, 2
#APP
# 8 "/tmp/gcc-explorer-compiler116814-16347-5i3lz1/example.cpp" 1
# I used "\n" instead of just "" so we could see exactly where our inline-asm code ended up.
# 0 "" 2
#NO_APP
call printf
xor eax, eax
add rsp, 8
ret
C variables are a high level concept; it doesn't cost anything to decide that the same registers now logically hold different named variables, instead of swapping the register contents without changing the varname->register mapping.
When hand-writing asm, use comments to keep track of the current logical meaning of different registers, or parts of a vector register.
The inline-asm didn't lead to any extra instructions outside the inline-asm block either, so it's perfectly efficient in this case. Still, the compiler can't see through it, and doesn't know that the values are still 1 and 2, so further constant-propagation would be defeated. https://gcc.gnu.org/wiki/DontUseInlineAsm
#include <stdio.h>
int main()
{
int x=1;
int y=2;
printf("x::%d,y::%d\n",x,y);
__asm__( "movl %1, %%eax;"
"movl %%eax, %0;"
:"=r"(y)
:"r"(x)
:"%eax"
);
printf("x::%d,y::%d\n",x,y);
return 0;
}
/* Load x to eax
Load eax to y */
If you want to exchange the values, it can also be done using this way. Please note that this instructs GCC to take care of the clobbered EAX register. For educational purposes, it is okay, but I find it more suitable to leave micro-optimizations to the compiler.
You can use extended inline assembly. It is a compiler feature whicg allows you to write assembly instructions within your C code. A good reference for inline gcc assembly is available here.
The following code copies the value of x into y using pop and push instructions.
( compiled and tested using gcc on x86_64 )
This is only safe if compiled with -mno-red-zone, or if you subtract 128 from RSP before pushing anything. It will happen to work without problems in some functions: testing with one set of surrounding code is not sufficient to verify the correctness of something you did with GNU C inline asm.
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
asm volatile (
"pushq %%rax\n" /* Push x into the stack */
"movq %%rbx, %%rax\n" /* Copy y into x */
"popq %%rbx\n" /* Pop x into y */
: "=b"(y), "=a"(x) /* OUTPUT values */
: "a"(x), "b"(y) /* INPUT values */
: /*No need for the clobber list, since the compiler knows
which registers have been modified */
);
printf("x=%d,y=%d",x,y);
getchar();
return 0;
}
Result x=2 y=1, as you expected.
The intel compiler works in a similar way, I think you have just to change the keyword asm to __asm__. You can find info about inline assembly for the INTEL compiler here.

asm in C "too many memory references for `mov'"

I've seen the post about the same error but i'm still get error :
too many memory references for `mov'
junk `hCPUIDmov buffer' after expression
... here's the code (mingw compiler / C::B) :
#include iostream
using namespace std;
union aregister
{
int theint;
unsigned bits[32];
};
union tonibbles
{
int integer;
short parts[2];
};
void GetSerial()
{
int part1,part2,part3;
aregister issupported;
int buffer;
__asm(
"mov %eax, 01h"
"CPUID"
"mov buffer, edx"
);//do the cpuid, move the edx (feature set register) to "buffer"
issupported.theint = buffer;
if(issupported.bits[18])//it is supported
{
__asm(
"mov part1, eax"
"mov %eax, 03h"
"CPUID"
);//move the first part into "part1" and call cpuid with the next subfunction to get
//the next 64 bits
__asm(
"mov part2, edx"
"mov part3, ecx"
);//now we have all the 96 bits of the serial number
tonibbles serial[3];//to split it up into two nibbles
serial[0].integer = part1;//first part
serial[1].integer = part2;//second
serial[2].integer = part3;//third
}
}
Your assembly code is not correctly formatted for gcc.
Firstly, gcc uses AT&T syntax (EDIT: by default, thanks nrz), so it needs a % added for each register reference and a $ for immediate operands. The destination operand is always on the right side.
Secondly, you'll need to pass a line separator (for example \n\t) for a new line. Since gcc passes your string straight to the assembler, it requires a particular syntax.
You should usually try hard to minimize your assembler since it may cause problems for the optimizer. Simplest way to minimize the assembler required would probably be to break the cpuid instruction out into a function, and reuse that.
void cpuid(int32_t *peax, int32_t *pebx, int32_t *pecx, int32_t *pedx)
{
__asm(
"CPUID"
/* All outputs (eax, ebx, ecx, edx) */
: "=a"(*peax), "=b"(*pebx), "=c"(*pecx), "=d"(*pedx)
/* All inputs (eax) */
: "a"(*peax)
);
}
Then just simply call using;
int a=1, b, c, d;
cpuid(&a, &b, &c, &d);
Another possibly more elegant way is to do it using macros.
Because of how C works,
__asm(
"mov %eax, 01h"
"CPUID"
"mov buffer, edx"
);
is equivalent to
__asm("mov %eax, 01h" "CPUID" "mov buffer, edx");
which is equivalent to
__asm("mov %eax, 01hCPUIDmov buffer, edx");
which isn't what you want.
AT&T syntax (GAS's default) puts the destination register at the end.
AT&T syntax requires immediates to be prefixed with $.
You can't reference local variables like that; you need to pass them in as operands.
Wikipedia's article gives a working example that returns eax.
The following snippet might cover your use-cases (I'm not intricately familiar with GCC inline assembly or CPUID):
int eax, ebx, ecx, edx;
eax = 1;
__asm( "cpuid"
: "+a" (eax), "+b" (ebx), "+c" (ecx), "+d" (edx));
buffer = edx

Inline assembly and function overwriting resulting in a segfault

Somebody over at SO posted a question asking how he could "hide" a function. This was my answer:
#include <stdio.h>
#include <stdlib.h>
int encrypt(void)
{
char *text="Hello World";
asm("push text");
asm("call printf");
return 0;
}
int main(int argc, char *argv[])
{
volatile unsigned char *i=encrypt;
while(*i!=0x00)
*i++^=0xBE;
return EXIT_SUCCESS;
}
but, there are problems:
encode.c: In function `main':
encode.c:13: warning: initialization from incompatible pointer type
C:\DOCUME~1\Aviral\LOCALS~1\Temp/ccYaOZhn.o:encode.c:(.text+0xf): undefined reference to `text'
C:\DOCUME~1\Aviral\LOCALS~1\Temp/ccYaOZhn.o:encode.c:(.text+0x14): undefined reference to `printf'
collect2: ld returned 1 exit status
My first question is why is the inline assembly failing ... what would be the right way to do it? Other thing -- the code for "ret" or "retn" is 0x00 , right... my code xor's stuff until it reaches a return ... so why is it SEGFAULTing?
As a high level point, I'm not quite sure why you're trying to use inline assembly to do a simple call into printf, as all you've done is create an incorrect version of a function call (your inline pushes something onto the stack, but never pop it off, most likely causing problems cause GCC isn't aware that you've modified the stack pointer in the middle of the function. This is fine in a trivial example, but could lead to non-obvious errors in a more complicated function)
Here's a correct implementation of your top function:
int encrypt(void)
{
char *text="Hello World";
char *formatString = "%s\n";
// volatile really isn't necessary but I just use it by habit
asm volatile("pushl %0;\n\t"
"pushl %1;\n\t"
"call printf;\n\t"
"addl $0x8, %%esp\n\t"
:
: "r"(text), "r"(formatString)
);
return 0;
}
As for your last question, the usual opcode for RET is "C3", but there are many variations, have a look at http://pdos.csail.mit.edu/6.828/2009/readings/i386/RET.htm
Your idea of searching for RET is also faulty as due to the fact that when you see the byte 0xC3 in a random set of instructions, it does NOT mean you've encountered a ret. As the 0xC3 may simply be the data/attributes of another instruction (as a side note, it's particularly hard to try and parse x86 instructions as you're doing due to the fact x86 is a CISC architecture with instruction lengths between 1-16 bytes)
As another note, not all OS's allow modification to the text/code segment (Where executable instructions are stored), so the the code you have in main may not work regardless.
GCC inline asm uses AT&T syntax (if no specific options are selected for using Intel's one).
Here's an example:
int a=10, b;
asm ("movl %1, %%eax;
movl %%eax, %0;"
:"=r"(b) /* output */
:"r"(a) /* input */
:"%eax" /* clobbered register */
);
Thus, your problem is that "text" is not identifiable from your call (and following instruction too).
See here for reference.
Moreover your code is not portable between 32 and 64 bit environments. Compile it with -m32 flag to ensure proper analysis (GCC will complain anyway if you fall in error).
A complete solution to your problem is on this post on GCC Mailing list.
Here's a snippet:
for ( i = method->args_size - 1; i >= 0; i-- ) {
asm( "pushl %0": /* no outputs */: \
"g" (stack_frame->op_stack[i]) );
}
asm( "call *%0" : /* no outputs */ : "g" (fp) :
"%eax", "%ecx", "%edx", "%cc", "memory" );
asm ( "movl %%eax, %0" : "=g" (ret_value) : /* No inputs */ );
On windows systems there's also an additional asm ( "addl %0, %%esp" : /* No outputs */ : "g" (method->args_size * 4) ); to do. Google for better details.
It is not printf but _printf

Resources