INC malfunction in inline assembly - c

In this code:
int a[2]={5,2},i=0;
asm volatile
(
"incl %1\n"
"incl %0"
:"+r"(a[i]),"+r"(i)
:
:
);
printf("%d\n",a[i]);
I'm trying to increase a[1] by 1 (for a result of 2+1=3) but the output shows 2, which means it hasn't changed. What's the problem and how can I fix it?

Related

arm inline assembly - store C variable in arm register

Trying to save a variable in an arm register using inline assembly.
unsigned int lma_offset = 0x1234; // typically calculated, hardcoded for example
__asm volatile ("MOV R10, %[input]"
: [input] "=r" (lma_offset)
);
This changes lma_offset to 0xd100 in my case, instead of setting the register. What am I doing wrong?
PS: when I declare lma_offset as const it gives a compiler error because lma_offset is used as output. So obviously something is wrong, still I cant find the correct syntax for this.
For future reference, according to Erics comment
const unsigned int lma_offset = 0x10000;
__asm__ volatile ("MOV R10, %[input]"
: // no C variable outputs
: [input] "r" (lma_offset)
: "R10" // tell the compiler R10 is modified
);
using double : and replacing the "=r" with "r" indeed solves the problem.
It would also be possible to ask the compiler to have that constant already in R10 for an asm statement, by using a register local variable to force the "r" input to pick r10. (Then we can omit the redundant mov r10, r10).
register unsigned r10 __asm__("r10") = lma_offset; // picks r10 for "r" constraints, no other *guaranteed* effects
__asm__ volatile ("# inline asm went here" // empty template, actually just a comment you can see if looking at the compiler's asm output
: // no C variable outputs
: [input] "r" (lma_offset)
: // no clobbers needed
);
When writing a register to some output C variable it would result in
unsigned int lma_offset = 0x0;
__asm__ volatile ("MOV %[output], R11"
: [output] "=r" (lma_offset)
// no clobbers needed; reading a register doesn't step on the compiler's toes
);

How can I call a function in inline assembly from C [duplicate]

This question already has answers here:
Referencing memory operands in .intel_syntax GNU C inline assembly
(1 answer)
Calling printf in extended inline ASM
(1 answer)
Is this assembly function call safe/complete?
(2 answers)
Calling a function in gcc inline assembly
(1 answer)
Closed 1 year ago.
I am currently playing around with in-line simply and I've gotten a bit stuck. I have managed to call a function with no parameters but when it comes to calling one with two parameters I get stuck.
My code below should call a function (add) that adds to predefined numbers together and it should call a second one (add parameter) with two parameters which should be added together.
#include <stdio.h>
int c = 4;
int d = 5;
void add() {
int result = 1 + 2;
printf("Result: %d\n", result);
}
void add_parameter(int a, int b) {
int result = a + b;
printf("Result: %d\n", result);
}
int main()
{
__asm__ __volatile__ ( "call add" );
// __asm__ __volatile__(
// "mov eax, offset c"
// "push eax"
// "mov eax, offset d"
// "push eax"
// "call add_parameter"
// "pop ebx"
// "pop ebx"
// );
__asm__ __volatile__ ( "mov eax, offset c" );
__asm__ __volatile__ ( "push eax" );
__asm__ __volatile__ ( "mov eax, offset d" );
__asm__ __volatile__ ( "push eax" );
__asm__ __volatile__ ( "call add_parameter" );
__asm__ __volatile__ ( "pop ebx" );
__asm__ __volatile__ ( "pop ebx" );
return 0;
}
My problem at the moment is that when I try to compile the program I get an error that says
p_function.c:31: Error: too many memory references for `mov'
p_function.c:33: Error: too many memory references for `mov'
In my program I've tried two approaches one being one single ASM call with the whole ASM code in it and one where I had split each line into its own asm call.
Unfortunately I am not sure which one of these approaches is correct let alone the most effective but I get the same error regardless of which approach I use.
How can I fix this problem and call the function add_parameter
Thanks

How to update an array in vectorized assembly(AVX)?

inline void addition(double * x, const double * vx,uint32_t size){
/*for (uint32_t i=0;i<size;++i){
x[i] = x[i] + vx[i];
}*/
__asm__ __volatile__ (
"1: \n\t"
"vmovupd -32(%0), %%ymm1\n\t"
"vmovupd (%0), %%ymm0\n\t"
"vaddpd -32(%1), %%ymm0, %%ymm0\n\t"
"vaddpd (%1), %%ymm1, %%ymm1\n\t"
"vmovupd %%ymm0, -32(%0)\n\t"
"vmovupd %%ymm1, (%0)\n\t"
"addq $128, %0\n\t"
"addq $128, %1\n\t"
"addl $-8, %2\n\t"
"jne 1b"
:
: "r" (x),"r"(vx),"r"(size)
: "ymm0", "ymm1"
);
}
I am practicing assembly(AVX instructions) right now so I write the above piece of code in inline assembly to replace the c code in the original function(which is commented out). The compiling process is successful but when I try to run the program, An error happens: Bus error: 10
Any thoughts to this bug? I didn't know what's wrong here. The compiler version is clang 602.0.53. Thank you!
Inline assembly is a complicated beast, if you just want to practice AVX assembly use a separate asm file where you don't have to put up with the compiler. In exchange, you will need to observe calling convention though.
You have some issues with the constraints. For example, you change all your input registers without telling the compiler and that can cause all sorts of weird problems elsewhere in compiler generated code. You also need to specify a memory clobber for obvious reasons.
Also, learn to use a debugger so you can find the exact cause of problems and fix your own code.
Failing that, at least comment your code so we can figure out your intentions. In this case, I am particularly puzzled why you use -32 offset to address before the array. I think you wanted +32 there. Using two avx registers at 32 bytes each, you of course need to advance the pointers by 64 not 128. Also you have ymm0 and ymm1 swapped in the initial load.
This code seems to work fine for me:
#include <stdio.h>
#include <stdint.h>
inline void addition(double * x, const double * vx,uint32_t size){
/*for (uint32_t i=0;i<size;++i){
x[i] = x[i] + vx[i];
}*/
__asm__ __volatile__ (
"1: \n\t"
"vmovupd 32(%0), %%ymm0\n\t"
"vmovupd (%0), %%ymm1\n\t"
"vaddpd 32(%1), %%ymm0, %%ymm0\n\t"
"vaddpd (%1), %%ymm1, %%ymm1\n\t"
"vmovupd %%ymm0, 32(%0)\n\t"
"vmovupd %%ymm1, (%0)\n\t"
"addq $64, %0\n\t"
"addq $64, %1\n\t"
"addl $-8, %2\n\t"
"jne 1b"
: "+r" (x),"+r"(vx),"+r"(size)
:
: "ymm0", "ymm1", "memory"
);
}
int main()
{
double x[] = { 1, 2, 3, 4, 5, 6, 7, 8 };
double vx[] = { 9, 10, 11, 12, 13, 14, 15, 16 };
int i;
addition(x, vx, 8);
for(i = 0; i < 8; i++) printf("%g ", x[i]);
putchar('\n');
return 0;
}

Inserting the address of a constant in inline assembly code

I want to translate this function:
iowrite32(mem1, value1);
into assembly code.
mem1 is defined as:
int * mem1;
in order to use ioremap.
I've written this code:
asm volatile(
"mov %[whr],%[wht]"
: [whr] "=r" (mem1)
: [wht] "r" (value)
);
Then I've realized I don't want to move value to mem1, but to the ADDRESS stored in mem1.
How do I write it in assembly?
You might want to take a look at the m constraint
asm volatile(
"mov %[wht], %[whr];"
: [whr] "=m" (*mem1)
: [wht] "r" (value)
);

Use C variables in ARM Neon assembly

I've a problem using C/C++ variables inside ARM NEON assembly code written in:
__asm__ __volatile()
I've read about the following possibilities, which should move values from ARM to NEON registers. Each of the following possibilities cause a Fatal Signal in my Android application:
VDUP.32 d0, %[variable]
VMOV.32 d0[0], %[variable]
the input argument list includes:
[variable] "r" (variable)
The only way I have success is using a load:
int variable = 0;
int *address = &variable;
....
VLD1.32 d0[0], [%[address]]
: [address] "+r" (address)
But I think a load is not the best for performance if I don't need to modify the variable, and I also need to understand how to move data from ARM to NEON registers for other purposes.
EDIT: added example as requested, both possibility 1 and 2 result in a "fatal signal". I know in this example NEON assembly simply should modify first 2 elements of "array[4]".
int c = 10;
int *array4;
array4 = new int[64];
for(int i = 0; i < 64; i++){
array4[i] = 100*i;
}
__asm__ __volatile ("VLD1.32 d0, [%[array4]] \n\t"
"VMOV.32 d1[0], %[c] \n\t" //this is possibility 1
"VDUP.32 d2, %[c] \n\t" //this is possibility 2
"VMUL.S32 d0, d0, d2 \n\t"
"VST1.32 d0, [%[output_array1]] \n\t"
: [output_array1] "=r" (output_array1)
: [c] "r" (c), [array4] "r" (array4)
: "d0", "d1", "d2");
The problem is caused by the output list. Moving the output array address in an input register solves the crashes.
int c = 10;
int *array4;
array4 = new int[64];
for(int i = 0; i < 64; i++){
array4[i] = 100*i;
}
__asm__ __volatile ("VLD1.32 d0, [%[array4]] \n\t"
"VMOV.32 d1[0], %[c] \n\t" //this is possibility 1
"VDUP.32 d2, %[c] \n\t" //this is possibility 2
"VMUL.S32 d0, d0, d2 \n\t"
"VST1.32 d0, [%[output_array1]] \n\t"
:
: [c] "r" (c), [array4] "r" (array4), [output_array1] "r" (output_array1)
: "d0", "d1", "d2");

Resources