Pass user input from C to Assembler - c

I am new to ARM Assembler. Using qemu emulator.
This solution didn't work for me.
I have this C file md1_main.c:
#include <stdio.h>
#include <stdlib.h>
#include "md1.h"
int main (void)
{
int n;
scanf("%d", &n);
printf("Result = %u\n", asum(n));
return 0;
}
and .h file contains function prototype unsigned int asum(unsigned int n);
I am really confused, how to pass n into the assembler code.
Assembler code is md1.s:
.text
.align 2
.global asum
.type asum, %function
asum:
mov r1, #0
mov r2, #1
loop:
cmp r2, #3 ; instead of 3 there should be my input
bgt end
add r1, r1, r2
add r2, r2, #1
b loop
end:
mov r0, r1
bx lr
Just can't get it.

OP has mentioned the architecture as ARM 64. So I will answer according the calling convention.
The first 4 arguments are passed in r0, r1, r2, r3.
That is what the compiler will also do for you when compiling the C file. So you can expect your parameter n in the r0 register and you can directly use it.
I also see that your function returns an unsigned value. That will be returned in the r0 register.
See this, for a more detailed description of the calling convention.

Related

gcc arm optimizes away parameters before System Call

I'm trying to implement some "OSEK-Services" on an arm7tdmi-s using gcc arm. Unfortunately turning up the optimization level results in "wrong" code generation. The main thing I dont understand is that the compiler seems to ignore the procedure call standard, e.g. passing parameters to a function by moving them into registers r0-r3. I understand that function calls can be inlined but still the parameters need to be in the registers to perform the system call.
Consider the following code to demonstrate my problem:
unsigned SysCall(unsigned param)
{
volatile unsigned ret_val;
__asm __volatile
(
"swi 0 \n\t" /* perform SystemCall */
"mov %[v], r0 \n\t" /* move the result into ret_val */
: [v]"=r"(ret_val)
:: "r0"
);
return ret_val; /* return the result */
}
int main()
{
unsigned retCode;
retCode = SysCall(5); // expect retCode to be 6 when returning back to usermode
}
I wrote the Top-Level software interrupt handler in assembly as follows:
.type SWIHandler, %function
.global SWIHandler
SWIHandler:
stmfd sp! , {r0-r2, lr} #save regs
ldr r0 , [lr, #-4] #load sysCall instruction and extract sysCall number
bic r0 , #0xff000000
ldr r3 , =DispatchTable #load dispatchTable
ldr r3 , [r3, r0, LSL #2] #load sysCall address into r3
ldmia sp, {r0-r2} #load parameters into r0-r2
mov lr, pc
bx r3
stmia sp ,{r0-r2} #store the result back on the stack
ldr lr, [sp, #12] #restore return address
ldmfd sp! , {r0-r2, lr} #load result into register
movs pc , lr #back to next instruction after swi 0
The dispatch table looks like this:
DispatchTable:
.word activateTaskService
.word getTaskStateService
The SystemCall function looks like this:
unsigned activateTaskService(unsigned tID)
{
return tID + 1; /* only for demonstration */
}
running without optimization everything works fine and the parameters are in the registers as to be expected:
See following code with -O0 optimization:
00000424 <main>:
424: e92d4800 push {fp, lr}
428: e28db004 add fp, sp, #4
42c: e24dd008 sub sp, sp, #8
430: e3a00005 mov r0, #5 #move param into r0
434: ebffffe1 bl 3c0 <SysCall>
000003c0 <SysCall>:
3c0: e52db004 push {fp} ; (str fp, [sp, #-4]!)
3c4: e28db000 add fp, sp, #0
3c8: e24dd014 sub sp, sp, #20
3cc: e50b0010 str r0, [fp, #-16]
3d0: ef000000 svc 0x00000000
3d4: e1a02000 mov r2, r0
3d8: e50b2008 str r2, [fp, #-8]
3dc: e51b3008 ldr r3, [fp, #-8]
3e0: e1a00003 mov r0, r3
3e4: e24bd000 sub sp, fp, #0
3e8: e49db004 pop {fp} ; (ldr fp, [sp], #4)
3ec: e12fff1e bx lr
Compiling the same code with -O3 results in the following assembly code:
00000778 <main>:
778: e24dd008 sub sp, sp, #8
77c: ef000000 svc 0x00000000 #Inline SystemCall without passing params into r0
780: e1a02000 mov r2, r0
784: e3a00000 mov r0, #0
788: e58d2004 str r2, [sp, #4]
78c: e59d3004 ldr r3, [sp, #4]
790: e28dd008 add sp, sp, #8
794: e12fff1e bx lr
Notice how the systemCall gets inlined without assigning the value 5 t0 r0.
My first approach is to move those values manually into the registers by adapting the function SysCall from above as follows:
unsigned SysCall(volatile unsigned p1)
{
volatile unsigned ret_val;
__asm __volatile
(
"mov r0, %[p1] \n\t"
"swi 0 \n\t"
"mov %[v], r0 \n\t"
: [v]"=r"(ret_val)
: [p1]"r"(p1)
: "r0"
);
return ret_val;
}
It seems to work in this minimal example but Im not very sure whether this is the best possible practice. Why does the compiler think he can omit the parameters when inlining the function? Has somebody any suggestions whether this approach is okay or what should be done differently?
Thank you in advance
A function call in C source code does not instruct the compiler to call the function according to the ABI. It instructs the compiler to call the function according to the model in the C standard, which means the compiler must pass the arguments to the function in a way of its choosing and execute the function in a way that has the same observable effects as defined in the C standard.
Those observable effects do not include setting any processor registers. When a C compiler inlines a function, it is not required to set any particular processor registers. If it calls a function using an ABI for external calls, then it would have to set registers. Inline calls do not need to obey the ABI.
So merely putting your system request inside a function built of C source code does not guarantee that any registers will be set.
For ARM, what you should do is define register variables assigned to the required register(s) and use those as input and output to the assembly instructions:
unsigned SysCall(unsigned param)
{
register unsigned Parameter __asm__("r0") = param;
register unsigned Result __asm__("r0");
__asm__ volatile
(
"swi 0"
: "=r" (Result)
: "r" (Parameter)
: // "memory" // if any inputs are pointers
);
return Result;
}
(This is a major kludge by GCC; it is ugly, and the documentation is poor. But see also https://stackoverflow.com/tags/inline-assembly/info for some links. GCC for some ISAs has convenient specific-register constraints you can use instead of r, but not for ARM.) The register variables do not need to be volatile; the compiler knows they will be used as input and output for the assembly instructions.
The asm statement itself should be volatile if it has side effects other than producing a return value. (e.g. getpid() doesn't need to be volatile.)
A non-volatile asm statement with outputs can be optimized away if the output is unused, or hoisted out of loops if its used with the same input (like a pure function call). This is almost never what you want for a system call.
You also need a "memory" clobber if any of the inputs are pointers to memory that the kernel will read or modify. See How can I indicate that the memory *pointed* to by an inline ASM argument may be used? for more details (and a way to use a dummy memory input or output to avoid a "memory" clobber.)
A "memory" clobber on mmap/munmap or other system calls that affect what memory means would also be wise; you don't want the compiler to decide to do a store after munmap instead of before.

Different gcc assembly when using designated initializers

I was checking some gcc generated assembly for ARM and noticed that I get strange results if I use designated initializers:
E.g. if I have this code:
struct test
{
int x;
int y;
};
__attribute__((noinline))
struct test get_struct_1(void)
{
struct test x;
x.x = 123456780;
x.y = 123456781;
return x;
}
__attribute__((noinline))
struct test get_struct_2(void)
{
return (struct test){ .x = 123456780, .y = 123456781 };
}
I get the following output with gcc -O2 -std=C11 for ARM (ARM GCC 6.3.0):
get_struct_1:
ldr r1, .L2
ldr r2, .L2+4
stm r0, {r1, r2}
bx lr
.L2:
.word 123456780
.word 123456781
get_struct_2: // <--- what is happening here
mov r3, r0
ldr r2, .L5
ldm r2, {r0, r1}
stm r3, {r0, r1}
mov r0, r3
bx lr
.L5:
.word .LANCHOR0
I can see the constants for the first function, but I don't understand how get_struct_2 works.
If I compile for x86, both functions just load the same single 64-bit value in a single instruction.
get_struct_1:
movabs rax, 530242836987890956
ret
get_struct_2:
movabs rax, 530242836987890956
ret
Am I provoking some undefined behavior, or is this .LANCHOR0 somehow related to these constants?
Looks like gcc shoots itself in the foot with an extra level of indirection after merging the loads of the constants into an ldm.
No idea why, but pretty obviously a missed optimization bug.
x86-64 is easy to optimize for; the entire 8-byte constant can go in one immediate. But ARM often uses PC-relative loads for constants that are too big for one immediate.

embedding ARM assembly in C language

My goal is to implement sorting algorithm using C language.
I have to make a C code that converts into least number of instructions when compiled by gcc -O0(no optimization option) in ARM machine.
So, My idea is to embed quicksort implemented in assembly directly into C code.
I referred to several following documents and tried to implement my goal.
However, I don't know how to put intarray into my assembly function 'QuickSort' as a parameter.
Reference
1.https://en.wikibooks.org/wiki/Algorithm_Implementation/Sorting/Quicksort#ARM_Assembly
2.http://forum.falinux.com/zbxe/index.php?mid=lecture_tip&comment_srl=517498&sort_index=readed_count&order_type=asc&l=fr&page=58&document_srl=567970 (sorry for non-english website)
I'm newbie in assembly.
Please help me..
#include <stdio.h>
#include <stdint.h>
int Quicksort(uint32_t intarray[]);
asm(
".global Quicksort\n\
Quicksort:\n\
qsort:\n\
stmfd sp!,{r4, r6, lr} \n\
mov r6,r2 \n\
qsort_tailcall_entry:\n\
sub r7,r6,r1\n\
cmp r7,#1\n\
ldmlefd sp!,{r4,r6,pc}\n\
ldr r7,[r0,r1,asl#2]\n\
add r2,r1,#1\n\
mov r4,r6\n\
partition_loop:\n\
ldr r3,[r0, r2, asl #2]\n\
cmp r3,r7\n\
addle r2,r2, #1\n\
ble partition_test\n\
sub r4,r4, #1\n\
ldr r5,[r0, r4, asl #2]\n\
str r5,[r0, r2, asl #2]\n\
str r3,[r0, r4, asl #2]\n\
partition_test:\n\
cmp r2,r4\n\
blt partition_loop\n\
partition_finish:\n\
sub r2,r2,#1\n\
ldr r3,[r0,r2,asl #2]\n\
str r3,[r0,r1,asl #2]\n\
str r7,[r0,r2,asl #2]\n\
bl qsort\n\
mov r1,r4\n\
b qsort_tailcall_entry\n\
"
);
int main(void){
uint32_t intarray[10] = {5,2,5,1,7,5,7,2,3,8};
Quicksort(intarray);
return 0;
}
Since you mentioned that you are compiling with gcc, you could use the gcc asm extension (as the name says, it's a gcc extension and might not be compatible with other compilers). Take a look at basic asm and extended asm. Since you will probably be accessing data from your C code, I advise you to stick with the advanced version which lets you specify memory operands.

Is there any gcc compiler primitive for "svc"?

I'm working on writing a program running on Cortex-m3.
At first I wrote an assembly file which executes 'svc'.
svc:
svc 0
bx lr
I decided to use gcc's inline asm, so I wrote it as follows, but the svc function was not inlined.
__attribute__((naked))
int svc(int no, ...)
{
(void)no;
asm("svc 0\n\tbx lr");
}
int f() {
return svc(0,1,2);
}
------------------ generated assembly ------------------
svc:
svc 0
bx lr
f:
mov r0, #0
mov r1, #1
mov r2, #2
b svc
I guess it's not inlined since it is naked, so I dropped the naked attribute and wrote like this.
int svc(int __no, ...)
{
register int no asm("r0") = __no;
register int ret asm("r0");
asm("svc 0" : "=r"(ret) : "r"(no));
return ret;
}
------------------ generated assembly ------------------
svc:
stmfd sp!, {r0, r1, r2, r3}
ldr r0, [sp]
add sp, sp, #16
svc 0
bx lr
f:
mov r0, #0 // missing instructions setting r1 and r2
svc 0
bx lr
Although I don't know why gcc adds some unnecessary stack operations, svc is good. The problem is that svc is not inlined properly, the variadic parameters were dropped.
Is there any svc primitive in gcc? If gcc does not have one, how do I write the right one?
Have a look at the syntax that is used in core_cmFunc.h which is supplied as part of the ARM CMSIS for the Cortex-M family. Here's an example that writes a value to the Priority Mask Register:
__attribute__ ((always_inline)) static inline void __set_PRIMASK(uint32_t priMask)
{
__ASM volatile ("MSR primask, %0"::"r" (priMask));
}
However, creating a variadic function like this sounds difficult.
You can use a macro like this.
#define __svc(sNum) __asm volatile("SVC %0" ::"M" (sNum))
And use it just like any compiler-primitive function, __svc(2);.
Since it is just a macro, it will only generate the provided instruction.

Inline Assembly: Passing pointers to a function and using it in that function in assembly

I'm using ARM/Cortex-A8 processor platform.
I have a simple function where I have to pass two pointers to a function. These pointers are later used in that function which has only my inline assembly code This plan is only to achieve performance.
function(unsigned char *input, unsigned char *output)
{
// What are the assembly instructions to use these two pointers here?
// I will inline the assembly instructions here
}
main()
{
unsigned char input[1000], output[1000];
function(input, output);
}
Thanks
Assuming you're using a normal ARM ABI, those two parameters will be passed in R0 and R1. Here is a quick example showing how to copy the bytes from the input buffer to the output buffer (gcc syntax):
.text
.globl _function
_function:
mov r2, #0 // initialize loop counter
loop:
ldrb r3, [r0, r2] // load r3 with input[r2]
strb r3, [r1, r2] // store r3 to output[r2]
add r2, r2, #1 // increment loop counter
cmp r2, #1000 // test loop counter
bne loop
mov pc, lr

Resources