ARM Assembly - Why does my app crash when zeroing r7? - c

I'm currently having a weird issue when trying to run a C program that calls a very simple ARM assembly function. Here's my C code:
#include <stdio.h>
#include <stdlib.h>
extern void getNumber(int* pointer);
int main()
{
int* pointer = malloc(sizeof(int));
getNumber(pointer);
printf("%d\n", *pointer);
return 0;
}
And here's my assembly code:
.section .text
.align 4
.arm
.global getNumber
.type getNumber STT_FUNC
getNumber:
mov r1, #0
str r1, [r0]
bx lr
So far so good. However, if I add a line with mov r7, #0 at the top of getNumber, the program segfaults when trying to access pointer. After inspecting it with gdb I noticed now the pointer itself is stored at a very low address, such as 0xa.
Now, I did a bit of research and apparently r7 is the frame pointer for THUMB code (according to this). However, I'm clearly stating I don't want to use THUMB instructions in the .arm line in my assembly code. Why on earth is it failing?
I'm compiling both the .c and .s files using arm-linux-gnueabihf-gcc, and I'm running the program on a Cortex-A8 based board running Arch Linux.
Edit: The program runs fine if I compile using the -fomit-frame-pointer flag. However, I still want to know why is it using r7 as the frame pointer.
Edit 2: It's still failing even if I use .code 32 instead of .arm.

The ARM Procedure Call Standard specifies the following:
A subroutine must preserve the contents of the registers r4-r8, r10, r11 and SP (and r9 in PCS variants that designate r9 as v6).
So your assembly language subroutine must save & restore r7 if it uses it.
You might be avoiding the problem with your small test program by by not compiling for Thumb mode, but you're just accidentally avoiding the problem. Anything that links to your assembly routine is entitled to expect that r7 will be preserved.

You're crashing the program because your are corrupting the frame pointer, like you mentioned. There is really no rhyme or reason to the convention. Just that ARM reserves certain registers for certain things. Kinda like in x86 esp is the stack pointer.
Here's a pretty good reference for registers to avoid:
http://msdn.microsoft.com/en-us/library/ms253599(v=vs.80).aspx

I finally got it: doing $ arm-linux-gnueabihf-gcc -v showed me the default options my compiler is using. Among those is: --with-mode=thumb.
Compiling with -marm fixed it. Now it's working as intended!
Edit: Upon reading the comments here I realize I was mistaken. I should've saved/restored r7 so it wouldn't screw up the rest of my program. Good thing I learned this now with a toy project and not while working on something real!

Related

ARM GCC + Cortex M4: Calling address as function generates BLX instead of BL

I build as little OS for a CortexM4 CPU which is able to receive compiled binaries over UART and schedule them dynamically. I want to use that feature to craft a testsuite which uploads test programs being able to directly call OS functions like memory allocation without doing a SVC. Therefor I need to cast the fixed addresses of those OS routines to function pointers. Now, casting of memory addresses resulting in wrong / non-thumb instruction code - BL is needed instead of BLX, resulting in HardFaults.
void (*functionPtr_addr)(void);
functionPtr_addr = (void (*)()) (0x0800084C);
This is the assembly when calling this function
8000838: 4b03 ldr r3, [pc, #12] ; (8000848 <idle+0x14>)
800083a: 681b ldr r3, [r3, #0]
800083c: 4798 blx r3
Is there a way to force the BL instruction for such a case? It works with inline assembly, I could write macros but it would be much cleaner do it this way.
The code gets compiled and linked, among other things, with
-mcpu=cortex-m4 -mthumb.
Toolchain:
gcc version 12.2.0 (Arm GNU Toolchain 12.2.MPACBTI-Bet1 (Build arm-12-mpacbti.16))
bl instruction is limited in range. The compiler does not know where your code will be placed so it can't know if the instruction bl can be used.
resulting in HardFaults.
The address passed to blx has to be odd on Cortex-M4 uCs to execute the code in the Thumb mode. Your address is even and the uC tries to execute ARM code not supported by this core.

GCC startup code _start does not end in main()

I could only find bits and pieces of information on the symbol _start, which is called from the target startup code in order to establish the C runtime environment. This would be necessary to ensure that all initialized global/static variables are properly loaded prior to branching to main().
In my case, I am using an MCU with an ARM Cortex-R4F core CPU. When the device resets, I implement all of the steps recommended by the MCU manufacturer then attempt to branch to the symbol _start using the following lines of code:
extern void _start(void);
_start();
I am using something similar to the following to link the program:
armeb-eabi-gcc-7.5.0" -marm -fno-exceptions -Og -ffunction-sections -fdata-sections -g -gdwarf-3 -gstrict-dwarf -Wall -mbig-endian -mcpu=cortex-r4 -Wl,-Map,"app_tms570_dev.map" --entry main -static -Wl,--gc-sections -Wl,--build-id=none -specs="nosys.specs" -o[OUTPUT FILE NAME HERE] [ALL OBJECT FILES HERE] -Wl,-T[LINKER COMMAND FILE NAME HERE]
My toolchain in this case is gcc-linaro-7.5.0-2019.12-i686-mingw32_armeb-eabi, which is being used since my MCU device is big-endian.
As I trace through the call to symbol _start, I can see my program branch to symbol _start then a few unexpected things happen.
First, there are a couple of places where the following instruction is called:
EF123456 svc #0x123456
This basically generates a software interrupt, which causes the program to branch to the software interrupt handler that I have configured for the device.
Secondly, the device eventually branches to __libc_init_array then _init. However, symbol _init does not contain any branch instruction and allows the program to flow into _fini, which also does not contain any branch instruction and allows the program to flow into whatever code was placed next in memory. This eventually causes some type of abort exception, as would be expected.
The disassembly associated with _init and _fini:
_init():
00003b00: E1A0C00D mov r12, r13
00003b04: E92DDFF8 push {r3, r4, r5, r6, r7, r8, r9, r10, r11, r12, r14, pc}
00003b08: E24CB004 sub r11, r12, #4
_fini():
00003b0c: E1A0C00D mov r12, r13
00003b10: E92DDFF8 push {r3, r4, r5, r6, r7, r8, r9, r10, r11, r12, r14, pc}
00003b14: E24CB004 sub r11, r12, #4
Based on some other documentation I read, I also attempted to call main() directly, but this just caused the program to jump to main() without initializing anything. I also tried to call symbol __main() similar to what is done when using the ARM Compiler in order to execute startup code, but this symbol is not found.
Note that this is for a bare-metal-ish system that does not use semihosting.
My question is: Is there a way to set up the system and call a function that will establish the C runtime environment automatically and branch to main() using the GCC linker?
For the time being, I have implemented my own function to initialize .data sections and the .bss sections are already being zeroed at reset using a built in feature of the MCU device.
Adding some more details here:
The specific MCU that I am using should not be relevant, particularly taking the following discussion into consideration.
First, I have already set up the exception vectors for the device in an assembler file:
.section .excvecs,"ax",%progbits
.type Exc_Vects, %object
.size Exc_Vects, .-Exc_Vects
// See DDI0363G, Table 3-6
Exc_Vects:
b c_int00 // Reset vector
b exc_undef // Undefined instruction
b exc_software // Software
b exc_prefetch // Pre-fetch abort
b exc_data // Data abort
b exc_invalid // Invalid vector
There are two instructions that follow for the IRQ and FIQ interrupts as well, but they are set according to the MCU datasheet. I have defined handlers for the undefined instruction, prefetch abort, data abort and invalid vector exceptions. For the software exception I use some assembly to jump to an address that can be changed at runtime. My startup sequence begins at c_int00. These have all been tested and work with no problems.
My reset handler takes care of all of the steps needed for initializing the MCU in accordance with the MCU datasheet. This include initializing CPU registers and the stack pointers, which are loaded using symbols from the linker file.
The toolchain that I am using, noted above, includes the C standard libraries and other libraries needed to compile and link my program with no problems. This includes the symbol _start that I mentioned previously.
From what I understand, the function _start typically wraps main(). Before it calls main() it initializes .bss and .data sections, configures the heap, as well as performing some other tasks to set up the environment. When main() returns, it performs some clean up tasks and branches to a designated exit() function. (Side note: _start is defined in newlib based on the source code that I downloaded from linaro).
There is some detail regarding this in a separate response here:
What is the use of _start() in C?
I have been using the ARM Compiler as an alternative for the same project. There, __main performs these functions. For the stack initialization, I basically provide it an empty hook function and for exit I provide it with a function that safely terminates the program should main() return for some reason. I am not sure if something like this is needed for GCC.
I would note that I have included option -specs="nosys.specs" without option -nostartfiles. My understanding is that this avoids implementing some of the functions that do not want to use in my application, such as I/O operations, but links the startup code.
I am not using the heap in my project as dynamic memory use is frowned upon, but I was hoping to be able to use the startup code primarily in order to avoid having to remember to initialize .data sections manually. Above I noted that my application is baremetal-ish. I am actually using an RTOS and have the memory partitioned into blocks so that I can use the device MPU.

What is proper syntax to pass variables from C code to Assembly and back?

Struggling electrical engineering student trying to link C and Assembly (ARM32 Cortex-M) for an Embedded Systems final project. I don't fully understand the proper syntax for this project.
I was instructed to combine 2 previous labs - along with additional code - to build a simple calculator (+,-,*,/) with C and Assembly language in the MBED environment. I've set the C file to scan a keypad, take 3 user inputs to 3 strings, then pass these strings to an Assembly file. The Assembly file is to perform the arithmetic function and save the result in an EXPORT PROC. My C file then takes the result and printf to the user (which we read with PuTTY).
Here is my assembly header and import links:
AREA calculator, CODE, READONLY ; assembly header
compute_asm
IMPORT OPERAND_1 ; imports from C file
IMPORT OPERAND_2 ; imports from C file
IMPORT USER_OPERATION ; imports from C file
ALIGN ; aligns memory
initial_values PROC
LDR R1, =OPERAND_1; loads R1 with OPERAND_1
LDR R2, =OPERAND_2; loads R2 with OPERAND_2
Here are a few lines from my C file linking to Assembly:
int OPERAND_1; //declares OPERAND_1 for Assembly use
int OPERAND_2; //declares OPERAND_2 for Assembly use
int USER_OPERATION; //declares USER_OPERATION for Assembly use
extern int add_number(); //links add_number function in Assembly
extern int subtract_number(); //links subtract_number function in Assembly
I expected to be able to compile and use this code (the previous labs went much smoother than this project). But after working through some other syntax issues, I'm getting "Error: "/tmp/fOofpw", line 39: Warning: #47-D: incompatible redefinition of macro "MBED_RAM_SIZE" when I compile.
Coding is my weak spot. Any help or pointers would be appreciated!
In general the calling convention used by a specific version of a compiler for a specific target is specific to that compiler and version. And technically is subject to change at any time (even with gnu and arm we have seen that) and no reason to expect any other compiler conforms to the same convention. Despite that compilers like gcc and clang conform to some version of the arm recommended abi, which that abi has changed over time and gcc has changed along with it.
As Peter pointed out:
LDR R1, =OPERAND_1; loads R1 with OPERAND_1
(you are clearly not using gnu assembler, so not the gnu toolchain correct? probably Kiel or ARM?)
puts the address of that label into r1 to get the contents you need another load
ldr r1,[r1]
and now the contents are there.
Using global variables gets you around the calling convention problem.
Using a simple example and disassembling you can discover the calling convention for your compiler:
extern unsigned int add ( unsigned int, unsigned int);
unsigned int fun ( void )
{
return(add(3,4)+2);
}
00000000 <fun>:
0: b510 push {r4, lr}
2: 2104 movs r1, #4
4: 2003 movs r0, #3
6: f7ff fffe bl 0 <add>
a: 3002 adds r0, #2
c: bd10 pop {r4, pc}
e: 46c0 nop ; (mov r8, r8)
first parameter in r0, second in r1, return in r0. which could technically change on any version of gnu going forward but can tell you from gcc 2.x.x to the present 9.1.0 this is how it has been for arm. gcc 3.x.x to the present for thumb which is what you are using.
How you have done it is fine, you just need to recognize what the =LABEL shortcut thing really does.

Illegal instruction when running simple ELLCC-generated ELF binary on a Raspberry Pi

I have an empty program in LLVM IR:
define i32 #main(i32 %argc, i8** %argv) nounwind {
entry:
ret i32 0
}
I'm cross-compiling it on Intel x86-64 Windows for ARM Linux using ELLCC, with the following command:
ecc++ hw.ll -o hw.o -target arm-linux-engeabihf
It completes without errors and generates an ELF binary.
When I take the binary to a Raspberry Pi Model B+ (running Raspbian), I get only the following error:
Illegal instruction
I don't know how to tell what's wrong from the disassembled code. I tried other ARM Linux targets but the behavior was the same. What's wrong?
The exact same file builds, links and runs fine for other targets like i386-linux-eng, x86_64-w64-mingw32, etc (that I could test on), again using the ELLCC toolchain.
Assuming the library and startup code isn't at fault, this is what the disassembly of main itself looks like:
.text:00010188 e24dd008 sub sp, sp, #8
.text:0001018c e3002000 movw r2, #0
.text:00010190 e58d0004 str r0, [sp, #4]
.text:00010194 e1a00002 mov r0, r2
.text:00010198 e58d1000 str r1, [sp]
.text:0001019c e28dd008 add sp, sp, #8
.text:000101a0 e12fff1e bx lr
I'd guess it's choking on the movw at 0x0001018c. The movw/movt encodings which can handle full 16-bit immediate values first appeared in the ARMv6T2 version of the architecture - the ARM1176 in the original Pi models predates that, only supporting original ARMv6*.
You need to tell the compiler to generate code appropriate to the thing you're running on - I don't know ELLCC, but I'd guess from this it's fairly modern and up-to-date and thus defaulting to something newer like ARMv6T2 or ARMv7. Otherwise, it's akin to generating code for a Pentium and hoping it works on an 80486 - you might be lucky, you might not. That said, there's no good reason it should have chosen that encoding in the first place - it's not as if 0 can't be encoded in a 'classic' mov instruction...
The decadent option, however, would be to consider this a perfect excuse to replace the Pi with a Pi 2 - the Cortex-A7s in that are nice capable ARMv7 cores ;)
* Lies for clarity. I think 1176 might actually be v6K, but that's irrelevant here. I'm not sure if anything actually exists as plain ARMv6, and all the various architecture extensions are frankly a hideous mess

How to use C defines in ARM assembler

How can I use external defines such as LONG_MIN and LONG_MAX in ARM assembler code?
Let's say my_arm.h looks like this:
int my_arm(int foo);
Let's say I have a my_main.c as follows:
...
#include <limits.h>
#include "my_arm.h"
...
int main (int argc, char *argv[])
{
int foo=0;
...
printf("My arm assembler function returns (%d)\n", my_arm(foo));
...
}
And my_arm.s looks like this:
.text
.align 2
.global my_arm
.type my_arm, %function
my_arm:
...
ADDS r1, r1, r2
BVS overflow
...
overflow:
LDR r0, LONG_MAX # this is probably wrong, how to do it correctly?
BX lr # return with max value
The second to last line, I am not sure how to load correctly, I vaguely remember reading somewhere, that I had to define LONG_MAX in .global, but can't find the link to a working example anymore.
I am compiling with arm-linux-gnueabi-gcc version 4.3.2
==================
UPDATE: Appreciate the suggestions! Unfortunately, I am still having trouble with syntax.
First, I made a little header file mylimits.h (for now in same dir as .S)
#define MY_LONG_MIN 0x80000000
in my_arm.S i added the following:
...
.include "mylimits.h"
...
ldr r7, =MY_LONG_MIN # when it was working it was ldr r7, =0x80000000
...
Two problems with this approach.
First the biggest problem: the symbol MY_LONG_MIN is not recognized...so something is still not right
Second: syntax for .include does not let me include <limits.h>, I would have to add that in mylimits.h, seems a bit kludgy, but I suppose, that is ok :)
Any pointers?
I have access to ARM System Developer’s Guide Designing and Optimizing System Software[2004] and ARM Architecture
Reference Manual[2000], my target is XScale-IXP42x Family rev 2 (v5l) though.
Often the lowercase file extension .s implies that assembler should not be passed through the c preprocessor, whereas the uppercase extension .S implies that it should.
It's up to your compiler to follow this convention though (gcc ports normally do), so check its documentation.
(EDIT: note that this means you can use #include directives - but remember that most of the files you would include would not normally be valid assembler (unless they consist entirely of #defineitions), so you may have to write your own header that is)
edit 5 years later:
Note that the armcc v5 compiler follows this behaviour under linux... but not on windows.
If you are using gcc and its assembler, it is straightforward: name the file with final .S, then add at the beginning #include <limits.h> and use wherever you need the constant, e.g. ldr r0, SOMETHING; I did tests with x86 since it is what I have, but the same works since it is a gcc feature.
What I ended up doing is this:
in my_main.c
#include <limits.h>
...
int my_LONG_MAX=LONG_MAX;
then in my_arm.S
ldr r8, =my_LONG_MAX
ldr r10, [r8]
It looks convuluted and it is(plus the portability gains are questionable in this approach).
There must be a way to access LONG_MAX directly in assembly. Such a way I would gladly accept as the full answer.
I have seen simply feeding gcc the assembler source vs gas will allow you to do C like things in assembler. It is actually a bit scary when you come across situations where you must use gcc as a front end to gas to get something to work, but that is another story.
use --cpreproc for armasm option and add
#include "my_arm.h"
into my_arm.s.
it works for Keil ARM

Resources