I have this program that i have to write in arm assembly to find the smallest element in an array. Normally this is a pretty easy thing to do in every programming language, but i just can't get my head around what i'm doing wrong in arm assembly. I'm a beginner in arm but i know my way around c. So I wrote the algorithm on how to find the smallest number in an array in c like this.
int minarray = arr[0];
for (int i =0; i < len; i++){
if (arr[i] < minarray){
minarray = arr[i];
}
It's easy and nothing special really.
Now i tried taking over the algorithm in arm almost the same. There are two things that have already been programmed from the beginning. The address of the first element is stored in register r0. The length of the array is stored in register r1. In the end, the smallest element must be stored back in register r0. Here is what i did:
This is almost the same algorithm as the one in c. First i load the first element into a new register r4. Now the first element is the smallest. Then once again, i load the first element in r8. I compare those two, if r8 <= r4, then copy the content of r8 to r4. After that (because i'm working with numbers of 32 bits) i add 4bytes to r0 to get on to the next element of the array. After that i subtract 1 from the array length to loop through the array until its below 0 to stop the program.
The feedback i'm getting from my testing function that was given to us to check if our program works says that it works partly. It says that it works for short arrays and arrays of length 0 but not for long arrays. I'm honestly lost. I think i'm making a really dumb mistake but i just cannot find it and i've been stuck at this easy problem for 3 days now but everything i have tried did not work or as i said, only worked "partly". I would really appreciate if someone could help me out.
This is the feedback that i get:
✗ min works with other numbers
✗ min works with a long array
✓ min works with a short array
✓ min tolerates size = 0
(x is for "it does not work", ✓ is for "it works")
So you see what i'm saying? i just do not understand how to implement the fact that its supposed to work with a longer array.
I'm not very good at ARM assembly by to my understanding R4 is expected to keep the value of minimum. R8 is used to keep the most recently fetched value from the input array.
The minimum is updated with this instruction:
MOVLE r8, r4
But it actually updated R8, not R4.
Try:
MOVLE r4, r8
EDIT
Other issue is using incorrect branch instruction:
SUBS r1, r1, #1
BPL loop1
works like:
r1 = r1 - 1
if (r1 >= 0) goto loop1;
For R1 equal to 1 the loop is exectured twice.
r1 = 1
... do stuff
r1 = r1 - 1 // r1 is 0 now
if (r1 >= 0) goto loop1; // 0>=0 TRUE!
... do stuff, overflow the input by indexing at `[r0 + 4]`
r1 = r1 - 1 // r1 is -1
if (r1 >= 0) goto loop1; // -1 >= 0 FALSE
// exit function
To fix it use branching only when input is non-zero.
BNE loop1
Coding in C use the correct types
You do not have to iterate from the index 0 only 1
int foo(const int *arr, size_t len)
{
int minarray = arr[0];
for (size_t i = 1; i < len; i++)
{
if (arr[i] < minarray)
{
minarray = arr[i];
}
}
return minarray;
}
And it generates this code:
foo:
mov r3, r0
subs r1, r1, #1
ldr r0, [r3], #4
beq .L1
.L3:
ldr r2, [r3], #4
cmp r0, r2
it ge
movge r0, r2
subs r1, r1, #1
bne .L3
.L1:
bx lr
Related
I have a C code in my mind which I want to implement in ARM Programming Language.
The C code I have in my mind is something of this sort:
int a;
scanf("%d",&a);
if(a == 0 || a == 1){
a = 1;
}
else{
a = 2;
}
What I have tried:
//arm equivalent of taking input to reg r0
//check for first condition
cmp r0,#1
moveq r0,#1
//if false
movne r0,#2
//check for second condition
cmp r0,#0
moveq r0,#1
Is this the correct way of implementing it?
Your code is broken for a=0 - single step through it in your head, or in a debugger, to see what happens.
Given this specific condition, it's equivalent to (unsigned)a <= 1U (because negative integer convert to huge unsigned values). You can do a single cmp and movls / movhi. Compilers already spot this optimization; here's how to ask a compiler to make asm for you so you can learn the tricks clever humans programmed into them:
int foo(int a) {
if(a == 0 || a == 1){
a = 1;
}
else{
a = 2;
}
return a;
}
With ARM GCC10 -O3 -marm on the Godbolt compiler explorer:
foo:
cmp r0, #1
movls r0, #1
movhi r0, #2
bx lr
See How to remove "noise" from GCC/clang assembly output? for more about making functions that will have useful asm output. In this case, r0 is the first arg-passing register in the calling convention, and also the return-value register.
I also included another C version using if (a <= 1U) to show that it compiles to the same asm. (1U is an unsigned constant, so C integer promotion rules implicitly convert a to unsigned so the types match for the <= operator. You don't need to explicitly do (unsigned)a <= 1U.)
General case: not a single range
For a case like a==0 || a==3 that isn't a single range-check, you can predicate a 2nd cmp. (Godbolt)
foo:
cmp r0, #3 # sets Z if a was 3
cmpne r0, #0 # leaves Z unmodified if it was already set, else sets it according to a == 0
moveq r0, #1
movne r0, #2
bx lr
You can similarly chain && like a==3 && b==4, or for checks like a >= 3 && a <= 7 you can sub / cmp, using the same unsigned-compare trick as the 0 or 1 range check after sub maps a values into the 0..n range. See the Godbolt link for that.
No that does not work.
cmp r0,#1 is it a one
moveq r0,#1 yes, make it a one again?
movne r0,#2 otherwise make it a 2, what if it was a zero to start, now it is a 2
cmp r0,#0 at this point it is either a 1 or a 2 you forced it so it cannot be zero, what it started off is is now lost.
moveq r0,#1
You have the right concept but need to order things better.
following that line of thinking though
maybe use another register
x = 2;
if(a==0) x = 1;
if(a==1) x = 1;
a = x;
Ponder this
if(a==0) a = 1;
if(a!=1) a = 2;
Or as everyone else is going to say ask the compiler.
because of the or, test OR test, generically they need to be done separately the false condition of the first test does not mean the else condition you have to then do the other test before declaring false. But if true you need to hop over everything and not fall into the second test because that might (in this case will) be false...
As Peter points out you can use unsigned less than or equal and greater than conditions (even though in C it is a signed int, bits is bits).
LS Unsigned lower or same
HI Unsigned higher
Depending the ARM instruction sets is can be:
cmp r0, #1
movls r0, #1
movhi r0, #2
bx lr
or
cmp r0, #1
ite ls
movls r0, #1
movhi r0, #2
bx lr
Am I smarter than you? NO I simply use the compiler to compile the C code.
https://godbolt.org/z/dqxv64Eb9
Was trying to learn how to multiply in LC3 but having trouble modifying my old program that was just meant for adding sums. How would I go about modifying this program to multiply by the 2 given inputs?
Code:
.ORIG x3000 ; begin at x3000
; input two numbers
IN ;input an integer character (ascii) {TRAP 23}
LD R3, HEXN30 ;subtract x30 to get integer
ADD R0, R0, R3
ADD R1, R0, x0 ;move the first integer to register 1
IN ;input another integer {TRAP 23}
ADD R0, R0, R3 ;convert it to an integer
; add the numbers
ADD R2, R0, R1 ;add the two integers
; print the results
LEA R0, MESG ;load the address of the message string
PUTS ;"PUTS" outputs a string {TRAP 22}
ADD R0, R2, x0 ;move the sum to R0, to be output
LD R3, HEX30 ;add 30 to integer to get integer character
ADD R0, R0, R3
OUT ;display the sum {TRAP 21}
; stop
HALT ;{TRAP 25}
; data
MESG .STRINGZ "The sum of those two numbers is: "
HEXN30 .FILL xFFD0 ; -30 HEX
HEX30 .FILL x0030 ; 30 HEX
.END```
The simplest approach to multiply on LC-3 is repetitive addition. So keep summing the multiplicand and decrement the multiplier; the iteration stops when the multiplier is consumed (i.e. zero).
There are lot's of caveats: if the multiplier is negative, then we would either negate it to use with count down, or count up instead — either way, the final result would be negated.
Since multiplication is commutative, we might consider using the lessor (absolute) value for the multiplier so that fewer iterations are done. But for more optimal multiplication, we would switch to a whole 'nother algorithm, the shift and add. Note that this algorithm is usually presented for hardware implementation, in which saving precious register bits is important, whereas for software this is not a really significant concern.
why two separate instructions instead of one instruction? Practically in what kind of situations we need to use CMP and TEQ instructions.
I know how both the instruction works.
short: Both serve different purposes each, cmp is subs without a destination while teq is eors without a destination.
cmp is very straightforward: you compare two numbers A and B
signed:
gt: A > B
ge: A >= B
eq: A == B
le: A <= B
lt: A < B
unsigned:
hi: A > B
hs: A >= B
eq: A == B
ls: A <= B
lo: A < B
Let's assume the problem below though:
int32_t foo(int32_t A)
{
if (((A < 0) && ((A & 1) == 1)) || ((A >= 0) && ((A & 1) == 0)))
{
A += 1;
}
else
{
A -= 1;
}
return A;
}
In human language, the if statement is true if A is either an (odd negative number) or an (even positive number), and Linaro GCC 7.4.1 # O3 will generate that mess below:
foo
0x00000000: CMP r0,#0
0x00000004: AND r3,r0,#1
0x00000008: BLT {pc}+0x14 ; 0x1c
0x0000000C: CMP r3,#0
0x00000010: BEQ {pc}+0x14 ; 0x24
0x00000014: SUB r0,r0,#1
0x00000018: BX lr
0x0000001C: CMP r3,#0
0x00000020: BEQ {pc}-0xc ; 0x14
0x00000024: ADD r0,r0,#1
0x00000028: BX lr
People knowledgeable in the field of bit hacking would alter the if statement like below:
int32_t bar(int32_t A)
{
if ((A ^ (A<<31)) >= 0)
{
A += 1;
}
else
{
A -= 1;
}
return A;
}
And the results are:
bar
0x0000002C: EORS r3,r0,r0,LSL #31
0x00000030: ADDPL r0,r0,#1
0x00000034: SUBMI r0,r0,#1
0x00000038: BX lr
And finally, assembly programmers will replace EORS with teq r0, r0, lsl #31.
It won't make the code any faster, but it doesn't need R3 as the scratch register.
Note that the code above is just a show case, being a separate function where you have excess of available registers.
In real life however, registers are by far the most scarce resource, especially inside a loop, and even compilers will make use of the teq instruction in similar situations.
Summing it up, there are fields such as error correction, decryption/encryption, etc where tons of xor operations are done, and people dealing with those problems just know to appreciate instructions such as teq and when to us them.
And always remember: never trust compilers
Hoping for just a bit of help with something.
I have a college assignment which involves making what is basically a calculator that takes values entered in by the user as a string of ASCII characters and stores and displays the entered value, as well as making various computations such as sum, min, etc. and stores them all as well. The code only executes after a value is entered.
My code itself is working how I need it to so far, and I think I know how to write in all the computations, but my issue is that most of the registers hold non-zero values from the start, and so if I start adding in values to a register right away -- for the sum, for instance -- the end value will be incorrect. I can't just use LDR and set them to zero beforehand, though, since that will happen every time the code is run, and I need to keep the added values around to make the computations each time.
I don't know if I'm overthinking this or if there's something really simple that I'm missing, but I can't think of a way to do what I need to with non-zero registry values.
This is my working code so far:
AREA ConsoleInput, CODE, READONLY
IMPORT main
IMPORT getkey
IMPORT sendchar
EXPORT start
PRESERVE8
start
read
LDR R7, =0
BL getkey ; read key from console
CMP R0, #0x0D ; while (key != CR)
BEQ endRead ; {
BL sendchar ; echo key back to console
;R4 is used to store the hex value of whatever is entered (as a full number)
;R5 stored the entered input safely (as R0 can change)
;R6 holds the constant 10, for increasing successive entries by a base of ten
;R10 is where successive values are multiplied by 10
;R11 is used to hold the count (+1 each time the code is run)
LDR R6, =10
MUL R10, R6, R10
MOV R5, R0
AND R5, R5, #&F
ADD R10, R10, R5
MOV R4, R10 ;NUMBER ENTERED SENT TO R4
ADD R11, R11, #1 ;COUNT
B read ; }
endRead
stop B stop
END
I am interested in converting a Fibonacci sequence code in C++ into ARM assembly language. The code in C++ is as follows:
#include <iostream>
using namespace std;
int main()
{
int range, first = 0 , second = 1, fibonacci;
cout << "Enter range for the Fibonacci Sequence" << endl;
cin >> range;
for (int i = 0; i < range; i++)
{
if (i <=1)
{
fibonacci = i;
}
else
{
fibonacci = first and second;
first = second;
second = fibonacci;
}
}
cout << fibonacci << endl;
return 0;
}
My attempt at converting this to assembly is as follows:
ldr r0, =0x00000000 ;loads 0 in r0
ldr r1, =0x00000001 ;loads 1 into r1
ldr r2, =0x00000002 ;loads 2 into r2, this will be the equivalent of 'n' in C++ code,
but I will force the value of 'n' when writing this code
ldr r3, =0x00000000 ;r3 will be used as a counter in the loop
;r4 will be used as 'fibonacci'
loop:
cmp r3, #2 ;Compares r3 with a value of 0
it lt
movlt r4, r3 ;If r3 is less than #0, r4 will equal r3. This means r4 will only ever be
0 or 1.
it eq ;If r3 is equal to 2, run through these instructions
addeq r4, r0, r1
moveq r0,r1
mov r1, r4
adds r3, r3, #1 ;Increases the counter by one
it gt ;Similarly, if r3 is greater than 2, run though these instructions
addgt r4, r0, r1
movgt r0, r1
mov r1, r4
adds r3, r3, #1
I'm not entirely sure if that is how you do if statements in Assembly, but that will be a secondary concern for me at this point. What I am more interested in, is how I can incorporate an if statement in order to test for the initial condition where the 'counter' is compared to the 'range'. If counter < range, then it should go into the main body of the code where the fibonacci statement will be iterated. It will then continue to loop until counter = range.
I am not sure how to do the following:
cmp r3, r2
;If r3 < r2
{
<code>
}
;else, stop
Also, in order for this to loop correctly, am I able to add:
cmp r3, r2
bne loop
So that the loop iterates until r3 = r2?
Thanks in advance :)
It's not wise to put if-statements inside a loop. Get rid of it.
An optimized(kinda) standalone Fibonacci function should be like this:
unsigned int fib(unsigned int n)
{
unsigned int first = 0;
unsigned int second = 1;
unsigned int temp;
if (n > 47) return 0xffffffff; // overflow check
if (n < 2) return n;
n -= 1;
while (1)
{
n -= 1;
if (n == 0) return second;
temp = first + second;
first = second;
second = temp
}
}
Much like factorial, optimizing Fibonacci sequence is somewhat nonsense in real world computing, because they exceed the 32-bit barrier really soon: It's 12 with factorial and 47 with Fibonacci.
If you really need them, you are served the best with very short lookup tables.
If you need this function fully implemented for larger values:
https://www.nayuki.io/page/fast-fibonacci-algorithms
Last but not least, here is the function above in assembly:
cmp r0, #47 // r0 is n
movhi r0, #-1 // overflow check
bxhi lr
cmp r0, #2
bxlo lr
sub r2, r0, #1 // r2 is the counter now
mov r1, #0 // r1 is first
mov r0, #1 // r0 is second
loop:
subs r2, r2, #1 // n -= 1
add r12, r0, r1 // temp = first + second
mov r1, r0 // first = second
bxeq lr // return second when condition is met
mov r0, r12 // second = temp
b loop
Please note that the last bxeq lr can be placed immediately after subs which might seem more logical, but with the multiple issuing capability of the Cortex series in mind, it's better in this order.
It might be not exactly the answer you were looking for, but keep this in mind: A single if statement inside a loop can seriously cripple the performance - a nested one even more.
And there are almost always ways avoiding these. You just have to look for them.
Conditionals compile to conditional jumps in almost all assembly language:
if (condition)
..iftrue..
else
..iffalse..
becomes
eval condition
conditional_jump_if_true truelabel
..iffalse..
unconditional_jump endlabel
truelabel:
..iftrue..
endlabel:
or the other way around (exchange false and true).
ARM supports conditional execution to eliminate these jumps when compiling the innermost conditionals: http://www.davespace.co.uk/arm/introduction-to-arm/conditional.html
IT... is a Thumb-2 instruction: http://en.wikipedia.org/wiki/ARM_architecture#Thumb-2 to support unified assemblies. See http://www.keil.com/support/man/docs/armasm/armasm_BABJGFDD.htm for more details.
Your code for looping (cmp and bne) is fine.
In general, try to rewrite your code using gotos instead of cycles, and else parts.
else can remain only at the deepest nesting level.
Then you can convert this semi-assembly code to assembly much more easily.
HTH