ARM Assembly, Keil Microvision 4; Working with non-zero registry values - arm

Hoping for just a bit of help with something.
I have a college assignment which involves making what is basically a calculator that takes values entered in by the user as a string of ASCII characters and stores and displays the entered value, as well as making various computations such as sum, min, etc. and stores them all as well. The code only executes after a value is entered.
My code itself is working how I need it to so far, and I think I know how to write in all the computations, but my issue is that most of the registers hold non-zero values from the start, and so if I start adding in values to a register right away -- for the sum, for instance -- the end value will be incorrect. I can't just use LDR and set them to zero beforehand, though, since that will happen every time the code is run, and I need to keep the added values around to make the computations each time.
I don't know if I'm overthinking this or if there's something really simple that I'm missing, but I can't think of a way to do what I need to with non-zero registry values.
This is my working code so far:
AREA ConsoleInput, CODE, READONLY
IMPORT main
IMPORT getkey
IMPORT sendchar
EXPORT start
PRESERVE8
start
read
LDR R7, =0
BL getkey ; read key from console
CMP R0, #0x0D ; while (key != CR)
BEQ endRead ; {
BL sendchar ; echo key back to console
;R4 is used to store the hex value of whatever is entered (as a full number)
;R5 stored the entered input safely (as R0 can change)
;R6 holds the constant 10, for increasing successive entries by a base of ten
;R10 is where successive values are multiplied by 10
;R11 is used to hold the count (+1 each time the code is run)
LDR R6, =10
MUL R10, R6, R10
MOV R5, R0
AND R5, R5, #&F
ADD R10, R10, R5
MOV R4, R10 ;NUMBER ENTERED SENT TO R4
ADD R11, R11, #1 ;COUNT
B read ; }
endRead
stop B stop
END

Related

How can I use arm adcs in loop?

In arm assembly language, the instruction ADCS will add with condition flags C and set condition flags.
And the CMP instruction do the same things, so the condition flags will be recovered.
How can I solve it ?
This is my code, it is doing BCD adder with r0 and r1 :
ldr r8, =#0
ldr r9, =#15
adds r7, r8, #0
ADDLOOP:
and r4, r0, r9
and r5, r1, r9
adcs r6, r4, r5
orr r7, r6, r7
add r8, r8, #1
mov r9, r9, lsl #4
cmp r8, #3
bgt ADDEND
bl ADDLOOP
ADDEND:
mov r0, r7
I tried to save the state of condition flags, but I don't know how to do.
To save/restore the Carry flag, you could create a 0/1 integer in a register (perhaps with adc reg, zeroed_reg, #0?), then next iteration cmp reg, #1 or rsbs reg, reg, #1 to set the carry flag from it.
ARM can't materialize C as an integer 0/1 with a single instruction without any setup; compilers normally use movcs r0, #1 / movcc r0, #0 when not in a loop (Godbolt), but in a loop you'd probably want to zero a register once outside the loop instead of using two instructions predicated on carry-set / carry-clear.
Loop without modifying C
Use teq r8, #4 / bne ADDLOOP as the loop branch, like the bottom of a do{}while(r8 != 4).
Or count down from 4 with tst r8,r8 / bne ADDLOOP, using sub r8, #1 instead of add.
TEQ updates N and Z but not C or V flags. (Unless you use a shifted source operand, then it can update C). docs - unlike cmp, it sets flags like eors. The eq / ne conditions work the same: subtraction and XOR both produce zero when the inputs are equal, and non-zero in every other case. But teq doesn't even set C or V flags, and greater / less wouldn't be meaningful anyway.
This is what optimized BigInt code like GMP does, for example in its mpn_add_n function (source) which adds two bigint inputs (arrays of 32-bit chunks).
IDK why you were jumping forwards over a bl (branch-and-link) which sets lr as a return address. Don't do that, structure your asm loops like a do{}while() because it's more efficient, especially when the trip-count is known to be non-zero so you don't have to worry about running the loop zero times in some cases.
There are cbz/cbnz instructions (docs) that jump on a register being zero or non-zero without affecting flags, but they can only jump forwards (out of the loop, past an unconditional branch). They're also only available in Thumb mode, unlike teq which was probably specifically designed to give ARM an efficient way to write BigInt loops.
BCD adding
Your algorithm has bugs; you need base-10 carry, like 0x05 + 0x06 = 0x11 not 0x0b in packed BCD.
And even the binary Carry flag isn't set by something like 0x0005000 + 0x0007000; there's no carry-out from the high bit, only into the next nibble. Also, adc adds the carry-in at the bottom of the register, not at nibble your mask isolated.
So maybe you need to do something like subtract 0x000a000 from the sum (for that example shift position), because that will carry-out. (ARM sets C as a !borrow on subtraction, so maybe rsb reverse-subtract or swap the operands.)
NEON should make it possible to unpack to 8-bit elements (mask odd/even and interleave) and do all nibbles in parallel, but carry propagation is a problem; ARM doesn't have an efficient way to branch on SIMD vector conditions (unlike x86 pmovmskb). Just byte-shifting the vector and adding could generate further carries, as with 999999 + 1.
IDK if this can be cut down effectively with the same techniques hardware uses, like carry-select or carry-lookahead, but for 4-bit BCD digits with SIMD elements instead of single bits with hardware full-adders.
It's not worth doing for binary bigint because you can work in 32 or 64-bit chunks with the carry flag to help, but maybe there's something to gain when primitive hardware operations only do 4 bits at a time.

Converting a code a* b * c written in C to LC3

I don't know if this question is vague or lacks enough information but I was just wondering If I want to convert this line a = a * b * c written in C language to LC-3, how can I do it? Assuming that a, b and c are local variables and that the offset of a is 0, b is -1, c is -2?
I know that I can start off like this:
LDR R0, R5, #0; to load A
LDR R1, R5, #-1; to load B
Is there a limitation to the registers I can use? Can I use R2 to load C?
Edit:
LDR R0, R5, #0; LOAD A
LDR R1, R5, #-1; LOAD B
LDR R2, R5, #-2; LOAD C
AND R3, R3, #0; Sum = 0
LOOP ADD R3, R3, R1; Sum = sum + B
ADD R0, R0, #-1; A = A-1
STR R0, R5, #0; SAVE TO A (a = a*b)
BRp LOOP
ADD R4, R4, R2; Sum1 = sum1 + C
ADD R2, R2, #-1; C = C-1
BRp LOOP
STR R0, R5, #0; SAVE TO A (a = a*c = a*b*c)
If you're writing a whole program, which is often the case with LC-3, the only physical limit is the instruction set, so modulo that, you can use the registers as you like.
Your coursework/assignment may impose some environmental requirements, such as local variables and parameters being accessible from a frame pointer, e.g. R5, and having the stack pointer in R6.  If so, then those should probably be left alone, but in a pinch you could save them, and later restore them.
If you're writing just a function that is going to be called, then you'll need to follow the calling convention.  Decode the signature of the function you're implementing according to the parameter passing approach.  If you want to use R7 (e.g. as a scratch register, or if you want to call another function) be aware that on entry to your function, it holds the return address, whose value is needed to return to the caller, but you can save it on the stack or in a global storage for later retrieval.
The calling convention in use should also inform which registers are call clobbered vs. call preserved.  Within a function, call-clobbered registers can be used without fuss, but call-preserved registers require being saving before use, and later restored to original values before returning to the function's caller.

KEIL: The reason for working a MUL within a loop only in the 1st iteration

I have written a code in keil in which there is a loop containing a multiplication which can be done either by shift left by 1 or MUL command. However, both are executed only in the 1st iteration and in the next ones, compiler executes the code, while nothing happens.
Is there someone who knows the probable reason?
Thanks
Here is the code of the loop:
loop ADD r9, r9, #1
;MUL r8, r7, r11 ;double the X
LSL r12, r7, #1
CMP r12, r4 ;CMP the X with P
BHI G1
BLS G0
it was solved. it occured because this is a consecutive shift and it should be shifted by 1 at each iteration. the written code copies the source into destination and then, shift by 1. So it is not modified after the 1st iteration. the solution is modifying r7 by LSL r7, #1. This is shifted r7 by 1 position at all iterations.

How to make this LC3 program multiply instead?

Was trying to learn how to multiply in LC3 but having trouble modifying my old program that was just meant for adding sums. How would I go about modifying this program to multiply by the 2 given inputs?
Code:
.ORIG x3000 ; begin at x3000
; input two numbers
IN ;input an integer character (ascii) {TRAP 23}
LD R3, HEXN30 ;subtract x30 to get integer
ADD R0, R0, R3
ADD R1, R0, x0 ;move the first integer to register 1
IN ;input another integer {TRAP 23}
ADD R0, R0, R3 ;convert it to an integer
; add the numbers
ADD R2, R0, R1 ;add the two integers
; print the results
LEA R0, MESG ;load the address of the message string
PUTS ;"PUTS" outputs a string {TRAP 22}
ADD R0, R2, x0 ;move the sum to R0, to be output
LD R3, HEX30 ;add 30 to integer to get integer character
ADD R0, R0, R3
OUT ;display the sum {TRAP 21}
; stop
HALT ;{TRAP 25}
; data
MESG .STRINGZ "The sum of those two numbers is: "
HEXN30 .FILL xFFD0 ; -30 HEX
HEX30 .FILL x0030 ; 30 HEX
.END```
The simplest approach to multiply on LC-3 is repetitive addition. So keep summing the multiplicand and decrement the multiplier; the iteration stops when the multiplier is consumed (i.e. zero).
There are lot's of caveats: if the multiplier is negative, then we would either negate it to use with count down, or count up instead — either way, the final result would be negated.
Since multiplication is commutative, we might consider using the lessor (absolute) value for the multiplier so that fewer iterations are done. But for more optimal multiplication, we would switch to a whole 'nother algorithm, the shift and add. Note that this algorithm is usually presented for hardware implementation, in which saving precious register bits is important, whereas for software this is not a really significant concern.

How to start the basic programming task using Pennsim and LC-3 programming language?

The task at hand is to write a subroutine STRCPY to implement a string copy function like the C the programming language's strcpy() function.
I know:
R1 is the address of the string to copy from
R2 is the address of where the string is to be copied to
It's to be assumed that the function should copy every character of the source string to the destination address (including the null terminator), thus creating a complete duplicate copy of the source string. Also, it can be assumed that the caller has allocated sufficient space for the new string, and the subroutine will not return any information to the caller.
The starting code is given here.
I guess I'm just somewhat overwhelmed by all of the LEA commands at the beginning and as such, any guidance/assistance would be tremendously appreciated.
I don't have your starting code, but guessing at what is required...
You might start by producing a more detailed explanation in English.
For example:
You have a string to copy. The string is in memory, beginning at a fixed address. The fixed address is in R1. You are going to copy the string to memory, to another fixed address. That other fixed address is in R2.
You can approach this by copying one character, moving to the next, and repeating. You will stop when you hit the last character (the one with value 0).
Copy the character at the address in R1 to the address in R2
Check if the character copied was the null terminator
If it was, stop.
If it was not, move the addresses in R1 and R2 up one character (point to next character), and go to step (1).
Then turn that into assembly:
Some of the sorts of instructions you will to translate this English recipe into LC3 assembly are likely to be:
LDR R3, R1, #0 will load the character at the address in R1 into R3.
STR is the analogous command you can use for storing the character in R2 (but you should work that out yourself if this is homework).
ADD R1, R1, #1 will add 1 to the address in register R1 (R1 = R1 + 1).
AND R4, R3, x1111 will set R4 to 0 if R3 is 0 (the null character) (R4 = R3 & 0x1111).
BRZ DONE will go to the label "DONE:" if the last instruction set the zero flag.
LEA R5, NEXT followed by JMP R5 will go to the label "NEXT:" by loading the address of the label "NEXT:" into R5 and then jumping to that value.
I imagine your code will look something like:
LEA R5, NEXT Put the address of NEXT in R5
NEXT:
LDR R3, R1, #0 Copy what is in the address in R1 (a character) to R3
STR... Store the character in R2
AND R4, R3, x1111 See if the character in R3 (the one copied) is 0
BRZ DONE If it is, finish
ADD R1, R1, #1 If not, go to the next character
ADD ...
JMP R5 Jump to the address in R5 (which is NEXT)
DONE:
...
You should not assume I have this correct. I don't have an LC3 simulator handy.
Best of luck.

Resources