I'm trying to copy array A into array N and then print the array (to test that it has worked) but all it outputs is -1
Here is my code:
ORG $1000
START: ; first instruction of program
clr.w d1
movea.w #A,a0
movea.w #N,a2
move.w #6,d2
for move.w (a0)+,(a2)+
DBRA d2,for
move.w #6,d2
loop
move.l (a2,D2),D1 ; get number from array at index D2
move.b #3,D0 ; display number in D1.L
trap #15
dbra d2,loop
SIMHALT ; halt simulator
A dc.w 2,2,3,4,5,6
N dc.l 6
END START ; last line of source
Why is -1 in the output only? If there is a better solution for this that would be very helpful
Since I don't have access to whatever assembler/simulator you're using, I can't actually test it, but here a few things (some of which are already noted in the comments):
dc.l declares a single long, you want ds.l (or similar) to allocate storage for 6 longs
dbra branches until the operand is equal to -1, so you'll probably want to turn
movw #loop_times, d0
loop
....
dbra d0, loop
into
movw #loop_times-1, d0
loop
....
dbra d0, loop
(this works as long as loop_times is > 0, otherwise you'll have to check the condition before entering the loop)
You display loop has a few problems: 1. On entry a2 points past the end of the N array. 2. Even fixing that, the way you're indexing it will cause problems. On the first entry you're trying to fetch a 4-byte long from address a2 + 6,then a long from a2 + 5...
What you want is to fetch longs from address a2 + 0, a2 + 4 .... One way of doing that:
move.w #6-1, d2 ; note the -1
movea.l #N, a2
loop
move.l (a2)+,D1 ; get next number from array
; use d1 here
dbra d2,loop
As already pointed out, your new array is only 4 bytes in size, you should change
dc.l 6 to ds.w 6
and also you work on 7 elements, since DBRA counts down to -1.
Second, and thats why you get -1 everywhere, you use A2 as pointer to the new array, but you do not reset it to point at the first word in new array. Since you increased it by one word per element during the copy, after the for loop has completed, A2 points to the first word after the array.
Your simulator outputting more than one number with your display loop indicates that your simulator does not emulate an MC68000, a real MC68000 would take a trap at "MOVE.L (A2,D2),D1" as soon as the sum of A2+D2 is odd - the 68000 does not allow W/L sized accesses to odd addresses (MC68020 and higher do).
A cleaned MC68000 compatible code could look like this:
lea A,a0
lea N,a2
moveq #5,d2
for move.w (a0)+,(a2)+
dbra d2,for
lea N,a2
moveq #5,d2
loop
move.w (a2)+,D1 ; get number (16 bits only)
ext.l d1 ; make the number 32 bits
moveq #3,D0 ; display number in D1.L
trap #15
dbra d2,loop
It probably contains some instructions you haven't encountered yet.
Related
I am practicing using arrays and loops and I am trying to have the user ENTER less than 100 characters in console to fill up my array. The user can press ENTER whenever they are done entering how ever many characters they want and the program will print out what they entered again.
The program works but I am wondering how the program checks to see if the user press ENTER.
I have it so the program will add #-10 to the inputted character and ENTER is x0A which is 10 in decimal. I'm assuming once the program detects this the result is 0 which if false and exits the loop. That is my thought process.
Also, how would I change my code to make it so I can have the exit character be anything?
.orig x3000
LD R1,DATA_PTR ;load the memory address of array into R1
DO_WHILE_LOOP
GETC ;read characters into R0
OUT ;print R0 onto console as ASCII
STR R0,R1, #0 ;stores into memory location in R1
ADD R1,R1, #1 ;increment to next memory address
ADD R0,R0,#-10 ;looks at inputted character and checks if its is ASCII #10
BRp DO_WHILE_LOOP
LD R0,newline
OUT
LD R1,DATA_PTR
DO_WHILE_LOOP2
LDR R0,R1,#0 ;load R1 into R0
OUT ;print
ADD R2,R0,#0 ;move R0 to R2
LD R0,newline ;newline
OUT ;print
ADD R1,R1,#1 ;increment
ADD R2,R2,#-10 ;check if printed character is enter ASCII #10
BRp DO_WHILE_LOOP2 ;if not print next character(loop)
HALT
;Data
DATA_PTR .FILL ARRAY ;DATA_PTR gets the beginning of the ARRAY
newline .FILL x0A
ARRAY .BLKW #100
.END
I'm assuming once the program detects this the result is 0 which if false and exits the loop.
It's not — that the result 0 here has meaning of "false" — but that the difference between the input character and 10 is 0 meaning it was exactly 0xA or 10(dec).
NB: the use of BRp can probably be considered a bug, though using usual simulators I've had trouble entering a character whose ascii value smaller than 10.
In high level language terms what it is saying is:
do { ... } while ( in > 10 );
Though using BRnp would mean:
do { ... } while ( in != 10 );
which is more specific to newline.
If you want a different terminal character, change the value subtracted to the value of another character.
LC-3 does not offer subtraction, but it can "add" a negative number. However, it cannot add a negative number smaller than -16 using the same immediate form of ADD. So, if you want to check for an ascii character larger than 16, you'll have to use the add register form instead, and use another instruction to load that register with the value, usually using a labeled constant, declared with .FILL and the value you want.
Instead of:
ADD R2,R2,#-10 ;check if printed character is enter ASCII #10
BRp DO_WHILE_LOOP
Do something like the following:
LD R3, value ; load value to subtract
ADD R2, R2, R3 ; subtract them
BRnp DO_WHILE_LOOP
...
...
value, .FILL #-65 ; letter A, negated.
In LC-3 the ADD instruction sets condition codes.
There are three condition codes, N, Z, and P, — N for negative, Z for zero, and P for positive. If you add zero to some register, as part of the addition operation, those three flags(condition codes) will be set as follows: N if the original value was negative, Z if the original value is zero, P if the original value is positive — so, < 0, = 0, > 0.
If we use ADD to add a non-zero value, here X (but in its negation, -X) to a register value, V, we get flags that tell us:
N = (V < X) i.e. N is true if V < X,
Z = (V = X) i.e. Z is true if V = X, and,
P = (V > X) i.e. P is true iv V > X
(all ignoring possibilities for overflow).
The BR instruction can then test flags as follows:
If you would like to change the program flow of control on:
relation
idea
Opcode
<
N
BRn
>=
not N
BRzp
=
Z
BRz
!=
not Z
BRnp
>
P
BRp
<=
not P
BRnz
I have a function that receives a number from 0 to 10 as an input in R0. Then I need to place the multiplication table from 1 to 10 into an array in the data segment and place the address of the result array in R1.
I have a loop to make the arithmetic operation and have the array setup however I have no idea how to place the values in the array.
Mi original idea is each time the loop runs it calculates an iteration and it stored in the array and so on.
myArray db 1000 dup (0)
.code
MOV R0,#8 ;user input
MOV R11, #9 ;reference to stop loop when it reaches 10th iteration
loop
ADD R10, R10, #1 ;functions as counter
ADD R1,R0,R1 ;add the input number to itserlf and stores it in r1
CMP R11,R10 ;substracts counter from 9
BMI finish ;if negative flag is set it ends the loop
B loop ;if negative flag is zero it continues
finish
end
Any help is much appreciated
Your code is on the right track but it needs some fixing.
To specifically answer your question about load and store, you need to reserve space in memory, make a pointer, and load and store to the location the pointer is pointing to. The pointer can be specified by a register, like R0.
Here is a play list of YT vids that covers all the things you need to make a loop (from memory allocation, to doing load store and looping). At the very least you can watch the code sections, load-store instructions, and looping and branch instructions videos.
Good luck!
I have just started to learn assembly language at school, and as an exercise I have to make a program that calculate the sum of the first n integers (1+2+3+4+5+...+n).
I managed to build this program but during the comparison (line.9) I only compare the even numbers in register R1, so I would have to do another comparison for the odd numbers in R0.
MOV R0,#1 ; I put a register at 1 to start the sequence
INP R2,2 ; I ask the user up to what number the program should calculate, and I put its response in the R2 register
B myloop ; I define a loop
myloop:
ADD R1,R0,#1 ; I calculate n+1 and put it in register 1
ADD R3,R1,R0 ; I add R0 and R1, and I put the result in the register R3
ADD R0,R1,#1 ; I calculate n+2 and I put in the register R0, and so on...
ADD R4,R4,R3 ; R4 is the total result of all additions between R0 and R1
CMP R1,R2 ; I compare R1 and the maximum number to calculate
BNE myloop ; I only exit the loop if R1 and R2 are equal
STR R4,100 ; I store the final result in memory 100
OUT R4,4 ; I output the final result of the sequence
HALT ; I stop the execution of the program
I've tried several methods but I can't manage to perform this double comparison... (a bit like an "elif" in python)
Basically I would like to add this piece of code to also compare odd numbers:
CMP R0,R2
BNE myloop
But adding this like this directly after comparing even numbers doesn't work no matter if I put "BNE" or not.
You're trying to do a conjunction, in context, something like this:
do {
...
} while ( odd != n && even != n );
...
Eventually, one of those counters should reach the value n and stop the loop. So, both tests must pass in order to continue the loop. However, if either test fails, then the loop should stop.
First, we'll convert this loop into the if-goto-label form of assembly (while still using the C language!):
loop1:
...
if ( odd != n && even != n ) goto loop1;
...
Next, let's break down the conjunction to get rid of it. The intent of the conjunction is that if the first component fails, to stop the loop, without even checking the second component. However, if the first component succeeds, then go on to check the second component. And if the second also succeeds, then, and only then return to the top of the loop (knowing both have succeeded), and otherwise fall off the bottom. Either way, whether the first component fails or the second component fails, the loop stops.
This intent is fairly easy to accomplish in if-goto-label:
loop1:
...
if ( odd == n ) goto endLoop1;
if ( even != n ) goto loop1;
endLoop1:
...
Can you figure out how to follow this logic in assembly?
Another analysis might look like this:
loop1:
...
if ( odd == n || even == n ) goto endLoop1;
goto loop1;
endLoop1:
...
This is the same logic, but stated as how to exit the loop rather than how to continue the loop. The condition is inverted (De Morgan) but the intended target of the if-goto statement is also changed — it is effectively doubly-negated, so holds the same.
From that we would strive to remove the disjunction, again by making two statements instead of one with disjunction, also relatively straightforward using if-goto:
loop1:
...
if ( odd == n ) goto endLoop1;
if ( even == n ) goto endLoop1;
goto loop1;
endLoop1:
...
And with an optimization sometimes known as branch over unconditional branch (the unconditional branch is goto loop1;), we perform a pattern substitution, namely: (1) reversing the condition of the conditional branch, (2) changing the target of the conditional branch to the target of the unconditional branch, and (3) removing the unconditional branch.
loop1:
...
if ( odd == n ) goto endLoop1;
if ( even != n ) goto loop1;
endLoop1:
...
In summary, one takeaway is to understand how powerful the primitive if-goto is, and that it can be composed into conditionals of any complexity.
Another takeaway is that logic can be transformed by pattern matching and substitution into something logically equivalent but more desirable for some purpose like writing assembly! Here we work toward increasing use of simple if-goto's and lessor use of compound conditions.
Also, as #vorrade says, programs generally should not make assumptions about register values — so suggest to load 0 into the registers that need it at the beginning of the program to ensure their initialization. We generally don't clear registers after their use but rather set them before use. So, in another larger program, your code might run with other values from some other code left over in those registers.
Also, I answer the question posed, which is about compound conditionals, and explain how those work in some detail; though as stated elsewhere, there's no need to separate even and odd numbers in order to sum them, and, there's also a single formula that can compute the sum of numbers without iteration (though does require multiplication which may not be directly available and so would require a more bounded iteration..).
First of all your code assumes that R4 is 0 at the beginning.
This might not be true.
Your program becomes simpler and easier to understand if you add each number in a smaller loop, like this:
INP R2,2 ; I ask the user up to what number the program should calculate, and I put its response in the R2 register
MOV R0,#0 ; I put a register at 0 to start the sequence
MOV R4,#0 ; I put a register at 0 to start the sum
B testdone ; Jump straight to test if done
myloop:
ADD R0,R0,#1 ; I calculate n+1 and keep it in register 0
ADD R4,R4,R0 ; I add R4 and R0, and I put the result in the register R4
testdone:
CMP R0,R2 ; I compare R0 and the maximum number to calculate
BNE myloop ; I only exit the loop if R0 and R2 are equal
STR R4,100 ; I store the final result in memory 100
OUT R4,4 ; I output the final result of the sequence
HALT ; I stop the execution of the program
You only need 3 registers: R0 for current n, R2 for limit and R4 for the sum.
However, if you really have to add the even and odd numbers separately, you could do this way:
INP R2,2 ; I ask the user up to what number the program should calculate, and I put its response in the R2 register
MOV R0,#0 ; I put a register at 0 to start the sequence
MOV R4,#0 ; I put a register at 0 to start the sum
B testdone ; Jump straight to test if done
myloop:
ADD R0,R0,#1 ; I calculate n+1 (odd numbers), still register 0
ADD R4,R4,R0 ; I add R4 and R0, keep the result in the register R4
CMP R0,R2 ; I compare R0 and the maximum number to calculate
BEQ done ; I only exit the loop if R0 and R2 are equal
ADD R0,R0,#1 ; I calculate n+1 (even numbers) and keep it in register 0
ADD R4,R4,R0 ; I add R4 and R0, and I put the result in the register R4
testdone:
CMP R0,R2 ; I compare R0 and the maximum number to calculate
BNE myloop ; I only exit the loop if R0 and R2 are equal
done:
STR R4,100 ; I store the final result in memory 100
OUT R4,4 ; I output the final result of the sequence
HALT ; I stop the execution of the program
First of all I would like to thank you very much #vorrade , #Erik Eidt and #Peter Cordes, I read your comments and advice very carefully, and they are very useful to me :)
But in fact following the post of my question I continued to seek by myself a solution to my problem, and I came to develop this code which works perfectly!
// Date : 31/01/2022 //
// Description : A program that calculate the sum of the first n integers (1+2+3+4+5+...+n) //
MOV R0,#1 // I put a register at 1 to start the sequence of odd number
INP R2,2 // I ask the user up to what number the program should calculate, and I put its response in the R2 register
B myloop // I define a main loop
myloop:
ADD R1,R0,#1 // I calculate n+1 (even) and I put in the register R1, and so on...
ADD R3,R1,R0 // I add R0 and R1, and I put the result in the register R3
ADD R4,R4,R3 // R4 is the total result of all additions between R0 and R1, which is the register that temporarily stores the calculated results in order to increment them later in register R4
ADD R0,R1,#1 // I calculate the next odd number to add to the sequence
B test1 // The program goes to the first comparison loop
test1:
CMP R0,R2 // I compare the odd number which is in the current addition with the requested maximum, this comparison can only be true if the maximum is also odd.
BNE test2 // If the comparison is not equal, then I move on to the next test which does exactly the same thing but for even numbers this time.
ADD R4,R4,R2 // If the comparison is equal, then I add a step to the final result because my main loop does the additions 2 by 2.
B final // The program goes to the final loop
test2:
CMP R1,R2 // I compare the even number which is in the current addition with the requested maximum, this comparison can only be true if the maximum is also even.
BNE myloop // If the comparison is not equal, then the program returns to the main loop because this means that all the comparisons (even or odd) have concluded that the maximum has not yet been reached and that it is necessary to continue adding.
B final // The program goes to the final loop
final:
STR R4,100 // I store the final result in memory 100
OUT R4,4 // I output the final result of the sequence
HALT // I stop the execution of the program
I made some comments to explain my process!
I now realize that the decomposition into several loops was indeed the right solution to my problem, it allowed me to better realize the different steps that I had first written down on paper.
Could someone help me speed up my Delphi function
To find a value in a byte array without using binary search.
I call this function thousands of times, is it possible to optimize it with assembly?
Thank you so much.
function IndexOf(const List: TArray< Byte >; const Value: byte): integer;
var
I: integer;
begin
for I := Low( List ) to High( List ) do begin
if List[ I ] = Value then
Exit ( I );
end;
Result := -1;
end;
The length of the array is about 15 items.
Well, let's think. At first, please edit this line:
For I := Low( List ) to High( List ) do
(you forgot 'do' at the end). When we compile it without optimization, here is the assembly code for this loop:
Unit1.pas.29: If List [I] = Value then
005C5E7A 8B45FC mov eax,[ebp-$04]
005C5E7D 8B55F0 mov edx,[ebp-$10]
005C5E80 8A0410 mov al,[eax+edx]
005C5E83 3A45FB cmp al,[ebp-$05]
005C5E86 7508 jnz $005c5e90
Unit1.pas.30: Exit (I);
005C5E88 8B45F0 mov eax,[ebp-$10]
005C5E8B 8945F4 mov [ebp-$0c],eax
005C5E8E EB0F jmp $005c5e9f
005C5E90 FF45F0 inc dword ptr [ebp-$10]
Unit1.pas.28: For I := Low (List) to High (List) do
005C5E93 FF4DEC dec dword ptr [ebp-$14]
005C5E96 75E2 jnz $005c5e7a
This code is far from being optimal: local variable i is really local variable, that is: it is stored in RAM, in stack (you can see it by [ebp-$10] adresses, ebp is stack pointer).
So at each new iteration we see how we load address of array into eax register (mov eax, [ebp-$04]),
then we load i from stack into edx register (mov edx, [ebp-$10]),
then we at least load List[i] into al register which is lower byte of eax (mov al, [eax+edx])
after which compare it with argument 'Value' taken again from memory, not from register!
This implementation is extremely slow.
But let's turn optimization on at last! It's done in Project options -> compiling -> code generation. Let's look at new code:
Unit1.pas.29: If List [I] = Value then
005C5E5A 3A1408 cmp dl,[eax+ecx]
005C5E5D 7504 jnz $005c5e63
Unit1.pas.30: Exit (I);
005C5E5F 8BC1 mov eax,ecx
005C5E61 5E pop esi
005C5E62 C3 ret
005C5E63 41 inc ecx
Unit1.pas.28: For I := Low (List) to High (List) do
005C5E64 4E dec esi
005C5E65 75F3 jnz $005c5e5a
now there are just 4 lines of code which gets repeated over and over.
Value is stored inside dl register (lower byte of edx register),
address of 0-th element of array is stored in eax register,
i is stored in ecx register.
So the line 'if List[i] = Value' converts into just 1 assembly line:
005C5E5A 3A1408 cmp dl,[eax+ecx]
the next line is conditional jump, 3 lines after that are executed just once or never (it's if condition is true), and at last there is increment of i,
decrement of loop variable (it's easier to compare it with zero then with anything else)
So, there is little we can do which Delphi compiler with optimizer didn't!
If it's permitted by your program, you can try to reverse direction of search, from last element to first:
For I := High( List ) downto Low( List ) do
this way compiler will be happy to compare i with zero to indicate that we checked everything (this operation is free: when we decrement i and got zero, CPU zero flag turns on!)
But in such implementation behaviour may be different: if you have several entries = Value, you'll get not the first one, but the last one!
Another very easy thing is to declare this IndexOf function as inline: this way you'll probably have no function call here: this code will be inserted at each place where you call it. Function calls are rather slow things.
There are also some crazy methods described in Knuth how to search in simple array as fast as possible, he introduces 'dummy' last element of array which equals your 'Value', that way you don't have to check boundaries (it will alway find something before going out of range), so there is just 1 condition inside loop instead of 2. Another method is 'unrolling' of loop: you write down 2 or 3 or more iterations inside a loop, so there are less jumps per each check, but this has even more downsides: it will be beneficial only for rather large arrays while may make it even slower for arrays with 1 or 2 elements.
As others said: the biggest improvement would be to understand what kind of data you store: does it change frequently or stays the same for long time, do you look for random elements or there are some 'leaders' which gets the most attention. Must these elements be in the same order as you put them or it's allowed to rearrange them as you wish? Then you can choose data structure accordingly. If you look for some 1 or 2 same entries all the time and they can be rearranged, a simple 'Move-to-front' method would be great: you don't just return index but first move element to first place, so it will be found very quickly the next time.
If your arrays are long, you can use the x86 built in string scan REP SCAS.
It is coded in microcode and has a moderate start-up time, but it is
heavily optimized in the CPU and runs fast given long enough data structures (>= 100 bytes).
In fact on a modern CPU it frequently outperforms very clever RISC code.
If your arrays are short, then no amount of optimization of this routine will help, because then your problem is in code not shown in the question, so there is no answer I can give you.
See: http://docwiki.embarcadero.com/RADStudio/Tokyo/en/Internal_Data_Formats_(Delphi)
function IndexOf({$ifndef RunInSeperateThread} const {$endif} List: TArray<byte>; const Value: byte): integer;
//Lock the array if you run this in a separate thread.
{$ifdef CPUX64}
asm
//RCX = List
//DL = byte.
mov r8,[rcx-8] //3 - get the length ASAP.
push rdi //0 - hidden in mov r,m
mov eax,edx //0 - rename
mov rdi,rcx //0 - rename
mov rcx,r8 //0 - rename
mov rdx,r8 //0 - remember the length
//8 cycles setup
repne scasb //2n - repeat until byte found.
pop rdi //1
neg rcx //0
lea rax,[rdx+rcx] //1 result = length - bytes left.
end;
{$ENDIF}
{$ifdef CPUX86}
asm
//EAX = List
//DL = byte.
push edi
mov edi,eax
mov ecx,[eax-4] //get the length
mov eax,edx
mov edx,ecx //remember the length
repne scasb //repeat until byte found.
pop edi
neg ecx
lea eax,[edx+ecx] //result = length - bytes left.
end;
Timings
On my laptop using an array of 1KB with the target byte at the end this gives the following timings (lowest time using a 100.0000 runs)
Code | CPU cycles
| Len=1024 | Len=16
-------------------------------+----------+---------
Your code optimizations off | 5775 | 146
Your code optimizations on | 4540 | 93
X86 my code | 2726 | 60
X64 my code | 2733 | 69
The speed-up is OK (ish), but hardly worth the effort.
If your array's are short, then this code will not help you and you'll have to resort to better other options to optimize your code.
Speed up possible when using binary search
Binary search is a O(log n) operation, vs O(n) for naive search.
Using the same array this will find your data in log2(1024) * CPU cycles per search = 10 * 20 +/- 200 cycles. A 10+ times speed up over my optimized code.
Here's a small snippet of assembly code (TASM) where I simply try to increment the value at the current index of the array. The idea is that the "freq" array will store a number (DWord size) that represents how many times that ASCII character was seen in the file. To keep the code short, "b" stores the current byte being read.
Declared in data segment
freq DD 256 DUP (0)
b DB ?
___________
Assume b contains current byte
mov bl, b
sub bh, bh
add bx, bx
inc freq[bx]
I receive this error at compilation time at the line containing "inc freq[bx]": ERROR Argument to operation or instruction has illegal size.
Any insight is greatly appreciated.
There is no inc that can increment a dword in 16 bit mode. You will have to synthesize it from add/adc, such as:
add freq[bx], 1
adc freq[bx + 2], 0
You might need to add a size override, such as word ptr or change your array definition to freq DW 512 DUP (0).
Also note that you have to scale the index by 4, not 2.