How do I copy an array into another array? - arrays

I am just starting out learning assembly and I am having a hard time understanding how I would copy one array into another array.
for example, let's say I have 2 arrays J and K:
J and K both contain 5 elements which are numbers that are 8 bits wide.
J = [0, 1, 2, 3, 4]
K = [5, 6, 7, 8, 9]
J is located in register 1 and K is located in register 2
How would I go about "appending"/"copying" J to K? (If that is even the correct way to think about it)
Would it just be:
LDR R3, R1[0] ; placing 0th J element into register R3
MOV R2, R3 ; Moving the R3 element into the array K
....
....
....
Continue like that until all elements have been copied over to array K
So the result I am trying to obtain is an array with the elements from both the initial arrays result = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
I am sure this is completely wrong, so if anyone is able to shed some light on this for me it would be much appreciated!

This can definitely be done in assembly, but it's difficult for a few reasons:
You need to know where the arrays are, and make sure you don't overwrite anything important when trying to copy them. Let's say for example that you want to just copy K directly behind J. If something else important is stored behind J then it will get erased!
How big is each array? It's obvious to you and me but the computer doesn't really have a clue. You have two choices: measure it yourself or know the array sizes ahead of time.
You'll need a different routine for each data type.
For this example code I'll assume that you want to place the appended array J+K in a separate section of memory that isn't going to overwrite anything you're using. Also this code assumes that both array sizes are 5 bytes. This isn't particularly useful since chances are you're going to want your code to be able to handle arrays of various sizes, not just this one particular size.
LDR R2,=ARRAY_J
LDR R3,=ARRAY_K
LDR R4,=ARRAY_L
MOV R1,#5 # size of ARRAY_J goes here.
loop_append_J:
LDRB R0,[R2],#1 # load R0 from ARRAY_J, add 1 to the pointer so we read the
# next byte on the next pass.
STRB R0,[R4],#1 # store R0 into ARRAY_L (ARRAY J + ARRAY K), add 1 to the
# pointer
SUBS R1,R1,#1 # decrease loop counter and set the flags accordingly
BNE loop_append_J # if R1 doesn't equal zero, loop again.
MOV R1,#5 # size of ARRAY_K goes here
loop_append_K:
LDRB R0,[R3],#1 # this is pretty much the same story, just with the second
# array. R4 already points to where it needs to.
STRB R0,[R4],#1
SUBS R1,R1,#1
BNE loop_append_K
# now your program is done, do whatever you need to do to return, be it BX LR or whatever
.data # forgive me if the syntax is wrong, I'm used to VASM which doesn't
# have data directives like this.
ARRAY_J:
.byte 0,1,2,3,4 #whatever directive you use to define 8-bit values
ARRAY_K:
.byte 5,6,7,8,9 #whatever directive you use to define 8-bit values
ARRAY_L: #empty space to hold the new array.
.space 64,0 #this is more than enough room to store the new array.

Related

Loading values to array in data segment with assembly

I have a function that receives a number from 0 to 10 as an input in R0. Then I need to place the multiplication table from 1 to 10 into an array in the data segment and place the address of the result array in R1.
I have a loop to make the arithmetic operation and have the array setup however I have no idea how to place the values in the array.
Mi original idea is each time the loop runs it calculates an iteration and it stored in the array and so on.
myArray db 1000 dup (0)
.code
MOV R0,#8 ;user input
MOV R11, #9 ;reference to stop loop when it reaches 10th iteration
loop
ADD R10, R10, #1 ;functions as counter
ADD R1,R0,R1 ;add the input number to itserlf and stores it in r1
CMP R11,R10 ;substracts counter from 9
BMI finish ;if negative flag is set it ends the loop
B loop ;if negative flag is zero it continues
finish
end
Any help is much appreciated
Your code is on the right track but it needs some fixing.
To specifically answer your question about load and store, you need to reserve space in memory, make a pointer, and load and store to the location the pointer is pointing to. The pointer can be specified by a register, like R0.
Here is a play list of YT vids that covers all the things you need to make a loop (from memory allocation, to doing load store and looping). At the very least you can watch the code sections, load-store instructions, and looping and branch instructions videos.
Good luck!

simple for loop and sum

I'm trying to learn HCS12 assembly language but there are no enough examples on the internet. I've tried to write a code but there is no success. I'm stuck. It's not absolutely homework. Can someone write it in HCS12 assembly language with comments? I want code because really I want to read it step by step. By the way, is there any other way more simple to define array?
;The array arr will be located at $1500 and the contents {2, 5, 6, 16, 100, 29, 60}
sum = 0;
for i = 0 : 6
x = arr[i];
if( x < 50 )
sum = sum + x
end
My try:
Entry:
;2,5,6,16,100,39,60
LDAA #2
STAA $1500
LDAA #5
STAA $1501
LDAA #6
STAA $1502
LDAA #16
STAA $1503
LDAA #100
STAA $1504
LDAA #39
STAA $1505
LDAA #60
STAA $1506
CLRA ; 0 in accumulator A
CLRB ; 0 in accumulator B
ADDB COUNT ; B accumulator has 6
loop:
;LDAA 1, X+ ; 1500 should be x because it should increase up to 0 from 6
; A accumulator has 2 now
BLO 50; number less than 50
;ADDA
DECB
BNE loop
Below is one possible way to implement your specific FOR loop.
It's mostly for the HC11 which is source level compatible to the HCS12 so it should also assemble correctly for the HCS12. However, the HCS12 has some extra instructions and addressing modes (e.g., the indexed auto-increment) which can make the code a bit shorter and even more readable. Anyway, I haven't actually tried this but it should be OK.
BTW, your code shows you have some fundamental lack of understanding for certain instructions. For example, BLO 50 does not mean branch if accumulator is below 50. It means check the appropriate CCR (Condition Code Register) flags which should be already set by some previous instruction, and branch to address 50 (obviously, not what you intended) if the value is less than the target. To compare a register to a value or some memory location you must use the CMPx instructions (e.g., CMPA).
;The array arr will be located at $1500 and the contents {2, 5, 6, 16, 100, 29, 60}
org $1500 ;(somewhere in ROM)
arr fcb 2,5,6,16,100,29,60 ;as bytes (use dw if words)
org $100 ;wherever your RAM is
;sum = 0;
sum rmb 2 ;16-bit sum
org $8000 ;wherever you ROM is
;for i = 0 : 6
clrb ;B is your loop counter (i)
stb sum ;initialize sum to zero (MSB)
stb sum+1 ; -//- (LSB)
ForLoop cmpb #6 ;compare against terminating value
bhi ForEnd ;if above, exit FOR loop
; x = arr[i];
ldx #arr ;register X now points to array
abx ;add offset to array element (byte size assumed)
ldaa ,x ;A is your target variable (x)
;;;;;;;;;;;;;;;;;;; ldaa b,x ;HCS12 only version (for the above two HC11-compatible lines)
inx ;X points to next value for next iteration
;;;;;;;;;;;;;;;;;;; ldaa 1,x+ ;HCS12 only version (for the above two HC11-compatible lines)
; if( x < 50 )
cmpa #50
bhs EndIf
; sum = sum + x
adda sum+1
staa sum+1
ldaa sum
adca #0
staa sum
EndIf
incb ;(implied i = i + 1 at end of loop)
bra ForLoop
;end
ForEnd
The above assumes your array is constant, so it is placed somewhere in ROM at assembly time. If your array is dynamic, it should be located in RAM, and you would need to use code to load it (similar to how you did). However, for efficiency, a loop is usually used when loading (copying) multiple values from one location to another. This is both more readable and more efficient in terms of needed code memory.
Hope this helps.
Edited: Forgot to initialize SUM to zero.
Edited: Unlike in the HC08, a CLRA in HC11 clears the Carry so the sequence CLRA, ADCA is wrong. Replaced with correct one: LDAA, ADCA #0

Assembly: Error when attempting to increment at array index

Here's a small snippet of assembly code (TASM) where I simply try to increment the value at the current index of the array. The idea is that the "freq" array will store a number (DWord size) that represents how many times that ASCII character was seen in the file. To keep the code short, "b" stores the current byte being read.
Declared in data segment
freq DD 256 DUP (0)
b DB ?
___________
Assume b contains current byte
mov bl, b
sub bh, bh
add bx, bx
inc freq[bx]
I receive this error at compilation time at the line containing "inc freq[bx]": ERROR Argument to operation or instruction has illegal size.
Any insight is greatly appreciated.
There is no inc that can increment a dword in 16 bit mode. You will have to synthesize it from add/adc, such as:
add freq[bx], 1
adc freq[bx + 2], 0
You might need to add a size override, such as word ptr or change your array definition to freq DW 512 DUP (0).
Also note that you have to scale the index by 4, not 2.

Storing an array into another array

I'm trying to copy array A into array N and then print the array (to test that it has worked) but all it outputs is -1
Here is my code:
ORG $1000
START: ; first instruction of program
clr.w d1
movea.w #A,a0
movea.w #N,a2
move.w #6,d2
for move.w (a0)+,(a2)+
DBRA d2,for
move.w #6,d2
loop
move.l (a2,D2),D1 ; get number from array at index D2
move.b #3,D0 ; display number in D1.L
trap #15
dbra d2,loop
SIMHALT ; halt simulator
A dc.w 2,2,3,4,5,6
N dc.l 6
END START ; last line of source
Why is -1 in the output only? If there is a better solution for this that would be very helpful
Since I don't have access to whatever assembler/simulator you're using, I can't actually test it, but here a few things (some of which are already noted in the comments):
dc.l declares a single long, you want ds.l (or similar) to allocate storage for 6 longs
dbra branches until the operand is equal to -1, so you'll probably want to turn
movw #loop_times, d0
loop
....
dbra d0, loop
into
movw #loop_times-1, d0
loop
....
dbra d0, loop
(this works as long as loop_times is > 0, otherwise you'll have to check the condition before entering the loop)
You display loop has a few problems: 1. On entry a2 points past the end of the N array. 2. Even fixing that, the way you're indexing it will cause problems. On the first entry you're trying to fetch a 4-byte long from address a2 + 6,then a long from a2 + 5...
What you want is to fetch longs from address a2 + 0, a2 + 4 .... One way of doing that:
move.w #6-1, d2 ; note the -1
movea.l #N, a2
loop
move.l (a2)+,D1 ; get next number from array
; use d1 here
dbra d2,loop
As already pointed out, your new array is only 4 bytes in size, you should change
dc.l 6 to ds.w 6
and also you work on 7 elements, since DBRA counts down to -1.
Second, and thats why you get -1 everywhere, you use A2 as pointer to the new array, but you do not reset it to point at the first word in new array. Since you increased it by one word per element during the copy, after the for loop has completed, A2 points to the first word after the array.
Your simulator outputting more than one number with your display loop indicates that your simulator does not emulate an MC68000, a real MC68000 would take a trap at "MOVE.L (A2,D2),D1" as soon as the sum of A2+D2 is odd - the 68000 does not allow W/L sized accesses to odd addresses (MC68020 and higher do).
A cleaned MC68000 compatible code could look like this:
lea A,a0
lea N,a2
moveq #5,d2
for move.w (a0)+,(a2)+
dbra d2,for
lea N,a2
moveq #5,d2
loop
move.w (a2)+,D1 ; get number (16 bits only)
ext.l d1 ; make the number 32 bits
moveq #3,D0 ; display number in D1.L
trap #15
dbra d2,loop
It probably contains some instructions you haven't encountered yet.

faster strlen?

Typical strlen() traverse from first character till it finds \0.
This requires you to traverse each and every character.
In algorithm sense, its O(N).
Is there any faster way to do this where input is vaguely defined.
Like: length would be less than 50, or length would be around 200 characters.
I thought of lookup blocks and all but didn't get any optimization.
Sure. Keep track of the length while you're writing to the string.
Actually, glibc's implementation of strlen is an interesting example of the vectorization approach. It is peculiar in that it doesn't use vector instructions, but finds a way to use only ordinary instructions on 32 or 64 bits words from the buffer.
Obviously, if your string has a known minimum length, you can begin your search at that position.
Beyond that, there's not really anything you can do; if you try to do something clever and find a \0 byte, you still need to check every byte between the start of the string and that point to make sure there was no earlier \0.
That's not to say that strlen can't be optimized. It can be pipelined, and it can be made to process word-size or vector chunks with each comparison. On most architectures, some combination of these and other approaches will yield a substantial constant-factor speedup over a naive byte-comparison loop. Of course, on most mature platforms, the system strlen is already implemented using these techniques.
Jack,
strlen works by looking for the ending '\0', here's an implementation taken from OpenBSD:
size_t
strlen(const char *str)
{
const char *s;
for (s = str; *s; ++s)
;
return (s - str);
}
Now, consider that you know the length is about 200 characters, as you said. Say you start at 200 and loop up and down for a '\0'. You've found one at 204, what does it mean? That the string is 204 chars long? NO! It could end before that with another '\0' and all you did was look out of bounds.
Get a Core i7 processor.
Core i7 comes with the SSE 4.2 instruction set. Intel added four additional vector instructions to speed up strlen and related search tasks.
Here are some interesting thoughts about the new instructions:
http://smallcode.weblogs.us/oldblog/2007/11/
The short answer: no.
The longer answer: do you really think that if there were a faster way to check string length for barebones C strings, something as commonly used as the C string library wouldn't have already incorporated it?
Without some kind of additional knowledge about a string, you have to check each character. If you're willing to maintain that additional information, you could create a struct that stores the length as a field in the struct (in addition to the actual character array/pointer for the string), in which case you could then make the length lookup constant time, but would have to update that field each time you modified the string.
You can try to use vectorization. Not sure if compiler will be able perform it, but I did it manually (using intrinsics). But it could help you only for long strings.
Use stl strings, it's more safe and std::string class contains its length.
Here I attached the asm code from glibc 2.29. I removed the snippet for ARM cpus. I tested it, it is really fast, beyond my expectation. It merely do alignment then 4 bytes comparison.
ENTRY(strlen)
bic r1, r0, $3 # addr of word containing first byte
ldr r2, [r1], $4 # get the first word
ands r3, r0, $3 # how many bytes are duff?
rsb r0, r3, $0 # get - that number into counter.
beq Laligned # skip into main check routine if no more
orr r2, r2, $0x000000ff # set this byte to non-zero
subs r3, r3, $1 # any more to do?
orrgt r2, r2, $0x0000ff00 # if so, set this byte
subs r3, r3, $1 # more?
orrgt r2, r2, $0x00ff0000 # then set.
Laligned: # here, we have a word in r2. Does it
tst r2, $0x000000ff # contain any zeroes?
tstne r2, $0x0000ff00 #
tstne r2, $0x00ff0000 #
tstne r2, $0xff000000 #
addne r0, r0, $4 # if not, the string is 4 bytes longer
ldrne r2, [r1], $4 # and we continue to the next word
bne Laligned #
Llastword: # drop through to here once we find a
tst r2, $0x000000ff # word that has a zero byte in it
addne r0, r0, $1 #
tstne r2, $0x0000ff00 # and add up to 3 bytes on to it
addne r0, r0, $1 #
tstne r2, $0x00ff0000 # (if first three all non-zero, 4th
addne r0, r0, $1 # must be zero)
DO_RET(lr)
END(strlen)
If you control the allocation of the string, you could make sure there is not just one terminating \0 byte, but several in a row depending on the maximum size of vector instructions for your platform. Then you could write the same O(n) algorithm using X bytes at a time comparing for 0, making strlen amortized O(n/X). Note that the amount of extra \0 bytes would not be equal to the amount of bytes on which your vector instructions operate (X), but rather 2*X - 1 since an aligned region should be filled with zeroes.
You would need to iterate over a couple of bytes normally in the beginning though, until you reach an address that is aligned to a boundary of X bytes.
The use case for this is kind of non-existent though: the amount of extra bytes you need to allocate would easily be more than simply storing a simple 4 or 8 byte integer containing the size directly. Even if it is important to you for some reason that this string can be passed solely as a pointer, without passing its size as well I think storing the size as the first Y bytes during allocation might be the fastest. But this is already far from the strlen optimization you're asking about.
Clarification:
the_size | the string ...
^
the pointer to the string
The glibc implementation is way cooler.

Resources