Assembly: Error when attempting to increment at array index - arrays

Here's a small snippet of assembly code (TASM) where I simply try to increment the value at the current index of the array. The idea is that the "freq" array will store a number (DWord size) that represents how many times that ASCII character was seen in the file. To keep the code short, "b" stores the current byte being read.
Declared in data segment
freq DD 256 DUP (0)
b DB ?
___________
Assume b contains current byte
mov bl, b
sub bh, bh
add bx, bx
inc freq[bx]
I receive this error at compilation time at the line containing "inc freq[bx]": ERROR Argument to operation or instruction has illegal size.
Any insight is greatly appreciated.

There is no inc that can increment a dword in 16 bit mode. You will have to synthesize it from add/adc, such as:
add freq[bx], 1
adc freq[bx + 2], 0
You might need to add a size override, such as word ptr or change your array definition to freq DW 512 DUP (0).
Also note that you have to scale the index by 4, not 2.

Related

Need help converting pseudo code to an 80x86 Assembly language program with the MASM assembler

I'm using Visual Studio 2012, editing the .asm file inside a windows32 solution.
This is the pseudo code that needs to be changed into assembly:
Declare a 32-bit integer array A[10] in memory
repeat
Prompt for and input user's array length L
until 0 <= L <= 10
for i := 0 to (L-1)
Prompt for and input A[i]
end for
while (First character of prompted input string for searching = 'Y' or 'y')
Prompt for and input value V to be searched
found := FALSE
for i := 0 to (L-1)
if V = A[i] then
found := TRUE
break
end if
end for
if (found) then
Display message that value was found at position i
else
Display message that value was not found
end if
end while
I can manage the inputs, loops, and jumps well enough but the things tripping me up are making the array have a length that the user inputs, and running through the array to compare the values. What would help me out the most is if someone would make and explain a segments of code to help me understand the parts I'm not getting. I've tried searching for it online but nearly everything I've come across uses a different assembler making it hard to dissect.
My code thus far is:
.586
.MODEL FLAT
INCLUDE io.h ; header file for input/output
.STACK 4096
.DATA
lengthA DWORD ?
promptL BYTE "Please enter a length of the array between 0 and 10: ", 0
string BYTE 40 DUP (?)
A DWORD 0 DUP (?)
.CODE
_MainProc PROC
Reread: input promptL, string, 40 ; read ASCII characters
atod string ; convert to integer
;while (promptL < 0 or > 10)
cmp eax, 0
jl Reread
cmp eax, 10
jg Reread
;end while
mov lengthA, eax ; store in memory
_MainProc ENDP
END ; end of source code
After checking that the user input is within range I just hit a brick wall and am not sure how to set up the array A to have the specified length or even if I declared A correctly.
You can't modify the assembly-time constant at runtime. The DUP directive is that, assembly-time directive reserving memory space by emitting the "init" value as many times, as you want, into resulting machine code. That forms a fixed executable, which is limited to that size used during assembling.
As your maximum L is 10, you can reserve array for the maximum possible size:
A DWORD 10 DUP (?)
And then later in the code you will have to fetch the [lengthA] to know how many elements are used (to make it look as dynamically resized array and to not process remaining unused part of that reserve after address A).
Other option is to reserve memory dynamically at runtime, either by calling OS service for allocation of heap memory, or reserving the array on stack. But both options are considerably more advanced than using the fixed-size memory like above.
I don't see in your pseudo code any request for dynamic memory allocation, the solution with fixed size array looks OK to me, especially if you are struggling with it already, then you are probably not ready to program your own dynamic memory manager/allocator.
EDIT: Actually the pseudo code does specify you should reserve fixed size memory, I somehow overlooked it at first read:
Declare a 32-bit integer array A[10] in memory
So example of pseudo code using the [lengthA]:
; for i := 0 to (L-1)
xor edi,edi ; index i
input_loop:
cmp edi,[lengthA]
jge input_loop_ends ; if (i >= L), end for
; TODO: Prompt for and input A[i]
; eax = input value, preserve edi (!)
mov [A + 4*edi],eax ; store value into A[i]
; end for
inc edi ; ++i
jmp input_loop
input_loop_ends:
; memory at address A contains L many DWORD values from user

Delphi Optimize IndexOf Function

Could someone help me speed up my Delphi function
To find a value in a byte array without using binary search.
I call this function thousands of times, is it possible to optimize it with assembly?
Thank you so much.
function IndexOf(const List: TArray< Byte >; const Value: byte): integer;
var
  I: integer;
begin
  for I := Low( List ) to High( List ) do begin
   if List[ I ] = Value then
    Exit ( I );
end;
  Result := -1;
end;
The length of the array is about 15 items.
Well, let's think. At first, please edit this line:
For I := Low( List ) to High( List ) do
(you forgot 'do' at the end). When we compile it without optimization, here is the assembly code for this loop:
Unit1.pas.29: If List [I] = Value then
005C5E7A 8B45FC mov eax,[ebp-$04]
005C5E7D 8B55F0 mov edx,[ebp-$10]
005C5E80 8A0410 mov al,[eax+edx]
005C5E83 3A45FB cmp al,[ebp-$05]
005C5E86 7508 jnz $005c5e90
Unit1.pas.30: Exit (I);
005C5E88 8B45F0 mov eax,[ebp-$10]
005C5E8B 8945F4 mov [ebp-$0c],eax
005C5E8E EB0F jmp $005c5e9f
005C5E90 FF45F0 inc dword ptr [ebp-$10]
Unit1.pas.28: For I := Low (List) to High (List) do
005C5E93 FF4DEC dec dword ptr [ebp-$14]
005C5E96 75E2 jnz $005c5e7a
This code is far from being optimal: local variable i is really local variable, that is: it is stored in RAM, in stack (you can see it by [ebp-$10] adresses, ebp is stack pointer).
So at each new iteration we see how we load address of array into eax register (mov eax, [ebp-$04]),
then we load i from stack into edx register (mov edx, [ebp-$10]),
then we at least load List[i] into al register which is lower byte of eax (mov al, [eax+edx])
after which compare it with argument 'Value' taken again from memory, not from register!
This implementation is extremely slow.
But let's turn optimization on at last! It's done in Project options -> compiling -> code generation. Let's look at new code:
Unit1.pas.29: If List [I] = Value then
005C5E5A 3A1408 cmp dl,[eax+ecx]
005C5E5D 7504 jnz $005c5e63
Unit1.pas.30: Exit (I);
005C5E5F 8BC1 mov eax,ecx
005C5E61 5E pop esi
005C5E62 C3 ret
005C5E63 41 inc ecx
Unit1.pas.28: For I := Low (List) to High (List) do
005C5E64 4E dec esi
005C5E65 75F3 jnz $005c5e5a
now there are just 4 lines of code which gets repeated over and over.
Value is stored inside dl register (lower byte of edx register),
address of 0-th element of array is stored in eax register,
i is stored in ecx register.
So the line 'if List[i] = Value' converts into just 1 assembly line:
005C5E5A 3A1408 cmp dl,[eax+ecx]
the next line is conditional jump, 3 lines after that are executed just once or never (it's if condition is true), and at last there is increment of i,
decrement of loop variable (it's easier to compare it with zero then with anything else)
So, there is little we can do which Delphi compiler with optimizer didn't!
If it's permitted by your program, you can try to reverse direction of search, from last element to first:
For I := High( List ) downto Low( List ) do
this way compiler will be happy to compare i with zero to indicate that we checked everything (this operation is free: when we decrement i and got zero, CPU zero flag turns on!)
But in such implementation behaviour may be different: if you have several entries = Value, you'll get not the first one, but the last one!
Another very easy thing is to declare this IndexOf function as inline: this way you'll probably have no function call here: this code will be inserted at each place where you call it. Function calls are rather slow things.
There are also some crazy methods described in Knuth how to search in simple array as fast as possible, he introduces 'dummy' last element of array which equals your 'Value', that way you don't have to check boundaries (it will alway find something before going out of range), so there is just 1 condition inside loop instead of 2. Another method is 'unrolling' of loop: you write down 2 or 3 or more iterations inside a loop, so there are less jumps per each check, but this has even more downsides: it will be beneficial only for rather large arrays while may make it even slower for arrays with 1 or 2 elements.
As others said: the biggest improvement would be to understand what kind of data you store: does it change frequently or stays the same for long time, do you look for random elements or there are some 'leaders' which gets the most attention. Must these elements be in the same order as you put them or it's allowed to rearrange them as you wish? Then you can choose data structure accordingly. If you look for some 1 or 2 same entries all the time and they can be rearranged, a simple 'Move-to-front' method would be great: you don't just return index but first move element to first place, so it will be found very quickly the next time.
If your arrays are long, you can use the x86 built in string scan REP SCAS.
It is coded in microcode and has a moderate start-up time, but it is
heavily optimized in the CPU and runs fast given long enough data structures (>= 100 bytes).
In fact on a modern CPU it frequently outperforms very clever RISC code.
If your arrays are short, then no amount of optimization of this routine will help, because then your problem is in code not shown in the question, so there is no answer I can give you.
See: http://docwiki.embarcadero.com/RADStudio/Tokyo/en/Internal_Data_Formats_(Delphi)
function IndexOf({$ifndef RunInSeperateThread} const {$endif} List: TArray<byte>; const Value: byte): integer;
//Lock the array if you run this in a separate thread.
{$ifdef CPUX64}
asm
//RCX = List
//DL = byte.
mov r8,[rcx-8] //3 - get the length ASAP.
push rdi //0 - hidden in mov r,m
mov eax,edx //0 - rename
mov rdi,rcx //0 - rename
mov rcx,r8 //0 - rename
mov rdx,r8 //0 - remember the length
//8 cycles setup
repne scasb //2n - repeat until byte found.
pop rdi //1
neg rcx //0
lea rax,[rdx+rcx] //1 result = length - bytes left.
end;
{$ENDIF}
{$ifdef CPUX86}
asm
//EAX = List
//DL = byte.
push edi
mov edi,eax
mov ecx,[eax-4] //get the length
mov eax,edx
mov edx,ecx //remember the length
repne scasb //repeat until byte found.
pop edi
neg ecx
lea eax,[edx+ecx] //result = length - bytes left.
end;
Timings
On my laptop using an array of 1KB with the target byte at the end this gives the following timings (lowest time using a 100.0000 runs)
Code | CPU cycles
| Len=1024 | Len=16
-------------------------------+----------+---------
Your code optimizations off | 5775 | 146
Your code optimizations on | 4540 | 93
X86 my code | 2726 | 60
X64 my code | 2733 | 69
The speed-up is OK (ish), but hardly worth the effort.
If your array's are short, then this code will not help you and you'll have to resort to better other options to optimize your code.
Speed up possible when using binary search
Binary search is a O(log n) operation, vs O(n) for naive search.
Using the same array this will find your data in log2(1024) * CPU cycles per search = 10 * 20 +/- 200 cycles. A 10+ times speed up over my optimized code.

variable changing value unexpectedly only when equal to zero in MASM32

I recently got arrays working in masm32, but I'm running into a very confusing snag. I have a procedure (AddValue) that accepts one argument and adds that argument to an element in an array called bfmem. Which element to affect is determined by a variable called index. However, index appears to be changing its value where i would not expect it to.
If index is greater than 0, the program behaves normally. However, if index equals 0, its value will be changed to whatever value was passed to the procedure. This is utterly baffling to me, especially as this only happens when index is set to zero. I don't know much MASM so forgive me if this is a really simple problem.
.386
.model flat,stdcall
option casemap:none
include \masm32\include\windows.inc
include \masm32\include\masm32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\masm32.lib
includelib \masm32\lib\kernel32.lib
.data
bfmem dd 0 dup(30000)
index dd 0
_output db 10 dup(0)
.code
AddValue proc val ; adds val to bfmem[index]
invoke dwtoa, index, addr _output
invoke StdOut, addr _output ;prints '0' (as expected)
mov ecx, index ;mov index to register for arithmatic
mov eax, val ;mov val to register for arithmatic
add [ecx * 4 + bfmem], eax ;multiply index by 4 for dword
;then add to array to get the element's
;memory address, and add value
invoke dwtoa, index, addr _output
invoke StdOut, addr _output ;prints '72' (not as expected)
ret
AddValue endp
main:
mov index, 0
invoke AddValue, 72
invoke StdIn, addr _output, 1
invoke ExitProcess, 0
end main
The only thing I can think of is that the assembler is doing some kind of arithmetic optimization (noticing ecx is zero and simplifying the [ecx * 4 + bfmem] expression in some way that changes the output). If so, how can I fix this?
Any help would be appreciated.
The problem is that your declaration:
bfmem dd 0 dup(30000)
Says to allocate 0 bytes initialized with the value 30000. So when index is 0, you are overwriting the value of index (the address of index and bfmem coincide). Larger indexes you don't see the problem because you're overwriting other memory, like your output buffer. If you want to test to see that this is what's happening, try this:
bfmem dd 0 dup(30000)
index dd 0
_messg dd "Here is an output message", 13, 10, 0
Run your program with an index value of 1, 2, 3 and then display the message (_messg) using invoke StdOut.... You'll see that it overwrites parts of the message.
I assume you meant:
bfmem dd 30000 dup(0)
Which is 30000 bytes initialized to 0.

Storing an array into another array

I'm trying to copy array A into array N and then print the array (to test that it has worked) but all it outputs is -1
Here is my code:
ORG $1000
START: ; first instruction of program
clr.w d1
movea.w #A,a0
movea.w #N,a2
move.w #6,d2
for move.w (a0)+,(a2)+
DBRA d2,for
move.w #6,d2
loop
move.l (a2,D2),D1 ; get number from array at index D2
move.b #3,D0 ; display number in D1.L
trap #15
dbra d2,loop
SIMHALT ; halt simulator
A dc.w 2,2,3,4,5,6
N dc.l 6
END START ; last line of source
Why is -1 in the output only? If there is a better solution for this that would be very helpful
Since I don't have access to whatever assembler/simulator you're using, I can't actually test it, but here a few things (some of which are already noted in the comments):
dc.l declares a single long, you want ds.l (or similar) to allocate storage for 6 longs
dbra branches until the operand is equal to -1, so you'll probably want to turn
movw #loop_times, d0
loop
....
dbra d0, loop
into
movw #loop_times-1, d0
loop
....
dbra d0, loop
(this works as long as loop_times is > 0, otherwise you'll have to check the condition before entering the loop)
You display loop has a few problems: 1. On entry a2 points past the end of the N array. 2. Even fixing that, the way you're indexing it will cause problems. On the first entry you're trying to fetch a 4-byte long from address a2 + 6,then a long from a2 + 5...
What you want is to fetch longs from address a2 + 0, a2 + 4 .... One way of doing that:
move.w #6-1, d2 ; note the -1
movea.l #N, a2
loop
move.l (a2)+,D1 ; get next number from array
; use d1 here
dbra d2,loop
As already pointed out, your new array is only 4 bytes in size, you should change
dc.l 6 to ds.w 6
and also you work on 7 elements, since DBRA counts down to -1.
Second, and thats why you get -1 everywhere, you use A2 as pointer to the new array, but you do not reset it to point at the first word in new array. Since you increased it by one word per element during the copy, after the for loop has completed, A2 points to the first word after the array.
Your simulator outputting more than one number with your display loop indicates that your simulator does not emulate an MC68000, a real MC68000 would take a trap at "MOVE.L (A2,D2),D1" as soon as the sum of A2+D2 is odd - the 68000 does not allow W/L sized accesses to odd addresses (MC68020 and higher do).
A cleaned MC68000 compatible code could look like this:
lea A,a0
lea N,a2
moveq #5,d2
for move.w (a0)+,(a2)+
dbra d2,for
lea N,a2
moveq #5,d2
loop
move.w (a2)+,D1 ; get number (16 bits only)
ext.l d1 ; make the number 32 bits
moveq #3,D0 ; display number in D1.L
trap #15
dbra d2,loop
It probably contains some instructions you haven't encountered yet.

emu8086 find minimum and maximum in an array

I have to come up with an ASM code (for emu8086) that will find the minimum and maximum value in an array of any given size. In the sample code, my instructor provides (what appears to be) a data segment that contains an array named LIST. He claims that he will replace this list with other lists of different sizes, and our code must be able to handle it.
Here's the sample code below. I've highlighted the parts that I've added, just to show you that I've done my best to solve this problem:
; You may customize this and other start-up templates;
; The location of this template is c:\emu8086\inc\0_com_template.txt
org 100h
data segment
LIST DB 05H, 31H, 34H, 30H, 38H, 37H
MINIMUM DB ?
MAXIMUM DB ?
AVARAGE DB ?
**SIZE=$-OFFSET LIST**
ends
stack segment **;**
DW 128 DUP(0) **; I have NO CLUE what this is supposed to do**
ends **;**
code segment
start proc far
; set segment registers:
MOV AX,DATA **;**
MOV DS,AX **;I'm not sure what the point of this is, especially since I'm supposed to be the programmer, not my teacher.**
MOV ES,AX **;**
; add your code here
**;the number of elements in LIST is SIZE (I think)
MOV CX,SIZE ;a loop counter, I think
;find the minimum value in LIST and store it into MINIMUM
;begin loop
AGAIN1:
LEA SI,LIST
LEA DI,MINIMUM
MOV AL,[SI]
CMP AL,[SI+1]
If carry flag=1:{I got no idea}
LOOP AGAIN1
;find the maximum value in LIST and store it into MAXIMUM
;Something similar to the other loop, but this time I gotta find the max.
AGAIN2:
LEA SI,LIST
LEA DI,MINIMUM
MOV AL,[SI]
CMP AL,[SI-1] ;???
LOOP AGAIN2
**
; exit to operating system.
MOV AX,4C00H
INT 21H
start endp
ends
end start ; set entry point and stop the assembler.
ret
I'm not positive, but I think you want to move the SIZE variable immediately after the LIST variable:
data segment
LIST DB 05H, 31H, 34H, 30H, 38H, 37H
SIZE=$-OFFSET LIST
MINIMUM DB ?
MAXIMUM DB ?
AVARAGE DB ?
ends
What it does is give you the number of bytes between the current address ($) and the beginning of the LIST variable - thus giving you the size (in bytes) of the list variable itself. Because the LIST is an array of bytes, SIZE will be the actual length of the array. If LIST was an array of WORDS, you'd have to divide SIZE by two. If your teacher wrote that code then perhaps you should leave it alone.
I'm not entirely clear on why your teacher made a stack segment, I can't think of any reason to use it, but perhaps it will become clear in a future assignment. For now, you probably should know that DUP is shorthand for duplicate. This line of code:
DW 128 DUP(0)
Is allocating 128 WORDS of memory initialized to 0.
The following lines of code:
MOV AX,DATA
MOV DS,AX
MOV ES,AX
Are setting up your pointers so that you can loop through the LIST. All you need to know is that, at this point, AX points to the beginning of the data segment and therefor the beginning of your LIST.
As for the rest... it looks like you have an endless loop. What you need to do is this:
Set SI to point to the beginning of the LIST
Set CX to be the length of the LIST, you've done that
Copy the first byte from [SI] to AX
Compare AX to the memory variable MINIMUM
If AX is smaller, copy it to MINIMUM
Increment IS and decriment CX
If CX = 0 (check the ZERO flag) exit the loop, otherwise, go back to #3

Resources