Assembler array max element search - arrays

I need to write asm function in Delphi to search for max array element. So that wat I wrote.
Got few prolbems here.
First - mov ecx, len just dosen't work in right way here. Actually it replaces value in ECX but not with value in len! And if I just wirte an example mov ecx, 5 there appears 5 in ecx.
Second - i test this function on array of 5 elements (using mov ecx, 5 ofc ) it returns some strange result. I think maybe because of I do someting worng when trying to read arrays 0 element like this
mov edx, arr
lea ebx, dword ptr [edx]
But if I read it like this
lea ebx, arr
it says that operation is invalid and if I try like this
lea bx, arr
it says that sizes mismatch.
How could I solve this problem? Full code here:
program Project2;
{$APPTYPE CONSOLE}
uses
SysUtils;
Type
TMyArray = Array [0..255] Of Byte;
function randArrCreate(len:Integer):TMyArray;
var temp:TMyArray; i:Integer;
begin
Randomize;
for i:=0 to len-1 do
temp[i]:=Random(100);
Result:= temp;
end;
procedure arrLoop(arr:TMyArray; len:Integer);
var i:integer;
begin
for i:=0 to len-1 do begin
Write(' ');
Write(arr[i]);
Write(' ');
end;
end;
function arrMaxAsm(arr:TMyArray; len:integer):Word; assembler;
asm
mov edx, arr
lea ebx, dword ptr [edx]
mov ecx, len
xor ax,ax //0
mov ax, [ebx] //max
#cycle:
mov dx, [ebx]
cmp dx, ax
jg #change
jmp #cont
#change:
mov ax, dx
#cont:
inc ebx
loop #cycle
mov result, ax
end;
var massive:TMyArray; n,res:Integer;
begin
Readln(n);
massive:=randArrCreate(n);//just create random array
arrLoop(massive,n);//just to show what in it
res:=arrMaxAsm(massive, n);
Writeln(res);
Readln(n);
end.

First off, calling conventions: what data is sent to the function and where?
According to the documentation, arrays are passed as 32-bit pointers to the data, and integers are passed as values.
According to the same documentation, multiple calling conventions are supported. Unfortunately, the default one isn't documented - explicitly specifying one would be a good idea.
Based on your description that mov ecx, len doesn't work, I'm guessing the compiler used the register convention by default, and the arguments were already placed in ecx and edx, then your code went and mixed them up. You can either change your code to work with that convention, or tell the compiler to pass the arguments using the stack - use the stdcall convention. I arbitrarily picked the second option. Whichever one you pick, make sure to specify the calling convention explicitly.
Next, actual function logic.
Is there a reason why you're working with 16 bit registers instead of full 32-bit ones?
Your array contains bytes, but you're reading and comparing words.
lea ebx, dword ptr [edx] is the same as mov ebx, edx. You're just introducing another temporary variable.
You're comparing elements as if they were signed.
Modern compilers tend to implement loops without using loop.
The documentation also says that ebx needs to be preserved - because the function uses ebx, its original value needs to be saved at the start and restored afterwards.
This is how I rewrote your function (using Lazarus, because I haven't touched Delphi in about 8 years - no compiler within reach):
function arrMaxAsm(arr:TMyArray; len:integer):Word; assembler; stdcall;
asm
push ebx { save ebx }
lea edx, arr { Lazarus accepts a simple "mov edx, arr" }
mov edx, [edx] { but Delphi 7 requires this indirection }
mov ecx, len
xor ax, ax { set default max to 0 }
test ecx, ecx
jle #done { if len is <= 0, nothing to do }
movzx ax, byte ptr [edx] { read a byte, zero-extend it to a word }
{ and set it as current max }
#cont:
dec ecx
jz #done { if no elements left, return current max }
#cycle:
inc edx
movzx bx, byte ptr [edx] { read next element, zero-extend it }
cmp bx, ax { compare against current max as unsigned quantities }
jbe #cont
mov ax, bx
jmp #cont
#done:
pop ebx { restore saved ebx }
mov result, ax
end;
It might be possible to optimize it further by reorganizing the loop jumps - YMMV.
Note: this will only work correctly for byte-sized unsigned values. To adapt it to values of different size/signedness, some changes need to be made:
Data size:
Read the right amount of bytes:
movzx bx, byte ptr [edx] { byte-sized values }
mov bx, word ptr [edx] { word-sized values }
mov ebx, dword ptr [edx] { dword-sized values }
{ note that the full ebx is needed to store this value... }
Mind that this reading is done in two places. If you're dealing with dwords, you'll also need to change the result from ax to eax.
Advance over the right amount of bytes.
#cycle:
inc edx { for an array of bytes }
add edx, 2 { for an array of words }
add edx, 4 { for an array of dwords }
Dealing with signed values:
The value extension, if it's applied, needs to be changed from movzx to movsx.
The conditional jump before setting new maximum needs to be adjusted:
cmp bx, ax { compare against current max as unsigned quantities }
jbe #cont
cmp bx, ax { compare against current max as signed quantities }
jle #cont

Related

ASSEMBLY - output an array with 32 bit register vs 16 bit

I'm was working on some homework to print out an array as it's sorting some integers from an array. I have the code working fine, but decided to try using EAX instead of AL in my code and ran into errors. I can't figure out why that is. Is it possible to use EAX here at all?
; This program sorts an array of signed integers, using
; the Bubble sort algorithm. It invokes a procedure to
; print the elements of the array before, the bubble sort,
; once during each iteration of the loop, and once at the end.
INCLUDE Irvine32.inc
.data
myArray BYTE 5, 1, 4, 2, 8
;myArray DWORD 5, 1, 4, 2, 8
currentArray BYTE 'This is the value of array: ' ,0
startArray BYTE 'Starting array. ' ,0
finalArray BYTE 'Final array. ' ,0
space BYTE ' ',0 ; BYTE
.code
main PROC
MOV EAX,0 ; clearing registers, moving 0 into each, and initialize
MOV EBX,0 ; clearing registers, moving 0 into each, and initialize
MOV ECX,0 ; clearing registers, moving 0 into each, and initialize
MOV EDX,0 ; clearing registers, moving 0 into each, and initialize
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET startArray ; load EDX with address of variable
CALL writeString ; print string
POP EDX ; return edx to previous stack
MOV ECX, lengthOf myArray ; load ECX with # of elements of array
DEC ECX ; decrement count by 1
L1:
PUSH ECX ; save outer loop count
MOV ESI, OFFSET myArray ; point to first value
L2:
MOV AL,[ESI] ; get array value
CMP [ESI+1], AL ; compare a pair of values
JGE L3 ; if [esi] <= [edi], don't exch
XCHG AL, [ESI+1] ; exchange the pair
MOV [ESI], AL
CALL printArray ; call printArray function
CALL crlf
L3:
INC ESI ; increment esi to the next value
LOOP L2 ; inner loop
POP ECX ; retrieve outer loop count
LOOP L1 ; else repeat outer loop
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET finalArray ; load EDX with address of variable
CALL writeString ; print string
POP EDX ; return edx to previous stack
CALL printArray
L4 : ret
exit
main ENDP
printArray PROC uses ESI ECX
;myArray loop
MOV ESI, OFFSET myArray ; address of myArray
MOV ECX, LENGTHOF myArray ; loop counter (5 values within array)
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET currentArray ; load EDX with address of variable
CALL writeString ; print string
POP EDX ; return edx to previous stack
L5 :
MOV AL, [ESI] ; add an integer into eax from array
CALL writeInt
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET space
CALL writeString
POP EDX ; restores the original edx register value
ADD ESI, TYPE myArray ; point to next integer
LOOP L5 ; repeat until ECX = 0
CALL crlf
RET
printArray ENDP
END main
END printArray
; output:
;Starting array. This is the value of array: +1 +5 +4 +2 +8
;This is the value of array: +1 +4 +5 +2 +8
;This is the value of array: +1 +4 +2 +5 +8
;This is the value of array: +1 +2 +4 +5 +8
;Final array. This is the value of array: +1 +2 +4 +5 +8
As you can see the output sorts the array just fine from least to greatest. I was trying to see if I could move AL into EAX, but that gave me a bunch of errors. Is there a work around for this so I can use a 32 bit register and get the same output?
Using EAX is definitely possible, in fact you already are. You asked "I was trying to see if I could move AL into EAX, but that gave me a bunch of errors." Think about what that means. EAX is the extended AX register, and AL is the lower partition of AX. Take a look at this diagram:image of EAX register
. As you can see, moving AL into EAX using perhaps the MOVZX instruction would simply put the value in AL into EAX and fill zeroes in from right to left. You'd be moving AL into AL, and setting the rest of EAX to 0. You could actually move everything into EAX and run the program just the same and there'd be no difference because it's using the same part of memory.
Also, why are you pushing and popping EAX so much? The only reason to push/pop things from the runtime stack is to recover them later, but you never do that, so you can just let whatever is in EAX at the time just die.
If you still want to do an 8-bit store, you need to use an 8-bit register. (AL is an 8-bit register. IDK why you mention 16 in the title).
x86 has widening loads (movzx and movsx), but integer stores from a register operand always take a register the same width as the memory operand. i.e. the way to store the low byte of EAX is with mov [esi], al.
In printArray, you should use movzx eax, byte ptr [esi] to zero-extend into EAX. (Or movsx to sign-extend, if you want to treat your numbers as int8_t instead of uint8_t.) This avoids needing the upper 24 bits of EAX to be zeroed.
BTW, your code has a lot of unnecessary instructions. e.g.
MOV EAX,0 ; clearing registers, moving 0 into each, and initialize
totally pointless. You don't need to "init" or "declare" a register before using it for the first time, if your first usage is write-only. What you do with EDX is amusing:
MOV EDX,0 ; clearing registers, moving 0 into each, and initialize
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET startArray ; load EDX with address of variable
CALL writeString ; print string
POP EDX ; return edx to previous stack
"Caller-saved" registers only have to be saved if you actually want the old value. I prefer the terms "call-preserved" and "call-clobbered". If writeString destroys its input register, then EDX holds an unknown value after the function returns, but that's fine. You didn't need the value anyway. (Actually I think Irvine32 functions at most destroy EAX.)
In this case, the previous instruction only zeroed the register (inefficiently). That whole block could be:
MOV EDX, OFFSET startArray ; load EDX with address of variable
CALL writeString ; print string
xor edx,edx ; edx = 0
Actually you should omit the xor-zeroing too, because you don't need it to be zeroed. You're not using it as counter in a loop or anything, all the other uses are write-only.
Also note that XCHG with memory has an implicit lock prefix, so it does the read-modify-write atomically (making it much slower than separate mov instructions to load and store).
You could load a pair of bytes using movzx eax, word ptr [esi] and use a branch to decide whether to rol ax, 8 to swap them or not. But store-forwarding stalls from byte stores forwarding to word loads isn't great either.
Anyway, this is getting way off topic from the title question, and this isn't codereview.SE.

Arrays in MASM Assembly (very confused beginner)

I have a pretty basic question:
How do you populate arrays in assembly? In high level programming languages you can use a for-loop to set a value to each index, but I'm not sure of how to accomplish the same thing assembly. I know this is wrong, but this is what I have:
ExitProcess PROTO
.data
warray WORD 1,2,3,4
darray DWORD ?
.code
main PROC
mov edi, OFFSET warray
mov esi, OFFSET darray
mov ecx, LENGTHOF warray
L1:
mov ax, [edi] ;i want to move a number from warray to ax
movzx esi,ax ;i want to move that number into darray...
add edi, TYPE warray ;this points to the next number?
loop L1
call ExitProcess
main ENDP
END
Each time the loop runs, ax will be overwritten with the value of the array's index, right? Instead how do I populate darray with the array elements from warray? Any help would be very much appreciated...I'm pretty confused.
There are more than one way to populate an array and your code is almost working. One way is to use counter in the indirect address so you don't have to modify destination and source array pointers each loop:
ExitProcess PROTO
.data
warray WORD 1,2,3,4
darray DWORD 4 dup (?) ; 4 elements
.code
main PROC
mov edi, OFFSET warray
mov esi, OFFSET darray
xor ecx, ecx ; clear counter
L1:
mov ax, [edi + ecx * 2] ; get number from warray
movzx [esi + ecx * 4], ax ; move number to darray
inc ecx ; increment counter
cmp ecx, LENGTHOF warray
jne L1
call ExitProcess
main ENDP
END
Of course this code could be modified to fill the array backwards to possibly save couple of bytes like you probably meant to do in your original code. Here is another way that has more compact loop:
ExitProcess PROTO
.data
warray WORD 1,2,3,4
darray DWORD 4 dup (?) ; 4 elements
.code
main PROC
mov edi, OFFSET warray
mov esi, OFFSET darray
mov ecx, LENGTHOF warray - 1 ; start from end of array
L1:
mov ax, [edi + ecx * 2] ; get number from warray
movzx [esi + ecx * 4], ax ; move number to darray
loop L1
; get and set element zero separately because loop terminates on ecx = 0:
mov ax, [edi]
movzx [esi], ax
call ExitProcess
main ENDP
END
You should also note that when working with arrays of the same type you can do simple copy very efficiently using repeat prefix with instructions like MOVSD:
ExitProcess PROTO
.data
array1 DWORD 1,2,3,4
array2 DWORD 4 dup (?)
.code
main PROC
mov esi, OFFSET array1 ; source pointer in esi
mov edi, OFFSET array2 ; destination in edi
mov ecx, LENGTHOF array1 ; number of dwords to copy
cld ; clear direction flag so that pointers are increasing
rep movsd ; copy ecx dwords
call ExitProcess
main ENDP
END
You're probably not "supposed to know" this, but anyway, there were (way back when) an instruction and an instruction prefix that were made to do exactly this.
Take a look here at this Microsoft page: HERE (click on it)
On that page scroll down until you find this phrase...
"...These instructions are remnants of the x86's CISC heritage and in recent processors are actually slower than the equivalent instructions written out the long way...."
What you do is...
Put the size of the array in Ecx
Point Edi at the start of the arry
Use the appropriate string instruction to populate it
The syntax (Masm/Tasm/etc.) will probably look something like this...
Mov Ecx, The_Length_Of_The_Array ;Figure this out somehow
Lea Edi, The_Target_You_Want_To_Fill ;Define this somewhere
Now, if you want to copy from one place to another, do this...
Lea Esi, The_Source_You_Want_To_Copy ;Whatever, define it
Cld ;This is the direction flag, make it inc
Rep Movsb ;Movsb means move byte for byte
Now, if you want to stuff the same value in each byte in the arrray do this...
Mov AL, The_Value_You_Want_To_Stuff ;Define this to your liking
Cld ;This is the direction flag, make it inc
Rep Stosb ;Stosb means store AL into each byte
Again, these instructions are, for reasons others will elucidate, not cool anymore and if you use them you will get cooties or something.
There are also string instructions for comparison, "Scanning", "Loading", and so on. They were once quite useful (and still are, but the "modern" gang today won't admit it) particularly with the Rep prefix added to them.
If this helps, but you need more detail, feel free to ask.

Implementing a toupper function in x86 assembly

I'm playing around with x86 assembly in VS 2012 trying to convert some old code I have to assembly. The problem I'm having is accessing and changing array values (the values are characters) and I'm not sure how to go about it. I've included comments so you can see my thought process
void toUpper(char *string) {
__asm{
PUSH EAX
PUSH EBX
PUSH ECX
PUSH EDX
PUSH ESI
PUSH EDI
MOV EBX, string
MOV ECX, 0 // counter
FOR_EXPR: // for loop
CMP EBX, 0 //compare ebx to 0
JLE END_FOR // if ebx == 0, jump to end_for
CMP EBX, 97 // compare ebx to 97
JL ELSE // if ebx < 97, jump else
CMP EBX, 122 // compare ebx to 122
JG ELSE // if ebx > 122, jump else
// subtract 32 from current array value
// jump to next element
JMP END_IF
ELSE:
// jump to next element
END_IF:
JMP FOR_EXPR
END_FOR:
POP EDI
POP ESI
POP EDX
POP ECX
POP EBX
POP EAX
}
}
Any help is much appreciated!
Looks to me like the basic problem is that you're loading EBX with the address of the string, but then trying to use it as if it contained a byte of data from inside the string.
I'd probably do things a bit differently. I'd probably load the address of the string into ESI and use it to read the contents of the string indirectly.
mov esi, string
next_char:
lodsb
test al, al ; check for end of string
jz done
cmp al, 'a' ; ignore unless in range
bl next_char
cmp al, 'z'
bg next_char
sub al, 'a'-'A' ; convert to upper case
mov [esi-1], al ; write back to string
jmp next_char
You can use EBX for that instead of ESI, but ESI is a lot more idiomatic. There are also some tricks you could use to optimize this a little, but until you understand the basics, they'd mostly add confusion. With a modern processor, they probably wouldn't make much difference anyway--this is likely to run as fast as your bandwidth to memory anyway.

x86 Assembly Lowercase unhandled exception [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
x86 convert to lower case assembly
This program is to convert a 2d char array into lower case
Quickie Edit: I'm using Visual Studio 2010
int b_search (char list[100][20], int count, char* token)
{
__asm
{
mov eax, 0 ; zero out the result
mov esi, list ; move the list pointer to ESI
mov ebx, count ; move the count into EBX
mov edi, token ; move the token to search for into EDI
MOV ecx, 0
LOWERCASE_TOKEN: ;lowercase the token
OR [edi], 20h
INC ecx
CMP [edi+ecx],0
JNZ LOWERCASE_TOKEN
MOV ecx, 0
At my OR instruction, where I'm trying to change the register that contains the address to token into all lower case, I keep getting unhandled exception...access violation, and without the brackets nothing gets lowercased. Later in my code I have
LOWERCASE_ARRAY: ;for(edi = 0, edi<ebx; edi++), loops through each name
CMP ecx, ebx
JGE COMPARE
INC ecx ;ecx++
MOV edx, 0; ;edx = 0
LOWERCASE_STRING: ;while next char != 0, loop through each byte to convert to lower case
OR [esi+edx],20h ;change to lower case
INC edx
CMP [esi+edx],0 ;if [esi+edx] not zero, loop again
JNZ LOWERCASE_STRING
JMP LOWERCASE_ARRAY ;jump back to start case change of next name
and the OR instruction there seems to work perfectly so I don't know why the first won't work. Also, I am trying to convert several strings.
After I finish one string, any ideas how I would go about going to the next string (as in list[1][x], list[2][x], etc...) I tried adding 20 as in [esi+20*ecx+edi] but that doesn't work. Can I get advice on how to proceed?
One possibility:
If parameters of procedure b_search are stored as registers (register calling convention) then you override list pointer in your first asm line, because eax point to the list array:
mov eax, 0 ; zero out the result
Because:
mov esi, list ; move the list pointer to ESI
should be converted to:
mov esi, eax
Try to exchange first and second line to:
mov esi, list ; move the list pointer to ESI
mov eax, 0 ; zero out the result

x86 convert to lower case assembly

This program is to convert a char pointer into lower case. I'm using Visual Studio 2010.
This is from another question, but much simpler to read and more direct to the point.
int b_search (char* token)
{
__asm
{
mov eax, 0 ; zero out the result
mov edi, [token] ; move the token to search for into EDI
MOV ecx, 0
LOWERCASE_TOKEN: ;lowercase the token
OR [edi], 20h
INC ecx
CMP [edi+ecx],0
JNZ LOWERCASE_TOKEN
MOV ecx, 0
At my OR instruction, where I'm trying to change the register that contains the address to token into all lower case, I keep getting unhandled exception...access violation, and without the brackets nothing, I don't get errors but nothing gets lowercased. Any advice?
This is part of some bigger code from another question, but I broke it down because I needed this solution only.
Your code can alter only the first char (or [edi], 20h) - the EDI does not increment.
EDIT: found this thread with workaround. Try using the 'dl' instead of al.
; move the token address to search for into EDI
; (not the *token, as would be with mov edi, [token])
mov edi, token
LOWERCASE_TOKEN: ;lowercase the token
mov al, [edi]
; check for null-terminator here !
cmp al, 0
je GET_OUT
or al, 20h
mov dl, al
mov [edi], dl
inc edi
jmp LOWERCASE_TOKEN
GET_OUT:
I would load the data into a register, manipulate it there, then store the result back to memory.
int make_lower(char* token) {
__asm {
mov edi, token
jmp short start_loop
top_loop:
or al, 20h
mov [edi], al
inc edi
start_loop:
mov al, [edi]
test al, al
jnz top_loop
}
}
Note, however, that your conversion to upper-case is somewhat flawed. For example, if the input contains any control characters, it will change them to something else -- but they aren't upper case, and what it converts them to won't be lower case.
The problem is, that the OR operator like many others don't allow two memory or constant parameters. That means: The OR operator can only have following parameters:
OR register, memory
OR register, register
OR register, constant
The second problem is, that the OR has to store the result to a register, not to memory.
Thats why you get an access violation, when the brackets are set. If you remove the brackets, the parameters are ok, but you don't write your lowercase letter to memory, what you intend to do. So use another register, to copy the letter to, and then use OR.
For example:
mov eax, 0 ; zero out the result
mov edi, [token] ; move the token to search for into EDI
MOV ecx, 0
LOWERCASE_TOKEN: ;lowercase the token
MOV ebx, [edi] ;## Copy the value to another register ##
OR ebx, 20h ;## and compare now the register and the memory ##
MOV [edi], ebx ;##Save back the result ##
INC ecx
CMP [edi+ecx],0
JNZ LOWERCASE_TOKEN
MOV ecx, 0
That should work^^

Resources