Assembly Addressing mode

Assembly Addressing mode - arrays

This is the code:
section .data
v dw 4, 6, 8, 12
len equ 4
section .text
global main
main:
mov eax, 0 ;this is i
mov ebx, 0 ;this is j
cycle:
cmp eax, 2 ;i < len/2
jge exit
mov ebx, 0
jmp inner_cycle
continue:
inc eax
jmp cycle
inner_cycle:
cmp ebx, 2
jge continue
mov di, [v + eax * 2 * 2 + ebx * 2]
inc ebx
jmp inner_cycle
exit:
push dword 0
mov eax, 0
sub esp, 4
int 0x80
I'm using an array and scanning it as a matrix, this is the C translation of the above code
int m[4] = {1,2,3,4};
for(i = 0; i < 2; i++){
for(j = 0; j < 2; j++){
printf("%d\n", m[i*2 + j]);
}
}
When I try to compile the assembly code I get this error:
DoubleForMatrix.asm:20: error: beroset-p-592-invalid effective address
which refers to this line
mov di, [v + eax * 2 * 2 + ebx * 2]
can someone explain me what is wrong with this line? I think that it's because of the register dimensions, I tried with
mov edi, [v + eax * 2 * 2 + ebx * 2]
but I've got the same error.
This is assembly for Mac OS X, to make it work on another SO you have to change the exit syscall.

You can't use arbitrary expressions in assembler. Only a few addressingmodes are allowed.
basically the most complex form is register/imm+register*scale with scale 1,2,4,8
Of course constants (like 2*2) will probably be folded to 4, so that counts as a single scale with 4 (not as two multiplications)
Your example tries to do two multiplies at once.
Solution: insert an extra LEA instruction to calculate v+ebx*2 and use the result in the mov.
lea regx , [v+ebx*2]
mov edi, [eax*2*2+regx]
where regx is a free register.

The SIB (Scale Immediate Base) addressing mode takes only one Scale argument (1,2,4 or 8) to be applied to exactly one register.
The proposed solution is to premultiply eax by 4 (also has to modify the comparison). Then inc eax can be replaced with add eax,4 and the illegal instruction by mov di,[v+eax+ebx*2]
A higher level optimization would be just to for (i=0;i<4;i++) printf("%d\n",m[i]);

Related

Modify C array in external assembler routine

Summary: We have an int variable and 4 double arrays in C, 2 of which hold input data and 2 of which we want to write output data to. We pass the variable and arrays to a function in an external .asm file, where the input data is used to determine output data and the output data is written to the output arrays.
Our problem is, that the output arrays remain seemingly untouched after the assembly routine finishes its work. We don't even know, if the routine reads the correct input data. Where did we go wrong?
We compile with the following commands
nasm -f elf32 -o calculation.o calculation.asm
gcc -m32 -o programm main.c calculation.o
If you need any more information, feel free to ask.
C code:
// before int main()
extern void calculate(int32_t counter, double radius[], double distance[], double paper[], double china[]) asm("calculate");
// in int main()
double radius[counter];
double distance[counter];
// [..] Write Input Data to radius & distance [...]
double paper[counter];
double china[counter];
for(int i = 0; i < counter; i++) {
paper[i] = 0;
china[i] = 0;
}
calculate(counter, radius, distance, paper, china);
// here we expect paper[] and china[] to contain output data
Our Assembly code currently only takes in the values, puts them into the FPU, then places them into the output array.
x86 Assembly (Intel Syntax) (I know, this code looks horrible, but we're beginners, so bear with us, please; Also I can't get syntax highlighting to work correctly for this one):
BITS 32
GLOBAL calculate
calculate:
SECTION .BSS
; declare all variables
pRadius: DD 0
pDistance: DD 0
pPaper: DD 0
pChina: DD 0
numItems: DD 0
counter: DD 0
; populate them
POP DWORD [numItems]
POP DWORD [pRadius]
POP DWORD [pDistance]
POP DWORD [pPaper]
POP DWORD [pChina]
SECTION .TEXT
PUSH EBX ; because of cdecl calling convention
JMP calcLoopCond
calcLoop:
; get input array element
MOV EBX, [counter]
MOV EAX, [pDistance]
; move it into fpu, then move it to output
FLD QWORD [EAX + EBX * 8]
MOV EAX, [pPaper]
FSTP QWORD [EAX + EBX * 8]
; same for the second one
MOV EAX, [pRadius]
FLD QWORD [EAX + EBX * 8]
MOV EAX, [pChina]
FSTP QWORD [EAX + EBX * 8]
INC EBX
MOV [counter], EBX
calcLoopCond:
MOV EAX, [counter]
MOV EBX, [numItems]
CMP EAX, EBX
JNZ calcLoop
POP EBX
RET

There are a couple of problems in the assembler routine. The POP instructions are emitted into the .bss section, so they are never executed. In the sequence of POPs, the return address (pushed by the caller) is not accounted for. Depending on the ABI, you must leave the arguments on the stack anyway. Because the POPs are never executed, the loop exit condition always happens to be true.
And you really should not use global variables this way. Instead, create a stack frame and use that.

Thanks to all your answers and comments and some heavy research, we managed to finally produce functioning code, which now properly uses stack frames and fulfills the cdecl calling convention:
BITS 32
GLOBAL calculate
SECTION .DATA
electricFieldConstant DQ 8.85e-12
permittivityPaper DQ 3.7
permittivityChina DQ 7.0
SECTION .TEXT
calculate:
PUSH EBP
MOV EBP, ESP
PUSH EBX
PUSH ESI
PUSH EDI
MOV ECX, 0 ; counter for loop
JMP calcLoopCond
calcLoop:
MOV EAX, [EBP + 12]
FLD QWORD [EAX + ECX * 8]
MOV EAX, [EBP + 20]
FSTP QWORD [EAX + ECX * 8]
MOV EAX, [EBP + 16]
FLD QWORD [EAX + ECX * 8]
MOV EAX, [EBP + 24]
FSTP QWORD [EAX + ECX * 8]
ADD ECX, 1 ; increase loop counter
calcLoopCond:
MOV EDX, [EBP + 8]
CMP ECX, EDX
JNZ calcLoop
POP EDI
POP ESI
POP EBX
MOV ESP, EBP
POP EBP
RET

how do i simplify/condense the code (if possible)? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'd like to present you my program in c and assembler code attached to his one. also, I've got some questions.
here is a piece of code in c
#include <stdio.h>
void podaj_znak(int tab[], int n);
int main()
{
int tab[7] = {4, 5, 6, 2, -80, 0, 56};
printf("Przed: ");
for (int i = 0; i < 7; i++)
printf("%d ", tab[i]);
printf("\n");
podaj_znak(tab, 7);
printf("Po: %d %d %d %d %d %d %d", tab[0], tab[1], tab[2], tab[3], tab[4], tab[5], tab[6]);
printf("\n");
return 0;
}
and asm right here
.686
.model flat
public _podaj_znak
.code
_podaj_znak PROC
push ebp
mov ebp, esp
mov edx, [ebp+8]
mov ecx, [ebp+12]
ptl:
mov eax, [edx]
cmp eax, 0
jl minus
ja plus
mov ebx, 0
mov [edx], ebx
jmp dalej
minus: mov ebx, -1
mov [edx], ebx
jmp dalej
plus: mov ebx, 1
mov [edx], ebx
jmp dalej
dalej: add edx, 4
sub ecx, 1
jnz ptl
pop ebp
ret
_podaj_znak ENDP
END
my question is, how can I simplify/condense the code?
edit: posting what the program does and what I like it to be like. it is just for me to train and to get used to assembler. the program is like you've got numbers from -inf to inf and it when the actual number is equal 0, it stays as it is, when it is something less than 0, it is replaced by -1, and when the number is more than 0, it is replaced by 1. the thing is, that I wanted to somehow optimize assembler code, but I don't know whether it is even possible to condense it.

Not really a good fit for this forum, but still:
For the C code, I'd create a PrintTab function that accepts tab and count and prints the table. Then invoke it both before and after the podaj_znak call.
For the asm code:
PLEASE add comments. I know this is probably just a class project, but still, get in the habit.
Why move [edx] to eax instead of just cmp [edx],0?
If perf matters, perhaps skip prolog/epilog and use a 'fastcall' calling convention.
Why repeat "mov [edx], ebx" for each case? Move it down to dalej.
As a 'trick' you might try checking for -1, but then handle the other 2 cases with setnz.

nasm syntax, may need subtle fixing for other asm, my solution:
; converts values in tab into [-1, 0, 1] as sgn()
; arguments: two on stack(int tab[], int n)
; modified registers: esi, edi, eax, ebx
; "no branch" version (except loop itself)
_podaj_znak:
mov esi,[esp+4] ; tab ptr
mov eax,[esp+8] ; count
xor ebx,ebx
lea edi,[esi+eax*4] ; tab.end() ptr
sgn_loop:
lodsd ; eax = [ds:esi], esi += 4
; change eax to [-1, 0, 1] by sgn(eax)
test eax,eax
setnz bl
sar eax,31
or eax,ebx
; overwrite original value with sgn() result
cmp esi,edi ; test if end of tab was reached
mov [esi-4],eax
jb sgn_loop
ret
And then for the curiosity googling Internet (just the loop part is different), 3 instructions version (my is 4):
...
; modifies also edx in this variant
sgn_loop:
lodsd ; eax = [ds:esi], esi += 4
; set edx to [-1, 0, 1] by sgn(eax)
cdq
cmp edx,eax
adc edx,ebx
; overwrite original value with sgn() result
cmp esi,edi
mov [esi-4],edx
jb sgn_loop
ret
Both variants are branch-less, so they should have superior performance to any branch variant (but I'm not going to profile it).

It is possible to optimize a little bit the assembly by calling only one time the mov [edx], ebx as follow:
ptl:
mov eax, [edx]
cmp eax, 0
jl minus
ja plus
mov ebx, 0 ; only set to 0
jmp dalej
minus: mov ebx, -1 ; only set to -1
jmp dalej
plus: mov ebx, 1 ; only set to 1
jmp dalej
dalej: mov [edx], ebx ; update the array[edx]
add edx, 4
sub ecx, 1
jnz ptl

Traversing a 2-d array

I'm having trouble understanding how to traverse a 2-d array in x86 assembly language. I am missing a bit of understanding. This is what I have so far.
The issue is the lines with //offset and //moving through array
For the //offset line the error I am getting is "non constant expression in second operand"
and also
"ebx: illegal register in second operand"
For the next line I get the error
"edx: illegal register in second operand"
mov esi, dim
mul esi
mov eax, 0 // row index
mov ebx, 0 // column index
mov ecx, dim
StartFor:
cmp ecx, esi
jge EndFor
lea edi, image;
mov edx, [eax *dim + ebx] // offset
mov dl, byte ptr [edi + esi*edx] // moving through array
mov edx, edi
and edx, 0x80
cmp edx, 0x00
jne OverThreshold
mov edx, 0xFF
OverThreshold:
mov edx, 0x0

See x86 tag wiki, including the list of addressing modes.
You can scale an index register by a constant, but you can't multiply two registers in an addressing mode. You'll have to do that yourself (e.g. with imul edx, esi, if the number of columns isn't a compile time constant. If it's a power-of-2 constant, you can shift, or even use a scaled addressing mode like [reg + reg*8]).
re: edit: *dim should work if dim is defined with something like dim equ 8. If it's a memory location holding a value, then of course it's not going to work. The scale factor can be 1, 2, 4, or 8. (The machine code format has room for a 2-bit shift count, which is why the options are limited.)
I'd also recommend loading with movzx to zero-extend a byte into edx, instead of only writing dl (the low byte). Actually nvm, your code doesn't need that. In fact, you overwrite the value you loaded with edi. I assume that's a bug.
You can replace
imul edx, esi
mov dl, byte ptr [edi + edx] ; note the different addressing mode
mov edx, edi ; bug? throw away the value you just loaded
and edx, 0x80 ; AND already sets flags according to the result
cmp edx, 0x00 ; so this is redundant
jne OverThreshold
with
imul edx, esi
test 0x80, byte ptr [edi + edx] ; AND, discard the result and set flags.
jnz
Of course, instead of multiplying inside the inner loop, you can just add the columns in the outer loop. This is called Strength Reduction. So you do p+=1 along each row, and p+=cols to go from row to row. Or if you don't need to care about rows and columns, you can just iterate over the flat memory of the 2D array.

A 2-dimensional array is just an interpretation of a sequence of bytes. You'll have to choose in which order to store the items. For example, you might choose "row-major order".
I've written a demo where a buffer is filled with a sequence of numbers. Then the sequence is interpreted as single- and two-dimensional array.
tx86.s
%define ITEM_SIZE 4
extern printf
section .bss
cols: equ 3
rows: equ 4
buf: resd cols * rows
c: resd 1
r: resd 1
section .data
fmt: db "%-4d", 0 ; fmt string, '0'
fmt_nl: db 10, 0 ; "\n", '0'
section .text ; Code section.
global main
main:
push ebp
mov ebp, esp
; fill buf
mov ecx, cols * rows - 1
mov [buf + ITEM_SIZE], ecx
.fill_buf:
mov [buf + ecx * ITEM_SIZE], ecx
dec ecx
jnz .fill_buf
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; buf as 1-dimensional array
; buf[c] = [buf + c * ITEM_SIZE]
xor ecx, ecx
mov [c], ecx
.lp1d:
mov ecx, [c]
push dword [buf + ecx * ITEM_SIZE]
push dword fmt
call printf
mov ecx, [c]
inc ecx
mov [c], ecx
cmp ecx, cols * rows
jl .lp1d
; print new line
push dword fmt_nl
call printf
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; buf as 2-dimensional array
; buf[r][c] = [buf + (r * cols + c) * ITEM_SIZE]
xor ecx, ecx
mov [r], ecx
.lp1:
xor ecx, ecx
mov [c], ecx
.lp2:
; calculate address
mov eax, [r]
mov edx, cols
mul edx ; eax = r * cols
add eax, [c] ; eax = r * cols + c
; print buf[r][c]
push dword [buf + eax * ITEM_SIZE]
push dword fmt
call printf
; next column
mov ecx, [c]
inc ecx
mov [c], ecx
cmp ecx, cols
jl .lp2
; print new line
push dword fmt_nl
call printf
; next row
mov ecx, [r]
inc ecx
mov [r], ecx
cmp ecx, rows
jl .lp1
mov esp, ebp
pop ebp ; restore stack
xor eax, eax ; normal exit code
ret
Buidling(on Linux)
nasm -f elf32 -l tx86.lst tx86.s
gcc -Wall -g -O0 -m32 -o tx86 tx86.o
Running
./tx86
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2
3 4 5
6 7 8
9 10 11

Bubble sort not working with local array in assembly language

I'm trying to create a bubble sort with assembly. I've done so successfully with nearly identical code, only now I'm passing in a LOCAL array instead of one defined in the .data section. Everything runs, but there seems to be no switching.
Here is the code
start
call main
exit
main PROC
LOCAL numArray[5]:DWORD ; Create a local array for the numbers
mov [numArray + 0], 5
mov [numArray + 4], 4
mov [numArray + 8], 3
mov [numArray + 12], 2
mov [numArray + 16], 1
push numArray
call BubbleSort
ret
main ENDP
array EQU [ebp + 8]
FLAG EQU DWORD PTR [ebp - 4]
BubbleSort PROC
enter 4, 0 ; Enter with one int local memory (for flag)
outer_bubble_loop:
mov ecx, 1 ; Set the count to 1
mov FLAG, 0 ; And clear the flag (Detects if anything changes
inner_bubble_loop: ; Loop through all values
mov ebx, [array + ecx * 4 - 4] ; Move the (n - 1)th index to ebx
cmp ebx, [array + ecx * 4] ; Compare ebx against the (n)th index
jle end_loop ; If the result was less than or equal, skip the swapping part
mov ebx, [array + ecx * 4] ; Move (n) into ebx
mov edx, [array + ecx * 4 - 4] ; Move (n - 1) into edx
mov [array + ecx * 4], edx ; Move (n - 1) into n
mov [array + ecx * 4 - 4], ebx ; Move (n) into (n - 1)
mov FLAG, 1 ; Set the changed flag
end_loop: ; End loop label
inc ecx ; Increase the count
cmp ecx, NDATES ; Check if we've made it to the end yet
jl inner_bubble_loop ; If not, repeat the inner loop
cmp FLAG, 0 ; Check if we changed anything
je loop_end ; If we didn't, go to the end
jmp outer_bubble_loop ; (Else) Jump to the beginning of the loop
loop_end: ; Loop end label
leave
ret
BubbleSort ENDP
My output is, strangely:
4
5
5
2
1
If I use a different data set, it doesn't do the duplication, but things still aren't moved.
Where am I going wrong with this?

; push numArray
lea eax, numArray
push eax
call BubbleSort
...
... unless I'm mistaken...
Edit: Ahhh... worse than that. I think you're going to have to "dereference" it in BubbleSort, too.
mov edx, array ; [ebp + 8], right?
; then use edx instead of "array"... or so...
Edit2 ; Whoops, you're already using edx in the swap. Use esi or edi, then...

You are missing a ret after the call to BubbleSort. I am unsure where you are setting BP for the stack frame indexing, but when falling through to the second execution of BubbleSort the stack won't be aligned the same.
call BubbleSort
ret
Or exit the code execution.

NASM Assembly mathematical logic

I have a program in assembly for the Linux terminal that's supposed to work through a series of mathematical manipulations, compare the final value to 20, and then using if logic, report <, > or = relationship. Code is:
segment .data
out_less db "Z is less than 20.", 10, 0
out_greater db "Z is greater than 20.", 10, 0
out_equal db "Z is equal to 20.", 10, 0
segment .bss
segment .text
global main
extern printf
main:
mov eax, 10
mov ebx, 12
mov ecx, eax
add ecx, ebx ;set c (ecx reserved)
mov eax, 3
mov ebx, ecx
sub ebx, eax ;set f (ebx reserved)
mov eax, 12
mul ecx
add ecx, 10 ;(a+b*c) (ecx reserved)
mov eax, 6
mul ebx
mov eax, 3
sub eax, ebx
mov ebx, eax ;(d-e*f) (ebx reserved) reassign to ebx to free eax
mov eax, ecx
div ebx
add ecx, 1 ;(a+b*c)/(d-e*f) + 1
cmp ecx, 20
jl less
jg greater
je equal
mov eax, 0
ret
less:
push out_less
call printf
jmp end
greater:
push out_greater
call printf
jmp end
equal:
push out_equal
call printf
jmp end
end:
mov eax, 0
ret
Commands for compiling in terminal using nasm and gcc:
nasm -f elf iftest.asm
gcc -o iftest iftest.o
./iftest
Equivalent C code would be:
main() {
int a, b, c, d, e, f, z;
a = 10;
b = 12;
c = a + b;
d = 3;
e = 6;
f = c - d;
z = ((a + b*c) / (d - e*f)) + 1;
if (z < 20) {
printf("Z (%d) is less than 20.\n", z);
}
else if (z > 20) {
printf("Z is greater than 20.\n");
}
else {
printf("Z is equal to 20.\n");
}
}
The current output is incorrect. The C code will report that z = -1, and therefore less than 20, but the assembly code outputs Z is greater than 20. I've tried printing out the final value, but I run into this issue where printing the value somehow changes it. I've checked and rechecked the math logic and I can't see why it shouldn't give the correct value, unless I'm using the math operators incorrectly. Any and all help is appreciated.

I think the problem is here:
div ebx
add ecx, 1 ;(a+b*c)/(d-e*f) + 1
The result of the div instruction is not in ecx.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight