Assembly Power(A,b) function - c

I'm trying to make an assembly power(a, b) function, based on this c code:
int power(int x, int y)
{
int z;
z = 1;
while (y > 0) {
if ((y % 2) == 1) {
y = y - 1;
z = z * x;
} else {
y = y / 2;
x = x * x;
}
}
return z;
}
Though, for some reason, it only gets some outputs right, and I can't figure out where the problem is. Here is my assembly code:
;* int power(int x, int y); *
;*****************************************************************************
%define y [ebp+12]
%define x [ebp+8]
%define z [ebp-4]
power:
push ebp
mov ebp,esp
sub esp,16
mov byte[ebp-4], 1
jmp L3;
L1:
mov eax,[ebp+12]
and eax,0x1
test eax,eax
je L2;
sub byte[ebp+12],1
mov eax,[ebp-4]
imul eax, [ebp+8]
mov [ebp-4],eax
jmp L3;
L2:
mov eax,[ebp+12]
mov edx,eax
shr edx,31
add eax,edx
sar eax,1
mov [ebp+12],eax
mov eax,[ebp+8]
imul eax,[ebp+8]
mov [ebp+8],eax
L3:
cmp byte[ebp+12],0
jg L1;
mov eax,[ebp-4]
leave
ret
When I run it, this is my output:
power(2, 0) returned -1217218303 - incorrect, should return 1
power(2, 1) returned 2 - correct
power(2, 2) returned -575029244 - incorrect, should return 4
power(2, 3) returned 8 - correct
Could someone please help me figure out where I've gone wrong, or to correct my code, any help would be appreciated.

You're storing a single byte, then reading back that byte + 3 garbage bytes.
%define z [ebp-4] ; what's the point of this define if you write it out explicitly instead of mov eax, z?
; should just be a comment if you don't want use it
mov byte[ebp-4], 1 ; leaves [ebp-3..ebp-1] unmodified
...
mov eax, [ebp-4] ; loads 4 bytes, including whatever garbage was there
Using a debugger would have shown you that you were getting garbage in EAX at this point. You could fix it by using movzx eax, byte [ebp-4], or by storing 4B in the first place.
Or, better, don't use any extra stack memory at all, since you can use ECX without even saving/restoring it in the usual 32-bit calling conventions. Keep your data in registers, that's what they're for.
Your "solution" of using [ebp+16] is writing to stack space owned by the caller. You're just getting lucky that it happens to have the upper 3 bytes zeroed, I guess, and that clobbering it doesn't lead to a crash.
and eax,1
test eax,eax
is redundant: test eax, 1 is sufficient. (Or if you want to destroy eax, and eax, 1 sets flags based on the result).
There's a huge amount of other inefficiency in your code, too.

Related

Modify C array in external assembler routine

Summary: We have an int variable and 4 double arrays in C, 2 of which hold input data and 2 of which we want to write output data to. We pass the variable and arrays to a function in an external .asm file, where the input data is used to determine output data and the output data is written to the output arrays.
Our problem is, that the output arrays remain seemingly untouched after the assembly routine finishes its work. We don't even know, if the routine reads the correct input data. Where did we go wrong?
We compile with the following commands
nasm -f elf32 -o calculation.o calculation.asm
gcc -m32 -o programm main.c calculation.o
If you need any more information, feel free to ask.
C code:
// before int main()
extern void calculate(int32_t counter, double radius[], double distance[], double paper[], double china[]) asm("calculate");
// in int main()
double radius[counter];
double distance[counter];
// [..] Write Input Data to radius & distance [...]
double paper[counter];
double china[counter];
for(int i = 0; i < counter; i++) {
paper[i] = 0;
china[i] = 0;
}
calculate(counter, radius, distance, paper, china);
// here we expect paper[] and china[] to contain output data
Our Assembly code currently only takes in the values, puts them into the FPU, then places them into the output array.
x86 Assembly (Intel Syntax) (I know, this code looks horrible, but we're beginners, so bear with us, please; Also I can't get syntax highlighting to work correctly for this one):
BITS 32
GLOBAL calculate
calculate:
SECTION .BSS
; declare all variables
pRadius: DD 0
pDistance: DD 0
pPaper: DD 0
pChina: DD 0
numItems: DD 0
counter: DD 0
; populate them
POP DWORD [numItems]
POP DWORD [pRadius]
POP DWORD [pDistance]
POP DWORD [pPaper]
POP DWORD [pChina]
SECTION .TEXT
PUSH EBX ; because of cdecl calling convention
JMP calcLoopCond
calcLoop:
; get input array element
MOV EBX, [counter]
MOV EAX, [pDistance]
; move it into fpu, then move it to output
FLD QWORD [EAX + EBX * 8]
MOV EAX, [pPaper]
FSTP QWORD [EAX + EBX * 8]
; same for the second one
MOV EAX, [pRadius]
FLD QWORD [EAX + EBX * 8]
MOV EAX, [pChina]
FSTP QWORD [EAX + EBX * 8]
INC EBX
MOV [counter], EBX
calcLoopCond:
MOV EAX, [counter]
MOV EBX, [numItems]
CMP EAX, EBX
JNZ calcLoop
POP EBX
RET
There are a couple of problems in the assembler routine. The POP instructions are emitted into the .bss section, so they are never executed. In the sequence of POPs, the return address (pushed by the caller) is not accounted for. Depending on the ABI, you must leave the arguments on the stack anyway. Because the POPs are never executed, the loop exit condition always happens to be true.
And you really should not use global variables this way. Instead, create a stack frame and use that.
Thanks to all your answers and comments and some heavy research, we managed to finally produce functioning code, which now properly uses stack frames and fulfills the cdecl calling convention:
BITS 32
GLOBAL calculate
SECTION .DATA
electricFieldConstant DQ 8.85e-12
permittivityPaper DQ 3.7
permittivityChina DQ 7.0
SECTION .TEXT
calculate:
PUSH EBP
MOV EBP, ESP
PUSH EBX
PUSH ESI
PUSH EDI
MOV ECX, 0 ; counter for loop
JMP calcLoopCond
calcLoop:
MOV EAX, [EBP + 12]
FLD QWORD [EAX + ECX * 8]
MOV EAX, [EBP + 20]
FSTP QWORD [EAX + ECX * 8]
MOV EAX, [EBP + 16]
FLD QWORD [EAX + ECX * 8]
MOV EAX, [EBP + 24]
FSTP QWORD [EAX + ECX * 8]
ADD ECX, 1 ; increase loop counter
calcLoopCond:
MOV EDX, [EBP + 8]
CMP ECX, EDX
JNZ calcLoop
POP EDI
POP ESI
POP EBX
MOV ESP, EBP
POP EBP
RET

Assembly intel x86 which addresses can be used to define arguments

Which memory points can I actually use?
I have this assembly code to power(a,b) by recursion:
int power(int x, int y); *
;*****************************************************************************
%define x [ebp+8]
%define y [ebp+12]
power:
push ebp
mov ebp, esp
mov eax, y ;move y into eax
cmp eax, 0 ;compare y to 0
jne l10; ;if not equal, jump to l10
mov eax, 1 ;move 1 into eax
jmp l20; ;jump to l20, the leave, ret label
l10:
mov eax, y ; move y into eax
sub eax, 1 ; y-1
push eax ; push y-1 onto stack
mov ebx, x ; move x into ebx
push ebx ; push x onto stack
call power ; call power/recursion
add esp, 8 ; add 8 to stack(4*2) for the two vars
imul eax, x ; multiply the returned value from call by x
l20:
leave ; leave
ret ;ret
It's coded straight from this c code:
int power_c(int x, int y) {
if (y == 0) {
return 1;
} else {
return power_c(x, y - 1)*x;
}
}
The asm code works perfectly, any suggested adjustments would be great, I'm still new to assembly. My question is, which addresses can I actually use to define arguments? Here, I have two, and I use:
%define x [ebp+8]
%define y [ebp+12]
If I have more, do I just increase it? Lets say all are ints, 4bytes, like so?
%define x [ebp+8]
%define y [ebp+12]
%define z [ebp+16]
%define a [ebp+20]
%define b [ebp+24]
I've hit a snag with code where I need to define more arguments, I just can't figure this out, any help would be greatly appreciated.
Arguments are passed on the stack - the caller is responsible for pushing them. The space is already reserved when your function starts. If you changed your prototype to power_c(int x, int y, int z), then z would be at [ebp+16].
Local Variables (or automatics) are your responsibility. The standard way is to subtract the space you need from esp, as #Peter Cordes mentions in the comments. So to create variable a and b, you would do:
sub esp, 8
%define a [ebp+4]
%define b [ebp-8]
Note that at this point ebp == esp+8. We define the variables relative to ebp, not esp, so that you can continue to use push and pop instructions (which change the stack pointer). Remember to set esp back to ebp (mov esp, ebp) before you exit the function, so that the return address is found correctly.

How safe is this swap w/o pushing registers?

I'm very new to Assembly and the code below is supposed to swap two integers via two different functions: first using swap_c and then using swap_asm.
However, I doubt, whether I need to push (I mean save) each value of registers before assembly code and pop them later (just before returning to main). In other words, will the CPU get mad at me if I return different register content (not the crucial ones like ebp or esp; but, just eax, ebx, ecx & edx) after running swap_asm function? Is it better to uncomment the lines in the assembly part?
This code runs OK for me and I managed to reduce the 27 lines of assembled C code down to 7 Assembly lines.
p.s.: System is Windows 10, VS-2013 Express.
main.c part
#include <stdio.h>
extern void swap_asm(int *x, int *y);
void swap_c(int *a, int *b) {
int t = *a;
*a = *b;
*b = t;
}
int main(int argc, char *argv[]) {
int x = 3, y = 5;
printf("before swap => x = %d y = %d\n\n", x, y);
swap_c(&x, &y);
printf("after swap_c => x = %d y = %d\n\n", x, y);
swap_asm(&x, &y);
printf("after swap_asm => x = %d y = %d\n\n", x, y);
getchar();
return (0);
}
assembly.asm part
.686
.model flat, c
.stack 100h
.data
.code
swap_asm proc
; push eax
; push ebx
; push ecx
; push edx
mov eax, [esp] + 4 ; get address of "x" stored in stack into eax
mov ebx, [esp] + 8 ; get address of "y" stored in stack into ebx
mov ecx, [eax] ; get value of "x" from address stored in [eax] into ecx
mov edx, [ebx] ; get value of "y" from address stored in [ebx] into edx
mov [eax], edx ; store value in edx into address stored in [eax]
mov [ebx], ecx ; store value in ecx into address stored in [ebx]
; pop edx
; pop ecx
; pop ebx
; pop eax
ret
swap_asm endp
end
Generally, this depends on the calling convention of the system you are working on. The calling convention specifies how to call functions. Generally, it says where to put the arguments and what registers must be preserved by the called function.
On i386 Windows with the cdecl calling convention (which is the one you probably use), you can freely overwrite the eax, ecx, and edx registers. The ebx register must be preserved. While your code appears to work, it mysteriously fails when a function starts to depend on ebx being preserved, so better save and restore it.

NASM Assembly mathematical logic

I have a program in assembly for the Linux terminal that's supposed to work through a series of mathematical manipulations, compare the final value to 20, and then using if logic, report <, > or = relationship. Code is:
segment .data
out_less db "Z is less than 20.", 10, 0
out_greater db "Z is greater than 20.", 10, 0
out_equal db "Z is equal to 20.", 10, 0
segment .bss
segment .text
global main
extern printf
main:
mov eax, 10
mov ebx, 12
mov ecx, eax
add ecx, ebx ;set c (ecx reserved)
mov eax, 3
mov ebx, ecx
sub ebx, eax ;set f (ebx reserved)
mov eax, 12
mul ecx
add ecx, 10 ;(a+b*c) (ecx reserved)
mov eax, 6
mul ebx
mov eax, 3
sub eax, ebx
mov ebx, eax ;(d-e*f) (ebx reserved) reassign to ebx to free eax
mov eax, ecx
div ebx
add ecx, 1 ;(a+b*c)/(d-e*f) + 1
cmp ecx, 20
jl less
jg greater
je equal
mov eax, 0
ret
less:
push out_less
call printf
jmp end
greater:
push out_greater
call printf
jmp end
equal:
push out_equal
call printf
jmp end
end:
mov eax, 0
ret
Commands for compiling in terminal using nasm and gcc:
nasm -f elf iftest.asm
gcc -o iftest iftest.o
./iftest
Equivalent C code would be:
main() {
int a, b, c, d, e, f, z;
a = 10;
b = 12;
c = a + b;
d = 3;
e = 6;
f = c - d;
z = ((a + b*c) / (d - e*f)) + 1;
if (z < 20) {
printf("Z (%d) is less than 20.\n", z);
}
else if (z > 20) {
printf("Z is greater than 20.\n");
}
else {
printf("Z is equal to 20.\n");
}
}
The current output is incorrect. The C code will report that z = -1, and therefore less than 20, but the assembly code outputs Z is greater than 20. I've tried printing out the final value, but I run into this issue where printing the value somehow changes it. I've checked and rechecked the math logic and I can't see why it shouldn't give the correct value, unless I'm using the math operators incorrectly. Any and all help is appreciated.
I think the problem is here:
div ebx
add ecx, 1 ;(a+b*c)/(d-e*f) + 1
The result of the div instruction is not in ecx.

Assembly Addressing mode

This is the code:
section .data
v dw 4, 6, 8, 12
len equ 4
section .text
global main
main:
mov eax, 0 ;this is i
mov ebx, 0 ;this is j
cycle:
cmp eax, 2 ;i < len/2
jge exit
mov ebx, 0
jmp inner_cycle
continue:
inc eax
jmp cycle
inner_cycle:
cmp ebx, 2
jge continue
mov di, [v + eax * 2 * 2 + ebx * 2]
inc ebx
jmp inner_cycle
exit:
push dword 0
mov eax, 0
sub esp, 4
int 0x80
I'm using an array and scanning it as a matrix, this is the C translation of the above code
int m[4] = {1,2,3,4};
for(i = 0; i < 2; i++){
for(j = 0; j < 2; j++){
printf("%d\n", m[i*2 + j]);
}
}
When I try to compile the assembly code I get this error:
DoubleForMatrix.asm:20: error: beroset-p-592-invalid effective address
which refers to this line
mov di, [v + eax * 2 * 2 + ebx * 2]
can someone explain me what is wrong with this line? I think that it's because of the register dimensions, I tried with
mov edi, [v + eax * 2 * 2 + ebx * 2]
but I've got the same error.
This is assembly for Mac OS X, to make it work on another SO you have to change the exit syscall.
You can't use arbitrary expressions in assembler. Only a few addressingmodes are allowed.
basically the most complex form is register/imm+register*scale with scale 1,2,4,8
Of course constants (like 2*2) will probably be folded to 4, so that counts as a single scale with 4 (not as two multiplications)
Your example tries to do two multiplies at once.
Solution: insert an extra LEA instruction to calculate v+ebx*2 and use the result in the mov.
lea regx , [v+ebx*2]
mov edi, [eax*2*2+regx]
where regx is a free register.
The SIB (Scale Immediate Base) addressing mode takes only one Scale argument (1,2,4 or 8) to be applied to exactly one register.
The proposed solution is to premultiply eax by 4 (also has to modify the comparison). Then inc eax can be replaced with add eax,4 and the illegal instruction by mov di,[v+eax+ebx*2]
A higher level optimization would be just to for (i=0;i<4;i++) printf("%d\n",m[i]);

Resources