Assembly x86-32 and some c functions - c

I never learn C language so it makes me confuse. I just like to know if I did it correctly or where I need to improve. For this code I used assembly x86 32 bit. Thanks
This is what I supposed to do:
Write a procedure with the signature
char *strchar(char *s1, char c1)
that returns a pointer to the first occurrence of the character c1 within the string s1 or, if not found, returns a null.
This is what I came out with:
strchar (char*, char):
push ebp
mov ebp, esp
mov dword ptr [ebp-24], edi
mov EAX , esi
mov BYTE PTR [ebp-28], al
.L5:
mov EAX , dword ptr [ebp-24]
movzx EAX , byte ptr [ EAX ]
test AL, AL
je .L2
mov EAX , dword PTR [ebp-24]
movzx EAX , BYTE PTR [ EAX ]
cmp BYTE PTR [ebp-28], al
jne .L3
mov eax, dword PTR [ebp-24]
jmp .L6
.L3:
add dword PTR [ebp-24], 1
jmp .L5
.L2:
LEA eax, [ebp-9]
MOV DWORD PTR [EBP-8], eax
MOV EAX, DWORD PTR [ebp-8]
.L6:
POP EBP
RET

The lines:
mov dword ptr [ebp-24], edi
mov EAX , esi
mov BYTE PTR [ebp-28], al
assume that a stack frame has been allocated for this function which doesn’t appear true; I think you should have something like:
sub esp, 32
after the
mov ebp,esp
Also, the three lines after L2 seem confused. The only way to get to L2 is if the nil (0) byte is discovered in the string, at which point, the code should return a NULL pointer.
The exit path in the code (L6) leaves eax alone, so all that should be needed is:
L2:
mov eax, 0
It might make debugging easier if you kept the alias up to date; so:
L2:
mov eax, 0
mov [ebp-24], eax
Also, the calling convention used here is a bit odd: the string is passed in edi and the character in esi. Normally, in x86-32, these would both be passed on the stack. This looks like it might have been x86-64 code, converted to x86-32....
A final note; this assembly code looks like the output of a compiler with optimisations disabled. Often, generating the assembly with the optimisations enabled generates easier to understand code. This code, for example, could be much more concisely written as below, without even devolving into weird intel ops:
strchar:
mov edx, esi
mov eax, edi
L:
mov dh, [eax]
test dh, dh
jz null
cmp dh, dl
je done
inc eax
jmp L
null:
mov eax, 0
done:
ret

Here is one with stack overhead
[global strchar]
strchar:
push ebp
mov ebp, esp
mov dl, byte [ebp + 12]
mov ecx, dword [ebp + 8]
xor eax, eax
.loop: mov al, [ecx]
or al, al
jz .exit
cmp al, dl
jz .found
add ecx, 1
jmp .loop
.found: mov eax, ecx
.exit:
leave
ret
Here is one without stack overhead
[global strchar]
strchar:
mov dl, byte [esp + 8]
mov ecx, dword [esp + 4]
xor eax, eax
.loop: mov al, [ecx]
or al, al
jz .exit
cmp al, dl
jz .found
add ecx, 1
jmp .loop
.found: mov eax, ecx
.exit:
ret
These are using the 'cdecl' calling convention. For 'stdcall' change the last 'ret' to 'ret 8'.

Related

Allocate a vector of bytes on stack in assembly x86

I try to put on stack the values contained by array_output but every time I print the values from the stack it prints just 0s.
If I try to print the array simple not using the stack it works.
What I do wrong?
stack_alloc:
sub esp, 1
mov al, byte [array_output + ecx]
mov byte [esp], al
add ecx, 1
cmp ebx, ARRAY_OUTPUT_LEN
cmp ebx, ecx
jg stack_alloc
cmp ebx, ARRAY_OUTPUT_LEN
cmp ebx, ecx
jg stack_alloc
You have a typo in your code. The cmp ebx, ARRAY_OUTPUT_LEN instruction should not be comparing but rather loading the EBX register.
You could correct the problem replacing the cmp with a mov but I would propose to simplify your code and just compare the index in ECX to ARRAY_OUTPUT_LEN. This will require choosing the opposite conditional branch and saves from using the additional register EBX:
xor ecx, ecx
stack_alloc:
sub esp, 1
mov al, [array_output + ecx]
mov [esp], al
inc ecx
cmp ecx, ARRAY_OUTPUT_LEN
jb stack_alloc

How to reverse an array is assembly?

I'm trying to write a simple program that reverses an array. It almost works, well it works for the first half of the array anyway. The term in the middle is always "0" and then it returns to the pattern. For example, if I input: 1,2,3,4,5,6,7,8,9,10, the output will be 10,9,8,7,6,0,6,5,4,3. The order of the program is basically this:
Call to ReverseArray:
INVOKE ReverseArray,
LENGTHOF array, SIZEOF array, TYPE array, OFFSET array
ReverseArray procedure (loops through the array and calls mSwap32):
ReverseArray PROC,
len:DWORD, s:DWORD, increment:DWORD,address:DWORD
mov edx, OFFSET revArray
call WriteString
mov edi, address
mov esi, address
add esi, s
sub esi, TYPE array
mov ecx, len
shr ecx, 1
L2:
mSwap32 edi, esi
add edi, increment
sub esi, increment
loop L2
ret
ReverseArray endp
mSwap32 Macro (is given two addresses to swap):
mSwap32 MACRO address1:REQ, address2:REQ
mov edi, address1
mov esi, address2
mov eax, [esi]
push eax
mov eax, [edi]
mov [esi], eax
pop eax
mov [edi], eax
ENDM
I'm fairly certain that the error is in mSwap32...I'm sure it is some small logical error I'm missing, but no matter how much I toy with it I can not figure it out. ??
it's a WORD array
With this crucial piece of information we can solve the problems. LENGTHOF will be 10 (as before), SIZEOF will now be 20 and TYPE will now be 2.
ReverseArray PROC,
len:DWORD, s:DWORD, increment:DWORD,address:DWORD
mov edx, OFFSET revArray
call WriteString
mov edi, address
mov esi, address
add esi, s
sub esi, increment ;Better not refer to the actual array here!
mov ecx, len
shr ecx, 1
L2:
mSwap32 edi, esi
add edi, increment
sub esi, increment
loop L2
ret
ReverseArray endp
and the macro:
mSwap32 MACRO address1:REQ, address2:REQ
mov edi, address1
mov esi, address2
mov ax, [esi]
push eax
mov ax, [edi]
mov [esi], ax
pop eax
mov [edi], ax
ENDM
Shorter and still using a macro:
mSwap32 MACRO address1:REQ, address2:REQ
mov edi, address1
mov esi, address2
mov ax, [esi]
xchg ax, [edi]
mov [esi], ax
ENDM

x86 Struct scanf

I am trying to convert C to x86. I am using a struct...
struct person_record_struct
{
char last_name[128];
char first_name[128];
char year_of_birth[10];
int month_of_birth; // January => 1
int day_of_birth; // 1st Day of a Month => 1
char drivers_license_no[128];
};
typedef struct person_record_struct person_record;
I am having trouble getting my scanf to work. Here is the C..
result = scanf("%s\n%s\n%s\n%d\n%d\n%s\n", &records[counter].last_name[0],
&records[counter].first_name[0], &records[counter].year_of_birth[0],
&records[counter].month_of_birth, &records[counter].day_of_birth,
&records[counter].drivers_license_no[0]);
And my x86..
;counter # [ebp-4]
;records # [ebp-16]
; format_string_main_2 db '%s\n%s\n%s\n%d\n%d\n%s\n', 0
; read in info
; push drivers_license_no
mov ebx, [ebp-16] ;
mov eax, [ebp-4]
mov ecx, struct_size
mul ecx
add eax, ebx
lea eax, [eax+276]
push eax
; push day_of_birth
mov ebx, [ebp-16]
mov eax, [ebp-4]
mov ecx, struct_size
mul ecx
add eax, ebx
lea eax, [eax+272]
push eax
; push month_of_birth
mov ebx, [ebp-16]
mov eax, [ebp-4]
mov ecx, struct_size
mul ecx
add eax, ebx
lea eax, [eax+268]
push ax
; push year_of_birth
mov ebx, [ebp-16]
mov eax, [ebp-4]
mov ecx, struct_size
mul ecx
add eax, ebx
lea eax, [eax+256]
push eax
; push first_name
mov ebx, [ebp-16]
mov eax, [ebp-4]
mov ecx, struct_size
mul ecx
add eax, ebx
lea eax, [eax+128]
push eax
; push last_name
mov ebx, [ebp-16]
mov eax, [ebp-4]
mov ecx, struct_size
mul ecx
add eax, ebx
lea eax, [eax+0]
push eax
push format_string_main_2
call scanf
add esp, 28
mov [ebp-12], eax
I'm using a check to see if result is 6 and if it's not my program that prints an error and exits. It keeps having an error and I'm not sure what I am doing wrong. Any help would be much appreciated. Thank you.
This is my calloc call which appears to be correct...
; // allocate the buffer of all the records
; records = (person_record *)calloc(number_of_records, sizeof(person_record));
push struct_size
mov eax, [ebp-8]
push eax
call calloc
add esp, 8
mov [ebp-16], eax
Under month_of_birth you have push ax instead of push eax. This would push only the lower 16 bits of the address on the stack, virtually guaranteeing a crash in scanf. Fix that and it should be OK.
There are many weird/wrong things going on in your code. It will be easier to show a cleaner way. You have not mentioned the Assembler you are using, there are a few for x86 and each has its own syntax. Here is how you can do it using NASM:
extern printf, scanf, calloc, exit, free, puts
global main
struc person_record
.last_name resb 128
.first_name resb 128
.year_of_birth resb 10
.month_of_birth resd 1
.day_of_birth resd 1
.drivers_license_no resb 128
.size equ $ - person_record
endstruc
MAX_RECORDS equ 2
section .data
Space db 32, 0
input_format db "%s%s%s%d%d%s", 0
output_format db "%s %s %s %d %d %s", 0
section .text
main:
push person_record.size
push MAX_RECORDS
call calloc
add esp, 4 * 2
mov esi, eax
mov ebx, eax
mov edi, MAX_RECORDS - 1
.FillRecord:
lea eax, [ebx + person_record.drivers_license_no]
push eax
lea ecx, [ebx + person_record.day_of_birth]
push ecx
lea edx, [ebx + person_record.month_of_birth]
push edx
lea eax, [ebx + person_record.year_of_birth]
push eax
lea ecx, [ebx + person_record.first_name]
push ecx
lea edx, [ebx + person_record.last_name]
push edx
push input_format
call scanf
add esp, 4 * 7
push Space
call puts
add esp, 4 * 1
add ebx, person_record.size
dec edi
jns .FillRecord
mov ebx, esi
mov edi, MAX_RECORDS - 1
.ShowRecord:
lea eax, [ebx + person_record.drivers_license_no]
push eax
mov ecx, [ebx + person_record.day_of_birth]
push ecx
mov edx, [ebx + person_record.month_of_birth]
push edx
lea eax, [ebx + person_record.year_of_birth]
push eax
lea ecx, [ebx + person_record.first_name]
push ecx
lea edx, [ebx + person_record.last_name]
push edx
push output_format
call printf
add esp, 4 * 7
push Space
call puts
add esp, 4 * 1
add ebx, person_record.size
dec edi
jns .ShowRecord
push esi
call free
add esp, 4 * 1
push 0
call exit
And the input and output of 2 records:

Assembly EAX register resetting without reason

I have the following assembly code:
; File: strrev.asm
; A subroutine called from C programs.
; Parameters: string A
; Result: String is reversed and returned.
SECTION .text
global strrev
_strrev: nop
strrev:
push ebp
mov ebp, esp
; registers ebx,esi, and edi must be saved if used
push ebx
push edi
xor esi, esi
xor eax, eax
mov ecx, [ebp+8] ; load the start of the array into ecx
jecxz end ; jump if [ecx] is zero
mov edi, ecx
reverseLoop:
cmp byte[edi], 0
je reverseLoop_1
inc edi
inc eax
jmp reverseLoop
reverseLoop_1:
mov esi, edi ;move end of array into esi
mov edi, ecx ;reset start of array to edi
reverseLoop_2:
mov al, [esi]
mov bl, [edi]
mov [esi], bl
mov [edi], al
inc edi
dec esi
dec eax
jnz reverseLoop_2
end:
pop edi ; restore registers
pop ebx
mov esp, ebp ; take down stack frame
pop ebp
ret
Which works fine until you start looping through reverseLoop_2. Using gdb, eax is listed as being 11, which it should be (this is the length of the string I passed in through a separate c program). This is show in the debugger as:
Breakpoint 2, reverseLoop_2 () at strrev.asm:40
40 mov al, [esi]
(gdb) display $eax
1: $eax = 11
However, if I step through the program to the next line, it resets to 0.
(gdb) next
41 mov bl, [edi]
1: $eax = 0
I need eax to be preserved since its the one keeping track of how many times reverseLoop_2 needs to loop. Why is it resetting to 0 after the call to mov?
If you're using eax as a loop counter, you shouldn't write to it inside the loop :
reverseLoop_2:
mov al, [esi]
Remember that al is the least significant byte of eax :
I think this should work.
mov eax, address of your string
push esi
push edi
mov edi, eax
mov esi, eax
; find end of string
sub ecx, ecx
not ecx
sub al, al
cld
repne scasb
; points to the byte after '0x00'
dec edi
dec edi
; main loop will swap the first with the last byte
; and increase/decrease the pointer until the cross each other
_loop:
cmp esi, edi ; if both pointers meet, we are done
jg _done
mov al, [edi]
mov bl, [esi]
mov [esi], al
mov [edi], bl
inc esi
dec edi
jmp _loop
_done:
pop edi
pop esi

declaring a string in assembly

I have this assembly code that computes some prime numbers:
#include <stdio.h>
int main() {
char format[] = "%d\t";
_asm{
mov ebx, 1000
mov ecx, 1
jmp start_while1
incrementare1:
add ecx, 1
start_while1:
cmp ecx, ebx
jge end_while1
mov edi, 2
mov esi, 0
jmp start_while2
incrementare2:
add edi, 1
start_while2:
cmp edi, ecx
jge end_while2
mov eax, ecx
xor edx, edx
div edi
test edx, edx
jnz incrementare2
mov esi, 1
end_while2:
test esi, esi
jnz incrementare1
push ecx
lea ecx, format
push ecx
call printf
pop ecx
pop ecx
jmp incrementare1
end_while1:
nop
}
return 0;
}
It works fine but I would like to also declare the 'format' string in asm, not in C code. I have tried adding something like format db "%d\t", 0 but it didn't work.
If all else fails there's always the ugly way:
format_minus_1:
mov ecx,0x00096425 ; '%', 'd', '\t', '\0' in little-endian format
lea ecx,format_minus_1 + 1 ; skip past the "mov ecx" opcode
push ecx
call printf
You cannot define objects inside the _asm block with those directives. The C declaration is allocating space on the stack for you so if you want to do something like that inside the _asm block you need to manipulate the stack pointer and initialize the memory yourself:
sub esp, 4
mov [esp], '%'
mov [esp + 1], 'd'
mov [esp + 2], '\t'
mov [esp + 3], '\0'
...
push ecx
push esp + 4
call printf
Note this is one way. Not necessarily the best way. The best way being let C do your memory management for you.

Resources