I have 2 string and one letter.
selectedWords BYTE "BICYCLE"
guessWords BYTE "-------"
inputLetter BYTE 'C'
Base on this answers, I write code who compere if selectedWords have letter C and If this is the case he need to change string guessWords:
guessWords "--C-C--"
But from some strange reason I get all other possibilities, just not correct one. Some suggestions on how to solve this problem.
First, forget the so called string instructions (scas, comps, movs). Second, you need a fixed pointer (dispkacement) with an index, e.g [esi+ebx]. Have you considered that WriteString needs a null-terminated string?
INCLUDE Irvine32.inc
.DATA
selectedWords BYTE "BICYCLE"
guessWords BYTE SIZEOF selectedWords DUP ('-'), 0 ; With null-termination for WriteString
inputLetter BYTE 'C'
.CODE
main PROC
mov esi, offset selectedWords ; Source
mov edi, offset guessWords ; Destination
mov ecx, LENGTHOF selectedWords ; Number of bytes to check
mov al, inputLetter ; Search for that character
xor ebx, ebx ; Index EBX = 0
ride_hard_loop:
cmp [esi+ebx], al ; Compare memory/register
jne #F ; Skip next line if no match
mov [edi+ebx], al ; Hang 'em lower
##:
inc ebx ; Increment pointer
dec ecx ; Decrement counter
jne ride_hard_loop ; Jump if ECX != 0
mov edx, edi
call WriteString ; Irvine32: Write a null-terminated string pointed to by EDX
exit ; Irvine32: ExitProcess
main ENDP
END main
Related
I am creating a program which reads a list of integers seperated by a single space via console and printing the sum of all the integers. The main problem is extracting the integers from the string array into a signed integer array.
Some examples of input are "-20 30 5" (each integer is seperated by a single space) or " [space]-20 30 5 [space]" (there may be spaces between the beginning and the end of the list, but the numbers are still seperated by a single space)
Also, after printing the sum, the program returns to reading another input unless only the enter key is typed.
After writing the code and pressing the Debug button, I am getting these two following build errors:
A2005 symbol redefinition: InBuffer
A2111 conflicting parameter definition
I've checked the error messages and apparently both of them are related to the PROTO and PROC directives. But there seems to be no problems regarding the parameter definition.
Here is my code.
INCLUDE Irvine32.inc
ArrayGet PROTO, ; convert string array into int array
inBuffer: PTR BYTE,
inBufferN: DWORD,
intArray: PTR SDWORD
.data
BUF_SIZE EQU 256
inBuffer BYTE BUF_SIZE DUP(?) ; input buffer
inBufferN DWORD ? ; length of input
intArray SDWORD BUF_SIZE/2 DUP(?) ; integer array for storing converted string
intArrayN DWORD ? ; number of integers
prompt BYTE "Enter numbers(<ent> to exit) : ", 0
bye BYTE "Bye!", 0
.code
main PROC
L1:
mov esi, 0
mov edx, OFFSET prompt
call WriteString
mov edx, OFFSET inBuffer
mov ecx, BUF_SIZE
call ReadString
cmp inBuffer[0], 0ah
je L3 ; only typing <ent> ends the program
mov inBufferN, eax
mov ecx, inBufferN
SpaceCheck: ; calls procedure when it finds a number
cmp inBuffer[esi], 20h
jne L2
inc esi
loop SpaceCheck
jmp L1
L2:
INVOKE ArrayGet, ADDR inBuffer, inBufferN, ADDR intArray ; put inBuffer offset on edx, inBufferN on ecx
mov intArrayN, eax
mov ecx, intArrayN
mov eax, 0
mov esi, OFFSET intArray
Ladd: ; adding the integer array
add eax, [esi]
inc esi
loop Ladd
call WriteInt
call CRLF
jmp L1
L3:
mov edx, OFFSET bye
call WriteString
exit
main ENDP
; procedure definition
ArrayGet PROC USES edx ecx,
inBuffer : PTR BYTE,
inBufferN: DWORD,
intArray: PTR SDWORD
LOCAL ArrayNum: DWORD
mov ArrayNum, 0
mov ecx, inBufferN
sub ecx, esi ; ecx(loop count) from first char to the end
LOOP1:
lea edx, inBuffer
add edx, esi ; edx points the offset of first char
mov edi, esi ; save location of first char
LOOP2: ; check spaces between integers
cmp inBuffer[esi], 20h
je getNum
inc esi
loop LOOP2
jmp getNum ; jump to getNum if array ends with a number
getNum: ; converting char into int
push ecx
inc esi
cmp inBuffer[esi], 20h ; two spaces in a row is considered as no more numbers afterwards
je EndBuffer
dec esi
mov ecx, esi
sub ecx, edi ; length of single number in char
call ParseInteger32
mov edi, ArrayNum
mov intArray[edi], eax
inc ArrayNum
inc esi
pop ecx
loop LOOP1
jmp EndBuffer ; end procedure when loop is over
EndBuffer:
mov eax, ArrayNum
inc eax
ret
ArrayGet ENDP
END main
In case you have questions about my intentions in the code or about the form of the input, feel free to leave it at the comment section
I have input like this:
This is, ,,, *&% a ::; demo + String. +Need to**#!/// format:::::!!! this.`
Output Required:
ThisisademoStringNeedtoformatthis
I have to do this without using str_trim.
Edit: I am writing an encryption program. I have to remove all punctuation from the string and turn all lower case letters to uppercase before I encrypt it.
I added the code. I need to remove the spaces, or any punctuation before I turn it to upper case. So far I haven't found anything in my book that could help with this except str_trim which we aren't allowed to use.
INCLUDE Irvine32.inc
.data
source byte "This is the source string",0
.code
main proc
mov esi,0 ; index register
mov ecx,SIZEOF source ; loop counter
L1:
mov al,source[esi] ; get a character from source
and source[esi], 11011111b ; convert lower case to upper case
inc esi ; move to next character
loop L1 ; repeat for entire string
mov edx, OFFSET source
call WriteString
exit
main endp
end main
Your are already trying to change from lowercase to uppercase, so, I will give you a hand to remove the punctuation. Next code uses my suggestion : moving the uppercase letters to an auxiliary string ignoring the punctuation characters. I used EMU8086 compiler :
.stack 100h
.data
source db "STRING, WITH. PUNCTUATION : AND * SPACES!$"
aux db " "
.code
mov ax, #data
mov ds, ax
;REMOVE EVERYTHING BUT UPPERCASE LETTERS.
mov si, offset source ; POINT TO STRING.
mov di, offset aux ; POINT TO AUXILIARY.
L1:
mov al, [ si ] ; get character from source
;CHECK IF END STRING ($).
cmp al, '$'
je finale
;CHECK IF CHAR IS UPPERCASE LETTER.
cmp al, 65
jb is_not_a_letter ; CHAR IS LOWER THAN 'A'.
cmp al, 90
ja is_not_a_letter ; CHAR IS HIGHER THAN 'Z'.
;COPY LETTER TO AUX STRING.
mov [ di ], al
inc di ; POSITION FOR NEXT CHARACTER.
is_not_a_letter:
inc si ; move to next character
jmp L1
finale:
mov [ di ], al ; '$', NECESSARY TO PRINT.
;PRINT STRING.
mov dx, OFFSET aux
mov ah, 9
int 21h
;END PROGRAM.
mov ax, 4c00h
int 21h
I ended the strings with '$' because I print the string with int 21h.
As you can see, I'm not using CX nor the LOOP instruction. What I do is to repeat until '$' is found. You can do the same until 0 is found.
This is my code after removing all the punctuation and turning it to uppercase.
INCLUDE Irvine32.inc
.data
source byte "STriNG, ## WITH.[][][ lalalala PUncTuATION : AND * SpaceS!", 0
target byte SIZEOF source DUP(0), 0
.code
main PROC
pushad
mov edx, offset source
call WriteString
call CrlF
mov edx, 0
mov esi, offset source
mov edi, offset target
L1:
mov al, [ esi ] ; get character from source
cmp al, 0
je final
cmp al, 65
jb not_letter ; if char is lower than 'A' jump to not letter
cmp al, 122
ja not_letter ; if char is greater than 'z' jump to not letter
cmp al, 90
ja Label1 ; jump if above 'Z'
jmp next ; false
Label1:
cmp al, 97
jl Label2 ; jmp if less than 'a'
jmp next ; false
Label2: ; if both are true than is greater than 'Z' but less than 'a'
jmp not_letter ; jump to not letter
next:
mov [ edi ], al
inc di ; position to next character.
not_letter:
inc si ; move to next character
jmp L1
final:
mov [ edi ], al
mov edx, OFFSET target
mov ah, 9
call WriteString
call CrlF
mov esi,0 ; index register
mov ecx,SIZEOF source ; loop counter
L2:
mov al, target[esi] ; get a character from source
and target[esi], 11011111b ; convert lower case to upper case
inc esi ; move to next character
loop L2 ; repeat for entire string
mov edx, OFFSET target
call WriteString
call CrlF
popad
exit
main endp
end main
This is my code to search ih some string consists of one letter:
selectedWords BYTE "BICYCLE"
inputLetter BYTE 'Y'
cld
mov ecx, LENGTHOF selectedWords
mov edi, offset selectedWords
mov al, inputLetter ;Load character to find
repne scasb ;Search
jne notfound
But how to return the pointer to the letter in string?
If I want after to change one leter with some other. Its easy to do if you have pointer to the letter in string.
If you read the documentation for REP and SCASB you'll see that SCAS updates edi. Thus the location of the matching char is stored in EDI.
All you have to do is return EDI if ZF=1 and return 0 if ZF<>1.
cld
mov ecx, LENGTHOF selectedWords
mov edi, offset selectedWords
mov al, inputLetter ;Load character to find
repne scasb ;Search
jne notfound
found:
mov eax,edi ;return the address of the match.
ret
notfound:
xor eax,eax ;return 0 aka not found as address.
ret
If repne scasb finds the element, EDI points to the element after the first match. You have to decrement it to get the pointer to the desired element.
You don't need to clear the direction flag (cld). It's very very unlikelikely that the direction flag is set without any involvement of your part. And if so, you should seit it back to the former status.
INCLUDE Irvine32.inc
.DATA
selectedWords BYTE "BICYCLE"
inputLetter BYTE 'Y'
err_msg BYTE "Not found.", 0
.CODE
main PROC
mov ecx, LENGTHOF selectedWords
mov edi, offset selectedWords
mov al, inputLetter ; Load character to find
repne scasb ; Search
jne notfound
dec edi
mov al, [edi]
call WriteChar ; Irvine32: Write a character in AL
exit ; Irvine32: ExitProcess
notfound:
lea edx, err_msg
call WriteString ; Irvine32: Write a null-terminated string pointed to by EDX
exit ; Irvine32: ExitProcess
main ENDP
END main
If you don't like repne scasb you can scan the word with a "normal" comparison loop
INCLUDE Irvine32.inc
.DATA
selectedWords BYTE "BICYCLE"
inputLetter BYTE 'Y'
err_msg BYTE "Not found.", 0
.CODE
main PROC
mov edi, offset selectedWords
mov ecx, LENGTHOF selectedWords
mov al, inputLetter
##:
cmp [edi], al ; Compare memory/immediate value
je found ; JE = jump if equal
inc edi ; Increment pointer
dec ecx ; Decrement counter
jne #B ; Jump back to the last ##, if ECX == 0
jmp notfound
found:
mov al, [edi]
call WriteChar ; Irvine32: Write a character in AL
exit ; Irvine32: ExitProcess
notfound:
lea edx, err_msg
call WriteString ; Irvine32: Write a null-terminated string pointed to by EDX
exit ; Irvine32: ExitProcess
main ENDP
END main
I can not use this declaration, because selectedWords can be any string.
.DATA
guessWords BYTE SIZEOF selectedWords DUP ('-'), 0
So I try to do this:
;Wordls what we select by rundom code
selectedWords BYTE ?
lengthSelectedWorld DWORD ?
;Letter what we guess, input from keyboard
guessLetter BYTE ?
guessWords BYTE ?
;Letter what are unknows, change with -
letterUnknown BYTE "-", 0
And I have write this function
make_array1 PROC
mov edx,OFFSET selectedWords
call StrLength
mov lengthSelectedWorld,eax
mov lengthSelectedWorld1 ,eax
inc lengthSelectedWorld
loop_add_more:
cmp lengthSelectedWorld, 1
je done
dec lengthSelectedWorld
mov eax, '-'
mov ecx, lengthSelectedWorld1
mov edi, offset guessWords
rep stosw
mov edx, offset guessWords
call WriteString
call Crlf ;stampamo enter novi red
jmp loop_add_more
done:
mov eax, '0'
mov ecx, lengthSelectedWorld1
mov edi, offset guessWords
rep stosw
mov edx, offset guessWords
call WriteString
call Crlf ;stampamo enter novi red
ret
make_array1 ENDP
But after this funcion I get guessWords what is string of ------- and dont have 0 on the and. So how to make string guessWords=-------0?
Its important for me to have 0 on the end of string because of some other comparation in code..
selectedWords BYTE ? reserves just one byte for selectedWords. The same issue with guessWords BYTE ?. Don't play with dynamically allocated memory as newbie. Rather reserve space which is sufficient in any case: guessWords BYTE 50 DUP (?). The question mark means that MASM can decide to treat it as uninitialized memory (not stored in the .exe file, but allocated at program start).
STOSW stores a WORD (= two characters). However Irvine's StrLength returns the number of bytes of the string. Use STOSB instead. After STOSB, EDI points to the character after the last stored AL. You can store a null there. If you want to see it, temporarily change 0 to '0'.
INCLUDE Irvine32.inc
.DATA
;Wordls what we select by rundom code
selectedWords BYTE "WEIGHTLIFTING", 0
lengthSelectedWord DWORD ?
;Letter what we guess, input from keyboard
guessLetter BYTE ?
guessWords BYTE 50 DUP (?)
;Letter what are unknows, change with -
letterUnknown BYTE "-", 0
.CODE
make_array1 PROC
mov edx,OFFSET selectedWords
call StrLength ; Irvine32: Length of a null-terminated string pointed to by EDX
mov lengthSelectedWord,eax
loop_add_more:
mov al, '-' ; Default charcter for guessWords
mov ecx, lengthSelectedWord ; REP counter
mov edi, offset guessWords ; Destination
rep stosb ; Build guessWords
mov BYTE PTR [edi], 0 ; Store the null termination
mov edx, offset guessWords
call WriteString ; Irvine32: write a string pointed to by EDX
call Crlf ; Irvine32: new line
ret
make_array1 ENDP
main PROC
call make_array1
exit ; Irvine32: ExitProcess
main ENDP
END main
I am trying to write an assembly program that calls a function in c that will replace certain characters in a string with a predefined character given that the currently character in the char array meets some qualification.
My c file:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
//display *((char *) $edi)
// These functions will be implemented in assembly:
//
int strrepl(char *str, int c, int (* isinsubset) (int c) ) ;
int isvowel (int c) {
if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u')
return 1 ;
if (c == 'A' || c == 'E' || c == 'I' || c == 'O' || c == 'U')
return 1 ;
return 0 ;
}
int main(){
char *str1;
int r;
// I ran my code through a debugger again, and it seems that when displaying
// the character stored in ecx is listed as "A" (correct) right before the call
// to "add ecx, 1" at which point ecx somehow resets to 0 when it should be "B"
str1 = strdup("ABC 123 779 Hello World") ;
r = strrepl(str1, '#', &isdigit) ;
printf("str1 = \"%s\"\n", str1) ;
printf("%d chararcters were replaced\n", r) ;
free(str1) ;
return 0;
}
And my .asm file:
; File: strrepl.asm
; Implements a C function with the prototype:
;
; int strrepl(char *str, int c, int (* isinsubset) (int c) ) ;
;
;
; Result: chars in string are replaced with the replacement character and string is returned.
SECTION .text
global strrepl
_strrepl: nop
strrepl:
push ebp ; set up stack frame
mov ebp, esp
push esi ; save registers
push ebx
xor eax, eax
mov ecx, [ebp + 8] ;load string (char array) into ecx
jecxz end ;jump if [ecx] is zero
mov esi, [ebp + 12] ;move the replacement character into esi
mov edx, [ebp + 16] ;move function pointer into edx
xor bl, bl ;bl will be our counter
firstLoop:
add bl, 1 ;inc bl would work too
add ecx, 1
mov eax, [ecx]
cmp eax, 0
jz end
push eax ; parameter for (*isinsubset)
;BREAK
call edx ; execute (*isinsubset)
add esp, 4 ; "pop off" the parameter
mov ebx, eax ; store return value
end:
pop ebx ; restore registers
pop esi
mov esp, ebp ; take down stack frame
pop ebp
ret
When running this through gdb and putting a breakpoint at ;BREAK, it segfaults after I take a step to the call command with the following error:
Program received signal SIGSEGV, Segmentation fault.
0x0081320f in isdigit () from /lib/libc.so.6
isdigit is part of the standard c library that i have included in my c file, so I am not sure what to make of this.
Edit: I have edited my firstLoop and included a secondLoop which should replace any digits with "#", however it seems to replace the entire array.
firstLoop:
xor eax, eax
mov edi, [ecx]
cmp edi, 0
jz end
mov edi, ecx ; save array
movzx eax, byte [ecx] ;load single byte into eax
mov ebp, edx ; save function pointer
push eax ; parameter for (*isinsubset)
call edx ; execute (*isinsubset)
;cmp eax, 0
;jne end
mov ecx, edi ; restore array
cmp eax, 0
jne secondLoop
mov edx, ebp ; restore function pointer
add esp, 4 ; "pop off" the parameter
mov ebx, eax ; store return value
add ecx, 1
jmp firstLoop
secondLoop:
mov [ecx], esi
mov edx, ebp
add esp, 4
mov ebx, eax
add ecx, 1
jmp firstLoop
Using gdb, when the code gets to secondloop, everything is correct. ecx is showing as "1" which is the first digit in the string that was passed in from the .c file. Esi is displaying as "#" as it should be. However, after I do mov [ecx], esi it seems to fall apart. ecx is displaying as "#" as it should at this point, but once I increment by 1 to get to the next character in the array, it is listed as "/000" with display. Every character after the 1 is replaced with "#" is listed as "/000" with display. Before I had the secondLoop trying to replace the characters with "#", I just had firstLoop looping with it self to see if it could make it through the entire array without crashing. It did, and after each increment ecx was displaying as the correct character. I am not sure why doing mov [ecx], esi would have set the rest of ecx to null.
In your firstLoop: you're loading characters from the string using:
mov eax, [ecx]
which is loading 4 bytes at a tie instead of a single byte. So the int that you're passing to isdigit() is likely to by far out of range for it to handle (it probably uses a simple table lookup).
You can load a single byte using the following Intel asm syntax:
movzx eax, byte ptr [ecx]
A few other things:
it will also have the effect that it probably wouldn't detect the end of the string properly since the null terminator might not be followed by three other zero bytes.
I'm not sure why you increment ecx before processing the first character in the string
the assembly code you posted doesn't appear to actually loop over the string
I've put some comments into your code:-
; this is OK: setting up the stack frame and saving important register
; on Win32, the registers that need saving are: esi, edi and ebx
; the rest can be used without needing to preserve them
push ebp
mov ebp, esp
push esi
push ebx
xor eax, eax
mov ecx, [ebp + 8]
; you said that this checked [ecx] for zero, but I think you've just written
; that wrong, this checks the value of ecx for zero, the [reg] form usually indicates
; the value at the address defined by reg
; so this is effectively doing a null pointer check (which is good)
jecxz end
mov esi, [ebp + 12]
mov edx, [ebp + 16]
xor bl, bl
firstLoop:
add bl, 1
; you increment ecx before loading the first character, this means
; that the function ignores the first character of the string
; and will therefore produce an incorrect result if the string
; starts with a character that needs replacing
add ecx, 1
; characters are 8 bit, not 32 bit (mentioned in comments elsewhere)
mov eax, [ecx]
cmp eax, 0
jz end
push eax
; possibly segfaults due to character out of range
; also, as mentioned elsewhere, the function you call here must conform to the
; the standard calling convention of the system (e.g, preserve esi, edi and ebx for
; Win32 systems), so eax, ecx and edx can change, so next time you call
; [edx] it might be referencing random memory
; either save edx on the stack (push before pushing parameters, pop after add esp)
; or just load edx with [ebp+16] here instead of at the start
call edx
add esp, 4
mov ebx, eax
; more functionality required here!
end:
; restore important values, etc
pop ebx
pop esi
mov esp, ebp
pop ebp
; the result of the function should be in eax, but that's not set up properly yet
ret
Comments on your inner loop:-
firstLoop:
xor eax, eax
; you're loading a 32 bit value and checking for zero,
; strings are terminated with a null character, an 8 bit value,
; not a 32 bit value, so you're reading past the end of the string
; so this is unlikely to correctly test the end of string
mov edi, [ecx]
cmp edi, 0
jz end
mov edi, ecx ; save array
movzx eax, byte [ecx] ;load single byte into eax
; you need to keep ebp! its value must be saved (at the end,
; you do a mov esp,ebp)
mov ebp, edx ; save function pointer
push eax ; parameter for (*isinsubset)
call edx ; execute (*isinsubset)
mov ecx, edi ; restore array
cmp eax, 0
jne secondLoop
mov edx, ebp ; restore function pointer
add esp, 4 ; "pop off" the parameter
mov ebx, eax ; store return value
add ecx, 1
jmp firstLoop
secondLoop:
; again, your accessing the string using a 32 bit value, not an 8 bit value
; so you're replacing the matched character and the three next characters
; with the new value
; the upper 24 bits are probably zero so the loop will terminate on the
; next character
; also, the function seems to be returning a count of characters replaced,
; but you're not recording the fact that characters have been replaced
mov [ecx], esi
mov edx, ebp
add esp, 4
mov ebx, eax
add ecx, 1
jmp firstLoop
You do seem to be having trouble with the way the memory works, you are getting confused between 8 bit and 32 bit memory access.