x86 function returning char* in C

x86 function returning char* in C - c

I want to write a function in x86 which will be called from C program.
The function should look like this:
char *remnth(char *s, int n);
I want it to remove every nth letter from string s and return that string. Here's my remnth.s file:
section.text
global remnth
remnth:
; prolog
push ebp
mov ebp, esp
; procedure
mov eax, [ebp+8]; Text in which I'm removing every nth letter
mov ebx, [ebp+12]; = n
mov ecx, [ebp+8] ; pointer to next letter (replacement)
lopext:
mov edi, [ebp+12] ; edi = n //setting counter
dec edi ; edi-- //we don't go form 'n' to '1' but from 'n-1' to '0'
lop1:
mov cl, [ecx] ; letter which will be a replacement
mov byte [eax], cl ; replace
test cl,cl ; was the replacement equal to 0?
je exit ; if yes that means the function is over
inc eax ; else increment pointer to letter which will be replaced
inc ecx ; increment pointer to letter which is a replacement
dec edi ; is it already nth number?
jne lop1 ; if not then repeat the loop
inc ecx ; else skip that letter by proceeding to the next one
jmp lopext ; we need to set counter (edi) once more
exit:
; epilog
pop ebp
ret
The problem is that when I'm calling this function from main() in C I get Segmentation fault (core dumped)
From what I know this is highly related to pointers, in this case I'm returning *char, and since I've seen some functions that returns int and they worked just fine, I suspect that I forgot about something important with returning a *char properly.
This is what my C file looks like:
#include <stdio.h>
extern char *remnth(char *s,int n);
int main()
{
char txt[] = "some example text\0";
printf("orginal = %s\n",txt);
printf("after = %s\n",remnth(txt,3));
return 0;
}
Any help will be appreciated.

You're using ecx as a pointer, and cl as a work register. Since cl is the low 8 bits of ecx, you're corrupting your pointer with the mov cl, [ecx] instruction. You'll need to change one or the other. Typically, al/ax/eax/rax is used for a temporary work register, as some accesses to the accumulator use shorter instruction sequences. If you use al as a work register, you'll want to avoid using eax as a pointer and use a different register instead (remembering to preserve its contents if necessary).

You need to load the return value into eax before the return. I assume you want to return a pointer to the beginning of the string, so that would be [ebp+8].

Related

ASSEMBLY - output an array with 32 bit register vs 16 bit

I'm was working on some homework to print out an array as it's sorting some integers from an array. I have the code working fine, but decided to try using EAX instead of AL in my code and ran into errors. I can't figure out why that is. Is it possible to use EAX here at all?
; This program sorts an array of signed integers, using
; the Bubble sort algorithm. It invokes a procedure to
; print the elements of the array before, the bubble sort,
; once during each iteration of the loop, and once at the end.
INCLUDE Irvine32.inc
.data
myArray BYTE 5, 1, 4, 2, 8
;myArray DWORD 5, 1, 4, 2, 8
currentArray BYTE 'This is the value of array: ' ,0
startArray BYTE 'Starting array. ' ,0
finalArray BYTE 'Final array. ' ,0
space BYTE ' ',0 ; BYTE
.code
main PROC
MOV EAX,0 ; clearing registers, moving 0 into each, and initialize
MOV EBX,0 ; clearing registers, moving 0 into each, and initialize
MOV ECX,0 ; clearing registers, moving 0 into each, and initialize
MOV EDX,0 ; clearing registers, moving 0 into each, and initialize
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET startArray ; load EDX with address of variable
CALL writeString ; print string
POP EDX ; return edx to previous stack
MOV ECX, lengthOf myArray ; load ECX with # of elements of array
DEC ECX ; decrement count by 1
L1:
PUSH ECX ; save outer loop count
MOV ESI, OFFSET myArray ; point to first value
L2:
MOV AL,[ESI] ; get array value
CMP [ESI+1], AL ; compare a pair of values
JGE L3 ; if [esi] <= [edi], don't exch
XCHG AL, [ESI+1] ; exchange the pair
MOV [ESI], AL
CALL printArray ; call printArray function
CALL crlf
L3:
INC ESI ; increment esi to the next value
LOOP L2 ; inner loop
POP ECX ; retrieve outer loop count
LOOP L1 ; else repeat outer loop
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET finalArray ; load EDX with address of variable
CALL writeString ; print string
POP EDX ; return edx to previous stack
CALL printArray
L4 : ret
exit
main ENDP
printArray PROC uses ESI ECX
;myArray loop
MOV ESI, OFFSET myArray ; address of myArray
MOV ECX, LENGTHOF myArray ; loop counter (5 values within array)
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET currentArray ; load EDX with address of variable
CALL writeString ; print string
POP EDX ; return edx to previous stack
L5 :
MOV AL, [ESI] ; add an integer into eax from array
CALL writeInt
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET space
CALL writeString
POP EDX ; restores the original edx register value
ADD ESI, TYPE myArray ; point to next integer
LOOP L5 ; repeat until ECX = 0
CALL crlf
RET
printArray ENDP
END main
END printArray
; output:
;Starting array. This is the value of array: +1 +5 +4 +2 +8
;This is the value of array: +1 +4 +5 +2 +8
;This is the value of array: +1 +4 +2 +5 +8
;This is the value of array: +1 +2 +4 +5 +8
;Final array. This is the value of array: +1 +2 +4 +5 +8
As you can see the output sorts the array just fine from least to greatest. I was trying to see if I could move AL into EAX, but that gave me a bunch of errors. Is there a work around for this so I can use a 32 bit register and get the same output?

Using EAX is definitely possible, in fact you already are. You asked "I was trying to see if I could move AL into EAX, but that gave me a bunch of errors." Think about what that means. EAX is the extended AX register, and AL is the lower partition of AX. Take a look at this diagram:image of EAX register
. As you can see, moving AL into EAX using perhaps the MOVZX instruction would simply put the value in AL into EAX and fill zeroes in from right to left. You'd be moving AL into AL, and setting the rest of EAX to 0. You could actually move everything into EAX and run the program just the same and there'd be no difference because it's using the same part of memory.
Also, why are you pushing and popping EAX so much? The only reason to push/pop things from the runtime stack is to recover them later, but you never do that, so you can just let whatever is in EAX at the time just die.

If you still want to do an 8-bit store, you need to use an 8-bit register. (AL is an 8-bit register. IDK why you mention 16 in the title).
x86 has widening loads (movzx and movsx), but integer stores from a register operand always take a register the same width as the memory operand. i.e. the way to store the low byte of EAX is with mov [esi], al.
In printArray, you should use movzx eax, byte ptr [esi] to zero-extend into EAX. (Or movsx to sign-extend, if you want to treat your numbers as int8_t instead of uint8_t.) This avoids needing the upper 24 bits of EAX to be zeroed.
BTW, your code has a lot of unnecessary instructions. e.g.
MOV EAX,0 ; clearing registers, moving 0 into each, and initialize
totally pointless. You don't need to "init" or "declare" a register before using it for the first time, if your first usage is write-only. What you do with EDX is amusing:
MOV EDX,0 ; clearing registers, moving 0 into each, and initialize
PUSH EDX ; preserves the original edx register value for future writeString call
MOV EDX, OFFSET startArray ; load EDX with address of variable
CALL writeString ; print string
POP EDX ; return edx to previous stack
"Caller-saved" registers only have to be saved if you actually want the old value. I prefer the terms "call-preserved" and "call-clobbered". If writeString destroys its input register, then EDX holds an unknown value after the function returns, but that's fine. You didn't need the value anyway. (Actually I think Irvine32 functions at most destroy EAX.)
In this case, the previous instruction only zeroed the register (inefficiently). That whole block could be:
MOV EDX, OFFSET startArray ; load EDX with address of variable
CALL writeString ; print string
xor edx,edx ; edx = 0
Actually you should omit the xor-zeroing too, because you don't need it to be zeroed. You're not using it as counter in a loop or anything, all the other uses are write-only.
Also note that XCHG with memory has an implicit lock prefix, so it does the read-modify-write atomically (making it much slower than separate mov instructions to load and store).
You could load a pair of bytes using movzx eax, word ptr [esi] and use a branch to decide whether to rol ax, 8 to swap them or not. But store-forwarding stalls from byte stores forwarding to word loads isn't great either.
Anyway, this is getting way off topic from the title question, and this isn't codereview.SE.

Get returned char from a C function using inline assembly

I have this function that uses inline assembly that basically calls a C function, gets the returned value, and passes that value as a parameter to another function that returns a character.
void convertText(FILE *arch, FILE *result)
{
int i = 0;
int n = arch->size;
_asm {
mov esi, 0
whileS:
cmp esi, n
jge end
mov ebx, result
mov ebx, [ebx]result.information ; Pointer to an array of characters
push esi ; Push parameters to get5bitsFunc
push arch ; Push parameters to get5bitsFunc
call get5bitsFunc
pop arch ; Restore values
pop esi ; Restore values
push eax ; push get5bitsFunc returned value to codify as parameter
call codify
mov edi, eax ; <- HERE move returned value from codify to edi register
pop eax ; restore eax
inc esi
jmp whileS
end:
}
}
Think of codify as function of the type
unsigned char codify(unsigned char parameter) {
unsigned char resp;
// Do something to the parameter
resp = 'b'; // asign value to resp
return resp;
}
I have already tested codify and works fine returning the value I want using C code. The problem is that when I run and debug the convertText code in inline assembly in the line I have marked as "-> Here" the value returned in eax is something of the type 3424242 and not 97 or above in the ascii table that is what I need.
How can I get the char value?

The Windows ABI apparently doesn't require functions returning char to zero- or sign-extend the value into EAX, so you need to assume that the bytes above AL hold garbage. (This is the same as in the x86 and x86-64 System V ABI. See also the x86 tag wiki for ABI/calling convention docs).
You can't assume that zeroing EAX before calling codify() is sufficient. It's free to use all of EAX as a scratch register before returning with the char in AL, but garbage in the rest of EAX.
You actually need to movzx esi, al, (or MOVSX), or mov [mem], al or whatever else you want to do to ignore garbage in the high bytes.

An unsigned char is only 1 byte while eax is a 32-bit (4 byte) register. If codify() is only returning 1 byte, then the return value will be stored in al (the first byte of eax) while leaving the rest of eax untouched (which would result in garbage). I would recommend xor eax, eax before calling codify() so you know that the register is clean before you store the return value in it.

MASM Why doesn't decrementing a register find the next value in an array?

I'm testing to see if an entered string is a palindrome by taking the string, moving it into a character array and comparing first and last elements of the char array to each other to test. I can get the first element of the array to find the second character easily, but to find the last acceptable value and decrement that, it doesn't find the next character in the array. So if the corrected/cleaned char array looks like:
['A']['B']['C']['D']['A']
ebx will go from 'A' -> 'B' but edi will not change from 'A' -> 'D'
Why will ebx change characters but edi only subtracts 1 from it's register value? What can I do to have edi change character value? Thanks!
C++ code: (just in case)
#include <iostream>
#include <cstring>
#include <sstream>
using namespace std;
extern"C"
{
char stringClean(char *, int);
char isPalindrome(char *, int);
}
int main()
{
int pal = 0;
const int SIZE = 30;
string candidate = "";
char strArray[SIZE] = { '\0' };
cout << "enter a string to be tested: ";
getline(cin, candidate);
int j = 0;
for (int i = 0; i < candidate.length(); i++) //getting rid of garbage before entering into array
{
if (candidate[i] <= 'Z' && candidate[i] >= 'A' || candidate[i] <= 'z' && candidate[i] >= 'a')
{
strArray[j] = candidate[i];
j++;
}
}
if (int pleaseWork = stringClean(strArray, SIZE) == 0)
pal = isPalindrome(strArray, SIZE);
if (pal == 1)
cout << "Your string is a palindrome!" << endl;
else
cout << "Your string is NOT a palindrome!" << endl;
system("pause");
return 0;
}
masm code:
.686
.model flat
.code
_isPalindrome PROC ; named _test because C automatically prepends an underscode, it is needed to interoperate
push ebp
mov ebp,esp ; stack pointer to ebp
mov ebx,[ebp+8] ; address of first array element
mov ecx,[ebp+12] ; number of elements in array
mov ebp,0
mov edx,0
mov eax,0
push edi ;save this
push ebx ;save this
mov edi, ebx ;make a copy of first element in array
add edi, 29 ;move SIZE-1 (30 - 1 = 29) elements down to, HOPEFULLY, the last element in array
mov bl, [ebx]
mov dl, [edi]
cmp dl, 0 ;checks if last element is null
je nextElement ;if null, find next
jne Comparison ;else, start comparing at Comparison:
nextElement:
dec edi ;finds next element
mov dl, [edi] ;move next element into lower edx
cmp dl, 0 ;checks if new element is mull
je nextElement ;if null, find next
jne Comparison ;else, start comparing at Comparison:
Comparison:
cmp bl,dl ;compares first element and last REAL element
je testNext ;jump to textNext: for further testing
mov eax,1 ;returns 1 (false) because the test failed
jne allDone ;jump to allDoneNo because it's not a palindrome
testNext:
dec edi ;finds last good element -1 --------THIS ISN'T DOING the right thing
inc ebx ;finds second element
cmp ebx, edi ;checks if elements are equal because that has tested all elements
je allDone
;mov bl,[ebx] ;move incremented ebx into bl
;mov dl,[edi] ;move decremented edi into dl
jmp Comparison ;compare newly acquired elements
allDone:
xor eax, eax
mov ebp, eax
pop edi
pop edx
pop ebp
ret
_isPalindrome ENDP
END

I haven't tested your code, but looking at it I noticed some possible problems.
Why will ebx change characters
It seems that way, but it's not what you tried to reach. You commented out the lines reading the characters from memory/the array after the initial phase (see below). So in fact, you did change the character in EBX, but not the way you expected (and supposedly wanted). With INC EBX you increased the char-value from 'A'(65dec) to 'B'(66dec). That 'B' is also the second char of the string is merely a coincidence. Try changing the string from ABCDA to ARRCD or something and you'd still get a 'B' on the second round. So EBX does indeed change.
...
;mov bl,[ebx] ;move incremented ebx into bl
;mov dl,[edi] ;move decremented edi into dl
jmp Comparison ;compare newly acquired elements
...
but edi only subtracts 1 from it's register value?
What can I do to have edi change character value?
Yes. That's what your code does and it's correct. Uncomment the above line containing [edi] and the char pointed at by EDI will be loaded into the lower byte of EDX = DL.
The problem with your code is that you are using EBX both as a pointer and (char)value. Loading the next char into EBX will destroy the pointer and your programm is likely to crash with ACCESS_VIOLATION in the next iteration or show random behaviour which would be hard to debug.
Separate pointers from values like you have done with EDI/EDX (EDI=pointer to char, EDX(DL)=char value.
Another problem is: your code will only work for strings with an odd length.
testNext:
dec edi ; !!!
inc ebx ; !!!
cmp ebx, edi ; checks if elements are equal because that has tested all elements
je allDone
So you are increasing and decreasing the (should be) pointers and then comparing them. Now consider this case of an even-length-string:
ABBA
^ ^ (EBX(first) and EDI(second))
=> dec both =>
ABBA
^^ (EBX(first) and EDI(second))
=> dec both =>
ABBA
^^ (EDI(first) and EBX(second))
=> dec both =>
ABBA
^ ^ (EDI(first) and EBX(second))
=> dec both =>
ABBA
^ ^ (EDI(first) and EBX(second))
...
=> Problem! Won't terminate, condition EBX=EDI will never be met*
Possible solution: Add an A(Above = Greater for unsigned values) to the jump
...
cmp ebx, edi
jae allDone

How do I put a register into an array index in MASM?

I'm having a really hard time with arrays in MASM. I don't understand how to put the value of a register into an index of an array. I can't seem to find where arr[i] is. What is it I'm missing or what do I have wrong?
Thanks for your time!
C++ code:
#include <iostream>
using namespace std;
extern"C"
{
char intToBinary(char *, int, int);
}
int main()
{
const int SIZE = 16;
char arr[SIZE] = { '/0' };
cout << "What integer do you want converted?" << endl;
cin >> decimal;
char value = intToBinary(arr, SIZE, decimal);
return 0;
}
Assembly code:
.686
.model flat
.code
_intToBinary PROC ; named _test because C automatically prepends an underscode, it is needed to interoperate
push ebp
mov ebp,esp ; stack pointer to ebp
mov ebx,[ebp+8] ; address of first array element
mov ecx,[ebp+12] ; number of elements in array
mov edx, 0 ;has to be 0 to check remainder
mov esi, 2 ;the new divisor
mov edi, 12
LoopMe:
add ebx, 4
xor edx, edx ;keep this 0 at all divisions
div esi ;divide eax by 2
inc ebx ;increment by 1
mov [ebp + edi], edx ;put edx into the next array index
add edi, 4 ;add 4 bytes to find next index
cmp ecx, ebx ;compare iterator to number of elements (16)
jg LoopMe
pop ebp ;return
ret
_intToBinary ENDP
END

In your C++ code
decimal is not defined.
'/0' is invalid character literal. Use \, not /, to write escape sequences in C++.
value isn't used.
Your code should be like this:
#include <iostream>
using namespace std;
extern"C"
{
char intToBinary(char *, int, int);
}
int main()
{
const int SIZE = 16;
char arr[SIZE] = { '\0' };
int decimal;
cout << "What integer do you want converted?" << endl;
cin >> decimal;
intToBinary(arr, SIZE, decimal);
for (int i = SIZE - 1; i >= 0; i--) cout << arr[i];
cout << endl;
return 0;
}
In your assembly code
You stored the "address of first array element" to ebx by mov ebx,[ebp+8], so the address of arr will be there.
Unfortunately, it is destroyed by add ebx, 4 and inc ebx.
"put edx into the next array index" No, [ebp + edi] isn't the next array index and it is destoying data on the stack. It is very bad.
Don't add 4 bytes to "find next index" if your size of char is 1 byte.
Your code should be like this (Sorry, this is nasm code because I am unfamiliar to masm):
bits 32
global _intToBinary
_intToBinary:
push ebp
mov ebp, esp ; stack pointer to ebp
push esi ; save this register before breaking in the code
push edi ; save this, too
push ebx ; save this, too
mov ebx, [ebp + 8] ; address of first array element
mov ecx, [ebp + 12] ; number of elements in array
mov eax, [ebp + 16] ; the number to convert
xor edi, edi ; the index of array to store
mov esi, 2 ; the new divisor
LoopMe:
xor edx, edx ; keep this 0 at all divisions
div esi ; divide eax by 2
add dl, 48 ; convert the number in dl to a character representing it
mov [ebx + edi], dl ; put dl into the next array index
inc edi ; add 1 byte to find next index
cmp ecx, edi ; compare iterator to number of elements
jg LoopMe
xor eax, eax ; return 0
pop ebx ; restore the saved register
pop edi ; restore this, too
pop esi ; restore this, too
mov esp, ebp ; restore stack pointer
pop ebp
ret
Note that this code will store the binary text in reversed order, so I wrote the C++ code to print them from back to front.
Also note that there are no terminating null character in arr, so do not do cout << arr;.

You have the address of the first array element in ebx, and edi is your loop counter. So mov [ebx + edi], edx would store edx into arr[edi].
Also note that your loop condition is wrong (your cmp is comparing the number of elements against the starting address of the array.)
Avoid div whenever possible. To divide by two, right-shift by one. div is very slow (like 10 to 30 times slower than a shift).
BTW, since you have a choice of which registers to use (out of the ones the ABI says you're allowed to clobber without saving/restoring), edi is used for a "destination" pointer by convention (i.e. when it doesn't cost any extra instructions), while esi is used as a "source" pointer.
Speaking of the ABI, you need to save/restore ebx in functions that use it, same as ebp. It keeps its value across function calls (because any ABI-compliant function you call preserves it). I forget which other registers are callee-saved in the 32bit ABI. You can check at the helpful links in https://stackoverflow.com/tags/x86/info. 32bit is obsolete; 64bit has a more efficient ABI, and includes SSE2 as part of the baseline.

x86 Assembly Lowercase unhandled exception [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
x86 convert to lower case assembly
This program is to convert a 2d char array into lower case
Quickie Edit: I'm using Visual Studio 2010
int b_search (char list[100][20], int count, char* token)
{
__asm
{
mov eax, 0 ; zero out the result
mov esi, list ; move the list pointer to ESI
mov ebx, count ; move the count into EBX
mov edi, token ; move the token to search for into EDI
MOV ecx, 0
LOWERCASE_TOKEN: ;lowercase the token
OR [edi], 20h
INC ecx
CMP [edi+ecx],0
JNZ LOWERCASE_TOKEN
MOV ecx, 0
At my OR instruction, where I'm trying to change the register that contains the address to token into all lower case, I keep getting unhandled exception...access violation, and without the brackets nothing gets lowercased. Later in my code I have
LOWERCASE_ARRAY: ;for(edi = 0, edi<ebx; edi++), loops through each name
CMP ecx, ebx
JGE COMPARE
INC ecx ;ecx++
MOV edx, 0; ;edx = 0
LOWERCASE_STRING: ;while next char != 0, loop through each byte to convert to lower case
OR [esi+edx],20h ;change to lower case
INC edx
CMP [esi+edx],0 ;if [esi+edx] not zero, loop again
JNZ LOWERCASE_STRING
JMP LOWERCASE_ARRAY ;jump back to start case change of next name
and the OR instruction there seems to work perfectly so I don't know why the first won't work. Also, I am trying to convert several strings.
After I finish one string, any ideas how I would go about going to the next string (as in list[1][x], list[2][x], etc...) I tried adding 20 as in [esi+20*ecx+edi] but that doesn't work. Can I get advice on how to proceed?

One possibility:
If parameters of procedure b_search are stored as registers (register calling convention) then you override list pointer in your first asm line, because eax point to the list array:
mov eax, 0 ; zero out the result
Because:
mov esi, list ; move the list pointer to ESI
should be converted to:
mov esi, eax
Try to exchange first and second line to:
mov esi, list ; move the list pointer to ESI
mov eax, 0 ; zero out the result

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight