Operand Size Conflict in x86 Assembly - c

I've just started programming in Assembly for my computer organization course, and I keep getting an operand size conflict error whenever I try to compile this asm block within a C program.
The arrayOfLetters[] object is a char array, so shouldn't each element be one byte? The code works when I do mov eax, arrayOfLetters[1], but I'm not sure why that works, as the eax register is 4 bytes.
#include <stdio.h>
#define SIZE 3
char findMinLetter( char arrayOfLetters[], int arraySize )
{
char min;
__asm{
push eax
push ebx
push ecx
push edx
mov dl, 0x7f // initialize DL
mov al, arrayOfLetters[1] //Problem occurs here
mov min, dl // read DL
pop edx
pop ecx
pop ebx
pop eax
}
return min;
}
int main()
{
char arrayOfLetters[ SIZE ] = {'a','B','c'};
int i;
printf("\nThe original array of letters is:\n\n");
for(i=0; i<SIZE; i++){
printf("%c ", arrayOfLetters[i]);
}
printf("\n\n");
printf("The smallest (potentially capitalized) letter is: %c\n", findMinLetter( arrayOfLetters, SIZE ));
return 0;
}

Use mov al, BYTE PTR arrayOfLetters[1].
You can compile the code with MSVC using cl input.c /Faoutput.asm to get an assembly printout - this would show that simply using arrayOfLetters[1] translates to DWORD PTR and you need to explicity state you want a BYTE PTR.

Related

How to pass array from assembly to a C function

I wanna pass an array defined in assembly code to a C function, but i'm getting a segment violation error when i try to access that array in my C code. Here is the assembly code (i'm using nasm):
%include "io.inc"
extern minimo ;My C function
extern printf
section .data
array db 1, 2, 3, 4, 5 ;My array
alen db 5 ;My array length
fmt db "%d", 10, 0 ;Format for the printf function
section .text
global CMAIN
CMAIN:
xor eax, eax
mov ebx, [alen]
mov ecx, [array]
push ebx
push ecx
call minimo
add esp, 8
push eax
push fmt
call printf
add esp, 8
mov eax, 1
mov ebx, 0
int 80h
And here is my C code:
int minimo(int *array, int size){
int ret = array[0];
for (int i = 1; i < size; i++){
if(array[i] < ret){
ret = array[i];
}
}
return ret;
}
mov ecx, [array] moves the value sitting on the location "array" points to, so you need to move an address mov ecx, array will do

x86 function returning char* in C

I want to write a function in x86 which will be called from C program.
The function should look like this:
char *remnth(char *s, int n);
I want it to remove every nth letter from string s and return that string. Here's my remnth.s file:
section.text
global remnth
remnth:
; prolog
push ebp
mov ebp, esp
; procedure
mov eax, [ebp+8]; Text in which I'm removing every nth letter
mov ebx, [ebp+12]; = n
mov ecx, [ebp+8] ; pointer to next letter (replacement)
lopext:
mov edi, [ebp+12] ; edi = n //setting counter
dec edi ; edi-- //we don't go form 'n' to '1' but from 'n-1' to '0'
lop1:
mov cl, [ecx] ; letter which will be a replacement
mov byte [eax], cl ; replace
test cl,cl ; was the replacement equal to 0?
je exit ; if yes that means the function is over
inc eax ; else increment pointer to letter which will be replaced
inc ecx ; increment pointer to letter which is a replacement
dec edi ; is it already nth number?
jne lop1 ; if not then repeat the loop
inc ecx ; else skip that letter by proceeding to the next one
jmp lopext ; we need to set counter (edi) once more
exit:
; epilog
pop ebp
ret
The problem is that when I'm calling this function from main() in C I get Segmentation fault (core dumped)
From what I know this is highly related to pointers, in this case I'm returning *char, and since I've seen some functions that returns int and they worked just fine, I suspect that I forgot about something important with returning a *char properly.
This is what my C file looks like:
#include <stdio.h>
extern char *remnth(char *s,int n);
int main()
{
char txt[] = "some example text\0";
printf("orginal = %s\n",txt);
printf("after = %s\n",remnth(txt,3));
return 0;
}
Any help will be appreciated.
You're using ecx as a pointer, and cl as a work register. Since cl is the low 8 bits of ecx, you're corrupting your pointer with the mov cl, [ecx] instruction. You'll need to change one or the other. Typically, al/ax/eax/rax is used for a temporary work register, as some accesses to the accumulator use shorter instruction sequences. If you use al as a work register, you'll want to avoid using eax as a pointer and use a different register instead (remembering to preserve its contents if necessary).
You need to load the return value into eax before the return. I assume you want to return a pointer to the beginning of the string, so that would be [ebp+8].

How to return an assembler value to a C Int Pointer?

I am writing a small ASM/C-Program for calculating the number of dividers of a number. I got the following C function:
#include <stdio.h>
extern void getDivisorCounter(int value, int* result);
int main(int argc, char** argv) {
int number;
printf("Please insert number:\n");
scanf("%d", &number);
int* result;
getDivisorCounter(number, result);
printf("amount of div: %d\n", *result);
return 0;
}
where I receive a result from the following assembler programm:
section .text
global getDivisorCounter
getDivisorCounter:
push ebp
mov ebp, esp
mov ecx, [ebp+8]
mov eax, 0
push ebx
for_loop:
mov ebx, ecx
jmp checking
adding:
add ebx, ecx
checking:
cmp ebx, [ebp+8]
jg looping
jl adding
inc eax
looping:
loop for_loop
mov [ebp+12], eax
pop ebx
pop ebp
ret
From Debugging, I know, that I end up with the right value in eax. But somehow I cannot get it to be printed by my C programm.
Could you give me a hint on how to solve this?
If neccessary, I am using NASM and GCC.
You do not need a pointer for this. Anyway, if you (or the assignment) insist, you must 1) initialize said pointer on the C side and 2) write through that pointer on the asm side.
E.g.
int value;
int* result = &value;
and
mov ecx, [ebp+12]
mov [ecx], eax
If you must use a pointer, this does not mean you need to create an extra pointer variable. You can just pass the address of a variable of proper type. This would eliminate the risk of missing memory allocation.
Missing memory allocation is the reason for your problem. result does not point to valid memory.
Instead of
int val;
int *result = &val; // <<== note the mandatory initialization of your pointer.
getDivisorCounter(number, result);
printf("amount of div: %d\n", val);
you could use this:
int result;
getDivisorCounter(number, &result);
printf("amount of div: %d\n", result);

How do I put a register into an array index in MASM?

I'm having a really hard time with arrays in MASM. I don't understand how to put the value of a register into an index of an array. I can't seem to find where arr[i] is. What is it I'm missing or what do I have wrong?
Thanks for your time!
C++ code:
#include <iostream>
using namespace std;
extern"C"
{
char intToBinary(char *, int, int);
}
int main()
{
const int SIZE = 16;
char arr[SIZE] = { '/0' };
cout << "What integer do you want converted?" << endl;
cin >> decimal;
char value = intToBinary(arr, SIZE, decimal);
return 0;
}
Assembly code:
.686
.model flat
.code
_intToBinary PROC ; named _test because C automatically prepends an underscode, it is needed to interoperate
push ebp
mov ebp,esp ; stack pointer to ebp
mov ebx,[ebp+8] ; address of first array element
mov ecx,[ebp+12] ; number of elements in array
mov edx, 0 ;has to be 0 to check remainder
mov esi, 2 ;the new divisor
mov edi, 12
LoopMe:
add ebx, 4
xor edx, edx ;keep this 0 at all divisions
div esi ;divide eax by 2
inc ebx ;increment by 1
mov [ebp + edi], edx ;put edx into the next array index
add edi, 4 ;add 4 bytes to find next index
cmp ecx, ebx ;compare iterator to number of elements (16)
jg LoopMe
pop ebp ;return
ret
_intToBinary ENDP
END
In your C++ code
decimal is not defined.
'/0' is invalid character literal. Use \, not /, to write escape sequences in C++.
value isn't used.
Your code should be like this:
#include <iostream>
using namespace std;
extern"C"
{
char intToBinary(char *, int, int);
}
int main()
{
const int SIZE = 16;
char arr[SIZE] = { '\0' };
int decimal;
cout << "What integer do you want converted?" << endl;
cin >> decimal;
intToBinary(arr, SIZE, decimal);
for (int i = SIZE - 1; i >= 0; i--) cout << arr[i];
cout << endl;
return 0;
}
In your assembly code
You stored the "address of first array element" to ebx by mov ebx,[ebp+8], so the address of arr will be there.
Unfortunately, it is destroyed by add ebx, 4 and inc ebx.
"put edx into the next array index" No, [ebp + edi] isn't the next array index and it is destoying data on the stack. It is very bad.
Don't add 4 bytes to "find next index" if your size of char is 1 byte.
Your code should be like this (Sorry, this is nasm code because I am unfamiliar to masm):
bits 32
global _intToBinary
_intToBinary:
push ebp
mov ebp, esp ; stack pointer to ebp
push esi ; save this register before breaking in the code
push edi ; save this, too
push ebx ; save this, too
mov ebx, [ebp + 8] ; address of first array element
mov ecx, [ebp + 12] ; number of elements in array
mov eax, [ebp + 16] ; the number to convert
xor edi, edi ; the index of array to store
mov esi, 2 ; the new divisor
LoopMe:
xor edx, edx ; keep this 0 at all divisions
div esi ; divide eax by 2
add dl, 48 ; convert the number in dl to a character representing it
mov [ebx + edi], dl ; put dl into the next array index
inc edi ; add 1 byte to find next index
cmp ecx, edi ; compare iterator to number of elements
jg LoopMe
xor eax, eax ; return 0
pop ebx ; restore the saved register
pop edi ; restore this, too
pop esi ; restore this, too
mov esp, ebp ; restore stack pointer
pop ebp
ret
Note that this code will store the binary text in reversed order, so I wrote the C++ code to print them from back to front.
Also note that there are no terminating null character in arr, so do not do cout << arr;.
You have the address of the first array element in ebx, and edi is your loop counter. So mov [ebx + edi], edx would store edx into arr[edi].
Also note that your loop condition is wrong (your cmp is comparing the number of elements against the starting address of the array.)
Avoid div whenever possible. To divide by two, right-shift by one. div is very slow (like 10 to 30 times slower than a shift).
BTW, since you have a choice of which registers to use (out of the ones the ABI says you're allowed to clobber without saving/restoring), edi is used for a "destination" pointer by convention (i.e. when it doesn't cost any extra instructions), while esi is used as a "source" pointer.
Speaking of the ABI, you need to save/restore ebx in functions that use it, same as ebp. It keeps its value across function calls (because any ABI-compliant function you call preserves it). I forget which other registers are callee-saved in the 32bit ABI. You can check at the helpful links in https://stackoverflow.com/tags/x86/info. 32bit is obsolete; 64bit has a more efficient ABI, and includes SSE2 as part of the baseline.

How to refresh a C array after using assembly to sort

I have been working on a program that will do a bubble sort for n integers. I have hit a wall, as I do not know to refresh the array once my assembler operation are done. Any suggestions would be great.
#include <stdio.h>
#include <stdlib.h>
int n;
int *input;
int output;
int i;
int main(void)
{
scanf("%d", &n);
input = (int *)malloc(sizeof(n));
for (i = 0; i < n; i++)
{
scanf("%d", &input[i]);
}
__asm
{
mov ebx, input
mov esi, n
outer_loop:
dec esi
jz end_outer
mov edi, n
inner_loop:
dec edi
jz outer_loop
compare:
mov al, [ebx + edi - 1]
mov dl, [ebx + edi]
cmp al, dl
jnl inner_loop
swap:
mov [ebx + edi], al
mov [ ebx + edi - 1], dl
jmp inner_loop
end_outer:
}
for (i = 0; i < n; i++)
{
printf("%d\n", input[i]);
}
scanf("%d", &output);
}
There's nothing to "refresh". Your code runs. ebx contains input and that's that. (Hint: Your C code also gets transformed into assembly. Looking at what your compiler generates through a disassembler might give you some insight.)
That said I see some problems:
input = (int *)malloc(sizeof(n));
This allocation is not big enough and your program will crash. You want to allocate sizeof(int) * n. You should also check the allocation for errors.
mov al, [ebx + edi - 1]
mov dl, [ebx + edi]
cmp al, dl
Kind of verbose. You should be able to do register-to-memory comparisons. (eg. cmp al, byte [ebx + edi])
Not to mention it's a complete waste of time to implement bubble sort in assembly. Rephrase: Learning assembly is great, but it would be a bad idea to use this in anything that matters. One of the most important things about knowing assembly is knowing when you don't need to use it. You'd probably find very often that what your compiler generates is good enough. Let's also not forget that a good algorithm in C will beat a bad algorithm in assembly, such as bubble sort.
#Giorgio also raises a good point in the comments. Your assembly is comparing and sorting bytes. You want to be doing things like this:
mov eax, [ebx + edi - 4] ; assumes edi is a byte offset, see next comment
mov edx, [ebx + edi]
And instead of dec edi etc., you want to do:
sub edi, 4
Your swap would also have to be re-done to use 32-bit quantities.
This is of course assuming int is 32 bits, which may not be the case. If you're using (non-standard) inline assembly it's probably fair that you're doing this - it means you're already targeting a particular compiler. (Based on the syntax I'd say VC++) Nitpickers might say you should use int32_t instead of int.
Note I'm not sure if this is the only problem, I haven't looked at your code too thoroughly.
I will also give it a try.
#include <stdio.h>
#include <stdlib.h>
int n;
int *input;
int output;
int i;
int s;
int main(void)
{
s = sizeof(int);
scanf("%d", &n);
input = (int *)malloc(sizeof(n));
for (i = 0; i < n; i++)
{
scanf("%d", &input[i]);
}
__asm
{
mov ecx, s
mov ebx, input
mov esi, n
mul esi, ecx
outer_loop:
sub esi, ecx
jz end_outer
mov edi, esi
inner_loop:
sub edi, ecx
jz outer_loop
compare:
mov edx, [ebx + edi]
sub edi, ecx
mov eax, [ebx + edi]
add edi, ecx
cmp eax, edx
jnl inner_loop
swap:
mov [ebx + edi], eax
sub edi, ecx
mov [ebx + edi], edx
add edi, ecx
jmp inner_loop
end_outer:
}
for (i = 0; i < n; i++)
{
printf("%d\n", input[i]);
}
scanf("%d", &output);
}
I used variable s to hold the size of an integer. To my knowledge it is not allowed to use an indirection like
mov eax, [ebx + edi + ecx]
therefore I had to add separate add and sub. It is not very nice, does anyone see a better solution?
You seem to intend to allocate and input an array of n int values. (Although the memory size in your malloc is incorrect, as has already been noted).
But then you proceed to sort your array as an array of n bytes. Why are you sorting bytes instead of sorting ints?
Even if your sorting algorithm is implemented correctly (as byte-sorting implementation), the end result will look totally meaningless, since you are printing your array as an array of ints in the end.
First make up your mind what is that you are trying to work with: ints or bytes (chars) and then act accordingly and consistently.

Resources