How to pass array from assembly to a C function - arrays

I wanna pass an array defined in assembly code to a C function, but i'm getting a segment violation error when i try to access that array in my C code. Here is the assembly code (i'm using nasm):
%include "io.inc"
extern minimo ;My C function
extern printf
section .data
array db 1, 2, 3, 4, 5 ;My array
alen db 5 ;My array length
fmt db "%d", 10, 0 ;Format for the printf function
section .text
global CMAIN
CMAIN:
xor eax, eax
mov ebx, [alen]
mov ecx, [array]
push ebx
push ecx
call minimo
add esp, 8
push eax
push fmt
call printf
add esp, 8
mov eax, 1
mov ebx, 0
int 80h
And here is my C code:
int minimo(int *array, int size){
int ret = array[0];
for (int i = 1; i < size; i++){
if(array[i] < ret){
ret = array[i];
}
}
return ret;
}

mov ecx, [array] moves the value sitting on the location "array" points to, so you need to move an address mov ecx, array will do

Related

To Pass The Content Of The Pointer In An Inline Assembly Function On Visual Studio

I want to understand the assembly language on Visual Studio using the __asm keyword in a c program.
What I try to do is;
- Create an int array with 5 elements
- Loop thru the array and add the values to the accumulator
Here is a code which is working fine;
#include <stdio.h>
int main()
{
int intArr[5] = { 1, 2, 3, 4, 5 };
int sum;
char printFormat[] = "Sum=%i\n";
__asm
{
lea esi, [intArr] // get the address of the intArr
mov ebx,5 // EBX is our loop counter, set to 5
mov eax, 0 // EAX is where we add up the values
label1: add eax, [esi] // add the current number on the array to EAX
add esi, 4 // increment pointer by 4 to next number in array
dec ebx // decrement the loop counter
jnz label1 // jump back to label1 if ebx is non-zero
mov[sum],eax // save the accumulated valu in memory
}
printf(printFormat, sum);
return 0;
}
The output is as below;
Sum=15
I want to use the inline assembly part as a separate function, and do the same with a function call as below;
#include <stdio.h>
// function declaration
int addIntArray(int[], int);
int main()
{
int intArr[5] = { 1, 2, 3, 4, 5 };
char printFormat[] = "Sum=%i\n";
int result;
result = addIntArray(intArr, 5);
printf(printFormat, result);
return 0;
}
int addIntArray(int intArr[], int size)
{
int sum;
__asm
{
lea esi, [intArr] // get the address of the intArr
mov ebx, 5 // EBX is our loop counter, set to 5
mov eax, 0 // EAX is where we add up the values
label1: add eax, [esi] // add the current number on the array to EAX
add esi, 4 // increment pointer by 4 to next number in array
dec ebx // decrement the loop counter
jnz label1 // jump back to label1 if ebx is non-zero
mov[sum], eax // save the accumulated value in memory
}
return sum;
}
The output is weird and as below;
Sum=2145099747
As I debug, I found that I just add up the address values which is stored in the esi register, instead of the contents of those addresses.
I am confused that, why the same inline assembly routine is working as I run it on the main thread, and why not working as I try to call it on a separate function.
Where is the problem, why is the process behaves different on main and function, and how can I fix it?
You are actually using the address of array that was passed on stack (that is, the address of the argument), not the address of the array itself.
This is actually easier (when it comes to assembly) if you use another debugger, like windbg.
Here's the code when addIntArray is called, everything's OK:
00b91010 c745e001000000 mov dword ptr [ebp-20h],1 ; start filling array
00b91017 c745e402000000 mov dword ptr [ebp-1Ch],2
00b9101e c745e803000000 mov dword ptr [ebp-18h],3
00b91025 c745ec04000000 mov dword ptr [ebp-14h],4
00b9102c c745f005000000 mov dword ptr [ebp-10h],5
[...]
00b91044 6a05 push 5 ; pass number of elements in array
00b91046 8d55e0 lea edx,[ebp-20h] ; load array address in edx
00b91049 52 push edx ; pass edx to addIntArray
00b9104a e831000000 call Tmp!addIntArray
Let's take a look at the stack when the CALL is made:
0:000> dd #esp L1
00fbfdb0 00fbfdbc
The above address is the address of the array, just display the content:
0:000> dd 00fbfdbc L5
00fbfdbc 00000001 00000002 00000003 00000004
00fbfdcc 00000005
Now let's take a look at addIntArray:
Tmp!addIntArray:
00b91080 55 push ebp
00b91081 8bec mov ebp,esp
00b91083 83ec08 sub esp,8
[...]
00b91090 53 push ebx
00b91091 56 push esi
00b91092 8d7508 lea esi,[ebp+8] ; load what ???
So, what's in ebp+8?
0:000> dd #ebp+8 L1
00fbfdb0 00fbfdbc
This (0x00fbfdbc) is the address of the array, but as you are using LEA instead of MOV you are actually loading the address of ebp+8, not the address of the array.
Let's check the value of esi after the LEA has been executed:
0:000> r #esi
esi=00fbfdb0
This (0x00fbfdb0) is the address of the first argument, and it contains the address of the array, you're not using the array directly.
The command below dereferences the esi register:
0:000> dd poi(#esi) L5
00fbfdbc 00000001 00000002 00000003 00000004
00fbfdcc 00000005
So instead of using LEA, use MOV:
; ...
mov esi, [intArr] // get the address of the intArr
mov ebx, 5 // EBX is our loop counter, set to 5
mov eax, 0 // EAX is where we add up the values
; ...
Executing the program with a mov instead of an LEA now displays the expected value:
Sum=15
As your question title says, arrays are passed as pointers to functions. Thus, you need to treat it as a pointer: mov esi, [intArr] should work.
As illustration, consider this C code:
#include <stdio.h>
void func(int intArr[])
{
printf("In func, sizeof(intArr) = %d\n", sizeof(intArr));
}
int main()
{
int intArr[5] = { 1, 2, 3, 4, 5 };
printf("In main, sizeof(intArr) = %d\n", sizeof(intArr));
func(intArr);
return 0;
}
Sample output:
In main, sizeof(intArr) = 20
In func, sizeof(intArr) = 4
You can see what was an array in main is a pointer in func.

How to call C extern function and get return struct?

I have an extern function and a struct defined in token.c:
#include "stdio.h"
typedef struct token {
int start;
int length;
} t;
extern t get_token(int, int);
t get_token(int s, int l) {
printf("[C] new token: start [%d] length [%d]\n\n", s, l);
t m_T = {};
m_T.start = s;
m_T.length = l;
return m_T;
}
... so that I can call _get_token from my assembly and get a new token. In make_token.asm I have the following:
SECTION .data ; initialized data
mtkn: db "call: token(%d, %d)", 10, 0
mlen db "length: %d", 10, 0
mstt: db "start: %d", 10, 0
mend: db 10, "*** END ***", 10, 0
SECTION .text ; code
extern _get_token
extern _printf
global _main
_main:
; stash base stack pointer
push ebp
mov ebp, esp
mov eax, 5
mov ebx, 10
push ebx ; length
push eax ; start
call _get_token ; get a token
mov [tkn], eax
add esp, 8
; test token properties
push DWORD [tkn]
push mstt
call _printf
push DWORD [tkn + 4]
push mlen
call _printf
add esp, 16
.end:
push DWORD mend
call _printf
; restore base stack pointer
mov esp, ebp
pop ebp
SECTION .bss ; uninitialized data
tkn: resd 1
The output is:
[C] new token: start [5] length [10]
start: 5
length: 0
What am I missing to get both start and length? The output verifies that the extern function in C is getting called and the values are pushed into the function.
I believe the problem lies in your .bss section:
SECTION .bss ; uninitialized data
tkn: resd 1
Here, you set aside a single dword (one integer's worth of memory) of memory for the token. However, in your C code, you define the token struct as having 2 ints (start and length), or 2 dwords worth of memory. This means that you are only able to write to part of the token struct (start), and the member length is treated as non-existent. Your problem can probably be solved by simply defining tkn as
tkn: resd 2
or
tkn: resq 1 ;; 1 QWORD == 2 DWORDs
Hope this helps ;)
I decided instead of a single token at a time, that I should allocate a buffer and fill it: determine how many tokens are required, malloc the buffer, call get_tokens and pass in the pointer to the buffer and number of tokens.
The get_tokens method fills the buffer and returns a count of tokens created.
The assembly then iterates the token buffer and displays the values - start and length - for each token.
token.c:
#include <stdio.h>
typedef struct token {
int start;
int length;
} t;
extern int get_tokens(t*, int);
extern int token_size();
/*
p_t: pointer to allocated buffer
num: number of tokens with which to fill buffer
*/
int get_tokens(t* p_t, int num) {
printf("[C] create %d tokens: %d bytes\n", num, token_size() * num);
int idx = 0;
while (idx < num) {
// values are arbitrary for testing purposes
t tkn = {idx, idx * 10};
p_t[idx] = tkn;
printf("[C] [%d] start: %d; len: %d\n", idx, tkn.start, tkn.length);
++idx;
}
return idx;
}
int token_size() {
return sizeof(t);
}
make_tokens.asm:
SECTION .data ; initialized data
endl: db 10, 0
mszt: db "token size: %d bytes", 10, 0
tk_info: db "[%d]: s[%d] l[%d]", 10, 0
mlen db "length: %d", 10, 0
mstt: db "start: %d", 10, 0
mend: db 10, "*** END ***", 10, 0
mt1 db "malloc space for 3 tokens: %d bytes", 10, 0
mty db 10, "success", 10, 0
mtn db 10, "fail", 10, 0
SECTION .text ; code
extern _get_tokens
extern _token_size
extern _free
extern _malloc
extern _printf
global _main
_main:
; stash base stack pointer
push ebp
mov ebp, esp
; get token size
call _token_size
mov [tsz], eax
push DWORD [tsz]
push DWORD mszt
call _printf
add esp, 8
mov eax, [tsz]
mov edx, 3
mul edx
mov [tbsz], eax
push DWORD [tbsz]
push DWORD mt1
call _printf
add esp, 8
push DWORD [tbsz] ; malloc 3 tokens
call _malloc
mov [tkn_buf], eax
add esp, 4
mov ecx, 3 ; 3 tokens
push DWORD ecx
push DWORD [tkn_buf]
call _get_tokens
add esp, 8
cmp eax, 3
je .yes
.no:
push DWORD mtn
call _printf
add esp, 4
jmp .end
.yes:
push DWORD mty
call _printf
add esp, 4
mov ecx, 0
mov ebx, [tkn_buf]
.loopTokens:
mov eax, [tsz] ; determine next token
mul ecx ; start location => eax
mov edi, ecx ; preserve counter
push DWORD [ebx + eax + 4] ; length
push DWORD [ebx + eax] ; start
push DWORD ecx
push DWORD tk_info
call _printf
add esp, 16
mov ecx, edi
inc ecx
cmp ecx, 3
jl .loopTokens
.end:
push DWORD [tkn_buf]
call _free
push DWORD mend
call _printf
; restore base stack pointer
mov esp, ebp
pop ebp
SECTION .bss ; uninitialized data
tkn_buf: resd 1
tbsz: resd 1
tsz: resd 1
...and the resulting output:
token size: 8 bytes
malloc space for 3 tokens: 24 bytes
[C] create 3 tokens: 24 bytes
[C] [0] start: 0; len: 0
[C] [1] start: 1; len: 10
[C] [2] start: 2; len: 20
success
[0]: s[0] l[0]
[1]: s[1] l[10]
[2]: s[2] l[20]
As I stated in a comment.
it would be far better to pass a pointer to
an instance of struct token
rather than the current code.
The following follows the current code.
but remember all those hidden calls to memcpy()
and the hidden ram allocation
otherfile.h contains
#ifndef OTHER_FILE_H
#define OTHER_FILE_H
struct token
{
int start;
int length;
};
struct token get_token( int, int );
#endif // OTHER_FILE_H
in file otherfile.c
#include <stdio.h>
#include "otherfile.h"
struct token get_token( int tokenStart, int tokenLength )
{
printf("[C] new token: start [%d] length [%d]\n\n", s, l);
struct token m_T = {0,0};
m_T.start = tokenStart;
m_T.length = tokenLength;
return m_T;
}
in file token.c
#include <stdio.h>
#include "otherfile.h"
...
struct token myToken = {0,0};
myToken = get_token( tokenStart, tokenLength );
...

assign a pointer to a pointer in assembly and c

In a C method, given the only local variable int i (uninitialized), that i'd like to store in the register %ecx, and given the following struct located in %ebp+8:
typedef struct {
char c;
int k;
int *m;
} S1;
how do I translate into assembly (at&t syntax) the following code:
i=*(a.m);
i=i+a.k;
Thanks!
Given that i is int, in masm it's going to be something like:
;i = *(a.m);
mov eax, [ebp+13] ; 13 = +8+1+4
mov ecx, [eax] ; store i in ecx
;i = i + a.k;
mov eax, ptr [ebp+9] ; 9 = +8+1
add ecx, eax ; new value of i

Assembly Return int to C function segfaults

I am finishing up an assembly program that replaces characters in a string with a given replacement character. The assembly code calls C functions and the assembly program itself is called from main in my .c file. However, when trying to finish and return a final int value FROM the assembly program TO C, I get segfaults. My .asm file is as follows:
; File: strrepl.asm
; Implements a C function with the prototype:
;
; int strrepl(char *str, int c, int (* isinsubset) (int c) ) ;
;
;
; Result: chars in string are replaced with the replacement character and string is returned.
SECTION .text
global strrepl
_strrepl: nop
strrepl:
push ebp ; set up stack frame
mov ebp, esp
push esi ; save registers
push ebx
xor eax, eax
mov ecx, [ebp + 8] ;load string (char array) into ecx
jecxz end ;jump if [ecx] is zero
mov al, [ebp + 12] ;move the replacement character into esi
mov edx, [ebp + 16] ;move function pointer into edx
firstLoop:
xor eax, eax
mov edi, [ecx]
cmp edi, 0
jz end
mov edi, ecx ; save array
movzx eax, byte [ecx] ;load single byte into eax
push eax ; parameter for (*isinsubset)
mov edx, [ebp + 16]
call edx ; execute (*isinsubset)
mov ecx, edi ; restore array
cmp eax, 0
jne secondLoop
add esp, 4 ; "pop off" the parameter
mov ebx, eax ; store return value
add ecx, 1
jmp firstLoop
secondLoop:
mov eax, [ebp+12]
mov [ecx], al
mov edx, [ebp+16]
add esp, 4
mov ebx, eax
add ecx, 1
jmp firstLoop
end:
pop ebx ; restore registers
pop esi
mov esp, ebp ; take down stack frame
pop ebp
mov eax, 9
push eax ;test
ret
and my c file is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
//display *((char *) $edi)
// These functions will be implemented in assembly:
//
int strrepl(char *str, int c, int (* isinsubset) (int c) ) ;
int isvowel (int c) {
if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u')
return 1 ;
if (c == 'A' || c == 'E' || c == 'I' || c == 'O' || c == 'U')
return 1 ;
return 0 ;
}
int main(){
char *str1;
int r;
str1 = strdup("ABC 123 779 Hello World") ;
r = strrepl(str1, '#', &isdigit) ;
printf("str1 = \"%s\"\n", str1) ;
printf("%d chararcters were replaced\n", r) ;
free(str1) ;
return 0;
}
In my assembly code, you can see in end
mov eax, 9
push eax
I am simply trying to return the value 9 to the value "r" which is an int in the C file. This is just a test to see if I can return an int back to r in the c file. Eventually I will be returning the number of characters that were replaced back to r. However, I need to figure out why the following code above is segfaulting. Any ideas?
mov eax, 9
push eax ; NOT a good idea
ret
That is a big mistake. It's going to return based on the lowest thing on the stack and you've just pushed something on to the stack that's almost certainly not a valid return address.
Most functions return a code by simply placing it into eax (this depends on calling convention of course but that's a pretty common one), there's generally no need to push it on to the stack, and certainly plenty of downside to doing so.
Return values are normally stored in EAX on X86 32 bit machines. So your pushing it on the stack after storing it in EAX is wrong, because the function it is returning to will try to use what is in EAX as a value for IP (instruction pointer)
Ret with no argument pops the return address off of the stack and jumps to it.
source

Operand Size Conflict in x86 Assembly

I've just started programming in Assembly for my computer organization course, and I keep getting an operand size conflict error whenever I try to compile this asm block within a C program.
The arrayOfLetters[] object is a char array, so shouldn't each element be one byte? The code works when I do mov eax, arrayOfLetters[1], but I'm not sure why that works, as the eax register is 4 bytes.
#include <stdio.h>
#define SIZE 3
char findMinLetter( char arrayOfLetters[], int arraySize )
{
char min;
__asm{
push eax
push ebx
push ecx
push edx
mov dl, 0x7f // initialize DL
mov al, arrayOfLetters[1] //Problem occurs here
mov min, dl // read DL
pop edx
pop ecx
pop ebx
pop eax
}
return min;
}
int main()
{
char arrayOfLetters[ SIZE ] = {'a','B','c'};
int i;
printf("\nThe original array of letters is:\n\n");
for(i=0; i<SIZE; i++){
printf("%c ", arrayOfLetters[i]);
}
printf("\n\n");
printf("The smallest (potentially capitalized) letter is: %c\n", findMinLetter( arrayOfLetters, SIZE ));
return 0;
}
Use mov al, BYTE PTR arrayOfLetters[1].
You can compile the code with MSVC using cl input.c /Faoutput.asm to get an assembly printout - this would show that simply using arrayOfLetters[1] translates to DWORD PTR and you need to explicity state you want a BYTE PTR.

Resources