Assembly retrieving buffer to c function parameter - c

I'm writing an assembly function that will read from IDE through ports.
I'm calling the parameters through x86 base pointer (EBP).
I debugged my kernel.bin (with gdb and qemu) and I that when I'm calling my recv buffer to print, eax will return values like:36h01h10h
IBM Char Table
My disk.asm is divided by read and write. Is it possible that I'm writing it wrong? Is it legal to move directly [ebp+16] to esi (to write)? If I, on read function, move [ebp+16] directly to edi is wrong? I'm using a register poiting to that address and making edi to point to that register:
In my disk.asm, to read the disk I have this:
sub dx, 7 ;dx = 0x1f0
mov ecx, 256
mov edi, bufferrecv
rep insw
(...)
push ebx
mov ebx, [ebp+16]
mov [ebx], long word bufferrecv
pop ebx
mov esp, ebp
pop ebp
ret
And to write disk:
sub dx, 7 ;dx = 0x1f0
mov ecx, 256
mov esi, [ebp+16]
rep outsw
(...)
I'm declaring those functions this way:
Kernel.c
extern int _readd(int sector_count, int nmrsector, STRING in_msg);
extern int writed(int sector_count, int nmrsector, STRING out_msg);
The STRING type was declared inside my types.h as char*

Related

Stack cleanup not working (__stdcall MASM function)

there's something weird going on here. Visual Studio is letting me know the ESP value was not properly saved but I cannot see any mistakes in the code (32-bit, windows, __stdcall)
MASM code:
.MODE FLAT, STDCALL
...
memcpy PROC dest : DWORD, source : DWORD, size : DWORD
MOV EDI, [ESP+04H]
MOV ESI, [ESP+08H]
MOV ECX, [ESP+0CH]
AGAIN_:
LODSB
STOSB
LOOP AGAIN_
RETN 0CH
memcpy ENDP
I am passing 12 bytes (0xC) to the stack then cleaning it up. I have confirmed by looking at the symbols the functions symbol goes like "memcpy#12", so its indeed finding the proper symbol
this is the C prototype:
extern void __stdcall * _memcpy(void*,void*,unsigned __int32);
Compiling in 32-bit. The function copies the memory (I can see in the debugger), but the stack cleanup appears not to be working
EDIT:
MASM code:
__MyMemcpy PROC _dest : DWORD, _source : DWORD, _size : DWORD
MOV EDI, DWORD PTR [ESP + 04H]
MOV ESI, DWORD PTR [ESP + 08H]
MOV ECX, DWORD PTR [ESP + 0CH]
PUSH ESI
PUSH EDI
__AGAIN:
LODSB
STOSB
LOOP __AGAIN
POP EDI
POP ESI
RETN 0CH
__MyMemcpy ENDP
C code:
extern void __stdcall __MyMemcpy(void*, void*, int);
typedef struct {
void(__stdcall*MemCpy)(void*,void*,int);
}MemFunc;
int initmemfunc(MemFunc*f){
f->MemCpy=__MyMemcpy
}
when I call it like this I get the error:
MemFunc mf={0};
initmemfunc(&mf);
mf.MemCpy(dest,src,size);
when I call it like this I dont:
__MyMemcpy(dest,src,size)
Since you have provided an update to your question and comments suggesting you disable prologue and epilogue code generation for functions created with the MASM PROC directive I suspect your code looks something like this:
.MODEL FLAT, STDCALL
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
.CODE
__MyMemcpy PROC _dest : DWORD, _source : DWORD, _size : DWORD
MOV EDI, DWORD PTR [ESP + 04H]
MOV ESI, DWORD PTR [ESP + 08H]
MOV ECX, DWORD PTR [ESP + 0CH]
PUSH ESI
PUSH EDI
__AGAIN:
LODSB
STOSB
LOOP __AGAIN
POP EDI
POP ESI
RETN 0CH
__MyMemcpy ENDP
END
A note about this code: beware that if your source and destination buffers overlap this can cause problems. If the buffers don't overlap then what you are doing should work. You can avoid this by marking the pointers __restrict. __restrict is an MSVC/C++ extension that will act as a hint to the compiler that the argument doesn't overlap with another. This can allow the compiler to potentially warn of this situation since your assembly code is unsafe for that situation. Your prototypes could have been written as:
extern void __stdcall __MyMemcpy( void* __restrict, void* __restrict, int);
typedef struct {
void(__stdcall* MemCpy)(void* __restrict, void* __restrict, int);
}MemFunc;
You are using PROC but not taking advantage of any of the underlying power it affords (or obscures). You have disabled PROLOGUE and EPILOGUE generation with the OPTION directive. You properly use RET 0Ch to have the 12 bytes of arguments cleaned from the stack.
From a perspective of the STDCALL calling convention your code is correct as it pertains to stack usage. There is a serious issue in that the Microsoft Windows STDCALL calling convention requires the caller to preserve all the registers it uses except EAX, ECX, and EDX. You clobber EDI and ESI and both need to be saved before you use them. In your code you save them after their contents are destroyed. You have to push both ESI and EDI on the stack first. This will require you adding 8 to the offsets relative to ESP. Your code should have looked like this:
__MyMemcpy PROC _dest : DWORD, _source : DWORD, _size : DWORD
PUSH EDI ; Save registers first
PUSH ESI
MOV EDI, DWORD PTR [ESP + 0CH] ; Arguments are offset by an additional 8 bytes
MOV ESI, DWORD PTR [ESP + 10H]
MOV ECX, DWORD PTR [ESP + 14H]
__AGAIN:
LODSB
STOSB
LOOP __AGAIN
POP ESI ; Restore the caller (non-volatile) registers
POP EDI
RETN 0CH
__MyMemcpy ENDP
You asked the question why it appears you are getting an error about ESP or a stack issue. I assume you are getting an error similar to this:
This could be a result of either ESP being incorrect when mixing STDCALL and CDECL calling conventions or it can arise out of the value of the saved ESP being clobbered by the function. It appears in your case it is the latter.
I wrote a small C++ project with this code that has similar behaviour to your C program:
#include <iostream>
extern "C" void __stdcall __MyMemcpy( void* __restrict, void* __restrict, int);
typedef struct {
void(__stdcall* MemCpy)(void* __restrict, void* __restrict, int);
}MemFunc;
int initmemfunc(MemFunc* f) {
f->MemCpy = __MyMemcpy;
return 0;
}
char buf1[] = "Testing";
char buf2[200];
int main()
{
MemFunc mf = { 0 };
initmemfunc(&mf);
mf.MemCpy(buf2, buf1, strlen(buf1));
std::cout << "Hello World!\n" << buf2;
}
When I use code like yours that doesn't properly save ESI and EDI I discovered this in the generated assembly code displayed in the Visual Studio C/C++ debugger:
I have annotated the important parts. The compiler has generated C runtime checks (these can be disabled, but they will just hide the problem and not fix it) including a check of ESP across a STDCALL function call. Unfortunately it relies on saving the original value of ESP (before pushing parameters) into the register ESI. As a result a runtime check is made after the call to __MyMemcpy to see if ESP and ESI are still the same value. If they aren't you get the warning about ESP not being saved correctly.
Since your code incorrectly clobbers ESI (and EDI) the check fails. I have annotated the debug output to hopefully provide a better explanation.
You can avoid the use of a LODSB/STOSB loop to copy data. There is an instruction that just this very operation (REP MOVSB) that copies ECX bytes pointed to by ESI and copies them to EDI. A version of your code could have been written as:
__MyMemcpy PROC _dest : DWORD, _source : DWORD, _size : DWORD
PUSH EDI ; Save registers first
PUSH ESI
MOV EDI, DWORD PTR [ESP + 0CH] ; Arguments are offset by an additional 8 bytes
MOV ESI, DWORD PTR [ESP + 10H]
MOV ECX, DWORD PTR [ESP + 14H]
REP MOVSB
POP ESI ; Restore the caller (non-volatile) registers
POP EDI
RETN 0CH
__MyMemcpy ENDP
If you were to use the power of PROC to save the registers ESI and EDI you could list them with the USES directive. You can also reference the argument locations on the stack by name. You can also have MASM generate the proper EPILOGUE sequence for the calling convention by simply using ret. This will clean the up the stack appropriately and in the case of STDCALL return by removing the specified number of bytes from the stack (ie ret 0ch) in this case since there are 3 4-byte arguments.
The downside is that you do have to generate the PROLOGUE and EPILOGUE code that can make things more inefficient:
.MODEL FLAT, STDCALL
.CODE
__MyMemcpy PROC USES ESI EDI dest : DWORD, source : DWORD, size : DWORD
MOV EDI, dest
MOV ESI, source
MOV ECX, size
REP MOVSB ; Use instead of LODSB/STOSB+Loop
RET
__MyMemcpy ENDP
END
The assembler would generate this code for you:
PUBLIC __MyMemcpy#12
__MyMemcpy#12:
push ebp
mov ebp,esp ; Function prologue generate by PROC
push esi ; USES caused assembler to push EDI/ESI on stack
push edi
mov edi,dword ptr [ebp+8]
mov esi,dword ptr [ebp+0Ch]
mov ecx,dword ptr [ebp+10h]
rep movs byte ptr es:[edi],byte ptr [esi]
; MASM generated this from the simple RET instruction to restore registers,
; clean up stack and return back to caller per the STDCALL calling convention
pop edi ; Assembler
pop esi
leave
ret 0Ch
Some may rightly argue that having the assembler obscure all this work makes the code potentially harder to understand for someone who doesn't realize the special processing MASM can do with a PROC declared function. This may result in harder to maintain code for someone else that is unfamiliar with MASM's nuances in the future. If you don't understand what MASM may generate, then sticking to coding the body of the function yourself is probably a safer bet. As you have found that also involves turning PROLOGUE and EPILOGUE code generation off.
The reason why the stack is corrupted is that MASM "secretly" inserts the prologue code to your function. When I added the option to disable that, the function works for me now.
You can see this, when you switch to assembly mode while still in the C code and then step into your function. It seems that VS doesn't swtich to assembly mode when already in the assembly source.
.586
.MODEL FLAT,STDCALL
OPTION PROLOGUE:NONE
.CODE
mymemcpy PROC dest:DWORD, src:DWORD, sz:DWORD
MOV EDI, [ESP+04H]
MOV ESI, [ESP+08H]
MOV ECX, [ESP+0CH]
AGAIN_:
LODSB
STOSB
LOOP AGAIN_
RETN 0CH
mymemcpy ENDP
END

Mixing C and Assembly

I'm doing a program in assembly to read a disk through ports (0x1f0-0x1f7) and I'm mixing it with c. I have a function in assembly that I will call in my c main funtion. My main function as 1 parameter: sectors to read:
Kernel.c
extern int _readd(int nmrsector);
(...)
int sector = 257;
int error = _readd(sector);
if(error == 0) PrintString("Error"); //It is declared on my screen.h file
disk.asm
global _readd
_readd:
push eax
push ebx
push ecx
push edx
push ebp
mov ebp, esp
mov eax, [ebp+8]
mov ecx, eax
cmp ecx, 256
jg short _fail
jne short _good
_fail:
xor eax, eax
leave
ret
_good:
xor eax, eax
mov eax, 12
leave
ret
It crashes when run it with VirtualBox. Any ideas?
If you save CPU registers when you enter a function, you need to restore them when you are finished. Your PUSHs need to be matched with POPs.
Also, if you use a stack frame to access local variables and parameters, setup the frame (push ebp ; mov ebp, esp) before everything, so you can more easily refer to them. Here [ebp+8] doesn't refer to a parameter, because you alter the stack before setting up the frame.

What is the difference between mov al, byte ptr [esi] and mov al,[num]

I have some code that works properly, but I want to know the difference between
using mov al, byte ptr [esi] and mov al,[num]. Also why do I need to define the pointer variable to dd instead of define db. Here's the code
.386
.model flat, stdcall
.stack 1000h
Sleep proto arg1:dword
printf proto c arg1:ptr byte, printlist:vararg
.data
array db "hello" ,0
pointerByte dd offset array
fmtmsg1 db "%c",0
.code
public main
main proc
mov esi,pointerByte
mov cl,0
repeat_loop:
push ecx
mov al,byte ptr [esi]
invoke printf,addr fmtmsg1,al
inc esi
pop ecx
inc cl
cmp cl,5
jne repeat_loop
;done
ret
main endp
end main
Given that num is equivalent to the address stored in esi, there is no practical difference between the two.
You need to define the pointer variable with dd (data double) because you are assigning an 32-bit offset. db (data byte) is only for 8-bit assignments. Also, take into consideration that dw (data word) exists.

Initialize char[] fails, esi contains wrong value

I want to initialize a char array, but during I do this my programm crashes. Here's my code:
void kernelEnteredMsg() {
char str[] = "Kernel successfully entered!";
}
Here's the disassembly:
push ebp
mov ebp,esp
push edi
push esi
push ebx
sub esp,byte +0x30
lea edx,[ebp-0x2d]
mov ebx,0x402000 ; load an address outside my data segment
mov eax,0x1d
mov edi,edx
mov esi,ebx ; move this address to edi
mov ecx,eax
rep movsb ; here the programm crashes
add esp,byte +0x30
pop ebx
pop esi
pop edi
pop ebp
ret
I don't understand why it loads esi with 0x402000. But this seems to cause the error. Can somebody explain what happens here and how to fix it?
PS: "Kernel successful entered!" is at 0x1000 in binary file.
C code:
void kernelEnteredMsg();
void entryPoint() {
kernelEnteredMsg();
}
void kernelEnteredMsg() {
char str[] = "Kernel successfully entered!";
int size = 28;
}
Calling assembly code:
extern _entryPoint
global _main
section .text
_main: ; start of kernel
nop
; setup ds, es, ss and gs
mov ax, 16
mov ds, ax
mov es, ax
mov ss, ax
mov sp, 0x4000
mov ax, 24
mov gs, ax
mov [gs:0], dword 0x07690748 ; test graphics
call _entryPoint ; enter kernel C code
jmp $
This code does copy the string from the .text section to the local stack, because the char array is not 'const'. This may provide a simple solution if you do not need the string to be modified - just make it const char.
I don't understand why it loads esi with 0x402000.
ESI is the source of the string copy instruction 'rep movsb', EDI is the destination.
The address is constructed by IMAGE_BASE+SECTION (IIRC) in the PE file(assuming it is PE.)
Remember in the file there is a FILE_ALIGN and a SECTION_VIRTUAL_ADDRESS, so a section may be
at position 0x1000 in the file(FILE_ALIGN) and at 0x2000 in memory(VIRTUAL_ADDRESS) resulting in IMAGE_BASE+VIRTUAL_ADDRESS=0x402000.
You can use a PE explorer like CFF Explorer(http://www.ntcore.com/exsuite.php)
to display this(if it's a .bin file it may be unapplicable but it has to have some kind of format)
Another possibility may be a wrong state of the DF-Flag leading to wrong behaviour of the string copy instruction (should not happen, because the compiler should take care of this).
Try inserting
__asm__ ("cld");
before the char str[] or in the __main procedure to set string increment to 'UP'.

Get address of current instruction for x86 [duplicate]

This question already has answers here:
Reading program counter directly
(7 answers)
Closed 4 years ago.
I am using Linux with x86 (64 bit to be precise). Is there a way I can get the address of the current instruction. Actually I want to write my own simplified versions of setjmp/longjmp. Here, R.. posted a simplified version of longjmp. Any idea how setjmp is implemented. A simplified version that is, without taking into account of exceptions and signals etc...
I believe in 64-bit code you can simply do lea rax, [rip].
The 32-bit idiom is:
call next
next: pop eax
If using GCC, you could also use __builtin_return_address
The offset-into-the-current-segment register (EIP) is not normally accessible. However, there is a hackish-way to read it indirectly - you trick the program into pushing the value of EIP onto the stack, then just read it off. You could create a subroutine that looks like this:
GetAddress:
mov eax, [esp]
ret
...
call GetAddress ; address of this line stored in eax
Or, even simpler:
call NextLine
NextLine:
pop eax ; address of previous line stored in EAX
If you use a CALL FAR instruction, the segment value (CS) will be pushed on the stack as well.
If you're using C, there are various compiler-specific C-extensions you could use on this page. See also this interesting article.
This site gives a simple version of setjmp and longjmp, which is as follows.
#include "setjmp.h"
#define OFS_EBP 0
#define OFS_EBX 4
#define OFS_EDI 8
#define OFS_ESI 12
#define OFS_ESP 16
#define OFS_EIP 20
__declspec(naked) int setjmp(jmp_buf env)
{
__asm
{
mov edx, 4[esp] // Get jmp_buf pointer
mov eax, [esp] // Save EIP
mov OFS_EIP[edx], eax
mov OFS_EBP[edx], ebp // Save EBP, EBX, EDI, ESI, and ESP
mov OFS_EBX[edx], ebx
mov OFS_EDI[edx], edi
mov OFS_ESI[edx], esi
mov OFS_ESP[edx], esp
xor eax, eax // Return 0
ret
}
}
__declspec(naked) void longjmp(jmp_buf env, int value)
{
__asm
{
mov edx, 4[esp] // Get jmp_buf pointer
mov eax, 8[esp] // Get return value (eax)
mov esp, OFS_ESP[edx] // Switch to new stack position
mov ebx, OFS_EIP[edx] // Get new EIP value and set as return address
mov [esp], ebx
mov ebp, OFS_EBP[edx] // Restore EBP, EBX, EDI, and ESI
mov ebx, OFS_EBX[edx]
mov edi, OFS_EDI[edx]
mov esi, OFS_ESI[edx]
ret
}
}

Resources