Get address of current instruction for x86 [duplicate] - c

This question already has answers here:
Reading program counter directly
(7 answers)
Closed 4 years ago.
I am using Linux with x86 (64 bit to be precise). Is there a way I can get the address of the current instruction. Actually I want to write my own simplified versions of setjmp/longjmp. Here, R.. posted a simplified version of longjmp. Any idea how setjmp is implemented. A simplified version that is, without taking into account of exceptions and signals etc...

I believe in 64-bit code you can simply do lea rax, [rip].
The 32-bit idiom is:
call next
next: pop eax

If using GCC, you could also use __builtin_return_address

The offset-into-the-current-segment register (EIP) is not normally accessible. However, there is a hackish-way to read it indirectly - you trick the program into pushing the value of EIP onto the stack, then just read it off. You could create a subroutine that looks like this:
GetAddress:
mov eax, [esp]
ret
...
call GetAddress ; address of this line stored in eax
Or, even simpler:
call NextLine
NextLine:
pop eax ; address of previous line stored in EAX
If you use a CALL FAR instruction, the segment value (CS) will be pushed on the stack as well.
If you're using C, there are various compiler-specific C-extensions you could use on this page. See also this interesting article.

This site gives a simple version of setjmp and longjmp, which is as follows.
#include "setjmp.h"
#define OFS_EBP 0
#define OFS_EBX 4
#define OFS_EDI 8
#define OFS_ESI 12
#define OFS_ESP 16
#define OFS_EIP 20
__declspec(naked) int setjmp(jmp_buf env)
{
__asm
{
mov edx, 4[esp] // Get jmp_buf pointer
mov eax, [esp] // Save EIP
mov OFS_EIP[edx], eax
mov OFS_EBP[edx], ebp // Save EBP, EBX, EDI, ESI, and ESP
mov OFS_EBX[edx], ebx
mov OFS_EDI[edx], edi
mov OFS_ESI[edx], esi
mov OFS_ESP[edx], esp
xor eax, eax // Return 0
ret
}
}
__declspec(naked) void longjmp(jmp_buf env, int value)
{
__asm
{
mov edx, 4[esp] // Get jmp_buf pointer
mov eax, 8[esp] // Get return value (eax)
mov esp, OFS_ESP[edx] // Switch to new stack position
mov ebx, OFS_EIP[edx] // Get new EIP value and set as return address
mov [esp], ebx
mov ebp, OFS_EBP[edx] // Restore EBP, EBX, EDI, and ESI
mov ebx, OFS_EBX[edx]
mov edi, OFS_EDI[edx]
mov esi, OFS_ESI[edx]
ret
}
}

Related

Stack cleanup not working (__stdcall MASM function)

there's something weird going on here. Visual Studio is letting me know the ESP value was not properly saved but I cannot see any mistakes in the code (32-bit, windows, __stdcall)
MASM code:
.MODE FLAT, STDCALL
...
memcpy PROC dest : DWORD, source : DWORD, size : DWORD
MOV EDI, [ESP+04H]
MOV ESI, [ESP+08H]
MOV ECX, [ESP+0CH]
AGAIN_:
LODSB
STOSB
LOOP AGAIN_
RETN 0CH
memcpy ENDP
I am passing 12 bytes (0xC) to the stack then cleaning it up. I have confirmed by looking at the symbols the functions symbol goes like "memcpy#12", so its indeed finding the proper symbol
this is the C prototype:
extern void __stdcall * _memcpy(void*,void*,unsigned __int32);
Compiling in 32-bit. The function copies the memory (I can see in the debugger), but the stack cleanup appears not to be working
EDIT:
MASM code:
__MyMemcpy PROC _dest : DWORD, _source : DWORD, _size : DWORD
MOV EDI, DWORD PTR [ESP + 04H]
MOV ESI, DWORD PTR [ESP + 08H]
MOV ECX, DWORD PTR [ESP + 0CH]
PUSH ESI
PUSH EDI
__AGAIN:
LODSB
STOSB
LOOP __AGAIN
POP EDI
POP ESI
RETN 0CH
__MyMemcpy ENDP
C code:
extern void __stdcall __MyMemcpy(void*, void*, int);
typedef struct {
void(__stdcall*MemCpy)(void*,void*,int);
}MemFunc;
int initmemfunc(MemFunc*f){
f->MemCpy=__MyMemcpy
}
when I call it like this I get the error:
MemFunc mf={0};
initmemfunc(&mf);
mf.MemCpy(dest,src,size);
when I call it like this I dont:
__MyMemcpy(dest,src,size)
Since you have provided an update to your question and comments suggesting you disable prologue and epilogue code generation for functions created with the MASM PROC directive I suspect your code looks something like this:
.MODEL FLAT, STDCALL
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
.CODE
__MyMemcpy PROC _dest : DWORD, _source : DWORD, _size : DWORD
MOV EDI, DWORD PTR [ESP + 04H]
MOV ESI, DWORD PTR [ESP + 08H]
MOV ECX, DWORD PTR [ESP + 0CH]
PUSH ESI
PUSH EDI
__AGAIN:
LODSB
STOSB
LOOP __AGAIN
POP EDI
POP ESI
RETN 0CH
__MyMemcpy ENDP
END
A note about this code: beware that if your source and destination buffers overlap this can cause problems. If the buffers don't overlap then what you are doing should work. You can avoid this by marking the pointers __restrict. __restrict is an MSVC/C++ extension that will act as a hint to the compiler that the argument doesn't overlap with another. This can allow the compiler to potentially warn of this situation since your assembly code is unsafe for that situation. Your prototypes could have been written as:
extern void __stdcall __MyMemcpy( void* __restrict, void* __restrict, int);
typedef struct {
void(__stdcall* MemCpy)(void* __restrict, void* __restrict, int);
}MemFunc;
You are using PROC but not taking advantage of any of the underlying power it affords (or obscures). You have disabled PROLOGUE and EPILOGUE generation with the OPTION directive. You properly use RET 0Ch to have the 12 bytes of arguments cleaned from the stack.
From a perspective of the STDCALL calling convention your code is correct as it pertains to stack usage. There is a serious issue in that the Microsoft Windows STDCALL calling convention requires the caller to preserve all the registers it uses except EAX, ECX, and EDX. You clobber EDI and ESI and both need to be saved before you use them. In your code you save them after their contents are destroyed. You have to push both ESI and EDI on the stack first. This will require you adding 8 to the offsets relative to ESP. Your code should have looked like this:
__MyMemcpy PROC _dest : DWORD, _source : DWORD, _size : DWORD
PUSH EDI ; Save registers first
PUSH ESI
MOV EDI, DWORD PTR [ESP + 0CH] ; Arguments are offset by an additional 8 bytes
MOV ESI, DWORD PTR [ESP + 10H]
MOV ECX, DWORD PTR [ESP + 14H]
__AGAIN:
LODSB
STOSB
LOOP __AGAIN
POP ESI ; Restore the caller (non-volatile) registers
POP EDI
RETN 0CH
__MyMemcpy ENDP
You asked the question why it appears you are getting an error about ESP or a stack issue. I assume you are getting an error similar to this:
This could be a result of either ESP being incorrect when mixing STDCALL and CDECL calling conventions or it can arise out of the value of the saved ESP being clobbered by the function. It appears in your case it is the latter.
I wrote a small C++ project with this code that has similar behaviour to your C program:
#include <iostream>
extern "C" void __stdcall __MyMemcpy( void* __restrict, void* __restrict, int);
typedef struct {
void(__stdcall* MemCpy)(void* __restrict, void* __restrict, int);
}MemFunc;
int initmemfunc(MemFunc* f) {
f->MemCpy = __MyMemcpy;
return 0;
}
char buf1[] = "Testing";
char buf2[200];
int main()
{
MemFunc mf = { 0 };
initmemfunc(&mf);
mf.MemCpy(buf2, buf1, strlen(buf1));
std::cout << "Hello World!\n" << buf2;
}
When I use code like yours that doesn't properly save ESI and EDI I discovered this in the generated assembly code displayed in the Visual Studio C/C++ debugger:
I have annotated the important parts. The compiler has generated C runtime checks (these can be disabled, but they will just hide the problem and not fix it) including a check of ESP across a STDCALL function call. Unfortunately it relies on saving the original value of ESP (before pushing parameters) into the register ESI. As a result a runtime check is made after the call to __MyMemcpy to see if ESP and ESI are still the same value. If they aren't you get the warning about ESP not being saved correctly.
Since your code incorrectly clobbers ESI (and EDI) the check fails. I have annotated the debug output to hopefully provide a better explanation.
You can avoid the use of a LODSB/STOSB loop to copy data. There is an instruction that just this very operation (REP MOVSB) that copies ECX bytes pointed to by ESI and copies them to EDI. A version of your code could have been written as:
__MyMemcpy PROC _dest : DWORD, _source : DWORD, _size : DWORD
PUSH EDI ; Save registers first
PUSH ESI
MOV EDI, DWORD PTR [ESP + 0CH] ; Arguments are offset by an additional 8 bytes
MOV ESI, DWORD PTR [ESP + 10H]
MOV ECX, DWORD PTR [ESP + 14H]
REP MOVSB
POP ESI ; Restore the caller (non-volatile) registers
POP EDI
RETN 0CH
__MyMemcpy ENDP
If you were to use the power of PROC to save the registers ESI and EDI you could list them with the USES directive. You can also reference the argument locations on the stack by name. You can also have MASM generate the proper EPILOGUE sequence for the calling convention by simply using ret. This will clean the up the stack appropriately and in the case of STDCALL return by removing the specified number of bytes from the stack (ie ret 0ch) in this case since there are 3 4-byte arguments.
The downside is that you do have to generate the PROLOGUE and EPILOGUE code that can make things more inefficient:
.MODEL FLAT, STDCALL
.CODE
__MyMemcpy PROC USES ESI EDI dest : DWORD, source : DWORD, size : DWORD
MOV EDI, dest
MOV ESI, source
MOV ECX, size
REP MOVSB ; Use instead of LODSB/STOSB+Loop
RET
__MyMemcpy ENDP
END
The assembler would generate this code for you:
PUBLIC __MyMemcpy#12
__MyMemcpy#12:
push ebp
mov ebp,esp ; Function prologue generate by PROC
push esi ; USES caused assembler to push EDI/ESI on stack
push edi
mov edi,dword ptr [ebp+8]
mov esi,dword ptr [ebp+0Ch]
mov ecx,dword ptr [ebp+10h]
rep movs byte ptr es:[edi],byte ptr [esi]
; MASM generated this from the simple RET instruction to restore registers,
; clean up stack and return back to caller per the STDCALL calling convention
pop edi ; Assembler
pop esi
leave
ret 0Ch
Some may rightly argue that having the assembler obscure all this work makes the code potentially harder to understand for someone who doesn't realize the special processing MASM can do with a PROC declared function. This may result in harder to maintain code for someone else that is unfamiliar with MASM's nuances in the future. If you don't understand what MASM may generate, then sticking to coding the body of the function yourself is probably a safer bet. As you have found that also involves turning PROLOGUE and EPILOGUE code generation off.
The reason why the stack is corrupted is that MASM "secretly" inserts the prologue code to your function. When I added the option to disable that, the function works for me now.
You can see this, when you switch to assembly mode while still in the C code and then step into your function. It seems that VS doesn't swtich to assembly mode when already in the assembly source.
.586
.MODEL FLAT,STDCALL
OPTION PROLOGUE:NONE
.CODE
mymemcpy PROC dest:DWORD, src:DWORD, sz:DWORD
MOV EDI, [ESP+04H]
MOV ESI, [ESP+08H]
MOV ECX, [ESP+0CH]
AGAIN_:
LODSB
STOSB
LOOP AGAIN_
RETN 0CH
mymemcpy ENDP
END

Assembly retrieving buffer to c function parameter

I'm writing an assembly function that will read from IDE through ports.
I'm calling the parameters through x86 base pointer (EBP).
I debugged my kernel.bin (with gdb and qemu) and I that when I'm calling my recv buffer to print, eax will return values like:36h01h10h
IBM Char Table
My disk.asm is divided by read and write. Is it possible that I'm writing it wrong? Is it legal to move directly [ebp+16] to esi (to write)? If I, on read function, move [ebp+16] directly to edi is wrong? I'm using a register poiting to that address and making edi to point to that register:
In my disk.asm, to read the disk I have this:
sub dx, 7 ;dx = 0x1f0
mov ecx, 256
mov edi, bufferrecv
rep insw
(...)
push ebx
mov ebx, [ebp+16]
mov [ebx], long word bufferrecv
pop ebx
mov esp, ebp
pop ebp
ret
And to write disk:
sub dx, 7 ;dx = 0x1f0
mov ecx, 256
mov esi, [ebp+16]
rep outsw
(...)
I'm declaring those functions this way:
Kernel.c
extern int _readd(int sector_count, int nmrsector, STRING in_msg);
extern int writed(int sector_count, int nmrsector, STRING out_msg);
The STRING type was declared inside my types.h as char*

Mixing C and Assembly

I'm doing a program in assembly to read a disk through ports (0x1f0-0x1f7) and I'm mixing it with c. I have a function in assembly that I will call in my c main funtion. My main function as 1 parameter: sectors to read:
Kernel.c
extern int _readd(int nmrsector);
(...)
int sector = 257;
int error = _readd(sector);
if(error == 0) PrintString("Error"); //It is declared on my screen.h file
disk.asm
global _readd
_readd:
push eax
push ebx
push ecx
push edx
push ebp
mov ebp, esp
mov eax, [ebp+8]
mov ecx, eax
cmp ecx, 256
jg short _fail
jne short _good
_fail:
xor eax, eax
leave
ret
_good:
xor eax, eax
mov eax, 12
leave
ret
It crashes when run it with VirtualBox. Any ideas?
If you save CPU registers when you enter a function, you need to restore them when you are finished. Your PUSHs need to be matched with POPs.
Also, if you use a stack frame to access local variables and parameters, setup the frame (push ebp ; mov ebp, esp) before everything, so you can more easily refer to them. Here [ebp+8] doesn't refer to a parameter, because you alter the stack before setting up the frame.

I am dealing with a possible array in assembly, but I cannot figure out what the starter value is

Size contains the number 86.
var_10= dword ptr -10h
var_C= dword ptr -0Ch
size= dword ptr 8
push ebp
mov ebp, esp
sub esp, 28h
mov eax, [ebp+size]
mov [esp], eax ; size
call _malloc
mov ds:x, eax
mov [ebp+var_C], 0
jmp short loc_804889E
loc_804889E: ~~~~~~~~~~~~~~~~~~~~~
mov eax, [ebp+size]
sub eax, 1
cmp eax, [ebp+var_C]
jg short loc_8048887
loc_8048887: ~~~~~~~~~~~~~~~~~~~~~
mov edx, ds:x
mov eax, [ebp+var_C]
add edx, eax
mov eax, [ebp+var_C]
add eax, 16h
mov [edx], al
add [ebp+var_C], 1
I am having difficulties reversing this portion of a project I am working on. There's a portion of the code where ds:x is moved into edx and is added with var_c and I am unsure where to go with that.
To me the program looks like it calls malloc and then moves that into ds:x and then moves 0 to var_c.
After that it simply subtracts 1 from the size of my pointer array and compares that number to 0, then jumps to a portion where it adds ds:x into edx so it can add eax to edx.
Am I dealing with some sort of array here? What is the first value that's going to go into edx in loc_8048887? Another way this could help would be to see a C equivalent of it... But that would be what I am trying to accomplish and would rather learn the solution through a different means.
Thank you!
In x86 assembly there's no strict distinction between a variable stored in memory and an array in memory. It only depends on how you access the memory region. All you have is code and data. Anyway, I'd say that ds:x is an array as because of this code here:
mov edx, ds:x ; edx = [x]
mov eax, [ebp+var_C] ; eax = something
add edx, eax ; edx = [x] + something
mov eax, [ebp+var_C] ; eax = something
add eax, 16h ; eax = something + 0x16
mov [edx], al ; [[x] + something ] = al . Yes, ds:x is an array!
What is the value of edx in loc_8048887? To find it out you only need some very basic debugging skills. I assume you have gdb at hand, if not, get it ASAP. Then compile the code with debug symbols and link it, then run gdb with the executable, set a code breakpoint at loc_8048887, run the program with r, and finally check the value of edx.
These are the commands you need:
gdb myexecutable
(gdb) b loc_8048887
(gdb) r
(gdb) info registers edx

porting c function to NASM

I want to port this c++ function to NASM.
DWORD WINAPI Generic(LPVOID lpParameter) {
__asm {
mov eax, [lpParameter]
call eax
push 0
call ExitThread
}
return 0;
}
I have some problems understanding how lpParameter works here, and i have and error there when i compile this on NASM.
this is my current code:
BITS 32
global _start
_start:
mov eax, [lpParameter]
call eax
push 0
call exitfunk
exitfunk:
mov ebx, 0x0A2A1DE0
push 0x9DBD95A6
call ebp
cmp al, byte 6
jl short goodbye
cmp bl, 0xE0
jne short goodbye
mov ebx, 0x6F721347
goodbye:
push byte 0
push ebx
call ebp
any one can help me?
IIRC, NASM does not support named function parameters, and so you have to extract the parameters manually from registers or the stack. AFAIR, all Win32 API functions expect parameters on the stack, so you should replace this
mov eax, [lpParameter]
with this
mov eax, [esp+4]

Resources