I want to understand how my data String ends up in rdx. In my mind the mov instruction puts data found at address into the target. So the content from rbp-0x28 is put into rdx. I checked whats in rbp-0x28 and it is not the data string ('AAAAAAA'). If, however, I let the command execute with ni then rdx contains the string. I dont know how the String ends up in rdx as it is not contained in rbp-0x28 beforehand. I know that my data is contained in 0x7fffffffe58f but Im not sure how or when its loaded into rdx. Any help is greatly appreciated!
This depends a lot on which compiler or debugger you're using as well as the architecture and calling convention. I did run your code with Apple's Clang compiler and lldb and got the expected results. There are minior variations between my output and your output but it's relatively easily to follow. Since you only posted partial output of your functions debug at offset+0x12 I'll assume that prior whichever register register held the first argument to the function call (in my case RDI) moved the pointer into [rbp-0x28]
This was my output.
mov rsi, qword ptr[rbp-0x30] is the equivellent of your mov rdx,[rbp-0x28] I think you're under Microsoft's x64 ABI calling convention so your first argument is passed through rcx. But prior to that instruction it's mov [rbp-0x30], rdi which I believe in your case will be mov [rbp-0x28],rcx
In the next instruction mov rdi,rcx I breakpointed again. Here I read the contents rsi which in your case would be rdx. It printed rsi = 0x00007ffeefbff94a
At that specific memory address I got the results 'AAAAAAA' Next I read the register rbp and printed rbp = 0x00007ffeefbff740 Then I read the memory address of 0x0x00007ffeefbff740-0x30 (in your case it would be -0x28) which is 0x0x7ffeefbff710 and here it was the same address stored in rsi
0x7ffeefbff94a (Little endian). Which we know points to the string 'AAAAAAA' So I'm going to assume what you're expecting at RBP-0x28 is the string itself. It should be the address which holds a pointer to the string. Also make sure to do your offsets correctly. Follow these steps:
Breakpoint at lea rax,[rbp-0x20]
Check the value of rdx, view the memory at that address and it should give you the string.
Then check the value of rbp. Subtract 0x28 from it. View the memory at the offset.
This should give you the value of rdx. Which should in turn point to the string you're looking for.
Related
I'm having some issues at the moment, I'm not sure if this is the problem or not with my program, but this is one thing that I'm not 100% about, so I'm going to use it as a learning opportunity.
I have this instruction: in al, 0x60 to read the scancode from the keyboard.
I'm trying to send this scancode to a function written in C. The C function declaration looks like: void cFunction(unsigned int scancode).
So basically, here is what I'm doing:
in al, 0x60
movzx EAX, AL
push EAX
call Cfunction
The goal is to get a value like this into the C function: 0x10, which would mean the Q was pressed, 0x11 the W was pressed, 0x12 is E, and so on...
Questions:
Is what I'm doing passing the right value to the function or not?
Is the result going to be different if I were to push only AX instead of EAX?
I only need the byte AL, but obviously I cannot push AL, so I've been zero extending it to EAX. So, let's say if Q was pressed and I compared it like: if(scancode == 0x10), would this interpret correctly no matter that EAX vs AX was pushed? Or do I only need to get the value of AL into the scancode? If not, how can I go about getting AL to the function?
Answer depends on what calling convention does your C compiler use. If it is standard cdecl than yes, generally you do it right.
Some notes:
It is better to use C data types with exact size in bytes like uint32_t than int which size isn't fixed. These data types are defined in stdint.h
If you want to use AX instead EAX, you must define your function as
void cFunction(uint16_t scancode).
Since AL is a part of AX (and EAX) it is better to just erase AX (or EAX) before reading key scancode than extend it with MOVZX after reading. Typical way for it in assembly:
XOR register with itself
XOR EAX, EAX
Moving zero in register
MOV EAX, 0
is also correct, but XOR usually is a bit faster (and using it for register erase is some kind of tradition now)
So there are a few things to talk about here. Let's cover them one at a time.
First, I'm assuming you're running either without an OS (i.e., making your own OS) or running in an OS like MS-DOS that doesn't get in the way of I/O. As Olaf rightly pointed out, in al, 0x60 isn't going to work on a modern, protected-mode OS; nor will it work on a USB keyboard (unless you're running in an emulator or virtual machine that is pretending to provide a classic PS/2 keyboard).
Second, I'm assuming you're programming on a 32-bit CPU. The C ABI (application binary interface) is different for a 16-bit CPU, a 32-bit CPU, and a 64-bit CPU, so the code will read differently depending on which CPU you're using.
Third, port 60h is a weird beast, and it's been a long time since I wrote a driver for it. The values you read in aren't the values you think you're going to read in a lot of the time, and there are the E0h extended codes, and there's the behavior of the Pause key. Writing a bug-free keyboard driver is a lot harder than it looks. But let's ignore all that for this question.
So let's assume you have no OS to get in the way, and a 32-bit CPU, and only the most basic keystrokes. How would you pass data from the keyboard to a C function? Pretty much the way you did:
in al, 0x60
movzx eax, al
push eax
call cFunction
Why is this correct?
Well, the first line loads an 8-bit register with the keyboard; and it must be al because that's the only 8-bit register that in can write to.
The 32-bit C ABI expects a function's parameters to be pushed onto the stack, in reverse calling order. So in order to call the C function, there must be a push instruction before it. However, in 32-bit mode, all push instructions are 32-bit-sized, so you can only push eax, ebx, esi, edi, and so on: You can't push al directly. (Even if you could — and technically, you can, using direct stack writes — it would be misaligned, because in 32-bit mode, every item pushed must be aligned to a 4-byte boundary.) So the only way to push the value is first to promote it from 8 bits to 32 bits, and movzx does that nicely.
There are, for what it's worth, other ways to do it. You could clear eax before the in:
xor eax, eax
in al, 0x60
push eax
call cFunction
This solution is a bit worse than the original solution for performance; it has the cost of a partial register stall: The processor internally doesn't actually keep al as a part of eax, but rather as a separate register; any attempts to mix the different-sized sub-pieces of the registers together involves the processor stalling before being able to do so: When you push eax here, the processor realizes that al got mutated by the previous instruction, and stalls for a clock cycle to quickly mash al's bits into eax so it still looks like al was actually a part of eax.
It's worth pointing out that if you're in classic 16-bit mode (8086, or '286 protected mode), the calling sequence is slightly different:
in al, 0x60
movzx ax, al
push ax
call cFunction
In this case, int is 16-bit-sized, so doing everything as 16 bits is correct. Alternatively, in 64-bit mode, you need to use rax instead:
in al, 0x60
movzx rax, al
push rax
call cFunction
Even though the cFunction may have been compiled with int being only 32 bits, the stack-alignment requirements in 64-bit mode mandate that a 64-bit value would be pushed. The C function will correctly read out the 64-bit value as a 32-bit value, but you can only push it as 64 bits.
So there you have it. Various ways of interacting with the C ABI to get your port data into your function, depending on your CPU and environment.
I'm trying to understand the basics of the addressing in the PE files, and i made a simple application with a couple of functions that call malloc linked statically against msvcr110 library. So i took my produced executable opened it in the ida pro, and found the offset of the malloc function which is not imported, added the base address and tried to call it like so:
HMODULE hCurrentModule = GetModuleHandle(NULL); // get current module base addres
DWORD_PTR hMallocAddr = (0x0048AD60 + (DWORD_PTR)hCurrentModule);
char *pointer;
__asm //calling malloc
{
push 80
mov eax,dword ptr[static_addr]
call eax
add esp,2
mov [pointer],eax
}
I then checked re-builded programm in IDA pro to make sure that the malloc offset remains the same and it's still the 0x0048AD60. So the problem is the offset+hCurrentModule gives me incorrect address, and crash after i call this address. For example the result of mine hMallocAddr is 0x0186AD60 but in the MSVC debug session in the disassembly window malloc address is at 0x0146AD60. What is wrong here?
0x0048AD60 is not the offset of malloc but the actual address of the function when the EXE is loaded at its default load address of 0x00400000. Subtract this value to get the offset from the start of the image.
I see one thing that I don't understand, the first instruction; you push a value, but never pop it. When you add 2 to esp, are you trying to fix the stack ? Could the compiler be "helping" you to optimize that as an 8 bit value ?
No guarantee, but those are the things I see from a first glance; but again, I'm not there and can't see the debug screen
{
push 80 ;Where do you pop this ?
mov eax,dword ptr[static_addr]
call eax
add esp,2 ;Is this the "pop" ? Possible bug, is "80" a 16 bit value ?
mov [pointer],eax
}
Along this same line, I'm not totally certain how your app is structured, but are you safe in using Eax without pushing before and popping afterward ? No clue if that makes a difference, it's just something from a cursory look at the code.
My program redirects a function to another function by writing a jmp instruction to the first few bytes of the function (only i386). It works like expected but it means that I can't call the original function anymore, because it will always jump to the new one.
There are two possible workarounds I could think of:
Create a new function, which overwrites the jmp instruction of the target function and call it. Afterwards the function writes back the jmp instruction.
But I'm not sure how to pass the arguments since there can be any number of them. And I wonder if the target function can jmp somewhere else and skip writing back the jmp instruction (like throw catch?).
Create a new function which executes the code I have overwritten with the jmp instruction. But I can't be sure that the overwritten data is a complete instruction. I'd have to know how many bytes I have to copy for a complete instructions.
So, finally, my questions:
Is there another way I didn't think of?
How do I find the size of an instruction? I already looked at binutils and found this but I don't know how to interpret it.
Here is a sample:
mov, 2, 0xa0, None, 1, Cpu64, D|W|CheckRegSize|No_sSuf|No_ldSuf, { Disp64|Unspecified|Byte|Word|Dword|Qword, Acc|Byte|Word|Dword|Qword }
the 2nd column shows the number of operands (2) and the last column has information about the operands, seperated by a comma
I also found this question which is pretty much the same but I can't be sure that the 7 bytes contain a whole instruction.
Writing a Trampoline Function
Any help is appreciated! Thanks.
Sebastian, you can use the exe_load_symbols() function in hotpatch to get a list of the symbols and their location in the existing exe and then see if you can overwrite that in memory. I have not tried it yet. You may be able to do it with the LD_PRELOAD environment variable as well instead of hotpatch.
--Vikas
How about something like this:
Let's say this is the original function:
Instruction1
Instruction2
Instruction3
...
RET
convert it to this:
JMP new_stuff
old:
Instruction2
Instruction3
...
RET
...
new_stuff:
CMP call_my_function,0
JNZ my_function
Instruction1
JMP old
my_function:
...
Of course you'd have to take the size of the original instructions into account (you could find that out by disassembling with objdump, for example) so that the first JMP fits perfectly (pad with NOPs if the JMP is shorter than the original instruction(s)).
I have an application which creates .text segment dumps of win32 processes. Then it divides the code on basic blocks. Basic block is a set of instructions which are executed always one after another (jumps are always the last instructions of such basic blocks). Here is an example:
Basic block 1
mov ecx, dword ptr [ecx]
test ecx, ecx
je 00401013h
Basic block 2
mov eax, dword ptr [ecx]
call dword ptr [eax+08h]
Basic block 3
test eax, eax
je 0040100Ah
Basic block 4
mov edx, dword ptr [eax]
push 00000001h
mov ecx, eax
call dword ptr [edx]
Basic block 5
ret 000008h
Now I would like to group such basic blocks in functions - say which basic blocks form a function. What's the algorithm? I have to remember that there might be many ret instructions inside one function. How to detect fast_call functions?
The simplest algorithm for grouping blocks into functions would be:
note all addresses to which calls are made with call some_address instructions
if the first block after such an address ends with ret, you're done with the function, else
follow the jump in the block to another block and so on until you've followed all possible execution paths (remember about conditional jumps, each of which splits a path into two) and all the paths have finished with ret. You'll need to recognize jumps that organize loops so your program itself does not hang by entering an infinite loop
Problems:
a number of calls can be made indirectly by reading function pointers from memory, e.g. you'd have call [some_address] instead of call some_address
some indirect calls can be made to calculated addresses
functions that call other functions before returning may have jump some_address instead of call some_address immediately followed by ret
call some_address can be simulated with a combination of push some_address + ret OR push some_address + jmp some_other_address
some functions may share code at their end (e.g. they have different entry points, but one or more exit points are the same)
You may use some heuristic to determine where functions start by looking for the most common prolog instruction sequence:
push ebp
mov ebp, esp
Again, this may not work if functions are compiled with the frame pointer suppressed (i.e. they'd use esp instead of ebp to access their parameters on the stack, it's possible).
The compiler (e.g. MSVC++) may also pad the inter-function space with the int 3 instruction and that too can serve as a hint for an upcoming function beginning.
As for differentiating between the various calling conventions, it's perhaps the easiest to look at the symbols (of course, if you have them). MSVC++ generates different name prefixes and suffixes, e.g.:
_function - cdecl
_function#number - stdcall
#function#number - fastcall
If you cannot extract this information from the symbols, you must analyze code to see how parameters are passed to functions and whether functions or their callers remove them from the stack.
You could use the presence of enter to denote the beginning of a function, or certain code which sets up a frame.
push ebp
mov ebp, esp
sub esp, (bytes for "local" stack space)
Later you'll find the opposite code (or leave) before a call to ret:
mov esp, ebp
pop ebp
You can also use the number of bytes for local stack space to identify local variables.
Identifying thiscall, fastcall, etc, will take some analysis of the code just prior to calls which use the initial location and an evaluation of the registers used/cleaned up.
Have a look at software like windasm or ollydbg. The call and ret operations denote function calls. However code does not run sequentially and jumps can be made all over the place. call dword ptr [edx] depends on the edx register and thus you won't be able to know where it goes unless you do runtime debugging.
To recognize fastcall functions you have to look at how parameters are passed on. Fastcall will put the first two pointer sized parameters in edx and ecx registers, where stdcall will push them on the stack. See this article for an explanation.
I'm trying to understand what is happening in this code, specifically within __asm__. How do I step through the assembly code, so I can print each variable and what not?
Specifically, I am trying to step through this to figure out what does the 8() mean and to see how does it know that it is going into array at index 2.
/* a[2] = 99 in assembly */
__asm__("\n\
movl $_a, %eax\n\
movl $99, 8(%eax)\n\
");
The stepi command steps through the assembly one instruction at a time. There is also a nexti for stepping over function calls. These commands do not adhere to the 'type only the unique prefix of a command is enough' rule that works for most commands -- partially because they the next and step commands are entirely prefixes of these commands and partially because these are not used too often and when they are they are typically used by someone who knows that they really want to use them.
info registers displays a lot of the register contents.
You'll also want to view the disassembly with the disassemble command.
More info on all of these commands is available with the help command, for instance:
(gdb) help info registers
tells you that info registers displays the integer registers and their contents, but it also tells you that if you supply a register name it will limit output to that register's value:
(gdb) info registers rax
rax 0x0 0
(rax is the x86_64 version of eax)
The first column is the register name, the second is the hex value, and the third is the integer value.
There is useful help for the disassemble command as well.
Remember that gdb has tab completion for many commands, and this can be used for more than just simple commands, though many times it offers you bad suggestions -- it's sometimes helpful, though.
Including a label within your inline assembly will allow you to easily make a break point at the beginning of it.
I was never any good at AT&T syntax, but I'm pretty sure the 8(%eax) part means "the address 8 bytes after the address stored in EAX", that is, it's the offset relative to the address stored in the register.
Approximate equivalent in Intel syntax would be something like this (off the top of my head, so it's entirely possible that there is some minor mistake here...)
mov eax, a
mov DWORD PTR [eax+8], 99
movl $_a, %eax // load the memory address of a into %eax
movl $99, 8(%eax) // jump 8 bytes and store number 99 (which is a[2])
It seems to me that a is an int array (int has 4 bytes in most platforms). So by increment 4 bytes you'll be accessing the next item of the array. Other examples of assigning values to this array would be:
movl $10, (%eax) // store number 10 on the the first position: a[0]
movl $20, 4(%eax) // jump 4 bytes from the address loaded in %eax
// and store number 20 on the next position (a[1])