To preface this, yes this is a project to take control of an executable externally. No, I do not have any malicious intents with this, the end result of this project won't be anything useful anyway. I am writing this in cygwin on a 32-bit installation of XP.
What I need to do is change the first few bits of a COM file to be a jump instruction so that on execution, it will jump to the very end of the COM file. I have looked in Assembler manuals to find what the bytes of that command would be so that I can just hard code it in C, but have had no luck.
First Question: Can I do this in C? It seems to me like I could just insert OpCodes in the beginning of any COM file so that it would execute that instead of the COM file.
Second Question: does someone know where I can find a resource for OpCodes so that I can insert them in my file? Or, does anyone know what the bytes would be for a Jump instruction?
If you have any question about the authenticity of this, feel free to ask.
The IntelĀ® 64 and IA-32 Architectures Software Developer Manual Volume 2A Instruction Set Reference explains the encoding of the JMP instruction (real mode is a subset of IA-32).
For a 16-byte near jump (within the current code segment) you'd use 0xE9 followed by the relative offset to jump to. If your jump is the first bytes of the COM file then the offset will be relative to address 0x103 - the first instruction of a COM file is always loaded at address 0x100, and the jump is relative to the instruction following the 3-byte jump.
On XP there should be debug.exe. Simply start it, start writing code with 'a'
type jmp ff00, and dis/[u]nassemble the result with 'u' if the corresponding hex dump was not shown.
Notice first that your program is necessarily operating system, ABI, and machine instruction set specific. (e.g. it won't run under Linux/x86-64 or Linux/PowerPC)
You could write in C the machine instructions as a sequence of bytes. Which bytes you have to write (i.e. the encoding of the appropriate jump instructions) is left to you!!!!!
Of course, that is not portable C. But you could basically do a memcpy with some appropriate source byte zone.
Maybe libraries like asmjit or GNU lightning might inspire you.
You probably cannot use them directly, but studying their code could help you.
See also x86 wikipedia pages for more references.
Related
I am trying to use the a.out format for my bootloader and I recall being able to do it in the past. ELF doesn't support 16 bit very well and produces a lot of undefined behavior when linked with C code. I am using BCC/dev86 to compile the code. The problem I'm having is finding any documentation on to where in memory you are supposed to place the text segment on a position-dependent 8086/real mode a.out file. It's in the header where the entry point is but I am unable to locate any sort of documentation of the loading of an a.out. Any help would be much appreciated
Usually, the first part of a bootloader (the 16 bit part) is written in assembly. This is mainly because a bootloader needs to do such low level tasks that using c is not really worth it. Once the bootloader gets itself to protected or long mode, it can use c. Another reason why using c is a bad idea for a bootloader is that a bootloader's task usually is to get the system into a state in which the kernel can start executing. On x86 this usually means going to protected mode. This involves linking some 16bit c code to some 32 bit c code (and maybe even some 64 bit code). This is really hard to do.
If you really think you want to continue with what you're doing: an a.out file is just an elf file. In bootloaders the cpu starts execution at address 0x7c00. So I suppose you should link the .text section to 0x7c00.
One more comment Id like to make, is that on modern cpus you don't even need to worry much about how to get sections from an a.out file. Usually UEFI can just boot elf files directly. And Qemu can as well.
in advance: I do not want any 'ready-to-use' solution. Especially, imho it would defeat the purpose to learn something. And this is my primary goal: what I'd like to have is a few explainations/hints, or deeper understanding.
Now to the problem:
After using gdb and setting a breakpoint the following output of the stack is generated ( c-program):
The question that emerges now is:
0xbfa62f84:0x08048350 0xbfa62fe8 0xb7df0390 0x00000001
0xbfa62f94:0xbfa63014 0xbfa6301c 0xb7f262d0 0x00000000
for what do the values stand for? Or how can they be disassembled/decomposed?
I assume that they encode the memory address + some OP-code like mov, sub etc.
But how? and why? Or asked in a different fashion: how can these instructions be 'read out'?
Thanks in advance
Dan
If you want to understand the such a flow use a debugger like Keil .There at the same time you can see the assembly code and the generated hex file and your source code at the same time .Then when you step through the code you will understand how the assembly is related to the hex file and source code.
Machine code is not stored in the stack; however, the return address stored in the stack frame points to machine code. 0x08048350 is a good candidate for a code address (on x86, the code segment starts at a low address); you can examine the memory starting at that address and try to puzzle out opcodes and registers.
Or you could use the gdb command x/i to display the instructions starting at that address - x/16i 0x08048350 will display the first 16 instructions starting at that address.
Im very new to embedded programming started yesterday actually and Ive noticed something I think is strange. I have a very simple program doing nothing but return 0.
int main() {
return 0;
}
When I run this in IAR Embedded Workbench I have a memory view showing me the programs memory. Ive noticed that in the memory there is some memory but then it is a big block of empty space and then there is memory again (I suck at explaining :P so here is an image of the memory)
Please help me understand this a little more than I do now. I dont really know what to search for because Im so new to this.
The first two lines are the 8 interrupt vectors, expressed as 32-bit instructions with the highest byte last. That is, read them in groups of 4 bytes, with the highest byte last, and then convert to an instruction via the usual method. The first few vectors, including the reset at memory location 0, turn out to be LDR instructions, which load an immediate address into the PC register. This causes the processor to jump to that address. (The reset vector is also the first instruction to run when the device is switched on.)
You can see the structure of an LDR instruction here, or at many other places via an internet search. If we write the reset vector 18 f0 95 e5 as e5 95 f0 18, then we see that the PC register is loaded with the address located at an offset of 0x20.
So the next two lines are memory locations referred to by instructions in the first two lines. The reset vector sends the PC to 0x00000080, which is where the C runtime of your program starts. (The other vectors send the PC to 0x00000170 near the end of your program. What this instruction is is left to the reader.)
Typically, the C runtime is code added to the front of your program that loads the global variables into RAM from flash, and sets the uninitialized RAM to 0. Your program starts after that.
Your original question was: why have such a big gap of unused flash? The answer is that flash memory is not really at a premium, so we can waste a little, and that having extra space there allows for forward-compatibility. If we need to increase the vector table size, then we don't need to move the code around. In fact, this interrupt model has been changed in the new ARM Cortex processors anyway.
Physical (not virtual) memory addresses map to physical circuits. The lowest addresses often map to registers, not RAM arrays. In the interest of consistency, a given address usually maps to the same functionality on different processors of the same family, and missing functionality appears as a small hole in the address mapping.
Furthermore, RAM is assigned to a contiguous address range, after all the I/O registers and housekeeping functions. This produces a big hole between all the registers and the RAM.
Alternately, as #Martin suggests, it may represent uninitialized and read-only Flash memory as -- bytes. Unlike truly unassigned addresses, access to this is unlikely to produce an exception, and you might even be able to make them "reappear" using appropriate Flash controller commands.
On a modern desktop-class machine, virtual memory hides all this from you, and even parts of the physical address map may be configurable. Many embedded-class processors allow configuration to the extent of specifying the location of the interrupt vector table.
UncleO is right but here is some additional information.
The project's linker command file (*.icf for IAR EW) determines where sections are located in memory. (Look under Project->Options->Linker->Config to identify your linker configuration file.) If you view the linker command file with a text editor you may be able to identify where it locates a section named .intvec (or similar) at address 0x00000000. And then it may locate another section (maybe .text) at address 0x00000080.
You can also see these memory sections identified in the .map file, along with their locations. (Ensure "Generate linker map file" is checked under Project->Options->Linker->List.) The map file is an output from the build, however, and it's the linker command file that determines the locations.
So that space in memory is there because the linker command file instructed it to be that way. I'm not sure whether that space is necessary but it's certainly not a problem. You might be able to experiment with the linker command file and move that second section around. But the exception table (a.k.a. interrupt vector table) must be located at 0x00000000. And you'll want to ensure that the reset vector points to the new location of the startup code if you move it.
I am working on this code where, I need to get the instructions executed by a program, given the instruction pointers. Assume for now that I have a mechanism that provides me addresses of the instructions, would it be possible to get the opcode from this (on an IA32 instruction set) ?
You need an in memory disassembler, such as BeaEngine or DiStorm, these can be passed a memory address to read from, just make sure the address is readable. If you know the length in bytes of the function, its a little better to use the Run-Length-Dissassemblers also provided on those sites.
If you are looking for hardware supported help, that's not how it works. This needs to be done in software. Your code needs a table of opcodes and instructions and just has to perform a lookup.
What you describe is known as disassembly. There are many open source disassemblers and if you could use one of those it would make your task very simple. Look here: http://en.wikibooks.org/wiki/X86_Disassembly/Disassemblers_and_Decompilers
It's said Position Independent Code only uses relative position instead of absolute positions, how's this implemented in c and assembly respectively?
Let's take char test[] = "string"; as an example, how to reference it by relative address?
In C, position-independent code is a detail of the compiler's implementation. See your compiler manual to determine whether it is supported and how.
In assembly, position-independent code is a detail of the instruction set architecture. See your CPU manual to find out how to read the PC (program counter) register, how efficient that is, and what the recommended best practices are in translating a code address to a data address.
Position-relative data is less popular now that code and data are separated into different pages on most modern operating systems. It is a good way to implement self-contained executable modules, but the most common such things nowadays are viruses.
On x86, position-independent code in principle looks like this:
call 1f
1: popl %ebx
followed by use of ebx as a base pointer with a displacement equal to the distance between the data to be accessed and the address of the popl instruction.
In reality it's often more complicated, and typically a tiny thunk function might be used to load the PIC register like this:
load_ebx:
movl 4(%esp),%ebx
addl $some_offset,%ebx
ret
where the offset is chosen such that, when the thunk returns, ebx contains a pointer to a designated special point in the program/library (usually the start of the global offset table), and then all subsequent ebx-relative accesses can simply use the distance between the desired data and the designated special point as the offset.
On other archs everything is similar in principle, but there may be easier ways to load the program counter. Many simply let you use the pc or ip register as an ordinary register in relative addressing modes.
In pseudo code it could look like:
lea str1(pc), r0 ; load address of string relative to the pc (assuming constant strings, maybe)
st r0, test ; save the address in test (test could also be PIC, in which case it could be relative
; to some register)
A lot depends on your compiler and CPU architecture, as the previous answer stated. One way to find out would be to compile with the appropriate flags (-PIC -S for gcc) and look at the assembly language you get.