I'm studying about windows and DLL stuffs and I have some question about it. :)
I made a simple program that loads my own DLL. This DLL has just simple functions, plus, minus.
This is the question : if I load some DLL (for example, text.dll), is this DLL always have the same Base Address? or it changes when I restart it? and can I hold the DLL's Base Address?
When I test it, it always have same Base Address, but I think when I need to do about this, I have to make some exception about the DLL Base Address.
The operating system will load your DLL in whatever base address it pleases. You can specify a "preferred" base address, but if that does not happen to be available, (for whatever reason, which may well be completely out of your control,) your DLL will be relocated by the operating system to whatever address the operating system sees fit.
i load some DLL(for example, text.dll), is this DLL always have the same Base Address?
No. It is a preferred base address. If something is already loaded at that address, the loader will rebase it and fixup all of the addresses.
Other things, like Address Space Layout Randomization could cause it to be different every time the process starts.
That's a common problem with DLLs that we encountered when trying to decode stacktraces issued by GNAT runtime (Ada).
When presented with a list of addresses (traceback) when our executables crash, we are able to perform addr2line on the given addresses and rebuild the call tree without issues.
On DLLs, this isn't the case (that's why I highly doubt that this issue is ASLR-related, else the executables would have the same random shift), vcsjones answer explains the "why".
Now to workaround this issue, you can write the address of a given symbol (example: the main program) to disk. When analysing a crash, just perform a difference between the address of the symbol in the mapfile and the address written to disk. Apply this difference to your addresses, and you'll be able to compute the theorical addresses, thus the call stack.
Related
Virtual memory along with logical memory helps to make sure programs do not corrupt each others data.
Program relocation does an almost similar thing of making sure that multiple programs does not corrupt each other.Relocation modifies object program so that it can be loaded at a new, alternate address.
How are virtual memory, logical memory and program relocation related ? Are they similar ?
If they are same/similar, then why do we need program relocation ?
Relocatable programs, or said another way position-independent code, is traditionally used in two circumstances:
systems without virtual memory (or too basic virtual memory, e.g. classic MacOS), for any code
for dynamic libraries, even on systems with virtual memory, given that a dynamic library could find itself lodaded on an address that is not its preferred one if other code is already at that space in the address space of the host program.
However, today even main executable programs on systems with virtual memory tend to be position-independent (e.g. the PIE* build flag on Mac OS X) so that they can be loaded at a randomized address to protect against exploits, e.g. those using ROP**.
* Position Independent Executable
** Return-Oriented Programming
Virtual memory does not prevent programs from interfering with out other. It is logical memory that does so. Unfortunately, it is common for the two concepts to be conflated to "virtual memory."
There are two types of relocation and it is not clear which you are referring to. However, they are connected. On the other hand, the concept is not really related to virtual memory.
The first concept of relocatable code. This is critical for shared libraries that usually have to be mapped to different addresses.
Relocatable code uses offsets rather than absolute addresses. When a program results in an instruction sequence something like:
JMP SOMELABEL
. . .
SOMELABEL:
The computer or assembler encodes this as
JUMP the-number-of-bytes-to-SOMELABEL
rather than
JUMP to-the-address-of-somelabel.
By using offsets the code works the same way no matter where the JMP instruction is located.
The second type of relocation uses the first. In the past relocation was mostly used for libraries. Now, some OS's will load program segments at different places in memory. That is intended for security. It is designed to keep malicious cracks that depend upon the application being loaded at a specific address.
Both of these concepts work with or without virtual memory.
Note that generally the program is not modified to relocated it. I generally, because an executable file will usually have some addresses that need to be fixed up at run time.
When a function is called, execution is shifted to a point indicated by the function pointer. At the start of execution, the executable code has to be loaded from disk.
How is the correct function pointer called? The executable code is not mapped into virtual memory at the same location every time, right? So how does the runtime make sure that a call to a function always calls the correct function even if the location of the executable code is different for each execution?
Consider the following code:
void func(void); //Func defined in another dynamic library
int main()
{
func();
//How is the pointer to func known if the file containing func is loaded from disk at run time?
};
The way that function pointers are resolved is really quite simple. When the compiler chain spits out an executable binary, all internal addresses are relative to a "base address." In some executable formats, this base address is specified, in others it is implied.
Basically, the compiler says that it assumes execution will start at address A. The runtime decides that it should actually start at B. The runtime then subtracts A and adds B to all non-relative addresses in the binary before executing it.
This process also applies to things like DLLs. Dynamic libraries store a list of addresses relative to the base pointer that point to each exported function. Names are often also associated with the list, so that you can reference a function by name. When the library is loaded, the address translation is applied to everything, including the address table. At that point, a caller just has to look up the address in the table that was translated, and then they'll have the absolute address of a given function.
In older operating systems, long long ago (and, in some cases, even today), well before things like address space layout randomization, memory pages, and multitasking operating systems, programs would just be copied to the specified base address in memory where it would then be executed.
In modern operating systems, one of a few things can happen, depending on the capabilities or requirements of the platform and application. Most operating systems handle native binaries as I described in the second paragraph, however some applications (such as running 16-bit x86 on later architectures) can involve more complex strategies. One such strategy involves giving the code a static virtual address space. This has various limitations, such as the need for an emulation/compatibility layer if you want it to interact with external code (like a windowed console or the network stack).
As the need for 16-bit support declines though, that sort of scheme is used less and less. Giving all programs their own unique address space (rather than letting it overlap) promotes the use of shared libraries, services, and other shared goodies.
In general, function calls are resolved statically. When you compile the file, first - .o (or .obj) file is created. All known addresses - are local functions (from this file). Unknown are "extern" functions.
Then, linking is performed. Linking completes address mapping for every function which is "extern". If any names are missing - linking error occurs.
How is the correct function pointer called?
Function pointer is function address, function name is function address. Both are values, not L-values. &func and func are absolutely same.
Loading or PE (or ELF) files is a process or loading the executable to memory. Too much information to explain. Basically, just for clarification, consider: every function has its own address in the process address space.
You can print the 'func' and see whether is has the same address during every execution like this:
printf("%u", function);
For me it's the same address every time (virtual memory wise).
Generally the user program binaries will be loaded in low address (usually around 0x400000) in the programs address space which will be specified in the elf binary (in the case of linux).
Can we force a user binary to load at a high address, possibly within the 2GB range of addresses where libc or other such libraries are loaded?
I have tried finding a solution on the net but could not find any concrete solution for this.
(I am working on Ubuntu 12.10 64bit OS)
Thanks
Unless the binary is position-independent (PIE), this is not possible. Normal (non-PIE) binaries are hard-coded for a particular load address at link time, and during linking, the information necessary for relocating to a different address was already lost.
Edit: The above is assuming you're working with an existing binary. If you are producing the binary yourself, you can control the load address that's hard-coded into it with the following link options:
-Wl,-Ttext-segment,0x80000000
replacing 0x80000000 by your desired address. Certain addresses (such as those reserved for kernel use, typically beginning at 0xc0000000) will not work, and the address must be page-aligned (the last 3 hex digits must be 0).
For example in my program I called a function foo(). The compiler and assembler would eventually write jmp someaddr in the binary. I know the concept of virtual memory. The program would think that it has the whole memory at disposal, and the start position is 0x000. In this way the assembler can calculate the position of foo().
But in fact this is not decided until runtime right? I have to run the program to know where I loaded the program into, hence the address of the jmp. But when the program actually runs, how does the OS come in and change the address of the jmp? These are direct CPU instructions right?
This question can't be answered in general because it's totally hardware and OS dependent. However a typical answer is that the initially loaded program can be compiled as you say: Because the VM hardware gives each program its own address space, all addresses can be fixed when the program is linked. No recalculation of addresses at load time is needed.
Things get much more interesting with dynamically loaded libraries because two used by the same initially loaded program might be compiled with the same base address, so their address spaces overlap.
One approach to this problem is to require Position Independent Code in DLLs. In such code all addresses are relative to the code itself. Jumps are usually relative to the PC (though a code segment register can also be used). Data are also relative to some data segment or base register. To choose the runtime location, the PIC code itself needs no change. Only the segment or base register(s) need(s) be set whenever in the prelude of every DLL routine.
PIC tends to be a bit slower than position dependent code because there's additional address arithmetic and the PC and/or base registers can bottleneck the processor's instruction pipeline.
So the other approach is for the loader to rebase the DLL code when necessary to eliminate address space overlaps. For this the DLL must include a table of all the absolute addresses in the code. The loader computes an offset between the assumed code and data base addresses and actual, then traverses the table, adding the offset to each absolute address as the program is copied into VM.
DLLs also have a table of entry points so that the calling program knows where the library procedures start. These must be adjusted as well.
Rebasing is not great for performance either. It slows down loading. Moreover, it defeats sharing of DLL code. You need at least one copy per rebase offset.
For these reasons, DLLs that are part of Windows are deliberately compiled with non-overlapping VM address spaces. This speeds loading and allows sharing. If you ever notice that a 3rd party DLL crunches the disk and loads slowly, while MS DLLs like the C runtime library load quickly, you are seeing the effects of rebasing in Windows.
You can infer more about this topic by reading about object file formats. Here is one example.
Position-independent code is code that you can run from any address. If you have a jmp instruction in position-independent code, it will often be a relative jump, which jumps to an offset from the current location. When you copy the code, it won't change the offsets between parts of the code so it will still work.
Relocatable code is code that you can run from any address, but you might have to modify the code first (maybe you can't just copy it). The code will contain a relocation table which tells how it needs to be modified.
Non-relocatable code is code that must be loaded at a certain address or it will not work.
Each program is different, it depends on how the program was written, or the compiler settings, or other various factors.
Shared libraries are usually compiled as position-independent code, which allows the same library to be loaded at different locations in different processes, without having to load multiple copies into memory. The same copy can be shared between processes, even though it is at a different address in each process.
Executables are often non-relocatable, but they can be position-independent. Virtual memory allows each program to have the entire address space (minus some overhead) to itself, so each executable can choose the address at which it's loaded without worrying about collisions with other executables. Some executables are position-independent, which can be used to increase security (ASLR).
Object files and static libraries are usually relocatable code. The linker will relocate them when combining them to create a shared library, executable, or other image.
Boot loaders and operating system kernels are almost always non-relocatable.
Yes, it is at runtime. The operating system, the part managing starting and switching tasks is ideally at a different protection level, it has more power. It knows what memory is in use and allocates some for the new task. It configures the mmu so that the new task has a virtual address space starting at zero or whatever the rule is for that operating system and processor. How you get into user mode at that starting address, is very processor specific.
One method for example is the hardware might save some state not just address but mode or virtual id or something when an interrupt occurs, lets say on the stack. And the return from interrupt instruction as defined by that processor takes the address, and state/mode, off of the stack and switches there (causing lets assume the mmu to react to its next fetch based on the new mode not the old). For a processor that works like that then you may be able to fake an interrupt return by placing the right items on the stack such that when you kick the interrupt return instruction it basically does a jump with additional features of mode switching, etc.
The ARM family for example (not cortex-m) has a processor state register for what you are running now (in the case of an interrupt or service call) and a second state register for where you came from, the state that was interrupted, when you do the proper return you give it the address and it switches back to that mode using the other register. You can directly access that register from the non-users modes so you can manipulate the state of the return. There is no return instruction in arm, just flavors of jump (modifications to the program counter), so it is a special kind of jump.
The short answer is that it is very specific to the processor as to what your choices are for jumping to the first time or returning to after a task switch to a running task in an application mode in a virtual address space. Either directly or indirectly the processor documentation will describe these modes and how you change them. If not explicitly described then you have to figure out on your own from the instructions and the mmu protections and such how to switch tasks.
In a 32 bit system, each process virtually has 2^32 bytes of CONTIGUOUS address space. So why the final executable code generated by a linker needs to be relocatable. What is the requirement since all addresses generated would be virtual addresses in the process's own address space and other process CANNOT use the same.
Hence the process can be placed in anywhere it wants to be. Why relocatable?
Some operating systems make the executable code relocatable (this is definitely not universal to all operating systems) to allow for address space layout randomization. This helps mitigate certain attacks.
In the past when stacks were executable a buffer overflow could be exploited by writing executable code directly on the overflowed stack or heap. As operating systems became smarter and started preventing execution of the stack and the heap, attacks became more sophisticated and started using known code sequences in memory by doing return oriented programming. The mitigation to that class of attacks was first done by randomizing the memory layout for shared libraries (since those were easier to exploit) and then when attackers switched to attacking the main executable, by randomizing the memory position of the executable. To make it possible the main executable needs to be relocatable.
Executable code does not always contain relative addresses. On Windows, for example, addressing is often absolute (e.g. for global data).
Consider two different dynamic libraries. Both were compiled for a fixed base address of 0x00100000. Your program tries to load both of them. Where is the loader to place the 2nd DLL? Its preferred base address is already used by the other DLL.
In this case relocatable code helps placing the 2nd DLL at a different address and patching its internal pointers to the new location. With fixed base addresses, loading the 2nd DLL would just fail.
It needs to be relocatable because in order to execute your process needs to be put into the actual main memory in a ready queue. Now where in the main memory it shall be placed is not fixed (it is placed wherever sufficient space is available) so the actual addresses of the instructions varies from its virtual address .
Hence statements making calls to functions ,returns etc need to be updated accordingly pointing to the actual address of those functions