When I do homework on Computer Systems: A Programmer's Perspective 2nd Edition, I find a question in chapter 7, the question is
The swap routine in Figure 7.10 contains five relocated references. For each relocated reference, give its line number in Figure 7.10, its run-time memory address, and its value. The original code and relocation entries in the swap.o module are shown in Figure 7.19.
and these are two pictures 7.10 and 7.19
picture7.10
picture7.19
and I found the solution on website is
Line # in Fig. 7.10 Address Value
15(bufp0) 0x080483CB 0x0804945c
16(buf[1]) 0x080483D0 0x08049458
18(bufp1) 0x080483D8 0x08049548
18(buf[1]) 0x080483DC 0x08049458
23(bufp1) 0x080483E7 0x08049548
I don't know how to calculate the address.
Related
I came across this question in a book - Can two different Far pointers contain two different addresses but refer to the same physical location in memory. The answer was 'YES'. But, for the same question involving Near and Huge pointers, the answer was 'NO'.
P.S. Don't dump this question since Far, Near and Huge pointers are obsolete nowadays.
To be using far pointers, you have to be working with primitive 80x86 chips, or modern chips in a compatibility mode. A far pointer consists of a segment number and an offset, but different segment numbers point to overlapping addresses, so different combinations of segment number and offset can point to the same physical address.
The segment number is multiplied by 16 and the offset added to produce the physical address. Hence:
segment offset address
0x100 0x0030 0x1030
0x101 0x0020 0x1030
Etc.
When I run into a fault handler on my ARM cortex-M4 (Thumb) I get a snapshot of the CPU register just before the fault occured. With this information I can find the stack pointer where it was. Now, what I want is to backtrace through all functions it passed. The only problem I see here is that I don't have a frame pointer, so I cannot really see where a certain subroutine has saved the LR, ad infinitum.
How would one tackle this problem if the frame pointer is not available in r7?
This blog post discusses this issue with reference to the MIPS architecture - the principles can be readily adapted to ARM architectures.
In short, it describes three possibilities for locating the stack frame for a given SP and PC:
Using compiler-generated debug information (not included in the executable image) to calculate it.
Using compiler-generated stack-unwinding (exception handling) information (included in the executable image) to calculate it.
Scanning the call site to locate the prologue or epilogue code that adjusts the stack pointer, and deducing the stack frame address from that.
Obviously it's very compiler- and compiler-option dependent, and not guaranteed to work in all cases.
R7 is not the frame pointer on the M4, it's R11. R7 is the FP for Cortex-M0+/M1 where only the lower registers are generally available. In anycase, when Cortex-M makes a call to a function using BL and variants, it saves the return address into LR (link register). At function entry, the LR is saved onto the stack. So in theory, to get a call trace, you would "chase" the chain of the LRs.
Unfortunately, the saved location of LR on the stack is not defined by the calling convention, and its location must be deduced from the debug info for that function entry in the DWARF records (in the .elf file). I do not know if there is an utility that would extract the LR locations from an ELF file, but it should not be too difficult.
Richard at ImageCraft is right.
More information can be found here
This works fine with C code. I had a harder applying it to C++ but it's not impossible.
in advance: I do not want any 'ready-to-use' solution. Especially, imho it would defeat the purpose to learn something. And this is my primary goal: what I'd like to have is a few explainations/hints, or deeper understanding.
Now to the problem:
After using gdb and setting a breakpoint the following output of the stack is generated ( c-program):
The question that emerges now is:
0xbfa62f84:0x08048350 0xbfa62fe8 0xb7df0390 0x00000001
0xbfa62f94:0xbfa63014 0xbfa6301c 0xb7f262d0 0x00000000
for what do the values stand for? Or how can they be disassembled/decomposed?
I assume that they encode the memory address + some OP-code like mov, sub etc.
But how? and why? Or asked in a different fashion: how can these instructions be 'read out'?
Thanks in advance
Dan
If you want to understand the such a flow use a debugger like Keil .There at the same time you can see the assembly code and the generated hex file and your source code at the same time .Then when you step through the code you will understand how the assembly is related to the hex file and source code.
Machine code is not stored in the stack; however, the return address stored in the stack frame points to machine code. 0x08048350 is a good candidate for a code address (on x86, the code segment starts at a low address); you can examine the memory starting at that address and try to puzzle out opcodes and registers.
Or you could use the gdb command x/i to display the instructions starting at that address - x/16i 0x08048350 will display the first 16 instructions starting at that address.
I am working in obtaining all the data of a program using its ELF and DWARF info and by hooking a pin tool to a process that is currently running -- It is kind of a debugger using a Pin tool.
For getting the local variables from the stack I am working with the registers EIP, EBP and ESP which I have access to from Pin.
What stroke me as weird is that I was expecting EIP to be pointing to the current function that was running when the pin tool was attached to the process, but instead EIP is pointing to the section .PLT. In other words, if the pin tool was hooked into the process when Foo() was running, then I was expecting EIP to be pointing to some address inside the Foo function. However it is pointing to the beginning of the .PLT section.
What I need to know is which function the process is currently in -- Is there any way to get the address of the function using the .PLT section? Is there any other ways to get the address of the function from the stack or using Pin? I hope I was clear enough, let me know if there are any questions though.
I might not be understanding exactly what is going on here...is the instruction pointer really in the .plt section or are you just getting a garbage value from Pin ?
You name the instruction pointer you are reading EIP, which might be a problem if you are running on a 64bit system, is that the case ?
You see the instruction pointer register is a 32bit value on a 32bit system, and a 64bit value on a 64bit system. So Pin actually provides 3 REG_* names for the instruction pointer: EIP, RIP and GBP. EIP is always the lower 32bit half of the register, RIP the 64bit value, and GBP one of the two depending on your architecture. Asking for EIP on a 64bit system gives you garbage, same for asking RIP on a 32bit one.
Otherwise, a quick look on Google gives me this. Quoting a bit:
By default the .plt entries are all initialized by the linker not to point to the correct target functions, but instead to point to the dynamic loader itself. Thus, the first time you call any given function, the dynamic loader looks up the function and fixes the target of the .plt so that the next time this .plt slot is used we call the correct function.
And more importantly:
It is possible to instruct the dynamic loader to bind addresses to all of the .plt slots before transferring control to the application—this is done by setting the environment variable LD_BIND_NOW=1 before running the program. This turns out to be useful in some cases when you are debugging a program, for example.
Hope that helps.
I'm trying to implement some parts of what dyld does and I'm a little bit stuck at stub trampolines.
Consider the following ARM instruction:
BL 0x2fec
It branches with link (subprocedure call) to 0x2fec. I'm aware of the fact, that there is a section __symbolstub1 in the __TEXT segment starting at 0x2fd8, so it's a jump to 20 bytes inside of __symbolstub1.
Now, there is a symbol
(undefined) external _objc_autoreleasePoolPush (from libobjc)
that I've resolved through LC_SYMTAB load command. There is no known address provided. I know, as a fact, that 0x2fec address is a trampoline to _objc_autoreleasePoolPush, but I cannot prove it via any means.
I've checked the LC_DYLD_INFO_ONLY command, and I had a slight hint in there, in the lazy_bind symbols I've found:
{:offset=>20, :segment=>2, :library=>6, :flags=>[], :name=>"_objc_autoreleasePoolPush"}
where the name and offset match what I have exactly, and the library #6 is "/usr/lib/libobjc.A.dylib", which is also perfect. Now the issue is that segment #2 is __TEXT, but __TEXT starts at 0x1000, and __symbolstub1 is way down there at 0x2fd8. So I'm missing some reference down to section.
Any ideas on how am I supposed to map 0x2fec virtual address to _objc_autoreleasePoolPush?
Heh, just a little more digging and I've found it at LC_DYSYMTAB's indirect symbols.
Now the long answer.
Find a section for given address;
The section should be of type S_NON_LAZY_SYMBOL_POINTERS, S_LAZY_SYMBOL_POINTERS, S_LAZY_DYLIB_SYMBOL_POINTERS, S_THREAD_LOCAL_VARIABLE_POINTERS or S_SYMBOL_STUBS;
If the section type is S_SYMBOL_STUBS, then the byte size is stored in reserved2, otherwise it is considered equal to 4;
The offset into indirect symbols table is stored in reserved1;
The index into indirect symbols table is calculated as
index = sect.reserved1 + (vmaddr - sect.addr) / bytesize;
The symbol in the symbols table is found at symbols[indirect_symbols[index]].