Honestly, I am really confused with this particular virtual memory related concept.
Q1) When a page fault occurs, does the processor first finishes the execution of the current instruction and then moves the IP register contents (address of next instruction) to the stack? Or, it aborts current instruction being executed and moves the contents of instruction pointer register to stack?
Q2) If the second case is true, then how does it resume the instruction which was aborted because when if it resumes, the stack contains the instruction pointer value which is nothing but the address of the next instruction. So it will never resume the instruction where the page fault occurred.
What I think
I think the second case sounds wrong. The confusion occurred while i was reading Operating System Principles by Silbershatz and Galvin. In that they have written
when a page fault occurs, we will have to bring in the desired page, correct page table and restart the instruction.
But the instruction pointer always points to the address of the next instruction so it means, according to what this book is trying to convey, we are decrementing the value of IP just to restart the execution of the instruction where the page fault occurred?
In the Intel System Programming guide, chapter 6.5, it says
Faults — A fault is an exception that can generally be corrected and that, once corrected, allows the program
to be restarted with no loss of continuity. When a fault is reported, the processor restores the machine state to
the state prior to the beginning of execution of the faulting instruction. The return address (saved contents of
the CS and EIP registers) for the fault handler points to the faulting instruction, rather than to the instruction
following the faulting instruction.
A page fault is classified as a fault (no surprises there), so when a page fault happened you're in the state "before it ever happened" - well not really, because you're in the fault handler (so EIP and ESP are definitely different, also CR2 contains the address), but when you return it'll be the state before the ever happened, only with changes made by the handler (so, put there page there, or kill the process)
Related
I was debugging a seg fault in a Linux app that was caused by a program trying to change a static constant array structure (so the data was in the read-only section of the ELF and subsequently loaded in a page that was then given read-only permission).
While in GDB I put a breakpoint on the line of assembler that did the bad store, and when it stopped there I manually performed the equivalent write action using GDB. GDB did this without any complaints, and reading the value back proved it had indeed been written. I looked in /proc/thepid/maps and that particular page was still marked as "not writeable".
So my question is: does GDB temporarily set write permissions on a read-only page, perform the write, then reset the permissions? Thanks.
does GDB temporarily set write permissions
No.
On Linux/*86, ptrace() (which is what GDB uses to read and write the inferior (being debugged) process memory) allows reads and writes to pages that are not readable/writable by the inferior, leading exactly to the confusion you've described.
This could be considered a bug in the kernel.
It should be noted that the kernel has to allow ptrace to write to normally non-writable .text section for the debugger to be able to plant breakpoints (which is done by overwriting original instruction with the breakpoint/trap instruction -- int3 via PTRACE_POKETEXT request).
The kernel doesn't have to do the same for POKE_DATA, but man ptrace says:
PTRACE_POKETEXT, PTRACE_POKEDATA
Copies the word data to location addr in the child's memory.
As above, the two requests are currently equivalent.
I believe it's that equivalentness that causes the current behavior.
From Understanding the Linux Kernel 3rd edition , chapter 10.4.4, which discussing about page fault exception while the Kernel accessing user space memory:
the Page Fault handler do_page_fault( ) executes the following
statements:
if ((fixup = search_exception_tables(regs->eip))) {
regs->eip = fixup->fixup;
return 1;
}
The regs->eip field contains the value of the eip register saved on
the Kernel Mode stack when the exception occurred. If the value in the
register (the instruction pointer) is in an exception table,
do_page_fault( ) replaces the saved value with the address found in
the entry returned by search_exception_tables( ). Then the Page Fault
handler terminates and the interrupted program resumes with execution
of the fixup code .
Well understood, beside a crucial fact- memory operations can be cached, which means that at the moment of the page fault exception, the instruction pointer contained address of another instruction which had no relation to the exception, as the instruction that caused the exception executed earlier by the CPU but was cached up until this moment (reference to that in here at last paragraphs).
How can Linux Kernel execute the above code with no side effects? how can it be sure that at the time of the page fault exception, the instruction pointer register contained the address of the memory-operation instruction (that can be cached and performed later on) that accessed illegal address?
EDIT #1:
I believe the same guide has a hint in chapter 2.4.7-
the cache unit is inserted between the paging unit and the main memory
Maybe it implies that address translation and checking in always done prior (or during) to caching (at least in x86 architecture, in which the guide is based on) which can explain my issue by the fact that address check in done in the MMU circuitry at the moment of instruction execution? unfortunately I could not find any definitive 'Yes' in the guide nor in my searches online.
EDIT #2:
I found another source in SuperUser that strengthen my speculation in EDIT #1:
Permissions still need to be checked before the access can be
committed
So, it seems (without any confirmation from formal source) that upon memory access, the address goes into MMU circuitry for translation before or during the caching, which means that address check is done at instruction execution time, and that this is the moment when page fault can be raised. So the cache latency of main memory access is irrelevant to the page fault timing, and when page fault occurs, instruction pointer register indeed contains the faulty instruction which tries to access invalid address. I'll keep searching for a source that confirms that, or alternatively a source that contradict that and that offers a proper solution.
How can Instruction Pointer register recover from a bad read or bad jump?
Kernel makes the call to an init code that will call the main() program. If the main() program makes a stack overflow or whatever and RIP/EIP/IP fills with junk, how can the OS recover the CPU register?
CPU has only one instruction pointer right? So recovering from a overflow seems impossible to my point of view.
Yes, if the IP gets trashed and that causes a fault, only the bad value is known. It's unclear what you mean by "recovering from overflow". Of course the fault handler of the OS has a well defined address and the cpu goes there so IP will be well defined from then on. The OS may decide to terminate the process or if the program has installed a signal/exception handler the OS will make sure that is called. This handler can then load IP with an appropriate value.
When you trash the IP in the usermode, eventually a hardware fault occurs, be it a page fault, illegal opcode or something like that. Then the processor switches to supervisor/kernel mode and starts running a fault handler by setting the instruction pointer to a well-defined value.
The kernel code will then inspect the address at which the exception happened and/or the type of the exception. Upon finding that it was because of any of these usually the kernel will then terminate the malfunctioning user-mode process.
If the IP gets loaded with an address from which it cannot execute, it triggers an EXCEPTION. A CPU usually recognizes a number of different types of exceptions and they are identified by a different number.
When the exception occurs, it causes the CPU to switch to kernel mode. That in turn causes the CPU to load the IP with the address of a handler defined to handle the specific type of exception and to load a kernel mode stack.
There are two types of exceptions: faults and traps. After a fault, the original instruction in the IP can be restarted. A trap is a fatal error. What happens at this point depends upon the type of exception.
If its a page fault, the handler will try to load the page into memory.
For most other exceptions, the handler will try to find a user mode handler for the specific type of exception. See the signal function in eunuchs.
As developing on ARM based project, we get data abort randomly, that is when we play with it we get a data abort interrupt. But the data abort is not always on the same point when we check with the register map with r14 or r13, even though check the function callback. Is there anyway that I can get the information about the root cause on data abort precisely? I have try the ref2 but could not get the same point when I trap the data about interrupt.
Related
ARM Data Abort error exception debugging
ARM: HOW TO ANALYZE A DATA ABORT EXCEPTION
Checking the link register (r14) as described in your Keil link above will show you the instruction that triggered the data abort. From there you'll have to figure out why it triggered a data abort and how that could have happened, which is the difficult part.
In my experience what most likely happened is that you accessed an invalid pointer. It can be invalid for many reasons. Here are a few candidates:
You used the pointer before it was initialized
You used the pointer after it, or the containing memory, had been freed (and was subsequently modified when another function allocated it)
The pointer was corrupted by a stack overflow
The pointer was corrupted by other, unrelated, misbehaving code that is trampling on memory
The pointer was allocated on the stack as a local variable and then used after the allocating function had exited
The pointer has incorrect alignment for its type (for example, trying to access 0x4001 as a uint32_t)
As you can see, lots of things can be the root cause of an ARM data abort. Finding the root cause is part of what makes ARM software/firmware development so much fun! Good luck figuring out your puzzle.
I am trying to develop a runtime stack tracer. I have a function that returns the EIP address whenever the program being traced segfaults. How can I get back to the ebp of the current function (the one during which the program under observation crashed) so that I can start tracing up?
There is no way to convert an instruction pointer to a stack frame pointer. The same function may be invoked many times (even recursively) with different stack addresses; that's the whole point of having a call stack. If you have a crash dump file (core file, etc.) it should contain a dump of all the registers. If you want the register values you must read them from here.
The current ebp and esp (and all other registers) at the time of the segfault is available in the ucontext, which is passed as the third argument to the signal handler. The details of what's where in the ucontext is OS and CPU specific.