What happens at CPU-Level if you dereference a null pointer? - c

Suppose I have following program:
#include <signal.h>
#include <stddef.h>
#include <stdlib.h>
static void myHandler(int sig){
abort();
}
int main(void){
signal(SIGSEGV,myHandler);
char* ptr=NULL;
*ptr='a';
return 0;
}
As you can see, I register a signalhandler and some lines further, I dereference a null pointer ==> SIGSEGV is triggered.
But how is it triggered?
If I run it using strace (Output stripped):
//Set signal handler (In glibc signal simply wraps a call to sigaction)
rt_sigaction(SIGSEGV, {sa_handler=0x563b125e1060, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7ffbe4fe0d30}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
//SIGSEGV is raised
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [SEGV], 8) = 0
But something is missing, how does a signal go from the CPU to the program?
My understanding:
[Dereferences null pointer] -> [CPU raises an exception] -> [??? (How does it go from the CPU to the kernel?) ] -> [The kernel is notified, and sends the signal to the process] -> [??? (How does the process know, that a signal is raised?)] -> [The matching signal handler is called].
What happens at these two places marked with ????

A NULL pointer in most (but not all) C implementations is address 0. Normally this address is not in a valid (mapped) page.
Any access to a virtual page that's not mapped by the HW page tables results in a page-fault exception. e.g. on x86, #PF.
This invokes the OS's page-fault exception handler to resolve the situation. On x86-64 for example, the CPU pushes exception-return info on the kernel stack and loads a CS:RIP from the IDT (Interrupt Descriptor Table) entry that corresponds to that exception number. Just like any other exception triggered by user-space, e.g. integer divide by zero (#DE), or a General Protection fault #GP (trying to run a privileged instruction in user-space, or a misaligned SIMD instruction that required alignment, or many other possible things).
The page-fault handler can find out what address user-space tried to access. e.g. on x86, there's a control register (CR2) that holds the linear (virtual) address that caused the fault. The OS can get a copy of that into a general-purpose register with mov rax, cr2.
Other ISAs have other mechanisms for the OS to tell the CPU where its page-fault handler is, and for that handler to find out what address user-space was trying to access. But it's pretty universal for systems with virtual memory to have essentially equivalent mechanisms.
The access is not yet known to be invalid. There are several reasons why an OS might not have bothered to "wire" a process's allocated memory into the hardware page tables. This is what paging is all about: letting the OS correct the situation, like copy-on-write, lazy allocation, or bringing a page back in from swap space.
Page faults come in three categories: (copied from my answer on another question). Wikipedia's page-fault article says similar things.
valid (the process logically has the memory mapped, but the OS was lazy or playing tricks like copy-on-write):
hard: the page needs to be paged in from disk, either from swap space or from a disk file (e.g. a memory mapped file, like a page of an executable or shared library). Usually the OS will schedule another task while waiting for I/O: this is the key difference between hard (major) and soft (minor).
soft: No disk access required, just for example allocating + zeroing a new physical page to back a virtual page that user-space just tried to write. Or copy-on-write of a writeable page that multiple processes had mapped, but where changes by one shouldn't be visible to the other (like mmap(MAP_PRIVATE)). This turns a shared page into a private dirty page.
invalid: There wasn't even a logical mapping for that page. A POSIX OS like Linux will deliver SIGSEGV signal to the offending process/thread.
So only after the OS consults its own data structures to see which virtual addresses a process is supposed to own can it be sure that the memory access was invalid.
Deciding whether a page fault is invalid or not is completely up to software. As I wrote on Why page faults are usually handled by the OS, not hardware? - if the HW could figure everything out, it wouldn't need to trap to the OS.
Fun fact: on Linux it's possible to configure the system so virtual address 0 is (or can be) valid. Setting mmap_min_addr = 0 allows processes to mmap there. e.g. WINE needs this for emulating a 16-bit Windows memory layout.
Since that wouldn't change the internal object-representation of a NULL pointer to be other than 0, doing that would mean that NULL dereference would no longer fault. That makes debugging harder, which is why the default for mmap_min_addr is 64k.
On a simpler system without virtual memory, the OS might still be able to configure an MMU to trap on memory access to certain regions of address space. The OS's trap handler doesn't have to check anything, it knows any access that triggered it was invalid. (Unless it's also emulating something for some regions of address space...)
Delivering a signal to user-space
This part is pure software. Delivering SIGSEGV is no different than delivering SIGALRM or SIGTERM sent by another process.
Of course, a user-space process that just returns from a SIGSEGV handler without fixing the problem will make the main thread re-run the same faulting instruction again. (The OS would return to the instruction that raised the page-fault exception.)
This is why the default action for SIGSEGV is to terminate, and why it doesn't make sense to set the behaviour to "ignore".

Typically what happens is that when the CPU’s Memory Management Unit finds that the virtual address the program is trying to access is not in any of the mappings to physical memory, it raises an interrupt. The OS will have set up an Interrupt Service Routine just in case this happens. That routine will do whatever is necessary inside the OS to signal the process with SEGV. In return from the ISR the offending instruction has not been completed.
What happens then depends on whether there’s a handler installed or not for SEGV. The language’s runtime may have installed one that raises it as an exception. Almost always the process is terminated, as it is beyond recovery. Something like valgrind would do something useful with the signal, eg telling you exactly where in the code the program had got to.
Where it gets interesting is when you look at the memory allocation strategies used by C runtime libraries like glibc. A NULL pointer dereference is a bit of an obvious one, but what about accessing beyond the end of an array? Often, calls to malloc() or new will result in the library asking for more memory than has been asked for. The bet is that it can use that memory to satisfy further requests for memory without troubling the OS - which is nice and fast. However, the CPU’s MMU has no idea that that’s happened. So if you do access beyond the end of the array, you’re still accessing memory that the MMU can see is mapped to your process, but in reality you’re beginning to trample where one shouldn’t. Some very defensive OSes don’t do this, specifically so that the MMU does catch out of bounds accesses.
This leads to interesting results. I’ve come across software that builds and runs just fine on Linux which, compiled for FreeBSD, starts throwing SEGVs. GNURadio is one such piece of software (it was a complex flow graph). Which is interesting because it makes heavy use of boost / c++11 smart pointers specifically to help avoid memory misuse. I’ve not yet been able to identify where the fault is to submit a bug report for that one...

Related

Writing directly to memory location in C works in microcontrollers but not on PC

I'd like to know why something like the C code below works in microcontrollers but not in mainstream computers:
// 1. Get memory location of GPIOA SET Register.
uint32_t *gpioa = (uint32_t *)(0x40020000 + 0x18);
// 2. Set bit to 1 to enable it.
*gpioa |= (1<<5);
Statement 1. works on computers, but trying to access the memory location in any way leads to a segmentation fault.
Is the operating system blocking direct memory access in this way?
Yes, on typical multi-user systems, the operating system controls access to memory.
Your process has only a virtual address space. The operating sets special registers or other features in the hardware to regulate your address space. Parts of your virtual address space are mapped to physical memory, and parts are not mapped at all. (A mapping specifies how a virtual address is translated to a physical address.) The operating system also determines whether you can read memory, write memory, or execute instructions from memory.
At times, the operating system may change what parts of memory your process can access. It may keep data your process is not currently using on disk and mark that part of your virtual address space inaccessible. When your process tries to access it, the hardware generates an exception, and the kernel handles the exception by reading the data from disk to memory, marking the memory accessible to your process, and restarting your process at the instruction that generated the exception.
To add to Eric Postpischil's answer, the memory locations of various microcontroller registers are different for every model of microntroller you may try to program. So not only is your code not portable to a PC (where it segfaults), but it may also segfault or misbehave on a different microcontroller (unless it's the same family of microntroller, and they're specifically designed to be compatible).

Behavior of mprotect with multiple threads

For the purpose of concurrent/parallel GC,
I'm interested in what memory order guarantee is provided by the mprotect syscall (i.e. the behavior of mprotect with multiple threads or the memory model of mprotect). My questions are (assuming no compilier reordering or with sufficient compiler barrier)
If thread 1 triggers a segfault on an address due to a mprotect on
thread 2, can I be sure that everything happens on thread 2 before the
syscall can be observed in thread 1 in the signal handler of the
segfault? What if a full memory barrier is placed in the signal
handler before performing load on thread1?
If thread 1 does an volatile load on an address that is set to
PROT_NONE by thread 2 and didn't trigger a segfault, is this enough of
a happens before relation between the two. Or in another word, if the
two threads do (*ga starts as 0, p is a page aligned address started readonly)
// thread 1
*ga = 1;
*(volatile int*)p; // no segfault happens
// thread 2
mprotect(p, 4096, PROT_NONE); // Or replace 4096 by the real userspace-visible page size
a = *ga;
is there a guarantee that a on thread 2 will be 1? (assuming no
segfault observed on thread 1 and no other code modifies *ga)
I'm mostly interested in Linux behavior and particularly on x86(_64), arm/aarch64 and ppc though information about other archs/OS are welcome to (for windows, replace mprotect by VirtualProtect or whatever it is called....). So far my tests on x64 and aarch64 Linux suggests no violations of these though I'm not sure if my test is conclusive or if the behavior can be relied on in the long term.
Some searching suggests that mprotect may issue a TLB shootdown on all threads with the address mapped when permission is removed which might provide the guarantee stated here (or in another word, providing this guarantee seems to be the goal of such operation) though it's unclear to me if future optimization of the kernel code could break this guarentee.
Ref LKML post where I asked about this a week ago with no reply yet...
Edit: clearification about the question. I was aware that a tlb shootdown should provide the guarantee I'm looking for but I'd like to know if such a behavior can be relied on. In another word, what's the reason such requests are issued by the kernel since it shouldn't be needed if not for providing some kind of ordering guarantee.
So I asked this on the mechanical-sympathy group a day after posting here and got an answer from Gil Tene. With his permission here's my summary of his answers. The full thread is available here in case there's anything I didn't include that isn't clear.
For the overall behavior one can expect from the OS.
(as in "would be surprising for an OS to not meet):
A call to mprotect() is fully ordered with respect to loads and stores that happen before and after the call. This tends to be trivially achieved at the CPU and OS level because mprotect is a system call, which involves a trap, which in turn involves full ordering. [In strange no-ring-transition-implementations (e.g. in-kernel execution, etc.) the protect call would be presumably responsible for emulating this ordering assumption].
A call to mprotect will not return before the protection request semantically takes hold everywhere within the process. If the mprotect() call sets a protection that would cause a fault, any operation on any thread that happens after this mprotect() call is required to fault. Similarly, if the mprotect() call sets a protection that would prevent a fault, any operation on any thread that happens after this mprotect() call is required to NOT fault.
This essentially means that the memory operation on the affected pages on other threads are synchronized with the thread calling mprotect. More specifically, one can expect both of the two cases mentioned in the original question are guaranteed. I.e.
If it is observed that a load on one thread in the affected page faults due to the mprotect call, this fault happens after mprotect() call and therefore after and is able to observer all memory operations that happens before mprotect.
If it is observed that a load on one thread in the affected page doesn't fault disbite the mprotect call, the load happens before mprotect call and the mprotect call and any code after it are after the load and will be able to observe any memory operations that happens before the load.
It was also pointed out that transitivity may not work, i.e. a fault load on one thread may not be after a non-fault load on another thread. This can (effectively) be caused by the non-atomicity of the tlb flush causing different threads/cpus to observer the change in access permission at different times.

When does memory load cause bus error on x86-64 linux?

I used to think that x86-64 supports unaligned memory access and invalid memory access always causes segmentation fault (except, perhaps, SIMD instructions like movdqa or movaps). Nevertheless recently I observed bus error with normal mov instruction. Here is a reproducer:
void test(void *a)
{
asm("mov %0, %%rbp\n\t"
"mov 0(%%rbp), %%rdx\n\t"
: : "r"(a) : "rbp", "rdx");
}
int main()
{
test((void *)0x706a2e3630332d69);
return 0;
}
(must be compiled with frame pointer omission, e.g. gcc -O test.c && ./a.out).
mov 0(%rbp), %rdx instruction and the address 0x706a2e3630332d69 were copied from a coredump of the buggy program. Changing it to 0 causes segfault, but just aligning to 0x706a2e3630332d60 is still bus error (my guess is that it is related to the fact that address space is 48-bit on x86-64).
The question is: which addresses cause bus error (SIGBUS)? Is it determined by architecture or configured by OS kernel (i.e. in page table, control registers or something similar)?
SIGBUS is in a sad state. There's no consensus between different operating systems what it should mean and when it is generated varies wildly between operating systems, cpu architectures, configuration and the phase of the moon. Unless you work with a very specific configuration you should just treat it "just like SIGSEGV, but different".
I suspect that originally it was supposed to mean "you tried a memory access that could not possibly be successful no matter what the kernel does", so in other words the exact bit pattern you have in the address can never be a valid memory access. Most commonly this would mean unaligned access on strict alignment architectures. Then some systems started using it for accesses to virtual address space that doesn't exist (like in your example, the address you have can't exist). Then by accident some systems made it also mean that userland tried to touch kernel memory (since at least technically it's virtual address space that doesn't exist from the point of view of userland). Then it became just random.
Other than that I've seen SIGBUS from:
access to non-existent physical address from mmap:ed hardware.
exec of non-exec mapping
access to perfectly valid mapping, but overcommitted memory couldn't be faulted in at this moment (I've seen SIGSEGV, SIGKILL and SIGBUS here, at least one operating system does this differently depending on which architecture you're on).
memory management deadlocks (and other "something went horribly wrong, but we don't know what" memory management errors).
stack red zone access
hardware errors (ECC memory, pci bus parity errors, etc.)
access to mmap:ed file where the file contents don't exist (past the end of the file or a hole).
access to mmap:ed file where the file contents should exist, but don't (I/O errors).
access to normal memory that got swapped out and swap in couldn't be performed (I/O error).
Generally, a SIGBUS can be sent on an unaligned memory access, i.e. when writing a 64-bit integer to an address, which is not 8-byte aligned. However, in recent systems. either the hardware itself handles it correctly (albeit a bit slower than an aligned access), or the OS emulates the access it in an exception handler (with 2 or more separate memory accesses).
In this case, the problem is, that an address outside the permissible virtual address address space was specified. Despite a pointer has 64-bit, only the address space from 0-(2^48-1) (0x0-0xffffffffffff) is valid on current 64-bit intel processors. Linux provides even less address space to its processes, from 0-(2^47-1) (which is 0-0x7fffffffffff), the rest (0x800000000000-0xffffffffffff) is used by the kernel.
This means, that the kernel sends a SIGBUS because of an access to an invalid address (every address >= 0x800000000000), as opposed to a SIGSEGV, which means, that an access error to a valid address occurred (missing page entry, wrong access rights, etc.).
The only situation where POSIX specifically requires generation of a SIGBUS is, when you create a file-backed mmap region that extends beyond the end of the backing file by more than a whole page, and then access addresses sufficiently far past the end. (The exact words are "References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.", from the specification of mmap.)
In all other circumstances, whether you get a SIGSEGV or a SIGBUS for an invalid memory access, or no signal at all, is left completely up to the implementation.

What's the way to simulate success after SIGSEGV

I'm capturing a SIGSEGV on a read/write to a known block of memory. The block is mmaped and under my control, so it can be manipulated. I'd like to simulate the read/write succeeding, actually process the data and continue the application. I've got two possible solutions, but they all seem too complicated. I'm hoping there's a better way to achieve this:
Borrow the trick from debuggers and:
mmap the area and protect
wait for SIGSEGV
get the read/write size from instruction type
for reads, put the required data in memory and remove protection
single-step the app
for writes, read what was written and process
in the single-step TRAP protect the page again and continue the app
Do some crazy processing on the instruction itself and:
mmap the area and protect
wait for SIGSEGV
get the instruction under eip and simulate its effects
return after the instruction
The app is not running under root account, in case that matters.
I'm assuming x86_64 and don't really care about other platforms at the moment.

scan memory of calling process

I have to scan the memory space of a calling process in C. This is for homework. My problem is that I don't fully understand virtual memory addressing.
I'm scanning the memory space by attempting to read and write to a memory address. I can not use proc files or any other method.
So my problem is setting the pointers.
From what I understand the "User Mode Space" begins at address 0x0, however, if I set my starting point to 0x0 for my function, then am I not scanning the address space for my current process? How would you recommend adjusting the pointer -- if at all -- to address the parent process address space?
edit: Ok sorry for the confusion and I appreciate the help. We can not use proc file system, because the assignment is intended for us to learn about signals.
So, basically I'm going to be trying to read and then write to an address in each page of memory to test if it is R, RW or not accessible. To see if I was successful I will be listening for certain signals -- I'm not sure how to go about that part yet. I will be creating a linked list of structure to represent the accessibility of the memory. The program will be compiled as a 32 bit program.
With respect to parent process and child process: the exact text states
When called, the function will scan the entire memory area of the calling process...
Perhaps I am mistaken about the child and parent interaction, due to the fact we've been covering this (fork function etc.) in class, so I assumed that my function would be scanning a parent process. I'm going to be asking for clarification from the prof.
So, judging from this picture I'm just going to start from 0x0.
From a userland process's perspective, its address space starts at address 0x0, but not every address in that space is valid or accessible for the process. In particular, address 0x0 itself is never a valid address. If a process attempts to access memory (in its address space) that is not actually assigned to that process then a segmentation results.
You could actually use the segmentation fault behavior to help you map out what parts of the address space are in fact assigned to the process. Install a signal handler for SIGSEGV, and skip through the whole space, attempting to read something from somewhere in each page. Each time you trap a SIGSEGV you know that page is not mapped for your process. Go back afterward and scan each accessible page.
Do only read, however. Do not attempt to write to random memory, because much of the memory accessible to your programs is the binary code of the program itself and of the shared libraries it uses. Not only do you not want to crash the program, but also much of that memory is probably marked read-only for the process.
EDIT: Generally speaking, a process can only access its own (virtual) address space. As #cmaster observed, however, there is a syscall (ptrace()) that allows some processes access to some other processes' memory in the context of the observed process's address space. This is how general-purpose debuggers usually work.
You could read (from your program) the /proc/self/maps file. Try first the following two commands in a terminal
cat /proc/self/maps
cat /proc/$$/maps
(at least to understand what are the address space)
Then read proc(5), mmap(2) and of course wikipages about processes, address space, virtual memory, MMU, shared memory, VDSO.
If you want to share memory between two processes, read first shm_overview(7)
If you can't use /proc/ (which is a pity) consider mincore(2)
You could also non-portably try reading from (and perhaps rewriting the same value using volatile int* into) some address and catching SIGSEGV signal (with a sigsetjmp(3) in the signal handler), and do that in a -dichotomical- loop (in multiple of 4Kbytes) - from some sane start and end addresses (certainly not from 0, but probably from (void*)0x10000 and up to (void*)0xffffffffff600000)
See signal(7).
You could also use the Linux (Gnu libc) specific dladdr(3). Look also into ptrace(2) (which should be often used from some other process).
Also, you could study elf(5) and read your own executable ELF file. Canonically it is /proc/self/exe (a symlink) but you should be able to get its from the argv[0] of your main (perhaps with the convention that your program should be started with its full path name).
Be aware of ASLR and disable it if your teacher permits that.
PS. I cannot figure out what your teacher is expecting from you.
It is a bit more difficult than it seems at the first sight. In Linux every process has its own memory space. Using any arbitrary memory address points to the memory space of this process only. However there are mechanisms which allow one process to access memory regions of another process. There are certain Linux functions which allow this shared memory feature. For example take a look at
this link which gives some examples of using shared memory under Linux using shmget, shmctl and other system calls. Also you can search for mmap system call, which is used to map a file into a process' memory, but can also be used for the purpose of accessing memory of another process.

Resources