Address ranges reserved by gdb? - c

I have a program that mmaps memory at higher addresses using MAP_FIXED at TASK_SIZE - PAGE_SIZE.
This program runs fine if I execute it, but if I run it with gdb, it segfaults just after the mmap. Also at this point, the gdb state seems to be completely corrupted and it appears that the execution reaches an address range filled with 0's (could be from the new mappings just created).
Does gdb use this address range in the running process? Have I cleared out some of gdb's state? Is this address range documented somewhere?
Following is my call to mmap and the address calculation -
#define TASK_SIZE64 (0x800000000000UL - 4096)
#define TASK_SIZE TASK_SIZE64
#define PAGE_OFFSET (void*)TASK_SIZE
...
char *load_address = PAGE_OFFSET - file_size_aligned;
if(load_address != mmap(load_address, file_size_aligned, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)){
err("Failed to allocate memory for raw_binary with: %d\n", errno);
return -1;
}
file_size_aligned comes to a PAGE_SIZE. This is one of the allocations. There is one more that starts from load_address and allocates few more pages backwards (with PROT_READ and PROT_WRITE only).

Does gdb use this address range in the running process?
No.
Have I cleared out some of gdb's state?
No.
Is this address range documented somewhere?
Possibly in kernel sources.
Your program makes invalid assumptions about available address space, and "blows itself up" when run with ASLR turned off (which GDB does by default).
You can confirm this by running your program outside GDB, but with ASLR disabled. It should also crash. Try one of these:
# echo 0 > /proc/sys/kernel/randomize_va_space
or
setarch $(uname -m) -R /path/to/exe
You can also confirm that your program will run under GDB if you enable ASLR:
gdb /path/to/exe
(gdb) set disable-randomization off
(gdb) run

Related

mallinfo doesn't show mmap allocation's information

In mallinfo structure there are two fields hblks and hblkhd. The man documentation says that they are responsible for the number of blocks allocated by mmap and the total number of bytes. But when I run next code
void * ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
*(int *) ptr = 10;
Fields hblks and hblkhd are also zero. While the total number of free bytes in the blocks decreases. Could you please explain why this behavior is observed?
I also tried to allocate all free space and use mmap after it. But in this situation fields also equal to zero
Compiler: gcc 9.4.0
OS: Ubuntu 20.04.1
I did some experiments and they led me to the conclusion that this field is filled only when mmap occurred when calling malloc. A normal mmap call doesn't show up in this statistic, which is logical, because this is a system call, and the statistic is collected in user-space

When accessing mmap adress, signal SIGBUS was received

When I tried to access the address mmap returned, a Bus error is occured.
My code is below:
ftruncate(fd, shared_size);
addr = mmap(shared_start, shared_size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, fd, 0);
shared_size == 256*1024*1024
shared_start == 401000000000 (I used flag MAP_FIXED)
ftruncate the file to 256M.
-rw-r--r-- 1 root 0 256.0M Mar 4 03:47 mem.alloc
There is nothing wrong when calling mmap, and not all of address range is not allowed to access.
From the gdb information below, we can see, the address 0x40100f11ff00 is not allowed, but address 0x40100fe00000 is allowed:
(gdb) p *((char *)addr+0xf11ff00)
Cannot access memory at address 0x40100f11ff00
(gdb) p *((char *)addr+0xfe1ff00)
Cannot access memory at address 0x40100fe1ff00
(gdb) p *((char *)addr+0xfe00000)
$17 = 0 '\000'
From maps information below, we can see the addresses I accessed above are all within the range of mmap address:
0x401000000000 0x401010000000 0x10000000 0x0 /dev/mem.alloc
However, when writing these inaccessible addresses, a bus error occurs:
Program received signal SIGBUS, Bus error.
PS.When reducing shared_size from 256M to 128M, there is no issue.
I have fixed it. This is a problem that can be easily overlooked. The space mount for dev is too small...so....you known....

Not able to write in /dev/mem

The issue that I am experimenting is not related with open() or mmap() function, which are executed properly. I have disabled CONFIG_STRICT_DEVMEM in the kernel so I can read from /dev/mem without problems. Actually, I can do the following:
const char *path = "/dev/mem"
int fd = open(path, O_RDWR); /* read and write flags*/
p = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, BASE_ADDR); /* read and write flags*/
And the code does not fail. Nonetheless, I am using this code to write in the PCI address space. So, basically the BASE_ADDR is 0xc000000, and the size is 256 MiB (0x10000000, all the PCI address space).
Said that, when I try to write on these positions (with a specific offset, BDF format), nothing is written; again the code does not fail, it just does not write anything.
In case my code was wrong, I tried BusyBox, with the following parameters:
[horro# ~]$ sudo busybox devmem 0xc00b0a8c w 0xffffffff
[horro# ~]$ sudo busybox devmem 0xc00b0a8c
0x00000000
So, basically it is not writing anything.
There is a CONFIG_STRICT_DEVMEM kernel config option. My understanding is that it must be set at compile-time as CONFIG_STRICT_DEVMEM=n. This is a security reasons.

Linux error from munmap

I have a simple question regarding mmap and munmap in Linux : is it possible that mmap succeeds but munmap fails?
Assuming all the parameters are correctly given, for example, see the following code snippet. In what circumstances munmap failed! will be printed??
char *addr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
... exit if mmap was not successful ...
... do some stuff using mmaped area ...
if( munmap(addr, 4096) == -1 ){
printf("munmap failed!\n");
}
Yes, it can fail. From mmunmap man pages :
Upon successful completion, munmap() shall return 0; otherwise, it shall return -1 and set errno to indicate the error.
The error codes indicates that people do pass invalid parameters. But if you pass pointer you got from mmap(), and correct size, then it will not fail.
Assuming all the parameters are correctly given,
Then it will not fail. That is why most implementations (99%) just don't check the return value of unmmap(). In such case, even if it fails, you can't do anything (other then informing the user).
munmap can fail with EINVAL if it receives an invalid parameter.
It can also fail with ENOMEM if the process's maximum number of mappings would have been exceeded. That can happen if you try to unmap a region in the middle of an existing map, and that results in two smaller maps.
You can see the maximum number of mappings per process with:
sysctl vm.max_map_count
You can increase it with, for example:
sysctl -w vm.max_map_count=131060
Be warned that if you don't check munmap's return, you will not have any clue when that causes a crash or another kind of failure, specially in long-running processes where the error can linger.
An munmap EINVAL, for instance, can point to a memory leak.
Also notice that mmap fails with MAP_FAILED, which is (void*)-1 and not NULL.

Mmap() an entire large file

I am trying to "mmap" a binary file (~ 8Gb) using the following code (test.c).
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
int main(int argc, char *argv[])
{
const char *memblock;
int fd;
struct stat sb;
fd = open(argv[1], O_RDONLY);
fstat(fd, &sb);
printf("Size: %lu\n", (uint64_t)sb.st_size);
memblock = mmap(NULL, sb.st_size, PROT_WRITE, MAP_PRIVATE, fd, 0);
if (memblock == MAP_FAILED) handle_error("mmap");
for(uint64_t i = 0; i < 10; i++)
{
printf("[%lu]=%X ", i, memblock[i]);
}
printf("\n");
return 0;
}
test.c is compiled using gcc -std=c99 test.c -o test and file of test returns: test: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not stripped
Although this works fine for small files, I get a segmentation fault when I try to load a big one. The program actually returns:
Size: 8274324021
mmap: Cannot allocate memory
I managed to map the whole file using boost::iostreams::mapped_file but I want to do it using C and system calls. What is wrong with my code?
MAP_PRIVATE mappings require a memory reservation, as writing to these pages may result in copy-on-write allocations. This means that you can't map something too much larger than your physical ram + swap. Try using a MAP_SHARED mapping instead. This means that writes to the mapping will be reflected on disk - as such, the kernel knows it can always free up memory by doing writeback, so it won't limit you.
I also note that you're mapping with PROT_WRITE, but you then go on and read from the memory mapping. You also opened the file with O_RDONLY - this itself may be another problem for you; you must specify O_RDWR if you want to use PROT_WRITE with MAP_SHARED.
As for PROT_WRITE only, this happens to work on x86, because x86 doesn't support write-only mappings, but may cause segfaults on other platforms. Request PROT_READ|PROT_WRITE - or, if you only need to read, PROT_READ.
On my system (VPS with 676MB RAM, 256MB swap), I reproduced your problem; changing to MAP_SHARED results in an EPERM error (since I'm not allowed to write to the backing file opened with O_RDONLY). Changing to PROT_READ and MAP_SHARED allows the mapping to succeed.
If you need to modify bytes in the file, one option would be to make private just the ranges of the file you're going to write to. That is, munmap and remap with MAP_PRIVATE the areas you intend to write to. Of course, if you intend to write to the entire file then you need 8GB of memory to do so.
Alternately, you can write 1 to /proc/sys/vm/overcommit_memory. This will allow the mapping request to succeed; however, keep in mind that if you actually try to use the full 8GB of COW memory, your program (or some other program!) will be killed by the OOM killer.
Linux (and apparently a few other UNIX systems) have the MAP_NORESERVE flag for mmap(2), which can be used to explicitly enable swap space overcommitting. This can be useful when you wish to map a file larger than the amount of free memory available on your system.
This is particularly handy when used with MAP_PRIVATE and only intend to write to a small portion of the memory mapped range, since this would otherwise trigger swap space reservation of the entire file (or cause the system to return ENOMEM, if system wide overcommitting hasn't been enabled and you exceed the free memory of the system).
The issue to watch out for is that if you do write to a large portion of this memory, the lazy swap space reservation may cause your application to consume all the free RAM and swap on the system, eventually triggering the OOM killer (Linux) or causing your app to receive a SIGSEGV.
You don't have enough virtual memory to handle that mapping.
As an example, I have a machine here with 8G RAM, and ~8G swap (so 16G total virtual memory available).
If I run your code on a VirtualBox snapshot that is ~8G, it works fine:
$ ls -lh /media/vms/.../snap.vdi
-rw------- 1 me users 9.2G Aug 6 16:02 /media/vms/.../snap.vdi
$ ./a.out /media/vms/.../snap.vdi
Size: 9820000256
[0]=3C [1]=3C [2]=3C [3]=20 [4]=4F [5]=72 [6]=61 [7]=63 [8]=6C [9]=65
Now, if I drop the swap, I'm left with 8G total memory. (Don't run this on an active server.) And the result is:
$ sudo swapoff -a
$ ./a.out /media/vms/.../snap.vdi
Size: 9820000256
mmap: Cannot allocate memory
So make sure you have enough virtual memory to hold that mapping (even if you only touch a few pages in that file).

Resources