When would one use mmap MAP_FIXED? - c

I've been looking at the different flags for the mmap function, namely MAP_FIXED, MAP_SHARED, MAP_PRIVATE. Can someone explain to me the purpose of MAP_FIXED? There's no guarantee that the address space will be used in the first place.

MAP_FIXED is dup2 for memory mappings, and it's useful in exactly the same situations where dup2 is useful for file descriptors: when you want to perform a replace operation that atomically reassigns a resource identifier (memory range in the case of MAP_FIXED, or fd in the case of dup2) to refer to a new resource without the possibility of races where it might get reassigned to something else if you first released the old resource then attempted to regain it for the new resource.
As an example, take loading a shared library (by the dynamic loader). It consists of at least three types of mappings: read+exec-only mapping of the program code and read-only data from the executable file, read-write mapping of the initialized data (also from the executable file, but typically with a different relative offset), and read-write zero-initialized anonymous memory (for .bss). Creating these as separate mappings would not work because they must be at fixed relative addresses relative to one another. So instead you first make a dummy mapping of the total length needed (the type of this mapping doesn't matter) without MAP_FIXED just to reserve a sufficient range of contiguous addresses at a kernel-assigned location, then you use MAP_FIXED to map over top of parts of this range as needed with the three or more mappings you need to create.
Further, note that use of MAP_FIXED with a hard-coded address or a random address is always a bug. The only correct way to use MAP_FIXED is to replace an existing mapping whose address was assigned by a previous successful call to mmap without MAP_FIXED, or in some other way where you feel it's safe to replace whole pages. This aspect too is completely analogous to dup2; it's always a bug to use dup2 when the caller doesn't already have an open file on the target fd with the intent to replace it.

If the file you are loading contains pointers, you will need to load it at a fixed location in order to ensure that the pointers are correct. In some cases, this can merely be an optimization.
Executables which are not position-independent must be loaded at fixed addresses.
Shared memory may contain pointers.
Executables which use prebinding will attempt to load dynamic libraries at predetermined memory locations as an optimization, but will fall back to normal loading techniques if a different location is used (or if the library has changed).
So MAP_FIXED is not typical usage.

Related

Always add MAP_NORESERVE flag in mmap for a regular file?

According to mmap's manual:
MAP_NORESERVE
Do not reserve swap space for this mapping. When swap space
is reserved, one has the guarantee that it is possible to modify
the mapping. When swap space is not reserved one might get
SIGSEGV upon a write if no physical memory is available.
To my understanding, if a regular file is mapped into the virtual address range, there is no need for any swap space. Only MAP_ANONYMOUS may need some swap space.
So, is it correct to always add MAP_NORESERVE flag in mmap for a regular file?
Update:
To be more specific, is it correct to always add MAP_NORESERVE flag in mmap for a regular file, when MAP_SHARED is used?
To my understanding, if a regular file is mapped into the virtual address range, there is no need for any swap space. Only MAP_ANONYMOUS may need some swap space.
That depends on the mmap flags. If a regular file is mapped with MAP_PRIVATE then the memory region is initialized from the file but not backed by the file. The system will need swap space for such a mapping if it decides to swap out any of its pages.
So, is it correct to always add MAP_NORESERVE flag in mmap for a regular file?
It is not incorrect to specify MAP_NORESERVE for any mapping. It's simply a question of what guarantees you want to have about program behavior.
Moreover, you seem looking at this from the wrong direction. If a particular mapping can never require swap space then the system will not reserve swap space for it, regardless of the flags. It doesn't hurt to use MAP_NORESERVE in such a case, but it doesn't help, either, so what would be the point?
On the other hand, if you want to be sure that mapping cannot fail on account of using MAP_NORESERVE then the most appropriate course of action is to avoid using that flag. You can completely ignore its existence if you wish, and in fact you should do so if you want maximum portability, because MAP_NORESERVE is a Linux extension not specified by POSIX.
Update:
As I understand it, you are asserting that you can reproducibly observe successful mapping of existing ranges of existing files with MAP_SHARED to require MAP_NORESERVE. That is, two such mapping attempts that differ only in whether MAP_NORESERVE is specified will produce different results for you, in a manner that you can predict and reliably reproduce.
I find that surprising, even dubious. I do not expect pages of a process's virtual address space that the page table maps to existing regions of a regular file to have any association with swap space, and therefore I do not expect the system to try to reserve any swap space for such pages when it establishes a mapping, flags notwithstanding. If you genuinely observe different behavior then I would attribute it to a library or kernel bug, about which you should file an issue.
Consider, for example, the GNU libc manual, which explicitly says that a memory mapping can be larger than physical memory and swap space, but does not document Linux-specific MAP_NORESERVE at all.
With that said, again, it is not incorrect (on Linux) to specify MAP_NORESERVE for any given memory mapping. I expect it to be meaningless for a MAP_SHARED mapping of an existing region of a regular file, however, so I would not consider it good -- much less best -- practice to routinely use that flag for such mappings. On the other hand, if specifying that flag works around a library or kernel bug that otherwise interferes with establishing certain mappings then I see no particular reason to avoid doing that, but I would expect each such use to be accompanied by a documentary comment explaining why that flag is used.

Custom heap/memory allocation ranges

I am writing a 64-bit application in C (with GCC) and NASM under Linux.
Is there a way to specify, where I want my heap and stack to be located. Specifically, I want all my malloc'ed data to be anywhere in range 0x00000000-0x7FFFFFFF. This can be done at either compile time, linking or runtime, via C code or otherwise. It doesn't matter.
If this is not possible, please explain, why.
P.S. For those interested, what the heck I am doing:
The program I am working on is written in C. During runtime it generates NASM code, compiles it and dynamically links to the already running program. This is needed for extreme optimization, because that code will be run thousands-if-not-billions of times, and is not known at compile time. So the reason I need 0x00000000-0x7FFFFFFF addresses is because they fit in immediates in assembler code. If I don't need to load the addresses separately, I can just about half the number of memory accesses needed and increase locality.
For Linux, the standard way of acquiring any Virtual Address range is using the mmap(2) function.
You can specify the starting virtual address and the size. If the address is not already in use and it not reserved by prior calls (or by the kernel) you will get access to the virtual address.
The success of this call can be checked by comparing the return value to the start address you passed. If the call fails, the function returns NULL.
In general mmap is used to map virtual addresses to file descriptors. But this mapping has to happen through physical pages on the RAM. Since the applications cannot directly access the disk.
Since you do not want any file backing, you can use the MAP_ANONYMOUS flag in the mmap call (also pass -1 as the fd).
This is the excerpt for the related part of the man-page -
MAP_ANONYMOUS
The mapping is not backed by any file; its contents are
initialized to zero. The fd argument is ignored; however,
some implementations require fd to be -1 if MAP_ANONYMOUS (or
MAP_ANON) is specified, and portable applications should
ensure this. The offset argument should be zero. The use of
MAP_ANONYMOUS in conjunction with MAP_SHARED is supported on
Linux only since kernel 2.4.

Is it possible to perform munmap based on information in /proc/self/maps?

I need to be able to unmap files that were opened through some libraries I'm linking with. The reason for needing to do this is that the mappings made by these libraries hold references to modules that may need to be reloaded while the program is executing (potentially long running executions). The problem is the modules cannot be unloaded while my process is holding a reference.
I've written C code to parse the information in proc/self/maps in an effort to read the address range of the mappings and calculate its length. I've calculated the length by subtracting the starting address from the ending address then I pass the starting address and the calculated length as the respective parameters to munmap. The problem is that munmap fails with EINVAL (Invalid Argument).
I've checked the size of the page my machine uses with sysconf(_SC_PAGESIZE) and it returned 4096, which is the value of my calculated length. The GNU manual says that munmap can fail with EINVAL if:
The memory range given was outside the user mmap range or wasn’t page
aligned.
Am I missing anything or is this at all not possible? My last resort would be to carefully comb through the system calls made and examine each mmap via strace, but I'd like for this to be a last resort, thanks.
The method you have chosen will not work, but, not necessarily for the reason(s) you think. However, there is a way, so read on ...
I need to be able to unmap files that were opened through some libraries I'm linking with.
Once you've let the ELF loader (e.g. ld.linux.so) load the libraries for you, you've lost control.
You can not just unmap the area [regardless of method]. The loader has already done relocations and symbol linkages for these libraries. The unmap removes the area, but now everything breaks because the various pointers set up by the linker now point to empty space. The loader will have no knowledge of what you've done.
Then, how would you remap the new version of the library [and where in memory]? Even remapping it to the same address is no guarantee because you can't adjust the things that the loader has already done.
The reason for needing to do this is that the mappings made by these libraries hold references to modules that may need to be reloaded while the program is executing (potentially long running executions).
Most programs that need to update to new versions of libraries simply reexec themselves. If you need to preserve the data, you can work out a dump/restore mechanism.
However, if you truly want to unload/load a newer library version, you can do this using dynamic linking.
Instead of using ld to link with (e.g.) libA, leave if off the ld command line and have the program do its own loading of libA.
You use dlopen/dlsym/dlclose to open/load the library under your control.
You'll have to keep track of the symbol tables, but changing to the new version is easy. When you want the new version, simply do dlclose and then a dlopen. You'll have to redo the dlsym calls to get the updated addresses, but all this is fairly easy and standard
The problem is the modules cannot be unloaded while my process is holding a reference.
The reason is that ELF loader did this. With dlopen et. al., you don't have the same problem.
Yes it is possible. I had a closer look at my code; there was something I was misunderstanding when it came to passing the start address of the mapping to munmap. I had initially read the start address into an unsigned long long and for some reason, I was converting this value into a hex string instead of just casting to a void pointer when calling munmap. In essence:
/* Values assigned here are really read from /proc/self/maps */
unsigned long long vm_start = 140013986873344;
unsigned long long vm_end = 140013986877440;
unsigned long long length = vm_end - vm_start;
munmap((void *)vm_start, length);

Using mremap() to merge two identical pages into one physical page

I have a C code where I know that the content of the page pointed to by void *p1 is the same as the content pointed to by page void *p2. p1 and p2 were dynamically allocated. My question is can I use remap() to let these two pages point to the same physical page instead of having two identical physical pages?
Edit: I am trying to change the virtual to physical mapping in the page table of this process so that p1 and p2 point to the same physical address. I do not want to make p1 and p2 to point to the same thing virtually.
If you are trying to map multiple virtual memory addresses to a single physical address, using the linux page scheme, that isn't what mremap() is for. mremap is for moving (remapping) an existing region, and if you use it to map to a specific newaddress, any old mappings to that address become invalid (per the man page). http://man7.org/linux/man-pages/man2/mremap.2.html
See the emphasized section...
MREMAP_FIXED (since Linux 2.3.31)
This flag serves a similar purpose to the MAP_FIXED flag of
mmap(2). If this flag is specified, then mremap() accepts a
fifth argument, void *new_address, which specifies a page-
aligned address to which the mapping must be moved. Any
previous mapping at the address range specified by new_address
and new_size is unmapped. If MREMAP_FIXED is specified, then
MREMAP_MAYMOVE must also be specified.
If you are simply trying to merge the storage of 2 identical data structures, you would't need mremap() to point 2 "pages" to the same identical page, you'd need to point the 2 different data structure pointers to the same page and free the redundant page.
If the content is the same, you'd need to convert any pointers that are pointing to p2 to addresses into p1.
Even using mremap properly requires you to take care of your own pointer housekeeping, it doesn't magically do that for you; if you fail to do that, after the remap you may have dangling pointers.
PS: Its been years since I did kernel programming, so I might be wrong or out of date in my next statement, but I think you will need to use kernel calls (ie. kernel module / driver level calls) to get to the physical mappings because mmap() and mremap() are user-land calls and work within the virtual address space. The "page mapping" is done at the kernel level, outside of user space.

How to use a mmap file mapping for variables

I'm currently experimenting with IPC via mmap on Unix.
So far, I'm able to map a sparse-file of 10MB into the RAM and access it reading and writing from two separated processes. Awesome :)
Now, I'm currently typecasting the address of the memory-segment returned by mmap to char*, so I can use it as a plain old cstring.
Now, my real question digs a bit further. I've quite a lot experience with higher levels of programming (ruby, java), but never did larger projects in C or ASM.
I want to use the mapped memory as an address-space for variable-allocation. I dont't wether this is possible or does make any sense at all. Im thinking of some kind of a hash-map-like data structure, that lives purely in the shared segment. This would allow some interesting experiments with IPC, even with other languages like ruby over FFI.
Now, a regular implementation of a hash would use pretty often something like malloc. But this woult allocate memory out of the shared space.
I hope, you understand my thoughts, although my english is not the best!
Thank you in advance
Jakob
By and large, you can treat the memory returned by mmap like memory returned by malloc. However, since the memory may be shared between multiple "unrelated" processes, with independent calls to mmap, the starting address for each may be different. Thus, any data structure you build inside the shared memory should not use direct pointers.
Instead of pointers, offsets from the initial map address should be used instead. The data structure would then compute the right pointer value by adding the offset to the starting address of the mmamp region.
The data structure would be built from the single call to mmap. If you need to grow the data structure, you have to extend the mmap region itself. The could be done with mremap or by manually munmap and mmap again after the backing file has been extended.

Resources