Basic mmap(2) call fails - c

I've been pulling my hairs out for hours on this very basic piece of code and I haven't managed to understand why mmap(2) was failing.
#include <sys/mman.h>
#include <sys/user.h>
#include <err.h>
#include <stdio.h>
#include <stdlib.h>
#define NPAGES 50
#define SLABSIZE (PAGE_SIZE * NPAGES) // 200 kB
int
main(int argc, char *argv[])
{
char *slab;
printf("DEBUG: mmap(%#x, %u, %#x, %#x, %d, %u)\n",
NULL, SLABSIZE, PROT_READ | PROT_WRITE, MAP_ANON, -1, 0);
slab = mmap(NULL, SLABSIZE, PROT_READ | PROT_WRITE, MAP_ANON, -1, 0);
if (slab == MAP_FAILED)
err(1, "mmap");
}
But when I run it:
$ make mmap
cc mmap.c -o mmap
$ ./mmap
DEBUG: mmap(0, 204800, 0x3, 0x20, -1, 0)
mmap: mmap: Invalid argument
I checked and re-checked mmap(2)'s manpage [1], and it seems that all requirements are OK but I must be missing something.
I'm running Linux kernel 4.8.13.
Thanks.
-- Jeremie
[1] http://man7.org/linux/man-pages/man2/mmap.2.html

When strace ing your program, I see:
mmap(NULL, 204800, PROT_READ|PROT_WRITE, MAP_FILE|MAP_ANONYMOUS, -1, 0) = -1 EINVAL (Invalid argument)
You forgot a MAP_SHARED flag (or a MAP_PRIVATE one). With it (either MAP_SHARED or MAP_PRIVATE, but you need one of them) your program works:
slab = mmap(NULL, SLABSIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
Quoting the mmap(2) man page:
This behavior is determined by including exactly one of the following values in flags:
(emphasis is mine)
MAP_SHARED
Share this mapping. Updates to the mapping are visible to
other processes mapping the same region, and (in the case of
file-backed mappings) are carried through to the underlying
file. (To precisely control when updates are carried through
to the underlying file requires the use of msync(2).)
MAP_PRIVATE
Create a private copy-on-write mapping. Updates to the
mapping are not visible to other processes mapping the same
file, and are not carried through to the underlying file. It
is unspecified whether changes made to the file after the
mmap() call are visible in the mapped region.
So the general advice before pulling your hairs: read once again the documentation; take a nap; read another time the documentation and think about what did you miss.
Another hint would be to use strace(1) on some (or several) existing executable(s). You'll learn a lot.

Related

Can I create a circular buffer on Linux? (current code segfaults)

Inspired by this example for Windows. In short, they create a file handle (with CreateFileMapping) then create 2 different pointers to the same memory (MapViewOfFileEx or MapViewOfFile3)
So I tried to do the same thing with shm_open, ftruncate and mmap. I used mmap a few times in the past for memory and files but I never mixed it with shm_open or used shm_open.
My code fails on the second mmap with a segfault. I tried doing a syscall directly on both mmaps and it still segfaults :( How do I do this properly? The idea is I can do memcpy(p+len-10, src, 20) and have the first 10bytes of src be at the end of the memory and last 10 written to the start (hence circular)
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <assert.h>
int main()
{
write(2, "Start\n", 6); //prints this
int len = 1024*1024*2;
int fd = shm_open("example", O_RDWR | O_CREAT, 0777);
assert(fd > 0); //ok
int r1 = ftruncate(fd, len);
assert(r1 == 0); //ok
char*p = (char*)mmap(0, len, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
assert((long long)p>0); //ok
//Segfaults on next line
char*p2 = (char*)mmap(p+len, len, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED, fd, 0); //segfaults
write(2, "Finish\n", 7); //doesn't print this
return 0;
}
Linux usually selects address space for mappings starting from a certain point and goes lower with each reservation. So your 2nd mmap call replaces one of previous file mappings (likely libc.so), which leads to SIGSEGV with SEGV_ACCERR - invalid access permissions. You are overwriting executable section of libc.so (that is being executed right now) with non-executable data.
Use strace to check what is going on inside:
$ strace ./a.out
...
openat(AT_FDCWD, "/dev/shm/example", O_RDWR|O_CREAT|O_NOFOLLOW|O_CLOEXEC, 0777) = 3
ftruncate(3, 2097152) = 0
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0x7f134c1bf000
mmap(0x7f134c3bf000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x7f134c3bf000
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7f134c4ccc37} ---
+++ killed by SIGSEGV +++
Compare addresses you are passing around with /proc/$pid/maps file and you will see what you are overwriting.
Your mistake was to assume MAP_FIXED can be used without reserving memory beforehand. To do this properly you need to:
Reserve memory by calling mmap with len * 2 size, PROT_NONE and MAP_ANONYMOUS | MAP_PRIVATE (and without file)
Use mmap with MAP_FIXED to overwrite portions of that mapping with the content you need
Additionally, you should prefer using memfd_create instead of shm_open on Linux to avoid shared memory files from staying around. Unlinking them with shm_unlink doesn't help if your program crashes. This also gives you a file that is private to your program instance.
You do not need to call mmap again to gen new pointer. (You even must not to do it.) Just increment it.
The pointer p2 will not point to the address just after the memory block allocated.
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <assert.h>
int main()
{
write(2, "Start\n", 6); //prints this
int len = 1024*1024*2;
int fd = shm_open("example", O_RDWR | O_CREAT, 0777);
assert(fd > 0); //ok
int r1 = ftruncate(fd, len);
assert(r1 == 0); //ok
char*p = (char*)mmap(0, len, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
assert((long long)p>0); //ok
char*p2 = p+len;
write(2, "Finish\n", 7); //doesn't print this
return 0;
}

map shared memory size using mmap more than set size done by ftruncate

i have few question based on below source:
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <stdio.h>
int g;
int main(void) {
int fd = shm_open("/myregion", O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
ftruncate(fd, sizeof(int)); // set size by sizeof(int)
int *p1 = mmap(NULL, 10*sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd,0); //now map 10*sizeof(int).
if (p1== MAP_FAILED) {
printf("*****************error");
}
*p1 = mmap(NULL, 8*sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd,0);
if (p1== MAP_FAILED) {
printf("*****************error");
}
*p1=89;
return g;
}
Question 1 :
why i don't see any error while i set size as size_of(int) and then map 10*size_of(int)
Question 2:
how many instace of shared mem is created here? i mean is there only one shared mem created or two as i did mmap twice?
Thanks
Given the code
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <stdio.h>
int g;
int main(void) {
int fd = shm_open("/myregion", O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
ftruncate(fd, sizeof(int)); // set size by sizeof(int)
int *p1 = mmap(NULL, 10*sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd,0); //now map 10*sizeof(int).
*p1 = mmap(NULL, 8*sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd,0);
if (!p1){
printf("*****************error");
}
*p1 = g;
*p1=89;
return g;
}
Question 1 : why i don't see any error while i set size as size_of(int) and then map 10*size_of(int)
Because you don't check the return value from mmap(), you don't know if an error happened or not. By immediately calling mmap() again with
*p1 = mmap(NULL, 8*sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd,0);
you mask any potential errors from the first mmap() call, and you also leak any memory that was successfully allocated.
Question 2: how many instace of shared mem is created here? i mean is there only one shared mem created or two as i did mmap twice?
If the first call succeeded, you mapped two memory segments. If it failed, you only mapped one if the second once succeeded.
If the first call did succeed, you leaked the memory.
Note that if you tried to write to the mmap()'d segment past the end of the size of the file you set with ftruncate(), you will not cause the file to grow. Per the POSIX mmap() documentation:
Note that references beyond the end of the object do not extend the object as the new end cannot be determined precisely by most virtual memory hardware. Instead, the size can be directly manipulated by ftruncate().
On Linux, trying to access mmap()'d data beyond the end of the mapped file will likely result in your process receiving a SIGBUS signal.

Using mmap() instead of malloc()

I am trying to complete an exercise that is done with system calls and need to allocate memory for a struct *. My code is:
myStruct * entry = (myStruct *)mmap(0, SIZEOF(myStruct), PROT_READ|PROT_WRITE,
MAP_ANONYMOUS, -1, 0);
To clarify, I cannot use malloc() but can use mmap(). I was having no issues with this on Windows in Netbeans, now however I'm compiling and running from command line on Ubuntu I am getting "Segmentation Fault" each time I try to access it.
Is there a reason why it will work on one and not the other, and is mmap() a valid way of allocating memory in this fashion? My worry was I was going to be allocating big chunks of memory for each mmap() call initially, now I just cannot get it to run.
Additionally, the error returned my mmap is 22 - Invalid Argument (I did some troubleshooting while writing the question so the error check isn't in the above code). Address is 0, the custom SIZEOF() function works in other mmap arguments, I am using MAP_ANONYMOUS so the fd and offsetparameters must -1 and 0 respectively.
Is there something wrong with the PROT_READ|PROT_WRITE sections?
You need to specify MAP_PRIVATE in your flags.
myStruct * entry = (myStruct *)mmap(0, SIZEOF(myStruct),
PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
From the manual page:
The flags argument determines whether updates to the mapping are
visible to other processes mapping the same region, and whether
updates are carried through to the underlying file. This behavior is
determined by including exactly one of the following values in flags:
You need exactly one of the flags MAP_PRIVATE or MAP_SHARED - but you didn't give either of them.
A complete example:
#include <sys/mman.h>
#include <stdio.h>
typedef struct
{
int a;
int b;
} myStruct;
int main()
{
myStruct * entry = (myStruct *)mmap(0, sizeof(myStruct),
PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (entry == MAP_FAILED) {
printf("Map failed.\n");
}
else {
entry->a = 4;
printf("Success: entry=%p, entry->a = %d\n", entry, entry->a);
}
return 0;
}
(The above, without MAP_PRIVATE of course, is a good example of what you might have provided as a an MCVE. This makes it much easier for others to help you, since they can see exactly what you've done, and test their proposed solutions. You should always provide an MCVE).
The man page for mmap() says that you must specify exactly one of MAP_SHARED and MAP_PRIVATE in the flags argument. In your case, to act like malloc(), you'll want MAP_PRIVATE:
myStruct *entry = mmap(0, sizeof *entry,
PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
(I've also made this more idiomatic C by omitting the harmful cast and matching the sizeof to the actual variable rather than its type).

Implementing copy-on-write buffer with mmap on Mac OS X

I've been playing around with copy-on-write buffers on Linux and the following example seems to work as intended:
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#define SIZE 4096
#define SHM_NAME "foobar"
int main(void)
{
int fd = shm_open(SHM_NAME, O_RDWR | O_CREAT, 0666);
int r = ftruncate(fd, SIZE);
char *buf1 = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
strcpy(buf1, "Original buffer");
char *buf2 = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_PRIVATE, fd, 0);
// At this point buf2 is aliased to buf1
// Now modifying buf2 should trigger copy-on-write)...
strcpy(buf2, "Modified buffer");
// buf1 and buf2 are now two separate buffers
strcpy(buf1, "Modified original buffer");
// clean up
r = munmap(buf2, SIZE);
printf("munmap(buf2): %i\n", r);
r = munmap(buf1, SIZE);
printf("munmap(buf1): %i\n", r);
r = shm_unlink(SHM_NAME);
printf("shm_unlink: %i\n", r);
return EXIT_SUCCESS;
}
However under OS X (10.10) the second mmap call returns MAP_FAILED, with errno = 22 (EINVAL). The OS X man page for mmap seems to suggest that this should work (it even mentions copy-on-write in the description of the MAP_PRIVATE flag), and I've experimented with various different flags for the calls to mmap, but nothing seems to work. Any ideas ?
It appears that using shm_open with MAP_SHARED and MAP_PRIVATE does something undesirable with the file descriptor. Using open is a possible workaround:
int fd = open(SHM_NAME, O_RDWR | O_CREAT, 0666);
...
Result:
munmap(buf2): 0
munmap(buf1): 0
shm_unlink: -1
Using shm_open with MAP_SHARED and MAP_PRIVATE results in an Invalid file descriptor, although using it with MAP_SHARED and MAP_SHARED for example does not. It's unclear to me whether this is a bug, or by design - the behavior does not seem correct though.

mprotect on a mmap-ed shared memory segment

When two processes share a segment of memory opened with shm_open and then it gets mmap-ed, does doing an mprotect on a portion of the shared memory in one process affects the permissions seen by the other process on this same portion? In other words, if one process makes part of the shared memory segment read-only, does it become read-only for the other process too?
I always like to address those questions in two parts.
Part 1 - Let's test it
Let's consider an example that is relatively similar to the one at shm_open(3).
Shared header - shared.h
#include <sys/mman.h>
#include <fcntl.h>
#include <semaphore.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0)
struct shmbuf {
char buf[4096];
sem_t sem;
};
Creator process - creator.c (Compile using gcc creator.c -o creator -lrt -lpthread)
#include <ctype.h>
#include <string.h>
#include "shared.h"
int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: %s /shm-path\n", argv[0]);
exit(EXIT_FAILURE);
}
char *shmpath = argv[1];
int fd = shm_open(shmpath, O_CREAT | O_EXCL | O_RDWR,
S_IRUSR | S_IWUSR);
if (fd == -1)
errExit("shm_open");
struct shmbuf *shm;
if (ftruncate(fd, sizeof(*shm)) == -1)
errExit("ftruncate");
shm = mmap(NULL, sizeof(*shm), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (shm == MAP_FAILED)
errExit("mmap");
if (sem_init(&shm->sem, 1, 0) == -1)
errExit("sem_init");
if (mprotect(&shm->buf, sizeof(shm->buf), PROT_READ) == -1)
errExit("mprotect");
if (sem_wait(&shm->sem) == -1)
errExit("sem_wait");
printf("got: %s\n", shm->buf);
shm_unlink(shmpath);
exit(EXIT_SUCCESS);
}
Writer process - writer.c (Compile using gcc writer.c -o writer -lrt -lpthread)
#include <string.h>
#include "shared.h"
int
main(int argc, char *argv[])
{
if (argc != 3) {
fprintf(stderr, "Usage: %s /shm-path string\n", argv[0]);
exit(EXIT_FAILURE);
}
char *shmpath = argv[1];
char *string = argv[2];
int fd = shm_open(shmpath, O_RDWR, 0);
if (fd == -1)
errExit("shm_open");
struct shmbuf *shm = mmap(NULL, sizeof(*shm), PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
if (shm == MAP_FAILED)
errExit("mmap");
strcpy(&shm->buf[0], string);
if (sem_post(&shm->sem) == -1)
errExit("sem_post");
exit(EXIT_SUCCESS);
}
What's supposed to happen?
The creator process, creates a new shared memory object, "sets" its size, maps it into memory (as shm), uses mprotect to allow only writes to the buffer (shm->buf) and waits for a semaphore to know when the writer (which we will discuss in a moment finishes its thing).
The writer process starts, opens the same shared memory object, writes into it whatever we tell it to and signals the semaphore.
Question is, will the writer be able to write to the shared memory object even though the creator changed the protection to READ-ONLY?
Let's find out. We can run it using:
# ./creator.c /shmulik &
# ./writer.c /shmulik hi!
got: hi!
#
[1]+ Done ./creator /shmulik
As you can see, the writer was able to write to the shared memory, even though the creator set it's protection to READ-ONLY.
Maybe the creator does something wrong? Let's try to add the following line to creator.c:
if (mprotect(&shm->buf, sizeof(shm->buf), PROT_READ) == -1)
errExit("mprotect");
memset(&shm->buf, 0, sizeof(shm->buf)); // <-- This is the new line
if (sem_wait(&shm->sem) == -1)
errExit("sem_wait");
Let's recompile & run the creator again:
# gcc creator.c -o creator -lrt -lpthread
# ./creator /shmulik
Segmentation fault
As you can see, the mprotect worked as expected.
How about we let the writer map the shared memory, then we change the protection? Well, it ain't going to change anything. mprotect ONLY affects the memory protection of the process calling it (and it's descendants).
Part 2 - Let's understand it
First, you have to understand that shm_open is a glibc method, it's not a systemcall.
You can get the glibc source code from their website and just look for shm_open to see that yourself.
The underlying implementation of shm_open is a regular call for open, just like the man page suggests.
As we already saw, most of the magic happens in mmap. When calling mmap, we have to use MAP_SHARED (rather than MAP_PRIVATE), otherwise every process is going to get a private memory segment to begin with, and obviously one ain't gonna affect the other.
When we call mmap, the hops are roughly:
ksys_mmap_pgoff.
vm_mmap_pgoff.
do_mmap
mmap_region
At that last point, you could see that we take the process' memory management context mm and allocate a new virtual memory area vma:
struct mm_struct *mm = current->mm;
...
vma = vm_area_alloc(mm);
...
vma->vm_page_prot = vm_get_page_prot(vm_flags);
This memory area is not shared with other processes.
Since mprotect changes only the vm_page_prot on the per-process vma, it doesn't affect other processes that map the same memory space.
The POSIX specification for mprotect() suggests that changes in the protection of shared memory should affect all processes using that shared memory.
Two of the error conditions detailed are:
[EAGAIN]
The prot argument specifies PROT_WRITE over a MAP_PRIVATE mapping and there are insufficient memory resources to reserve for locking the private page.
[ENOMEM]
The prot argument specifies PROT_WRITE on a MAP_PRIVATE mapping, and it would require more space than the system is able to supply for locking the private pages, if required.
These strongly suggest that memory that is mapped with MAP_SHARED should not fail because of a lack of memory for making copies.
See also the POSIX specification for mmap().

Resources