Confused about mmap() return - two different pointers? - c

Currently trying to understand how memory mapping works in Linux (or in general, really), and I'm following with this one example of Shared Memory in POSIX systems from Operating System Concepts. The two files are as follows:
Producer file
// This is the producer file for the Shared memory object
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <sys/shm.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
int main()
{
// ---------------- PRODUCER ESTABLISHES SHARED MEMORY OBJECT AND WRITES TO IT ----------------
// Specifying the size in bytes of the shared memory object
const int SIZE = 4096;
// Name of the shared memory space
const char *name = "OS";
// The actual strings to write to shared memory
const char *message_0 = "Hello";
const char *message_1 = "World!";
// Shared memory file descriptor will be stored in this
int fd;
// Pointer to the shared memory object will be stored in this
char *ptr;
// Checking error
int errnum;
// Create a shared memory object. This opens (establishes a connection to) a shared memory object.
fd = shm_open(name, O_CREAT | O_RDWR,0666);
if (fd == -1)
{
errnum = errno;
fprintf(stdout, "Value of errno: %d\n", errno);
perror("Error printed by perror");
fprintf(stdout, "Error opening file: %s\n", strerror(errnum));
return 1;
}
// Configure the size of the shared-memory object to be 4096 bytes
ftruncate(fd, SIZE);
// Memory map the shared memory object
ptr = (char *) mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
printf("Ptr is: %p\n", ptr);
if (ptr == MAP_FAILED)
{
errnum = errno;
fprintf(stdout, "Value of errno in ptr: %d\n", errno);
perror("Error printed by perror");
fprintf(stdout, "Error opening file: %s\n", strerror(errnum));
return 1;
}
// Write to the shared memory object
sprintf(ptr, "%s", message_0);
ptr += strlen(message_0);
sprintf(ptr, "%s", message_1);
ptr += strlen(message_1);
return 0;
}
Consumer file
// This is the consumer file for the Shared memory object, in which it reads what is in the memory object OS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <sys/shm.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
int main()
{
// Size in bytes of shared memory object
const int SIZE = 4096;
// Name of the shared memory space
const char *name = "OS";
// Shared memory file descriptor will be stored in this
int fd;
// Pointer to the shared memory object will be stored in this
char *ptr;
// Checking error
int errnum;
// Open the shared memory object
fd = shm_open(name, O_RDWR, 0666);
// If error in shm_open()
if (fd == -1)
{
errnum = errno;
fprintf(stdout, "Value of errno: %d\n", errno);
perror("Error printed by perror");
fprintf(stdout, "Error opening file: %s\n", strerror(errnum));
return 1;
}
// Memory-map the shared memory object
ptr = (char *) mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
printf("Ptr is: %p\n", ptr);
if (ptr == MAP_FAILED)
{
errnum = errno;
fprintf(stdout, "Value of errno in ptr: %d\n", errno);
perror("Error printed by perror");
fprintf(stdout, "Error opening file: %s\n", strerror(errnum));
return 1;
}
// Read from the shared memory object
printf("%s\n", (char *) ptr);
// Remove the shared memory object (delete it)
shm_unlink(name);
return 0;
}
When I print the pointer to the shared memory object (printf("Ptr value: %p\n, ptr)), I get two different values for the consumer and producer files. Why does this happen?
As I understand, the pointer ptr points at the shared memory object, which is in the physical memory. This physical memory is just shared amongst two processes, by mapping it onto their address space. However, this would require that the pointer point to the same address in physical memory, no? Or is it pointing to the virtual memory (i.e. the address space of the processes)? If so, does that itself point to the physical memory?
Thanks!

Each process has its own virtual memory address space. The mappings from virtual memory to physical memory are, in general, different for each process. While the shared memory may be in physical memory at address 0x1000, the producer process may have it mapped into virtual memory at address 0x7000, and the consumer process may have it mapped into virtual memory at address 0x4000.
Further, if the shared memory is swapped out of memory for some reason, the system could later reload it to a different physical address, say 0x13000, and update the mappings in the processes so that it appears at the same addresses as before in each of the producer and consumer processes.

Producer and consumer are two different processes, so they have their own virtual memory space each. When you attach the shared segment, you specify the kernel that you have no preference on where (in your unallocated virtual address space) to put it, so you normally get different pointers because most probably both processes have different memory maps. Just think on an scenario where one process has allocated the memory map at, say, address A, and the other process has that address A occupied by some other thing it has allocated (e.g. dynamic memory for the heap, or a different shared library) It is clear that the kernel cannot allocate two different things in the same (virtual) address range (or even overlap some other mapping), so it uses a differen place, and returns a different pointer.
Just don't worry about it, as both virtual addresses map to the same phisical address (this is sure, as it is a shared memory segment), and the pointers finally end pointing to the same thing. The kernel is normally not aware of where it has placed that same segment in a different process, it doesn't need to do. This means that it is highly improbable that you get both pointers equal, contrary to what you though.
As none of these processes knows about the virtual address space of the other, there's no conflict, but never pass the address to the other process to use, because an address in a virtual address space has absolutely no meaning in the virtual address space of another, different process.
A process has normally no means (and no possibility) to know how the kernel assings the memory and builds the map of virtual to physicall addresses. Even for itself (the kernel also runs in it's virtual memory space, and only deals with the tables that map ---these tables are in the virtual address space of the kernel, but not in the virtual address space of any user process, so this is the reason that makes the kernel capable of changing the mappings but it is not possible for other processes)
It is impossible for a process to know where in actual memory a virtual address points to. That is possible for some administrator processes through a device (and a special device is needed for this) /dev/mem. But if you try to look in there without the actual mapping you'll get a mess of pages some belonging to a process, some to another, with no apparent structure, or you cannot also know what does it mean the contents of those pages.
There's another device /dev/kmem that maps to the kernel virtual address space, and allows a process (e.g. a debugger) to get access to the mapping table there.... but this is a very dangerous bend, as you can easily crash your system by tweaking there. Think that you cannot normally stop the kernel (this stops the system) or if you can do, you will stop essential parts of the system that should be running.

Related

How to create Posix shared memory to work with integer

here is a bit of background on what I am trying to do for this class assignment. I am going to have 10 total programs that are going using semaphores to be synchronized to accessing a shared memory construct that is going to hold an Integer. This integer will hold the amount of text that has currently been read. How exactly am I am going to take this setup from my textbook for strings and convert it for use with Integers?
The 9 of the programs will only increment the value of the shared memory and then one of the programs will just read the shared memory value for later use. "Producer" is going to be the 9 programs that will increment the value of the shared memory. "Consumer" is going to be the single program that will read the value later on to do separate operations.
The "Consumer" program will be initializing the shared memory and setting the int value to 0.
Below is the "Producer" program that will build and write to the shared memory construct.
#include <stdio.h>
#include <stlib.h>
#include <string.h>
#include <fcntl.h>
#include <sys/shm.h>
#include <sys/stat.h>
int main()
{
/* the size (in bytes) of shared memory object */
const int SIZE 4096;
/* name of the shared memory object */
const char *name = "OS";
/* strings written to shared memory */
const char *message 0 = "Hello";
const char *message 1 = "World!";
/* shared memory file descriptor */
int shm fd;
/* pointer to shared memory obect */
void *ptr;
/* create the shared memory object */
shm fd = shm open(name, O CREAT | O RDRW, 0666);
/* configure the size of the shared memory object */
ftruncate(shm fd, SIZE);
/* memory map the shared memory object */
ptr = mmap(0, SIZE, PROT WRITE, MAP SHARED, shm fd, 0);
/* write to the shared memory object */
sprintf(ptr,"%s",message 0);
ptr += strlen(message 0);
sprintf(ptr,"%s",message 1);
ptr += strlen(message 1);
return 0;
}
Now below is the "Consumer" program that will read from the shared memory
#include <stdio.h>
#include <stlib.h>
#include <fcntl.h>
#include <sys/shm.h>
#include <sys/stat.h>
int main()
{
/* the size (in bytes) of shared memory object */
const int SIZE 4096;
/* name of the shared memory object */
const char *name = "OS";
/* shared memory file descriptor */
int shm fd;
/* pointer to shared memory obect */
void *ptr;
/* open the shared memory object */
shm fd = shm open(name, O RDONLY, 0666);
/* memory map the shared memory object */
ptr = mmap(0, SIZE, PROT READ, MAP SHARED, shm fd, 0);
/* read from the shared memory object */
printf("%s",(char *)ptr);
/* remove the shared memory object */
shm unlink(name);
return 0;
}
My idea of how to change this is to end up having the following changes for the Producer
change SIZE to 4 bytes for the size of Int
read the value of the int from shared memory assign that value to shmValue
shmValue then gets incremented
shmValue then gets written back to shared memory
My idea of how to change the Consumer program
change SIZE to 4 bytes for the size of Int
initialize the shared memory to int with a value of 0
reads shared memory to shmValue
then uses shmValue for later operations.
Again, guys, I would very much appreciate the help in dealing with this issue that I am having in understanding how to use this technology. Thank you for the help in advance!!

C/Linux: dual-map memory with different permissions

My program passes data pointers to third-party plugins with the intention that the data should be read-only, so it would be nice to prevent the plugins from writing to the data objects. Ideally there would be a segfault if a plugin attempts a write. I've heard there is some way to double-map a memory region, such that a second virtual address range points to the same physical memory pages. The second mapping would not have write permission, and the exported pointers would use this address range instead of the original (writable) one. I would prefer not to change the original memory allocations, whether they happen to use malloc or mmap or whatever. Can someone explain how to do this?
It is possible to get a dual mapping, but it requires some work.
The only way I know how to create such a dual mapping is to use the mmap function call. For mmap you need some kind of file-descriptor. Fortunately Linux allows you to get a shared memory object, so no real file on a storage medium is required.
Here is a complete example that shows how to create the shared memory object, creates a read/write and read-only pointer from it and then does some basic tests:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
int main()
{
// Lets do this demonstration with one megabyte of memory:
const int len = 1024*1024;
// create shared memory object:
int fd = shm_open("/myregion", O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
printf ("file descriptor is %d\n", fd);
// set the size of the shared memory object:
if (ftruncate(fd, len) == -1)
{
printf ("setting size failed\n");
return 0;
}
// Now get two pointers. One with read-write and one with read-only.
// These two pointers point to the same physical memory but will
// have different virtual addresses:
char * rw_data = mmap(0, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd,0);
char * ro_data = mmap(0, len, PROT_READ , MAP_SHARED, fd,0);
printf ("rw_data is mapped to address %p\n", rw_data);
printf ("ro_data is mapped to address %p\n", ro_data);
// ===================
// Simple test-bench:
// ===================
// try writing:
strcpy (rw_data, "hello world!");
if (strcmp (rw_data, "hello world!") == 0)
{
printf ("writing to rw_data test passed\n");
} else {
printf ("writing to rw_data test failed\n");
}
// try reading from ro_data
if (strcmp (ro_data, "hello world!") == 0)
{
printf ("reading from ro_data test passed\n");
} else {
printf ("reading from ro_data test failed\n");
}
printf ("now trying to write to ro_data. This should cause a segmentation fault\n");
// trigger the segfault
ro_data[0] = 1;
// if the process is still alive something didn't worked.
printf ("writing to ro_data test failed\n");
return 0;
}
Compile with: gcc test.c -std=c99 -lrt
For some reason I get a warning that ftruncate is not declared. No idea why. The code works well though. Example output:
file descriptor is 3
rw_data is mapped to address 0x7f1778d60000
ro_data is mapped to address 0x7f1778385000
writing to rw_data test passed
reading from ro_data test passed
now trying to write to ro_data. This should cause a segmentation fault
Segmentation fault
I've left the memory deallocation as an exercise for the reader :-)

mprotect on a mmap-ed shared memory segment

When two processes share a segment of memory opened with shm_open and then it gets mmap-ed, does doing an mprotect on a portion of the shared memory in one process affects the permissions seen by the other process on this same portion? In other words, if one process makes part of the shared memory segment read-only, does it become read-only for the other process too?
I always like to address those questions in two parts.
Part 1 - Let's test it
Let's consider an example that is relatively similar to the one at shm_open(3).
Shared header - shared.h
#include <sys/mman.h>
#include <fcntl.h>
#include <semaphore.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0)
struct shmbuf {
char buf[4096];
sem_t sem;
};
Creator process - creator.c (Compile using gcc creator.c -o creator -lrt -lpthread)
#include <ctype.h>
#include <string.h>
#include "shared.h"
int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: %s /shm-path\n", argv[0]);
exit(EXIT_FAILURE);
}
char *shmpath = argv[1];
int fd = shm_open(shmpath, O_CREAT | O_EXCL | O_RDWR,
S_IRUSR | S_IWUSR);
if (fd == -1)
errExit("shm_open");
struct shmbuf *shm;
if (ftruncate(fd, sizeof(*shm)) == -1)
errExit("ftruncate");
shm = mmap(NULL, sizeof(*shm), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (shm == MAP_FAILED)
errExit("mmap");
if (sem_init(&shm->sem, 1, 0) == -1)
errExit("sem_init");
if (mprotect(&shm->buf, sizeof(shm->buf), PROT_READ) == -1)
errExit("mprotect");
if (sem_wait(&shm->sem) == -1)
errExit("sem_wait");
printf("got: %s\n", shm->buf);
shm_unlink(shmpath);
exit(EXIT_SUCCESS);
}
Writer process - writer.c (Compile using gcc writer.c -o writer -lrt -lpthread)
#include <string.h>
#include "shared.h"
int
main(int argc, char *argv[])
{
if (argc != 3) {
fprintf(stderr, "Usage: %s /shm-path string\n", argv[0]);
exit(EXIT_FAILURE);
}
char *shmpath = argv[1];
char *string = argv[2];
int fd = shm_open(shmpath, O_RDWR, 0);
if (fd == -1)
errExit("shm_open");
struct shmbuf *shm = mmap(NULL, sizeof(*shm), PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
if (shm == MAP_FAILED)
errExit("mmap");
strcpy(&shm->buf[0], string);
if (sem_post(&shm->sem) == -1)
errExit("sem_post");
exit(EXIT_SUCCESS);
}
What's supposed to happen?
The creator process, creates a new shared memory object, "sets" its size, maps it into memory (as shm), uses mprotect to allow only writes to the buffer (shm->buf) and waits for a semaphore to know when the writer (which we will discuss in a moment finishes its thing).
The writer process starts, opens the same shared memory object, writes into it whatever we tell it to and signals the semaphore.
Question is, will the writer be able to write to the shared memory object even though the creator changed the protection to READ-ONLY?
Let's find out. We can run it using:
# ./creator.c /shmulik &
# ./writer.c /shmulik hi!
got: hi!
#
[1]+ Done ./creator /shmulik
As you can see, the writer was able to write to the shared memory, even though the creator set it's protection to READ-ONLY.
Maybe the creator does something wrong? Let's try to add the following line to creator.c:
if (mprotect(&shm->buf, sizeof(shm->buf), PROT_READ) == -1)
errExit("mprotect");
memset(&shm->buf, 0, sizeof(shm->buf)); // <-- This is the new line
if (sem_wait(&shm->sem) == -1)
errExit("sem_wait");
Let's recompile & run the creator again:
# gcc creator.c -o creator -lrt -lpthread
# ./creator /shmulik
Segmentation fault
As you can see, the mprotect worked as expected.
How about we let the writer map the shared memory, then we change the protection? Well, it ain't going to change anything. mprotect ONLY affects the memory protection of the process calling it (and it's descendants).
Part 2 - Let's understand it
First, you have to understand that shm_open is a glibc method, it's not a systemcall.
You can get the glibc source code from their website and just look for shm_open to see that yourself.
The underlying implementation of shm_open is a regular call for open, just like the man page suggests.
As we already saw, most of the magic happens in mmap. When calling mmap, we have to use MAP_SHARED (rather than MAP_PRIVATE), otherwise every process is going to get a private memory segment to begin with, and obviously one ain't gonna affect the other.
When we call mmap, the hops are roughly:
ksys_mmap_pgoff.
vm_mmap_pgoff.
do_mmap
mmap_region
At that last point, you could see that we take the process' memory management context mm and allocate a new virtual memory area vma:
struct mm_struct *mm = current->mm;
...
vma = vm_area_alloc(mm);
...
vma->vm_page_prot = vm_get_page_prot(vm_flags);
This memory area is not shared with other processes.
Since mprotect changes only the vm_page_prot on the per-process vma, it doesn't affect other processes that map the same memory space.
The POSIX specification for mprotect() suggests that changes in the protection of shared memory should affect all processes using that shared memory.
Two of the error conditions detailed are:
[EAGAIN]
The prot argument specifies PROT_WRITE over a MAP_PRIVATE mapping and there are insufficient memory resources to reserve for locking the private page.
[ENOMEM]
The prot argument specifies PROT_WRITE on a MAP_PRIVATE mapping, and it would require more space than the system is able to supply for locking the private pages, if required.
These strongly suggest that memory that is mapped with MAP_SHARED should not fail because of a lack of memory for making copies.
See also the POSIX specification for mmap().

Can anyone help me make a Shared Memory Segment in C

I need to make a shared memory segment so that I can have multiple readers and writers access it. I think I know what I am doing as far as the semaphores and readers and writers go...
BUT I am clueless as to how to even create a shared memory segment. I want the segment to hold an array of 20 structs. Each struct will hold a first name, an int, and another int.
Can anyone help me at least start this? I am desperate and everything I read online just confuses me more.
EDIT: Okay, so I do something like this to start
int memID = shmget(IPC_PRIVATE, sizeof(startData[0])*20, IPC_CREAT);
with startData as the array of structs holding my data initialized and I get an error saying
"Segmentation Fault (core dumped)"
The modern way to obtain shared memory is to use the API, provided by the Single UNIX Specification. Here is an example with two processes - one creates a shared memory object and puts some data inside, the other one reads it.
First process:
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <fcntl.h>
#define SHM_NAME "/test"
typedef struct
{
int item;
} DataItem;
int main (void)
{
int smfd, i;
DataItem *smarr;
size_t size = 20*sizeof(DataItem);
// Create a shared memory object
smfd = shm_open(SHM_NAME, O_RDWR | O_CREAT, 0600);
// Resize to fit
ftruncate(smfd, size);
// Map the object
smarr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, smfd, 0);
// Put in some data
for (i = 0; i < 20; i++)
smarr[i].item = i;
printf("Press Enter to remove the shared memory object\n");
getc(stdin);
// Unmap the object
munmap(smarr, size);
// Close the shared memory object handle
close(smfd);
// Remove the shared memory object
shm_unlink(SHM_NAME);
return 0;
}
The process creates a shared memory object with shm_open(). The object is created with an initial size of zero, so it is enlarged using ftruncate(). Then the object is memory mapped into the virtual address space of the process using mmap(). The important thing here is that the mapping is read/write (PROT_READ | PROT_WRITE) and it is shared (MAP_SHARED). Once the mapping is done, it can be accessed as a regular dynamically allocated memory (as a matter of fact, malloc() in glibc on Linux uses anonymous memory mappings for larger allocations). Then the process writes data into the array and waits until Enter is pressed. Then it unmaps the object using munmap(), closes its file handle and unlinks the object with shm_unlink().
Second process:
#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>
#define SHM_NAME "/test"
typedef struct
{
int item;
} DataItem;
int main (void)
{
int smfd, i;
DataItem *smarr;
size_t size = 20*sizeof(DataItem);
// Open the shared memory object
smfd = shm_open(SHM_NAME, O_RDONLY, 0600);
// Map the object
smarr = mmap(NULL, size, PROT_READ, MAP_SHARED, smfd, 0);
// Read the data
for (i = 0; i < 20; i++)
printf("Item %d is %d\n", i, smarr[i].item);
// Unmap the object
munmap(smarr, size);
// Close the shared memory object handle
close(smfd);
return 0;
}
This one opens the shared memory object for read access only and also memory maps it for read access only. Any attempt to write to the elements of the smarr array would result in segmentation fault being delivered.
Compile and run the first process. Then in a separate console run the second process and observe the output. When the second process has finished, go back to the first one and press Enter to clean up the shared memory block.
For more information consult the man pages of each function or the memory management portion of the SUS (it's better to consult the man pages as they document the system-specific behaviour of these functions).

Why does mmap() fail with ENOMEM on a 1TB sparse file?

I've been working with large sparse files on openSUSE 11.2 x86_64. When I try to mmap() a 1TB sparse file, it fails with ENOMEM. I would have thought that the 64 bit address space would be adequate to map in a terabyte, but it seems not. Experimenting further, a 1GB file works fine, but a 2GB file (and anything bigger) fails. I'm guessing there might be a setting somewhere to tweak, but an extensive search turns up nothing.
Here's some sample code that shows the problem - any clues?
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
char * filename = argv[1];
int fd;
off_t size = 1UL << 40; // 30 == 1GB, 40 == 1TB
fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0666);
ftruncate(fd, size);
printf("Created %ld byte sparse file\n", size);
char * buffer = (char *)mmap(NULL, (size_t)size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if ( buffer == MAP_FAILED ) {
perror("mmap");
exit(1);
}
printf("Done mmap - returned 0x0%lx\n", (unsigned long)buffer);
strcpy( buffer, "cafebabe" );
printf("Wrote to start\n");
strcpy( buffer + (size - 9), "deadbeef" );
printf("Wrote to end\n");
if ( munmap(buffer, (size_t)size) < 0 ) {
perror("munmap");
exit(1);
}
close(fd);
return 0;
}
The problem was that the per-process virtual memory limit was set to only 1.7GB. ulimit -v 1610612736 set it to 1.5TB and my mmap() call succeeded. Thanks, bmargulies, for the hint to try ulimit -a!
Is there some sort of per-user quota, limiting the amount of memory available to a user process?
My guess is the the kernel is having difficulty allocating the memory that it needs to keep up with this memory mapping. I don't know how swapped out pages are kept up with in the Linux kernel (and I assume that most of the file would be in the swapped out state most of the time), but it may end up needing an entry for each page of memory that the file takes up in a table. Since this file might be mmapped by more than one process the kernel has to keep up with the mapping from the process's point of view, which would map to another point of view, which would map to secondary storage (and include fields for device and location).
This would fit into your addressable space, but might not fit (at least contiguously) within physical memory.
If anyone knows more about how Linux does this I'd be interested to hear about it.

Resources