Memory Management for Mapped Data in Shared Memory Segments - c

I'm working on a project in C that uses shared memory for IPC on a Linux system. However, I'm a little bit confused about memory management in these segments. I'm using the POSIX API for this project.
I understand how to create the shared segments, and that these persist until a reboot if you fail to properly remove them with shm_unlink(). Additionally, I understand how to do the actually mapping & unmapping with mmap and munmap respectively. However, the usage of these operations and how it affects the stored data in these shared segments is confusing me.
Here is what I'm trying to properly understand:
Lets say I create a segment using shm_open() with the O_CREAT flag. This gives me a file descriptor that I've named msfd in the below example. Now I have a struct that I map into that address space with the following:
mystruct* ms = (mystruct*)mmap(NULL, sizeof(mystruct), PROT_READ | PROT_WRITE, MAP_SHARED, msfd, 0);
//set the elements of the struct here using ms->element = X as usual
Part 1)
Here's where my confusion beings. Lets say that this process is now done accessing that location since it was just setting data for another process to read. Do I still call munmap()?
I want the other process to still have access to all of this data that the current process has set. Normally, you wouldn't call free() on a malloc'ed pointer until its use is no longer needed permanently. However, I understand that when this process exits the unmapping happens automatically anyway. Is the data persisted inside the segment, or does that segment just get reserved with it's allotted size and name?
Part 2)
We're now in the process of the other application that needs to access and read from that shared segment. I understand that we now open that segment with shm_open() and then perform the same mapping operation with mmap(). Now we have access to the structure in that segment. When we call munmap() from this process (NOT the one that created the data) it "unlinks" us from that pointer, however the data is still accessible. Does this assume that process 1 (the creator) has NOT called munmap()?

Is the data persisted inside the segment,
Yes.
does that segment just get reserved with it's allotted size and name?
Also yes.
Does this assume that process 1 (the creator) has NOT called munmap()?
No.
The shared memory gets created via shm_create() (as being taken from available OS memory) and from this moment on it carries whichever content had been written into until it is given back to the OS via shm_unlink().
shm_create() and shm_open() act system oriented, in terms of the (shared) memory being a system (not process) specific resource.
mmap() and unmap() act process oriented, that is map and unmap the system resource shared memory into/out-of the process' address space.

Related

How to remap a file mmap(2)-ed in memory like shmget

I have a massive file ie 1TiB owned as 'filehandler', permitted rwx------. I mmap(2)-ed it into the 64bit address space, and all works successfully. This file handled by a process running as user 'filehandler'.
Other processes request services from this handler process running as other user than the filehandler. They login into handler through unix socket. They communicate by IPC rules, all is ok.
The entire file must not be shared to requesters due to security reasons. In the file only some parts are allowed to access for requester processes.
The best performance will be given if share of the memory, just the allowed parts of the file with the requesting processes.
For example the shm gives the key to access the segment for other processes, it is a practical targeting to requester.
Is there any way to share only the allowed parts of a mmap(2)-ed space to any processes identified like shm technology?
Is there any way to share only the allowed parts of a mmap(2)-ed space to any processes identified like shm technology?
TL;DR: No.
In more detail,
How to remap a file mmap(2)-ed in memory like shmget
mmap() and shmget()are not really comparable. A better comparison would be between the combination of shm_open() / ftruncate() / mmap() on one hand and the combination of shmget() / shmat() on the other. These are the main alternatives in POSIX for creating labeled shared-memory segments and mapping them into the process's address space. You should recognize there that the analog of shmget() is shm_open(), and the analog of mmap() in this context is shmat().
Now, returning to
Is there any way to share only the allowed parts of a mmap(2)-ed space to any processes identified like shm technology?
Note well that in both cases above, it is the object being mapped (a shared memory segment) that provides for sharing between unrelated processes, not anything to do with mmap() itself. The same applies when mmap() maps any other kind of object, such as a regular file. It is always the mapped object through which any shared access is mediated. It has to be this way, because a memory mapping is a property of one process -- it is not itself share-able.
Your design calls for a filehandler process to serve as gatekeeper to the data, rather than allowing clients to access it directly. That's fine, but it precludes the clients mapping the file into memory. You could probably arrange for client to access the data through a shared memory segment of either flavor, but that would require the server copying the right data out of the big file into the client's shared memory segment. That might indeed be something to consider, but you can forget about the server providing clients direct memory-mapped access to the file.
There's no connection between implementations of shmget system call (a System V AT&T derived implementation) and mmap (a berkeley's BSD system derived implementation) It's true that in BSD systems, AT&T shared memory is implemented by using mmaped private segments with no file attached, but that's of no use also, because you need the shared segment to be associated with a file.
As you need, the only possibility to share memory segments related to a file's contents are by using mmap system call, because System V shared memory segments have no means to associate a file with them.
All of these resources (either SysV or BSD) have a set of permissions bits associated with them that allow them to be used with some security, but as happens with files, only in a global (the entire resource) way, making you able to access the whole thing or nothing at all.
BTW, you can implement what you want by means of copying segment contents to a different, private, segment (only the size you want the client to be allowed to see) only the segments it is allowed access, and this way you can have finer control over whom and what the clients are allowed to do.
And last, remember that this approach requires copying of segments of shared memory, so you need to remember to copy back the exported segment for a customer if you don't want the modifications made by that client to be lost when the client finishes using them.
From my point of view, you are complicating things a little, but you better know how your application is designed than me.

Can I with PTEs from one process which indicate to fragments of physical memory to create appropriate PTEs in other process?

When we in Linux use function mmap (,,, MAP_ANON | MAP_SHARED);, then for the same region of fragmented physically memory (which allocated) between processes are allocating virtual memory pages (PTEs). Ie these PTEs are copied from page table of one process to the page table of another process (with the same sequence of fragments of physical addresses allocated memory), is this true?
But mmap () needs to be done before fork (). And if we already have two working process (ie after fork ()), then we need to use a file for the mmap(). Which functions used to copying mechanism of PTEs between the two already established processes to create a shared memory?
Can I with PTEs/SGL(scatter-gather-list) which indicate to fragments of physical memory which have been allocated to create appropriate PTEs in other process by using linux-kernel, and how to do it?
I want to understand how it mmap() works at a lower level .
When we in Linux use function mmap (,,, MAP_ANON | MAP_SHARED);, then
for the same region of fragmented physically memory (which allocated)
between processes are allocating virtual memory pages (PTEs).
Restate the question/statement, please, the above does not make sense.
Ie these PTEs are copied from page table of one process to the page
table of another process (with the same sequence of fragments of
physical addresses allocated memory), is this true?
No, it is not true.
When you establish a new mapping, a kernel first looks
for a sufficiently large unused range of addresses in the virtual address space of the process. Then it modifies the corresponding page table entries to indicate that that address range is valid, but physical pages there are not present.
When you attempt to access an address in that range, a page fault is generated. The kernel looks in its data structures and determines that the access is valid. Then it allocates a
fresh physical page, modifies the page entry to establish the mapping between the
virtual address and the physical address and marks the page as present. Upon return from
the page fault exception, the offending instruction is restarted and this time executes successfully.
But mmap () needs to be done before fork (). And if we already have
two working process (ie after fork ()), then we need to use a file for
the mmap(). Which functions used to copying mechanism of PTEs between
the two already established processes to create a shared memory?
If you do a mmap after the fork, the two processes will create and initialize
page table entries entirely independent of each other. However, when you mmap a file,
the kernel will not allocate simply a free physical page - it will allocate a page,
fill it with data from the file and put the page in the page/buffer cache. When a second
process mmaps the same file, the kernel looks in the page cache, finds there the physical
page, which corresponds to the same file and the required file offset and points the PTE
to that page. Now, there will be two completely independently created PTE, which just point to the same physical page.
Can I with PTEs/SGL(scatter-gather-list) which indicate to fragments
of physical memory which have been allocated to create appropriate
PTEs in other process by using linux-kernel, and how to do it?
Restate this too, it's not clear what you are asking.
I want to understand how it mmap() works at a lower level .
I would recommend an operating systems book, a chapter on virtual memory management,
something like Operating System Concepts by Silberschatz el al.
http://www.amazon.co.uk/Operating-System-Concepts-Abraham-Silberschatz/dp/1118112733/ref=sr_1_5?ie=UTF8&qid=1386065707&sr=8-5&keywords=Operating+System+Concepts%2C+by+Silberschatz%2C+Galvin%2C+and+Gagne

shmat for attaching shared memory segment

When I looked through the man pages of shmat. It is described as the primitive function of the API is to attach the memory segment associated wih shmid it to the calling process' address space .
The questions I have are the following.
The term attach looks generic to me. I find difficulties in understanding what is the underlying acivity that attach refers to.?
What it means by mapping a segment of memory?
Use it as char *ptr=shmat(seg_id,NULL,0);
It attaches the created segment id by function shmget() with the process which contains this above code.
seg_id is the segment id of newly created segment
NULL means the Operating System will take care of the starting address of the segment on user's behalf
0 is flag for read/write both
Whenever a process attaches to shared memory then it must be detached so that another process can access it by attaching to that segment (if the locking mechanism of resources is present.)
to detach : shmdt(ptr);
There's a good explanation here: http://www.makelinux.net/alp/035
"Under Linux, each process's virtual memory is split into pages. Each process maintains a mapping from its memory addresses to these virtual memory pages, which contain the actual data. Even though each process has its own addresses, multiple processes' mappings can point to the same page, permitting sharing of memory"

How do I choose a fixed address for mmap?

mmap() can be optionally supplied with a fixed location to place the map. I would like to mmap a file and then have it available to a few different programs at the same virtual address in each program. I don't care what the address is, just as long as they all use the same address. If need be, the address can be chosen by one of them at run time (and communicated with the others via some other means).
Is there an area of memory that Linux guarantees to be unused (by the application and by the kernel) that I can map to? How can I find one address that is available in several running applications?
Not really, no. With address space randomisation on modern linux systems it is very hard to guarantee anything about what addresses may or may not be used.
Also, if you're thinking of using MAP_FIXED then be aware that you need to be very careful as it will cause mmap to unmap anything that may already be mapped at that address which is generally a very bad thing.
I really think you will need to find another solution to your problem...
Two processes can map a shared memory block to the same virtual address using shm_open() and mmap(); that is, the virtual address returned from mmap() can be the same for both processes. I found that Linux will by default give different virtual addresses to different processes for the same piece of shared memory, but that using mmap() with MAP_FIXED will force Linux to supply the same virtual address to multiple processes.
The process that creates the shared memory block must store the virtual address somewhere, either within the shared memory, in a file, or with some other method so that another process can determine the original virtual address. The known virtual address is then used in the mmap() call, along with the MAP_FIXED flag.
I was able to use the shared memory to do this. When doing so, the "golden" virtual address is stored within the shared memory block; I made a structure that contains a number of items, the address being one of them, and initialized it at the beginning of the block.
A process that wants to map that shared memory must execute the mmap() function twice; once to get the "golden" virtual address, then to map the block to that address using the MAP_FIXED flag.
Interestingly, I'm working with an embedded system running a 2.6 kernel. It will, by default, supply the same virtual address to all mmap() calls to a given file descriptor. Go figure.
Bob Wirka
You could look into doing a shared memory object using shmget(), shmat(), etc. First have the process that obtains the right to initialize the shared memory object read in your file and copy it into the shared memory object address space. Now any other process that simply gets a return shared memory ID value can access the data in the shared memory space. So for instance, you could employ some type initialization scheme like the following:
#include <sys/shm.h>
#define KEYVALUE 1000 //arbitrary value ... just needs to be shared between your processes
int file_size
//read your file and obtain its size in bytes;
//try to create the shared memory object
int shared_mem_id;
void* shared_mem_ptr = NULL;
if ((shared_mem_id = shmget(KEYVALUE, file_size, S_IRUSR | S_IWUSR IPC_CREAT | IPC_EXCL)) == -1)
{
if (errno == EEXIST)
{
//shared memory segment was already created, so just get its ID value
shared_mem_id = shmget(KEYVALUE, file_size, S_IRUSR | S_IWUSR);
shared_mem_ptr = shmat(shared_mem_id, NULL, 0)
}
else
{
perror("Unable to create shared memory object");
exit(1);
}
}
else
{
shared_mem_ptr = shmat(shared_mem_id, NULL, 0);
//copy your file into shared memory via the shared_mem_ptr
}
//work with the shared data ...
The last process to use the shared memory object, will, just before destroying it, copy the modified contents from shared memory back into the actual file. You may also want to allocate a structure at the beginning of your shared memory object that can be used for synchronization, i.e., there would be some type of "magic number" that the initializing process will set so that your other processes will know that the data has been properly initialized in the shared memory object before accessing it. Alternatively you could use a named semaphore or System V semaphore to make sure that no process tries to access the shared memory object before it's been initialized.

Is it possible to turn a segment of shared memory into private memory?

Say I have a c program (in a linux environment) that uses shared memory to send data to and from several processes. Let's say later in the program the parallel processes finish and I have only one process. Now but I want to fork() off another one process, however this time I don't want that memory segment to be shared, I want both the parent and child process to be able to modify the values without affecting one another, as if it were private memory. Is there any way to do this; convert shared memory to private memory but have it occupy the same space in virtual memory, or make shared memory copy-on-write?
Well, the only way I can think of from a portable POSIX API to do this is to have the child map some new segment of the same size somewhere else (random), copy the data over, and then detach the original segment and re-attach the new segment to the correct address. Sounds ugly.
You can unlink the new segment after you are done to prevent other people from attaching to it.
Now that I look at the man page, if you have the FD to the shm object, you could try re-mmapping the shm object as MAP_PRIVATE in the child at the right address. However ``It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region.'' so you either need to test that and live dangerously or use the other technique.

Resources