:) I'm trying to port some legacy code (large program) to CentOS 7 but I'm hitting a snag. The core of the code is a rather awkard structure built around using mmap to allocate a hard-coded address and map a file to it. The file acts like a database (and is built by one) and includes hard-coded pointers to different sections of the mapped memory. Very ugly, but it is what it is. The entire program is built around this structure, and nobody is going to fund a rewrite.
The problem comes on the mmap line. This worked before, but no longer on CentOS 7:
mmapAddr = mmap ((void *) SMAddr, SMA_WINDOW_SIZE, PROT_READ | (readOnly ? 0 : PROT_WRITE),MAP_FILE | MAP_FIXED | MAP_SHARED, SMFileDesc, 0);
... where SMAddr is 0x8000000, SMA_WINDOW_SIZE is 127926272, and readOnly is false. So basically it's trying to map a file to the address 0x8000000 with size 122MB.
What might have changed between versions, I have no clue. But I do note that the file it's mapping is only 1,5MB. I'm not sure exactly why it needs to map so much more than the file size, but I know it's needed, and I know that a lot of nuance has gone into picking the size "122MB" for some reason.
Could a mismatch between actual file size and allocated size have been fine in the past but not any more? I know that SIGBUS means an attempt to access an invalid memory region. Given that mmap doesn't take any sort of allocated pointer, this has to be something it's doing internally.
I tried catching and blocking SIGBUS (thinking that maybe it'd be ignorable?), but the program still crashed with a SIGBUS at the same spot. Maybe I did that wrong.
Thoughts?
From here1:
The mmap() function can be used to map a region of memory that is
larger than the current size of the object. Memory access within the
mapping but beyond the current end of the underlying objects may
result in SIGBUS signals being sent to the process. The reason for
this is that the size of the object can be manipulated by other
processes and can change at any moment. The implementation should tell
the application that a memory reference is outside the object where
this can be detected; otherwise, written data may be lost and read
data may not reflect actual data in the object.
Note that references beyond the end of the object do not extend the
object as the new end cannot be determined precisely by most virtual
memory hardware. Instead, the size can be directly manipulated by
ftruncate().
So most likely the bug is that your program tries to access a region of the mapped memory which lies outside the file. The mmap call should succeed, however. Which return value do you get?
Related
I want to lock the memory to physical RAM in C with mlock and munlock, but I'm unsure about the correct usage.
Allow me to explain in a step by step scenario:
Let's assume that I dynamically allocate a pointer using calloc:
char * data = (char *)calloc(12, sizeof(char*));
Should I do mlock right after that?
Let's also assume that I later attempt to resize the memory block with realloc:
(char *)realloc(data, 100 * sizeof(char*));
Note the above increase amount ( 100 ) is random and sometimes i will decrease the memory block.
Should I first do munlock and then mlock again to address the changes made?
Also when I want to free the pointer data later, should I munlock first?
I hope someone can please explain the correct steps to me so I can understand better.
From the POSIX specification of mlock() and munlock():
The mlock() function shall cause those whole pages containing any part
of the address space of the process starting at address addr and
continuing for len bytes to be memory-resident until unlocked or until
the process exits or execs another process image. The implementation
may require that addr be a multiple of {PAGESIZE}.
The munlock() function shall unlock those whole pages containing any
part of the address space of the process starting at address addr and
continuing for len bytes, regardless of how many times mlock() has
been called by the process for any of the pages in the specified
range. The implementation may require that addr be a multiple of
{PAGESIZE}.
Note that:
Both functions work on whole pages
Both functions might require addr to be a multiple of page size
munlock doesn't use any reference counting to track lock lifetime
This make it almost impossible to use them with pointers returned by malloc/calloc/realloc as they can:
Accidently lock/unlock nearby pages (you might unlock pages that must be memory-resident by accident)
Might return pointers that are not suitable for those functions
You should use mmap instead or any other OS-specific mechanism. For example Linux has mremap which allows you to "reallocate" memory. Whatever you use, make sure mlock behavior is well-defined for it. From Linux man pages:
If the memory segment specified by old_address and old_size is locked
(using mlock(2) or similar), then this lock is maintained when the
segment is resized and/or relocated. As a consequence, the amount of
memory locked by the process may change.
Note Nate Eldredge's comment below:
Another problem with using realloc with locked memory is that the data
will be copied to the new location before you have a chance to find
out where it is and lock it. If your purpose in using mlock is to
ensure that sensitive data never gets written out to swap, this
creates a window of time where that might happen.
TL;DR
Memory locking doesn't mix with general-purpose memory allocation using the C language runtime.
Memory locking does mix with page-oriented virtual memory mapping OS-level APIs.
The above hold unless special circumstances arise (that's my way out of this :)
Currently I use shm_open to get a file descriptor and then use ftruncate and mmap whenever I want to add a new buffer to the shared memory. Each buffer is used individually for its own purposes.
Now what I need to do is arbitrarily resize buffers.
And also munmap buffers and reuse the free space again later.
The only solution I can come up with for the first problem is: ftuncate(file_size + old_buffer_size + extra_size), mmap, copy data accross into the new buffer and then munmap the original data. This looks very expensive to me and there is probably a better way. It also entails removing the original buffer every time.
For the second problem I don't even have a bad solution, I clearly can't move memory around everytime a buffer is removed. And if I keep track of free memory and use it whenever possible it will slow down allocation process as well as leave me with bits and pieces in between that are unused.
I hope this is not too confusing.
Thanks
As best as I understand you need to grow (or shrink) the existing memory mapping.
Under linux shared memory implemented as a file, located in /dev/shm memory filesystem. All operations in this file is the same as on the regular files (and file descriptors).
if you want to grow the existing mapping first expand the file size with ftruncate (as you wrote) then use mremap to expand the mapping to the requested size.
If you store pointers points to this region you maybe have to update these, but first try to call with 0 flag. In this case the system tries to grow the existing mapping to the requested size (if there is no collision with other preserved memory region) and pointers remains valid.
If previous option not available use MREMAP_MAYMOVE flag. In this case the system remaps to another locations, but mostly it's done effectively (no copy applied by the system.) then update the pointers.
Shrinking is the same but the reverse order.
I wrote an open source library for just this purpose:
rszshm - resizable pointer-safe shared memory
To quote from the description page:
To accommodate resizing, rszshm first maps a large, private, noreserve
map. This serves to claim a span of addresses. The shared file mapping
then overlays the beginning of the span. Later calls to extend the
mapping overlay more of the span. Attempts to extend beyond the end of
the span return an error.
I extend a mapping by calling mmap with MAP_FIXED at the original address, and with the new size.
I'm working on a project in C that uses shared memory for IPC on a Linux system. However, I'm a little bit confused about memory management in these segments. I'm using the POSIX API for this project.
I understand how to create the shared segments, and that these persist until a reboot if you fail to properly remove them with shm_unlink(). Additionally, I understand how to do the actually mapping & unmapping with mmap and munmap respectively. However, the usage of these operations and how it affects the stored data in these shared segments is confusing me.
Here is what I'm trying to properly understand:
Lets say I create a segment using shm_open() with the O_CREAT flag. This gives me a file descriptor that I've named msfd in the below example. Now I have a struct that I map into that address space with the following:
mystruct* ms = (mystruct*)mmap(NULL, sizeof(mystruct), PROT_READ | PROT_WRITE, MAP_SHARED, msfd, 0);
//set the elements of the struct here using ms->element = X as usual
Part 1)
Here's where my confusion beings. Lets say that this process is now done accessing that location since it was just setting data for another process to read. Do I still call munmap()?
I want the other process to still have access to all of this data that the current process has set. Normally, you wouldn't call free() on a malloc'ed pointer until its use is no longer needed permanently. However, I understand that when this process exits the unmapping happens automatically anyway. Is the data persisted inside the segment, or does that segment just get reserved with it's allotted size and name?
Part 2)
We're now in the process of the other application that needs to access and read from that shared segment. I understand that we now open that segment with shm_open() and then perform the same mapping operation with mmap(). Now we have access to the structure in that segment. When we call munmap() from this process (NOT the one that created the data) it "unlinks" us from that pointer, however the data is still accessible. Does this assume that process 1 (the creator) has NOT called munmap()?
Is the data persisted inside the segment,
Yes.
does that segment just get reserved with it's allotted size and name?
Also yes.
Does this assume that process 1 (the creator) has NOT called munmap()?
No.
The shared memory gets created via shm_create() (as being taken from available OS memory) and from this moment on it carries whichever content had been written into until it is given back to the OS via shm_unlink().
shm_create() and shm_open() act system oriented, in terms of the (shared) memory being a system (not process) specific resource.
mmap() and unmap() act process oriented, that is map and unmap the system resource shared memory into/out-of the process' address space.
I have a call to mmap() which I try to map 64MB using MAP_ANONYMOUS as follows:
void *block = mmap(0, 67108864, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (block == MAP_FAILED)
exit(1);
I understand that to actually own the memory, I need to hit that block of memory. I want to add some sort of 0's or empty strings to actually own the memory. How would I do that? I tried the following, but that obviously segfaults (I know why it does):
char *temp = block;
for (int i = 0; i < 67108864; i++) {
*temp = '0';
temp++;
}
How would I actually gain ownership of that block by assigning something in that block?
Thanks!
Your process already owns the memory, but what I think you want is to make it resident. That is, you want the kernel to allocate physical memory for the mmaped region.
The kernel allocates a virtual memory area (VMA) for the process, but this just specifies a valid region and doesn't actually allocate physical pages (or frames as they are also sometimes called). To make the kernel allocate entries in the page table, all you need to do is force a page fault.
The easiest way to force a page fault is to touch the memory just like you're doing. Though, because your page size is almost certainly 4096 bytes, you really only need to read one byte every 4096 bytes thereby reducing the amount of work you actually need to do.
Finally, because you are setting the pages PROT_READ, you will actually want to read from each page rather than try to write.
Your question is not very well formulated. I don't understand why you think the process is not owning its memory obtained thru mmap?
Your newly mmap-ed memory zone has only PROT_READ (so you can just read the zeros inside) and you need that to be PROT_READ|PROT_WRITE to be able to write inside.
But your process already "owns" the memory once the mmap returned.
If the process has pid 1234, you could sequentially read (perhaps with cat /proc/1234/maps in a different terminal) its memory map thru /proc/1234/maps; from inside your process, use /proc/self/maps.
Maybe you are interested in memory overcommit; there is a way to disable that.
Perhaps the mincore(2), msync(2), mlock(2) syscalls are interesting you.
Maybe you want the MAP_POPULATE or MAP_LOCKED flag of mmap(2)
I actually don't understand why you say "own the memory" in your question, which I don't understand very well. If you just want to disable memory overcommit, please tell.
And you might also mmap some file segment. I believe there is no possible overcommit in that case. But I would just suggest to disable memory overcommit in your entire system, thru /proc/sys/vm/overcommit_memory.
I'm implementing persistent large constant arrays via mmap. Is there any tips and tricks or gotchas one should be aware when using mmap?
All pointers that are stored inside the mmap'd region should be done as offsets from the base of the mmap'd region, not as real pointers! You won't necessarily be getting the same base address when you mmap the region on the next run of the program. (I have had to clean up code that made incorrect assumptions about mmap region base address constancy).
This is the most straight forward use case for mmap() so there shouldn't be much to trip you up.
You are effectively just loading a large constant array. Being constants you shouldn't need to worry about synchronization. It would be advisable to make sure the prot parameter is set to PROT_READ only since you won't be writing.
If one or more programs using the constants are going to be continually run, it might be worthwhile to have a separate program that loads the data and keeps it resident. Runs of the other programs then essentially are just doing an shared memory attach rather than continually reading the file into memory.
Make sure you check for restrictions on open file size or memory usage. On Linux there is a built in shell command ulimit. Run as ulimit -a to see the current settings.
Flush writes to the in-memory array to the file with the msync(2) syscall or else they may stay in memory until munmap(2) and there may be a power outage or something before then!
If multiple processes are mmap'ing the same memory region shared with read and write privileges, make sure that only one is writing to it at a time to avoid corrupting your data. Or use file locking or some other means of synchronization.