Mapping existing memory (data segment) to another memory segment - c

As the title suggests, I would like to ask if there is any way for me to map the data segment of my executable to another memory so that any changes to the second are updated instantly on the first. One initial thought I had was to use mmap, but unfortunately mmap requires a file descriptor and I do not know of a way to somehow open a file descriptor on my running processes memory. I tried to use shmget/shmat in order to create a shared memory object on the process data segment (&__data_start) but again I failed ( even though that might have been a mistake on my end as I am unfamiliar with the shm API). A similar question I found is this: Linux mapping virtual memory range to existing virtual memory range? , but the replies are not helpful.. Any thoughts are welcome.
Thank you in advance.
Some pseudocode would look like this:
extern char __data_start, _end;
char test = 'A';
int main(int argc, char *argv[]){
size_t size = &_end - &__data_start;
char *mirror = malloc(size);
magic_map(&__data_start, mirror, size); //this is the part I need.
printf("%c\n", test) // prints A
int offset = &test - &__data_start;
*(mirror + offset) = 'B';
printf("%c\n", test) // prints B
free(mirror);
return 0;
}

it appears I managed to solve this. To be honest I don't know if it will cause problems in the future and what side effects this might have, but this is it (If any issues arise I will try to log them here for future references).
Solution:
Basically what I did was use the mmap flags MAP_ANONYMOUS and MAP_FIXED.
MAP_ANONYMOUS: With this flag a file descriptor is no longer required (hence the -1 in the call)
MAP_FIXED: With this flag the addr argument is no longer a hint, but it will put the mapping on the address you specify.
MAP_SHARED: With this you have the shared mapping so that any changes are visible to the original mapping.
I have left in a comment the munmap function. This is because if unmap executes we free the data_segment (pointed to by &__data_start) and as a result the global and static variables are corrupted. When at_exit function is called after main returns the program will crash with a segmentation fault. (Because it tries to double free the data segment)
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#define _GNU_SOURCE 1
#include <unistd.h>
#include <sys/mman.h>
extern char __data_start;
extern char _end;
int test = 10;
int main(int argc, char *argv[])
{
size_t size = 4096;
char *shared = mmap(&__data_start, 4096, PROT_READ | PROT_WRITE, MAP_FIXED | MAP_ANONYMOUS | MAP_SHARED, -1, 0);
if(shared == (void *)-1){
printf("Cant mmap\n");
exit(-1);
}
printf("original: %p, shared: %p\n",&__data_start, shared);
size_t offset = (void *)&test - (void *)&__data_start;
*(shared+offset) = 50;
msync(shared, 4096, MS_SYNC);
printf("test: %d :: %d\n", test, *(shared+offset));
test = 25;
printf("test: %d :: %d\n", test, *(shared+offset));
//munmap(shared, 4096);
}
Output:
original: 0x55c4066eb000, shared: 0x55c4066eb000
test: 50 :: 50
test: 25 :: 25

Related

copy whole of a file into memory using mmap

i want to copy whole of a file to memory using mmap in C.i write this code:
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <errno.h>
int main(int arg, char *argv[])
{
char c ;
int numOfWs = 0 ;
int numOfPr = 0 ;
int numberOfCharacters ;
int i=0;
int k;
int pageSize = getpagesize();
char *data;
float wsP = 0;
float prP = 0;
int fp = open("2.txt", O_RDWR);
data = mmap((caddr_t)0, pageSize, PROT_READ, MAP_SHARED, fp,pageSize);
printf("%s\n", data);
exit(0);
}
when i execute the code i get the Bus error message.
next, i want to iterate this copied file and do some thing on it.
how can i copy the file correctly?
2 things.
The second parameter of mmap() is the size of the portion of file you want to make visible in your address space. The last one is the offset in the file from which you want the map. This means that as you have called mmap() you will see only 1 page (on x86 and ARM it's 4096 bytes) starting at offset 4096 in your file. If your file is smaller than 4096 bytes, then there will be no mapping and mmap() will return MAP_FAILED (i.e. (caddr_t)-1). You didn't check the return value of the function so the following printf() dereferences an illegal pointer => BUS ERROR.
Using a memory map with string functions can be difficult. If the file doesn't contain binary 0. It can happen that these functions then try to access past the mapped size of the file and touch unmapped memory => SEGFAULT.
To open a memory for a file, you have to know the size of the file.
struct stat filestat;
if(fstat(fd, &filestat) !=0) {
perror("stat failed");
exit(1);
}
data = mmap(NULL, filestat.st_size, PROT_READ, MAP_SHARED, fp, 0);
if(data == MAP_FAILED) {
perror("mmap failed");
exit(2);
}
EDIT: The memory map will always be opened with a size that is a multiple of the pagesize. This means that the last page will be filled with 0 up to the next multiple of the pagesize. Often programs using memory mapped files with string functions (like your printf()) will work most of the time, but will suddenly crash when mapping a file whith a size exactly a multiple of the page size (4096, 8192, 12288 etc.). The often seen advice to pass to mmap() a size bigger than real file size works on Linux but is not portable and is even in violation of Posix, which explicitly states that mapping beyond the file size is undefined behaviour. The only portable way is to not use string functions on memory maps.
The last parameter of mmap is the offset within the file, where the part of file mapped to memory starts. It shall be 0 in your case
data = mmap(NULL, pageSize, PROT_READ, MAP_SHARED, fp,0);
If your file is shorter than pageSize, you will not be able to use addresses beyond the end of file. To use the full size, you shall expand the size to pageSize before calling mmap. Use something like:
ftruncate(fp, pageSize);
If you want to write to the memory (file) you shall use flag PROT_WRITE as well. I.e.
data = mmap(NULL, pageSize, PROT_READ|PROT_WRITE, MAP_SHARED, fp,0);
If your file does not contain 0 character (as end of string) and you want to print it as a string, you shall use printf with explicitly specified maximum size:
printf("%.*s\n", pageSize, data);
Also, of course, as pointed by #Jongware, you shall test result of open for -1 and mmap for MAP_FAILED.

C/Linux: dual-map memory with different permissions

My program passes data pointers to third-party plugins with the intention that the data should be read-only, so it would be nice to prevent the plugins from writing to the data objects. Ideally there would be a segfault if a plugin attempts a write. I've heard there is some way to double-map a memory region, such that a second virtual address range points to the same physical memory pages. The second mapping would not have write permission, and the exported pointers would use this address range instead of the original (writable) one. I would prefer not to change the original memory allocations, whether they happen to use malloc or mmap or whatever. Can someone explain how to do this?
It is possible to get a dual mapping, but it requires some work.
The only way I know how to create such a dual mapping is to use the mmap function call. For mmap you need some kind of file-descriptor. Fortunately Linux allows you to get a shared memory object, so no real file on a storage medium is required.
Here is a complete example that shows how to create the shared memory object, creates a read/write and read-only pointer from it and then does some basic tests:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
int main()
{
// Lets do this demonstration with one megabyte of memory:
const int len = 1024*1024;
// create shared memory object:
int fd = shm_open("/myregion", O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
printf ("file descriptor is %d\n", fd);
// set the size of the shared memory object:
if (ftruncate(fd, len) == -1)
{
printf ("setting size failed\n");
return 0;
}
// Now get two pointers. One with read-write and one with read-only.
// These two pointers point to the same physical memory but will
// have different virtual addresses:
char * rw_data = mmap(0, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd,0);
char * ro_data = mmap(0, len, PROT_READ , MAP_SHARED, fd,0);
printf ("rw_data is mapped to address %p\n", rw_data);
printf ("ro_data is mapped to address %p\n", ro_data);
// ===================
// Simple test-bench:
// ===================
// try writing:
strcpy (rw_data, "hello world!");
if (strcmp (rw_data, "hello world!") == 0)
{
printf ("writing to rw_data test passed\n");
} else {
printf ("writing to rw_data test failed\n");
}
// try reading from ro_data
if (strcmp (ro_data, "hello world!") == 0)
{
printf ("reading from ro_data test passed\n");
} else {
printf ("reading from ro_data test failed\n");
}
printf ("now trying to write to ro_data. This should cause a segmentation fault\n");
// trigger the segfault
ro_data[0] = 1;
// if the process is still alive something didn't worked.
printf ("writing to ro_data test failed\n");
return 0;
}
Compile with: gcc test.c -std=c99 -lrt
For some reason I get a warning that ftruncate is not declared. No idea why. The code works well though. Example output:
file descriptor is 3
rw_data is mapped to address 0x7f1778d60000
ro_data is mapped to address 0x7f1778385000
writing to rw_data test passed
reading from ro_data test passed
now trying to write to ro_data. This should cause a segmentation fault
Segmentation fault
I've left the memory deallocation as an exercise for the reader :-)

Segmentation fault with Posix-C program using mmap and mapfile

Well I have this program and I get a segmentation fault: 11 (core dumped). After lots of checks I get this when the for loop gets to i=1024 and it tries to mapfile[i]=0. The program is about making a server and a client program that communicates by read/writing in a common file made in the server program. This is the server program and it prints the value inside before and after the change. I would like to see what's going on, if it's a problem with the mapping or just problem with memory of the *mapfile. Thanks!
#include <sys/shm.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <errno.h>
#include <math.h>
int main()
{
int ret, i;
int *mapfile;
system("dd if=/dev/zero of=/tmp/c4 bs=4 count=500");
ret = open("/tmp/c4", O_RDWR | (mode_t)0600);
if (ret == -1)
{
perror("File");
return 0;
}
mapfile = mmap(NULL, 2000, PROT_READ | PROT_WRITE, MAP_SHARED, ret, 0);
for (i=1; i<=2000; i++)
{
mapfile[i] = 0;
}
while(mapfile[0] != 555)
{
mapfile = mmap(NULL, 2000, PROT_READ | PROT_WRITE, MAP_SHARED, ret, 0);
if (mapfile[0] != 0)
{
printf("Readed from file /tmp/c4 (before): %d\n", mapfile[0]);
mapfile[0]=mapfile[0]+5;
printf("Readed from file /tmp/c4 (after) : %d\n", mapfile[0]);
mapfile[0] = 0;
}
sleep(1);
}
ret = munmap(mapfile, 2000);
if (ret == -1)
{
perror("munmap");
return 0;
}
close(ret);
return 0;
}
mapfile = mmap(NULL, 2000, PROT_READ | PROT_WRITE, MAP_SHARED, ret, 0);
for (i=1; i<=2000; i++)
{
mapfile[i] = 0;
}
In this code here, you see that you are requesting 2000 units of memory. In this case mmap is taking in a size_t type meaning that its looking for a size, and not an amount of things for memory. As #Mat mentioned, you will need t use the sizeof(int) operator in order to feed mmap the proper size it requires.
The other issue that should be noted about this code that may cause a problem for you down the road, is beginning your loop index at i=1 rather than i=0. Starting your index at 0 wil ensure that you are going from the indices 0 - 1999, which corresponds to the memory you are trying to allocate.
Overall here, it looks like what your trying to do is initialize the values of your memory to 0. perhaps you could do this easier by relying on a builtin function called memset:
void *memset(void *str, int c, size_t n)
your code then becomes:
mapfile = mmap(NULL, 2000*sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, ret, 0);
void *returnedPointer = memset(mapfile, 0, 2000*sizeof(int));
docs for memset can be found here:
http://www.tutorialspoint.com/c_standard_library/c_function_memset.htm
You're requesting 2000 bytes from mmap, but treating the returned value as an array of 2000 ints. That can't work, an int is usually 4 or 8 bytes these days. You'll be writing past the end of the reserved memory in your loop.
Change the mmap calls to use 2000*sizeof(int). And while you're at it, give that 2000 constant a name (e.g. const int num_elems = 2000; near the top) and don't repeat the magic constant all over the place. And once that's done change it to 1024 or 2048 so that the resulting size is a multiple of the page size (if you're not sure of your page size, getconf PAGE_SIZE on the command line).
And also change your dd command to create a large-enough file. It is currently creating a 2000 byte file, you'll need to increase that as well.
And validate the return value of mmap - it can fail, and you should detect that.
Finally, don't continuously remap, you're using MAP_SHARED modifications through other shared mappings of the same file and offset will be visible to your process. (Must really be the same file, if the other process also does a dd, that might not work. Only one process should have the responsibility of creating that file.)
If you do want to remap, you must also unmap each time. Otherwise you're leaking mappings.

Does any operating system allow an application programmer to create pointers out of thunks?

Many operating systems allow one to memory map files, and read from them lazily. If the operating system can do this then effectively it has the power to create regular pointers out of thunks.
Does any operating system allow an application programmer to create pointers out of his own thunks?
I know that to a limited extent operating systems already support this capability because one can create a pipe, map it into memory, and connect a process to the pipe to accomplish some of what I'm talking about so this functionality doesn't seem too impossible or unreasonable.
A simple example of this feature is a pointer which counts the number of times it's been dereferenced. The following program would output zero, and then one.
static void setter(void* environment, void* input) {
/* You can't set the counter */
}
static void getter(void* environment, void* output) {
*((int*)output) = *((int*)environment)++;
}
int main(int argc, char** argv) {
volatile int counter = 0;
volatile int * const x = map_special_voodoo_magic(getter, setter, sizeof(*x),
&counter);
printf("%i\n", *x);
printf("%i\n", *x);
unmap_special_voodoo_magic(x);
}
P.S. A volatile qualifier is required because the value pointed to by x changes unexpectedly right? As well, the compiler has no reason to think that dereferencing x would change counter so a volatile is needed there too right?
The following code has no error checking, and is not a complete implementation. However, the following code does illustrate the basic idea of how to make a thunk into a pointer. In particular, the value 4 is calculated on demand and not assigned to memory earlier.
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <signal.h>
void * value;
static void catch_function(int signal, siginfo_t* siginfo,
void* context) {
if (value != siginfo->si_addr) {
puts("ERROR: Wrong address!");
exit(1);
}
mmap(value, sizeof(int), PROT_READ | PROT_WRITE, MAP_FIXED | MAP_SHARED
| MAP_ANON, -1, 0);
*((int*)value) = 2 * 2;
}
int main(void) {
struct sigaction new_action;
value = mmap(NULL, sizeof(int), PROT_NONE, MAP_ANON | MAP_SHARED, -1,
0);
sigemptyset(&new_action.sa_mask);
new_action.sa_sigaction = catch_function;
new_action.sa_flags = SA_SIGINFO;
sigaction(SIGBUS, &new_action, NULL);
printf("The square of 2 is: %i\n", *((int*)value));
munmap(value, sizeof(int));
return 0;
}

Can I do a copy-on-write memcpy in Linux?

I have some code where I frequently copy a large block of memory, often after making only very small changes to it.
I have implemented a system which tracks the changes, but I thought it might be nice, if possible to tell the OS to do a 'copy-on-write' of the memory, and let it deal with only making a copy of those parts which change. However while Linux does copy-on-write, for example when fork()ing, I can't find a way of controlling it and doing it myself.
Your best chance is probably to mmap() the original data to file, and then mmap() the same file again using MAP_PRIVATE.
Depending on what exactly it is that you are copying, a persistent data structure might be a solution for your problem.
Its easier to implement copy-on-write in a object oriented language, like c++. For example, most of the container classes in Qt are copy-on-write.
But if course you can do that in C too, it's just some more work. When you want to assign your data to a new data block, you don't do a copy, instead you just copy a pointer in a wrapper strcut around your data. You need to keep track in your data blocks of the status of the data. If you now change something in your new data block, you make a "real" copy and change the status.
You can't of course no longer use the simple operators like "=" for assignment, instead need to have functions (In C++ you would just do operator overloading).
A more robust implementation should use reference counters instead of a simple flag, I leave it up to you.
A quick and dirty example:
If you have a
struct big {
//lots of data
int data[BIG_NUMBER];
}
you have to implement assign functions and getters/setters yourself.
// assume you want to implent cow for a struct big of some kind
// now instead of
struct big a, b;
a = b;
a.data[12345] = 6789;
// you need to use
struct cow_big a,b;
assign(&a, b); //only pointers get copied
set_some_data(a, 12345, 6789); // now the stuff gets really copied
//the basic implementation could look like
struct cow_big {
struct big *data;
int needs_copy;
}
// shallow copy, only sets a pointer.
void assign(struct cow_big* dst, struct cow_big src) {
dst->data = src.data;
dst->needs_copy = true;
}
// change some data in struct big. if it hasn't made a deep copy yet, do it here.
void set_some_data(struct cow_big* dst, int index, int data } {
if (dst->needs_copy) {
struct big* src = dst->data;
dst->data = malloc(sizeof(big));
*(dst->data) = src->data; // now here is the deep copy
dst->needs_copy = false;
}
dst->data[index] = data;
}
You need to write constructors and destructors as well. I really recommend c++ for this.
The copy-on-write mechanism employed e.g. by fork() is a feature of the MMU (Memory Management Unit), which handles the memory paging for the kernel. Accessing the MMU is a priviledged operation, i.e. cannot be done by a userspace application. I am not aware of any copy-on-write API exported to user-space, either.
(Then again, I am not exactly a guru on the Linux API, so others might point out relevant API calls I have missed.)
Edit: And lo, MSalters rises to the occasion. ;-)
You should be able to open your own memory via /proc/$PID/mem and then mmap() the interesting part of it with MAP_PRIVATE to some other place.
Here's a working example:
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define SIZE 4096
int main(void) {
int fd = shm_open("/tmpmem", O_RDWR | O_CREAT, 0666);
int r = ftruncate(fd, SIZE);
printf("fd: %i, r: %i\n", fd, r);
char *buf = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
printf("debug 0\n");
buf[SIZE - 2] = 41;
buf[SIZE - 1] = 42;
printf("debug 1\n");
// don't know why this is needed, or working
//r = mmap(buf, SIZE, PROT_READ | PROT_WRITE,
// MAP_FIXED, fd, 0);
//printf("r: %i\n", r);
char *buf2 = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_PRIVATE, fd, 0);
printf("buf2: %i\n", buf2);
buf2[SIZE - 1] = 43;
buf[SIZE - 2] = 40;
printf("buf[-2]: %i, buf[-1]: %i, buf2[-2]: %i, buf2[-1]: %i\n",
buf[SIZE - 2],
buf[SIZE - 1],
buf2[SIZE - 2],
buf2[SIZE - 1]);
unlink(fd);
return EXIT_SUCCESS;
}
I'm a little unsure of whether I need to enable the commented out section, for safety.

Resources