Mmap a large 1TB file and write 1's to it? - c

I am very new to mmap and memset. I have been assigned a task to create a large file (1TB) and write 1's to it as we are trying to understand the performance.
Now from what I understand, I can basically fallocate a file with 1Tb, then in a C function, I can mmap it with PROT_READ, PROT_WRITE, MAP_SHARED and then memset that mmap'ed pointed with memset, like this :
int fd = open(FILEPATH, O_RDWE, (mode_t)0700);
size_t data_length = 1000000000000;
char *data = (char*)mmap(NULL, data_length, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, fd , 0);
memset(data, '1', data_length);
Is this correct?
Do I need to sync or anything to make this data persistent?
If so, I'm basically writing a 1TB file then why does my function run within split seconds.
I've tried to cat the output file and there indeed are 1's, its just that my terminal craps out after a few seconds, but the data is actually getting written.
Am I doing this correctly or should I actually go and write data to memory rather than memset. If so, how should I do it?
Thanks for any help.

Related

Read file stored in segment and send bytes

I have a file I stored in a structure in the segment from a different process A. Now from process B I need to get this file and convert it to bytes so I can send it or send it while reading its bytes , what would be an ideal way of doing this? see below:
typedef struct mysegment_struct_t {
FILE *stream;
size_t size;
}
so I have the mapping to the segment and all just not sure how to get it now
size_t bytes_sent;
struct mysegment_struct_t *fileinfo =
(struct mysegment_struct_t *)mmap(NULL,size,PROT_READ | PROT_WRITE, MAP_SHARED, fd,0);
//read stream into a byte array? (how can this be done in c)
//FILE *f = fopen(fileinfo->stream, "w+b"); //i a bit lost here, the file is in the segment already
//send bytes
while (bytes_sent < fileinfo->size) {
bytes_sent +=send_to_client(buffer, size); //some buffer containing bytes?
}
I am kind of new to C programming but I cant find something like read the file in memory to a byte array for example.
Thanks
from blog https://www.softprayog.in/programming/interprocess-communication-using-posix-shared-memory-in-linux
there has to be a way i can share the file between processes using the shared memory.
You simply can't do this. The pointer stream points to objects that only exist in the memory of process A, and are not in the shared memory area (and even if they were, they wouldn't typically be mapped at the same address). You're going to have to design something else.
One possibility is to send the file descriptor over a Unix domain socket, see Portable way to pass file descriptor between different processes. However, it is probably worth stepping back and thinking about why you want to pass an open file between processes in the first place, and whether there is a better way to achieve your overall goal.

MMAP segmentation fault

int fp, page;
char *data;
if(argc > 1){
printf("Read the docs");
exit(1);
}
fp = open("log.txt", O_RDONLY); //Opening file to read
page = getpagesize();
data = mmap(0, page, PROT_READ, 0,fp, 0);
initscr(); // Creating the ncurse screen
clear();
move(0, 0);
printw("%s", data);
endwin(); //Ends window
fclose(fp); //Closing file
return 0;
Here is my code I keep getting a segmentation fault for some reason.
All my header files have been included so that's not the problem (clearly, because its something to do with memory). Thanks in advance.
Edit: Got it - it wasn't being formatted as a string. and also had to use stat() to get the file info rather than getpagesize()
You can't fclose() a file descriptor you got from open(). You must use close(fp) instead. What you do is passing a small int that gets treated as a pointer. This causes a segmentation fault.
Note that your choice of identifier naming is unfortunate. Usually fp would be a pointer-to-FILE (FILE*, as used by the standard IO library), while fd would be a file descriptor (a small integer), used by the kernel's IO system calls.
Your compiler should have told you that you pass an int where a pointer-to-FILE was expected, or that you use fclose() without a prototype in scope. Did you enable the maximum warning level of your compiler?
Another segfault is possible if the data pointer does not point to a NUL (0) terminated string. Does your log.txt contain NUL-terminated strings?
You should also check if mmap() fails returning MAP_FAILED.
Okay so here is the code that got it working
#include <sys/stat.h>
int status;
struct stat s;
status = stat(file, &s);
if(status < 0){
perror("Stat:");
exit(1);
data = mmap(NULL, s.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
Before i was using 'getpagesize();' thanks beej !!!
mmap's man page gives you information on the parameters:
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
As you can see, your second argument may be wrong (except you really want to exactly map a part of the file fitting into a single page).
Also: Probably 0 is not a valid flag value? Let's have a look again at the man page:
The flags argument determines whether updates to the mapping are
visible to other processes mapping the same region, and whether
updates are carried through to the underlying file. This behavior is
determined by including exactly one of the following values in flags: MAP_SHARED or MAP_PRIVATE
So you could try something like
data = mmap(0, size, PROT_READ, MAP_SHARED, fp, 0);
Always use the provided flags, as the underlying value may differ from machine to machine.
Also, the mapped area should not be larger than the underlying file. Check the size of log.txt beforehand.
The second argument to mmap should not be page size, it should be the size of your file. Here is a nice example.

copy_to_user fails to copy data to mmap user from kernel?

In the user space programm I am allocating some memory via mmap as the following function call:
void *memory;
int fd;
fd = open(filepath, O_RDWR);
if (fd < 0)
return errno;
memory = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED, fd, 0);
if (memory == MAP_FAILED)
return -1;
//syscall() goes here
In the kernel space in my system call I am trying to copy data to the memory mapped region as follows:
copy_to_user(memory,src,4096);
EDIT: added error checking code to the post for clarification
The copy_to_user() call is repeatedly failing in this case, whereas if I would have done a memory = malloc() it was succeeding always.
Am I getting some permission flags wrong in this case for mmap ?
Does the open succeed? What about mmap? Is the target file big enough? Can you write to the file through the mapping in userspace?
Also, the repeated 4096 is a strong hit your code is wrong. Userspace should pass the expected size instead.

how to decrypt large files in memory using c

C based application is using aes with key size of 256. Data is available in binary form, it is encrypted and is written in the binary file. Requirment is to decrypt this binary file in RAM (i.e on the fly / real time encryption). Question is how to achieve on the fly encryption in efficient way? Any good web links or code references for understanding on the fly encryption are required.
In more simple way the question is how to decrypt large files in memory using c (Linux)? Like in truecrypt.
Use mmap on the file; the file is then opened as a datastream in memory. For example, a simple memory-changing function that XOR's each byte in a file on a large (say, 400Gb) file:
// The encryption function
void xor_ram (unsigned char *buffer, long len) {
while (len--) *buffer ^= *buffer++;
}
// The file we want to encrypt
int fd = open ("/path/to/file", O_RDWR);
// Figure out the file length
FILE *tmpf = fdopen (fd, "r");
fseek (tmpf, 0, SEEK_END);
long length = ftell (tmpf);
// Memory map the file using the fd
unsigned char *mapped_file = mmap (NULL, length,
PROT_READ | PROT_WRITE, MAP_PRIVATE,
fd, 0);
// Call the encryption function
xor_ram (mapped_file, length);
// All done now
munmap (mapped_file, length);
close (fd);
You can read the manpage for mmap here: http://unixhelp.ed.ac.uk/CGI/man-cgi?mmap
Although you should really find the documentation for mmap on your particular platform (man mmap if you're on a unix system of some sort, or search the platforms libraries if not).

Is there really no mremap in Darwin?

I'm trying to find out how to remap memory-mapped files on a Mac (when I want to expand the available space).
I see our friends in the Linux world have mremap but I can find no such function in the headers on my Mac. /Developer/SDKs/MacOSX10.6.sdk/usr/include/sys/mman.h has the following:
mmap
mprotect
msync
munlock
munmap
but no mremap
man mremap confirms my fears.
I'm currently having to munmap and mmmap if I want to resize the size of the mapped file, which involves invalidating all the loaded pages. There must be a better way. Surely?
I'm trying to write code that will work on Mac OS X and Linux. I could settle for a macro to use the best function in each case if I had to but I'd rather do it properly.
If you need to shrink the map, just munmap the part at the end you want to remove.
If you need to enlarge the map, you can mmap the proper offset with MAP_FIXED to the addresses just above the old map, but you need to be careful that you don't map over something else that's already there...
The above text under strikeout is an awful idea; MAP_FIXED is fundamentally wrong unless you already know what's at the target address and want to atomically replace it. If you are trying to opportunistically map something new if the address range is free, you need to use mmap with a requested address but without MAP_FIXED and see if it succeeds and gives you the requested address; if it succeeds but with a different address you'll want to unmap the new mapping you just created and assume that allocation at the requested address was not possible.
If you expand in large enough chunks (say, 64 MB, but it depends on how fast it grows) then the cost of invalidating the old map is negligible. As always, benchmark before assuming a problem.
You can ftruncate the file to a large size (creating a hole) and mmap all of it. If the file is persistent I recommend filling the hole with write calls rather than by writing in the mapping, as otherwise the file's blocks may get unnecessarily fragmented on the disk.
I have no experience with memory mapping, but it looks like you can temporarily map the same file twice as a means to expand the mapping without losing anything.
int main() {
int fd;
char *fp, *fp2, *pen;
/* create 1K file */
fd = open( "mmap_data.txt", O_RDWR | O_CREAT, 0777 );
lseek( fd, 1000, SEEK_SET );
write( fd, "a", 1 );
/* map and populate it */
fp = mmap( NULL, 1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0 );
pen = memset( fp, 'x', 1000 );
/* expand to 8K and establish overlapping mapping */
lseek( fd, 8000, SEEK_SET );
write( fd, "b", 1 );
fp2 = mmap( NULL, 7000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0 );
/* demonstrate that mappings alias */
*fp = 'z';
printf( "%c ", *fp2 );
/* eliminate first mapping */
munmap( fp, 1000 );
/* populate second mapping */
pen = memset( fp2+10, 'y', 7000 );
/* wrap up */
munmap( fp2, 7000 );
close( fd );
printf( "%d\n", errno );
}
The output is zxxxxxxxxxyyyyyy.....
I suppose, if you pound on this, it may be possible to run out of address space faster than with mremap. But nothing is guaranteed either way and it might on the other hand be just as safe.

Resources