Why (ftruncate+mmap+memcpy) is faster than (write)?

Why (ftruncate+mmap+memcpy) is faster than (write)? - file

I found a different way to write data, which is faster than normal unix write function.
Firstly, ftruncate the file to the length we need, then mmap this block of file, finally, using memcpy to flush the file content. I will give the example code below.
As I known, mmap can load the file into the process address space, accelerating by ignoring the page cache. BUT, I don't have any idea why it can fast up the writing speed.
Whether I write a wrong test case or it can be a kind of opti trick?
Here is the test code. Some of its written in ObjC, but no matter. WCTTicker is just a statistics class using gettimeofday.
//find a dir to test
NSString* document = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES)[0];
NSString* dir = [document stringByAppendingPathComponent:#"testDir"];
//remove all existing test
if ([[NSFileManager defaultManager] fileExistsAtPath:dir]) {
if (![[NSFileManager defaultManager] removeItemAtPath:dir error:nil]) {
NSLog(#"fail to remove dir");
return;
}
}
//create dir to test
if (![[NSFileManager defaultManager] createDirectoryAtPath:dir withIntermediateDirectories:YES attributes:nil error:nil]) {
NSLog(#"fail to create dir");
}
//pre-alloc memory
const int length = 10000000;
const int count = 100;
char* mem = (char*)malloc(length);
memset(mem, 'T', length);
{
//start testing mmap
// ftruncate && mmap(private) &&memcpy
NSString* mmapFileFormat = [dir stringByAppendingPathComponent:#"privateMmapFile%d"];
[WCTTicker tick];
for (int i = 0; i < count; i++) {
NSString* path = [[NSString alloc] initWithFormat:mmapFileFormat, i];
int fd = open(path.UTF8String, O_CREAT | O_RDWR, S_IRWXG | S_IRWXU | S_IRWXO);
if (fd<0) {
NSLog(#"fail to open");
}
int rc = ftruncate(fd, length);
if (rc<0) {
NSLog(#"fail to truncate");
}
char* map = (char*)mmap(NULL, length, PROT_WRITE | PROT_READ, MAP_PRIVATE, fd, 0);
if (!map) {
NSLog(#"fail to mmap");
}
memcpy(map, mem, length);
close(fd);
}
[WCTTicker stop];
}
{
//start testing write
// normal write
NSString* writeFileFormat = [dir stringByAppendingPathComponent:#"writeFile%d"];
[WCTTicker tick];
for (int i = 0; i < count; i++) {
NSString* path = [[NSString alloc] initWithFormat:writeFileFormat, i];
int fd = open(path.UTF8String, O_CREAT | O_RDWR, S_IRWXG | S_IRWXU | S_IRWXO);
if (fd<0) {
NSLog(#"fail to open");
}
int written = (int)write(fd, mem, length);
if (written!=length) {
NSLog(#"fail to write");
}
close(fd);
}
[WCTTicker stop];
}
{
//start testing mmap
// ftruncate && mmap(shared) &&memcpy
NSString* mmapFileFormat = [dir stringByAppendingPathComponent:#"sharedMmapFile%d"];
[WCTTicker tick];
for (int i = 0; i < count; i++) {
NSString* path = [[NSString alloc] initWithFormat:mmapFileFormat, i];
int fd = open(path.UTF8String, O_CREAT | O_RDWR, S_IRWXG | S_IRWXU | S_IRWXO);
if (fd<0) {
NSLog(#"fail to open");
}
int rc = ftruncate(fd, length);
if (rc<0) {
NSLog(#"fail to truncate");
}
char* map = (char*)mmap(NULL, length, PROT_WRITE | PROT_READ, MAP_SHARED, fd, 0);
if (!map) {
NSLog(#"fail to mmap");
}
memcpy(map, mem, length);
close(fd);
}
[WCTTicker stop];
}
Here is the test result:
2016-07-05 11:44:08.425 TestCaseiOS[4092:1070240]
0: 1467690246.689788, info: (null)
1: 1467690248.419790, cost 1.730002, info: (null)
2016-07-05 11:44:14.126 TestCaseiOS[4092:1070240]
0: 1467690248.427097, info: (null)
1: 1467690254.126590, cost 5.699493, info: (null)
2016-07-05 11:44:14.814 TestCaseiOS[4092:1070240]
0: 1467690254.126812, info: (null)
1: 1467690254.813698, cost 0.686886, info: (null)

You have mmap() without corresponding munmap()
From mmap manual page (linux)
MAP_SHARED Share this mapping. Updates to the mapping are visible
to other processes that map this file, and are carried through to the
underlying file. The file may not actually be updated until msync(2)
or munmap() is called.
Perhaps you should run your tests again so that there is a call to munmap:
char* map = (char*)mmap(NULL, length, PROT_WRITE | PROT_READ, MAP_SHARED, fd, 0);
if (!map) {
NSLog(#"fail to mmap");
}
memcpy(map, mem, length);
munmap(map, length);
close(fd);

Even with the munmap (or msync) added, I think this should be faster at least for big data transfers because write() results in a copy operation while mmap and access to the map do not.

Related

How to detect which piece of my code is generating "Fatal error: glibc detected an invalid stdio handle" error?

I'm trying to code a simple program to copy a file into another using 2 processes.
I want to use shared memory to open both files and get a piece of shared memory of 1Byte to be used as exchange memory in a mutually exclusive way.
So the main process should open both files and put them in shared memories;
fork twice, I obtain 2 processes A and B.
Process A should read 1 byte of the first file, put it in the shared exchange memory and unlock the mutex for process B.
Process B should copy the file from the shared exchange memory and put it in its file and unlock the mutex for process A.
And so on.
#define SIZE 4096
void reader_process(FILE* fptr,char*exch, sem_t*mut){
while(1){
sem_wait(mut);
*exch = (char) getc(fptr);
sem_post(mut+1);
}
}
void writer_process(FILE* fptr,char*exch, sem_t*mut){
if(*exch == EOF){
printf("done\n");
exit(0);
}
while(1){
sem_wait(mut);
putc((int)*exch,fptr);
sem_post(mut-1);
}
}
int main(int argc, char *argv[]){
FILE* shared_f_ptr[2];
pid_t pid;
//2 files name.
char *files[2];
int fd[2];
//open files.
files[0] = argv[1];
printf("%s\n",files[0]);
FILE* fpointer1 = fopen(files[0],"r+");
if (fpointer1 == NULL){
perror("fopen\n");
exit(-1);
}
fd[0] = fileno(fpointer1);
files[1] = argv[2];
printf("%s\n",files[1]);
FILE* fpointer2 = fopen(files[1],"r+");
if (fpointer2 == NULL){
perror("fopen\n");
exit(-1);
}
fd[1] = fileno(fpointer2);
//shared File pointers.
shared_f_ptr[0] = (FILE*)mmap(NULL, SIZE*sizeof(char),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS,
fd[0], 0);
if (shared_f_ptr[0] == MAP_FAILED){
perror("mmap\n");
exit(-1);
}
shared_f_ptr[1] = (FILE*)mmap(NULL, SIZE*sizeof(char),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS,
fd[1], 0);
if (shared_f_ptr[1] == MAP_FAILED){
perror("mmap\n");
exit(-1);
}
//shared mem for 1B exchange.
char *shared_exchange = (char*)mmap(NULL, sizeof(char),
PROT_READ | PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS,
-1, 0);
if (shared_exchange == MAP_FAILED){
perror("mmap\n");
exit(-1);
}
//mutex.
sem_t *mut = (sem_t*)mmap(NULL, 2*sizeof(sem_t),
PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS,
-1, 0);
sem_init(&mut[0],1,0);
sem_init(&mut[1],1,0);
//fork.
pid = fork();
if (pid == 0) {
reader_process(shared_f_ptr[0],
shared_exchange, &mut[0]);
}
if (pid == -1){
perror("fork\n");
exit(-1);
}
else pid = fork();
if (pid == 0) writer_process(shared_f_ptr[1],
shared_exchange, &mut[1]);
if (pid == -1){
perror("fork\n");
exit(-1);
}
else{
sem_post(&mut[0]);
}
}
I don't expect the error i am getting Fatal error: glibc detected an invalid stdio handle but i don't really know how to find what's causing the problem.

Don't do this:
//shared File pointers.
shared_f_ptr[0] = (FILE*)mmap(NULL, SIZE*sizeof(char),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS,
fd[0], 0);
if (shared_f_ptr[0] == MAP_FAILED){
perror("mmap\n");
exit(-1);
}
shared_f_ptr[1] = (FILE*)mmap(NULL, SIZE*sizeof(char),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS,
fd[1], 0);
if (shared_f_ptr[1] == MAP_FAILED){
perror("mmap\n");
exit(-1);
}
Use this instead:
shared_f_ptr[0] = fpointer1;
shared_f_ptr[1] = fpointer2;
Don't use the file descriptors underlying each FILE. Instead, simply use the FILE itself.
Also, instead of using fpointer1 and fpointer2, just use shared_f_ptr[0] and shared_f_ptr[1].
This is a possible definition of the FILE structure:
typedef struct _IO_FILE
{
int __fd;
int __flags;
int __unget;
char *__buffer;
struct {
size_t __orig;
size_t __size;
size_t __written;
} __bufsiz;
fpos_t __fpos;
} FILE;
As you can see, it's a structure, not just a flat pointer.

ELF - Getting a SEGFAULT when changing the entry point

I'm trying to patch the entry point of an ELF file directly via the e_entry field:
Elf64_Ehdr *ehdr = NULL;
Elf64_Phdr *phdr = NULL;
Elf64_Shdr *shdr = NULL;
if (argc < 2)
{
printf("Usage: %s <executable>\n", argv[0]);
exit(EXIT_SUCCESS);
}
fd = open(argv[1], O_RDWR);
if (fd < 0)
{
perror("open");
exit(EXIT_FAILURE);
}
if (fstat(fd, &st) < 0)
{
perror("fstat");
exit(EXIT_FAILURE);
}
/* map whole executable into memory */
mapped_file = mmap(NULL, st.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (mapped_file < 0)
{
perror("mmap");
exit(EXIT_FAILURE);
}
// check for an ELF file
check_elf(mapped_file, argv);
ehdr = (Elf64_Ehdr *) mapped_file;
phdr = (Elf64_Phdr *) &mapped_file[ehdr->e_phoff];
shdr = (Elf64_Shdr *) &mapped_file[ehdr->e_shoff];
mprotect((void *)((uintptr_t)&ehdr->e_entry & ~(uintptr_t)4095), 4096, PROT_READ | PROT_WRITE);
if (ehdr->e_type != ET_EXEC)
{
fprintf(stderr, "%s is not an ELF executable.\n", argv[1]);
exit(EXIT_FAILURE);
}
printf("Program entry point: %08x\n", ehdr->e_entry);
int text_found = 0;
uint64_t test_addr;
uint64_t text_end;
size_t test_len = strlen(shellcode);
int text_idx;
for (i = 0; i < ehdr->e_phnum; ++i)
{
if (text_found)
{
phdr[i].p_offset += PAGE_SIZE;
continue;
}
if (phdr[i].p_type == PT_LOAD && phdr[i].p_flags == ( PF_R | PF_X))
{
test_addr = phdr[i].p_vaddr + phdr[i].p_filesz;
text_end = phdr[i].p_vaddr + phdr[i].p_filesz;
printf("TEXT SEGMENT ends at 0x%x\n", text_end);
puts("Changing entry point...");
ehdr->e_entry = (Elf64_Addr *) test_addr;
memmove(test_addr, shellcode, test_len);
phdr[i].p_filesz += test_len;
phdr[i].p_memsz += test_len;
text_found++;
}
}
//patch sections
for (i = 0; i < ehdr->e_shnum; ++i)
{
if (shdr->sh_offset >= test_addr)
shdr->sh_offset += PAGE_SIZE;
else
if (shdr->sh_size + shdr->sh_addr == test_addr)
shdr->sh_size += test_len;
}
ehdr->e_shoff += PAGE_SIZE;
close(fd);
}
The shellcode in this case is just a bunch of NOPs with an int3 instruction at the end.
I made sure to adjust the segments and sections that come after this new code, but the problem is that as soon as I patch the entry point the program crashes, why is that?

changing:
memmove(test_addr, shellcode, test_len);
to:
memmove(mapped_file + phdr[i].p_offset + phdr[i].p_filesz, shellcode, test_len);
Seems to fix your problem. test_addr is a virtual address belong to the file you have mapped; you cannot use that directly as a pointer. The bits you want to muck with are the file map address, p_offset and p_filesz.

I suspect that you haven't enable write-access to program's header. You can do this via something like
const uintptr_t page_size = 4096;
mprotect((void *)((uintptr_t)&ehdr->e_entry & ~(uintptr_t)4095), 4096, PROT_READ | PROT_WRITE);
ehdr->e_entry = test_addr;

shm_open() leads to a No such file or directory

Description :
I have a project directory named "Projet" which contain two directory named "Serveur" and "Client".
(1) Serveur contains serveur.c (2) Client contains client.c
Referenting to the man, I choose as a name : "/shm_request_stack".
Source files description :
serveur.c :
#define SHM_REQUEST "/shm_request_stack"
int main(void) {
sem_t shm = open_shm(SHM_REQUEST,
O_RDWR | O_CREAT | O_EXCL,
S_IRUSR | S_IWUSR);
unlink_shm(SHM_REQUEST);
size_t memsize = sizeof(int);
setsize_shm(shm, memsize);
int * ptr = project_shm(shm, memsize);
*ptr = 0;
while(*ptr == 0);
printf("Client modify the value\n");
}
client.c :
#define SHM_REQUEST "/shm_request_stack"
int main(void) {
sem_t shm = open_shm(SHM_REQUEST,
O_RDWR,
S_IRUSR | S_IWUSR);
unlink_shm(SHM_REQUEST);
size_t memsize = sizeof(int);
int * ptr = project_shm(shm, memsize);
*ptr = 1;
}
Envelope functions
int open_shm(char *name, int oflag, mode_t mode) {
int shm = shm_open(name, oflag, mode);
if (shm == -1) {
fprintf(stderr, "Error while opening %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
return shm;
}
void unlink_shm(char *name) {
if (shm_unlink(name) == -1) {
perror("sem_unlink");
exit(EXIT_FAILURE);
}
}
void setsize_shm(int shm, size_t size) {
if (ftruncate(shm, size) == -1) {
perror("ftruncate");
exit(EXIT_FAILURE);
}
}
void * project_shm(int shm, size_t size) {
int *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, shm, 0);
if (ptr == MAP_FAILED) {
perror("mmap");
exit(EXIT_FAILURE);
}
return ptr;
}
Problems :
The client can't find the named shm create by the server.
I tried to found the shared memory with ipcs -m but I've not.
I try to modify the value from the server and it's works so the memory exist.
How can I successfully open the shm from the client ?

You appear to be deleting the shared object immediately after creating it (unlink).
It is a bit like a file. If you have an open reference to the object then it is retained, but the unlink removes the name. That is why the server can still write to the object after the unlink (she is still in scope), but the client cannot open the object by name.
The critical words in the doc you quote are : "all open and map references" - what you can't do is create a new reference after the unlink.

c - writing to shared memory segment causes segmentation fault

I'm having problems with writing to shared memory segment. Here's the code:
EDIT: after I removed that == (mistake), now I'm getting Bus Error (Core Dumped), here's the edited code:
// Struct for data from shared memory
typedef struct {
pthread_mutex_t shared_mutex;
int last_used_job_id;
} shared1;
static void *job_generator(void *param)
{
int J = *((int *) param);
shared1 *shd;
int shm;
int job_id;
// Open shared memory, don't create it if doesn't exist
shm = shm_open("/lab5", O_RDWR, 00600);
// Check
if (shm == -1) {
// Open shared memory, create it if doesn't exist (O_CREAT)
shm = shm_open(q_name, O_RDWR | O_CREAT, 00600);
// Map space for struct
shd = mmap(NULL, sizeof(shared1), PROT_READ | PROT_WRITE, MAP_SHARED, shm, 0);
if (shd == (void *) -1) {
perror ( "mmap" );
exit(1);
}
// Initialize mutex
if (pthread_mutex_init(&(shd->shared_mutex), NULL) != 0)
{
printf("Mutex initialization failed!\n");
exit(1);
}
}
else
{
// Map space for struct
shd = mmap(NULL, sizeof(shared1), PROT_READ | PROT_WRITE, MAP_SHARED, shm, 0);
if (shd == (void *) -1) {
perror ( "mmap" );
exit(1);
}
}
// Lock mutex
pthread_mutex_lock(&(shd->shared_mutex));
job_id = shd->last_used_job_id + 1;
shd->last_used_job_id = job_id + J;
printf("a: %d\n", shd->last_used_job_id);
return NULL;
}
it's caused by any of the instructions which are using shd, so any of these:
// Lock mutex
pthread_mutex_lock(&(shd->shared_mutex));
job_id = shd->last_used_job_id + 1;
shd->last_used_job_id = job_id + J;
printf("a: %d\n", shd->last_used_job_id);

I think this is where your problem lies:
shd == mmap(NULL, sizeof(shared1), PROT_READ | PROT_WRITE, MAP_SHARED, shm, 0);
You're comparing shd to the return value of mmap with '=='. I think you meant to use a single '=' which would assign the return value to shd.

MMAP reading and writing files

Im trying to use mmap to read in a file and then encrypt it and then write the encryption to the output file. I'm trying to also do this with mmap but when I run the code, it tells me that it was not able to unmmap due to "Invalid Argument".
//Open files initialy and obtain a handle to the file.
inputFile = open(inFileName, O_RDONLY, S_IREAD);
outputFile = open(outFileName, O_APPEND | O_CREAT | O_TRUNC | O_WRONLY, S_IWRITE);
//Allocate buffers for encrption.
from = (unsigned char*)malloc(blockSize);
to = (unsigned char*)malloc(blockSize);
mmapWriteBuff = (unsigned char*)malloc(blockSize);
mmapReadBuff = (unsigned char*)malloc(blockSize);
memset(to, 0, blockSize);
memset(from, 0, blockSize);
memset(mmapWriteBuff, 0, blockSize);
memset(mmapReadBuff, 0, blockSize);
//Make sure we have permission to read the file provided.
setFilePermissions(inFileName, PERMISSION_MODE);
setFilePermissions(outFileName, PERMISSION_MODE);
if(encriptParam)
{
printf("*Encripting file: %s *\n", inFileName);
do//Go through the entire file.
{
if(memParam)
{
currAmt = lseek(inputFile, blockSize, SEEK_SET);
mmapReadBuff = mmap(0, blockSize, PROT_READ, MAP_SHARED, inputFile, 0);
/*
*This is how you encrypt an input char* buffer "from", of length "len"
*onto output buffer "to", using key "key". Jyst pass "iv" and "&n" as
*shown, and don't forget to actually tell the function to BF_ENCRYPT.
*/
BF_cfb64_encrypt(mmapReadBuff, mmapWriteBuff, blockSize, &key, iv, &n, BF_ENCRYPT);
if(currAmt < blockSize)
{
writeAmt = lseek(outputFile, currAmt, SEEK_SET);
mmapWriteBuff = mmap(0, currAmt, PROT_WRITE, MAP_SHARED, outputFile, 0);
if(errno == EINVAL)
{
perror("MMAP failed to start write buffer: ");
exit(MMAP_IO_ERROR);
}
}
else
{
writeAmt = lseek(outputFile, blockSize, SEEK_SET);
mmapWriteBuff = mmap(0, blockSize, PROT_WRITE, MAP_SHARED, outputFile, 0);
if(errno == EINVAL)
{
perror("MMAP failed to start write buffer: ");
exit(MMAP_IO_ERROR);
}
}
mmapWriteBuff = to;
}
else
{
currAmt = read(inputFile, from, blockSize);
/*
*This is how you encrypt an input char* buffer "from", of length "len" *onto output buffer "to", using key "key". Jyst pass "iv" and "n" as
*shown, and don't forget to actually tell the function to BF_ENCRYT.
*/
BF_cfb64_encrypt(from, to, blockSize, &key, iv, &n, BF_ENCRYPT);
if(currAmt < blockSize)
{
writeAmt = write(outputFile, to, currAmt);
}
else
{
writeAmt = write(outputFile, to, blockSize);
}
}
if(memParam)
{
//if(currAmt < blockSize)
//{
// if(munmap(mmapWriteBuff, currAmt) == -1)
// {
// perror("MMAP failed to unmap itself: ");
//
// exit(MMAP_IO_ERROR);
// }
//
// if(munmap(mmapReadBuff, currAmt) == -1)
// {
// perror("MMAP failed to unmap itself: ");
//
// exit(MMAP_IO_ERROR);
// }
//}
//else
//{
if(munmap(mmapReadBuff, blockSize) == -1)
{
perror("MMAP Read Buffer failed to unmap itself: ");
exit(MMAP_IO_ERROR);
}
if(munmap(mmapWriteBuff, blockSize) == -1)
{
perror("MMAP Write Buffer failed to unmap itself: ");
exit(MMAP_IO_ERROR);
}
//}
}
memset(to, 0, strlen((char *)to));
memset(from, 0, strlen((char *)from));
memset(mmapReadBuff, 0, strlen((char*)mmapReadBuff));
memset(mmapWriteBuff, 0, strlen((char*)mmapWriteBuff));
}
while(currAmt > 0);
printf("*Saving file: %s *\n", outFileName);
}

Generally speaking, it seems like you might want to try setting up the outputFile file descriptor without the O_WRONLY flag. Using the O_WRONLY flag for mmap() isn't sufficient.
Thus, you may need to change this:
outputFile = open(outFileName, O_APPEND | O_CREAT | O_TRUNC | O_WRONLY, S_IWRITE);
to this:
outputFile = open(outFileName, O_APPEND | O_CREAT | O_TRUNC | O_RDWR, S_IWRITE);
I am not an expert with mmap(), but I know that you might want to give it both read and write permissions when opening the file descriptor.
EDIT:
Also you want want to try casting every mmap() call to (int*) like so:
mmapReadBuff = (int*)mmap(0, blockSize, PROT_READ, MAP_SHARED, inputFile, 0);

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Why (ftruncate+mmap+memcpy) is faster than (write)? - file

Even with the munmap (or msync) added, I think this should be faster at least for big data transfers because write() results in a copy operation while mmap and access to the map do not.

Related

How to detect which piece of my code is generating "Fatal error: glibc detected an invalid stdio handle" error?

ELF - Getting a SEGFAULT when changing the entry point

shm_open() leads to a No such file or directory

c - writing to shared memory segment causes segmentation fault

MMAP reading and writing files

Categories

Resources