File Path to in memory file - c

I have a void *buffer that is an instance of a file in RAM.
The file type is in a format that must be parsed by the API given.
Unfortunately, the only way to open this file type through the API is to supply the API with the file path:
sample_api_open(char *file_name, ...);
I understand that shm_open returns the file descriptor, but the API only takes a file path.
Is there a work around to read this type of file in memory?

Is there a work around to read this type of file in memory?
Dump the content of the buffer into a temporary file on /tmp.
On the vast majority of modern *NIX systems, the /tmp is a synthetic, in-memory file system. Only in case of the memory shortage, the content might hit the disk due to the swapping.
If the amount of the information is too large, to avoid duplication, after dumping the content onto /tmp, you can free the local memory and mmap() the content of the file.

Instead of using POSIX shared memory, you could open a temporary file and mmap() it. Then make the buffer end up in the mmap()-ed region so you can finally call the API on the temporary file.

Related

What data is held in the operating system's file descriptors?

I've been working with assembly and have been working with file IO. From what I've learned, the process goes as follows. CPU makes a system call to the kernel to open a file ie "hello.txt". The kernel then finds that location in the filesystem (persistent memory), makes it accessible for read and/or write, and returns a file descriptor that uniquely identifies that file. From my understanding the file descriptor is an index for a table that stores file data. My question is: what data is stored? presumably storing the entire file data would get grossly memory expense for large files. Does it store file metadata like mime-type, encoding, etc? Or does it actually store the whole contents?

If a file is stored in the hard drive, why do we use a pointer to create it?

I've always tought of pointers as being RAM adresses, pointing towards bytes of memory that can be random accessed. However, when we create a file in C, we use the pointer FILE*, that points towards the file but, after i close the program, isn't the created file saved in my HD? So, i see two possibilities here:
1) A pointer can points towards a HDD file
2) The file is saved in RAM (that doesn't make much sense to me)
Which one of it is true? Or, if there is a third possibility, what is it?
Thanks in advance.
As mention in How exactly does fopen(), fclose() work
When called, fopen allocates a FILE object on the heap. Note that the data in a FILE object is undocumented - FILE is an opaque struct, you can only use pointers-to-FILE from your code.
The FILE object gets initialized. For example, something like fillLevel = 0 where fillLevel is the amount of buffered data that hasn't been flushed yet.
A call to the filesystem driver (FS driver) opens the file and provides a handle to it, which is put somewhere in the FILE struct.
To do this, the FS driver figures out the HDD address corresponding to the requested path, and internally remembers this HDD address, so it can later fulfill calls to fread etc.
The FS driver uses a sort of indexing table (stored on the HDD) to figure out the HDD address corresponding to the requested path. This will differ a lot depending on the filesystem type - FAT32, NTFS and so on.
The FS driver relies on the HDD driver to perform the actual reads and writes to the HDD.
A cache might be allocated in RAM for the file. This way, if the user requests 1 byte to be read, C++ may read a KB just in case, so later reads will be instantaneous.
A pointer to the allocated FILE gets returned from fopen.

Memory Mapped I/O in Unix

I am unable to understand how files are managed in memory mapped I/O. As normal If we open a file using open or fopen, it returns fd or
file pointer respectively. After this open where the file resides for processing. It is in memory(copy of the file which is in hard disk) or not? If it
is not in memory where the data is fetch by consequent read or write system call or It fetchs data from the hard disk for each time of calling read or write.
Otherwise the copy of the file is stored in memory and the file is accessed by process for furthur manipulation and once the process is completed the file is copied to hard disk. In the above concepts
which scenario is worked ?
The following is the definition given for memory mapped i/o in Advanced Programming in Unix Environment(2nd Edition) book:
Memory-mapped I/O lets us map a file on disk into a buffer in memory so that, when we fetch bytes from the buffer, the corresponding bytes of the file are read. Similarly, when we store data in the buffer, the corresponding bytes are automatically written to the file. This lets us perform I/O without using read or write.
what is mapping a file into memory? And here, they defined the memory is placed in between stack and heap. In this memory, what
type of data is present after mapping a file. It contains copy of the file or the address of the file which resides in hard disk. And
how the above scenario becomes true.
Does anyone explain the working mechanism of memory mapped I/O and mmap functionality?
Normally when you open a file, the system sets up some bookkeeping structures (metadata) but does not need to read any part of the actual data of the file. When you call read(), the system loads a chunk of the file into (virtual) memory which you allocated for the purpose.
When you memory-map a file, the system again sets up bookkeeping, and also sets up a (virtual) memory "mapping" which means a range of valid addresses which, if used, will reflect reads (or writes) of the underlying file. It does not mean the entire file needs to be read at once, because it can be "paged in" on demand, i.e. the system can give you an address range to use, then wait for you to actually use it before loading any data there. This "page faulting" is supported by a hardware device called the Memory Management Unit, or MMU. The same system is used when you run an executable file--the system can simply map it into virtual memory and read pages (chunks) from disk only as needed.
It is in memory(copy of the file which is in hard disk) or not?
According to Computer Programming and Utilization, When you open file with fopen its content are loaded into memory. (Partially or wholly).
If it is not in memory where the data is fetch by consequent read or
write system call
When you fwrite some data, it is eventually copied into the kernel which will then write it to disk (or wherever) after buffering. In general, no part of a file needs to be loaded in order to write.
what is mapping a file into memory?
For more refer here
In this memory, what type of data is present after mapping a file. It
contains copy of the file or the address of the file which resides in
hard disk.
A memory-mapped file is a segment of virtual memory which has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource.Refer this
It is possible to mmap a file to a region of memory. When this is done, the file can be accessed just like an array in the program.This is more efficient than read or write, as only the regions of the file that a program actually accesses are loaded. Accesses to not-yet-loaded parts of the mmapped region are handled in the same way as swapped out pages.
After this open where the file resides for processing. It is in memory(copy of the file which is in hard disk) or not?
On the disk. It may also be partly or completely in memory if the operating system does a read-ahead, but that isn't detectable by you. You still have to issue reads to get data from the file.
If it is not in memory where the data is fetch by consequent read or write system call
From the disk.
or It fetchs data from the hard disk for each time of calling read or write.
In effect, but you also have to consider the effect of any caching.
Otherwise the copy of the file is stored in memory and the file is accessed by process for furthur manipulation and once the process is completed the file is copied to hard disk.
No. The file behaves as though it is all on the disk.
And here, they defined the memory is placed in between stack and heap.
Not in what you quoted.
In this memory, what type of data is present after mapping a file.
The data in the file. The question 'what type of data' doesn't make sense. Data is data.
It contains copy of the file or the address of the file which resides in hard disk.
It effectively contains a copy of the file.
And how the above scenario becomes true.
Via virtual memory. Too broad to cover here.

Load file content into memory, C

I will be dealing with really huge files, for which I want just partially to load the content into memory. So I was wondering if the command:
FILE* file=fopen("my/link/file.txt", "r");
loads the whole file content into memory or it is just a pointer to the content? After I open the file I use fgets() to read the file line by line.
And what about fwrite()? Do I need to open and close the file every time I write something so It doesn't overloads or it is managed in the background?
Another thing, is there maybe a nice bash command like "-time" which could tell me the maximal peak memory of my executed program ? I am using OSx.
As per the man page for fopen(),
The fopen() function opens the file whose name is the string pointed to by path and associates a stream with it.
So, no, it does not load the content of the file into memory or elsewhere.
To operate on the returned file pointer, as you already know, you need to use fgets() and family.
Also, once you open the file, get a pointer and does not fclose() the same, you can use the pointer any number of time to write into the file (remember to open the file in append more). You don't need to open and close for every read and write made to the pointer.
Also, FWIW, if you want to move the file pointer back and forth, you may feel fseek() can come handy.
fopen does not load all the file into the memory. It create a file descriptor to the file. Like a pointer to the place of the open file table.
in the open file table you have a pointer to the location of the file on the disk.
if you want to go to place on the file use fseek.
another Option is to use mmap. This is create new mapping in the virtual address space of the calling process. You can access to the file like an array.. (not all the file load into the memory. it use the memory pages mechanism to load the data)
fopen does not read the file, fread and fgets and similar functions do.
Personally I've never tried reading and writing a file at the same time.
It should work, though.
You can use multiple file pointers to the same file.
There is no command like time for memory consumption. The simplest way is to look at top. There exist malloc/new replacement libraries which can do that for you.
loads the whole file content into memory or it is just a pointer to the content?
No,
fopen() open file with the specified filename and associates it with a stream that can be identified by the FILE pointer.
fread() can be used to get file contents into buffer.
Multiple read/write operations can be carried out without any need for opening files number of times.
Functions like rewind() and fseek() can be used to change position of cursor in file.

How to get memory address from file path without opening file

I have a file. I know its file path, and I want to know its memory address. My problem is that I can't actually open the file (fopen can't be used because the OS thinks the file is in use) to get its memory address.
Ex.
fopen("C:/example/file", "r") returns null
From why I understand the OS returns the memory address after it confirms the file isn't in use. So is it even possible to bystep the OS?
#Alter by finding the Process ID of the process that has locks on the file, you could get somewhere... You might be able to track your files contents in memory as part of the memory space allocated to the Process.
However, just because a file is locked does not at all mean that the file is in memory. Sometimes just a part of a file is used, like the functions within a DLL - where only the 'used' and necessary chunks of the file would be in memory. Other times, the entire document (file) will be present very nicely and contiguously in memory (consider a text file open in Notepad) . It is also possible that the file is locked purely like a placeholder, where the lock is all that matters and none of the file is actually loaded. You really need to know alot about the Process that has locks on the file.
Now if you simply want to copy the file to another file, then launch the copy before the 'Process' locks the file. You could try a batch file that runs at Windows Startup - and see if that is early enough to copy the file before a lock is placed on it.

Resources