In a course I am enrolled in, I was tasked to create a filesystem with some custom features. I simply created an image of zeroes using dd, and created my filesystem by creating superblock, inodes, stat files, etc. for it. It can read/write files, import and export files and directories, with proper directory hierarchy.
Now I want to make this work with an actual physical partition. I looked at many places, and saw that file descriptors can be read as plain files. But I want to know if it relies on existing filesystem in the partition. Can I bypass everything and simply get a block-wise read/write interface, and with ability to seek to bytes or blocks? What would be the overhead on that?
Also, I want to turn it into a linux module so that my filesystem can work with file managers. What is the standard API interface that I need to implement to make it happen?
Please guide me to the right direction.
You have a lot of options; integration into the kernel is relatively hard, whereas integrating with a user space file system framework (libfuse on github) is a good intermediate step. At the end, you will have a usable file system.
Related
I'm looking for a way to get a file's inode on Linux and a file identity on Windows, or any other unique file identity as cross-platform as possible and more efficient than hashing file contents. Platforms are Linux,Windows,Mac.
The purpose of this is to be able track files as efficiently as possible so the program can locate them after the user has renamed or moved them. If I have a file path + file ID and the file is not at the path, I need a way to find the new file path on all mounted volumes (or fail if it has been deleted). Hashing all files is prohibitively expensive and creates a whole bunch of new error possibilities since files are going to be updated and renamed regularly.
I'm hoping there is an existing library but haven't found anything cross-platform yet. If there is none, I'd be very grateful for help on how to use syscalls in x/sys, and to hear about any limitations inodes and file identities have.
I am tying to scan usb msc in stm32 for audio files. This mp3 files are scattered in many folders which are unknown to the application.
First I scan for directories in the root folder and find folders then I scan folders for mp3 files.
This is very time consuming and for depth of 8 folders with many files in each folder.
Is there any Way to Scan for just for folders which have mp3 file in them using a better approach.
Directory structure for testing is something like this:
It is not clear what your problem might be since you have not provided any code to see how you are scanning, or quantative information on the file/folder structure ("many files" is rather vague), or even specified the media type used.
However a solution that might overcome all the variables of filesystem performance, hardware I/O, driver implementation and media type and make access more deterministic regardless, is to maintain a separate index file or database in a single file in the root directory to map each MP3 file to its path so you need only search the index/database for the MP3 you need (or use it to directly list all MP3's without scanning the file system).
If you maintain that file sorted (or a separate index file that is sorted) then you can use a binary search to find a specific file. Or simply use a real database - though that might be a rather heavyweight solution for this purpose. You might even load the metadata into memory for even faster access, and write it to the filesystem only if it changes.
Either way, the solution I suggest is to isolate your application from the variability of the filesystem/media, and the lack of scalability of FAT in general by maintaining your own "metadata" file(s) indicating what is stored and where so that you can use that to access the files directly without file system scanning using findfirst/findnext semantics, and recursion which is always best avoided, but is the obvious way to scan a directory tree.
Incidentally this is precisely how iTunes works for example. The "iTunes Library.xml" contains meta-data about "songs" including their location. Clearly you need not have anything quite that detailed, but the principle is the same and there may be merit in using XML or JSON for your application given a suitable library for updating and accessing such a file.
By doing that, the performance is more directly within your control rather than dependent on the filesystem, media and/or device driver level. However you still have some control/responsibility over the media and its interface (SPI, SDIO, USB or whatever), and the device I/O layer (DMA, interrupts, polled, bit-banged), and while you may have little control over the choice of FAT and the ELM FatFs implementation, you can certainly impact its performance greatly at the device driver, hardware interface and physical media level.
I'd like to create shared library in C for linux, some abstract implementation of database management. Shared library whould be responsible for read file containing database and write differences into it. But I have no idea how to handle multiprocessing problems of file handling for this case eg.: App1 try to write differences into database file and App2 has currently opened file with database to read it. In the case of this example I'd like to inform app1 that file is currently open and delay write sequence until App2 will finish database file read.
I was thinking of using some mutual exclusion mechanisms or by using global enum variable to manage current file status, but after I read some of posts I understood that every application that uses shared library create it's own copy in memory and they don't share any memory section during work.
Shared library whould be responsible for read file containing database and write differences into it.
That is possible but quit complicated solution.
While you would need to make sure that multiple processes do not interfere with each other. It's possible to do so with file logs (see man flock), and record locking in man fcntl but you must insure that multiple processes update disjoint file chunks "in place" (without resizing the file).
This is also prone to deadlocks -- if one of the processes takes a lock on a region and then goes into infinite loop, other processes may get stuck as well.
A much simpler solution involves client-server, where the server implements all writes and clients send it read and modify requests.
P.S. There are existing libraries implementing either approach. You will likely save yourself several months of development time using existing solutions.
Hi
I was trying to use FileSystemWatcher to detect if some files or directories has been moved to another location. The problem was, i had to use onCreated and onDeleted events to handle this, but there are many issues using this solution
how could i detect change if i will select more than one file and press Ctrl+C, Ctrl+V, or right-click and select Copy and then Paste in the same directory?
how could i detect, if i will select more than one directory?
the last one, what if i simulate moving file? I could delete file and create with same name in different place.
I know i could use, Timers, process locking detection, verification which process uses file (if explorer.exe then it could be moving file), but this solution is not perfect and it's very ineffective. I was whinking about this how to solve this issue, and i have decided to implement this in low-level language. Is this possible to do this using C, or assembler? I know that every thing is possible to do using assembler, so is it possible to implement this in asm? I would like to create my own FileSystemWatcher using assembler or C but where should i looking for info how to do this?
File movement within the same filesystem can be detected easily using a filesystem filter driver, as the filesystem received the corresponding request from the OS. Other scenarios such as moving to the other disk or moving by copy/delete sequence are hardly traceable even with the filter driver because you would need to match between the file which have been created/written to and the file which is being deleted (possibly on the other disk).
If you plan to write some security mechanism (like a DRM), then I need to remind that the data can be altered during copying (eg. encrypted or compressed), which makes your task even harder.
Still you can look at filesystem filter drivers - should you decide to go on with detection of filesystem events, such driver is a much more reliable and powerful mechanism than FileSystemWatcher.
So I know that a file is composed by it's data and also metadata, which is information about it (usually the name, the type of the file, dates of creation and modification, etc.).
My question is where exactly is that information stored. I know it can be included inside the file, the directory or in a database, but for the Windows, Linux and MAC-OS file systems I can't seem to find this information...
Most of this information in the case of Windows and Mac is proprietary.
For Windows I can say for certain that a close enough version of the NTFS file system driver has been written for Linux. You can look into that, there's also some documentation most of it written by Richard 'Flatcap' Russon (http://www.flatcap.org/ntfs/).
Documentation on the FAT Filesystem has been made public a long time ago with the intent that it would provide ample information for developers and engineers working on flash drives and things like that. (http://msdn.microsoft.com/en-us/library/windows/hardware/gg463080.aspx)
Documentation on the Ext Filesystem used by Linux distributions can be found on the web easily. (Ext2 : http://www.nongnu.org/ext2-doc/ext2.html)
I have no clue what Mac uses but I bet it's some kind of proprietary abomination derived from an existing format (probably ext). This is just my opinion, do not take this as fact.
All these formats have some sort of structure that holds meta-data. The file is just a stream of bytes somewhere on the physical drive. Most filesystems should have a structure that stores at least the file's location (usually a starting cluster for each fragment of the file) and the file's size. The rest of meta-data is up to each file system to implement.
For example, the FAT Filesystem there are tables for each directory and each directory stores metadata about the files it contains. But it also has a FAT table that holds the fragment locations for each file contained in the filesystem.
The NTFS Filesystem has a big table called the Master File Table that holds a record of metadata for each file contained by the filesystem including the table itself. Each record holds all the metadata including the file location on the physical drive for each fragment.
However, the directory structure is held as data in directory file records. However, the NTFS has even more structures that hold information about the files such as the USN Journal or the Volume Bitmap.
To access the meta-data contained by the file system you either have to parse the raw volume or to use functions exposed by the operating system's API. The API doesn't generally give you all the information you want about the metadata. For example, Windows API will give you functions to iterate through the USN Journal to find information about a particular file, but you can't get the MFT attributes of a file directly.
Again, I have to stress out that even with most of the documentation on these proprietary file systems, you're taking shots in the dark since it's their intellectual property. Some if not most of the documentation that we have now comes from reverse engineering.
It depends on the filesystem. Take a look at http://lxr.free-electrons.com/source/fs/fat/fat.h for instance.
systems#Metadata show details on which file systems store information in metadata.