Basic File System Implementation - filesystems

I've been given 2k bytes to make a ultra minimalistic file system and I thought about making a stripped out version of FAT16.
My only problem is understanding how do I store the FAT in the volume. Let's say I use 2 bytes per block hence I'd have 1024 blocks. I need a table with 1024 rows and in each row I'll save the next block of a file.
As each of this block can address other 1023 blocks, I fail to see how this table would not use my entire 2k space. I do not understand how to save this table into my hard drive and use only a few bytes rather than just using 1024 block for writing a 1024 row table.

Given that you are allowed to implement a flat filesystem and have such a small space to work with, I would look at something like the Apple DOS 3.3 filesystem rather than a hierarchical filesystem like FAT16. Even the flat filesystem predecessor of FAT16, FAT12, is overly complex for your purposes.
I suggest that you divide your 2 kiB volume up into 256 byte "tracks" with 16 byte "sectors," to use the Apple DOS 3.3 nomenclature. Call them what you like in your own implementation. It just helps you to map the concepts if you reuse the same terms here at the design stage.
You don't need a DOS boot image, and you don't have the seek time of a moving disk drive head to be concerned about, so instead of setting aside tracks 0-2 and putting the VTOC track in the middle of the disk, let's put our VTOC on track 0. The VTOC which contains the free sector bitmap, the location of the first catalog sector, and other things.
If we reserve the entirety of track 0 for the VTOC, we would have 112 of our 16-byte sectors left. Those will pack up into only 14 bytes for the bitmap, which suggests that we really don't need the entirety of track 0 for this.
Let's set aside the first two sectors of track 0 instead, and include track 0 in the free sector bitmap. That causes a certain amount of redundancy, in that we will always have the first two sectors mapped as "used," but it makes the implementation simpler, since there are now no special cases.
Let's split Apple DOS 3.3's VTOC concept into two parts: the Volume Label Sector (VLS) and the volume free sector bitmap (VFSB).
We'll put the VLS on track 0 sector 0.
Let's set aside the first 2-4 bytes of the VLS for a magic number to identify this volume file as belonging to your filesystem. Without this, the only identifying characteristic of your volume files is that they are 2 kiB in size, which means your code could be induced to trash an innocent file that happened to be the same size. You want more insurance against data destruction than that.
The VLS should also name this volume. Apple DOS 3.3 just used a volume number, but maybe we want to use several bytes for an ASCII name instead.
The VLS also needs to point to the first catalog sector. We need at least 2 bytes for this. We have 128 tracks, which means we need at least 7 bits. Let's use two bytes: track and sector. This is where you get into the nitty-gritty of design choices. We can now consider moving to 4 kiB volume sizes by defining 256 tracks. Or, maybe at this point we decide that 16-byte sectors are too small, and increase them so we can move beyond 4 kiB later. Let's stick with 16 byte sectors for now, though.
We only need one sector for the VFSB: the 2 kiB volume ÷ 16 bytes per sector = 128 sectors ÷ 8 bits per byte = 16 bytes. But, with the above thoughts in mind, we might consider setting aside a byte in the VLS for the number of VFSB sectors following the VL, to allow for larger volumes.
The Apple DOS 3.3 catalog sector idea should translate pretty much directly over into this new filesystem, except that with only 16 bytes per sector to play with, we can't describe 7 files per sector. We need 2 bytes for the pointer to the next catalog sector, leaving 14 bytes. Each file should have a byte for flags: deleted, read-only, etc. That means we can have either a 13-byte file name for 1 file per catalog sector, or two 6-byte file names for 2 files per catalog sector. We could do 7 single-letter file names, but that's lame. If we go with your 3-character file name idea, that's 3 files per catalog sector after accounting for the flag byte per file, leaving 2 extra bytes to define. I'd go with 1 or 2 files per sector, though.
That's pretty much what you need. The rest is implementation and expansion.
One other idea for expansion: what if we want to use this as a bootable disk medium? Such things usually do need a boot loader, so do we need to move the VLS and VFSB sectors down 1, to leave track 0 sector 0 aside for a boot image? Or, maybe the VLS contains a pointer to the first catalog sector that describes the file containing the boot image instead.

Related

Printing bits in a buffer with C?

What is the best way in C to write and read to/from a file a specific amount of bits at a time say the first 16 bits or 12 of the lower half of an integer. I can't seem to find any threads or documentation on it other than to use fwrite. I may not be sure but I do not think I can write a specific amount of bits and would need a buffer but can anyone direct me in the correct way to perform this?
With the available APIs, the smallest amount of info you can write in a file at a time is 1 byte. To achieve what you want, you have to read the byte from the file, modify it using bitwise operators and write it back to the file. In case you are writing data as a stream, you would have to call fwrite once each byte is complete or if you are done. You would then have to pad the last byte with zeros or ones whichever is more appropriate in view of the fact that the file system will keep track of the file size in bytes. To do otherwise would require a file system that provides bit level operations and the corresponding support at the operating system level.
In fact, the smallest physical amount of data that can be written to a disk is a sector of 512 bytes and more recently 4096 bytes. At the file system level, several sectors are bundled together into a block. The operating system "hides" this fact and allows us to deal with individual bytes.
What seems to make this question sound stupid is the fact that we are so used to the current file abstraction that it has become like second nature. However, behind the scenes a lot is going on to support this illusion.

What are the EXT2 file system structure details?

I'm trying to wrap my head around the EXT2 file system, but I can't find a single place that shows me the EXT2 file system in detail.
I finally drew up a diagram myself. So I got that far. Now I'm trying to figure out the following (I've found some info already):
Number of bytes per sector: 0.5kB - 4kB
Number of bytes per block: 4kB - 64kB
Number of sectors per block: 1 - 128
Number of blocks per block group: ?
Number of block groups per partition: ?
It's crazy to me that I can't find a single place that has this information.
EDIT: Also just found this, which means my bytes per block number is probably wrong:
#define EXT2_MIN_BLOCK_SIZE 1024
#define EXT2_MAX_BLOCK_SIZE 4096
I usually find my information about ext2 at the osdev wiki which in turn links here.
The number of bytes per block is 1024<<n where n is given in the superblock and is a 32 bit integer. So in theory, a block could be anywhere between 1024 and ... lots of bytes. Normally, block sizes of 1, 2, 4 or 8 kB are used, though depending on several factors such as partition size and expected average file size.
Each block group contains a single block bitmap of free blocks. This gives the constraint 8*block size to the number of blocks per block group. The same is true for inodes per block group. The actual values are found in the superblock.
This in turn gives a lower bound to the number of block groups needed to fill the partition.

Split file occupying the same memory space as source file

I have a file, say 100MB in size. I need to split it into (for example) 4 different parts.
Let's say first file from 0-20MB, second 20-60MB, third 60-70MB and last 70-100MB.
But I do not want to do a safe split - into 4 output files. I would like to do it in place. So the output files should use the same place on the hard disk that is occupied by this one source file, and literally split it, without making a copy (so at the moment of split, we should loose the original file).
In other words, the input file is the output files.
Is this possible, and if yes, how?
I was thinking maybe to manually add a record to the filesystem, that a file A starts here, and ends here (in the middle of another file), do it 4 times and afterwards remove the original file. But for that I would probably need administrator privileges, and probably wouldn't be safe or healthy for the filesystem.
Programming language doesn't matter, I'm just interested if it would be possible.
The idea is not so mad as some comments paint it. It would certainly be possible to have a file system API that supports such reinterpreting operations (to be sure, the desired split is probably not exacly aligned to block boundaries, but you could reallocate just those few boundary blocks and still save a lot of temporary space).
None of the common file system abstraction layers support this; but recall that they don't even support something as reasonable as "insert mode" (which would rewrite only one or two blocks when you insert something into the middle of a file, instead of all blocks), only an overwrite and an append mode. The reasons for that are largely historical, but the current model is so entrenched that it is unlikely a richer API will become common any time soon.
As I explain in this question on SuperUser, you can achieve this using the technique outlined by Tom Zych in his comment.
bigfile="mybigfile-100Mb"
chunkprefix="chunk_"
# Chunk offsets
OneMegabyte=1048576
chunkoffsets=(0 $((OneMegabyte*20)) $((OneMegabyte*60)) $((OneMegabyte*70)))
currentchunk=$((${#chunkoffsets[#]}-1))
while [ $currentchunk -ge 0 ]; do
# Print current chunk number, so we know it is still running.
echo -n "$currentchunk "
offset=${chunkoffsets[$currentchunk]}
# Copy end of $archive to new file
tail -c +$((offset+1)) "$bigfile" > "$chunkprefix$currentchunk"
# Chop end of $archive
truncate -s $offset "$archive"
currentchunk=$((currentchunk-1))
done
You need to give the script the starting position (offset in bytes, zero means a chunk starting at bigfile's first byte) of each chunk, in ascending order, like on the fifth line.
If necessary, automate it using seq : The following command will give a chunkoffsets with one chunk at 0, then one starting at 100k, then one for every megabyte for the range 1--10Mb, (note the -1 for the last parameter, so it is excluded) then one chunk every two megabytes for the range 10--20Mb.
OneKilobyte=1024
OneMegabyte=$((1024*OneKilobyte))
chunkoffsets=(0 $((100*OneKilobyte)) $(seq $OneMegabyte $OneMegabyte $((10*OneMegabyte-1))) $(seq $((10*OneMegabyte-1)) $((2*OneMegabyte)) $((20*OneMegabyte-1))))
To see which chunks you have set :
for offset in "${chunkoffsets[#]}"; do echo "$offset"; done
0
102400
1048576
2097152
3145728
4194304
5242880
6291456
7340032
8388608
9437184
10485759
12582911
14680063
16777215
18874367
20971519
This technique has the drawback that it needs at least the size of the largest chunk available (you can mitigate that by making smaller chunks, and concatenating them somewhere else, though). Also, it will copy all the data, so it's nowhere near instant.
As to the fact that some hardware video recorders (PVRs) manage to split videos within seconds, they probably only store a list of offsets for each video (a.k.a. chapters), and display these as independent videos in their user interface.

The FAT, Linux, and NTFS file systems

I heard that the NTFS file system is basically a b-tree. Is that true? What about the other file systems? What kind of trees are they?
Also, how is FAT32 different from FAT16?
What kind of tree are the FAT file systems using?
FAT (FAT12, FAT16, and FAT32) do not use a tree of any kind. Two interesting data structures are used, in addition to a block of data describing the partition itself. Full details at the level required to write a compatible implementation in an embedded system are available from Microsoft and third parties. Wikipedia has a decent article as an alternative starting point that also includes a lot of the history of how it got the way it is.
Since the original question was about the use of trees, I'll provide a quick summary of what little data structure is actually in a FAT file system. Refer to the above references for accurate details and for history.
The set of files in each directory is stored in a simple list, initially in the order the files were created. Deletion is done by marking an entry as deleted, so a subsequent file creation might re-use that slot. Each entry in the list is a fixed size struct, and is just large enough to hold the classic 8.3 file name along with the flag bits, size, dates, and the starting cluster number. Long file names (which also includes international character support) is done by using extra directory entry slots to hold the long name alongside the original 8.3 slot that holds all the rest of the file attributes.
Each file on the disk is stored in a sequence of clusters, where each cluster is a fixed number of adjacent disk blocks. Each directory (except the root directory of a disk) is just like a file, and can grow as needed by allocating additional clusters.
Clusters are managed by the (misnamed) File Allocation Table from which the file system gets its common name. This table is a packed array of slots, one for each cluster in the disk partition. The name FAT12 implies that each slot is 12 bits wide, FAT16 slots are 16 bits, and FAT32 slots are 32 bits. The slot stores code values for empty, last, and bad clusters, or the cluster number of the next cluster of the file. In this way, the actual content of a file is represented as a linked list of clusters called a chain.
Larger disks require wider FAT entries and/or larger allocation units. FAT12 is essentially only found on floppy disks where its upper bound of 4K clusters makes sense for media that was never much more than 1MB in size. FAT16 and FAT32 are both commonly found on thumb drives and flash cards. The choice of FAT size there depends partly on the intended application.
Access to the content of a particular file is straightforward. From its directory entry you learn its total size in bytes and its first cluster number. From the cluster number, you can immediately calculate the address of the first logical disk block. From the FAT indexed by cluster number, you find each allocated cluster in the chain assigned to that file.
Discovery of free space suitable for storage of a new file or extending an existing file is not as easy. The FAT file system simply marks free clusters with a code value. Finding one or more free clusters requires searching the FAT.
Locating the directory entry for a file is not fast either since the directories are not ordered, requiring a linear time search through the directory for the desired file. Note that long file names increase the search time by occupying multiple directory entries for each file with a long name.
FAT still has the advantage that it is simple enough to implement that it can be done in small microprocessors so that data interchange between even small embedded systems and PCs can be done in a cost effective way. I suspect that its quirks and oddities will be with us for a long time as a result.
ext3 and ext4 use "H-trees", which are apparently a specialized form of B-tree.
BTRFS uses B-trees (B-Tree File System).
ReiserFS uses B+trees, which are apparently what NTFS uses.
By the way, if you search for these on Wikipedia, it's all listed in the info box on the right side under "Directory contents".
Here is a nice chart on FAT16 vs FAT32.
The numerals in the names FAT16 and
FAT32 refer to the number of bits
required for a file allocation table
entry.
FAT16 uses a 16-bit file allocation
table entry (2 16 allocation units).
Windows 2000 reserves the first 4 bits
of a FAT32 file allocation table
entry, which means FAT32 has a maximum
of 2 28 allocation units. However,
this number is capped at 32 GB by the
Windows 2000 format utilities.
http://technet.microsoft.com/en-us/library/cc940351.aspx
FAT32 uses 32bit numbers to store cluster numbers. It supports larger disks and files up to 4 GiB in size.
As far as I understand the topic, FAT uses File Allocation Tables which are used to store data about status on disk. It appears that it doesn't use trees. I could be wrong though.

Basic concepts in file system implementation

I am a unclear about file system implementation. Specifically (Operating Systems - Tannenbaum (Edition 3), Page 275) states "The first word of each block is used as a pointer to the next one. The rest of block is data".
Can anyone please explain to me the hierarchy of the division here? Like, each disk partition contains blocks, blocks contain words? and so on...
I don't have the book in front of me, but I'm suspect that quoted sentence isn't really talking about files, directories, or other file system structures. (Note that a partition isn't a file system concept, generally). I think your quoted sentence is really just pointing out something about how the data structures stored in disk blocks are chained together. It means just what it says. Each block (usually 4k, but maybe just 512B) looks very roughly like this:
+------------------+------------- . . . . --------------+
| next blk pointer | another 4k - 4 or 8 bytes of stuff |
+------------------+------------- . . . . --------------+
The stuff after the next block pointer depends on what's stored in this particular block. From just the sentence given, I can't tell how the code figures that out.
With regard to file system structures:
A disk is an array of sectors, almost always 512B in size. Internally, disks are built of platters, which are the spinning disk-shaped things covered in rust, and each platter is divided up into many concentric tracks. However, these details are entirely hidden from the operating system by the ATA or SCSI disk interface hardware.
The operating system divides the array of sectors up into partitions. Partitions are contiguous ranges of sectors, and partitions don't overlap. (In fact this is allowed on some operating systems, but it's just confusing to think about.)
So, a partition is also an array of sectors.
So far, the file system isn't really in the picture yet. Most file systems are built within a partition. The file system usually has the following concepts. (The names I'm using are those from the unix tradition, but other operating systems will have similar ideas.)
At some fixed location on the partition is the superblock. The superblock is the root of all the file system data structures, and contains enough information to point to all the other entities. (In fact, there are usually multiple superblocks scattered across the partition as a simple form of fault tolerance.)
The fundamental concept of the file system is the inode, said "eye-node". Inodes represent the various types of objects that make up the file system, the most important being plain files and directories. An inode might be it's own block, but some file system pack multiple inodes into a single block. Inodes can point to a set of data blocks that make up the actual contents of the file or directory. How the data blocks for a file is organized and indexed on disk is one of the key tasks of a file system. For a directory, the data blocks hold information about files and subdirectories contained within the directory, and for a plain file, the data blocks hold the contents of the file.
Data blocks are the bulk of the blocks on the partition. Some are allocated to various inodes (ie, to directories and files), while others are free. Another key file system task is allocating free data blocks as data is written to files, and freeing data blocks from files when they are truncated or deleted.
There are many many variations on all of these concepts, and I'm sure there are file systems where what I've said above doesn't line up with reality very well. However, with the above, you should be in a position to reason about how file systems do their job, and understand, at least a bit, the differences you run across in any specific file system.
I don't know the context of this sentence, but it appears to be describing a linked list of blocks. Generally speaking, a "block" is a small number of bytes (usually a power of two). It might be 4096 bytes, it might be 512 bytes, it depends. Hard drives are designed to retrieve data a block at a time; if you want to get the 1234567th byte, you'll have to get the entire block it's in. A "word" is much smaller and refers to a single number. It may be as low as 2 bytes (16-bit) or as high as 8 bytes (64-bit); again, it depends on the filesystem.
Of course, blocks and words isn't all there is to filesystems. Filesystems typically implement a B-tree of some sort to make lookups fast (it won't have to search the whole filesystem to find a file, just walk down the tree). In a filesystem B-tree, each node is stored in a block. Many filesystems use a variant of the B-tree called a B+-tree, which connects the leaves together with links to make traversal faster. The structure described here might be describing the leaves of a B+-tree, or it might be describing a chain of blocks used to store a single large file.
In summary, a disk is like a giant array of bytes which can be broken down into words, which are usually 2-8 bytes, and blocks, which are usually 512-4096 bytes. There are other ways to break it down, such as heads, cylinders, sectors, etc.. On top of these primitives, higher-level index structures are implemented. By understanding the constraints a filesystem developer needs to satisfy (emulate a tree of files efficiently by storing/retrieving blocks at a time), filesystem design should be quite intuitive.
Tracks >> Blocks >> Sectors >> Words >> Bytes >> Nibbles >> Bits
Tracks are concentric rings from inside to the outside of the disk platter.
Each track is divided into slices called sectors.
A block is a group of sectors (1, 2, 4, 8, 16, etc). The bigger the drive, the more sectors that a block will hold.
A word is the number of bits a CPU can handle at once (16-bit, 32-bit, 64-bit, etc), and in your example, stores the address (or perhaps offset) of the next block.
Bytes contain nibbles and bits. 1 Byte = 2 Nibbles; 1 Nibble = 4 Bits.

Resources