The FAT, Linux, and NTFS file systems - filesystems

I heard that the NTFS file system is basically a b-tree. Is that true? What about the other file systems? What kind of trees are they?
Also, how is FAT32 different from FAT16?
What kind of tree are the FAT file systems using?

FAT (FAT12, FAT16, and FAT32) do not use a tree of any kind. Two interesting data structures are used, in addition to a block of data describing the partition itself. Full details at the level required to write a compatible implementation in an embedded system are available from Microsoft and third parties. Wikipedia has a decent article as an alternative starting point that also includes a lot of the history of how it got the way it is.
Since the original question was about the use of trees, I'll provide a quick summary of what little data structure is actually in a FAT file system. Refer to the above references for accurate details and for history.
The set of files in each directory is stored in a simple list, initially in the order the files were created. Deletion is done by marking an entry as deleted, so a subsequent file creation might re-use that slot. Each entry in the list is a fixed size struct, and is just large enough to hold the classic 8.3 file name along with the flag bits, size, dates, and the starting cluster number. Long file names (which also includes international character support) is done by using extra directory entry slots to hold the long name alongside the original 8.3 slot that holds all the rest of the file attributes.
Each file on the disk is stored in a sequence of clusters, where each cluster is a fixed number of adjacent disk blocks. Each directory (except the root directory of a disk) is just like a file, and can grow as needed by allocating additional clusters.
Clusters are managed by the (misnamed) File Allocation Table from which the file system gets its common name. This table is a packed array of slots, one for each cluster in the disk partition. The name FAT12 implies that each slot is 12 bits wide, FAT16 slots are 16 bits, and FAT32 slots are 32 bits. The slot stores code values for empty, last, and bad clusters, or the cluster number of the next cluster of the file. In this way, the actual content of a file is represented as a linked list of clusters called a chain.
Larger disks require wider FAT entries and/or larger allocation units. FAT12 is essentially only found on floppy disks where its upper bound of 4K clusters makes sense for media that was never much more than 1MB in size. FAT16 and FAT32 are both commonly found on thumb drives and flash cards. The choice of FAT size there depends partly on the intended application.
Access to the content of a particular file is straightforward. From its directory entry you learn its total size in bytes and its first cluster number. From the cluster number, you can immediately calculate the address of the first logical disk block. From the FAT indexed by cluster number, you find each allocated cluster in the chain assigned to that file.
Discovery of free space suitable for storage of a new file or extending an existing file is not as easy. The FAT file system simply marks free clusters with a code value. Finding one or more free clusters requires searching the FAT.
Locating the directory entry for a file is not fast either since the directories are not ordered, requiring a linear time search through the directory for the desired file. Note that long file names increase the search time by occupying multiple directory entries for each file with a long name.
FAT still has the advantage that it is simple enough to implement that it can be done in small microprocessors so that data interchange between even small embedded systems and PCs can be done in a cost effective way. I suspect that its quirks and oddities will be with us for a long time as a result.

ext3 and ext4 use "H-trees", which are apparently a specialized form of B-tree.
BTRFS uses B-trees (B-Tree File System).
ReiserFS uses B+trees, which are apparently what NTFS uses.
By the way, if you search for these on Wikipedia, it's all listed in the info box on the right side under "Directory contents".

Here is a nice chart on FAT16 vs FAT32.
The numerals in the names FAT16 and
FAT32 refer to the number of bits
required for a file allocation table
entry.
FAT16 uses a 16-bit file allocation
table entry (2 16 allocation units).
Windows 2000 reserves the first 4 bits
of a FAT32 file allocation table
entry, which means FAT32 has a maximum
of 2 28 allocation units. However,
this number is capped at 32 GB by the
Windows 2000 format utilities.
http://technet.microsoft.com/en-us/library/cc940351.aspx

FAT32 uses 32bit numbers to store cluster numbers. It supports larger disks and files up to 4 GiB in size.
As far as I understand the topic, FAT uses File Allocation Tables which are used to store data about status on disk. It appears that it doesn't use trees. I could be wrong though.

Related

Inserting a block of bytes in the middle of a file without rewriting everything?

Since files are stored as blocks on disk, is it possible to insert a block in the middle of the chain of blocks?
This is because, without such an API, if I wanted to insert a 4kb block in the middle of a file at a certain position, using the traditional read/write apis, I would essentially have to rewrite everything in the file after that position and shfit them by 4kb.
I'm ok with an answer that only works for some OS or some file systems. It doesn't have to be cross-platform or work for every file system.
(I also understand that not all file systems or hardware use 4kb for blocks - answers that work for different numbers are also ok).
I am not sure about filesystems that would allow extending a file in the middle easily. Then again many modern filesystems do not actually have a chain of blocks. The chain of blocks was a thing on the FAT filesystem family. Instead the blocks in modern filesystems are often organized as a tree. Within a tree you can find the block containing any byte position in O(lg n) reads, with the logarithm having such a large base that it can be considered essentially constant.
While the chain would allow for the operation of "insert n blocks in between" with comparative ease, the tree unfortunately does not. This doesn't mean that the tree is the wrong structure - on the contrary, many database systems benefit greatly from the fast random access that it offers.
Note that the tree enables you to have another thing that might be useful instead of holes - Unix file systems have sparse files - any blocks of the file that are known to contain zeroes need not actually use disk space at all - instead these blocks are marked unallocated and considered containing zeroes in the tree structure itself.

FAT32: root directory entries

We are building a fat32 filesystem manipulation tool in C and are currently trying to access all the entries in the root directory (situated right after the two FAT tables).
The first question is : Are all the root directory entries contiguous in the data region ? If not, given the first entry, how can we access the next entry ?
Does it have anything to do with the tags "low cluster / high cluster" or do we need to look in the FAT table for it (root directory) ?
Basically, we have the "equation" that leads us to the data region. Based on that, we point on the cluster, but after that, we don't really know how to find the next entry in the Root Directory.
This might seem confusing, but if you need pieces of code or more information, I will provide them.
Thank you in advance.
FAT (also FAT32) directory entries are 32bytes and appear in a sequential order.
To store long file names an entry could need multiples of 32 bytes.
About how L(ong)F(ile)N(ames) are marked (from wikipedia):
Long File Names (LFN) are stored on a FAT file system using a trick—adding (possibly multiple) additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. (ff)
Referring your second question (from wikepedia):
[...] VFAT LFN entries always have the cluster value at 0x1A set to 0x0000 and the length entry at 0x1C is never 0x00000000 [...]

FAT System Identification of free space and structure of entry files?

Been seaching google for a good explanation for how FAT systems identify free space and the structure of FAT Entry files.
Alot of the explanations ive found are quite hard to follow can anyone help brief sum these up?
i understand that clusters are marked as unused but is this within the root directory or data region? and is the information on clusters status just marked in a table?
I haven't managed to gain any knowledge on the structure of the entry files either, just that they use chains to keep the clusters together
Anyone help?
A file system can be thought of having three (3) types of data: file data, file meta-data and file system meta-data. File data is file or directory contents. File meta-data is that which tells us where the file data is stored on the disk. File system meta-data tells us how the file system allocates the blocks used in the file system.
The FAT file system however does not keep the lines so clear cut. Its disk structures often blur these distinctions.
The File Allocation Table (FAT) itself blurs the lines of the file meta-data and file system meta-data. That is, the FAT entries identify both the cluster number of the where the next cluster of file (or directory) data can be found as well as indicating to the file system whether the cluster identified by the index into the FAT is available (or not). As you indicated in your question, this forms a chain. A special marker (the specific value escapes my memory) indicates that the cluster identified by the index into the FAT is the last cluster in the chain.
Directory entries in a FAT based file system are both file data and file meta-data. They read like files with their entries being the "file data". However, their entries are also interpreted as file meta-data, for they contain the file attributes (permissions, file size, and the starting cluster number--which is an index into the FAT).
The root directory is a special directory on a FAT file system. If memory serves, it does not have either a "." nor a ".." entry. On FAT12 and FAT16 systems, the size of the root directory is specified when the disk is formatted and is thus of fixed size--however, its clusters are still marked in the FAT. On FAT32, the root directory size is not set at format time and can grow. The starting cluster of the root directory is stored in a special field in one of the file system meta-data structures (as I'm going by memory the name of this structure eludes me).
Hope this helps.
Here is a fairly long article that has lots of information about fat file systems.
It should provide all the details you need.
http://en.wikipedia.org/wiki/File_Allocation_Table

How does the length of a filename affect remaining storage space on a disk?

How does the length of a filename affect remaining storage space on a disk?
I realize this is filesystem dependent. In particular I am thinking about the EXT series of file systems. I don't fully understand how inodes affect disk space and how the filename itself is stored. It's difficult to get relevant search results for this question too. That's why I'm asking here. On linux, the maximum file name length is usually 255 or 256 characters. When the file system is created, is that amount of space "reserved" for each and every file name? In other words, is disk storage not affected by the actual file name because the maximum is already used? Or is it more complicated than that?
Suppose, I have a file named "joe.txt" and rename it to "joe2.txt". Has the amount of available disk space decreased after this? What about longer names like "joe_version.txt" or "joe_original_version_with_bug_that_Jim_solved.txt"? I am worried about thresholds at 8, 16, 32, 64, etc characters. I will be storing millions of images. I have never bothered to worry about such an issue before so I'm not completely sure how this works.
Although EXT is the only filesystem I'm using, discussing FAT and others might be useful to somebody else that has a similar question.
On Linux (or more generally, Unix type filesystems) file names are stored in directory entry inodes, which contain a list of (filename, inode number) mappings for each file in the directory. My understanding is that for each filename there is reserved space for NAME_MAX characters. And indeed, on Linux NAME_MAX is 255.
So, to answer you question, when the file system is created there is no space reserved for file names, but once you create a file NAME_MAX bytes are reserved for the name. Moreover, for the directory inode, my understanding is that at least on ext2/3/4 space is allocated in disk block (4 KB, unless you're doing something very strange) granularity as needed. I.e. a directory takes up at minimum 4 KB (plus an entry in the parent directory inode), and if the list of (filename, inode) pairs doesn't fit into that 4 KB (minus other overhead, e.g. directory permissions), it allocates a new 4 KB block to continue the list, and so forth (ext2/3 uses an indirect block scheme, whereas ext4 uses extents).
FAT16 pre-allocates.
FAT32 uses a work-around to provide long filenames; as the filename becomes longer, additional directory file blocks are required to store the extra characters - and a directory file is a regular file, so this consumes additional disk space. However, the smallest allocation is one cluster, so unless the additional filename store exceeds the cluster boundary, no additional disk space is consumed from what you could otherwise have used.
I'm not offhand familiar with how filenames are handled in the UNIX type filesystems.

Basic concepts in file system implementation

I am a unclear about file system implementation. Specifically (Operating Systems - Tannenbaum (Edition 3), Page 275) states "The first word of each block is used as a pointer to the next one. The rest of block is data".
Can anyone please explain to me the hierarchy of the division here? Like, each disk partition contains blocks, blocks contain words? and so on...
I don't have the book in front of me, but I'm suspect that quoted sentence isn't really talking about files, directories, or other file system structures. (Note that a partition isn't a file system concept, generally). I think your quoted sentence is really just pointing out something about how the data structures stored in disk blocks are chained together. It means just what it says. Each block (usually 4k, but maybe just 512B) looks very roughly like this:
+------------------+------------- . . . . --------------+
| next blk pointer | another 4k - 4 or 8 bytes of stuff |
+------------------+------------- . . . . --------------+
The stuff after the next block pointer depends on what's stored in this particular block. From just the sentence given, I can't tell how the code figures that out.
With regard to file system structures:
A disk is an array of sectors, almost always 512B in size. Internally, disks are built of platters, which are the spinning disk-shaped things covered in rust, and each platter is divided up into many concentric tracks. However, these details are entirely hidden from the operating system by the ATA or SCSI disk interface hardware.
The operating system divides the array of sectors up into partitions. Partitions are contiguous ranges of sectors, and partitions don't overlap. (In fact this is allowed on some operating systems, but it's just confusing to think about.)
So, a partition is also an array of sectors.
So far, the file system isn't really in the picture yet. Most file systems are built within a partition. The file system usually has the following concepts. (The names I'm using are those from the unix tradition, but other operating systems will have similar ideas.)
At some fixed location on the partition is the superblock. The superblock is the root of all the file system data structures, and contains enough information to point to all the other entities. (In fact, there are usually multiple superblocks scattered across the partition as a simple form of fault tolerance.)
The fundamental concept of the file system is the inode, said "eye-node". Inodes represent the various types of objects that make up the file system, the most important being plain files and directories. An inode might be it's own block, but some file system pack multiple inodes into a single block. Inodes can point to a set of data blocks that make up the actual contents of the file or directory. How the data blocks for a file is organized and indexed on disk is one of the key tasks of a file system. For a directory, the data blocks hold information about files and subdirectories contained within the directory, and for a plain file, the data blocks hold the contents of the file.
Data blocks are the bulk of the blocks on the partition. Some are allocated to various inodes (ie, to directories and files), while others are free. Another key file system task is allocating free data blocks as data is written to files, and freeing data blocks from files when they are truncated or deleted.
There are many many variations on all of these concepts, and I'm sure there are file systems where what I've said above doesn't line up with reality very well. However, with the above, you should be in a position to reason about how file systems do their job, and understand, at least a bit, the differences you run across in any specific file system.
I don't know the context of this sentence, but it appears to be describing a linked list of blocks. Generally speaking, a "block" is a small number of bytes (usually a power of two). It might be 4096 bytes, it might be 512 bytes, it depends. Hard drives are designed to retrieve data a block at a time; if you want to get the 1234567th byte, you'll have to get the entire block it's in. A "word" is much smaller and refers to a single number. It may be as low as 2 bytes (16-bit) or as high as 8 bytes (64-bit); again, it depends on the filesystem.
Of course, blocks and words isn't all there is to filesystems. Filesystems typically implement a B-tree of some sort to make lookups fast (it won't have to search the whole filesystem to find a file, just walk down the tree). In a filesystem B-tree, each node is stored in a block. Many filesystems use a variant of the B-tree called a B+-tree, which connects the leaves together with links to make traversal faster. The structure described here might be describing the leaves of a B+-tree, or it might be describing a chain of blocks used to store a single large file.
In summary, a disk is like a giant array of bytes which can be broken down into words, which are usually 2-8 bytes, and blocks, which are usually 512-4096 bytes. There are other ways to break it down, such as heads, cylinders, sectors, etc.. On top of these primitives, higher-level index structures are implemented. By understanding the constraints a filesystem developer needs to satisfy (emulate a tree of files efficiently by storing/retrieving blocks at a time), filesystem design should be quite intuitive.
Tracks >> Blocks >> Sectors >> Words >> Bytes >> Nibbles >> Bits
Tracks are concentric rings from inside to the outside of the disk platter.
Each track is divided into slices called sectors.
A block is a group of sectors (1, 2, 4, 8, 16, etc). The bigger the drive, the more sectors that a block will hold.
A word is the number of bits a CPU can handle at once (16-bit, 32-bit, 64-bit, etc), and in your example, stores the address (or perhaps offset) of the next block.
Bytes contain nibbles and bits. 1 Byte = 2 Nibbles; 1 Nibble = 4 Bits.

Resources