I am trying to understand how a FAT file system works. From the attached first sector of FAT 16 partition I could understand,
Bytes per sector = 512.
Sectors per cluster = 4.
FAT 16 file system.
reserved sectors = 4.
FAT table count = 2.
Number of entries in root directory = 512.
Total sectors = 204800.
Root dir sector = 32.
Size of FAT table = 200.
First data sector = 436 (4 + 2 * 200 + 32).
Cluster count = 51091.
Root directory is at 404th sector (0x32800th byte)
Root directory at address 0x32800 is attached. The root directory has two folders named a, b and one file named file.txt. In the given image above how to distinguish between file and folder.
Doubts listed below:
1. A folder entry should start with a 0x2E but there is no such value. So how to find out whether a given entry is a file or folder?
2. As you can see each entry in the root directory occupies 64 bytes (instead of 32 bytes). There seems to be 2 32byte entries for each file and folder. For example, folder 'a' has entries at 0x32800 and 0x32820 (totally 64bytes).
3. What does the value 0x41 denote in this context? The value 0x41 appears at 0x32800, 0x32820, 0x32840, 0x32880. The values at 0x32860 and 0x328A0 are different from 0x41.
4. The offset 0x1A from address 0x32800 (0x32800 + 0x1a = 0x3281a) has value 0, offset 0x1A from address 0x32820 (0x32820 + 0x1a = 0x3283a) has value 3. Which is the correct cluster number corresponding to folder 'a'?
No, folder entries do NOT start with "." (0x2E) unless they are for the . and .. entries of subdirectories (these aren't in the root). The dirent's attributes byte has the 0x10 bit set if the dirent is a directory.
You are also looking at a directory that has long file names. The original FAT file system specification only allowed 11 character names that were all upper case and were in the OEM codepage. Windows 95 extended this. It's pretty complicated to explain on stackoverflow how this works. I suggest looking at the MSDN documentation for LFN or Long File Names.
http://technet.microsoft.com/en-us/library/cc938438.aspx
A FAT** file system saves all files as one basic size unless the file is larger than that size then it re adapts the size to hold the entire file
but the point here is that a FAT file system is mainly good if you have allot of Disk space other wise I would Recommend using an NTFS file system if possible. also the images your showing looks like a registry code for a floppy drive
Related
Page 301 of Tanenbaum's Modern Operating Systems contains the table below. It gives the file sizes on a 2005 commercial Web server. The chapter is on file systems, so these data points are meant to be similar to what you would see on a typical storage device.
File length (bytes)
Percentage of files less than length
1
6.67
2
7.67
4
8.33
8
11.30
16
11.46
32
12.33
64
26.10
128
28.49
...
...
1KB
47.82
...
...
1 MB
98.99
...
...
128 MB
100
In the table, you will see that 6.67% of files on this server are 1 byte in length. What kinds of processes are creating 1 byte files? What kind of data would be stored in these files?
I wasn't familiar with that table, but it piqued my interest. I'm not sure what the 1-byte files were at the time, but perhaps the 1-byte files of today can shed some light?
I searched for files of size 1 byte with
sudo find / -size 1c 2>/dev/null | while read line; do ls -lah $line; done
Looking at the contents of these files on my system, they contain a single character: a newline. This can be verified by running the file through hexdump. A file with a single newline can exist for multiple reasons, but it probably has to do with the convention of terminating a line with a newline.
There is a second type of file with size 1 byte: symbolic links where the target is a single character. ext4 appears to report the length of the target as the size of the symbolic link (at least for short-length targets).
I'm currently working on a mini library for myself to compress and extract ZIP files. So far I don't have any major problems with the documentation, except that I don't get what "disks" are in a ZIP file and how to calculate the number of a disk:
4.3.16 End of central directory record:
end of central dir signature 4 bytes (0x06054b50)
number of this disk 2 bytes <= What does "disk" mean here?
number of the disk with the
start of the central directory 2 bytes <= What does "disk" mean here?
total number of entries in the
central directory on this disk 2 bytes <= What does "disk" mean here?
total number of entries in
the central directory 2 bytes
size of the central directory 4 bytes
offset of start of central
directory with respect to
the starting disk number 4 bytes <= What does "disk" mean here?
.ZIP file comment length 2 bytes
.ZIP file comment (variable size)
The term disk refers to floppy diskettes in the context of splitting and spanning ZIP files (see chapter 8.0 of the documentation that you provided; emphasis mine):
8.1.1 Spanning is the process of segmenting a ZIP file across
multiple removable media. This support has typically only
been provided for DOS formatted floppy diskettes.
Nowadays, implementations often no longer support splitting and spanning (e.g. here (see Limitations) or here (see procedure finish_zip())), as floppy disks or even compact disks went out of fashion. If you are fine with not supporting splitting and spanning (at first), you can set the values as follows:
number of this disk 2 bytes <= You only have one disk/file, so set it to 1.
number of the disk with the
start of the central directory 2 bytes <= You only have one disk/file, so set it to 1.
total number of entries in the
central directory on this disk 2 bytes <= Set it to the overall number of records.
offset of start of central
directory with respect to
the starting disk number 4 bytes <= Set this offset (in bytes) relative to
the start of your archive.
If you want to support splitting or spanning, then you have to increase the disk count every time you start writing to a new disk/file. Reset the total number of entries in the central directory on this disk for each new disk/file. Calculate the offset relative to the start of the file.
I am working with the EXT2 File System and spent the last 2 days trying to figure out how to create a symbolic link. From http://www.nongnu.org/ext2-doc/ext2.html#DEF-SYMBOLIC-LINKS, "For all symlink shorter than 60 bytes long, the data is stored within the inode itself; it uses the fields which would normally be used to store the pointers to data blocks. This is a worthwhile optimization as it we avoid allocating a full block for the symlink, and most symlinks are less than 60 characters long"
To create a sym link at /link1 to /source I create a new inode and say it gets index 24. Since it's <60 characters, I placed the string "/source" starting at the i_block[0] field (so printing new_inode->i_block[0] in gdb shows "/dir2/source") and set i_links_count to 1, i_size and i_blocks to 0. I then created a directory entry at the inode 2 (root inode) with the properties 24, "link1", and file type EXT2_FT_SYMLINK.
A link called "link1" gets created but its a directory and when I click it it goes to "/". I'm wondering what I'm doing wrong...
A (very) late response, but just because the symlink's data is in the block pointers that doesn't mean the file size is 0! You need to set the i_size field in the symlink's inode equal to the length of the path
I'm currently trying to code a FAT system in C on a Xillinx Kintex 7 card. It is equipped with a MicroBlaze and I've already managed to create most of the required functions.
The problem I'm facing is about the total capacity of a folder, I've read on the web that in FAT32 a folder should be able to contain more than 65 000 files but with the system I've put in place I'm limited to 509 files per folder. I think it's because of my comprehension of the way FAT32 works but here's what I've made so far:
I've created a format function that writes the correct data in the MBR (sector 0) and the Volume ID (sector 2048 on my disk).
I've created a function that writes the content of the root directory (first cluster that starts on sector 124 148)
I've created a function that writes a new folder that contains N files of size X. The name of the folder is written in the root directory (sector 124148) and the filenames are written on the next cluster (sector 124212 since I've set cluster size to 64 sectors). Finally, the content of the files (a simple counter) is written on the next cluster that starts on sector 124276.
Here, the thing is that a folder has a size of 1 cluster which means that it has a capacity of 64 sectors = 32KB and I can create only 512 (minus 2) files in a directory! Then, my question is: is it possible to change the size of a folder in number of cluster? Currently I use only 1 cluster and I don't understand how to change it. Is it related to the FAT of the drive?
Thanks in advance for your help!
NOTE: My drive is recognized by Windows when I plug it in, I can access and read every file (except those that exceed the 510 limit) and I can create new files through the Windows Explorer. It obviously comes from the way I understand file creation and folder creation!
A directory in the FAT filesystem is only a special type of file. So use more clusters for this "file" just as you would with any other file.
The cluster number of the root directory is stored at offset 0x2c of the FAT32 header and is usually cluster 2. The entry in the cluster map for cluster 2 contains the value 0x0FFFFFFF (end-of-clusters) if this is the only cluster for the root directory. You can use two clusters (for example cluster 2 and 3) for the root directory if you set cluster 3 in the cluster map as the next cluster for cluster 2 (set 0x00000003 as value for the entry of cluster 2 in the cluster map). Now, cluster 3 can either be the last cluster (by setting its entry to 0x0FFFFFFF) or can point in turn to another cluster, to make the space for the root directory even bigger.
The clusters do not need to be subsequent, but it usually has a performance gain on sequential reading (that's why defragmenting a volume can largly increase performance).
The maximum number of files within a directory of a FAT file system is 65,536 if all files have short filenames (8.3 format). Short filenames are stored in a single 32-byte entry.
That means the maximum size of a direcotry (file) is 65,536 * 32 bytes, i.e. 2,097,152 bytes.
Short filenames in 8.3 format consists of 8 characters plus optional a "." followed by maximum 3 characters. The character set is limited. Short filenames that contain lower case letters are additionally stored in a Long File Name entry.
If the filename is longer (Long File Name), it is spread over multiple 32-byte long entries. Each entry contains 13 characters of the filename. If the length of the filename is not a multiple of 13, the last entry is padded.
Additionally there is one short file name entry for each Long File Name entry.
2 32-byte entries are already taken by the "." and ".." entries in each directory (except root).
1 32-byte entry is taken as end marker?
So the actual maximum number of files in a directory depends on the length of the filenames.
I have a project for school which implies making a c program that works like tar in unix system. I have some questions that I would like someone to explain to me:
The dimension of the archive. I understood (from browsing the internet) that an archive has a define number of blocks 512 bytes each. So the header has 512 bytes, then it's followed by the content of the file (if it's only one file to archive) organized in blocks of 512 bytes then 2 more blocks of 512 bytes.
For example: Let's say that I have a txt file of 0 bytes to archive. This should mean a number of 512*3 bytes to use. Why when I'm doing with the tar function in unix and click properties it has 10.240 bytes? I think it adds some 0 (NULL) bytes, but I don't know where and why and how many...
The header chcksum. As I know this should be the size of the archive. When I check it with hexdump -C it appears like a number near the real size (when clicking properties) of the archive. For example 11200 or 11205 or something similar if I archive a 0 byte txt file. Is this size in octal or decimal? My bets are that is in octal because all information you put in the header it needs to be in octal. My second question at this point is what is added more from the original size of 10240 bytes?
Header Mode. Let's say that I have a file with 664, the format file will be 0, then I should put in header 0664. Why, on a authentic archive is printed 3 more 0 at the start (000064) ?
There have been various versions of the tar format, and not all of the extensions to previous formats were always compatible with each other. So there's always a bit of guessing involved. For example, in very old unix systems, file names were not allowed to have more than 14 bytes, so the space for the file name (including path) was plenty; later, with longer file names, it had to be extended but there wasn't space, so the file name got split in 2 parts; even later, gnu tar introduced the ##LongLink pseudo-symbolic links that would make older tars at least restore the file to its original name.
1) Tar was originally a *T*ape *Ar*chiver. To achieve constant througput to tapes and avoid starting/stopping the tape too much, several blocks needed to be written at once. 20 Blocks of 512 bytes were the default, and the -b option is there to set the number of blocks. Very often, this size was pre-defined by the hardware and using wrong blocking factors made the resulting tape unusable. This is why tar appends \0-filled blocks until the tar size is a multiple of the block size.
2) The file size is in octal, and contains the true size of the original file that was put into the tar. It has nothing to do with the size of the tar file.
The checksum is calculated from the sum of the header bytes, but then stored in the header as well. So the act of storing the checksum would change the header, thus invalidate the checksum. That's why you store all other header fields first, set the checksum to spaces, then calculate the checksum, then replace the spaces with your calculated value.
Note that the header of a tarred file is pure ascii. This way, In those old days, when a tar file (whose components were plain ascii) got corrupted, an admin could just open the tar file with an editor and restore the components manually. That's why the designers of the tar format were afraid of \0 bytes and used spaces instead.
3) Tar files can store block devices, character devices, directories and such stuff. Unix stores these file modes in the same place as the permission flags, and the header file mode contains the whole file mode, including file type bits. That's why the number is longer than the pure permission.
There's a lot of information at http://en.wikipedia.org/wiki/Tar_%28computing%29 as well.