Reading a FAT16 file system - file

I am trying to read a FAT16 file system to gain information about it like number of sectors, clusters, bytespersector etc...
I am trying to read it like this:
FILE *floppy;
unsigned char bootDisk[512];
floppy = fopen(name, "r");
fread(bootDisk, 1, 512, floppy);
int i;
for (i = 0; i < 80; i++){
printf("%u,",bootDisk[i]);
}
and it outputs this:
235,60,144,109,107,100,111,115,102,115,0,0,2,1,1,0,2,224,0,64,11,240,9,0,18,0,2,0,0,0,0,0,0,0,0,0,0,0,41,140,41,7,68,32,32,32,32,32,32,32,32,32,32,32,70,65,84,49,50,32,32,32,14,31,190,91,124,172,34,192,116,11,86,180,14,187,7,0,205,16,
What do these numbers represent and what type are they? Bytes?

You are not reading the values properly. Most of them are longer than 1 byte.
From the spec you can obtain the length and meaning of every attributes in the boot sector:
Offset Size (bytes) Description
0000h 3 Code to jump to the bootstrap code.
0003h 8 Oem ID - Name of the formatting OS
000Bh 2 Bytes per Sector
000Dh 1 Sectors per Cluster - Usual there is 512 bytes per sector.
000Eh 2 Reserved sectors from the start of the volume.
0010h 1 Number of FAT copies - Usual 2 copies are used to prevent data loss.
0011h 2 Number of possible root entries - 512 entries are recommended.
0013h 2 Small number of sectors - Used when volume size is less than 32 Mb.
0015h 1 Media Descriptor
0016h 2 Sectors per FAT
0018h 2 Sectors per Track
001Ah 2 Number of Heads
001Ch 4 Hidden Sectors
0020h 4 Large number of sectors - Used when volume size is greater than 32 Mb.
0024h 1 Drive Number - Used by some bootstrap code, fx. MS-DOS.
0025h 1 Reserved - Is used by Windows NT to decide if it shall check disk integrity.
0026h 1 Extended Boot Signature - Indicates that the next three fields are available.
0027h 4 Volume Serial Number
002Bh 11 Volume Label - Should be the same as in the root directory.
0036h 8 File System Type - The string should be 'FAT16 '
003Eh 448 Bootstrap code - May schrink in the future.
01FEh 2 Boot sector signature - This is the AA55h signature
You should probably use a custom struct to read the boot sector.
Like:
typedef struct {
unsigned char jmp[3];
char oem[8];
unsigned short sector_size;
unsigned char sectors_per_cluster;
unsigned short reserved_sectors;
unsigned char number_of_fats;
unsigned short root_dir_entries;
[...]
} my_boot_sector;
Keep in mind your endianness and padding rules in your implementation. This struct is an example only.
If you need more details this is a thorough example.

Related

Linux sound programming. How to determine a buffer size in frames?

I'm experimenting with ALSA and came across with the following configuration parameter in this howto, Section 2:
The unit of the buffersize depends on the function. Sometimes it is
given in bytes, sometimes the number of frames has to be specified.
One frame is the sample data vector for all channels. For 16 Bit
stereo data, one frame has a length of four bytes.
/* Set buffer size (in frames). The resulting latency is given by */
/* latency = periodsize * periods / (rate * bytes_per_frame) */
if (snd_pcm_hw_params_set_buffer_size(pcm_handle, hwparams, (periodsize * periods)>>2) < 0) {
fprintf(stderr, "Error setting buffersize.\n");
return(-1);
}
I don't understand this For 16 Bit stereo data, one frame has a length
of four bytes
Why is it four? Does it follow by the number of channels: 2? I mean earlier they configured it as follows:
/* Set number of channels */
if (snd_pcm_hw_params_set_channels(pcm_handle, hwparams, 2) < 0) {
fprintf(stderr, "Error setting channels.\n");
return(-1);
}
How about if my acoustic system contains 4 outputs? Or 6? Does it mean that I have to configre it to 16 Bit * 4 and 16 Bit * 6 correspondingly?
Why is it four? Does it follow by the number of channels: 2?
Yes, according to mentioned earlier:
One frame is the sample data vector for all channels.
So for stereo 16 bit data, there are two (left and right) channels of 16 bits (=2 bytes) each, so that totals to 4 bytes per frame.

Parsing a .pcap file in plain C

I'm trying to create my own pcap files parser. According to Wireshark's docs:
Global Header
This header starts the libpcap file and will be followed by the first packet header:
typedef struct pcap_hdr_s {
guint32 magic_number; /* magic number */
guint16 version_major; /* major version number */
guint16 version_minor; /* minor version number */
gint32 thiszone; /* GMT to local correction */
guint32 sigfigs; /* accuracy of timestamps */
guint32 snaplen; /* max length of captured packets, in octets */
guint32 network; /* data link type */
} pcap_hdr_t;
magic_number: used to detect the file format itself and the byte ordering. The writing application writes 0xa1b2c3d4 with it's native byte ordering format into this field. The reading application will read either 0xa1b2c3d4 (identical) or 0xd4c3b2a1 (swapped). If the reading application reads the swapped 0xd4c3b2a1 value, it knows that all the following fields will have to be swapped too. For nanosecond-resolution files, the writing application writes 0xa1b23c4d, with the two nibbles of the two lower-order bytes swapped, and the reading application will read either 0xa1b23c4d (identical) or 0x4d3cb2a1 (swapped).
version_major, version_minor: the version number of this file format (current version is 2.4)
thiszone: the correction time in seconds between GMT (UTC) and the local timezone of the following packet header timestamps. Examples: If the timestamps are in GMT (UTC), thiszone is simply 0. If the timestamps are in Central European time (Amsterdam, Berlin, ...) which is GMT + 1:00, thiszone must be -3600. In practice, time stamps are always in GMT, so thiszone is always 0.
sigfigs: in theory, the accuracy of time stamps in the capture; in practice, all tools set it to 0
snaplen: the "snapshot length" for the capture (typically 65535 or even more, but might be limited by the user), see: incl_len vs. orig_len below
network: link-layer header type, specifying the type of headers at the beginning of the packet (e.g. 1 for Ethernet, see tcpdump.org's link-layer header types page for details); this can be various types such as 802.11, 802.11 with various radio information, PPP, Token Ring, FDDI, etc.
/!\ Note: if you need a new encapsulation type for libpcap files (the value for the network field), do NOT use ANY of the existing values! I.e., do NOT add a new encapsulation type by changing an existing entry; leave the existing entries alone. Instead, send mail to tcpdump-workers#lists.tcpdump.org , asking for a new link-layer header type value, and specifying the purpose of the new value.
The first integer in the file should be either 0xA1B2C3D4 or 0xD4C3B2A1, but my code's output:
#include <stdio.h>
typedef unsigned int guint32;
typedef unsigned short guint16;
int main()
{
FILE * file = fopen("test.pcap", "rb");
guint32 magic_number;
fscanf(file, "%d", &magic_number);
printf("%x\n", magic_number);
return 0;
}
Is 0x8. Why is that?
The magic number are the first 4 bytes of the file. With fscanf(...%d you don't read these 4 bytes but you instead try to interpret the beginning of the file as the ASCII representation of a number, i.e. "1234" instead of "\x01\x02\x03\x04". Thus instead of fscanf you need to use fread to read exactly 4 bytes.
fread((void*)&magic_number, 4, 1, file)

Portable way to determine sector size in Linux

I want to write a small program in C which can determine the sector size of a hard disk. I wanted to read the file located in /sys/block/sd[X]/queue/hw_sector_size, and it worked in CentOS 6/7.
However when I tested in CentOS 5.11, the file hw_sector_size is missing, and I have only found max_hw_sectors_kb and max_sectors_kb.
Thus, I'd like to know how can I determine (APIs) the sector size in CentOS 5, or is there an other better way to do so. Thanks.
The fdisk utility displays this information (and runs successfully on kernels older even than than the 2.6.x vintage on CentOS 5), so that seems a likely place to look for an answer. Fortunately, we're living in the wonderful world of open source, so all it requires is a little investigation.
The fdisk program is provided by the util-linux package, so we need that first.
The sector size is displayed in the output of fdisk like this:
Disk /dev/sda: 477 GiB, 512110190592 bytes, 1000215216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
If we look for Sector size in the util-linux code, we find it in disk-utils/fdisk-list.c:
fdisk_info(cxt, _("Sector size (logical/physical): %lu bytes / %lu bytes"),
fdisk_get_sector_size(cxt),
fdisk_get_physector_size(cxt));
So, it looks like we need to find fdisk_get_sector_size, which is defined in libfdisk/src/context.c:
unsigned long fdisk_get_sector_size(struct fdisk_context *cxt)
{
assert(cxt);
return cxt->sector_size;
}
Well, that wasn't super helpful. We need to find out where cxt->sector_size is set:
$ grep -lri 'cxt->sector_size.*=' | grep -v tests
libfdisk/src/alignment.c
libfdisk/src/context.c
libfdisk/src/dos.c
libfdisk/src/gpt.c
libfdisk/src/utils.c
I'm going to start with alignment.c, since that filename sounds promising. Looking through that file for the same regex I used to list the files, we find this:
cxt->sector_size = get_sector_size(cxt->dev_fd);
Which leads me to:
static unsigned long get_sector_size(int fd)
{
int sect_sz;
if (!blkdev_get_sector_size(fd, &sect_sz))
return (unsigned long) sect_sz;
return DEFAULT_SECTOR_SIZE;
}
Which in turn leads me to the definition of blkdev_get_sector_size in lib/blkdev.c:
#ifdef BLKSSZGET
int blkdev_get_sector_size(int fd, int *sector_size)
{
if (ioctl(fd, BLKSSZGET, sector_size) >= 0)
return 0;
return -1;
}
#else
int blkdev_get_sector_size(int fd __attribute__((__unused__)), int *sector_size)
{
*sector_size = DEFAULT_SECTOR_SIZE;
return 0;
}
#endif
And there we go. There is a BLKSSZGET ioctl that seems useful. A search for BLKSSZGET leads us to this stackoverflow question, which includes the following information in a comment:
For the record: BLKSSZGET = logical block size, BLKBSZGET = physical
block size, BLKGETSIZE64 = device size in bytes, BLKGETSIZE = device
size/512. At least if the comments in fs.h and my experiments can be
trusted. – Edward Falk Jul 10 '12 at 19:33

LBA and cluster

I wonder about LBA and cluster number.
My question is this:
is LBA 0 always cluster 2?
then what does cluster 0 and 1 for?
only difference between cluster and LBA is just where do they start from the disk?
relation among CHS, LBA, cluster nubmer?
and in the flowing code, what does add ax, WORD [datasector] code for?
;************************************************;
; Convert CHS to LBA
; LBA = (cluster - 2) * sectors per cluster
;************************************************;
ClusterLBA:
sub ax, 0x0002 ; zero base cluster number
xor cx, cx
mov cl, BYTE [bpbSectorsPerCluster] ; convert byte to word
mul cx
add ax, WORD [datasector] ; base data sector
ret
There are many sector numbering schemes on disk drives. One of the earliest was CHS (Cylinder-Head-Sector). One sector can be selected by specifying the cylinder (track), read/write head and sector per track triplet. This numbering scheme depends on the actual physical characteristics of the disk drive.
The first logical sector resides on cylinder 0, head 0, sector 1. The second is on sector 2, and so on. If there isn't any more sectors on the disk (eg. on a 1.44M floppy disk there's 18 sectors per track), then the next head is applied, starting on sector 1 again, and so on.
You can convert CHS addresses to an absolute (or logical) sector number with a little math:
L = (C * Nh + H) * Ns + S - 1
where C, H ans S are the cylinder, head and sector numbers according to CHS adressing, while Nh and Ns are the number of heads and number of sectors per track (cylinder), respectively. The reverse calculation (to convert LBA to CHS) is as simple as this.
In this numbering scheme, which is called LBA (Logical Block Addressing), each sector can be identified by a single number. The first logical sector is LBA 0, the second is LBA 1, and so on. This scheme is linear and easier to deal with.
Clusters are simply groups of continuous sectors on the disk, which are treated together by the operating system and the file system, in order to reduce disk fragmentation and disk space needed for file system metadata (eg. to describe in which sectors could a specific file found on the disk). A cluster may consist of only 1 sector (512 bytes), up to 128 sectors (64 kilobytes) or more, depending on the capacity of the disk.
Again, the logical sector number of the first sector of a cluster can be easily calculated:
L = ((U - Sc) * Nc) + Sd
where U is the cluster number, Nc is the number of sectors in a cluster, Sc is the first valid cluster number, and Sd is the number of the first logical sector available for generic file data. The latter two parameters (Sc and Sd) are completely filesystem and operating system specific values.
Some filesystems (for example FAT16, and the whole FAT-family) reserve cluster number 0 and 1 for special purposes, that's why the first actual cluster is cluster number two (Sc = 2 in this case). Similarly, there may be some reserved sectors in the beginning of the disk, where no data is allowed to be written to and read from. This reserved area can range from a few sectors (e.g. a boot record) to millions of sectors (e.g. a completely different partition which preceeds our partition on the hard disk).
Huh, this was the long answer. After all, the short answers to your questions can be summarized as follows:
No, LBA 0 is not always cluster 2, it's filesystem specific (in case of FAT, cluster 2 is the first available sector on the disk, but not always LBA 0 - see answer 5).
Interpretation of cluster number 0 and 1 are also filesystem specific (in case of FAT, cluster number 0 represents an empty cluster in the File Allocation Table, and cluster number 1 is reserved).
No, the main difference is that a cluster number addresses a group of continous sectors, while LBA addresses a single sector on the disk.
See the formulas (formulae?), and the accompanying description in the long answer above.
It's hard to tell from such a short assembly code, but my best guess would be the number of reserved sectors in the beginning of the partition (noted by Sd in the formula above).

Programatically determining file "size on disk" in advance

I need to know how big a given in-memory buffer will be as an on-disk (usb stick) file before I write it. I know that unless the size falls on the block size boundary, its likely to get rounded up, e.g. a 1 byte file takes up 4096 bytes on-disk. I'm currently doing this using GetDiskFreeSpace() to work out the disk block size, then using this to calculate the on-disk size like this:
GetDiskFreeSpace(szDrive, &dwSectorsPerCluster,
&dwBytesPerSector, NULL, NULL);
dwBlockSize = dwSectorsPerCuster * dwBytesPerSector;
if (dwInMemorySize % dwBlockSize != 0)
{
dwSizeOnDisk = ((dwInMemorySize / dwBlockSize) * dwBlockSize) + dwBlockSize;
}
else
{
dwSizeOnDisk = dwInMemorySize;
}
Which seems to work fine, BUT GetDiskFreeSpace() only works on disks up to 2GB according to MSDN. GetDiskFreeSpaceEx() doesn't return the same information, so my question is, how else can I calculate this information for drives >2GB? Is there an API call I've missed? Can I assume some hard values depending on the overall disk size?
MSDN only states that the GetDiskFreeSpace() function cannot report volume sizes greater than 2GB. It works fine for retrieving sectors per cluster and bytes per sector, I've used it myself for very similar-looking code ;-)
But if you want disk capacity too, you'll need an additional call to GetDiskFreeSpaceEx().
The size of a file on disk is a fuzzy concept. In NTFS, a file consists of a set of data elements. You're primarilty thinking of the "unnamed data stream". That's an attribute of a file that, if small, can be packed with the other attributes in the directory entry. Apparently, you can store a data stream of up to 700-800 bytes in the directory entry itself. Hence, your hypothetical 1 byte file would be as big as a 0 byte or 700 byte file.
Another influence is file compression. This will make the on-disk size potentially smaller than the in-memory size.
You should be able to obtain this information using the DeviceIoControl function and
DISK_GEOMETRY_EX. It will return a structure that contains the information you are looking for I think
http://msdn.microsoft.com/en-us/library/aa363216(VS.85).aspx
http://msdn.microsoft.com/en-us/library/ms809010.aspx
In actionscript!
var size:Number = 19912;
var sizeOnDisk:Number = size;
var reminder:Number = size % (1024 * 4);
if(reminder>0){
sizeOnDisk = size + ((1024 * 4)- reminder)
}
trace(size)
trace(sizeOnDisk)

Resources