File-system: Overwriting data of equal length - filesystems

I have a project where I have to update the data on disk very frequently in case of power loss. When overwriting exactly 512b (1 sector of my drive) in the file with data of equal length does the file-system mark the sectors that have been changed and update them on disk when ready to flush? Or does it write the whole file every time it flushes a change? I am mainly concerned with ext4 but I am curious if it is the same with every file-system.
If the standard is not to track changes but to overwrite the entire file is there a way to change this? Some write options?

In general with Linux files are cached in the page cache, and whether or not a page is dirty is tracked at the page level. On Intel x86 platforms, the page size is 4k, so if you dirty a 4k page, it is the 4k page which gets written back.
If you want to only overwrite a single 512 byte sector, and you have a HDD that has 512 byte sectors, you can open the file with the O_DIRECT flag and if you issue a 512 byte write, on a file offset which is a multiple of 512 bytes, and where the memory buffer from where you source the write is also 512 byte aligned, then you can bypass the page cache, and the write will go directly to the disk (hence O_DIRECT).
Note that a number of modern disks are really using a 4k physical sector, but they are emulating 512 byte sectors for backwards compatibility reasons. These disks are sometimes called 512e sectors (e for emulated). On these drives, if you do a 512 byte sector write, the disk will do a read-modify-write cycle, since the drive internally can only write 4k at a time. This will be visible to you as a performance hit, but from a functional perspective, it will otherwise look the same as a traditional, old-fashioned 512 sectored HDD.

Related

How to get size of disk driver sector

Can get size of disk sector via the Linux API/ABI? It's about the quantum of I/O disk, normally it's equal 512 bytes, but others values can be too (usually multiple 512 bytes).
Also it should not confuse to size of logical block or to size of sector of a file system.
A block device is reflected as file in a file system of an UNIX (/dev/sda, /dev/sr etc.) It means, can open that file and make some manipulations to its content like with content of the corresponded block device.
So specifically the work to a true block device similar the work to a virtual hard disk (the .vhd format for instance).
But i don't know how to get size of sector in general case.
At moment i've single solution: get the maximal CHS address and size of hard drive, both action via BIOS. But i think, it's bad idea, because portability lost

Giving read() a start position

When you give read a start position - does it slow down read()? Does it have to read everything before the position to find the text it's looking for?
In other words, we have two different read commands,
read(fd,1000,2000)
read(fd,50000,51000)
where we give it two arguments:
read(file descriptor, start, end)
is there a way to implement read so that the two commands take the same amount of computing time?
You don't name a specific file system implementation or one specific language library so I will comment in general.
In general, a file interface will be built directly on top of the OS level file interface. In the OS level interface for most types of drives, data can be read in sectors with random access. The drive can seek to the start of a particular sector (without reading data) and can then read that sector without reading any of the data before it in the file. Because data is typically read in chunks by sector, if the data you request doesn't perfectly align on a sector boundary, it's possible the OS will read the entire sector containing the first byte you requested, but it won't be a lot and won't make a meaningful difference in performance as once the read/write head is positioned correctly, a sector is typically read in one DMA transfer.
Disk access times to read a given set of bytes for a spinning hard drive are not entirely predictable so it's not possible to design a function that will take exactly the same time no matter which bytes you're reading. This is because there's OS level caching, disk controller level caching and a difference in seek time for the read/write head depending upon what the read/write head was doing beforehand. If there are any other processes or services running on your system (which there always are) some of them may also be using the disk and contending for disk access too. In addition, depending upon how your files were written and how many bytes you're reading and how well your files are optimized, all the bytes you read may or may not be in one long readable sequence. It's possible the drive head may have to read some bytes, then seek to a new position on the disk and then read some more. All of that is not entirely predictable.
Oh, and some of this is different if it's a different type of drive (like an SSD) since there's no drive head to seek.
When you give read a start position - does it slow down read()?
No. The OS reads the directory entry to find out where the file is located on the disk, then calculates where on the disk your desired read should be, seeks to that position on the disk and starts reading.
Does it have to read everything before the position to find the text it's looking for?
No. Since it reads sectors at a time, it may read a few bytes before what you requested (whatever is before it in the sector), but sectors are not huge (often 8K) and are typically read in one fell swoop using DMA so that extra part of the sector before your desired data is not likely noticeable.
Is there a way to implement read so that the two commands take the same amount of computing time?
So no, not really. Disk reads, even of identical number of bytes vary a bit depending upon the situation and what else might be happening on the computer and what else might be cached already by the OS or the drive itself.
If you share what problem you're really trying to solve, we could probably suggest alternate approaches rather than relying on a given disk read taking an exact amount of time.
Well, filesystems usually split the data in a file in even-sized blocks. In most file systems the allocated blocks are organized in trees with high branching factor so it is effectively the same time to find the the nth data block than the first data block of the file, computing-wise.
The only general exception to this rule is the brain-damaged floppy disk file system FAT from Microsoft that should have become extinct in 1980s, because in it the blocks of the file are organized in a singly-linked list so to find the nth block you need to scan through n items in the list. Of course decent operating systems then have all sorts of tricks to address the shortcomings here.
Then the next thing is that your reads should touch the same number of blocks or operating system memory pages. Usually operating system pages are 4K nowadays and disk blocks something like 4k too so having every count being a multiple of 4096, 8192 or 16384 is better design than to have decimal even numbers.
i.e.
read(fd, 4096, 8192)
read(fd, 50 * 4096, 51 * 4096)
While it does not affect the computing time in a multiprocessing system, the type of media affects a lot: in magnetic disks the heads need to move around to find the new read position, and the disk must have spun to be in the reading position whereas SSDs have identical random access timings regardless of where on disk the data is positioned. And additionally the operating system might cache frequently accessed locations or expect that the block that is read after N would be N + 1 and hence such order be faster. But most of the time you wouldn't care.
Finally: perhaps instead of read you should consider using memory mapped I/O for random accesses!
Read typically reads data from the given file descriptor into a buffer. The amount of data it reads is from start (arg2) - end (arg3). More generically put the amount of data read can be found with (end-start). So if you have the following reads
read(fd1, 0xffff, 0xffffffff)
and
read(fd2, 0xf, 0xff)
the second read will be quicker because the end (0xff) - the start (0xf) is less than the first reads end (0xffffffff) - start (0xffff). AKA less bytes are being read.

FATFS porting on STM32F103 SPI Flash

I have ported FATFS for Free RTOS on STM32F103 SPI Flash of 32 Mbit. In a demo Application I have successfully created a file, written a file, and read back from the file. My requirement is like I have to store multiple files (images) in SPI flash and read it back when required.
I have the following conditions/queries.
I have set the Sector Size to 512 bytes, and block erase size for SPI flash is 4K. As in the SPI Flash, block needs to be erased before written. Do I need to keep track on whether a particular block is erased or not or its the file System who is managing this?
How can I verify that the sector, which I am writing in erased or not? What I am currently doing is, Erase the Complete Block for the sector, which I am going to write?
How can I make sure, The Block for SPI flash I am going to erase will not affect any Sector containing useful data?
Thanking in an Anticipation,
Regards,
AK
The simplest solution is to define the "cluster" size to 4K, the same as the page size of your flash. That mean each file, even if only 1 byte, takes 4K, which is 8 consecutive sectors of 512 bytes each.
As soon as you need to reserve one more cluster, when the file grow above 4096 bytes, you pick a free cluster, chain it to the FAT, and write the next byte.
For performance reason and to increase the durability of the flash, you should avoid to erase of flash sector when not needed. It is many order of magnitude faster to read then erasing. So, as you select a free cluster, you can start a loop to read each of the 8 sectors. As soon as you find even a single byte not equal to 0xFF, then you abort the loop and call the flash erase for that sector.
A further optimization is possible if the flash controller is able to perform the blank test directly. Such test could be done in a few microsecond while reading 8 sectors and looping to check each of the 4096 bytes is probably slower.

Can sector size of USB flash drive be changed?

i know the cluster size of a USB flash drive can be changed , can we change the sector size too ??
Sector size isn't a configurable parameter in ATA/SATA/SCSI/etc devices and, from my experience, USB flash drives implement one of these protocols. The sector size is reported by the device itself but, even if you could set it to something other than 512, you would likely run into a latent bug somewhere in a driver or file system package that assumed a sector size of 512.
There are real reasons for using a sector size like 512, for example, addressing of larger sectors can be done more quickly and efficiently (not just in time but in size/space as well). Throughput to these devices is also better with something like 512. Consider that, if you could set sector size to something like 16-bytes, you might have less wasted space with 16-byte sectors compared to having a number of half-full 512-byte sectors, but your throughput to the device would probably be worse. In fact, writing a single 16-byte sector would only be slightly faster than writing a single 512-byte sector. On the other hand, writing 32 16-byte sectors (to write a total of 512 bytes) would likely take longer than writing a single 512 byte sector simply due to the overhead associated with transferring multiple sectors.
I would suggest you buy a larger USB flash drive if you are worried about wasting space with 512-byte sectors.

Is there any way to write a few bytes to a disk sector without reading it first?

I've been experimenting with the performance of reading and writing files on Linux, specifically O_DIRECT, and I'm wondering, both at a hard drive level and the posix/Linux API level, is it possible to write only a few bytes to a sector, without destroying the rest of the sector, and without reading it first?
My experience with disk drives is that they expect data to be sent to them in entire sectors. So, basically, there's no way of writing less than an entire sector and if you wish to change the start of a sector without changing the end, you must read the whole sector, modify and write back. That is partly to do with how the disk head interacts with the platter (for physical disks anyway. In the case of flash drives, it's more likely to be with how small a chunk of the flash can be erased in one go).
In a portable way? Probably not.
In Linux and a few other Unix-like systems, you can open the block device for the drive, seek to a position (probably aligned to the sector size) and write some data to it, but I don't know what effect it would have on the remaining portion of that block.
Your best bet is to try it out on a virtual machine and see what happens. (Obviously, you'll have to have permission to write to the block device.)

Resources