A common approach to saving and loading of user / application settings in a GUI application looks like this:
private void FormMain_Load(object sender, EventArgs e)
{
m_foo = Properties.Settings.Default.Foo;
}
private void FormMain_FormClosing(object sender, FormClosingEventArgs e)
{
Properties.Settings.Default.Save();
}
At first glance, this looks reasonable. However, once you consider what actually transpires inside the save and (implicit) load operations, namely disk I/O, it appears a bit suspect. Doesn't it stand against the principle of avoiding potentially long running operations in general and especially I/O on the GUI thread?
I realize that in the vast majority of cases we're talking about very small files located on a local hard drive, but I can come up with scenarios where the operation would take some time to complete if I really wanted to (disk was put to sleep, disk is under stress, disk is actually network storage, etc).
Also, it's not clear what I should do about it. Startup is easy enough to handle, disabling the GUI while the settings are loading asynchronously. But what about the close event? Sure, I could cancel the event, save asynchronously, and then close it myself but it gets a little messy as I need to take care of cases where the user tries to close it again before I finished saving (so I don't initiate another save, or exit while in the middle of saving etc). And I'm not sure it can even be done (at least simply) in other frameworks, say GTK# where the OnDeleteEvent is typically used when it's too late. I suppose I could fire a foreground thread and save it there, but then the user might think the application is closed and run it again before the settings were actually saved (and there could be other consequences for having multiple instances of it alive regardless).
Should I be worried about such scenarios or am I overthinking it?
You are over-thinking it. Presumably based on the assumption that saving settings is slow. It is not in a Winforms app, you are not running this app on a mobile device. File writes are always cached by the file system cache, nothing but a memory-to-memory copy. They run at memory bus speeds, 5 gigbytes/sec at its slowest (old DDR2), more typically around 35 GB/sec.
After which the file system driver lazily writes the changes to disk, highly optimized to minimize the write head movement. Writes up to ~1 gigabyte can fit on modern machines, much more if it has enough RAM. An easy 6 orders of magnitude more than you'll ever need for a user.config file.
Any optimization that turns it into an async write cannot ever make the responsiveness of your app better than a millisecond, completely unobservable by a human. Also the reason why this isn't supported by System.Configuration. You don't have a real problem.
I think you're forgetting the "why" of the general rule (which is the danger with all general rules). The "why" behind "the principle of avoiding potentially long running operations in general and especially I/O on the GUI thread" is to maintain responsiveness (low latency) in the user interaction. If you do long-running work in the GUI thread then your GUI won't refresh as quickly.
What user interaction is needed as the application is loading (before it's ready to run), or as it is closing (when it can no longer be interacted with)? There are scenarios when there is a legitimate answer (interacting with another window in this app, etc.), but this is a minority of the time.
Also, there are various buffers that can help mitigate any latency here. The file is likely read into a disk buffer before your app calls the read of the config (not necessarily). Disk output need only go to the nearest buffer before the app can close.
Related
My C program which uses sorting runs 10x slower the first time than other times. It uses file of integers to sort and even if I change the numbers, program still runs faster. When I restart the PC, the very first time program runs 10x slower. I use time to count the time.
The operating system holds the data in RAM even if it's not needed anymore (this is called "caching"), so when the program runs again, it gets all data from there and there's no disk I/O. Even when you change the data, that change happens in RAM first, and it stays there even after its written to the file.
It doesn't stay in RAM forever though, mind you. If the memory is needed for something else, the cache is deleted. At that point, a disk access is needed (and it's cached in RAM again at that point.)
This is why first access after a reboot is always slow; the data hasn't been cached yet since it was never read from the file.
You have to make hypothesis and confront them to reality. The first you can reasonably make is that it does smell a lot like a caching issue !
Ask yourself those questions :
Does my data fits in free RAM (= is my file cached by the OS FS cache
?)
Does my data fits in CPU data cache ?
Does my data fits in HDD internal cache ?
The most easy hypothesis to discard is the FS cache. Under linux, just issue sync; echo 3 > /proc/sys/vm/drop_caches between each call to your program. The first will make sure the cached data will make it to the physical medium (hard drive), the second will drop the content of the filesystem cache from memory.
The 'physical medium' might be the HDD cache itself, so beware... Under linux you can disable this "write-back" cache with the command hdparm -W 0 <device>, for instance if you are working with drive sda, hdparm -W 0 /dev/sda will do the job. You might want to re-enable it after you are finished with your tests :)
Another hypothesis is the CPU cache, have a look at How can I do a CPU cache flush in x86 Windows? and How to clear CPU L1 and L2 cache
Well, it may or may not be one of those, but it doesn't hurt trying :)
If your program does network access then that could be the reason for the initial delay. Many network protocols need time to setup things. Some examples:
DNS: if your program does any network access, chances are it needs to resolve a hostname to an IP address. The first time it would need at least a network round trip to populate a local cache. Following requests would be shorter.
Networked filesystems (NFS, CIFS and others): opening files can happen through the network.
Even some seemingly innocuous library functions can require network access: the users list for the host can be on a remote directory server.
Appart from this you could use some low level tracing tool to see where the time is spent. On linux a basic tool is strace -r. There is probably some similar tool for other systems. Your compiler must also come with a profiler (i.e. gprof for GCC or maybe Valgrind).
I had a very similar issue but I wasn't loading in a large file - so I was baffled at the long first execution time (caching couldn't have been the issue).
This answer pointed me in the right direction - it was my real-time anti-virus protection. Every time I recompiled the program it would re-scan it as being potentially malicious. I added my project path as an "Exception" to Avira's (in my case) real-time virus protection.
Program execution on the first execution is now lightening quick!
This is nothing new, not just your program many popular commercial softwares face this problem.
To start with check this MATLAB Article about slow fist time execution
In case of other programming language which runs on a Virtual Machine like C# or Java this is quite common.
http://en.wikipedia.org/wiki/Just-in-time_compilation#Startup_delay_and_optimizations
Caching is a good reason for that to happen in C but still 10x is quite a long duration..It might be also possible that you system was loading other resources after you restart.
You should run the program after say 10 minutes after restart for better results. All the startup application would be loaded by that time. (10 minutes ---- depends on the number of startup applications and the time it takes to start each of them)
This is because of compiler optimatization ,what it does is it caches the result for Temoparal Locality and the activation record is saved,time is also saved because the binding object donot have to be reloaded again during Linking Stage
There are two components to the time measured
If you are reading a file from disk,and loading it in memory - and sorting :
1)Time to read the file & store it in an array
2)Time of sorting
Were these measured separately?
Can you check this out?
Invalidating Linux Buffer Cache
Instead of doing a restart, if repeating the experiment with clearing the cache gives the same result, then you can infer that File buffer caching effects were not factored into.
I am working on an embedded application without any OS that needs the use of a File System. I've been over this many times with the people in the project and some agree with me that the system must make a proper shut down of the system whenever there is a power failure or else the file system might go crazy.
Some people say that it doesn't matter if you simply power off the system and let nature run its course, but I think that's one of the worst things to do, especially if you know this will bring you a problem and probably shorten your product's life span.
In the last paragraph I just assumed that it is a problem, but my question remains:
Does a power down have any effect on the file system?
Here is a list of various techniques to help an embedded system tolerate a power failure. These may not be practical for your particular application.
Use a Journaling File System - Can tolerate incomplete writes due to power failure, OS crash, etc. Most modern filesystems are journaled, but do your homework to confirm.
Unless your application needs the write performance, disable all write caching. Check your disk drivers for caching options. Under Linux/Unix, consider mounting the filesystem in sync mode.
Unless it must be writable, make it read-only. Try to keep your application executables and operating system files on their own partition(s), with write protections in place (e.g. mount read only in Linux). Your read/write data should be on its own partition. Even if your application data gets corrupted, your system should still be able to boot (albeit with a fail safe default configuration).
3a. For data that is only written once (e.g. Configuration Settings), try to keep it mounted as read-only most of the time. If there is a settings change mount is as R/W temporarily, update the data, and then unmount/remount it as read-only.
3b. Use a technique similar to 3a to handle application/OS updates in the field.
3c. If it is impractical for you to mount the FS as read-only, at least consider opening individual files as read-only (e.g. fp=fopen("configuration.ini", "r")).
If possible, use separate devices for your storage. Keeping things in separate partitions provides some protection, but there are still edge cases where a partition table may become corrupt and render the entire drive unreadable. Using physically separate devices further isolates against one corrupt device bringing down the whole system. In a perfect world, you would have at least 4 separate devices:
4a. Boot Loader
4b. Operating System & Application Code
4c. Configuration Settings
4e. Application Data
Know the characteristics of your storage devices, and control the brand/model/revision of devices used. Some hard disks ignore cache flush commands from the OS. We had cases where some models of CompactFlash cards would corrupt themselves during a power failure, but the "industrial" models did not have this problem. Of course, this information was not published in any datasheet, and had to be gathered by experimental testing. We developed a list of approved CF cards, and kept inventory of those cards. We periodically had to update this list as older cards became obsolete, or the manufacturer would make a revision.
Put your temporary files in a RAM Disk. If you keep those writes off-disk, you eliminate them as a potential source of corruption. You also reduce flash wear and tear.
Develop automated corruption detection and recovery methods. - All of the above techniques will not help you if the application simply hangs because a missing config file. You need to be able to recover as gracefully as possible:
7a. Your system should maintain at least two copies of its configuration settings, a "primary" and a "backup". If the primary fails for some reason, switch to the backup. You should also consider mechanisms for making backups whenever whenever the configuration is changed, or after a configuration has been declared "good" by the user (testing vs production mode).
7b. Did your Application Data partition fail to mount? Automatically run chkdsk/fsck.
7c. Did chkdsk/fsck fail to fix the problem? Automatically re-format the partition and get it back to a known state.
7d. Do you have a Boot Loader or other method to restore the OS and application after a failure?
7e. Make sure your system will beep, flash an LED, or something to indicate to the user what happened.
Power Failures should be part of your system qualification testing. The only way you will be sure you have a robust system is to test it. Yank the power cord from the system and document what happens. Try yanking the power at multiple points in the system operation (during runtime, while booting, mid configuration, etc). Repeat each test multiple times.
If you cannot mitigate all power failure problems, incorporate a battery or Supercapacitor into the system - Keep in mind that you will need a background process in your OS to initiate a graceful shutdown when power gets low. Also, batteries will require periodic testing and replacement with age.
Addition to msemack's response, unfortunately my rating is too low to post a comment to his answer vs. a separate answer.
Does a power down have any effect on the file system?
Yes, if proper measures aren't put in place to prevent corruption. See previous answers for file system options to help mitigate. However if ATA flush/sleep aren't properly implemented on your device you may run into the scenario we did. In our scenario the device was corrupt beyond the file system, and fdisk/format would not recover the device.
Instead an ATA security-erase was required to recover the device once corruption occurs. In order to avoid this, we implemented an ATA sleep command prior to power loss. This required hold-up of 400ms to support the 160ms ATA sleep took, and leave some head room for degradation of the caps over the life of the product.
Notes from our scenario:
fdisk/format failed to repair/recover the drive.
Our power-safe file system's check disk utility returned that the device had bad blocks, but there really weren't any.
flush/sync returned success, quickly, and most likely weren't implemented.
Once corrupt, dd could not read the device beyond the 1st partition boundary and returned i/o errors after.
hdparm used to issue ATA security-erase, as only method of recovery for some corruption scenarios.
For non-journalling filesystem unexpected turn-off can mean corruption of certain data including directory structure. This happens if there's unsaved data in the cache or if the FS is in the process of writing multi-block update and interruption happens when only some blocks are written.
Journalling addresses this problem mostly - if there's interruption in the middle, recovery routine or check-and-repair operation done by the FS (usually implicitly) brings the filesystem to consistent state. However this state is not always the latest - i.e. if there were some data in the memory cache, they can be lost even with journalling. This is because journalling saves you from corruption of the filesystem but doesn't do magic.
Write-through mode (no write caching) reduces possibility of the data loss but doesn't solve the problem completely, as journalling will work as a cache (for a very short time).
So unfortunately backup or data duplication are the main ways to prevent data loss.
It totally depends on the file system you are using and if it is acceptable to loose some data at power off based on your project requirements.
One could imagine using a file system that is secured against unattended power-off and is able to recover from a partial write sequence. So on the applicative side, if you don't have critic data that absolutely needs to be written before shuting down, there is no need for a specific power off detection procedure.
Now if you want a more specific answer for your project you will have to give more information on the file system you are using and your project requirements.
Edit: As you have critical applicative data to save before power-off, i think you have answered the question yourself. The only way to secure unattended power-off is to have a brown-out detection that alerts your embedded device coupled with some hardware circuitry that allows keeping delivering enought power to the device to perform the shutdown procedure.
The FAT file-system is particularly prone to corruption if a write is in progress or a file is open on shutdown - specifically if ther is a buffered operation that is not flushed . On one project I worked on the solution was to run a file system integrity check and repair (essentially chkdsk/scandsk) on start-up. This strategy did not prevent data loss, but it did prevent the file system becoming unusable.
A number of vendors provide journalling add-on components for FAT to counter exactly this problem. These include Segger, Quadros and Micrium for example.
Either way, your system should generally adopt a open-write-close approach to file access, or open-write-flush if you feel the need to keep the file open.
Before overwriting data in a file, I would like to be pretty sure the old data is stored on disk. It's potentially a very big file (multiple GB), so in-place updates are needed. Usually writes will be 2 MB or larger (my plan is to use a block size of 4 KB).
Instead of (or in addition to) calling fsync(), I would like to retain (not overwrite) old data on disk until the file system has written the new data. The main reasons why I don't want to rely on fsync() is: most hard disks lie to you about doing an fsync.
So what I'm looking for is what is the typical maximum delay for a file system, operating system (for example Windows), hard drive until data is written to disk, without using fsync or similar methods. I would like to have real-world numbers if possible. I'm not looking for advice to use fsync.
I know there is no 100% reliable way to do it, but I would like to better understand how operating systems and file systems work in this regard.
What I found so far is: 30 seconds is / was the default for /proc/sys/vm/dirty_expire_centiseconds. Then "dirty pages are flushed (written) to disk ... (when) too much time has elapsed since a page has stayed dirty" (but there I couldn't find the default time). So for Linux, 40 seconds seems to be on the safe side. But is this true for all file systems / disks? What about Windows, Android, and so on? I would like to get an answer that applies to all common operating systems / file system / disk types, including Windows, Android, regular hard disks, SSDs, and so on.
Let me restate this your problem in only slightly-uncharitable terms: You're trying to control the behavior of a physical device which its driver in the operating system cannot control. What you're trying to do seems impossible, if what you want is an actual guarantee, rather than a pretty good guess. If all you want is a pretty good guess, fine, but beware of this and document accordingly.
You might be able to solve this with the right device driver. The SCSI protocol, for example, has a Force Unit Access (FUA) bit in its READ and WRITE commands that instructs the device to bypass any internal cache. Even if the data were originally written buffered, reading unbuffered should be able to verify that it was actually there.
The only way to reliably make sure that data has been synced is to use the OS specific syncing mechanism, and as per PostgreSQL's Reliability Docs.
When the operating system sends a write request to the storage
hardware, there is little it can do to make sure the data has arrived
at a truly non-volatile storage area. Rather, it is the
administrator's responsibility to make certain that all storage
components ensure data integrity.
So no, there are no truly portable solutions, but it is possible (but hard) to write portable wrappers and deploy a reliable solution.
First of all thanks for the information that hard disks lie about flushing data, that was new to me.
Now to your problem: you want to be sure that all data that you write has been written to the disk (lowest level). You are saying that there are two parts which need to be controlled: the time when the OS writes to the hard drive and the time when the hard drive writes to the disk.
Your only solution is to use a fuzzy logic timer to estimate when the data will be written.
In my opinion this is the wrong way. You have control about when the OS is writing to the hard drive, so use the possibility and control it! Then only the lying hard drive is your problem. This problem can't be solved reliably. I think, you should tell the user/admin that he must take care when choosing the right hard drive. Of course it might be a good idea to implement the additional timer you proposed.
I believe, it's up to you to start a row of tests with different hard drives and Brad Fitzgerald's tool to get a good estimation of when hard drives will have written all data. But of course - if the hard drive wants to lie, you can never be sure that the data really has been written to the disk.
There are a lot of caches involved in giving users a responsive system.
There is cpu cache, kernel/filesystem memory cache, disk drive memory cache, etc. What you are asking is how long does it take to flush all the caches?
Or, another way to look at it is, what happens if the disk drive goes bad? All the flushing is not going to guarantee a successful read or write operation.
Disk drives do go bad eventually. The solution you are looking for is how can you have a redundant cpu/disk drive system such that the system survives a component failure and still keeps working.
You could improve the likelihood that system will keep working with aid of hardware such as RAID arrays and other high availability configurations.
As far software solution goes, I think the answer is, trust the OS to do the optimal thing. Most of them flush buffers out routinely.
This is an old question but still relevant in 2019. For Windows, the answer appears to be "at least after every one second" based on this:
To ensure that the right amount of flushing occurs, the cache manager spawns a process every second called a lazy writer. The lazy writer process queues one-eighth of the pages that have not been flushed recently to be written to disk. It constantly reevaluates the amount of data being flushed for optimal system performance, and if more data needs to be written it queues more data.
To be clear, the above says the lazy writer is spawned after every second, which is not the same as writing out data every second, but it's the best I can find so far in my own search for an answer to a similar question (in my case, I have an Android apps which lazy-writes data back to disk and I noticed some data loss when using an interval of 3 seconds, so I am going to reduce it to 1 second and see if that helps...it may hurt performance but losing data kills performance a whole lot more if you consider the hours it takes to recover it).
We have a problem which is embarrassingly parallel - we run a large number of instances of a single program with a different data set for each; we do this simply by submitting the application many times to the batch queue with different parameters each time.
However with a large number of jobs, not all of them complete. It does not appear to be a problem in the queue - all of the jobs are started.
The issue appears to be that with a large number of instances of the application running, lots of jobs finish at roughly the same time and thus all try to write out their data to the parallel file-system at pretty much the same time.
The issue then seems to be that either the program is unable to write to the file-system and crashes in some manner, or just sits there waiting to write and the batch queue system kills the job after it's been sat waiting too long. (From what I have gathered on the problem, most of the jobs that fail to complete, if not all, do not leave core files)
What is the best way to schedule disk-writes to avoid this problem? I mention our program is embarrassingly parallel to highlight the fact the each process is not aware of the others - they cannot talk to each other to schedule their writes in some manner.
Although I have the source-code for the program, we'd like to solve the problem without having to modify this if possible as we don't maintain or develop it (plus most of the comments are in Italian).
I have had some thoughts on the matter:
Each job write to the local (scratch) disk of the node at first. We can then run another job which checks every now and then what jobs have completed and moves the files from the local disks to the parallel file-system.
Use an MPI wrapper around the program in master/slave system, where the master manages a queue of jobs and farms these off to each slave; and the slave wrapper runs the applications and catches the exception (could I do this reliably for a file-system timeout in C++, or possibly Java?), and sends a message back to the master to re-run the job
In the meantime I need to pester my supervisors for more information on the error itself - I've never run into it personally, but I haven't had to use the program for a very large number of datasets (yet).
In case it's useful: we run Solaris on our HPC system with the SGE (Sun GridEngine) batch queue system. The file-system is NFS4, and the storage servers also run Solaris. The HPC nodes and storage servers communicate over fibre channel links.
Most parallel file systems, particularly those at supercomputing centres, are targetted for HPC applications, rather than serial-farm type stuff. As a result, they're painstakingly optimized for bandwidth, not for IOPs (I/O operations per sec) - that is, they are aimed at big (1000+ process) jobs writing a handful of mammoth files, rather than zillions of little jobs outputting octillions of tiny little files. It is all to easy for users to run something that runs fine(ish) on their desktop and naively scale up to hundreds of simultaneous jobs to starve the system of IOPs, hanging their jobs and typically others on the same systems.
The main thing you can do here is aggregate, aggregate, aggregate. It would be best if you could tell us where you're running so we can get more information on the system. But some tried-and-true strategies:
If you are outputting many files per job, change your output strategy so that each job writes out one file which contains all the others. If you have local ramdisk, you can do something as simple as writing them to ramdisk, then tar-gzing them out to the real filesystem.
Write in binary, not in ascii. Big data never goes in ascii. Binary formats are ~10x faster to write, somewhat smaller, and you can write big chunks at a time rather than a few numbers in a loop, which leads to:
Big writes are better than little writes. Every IO operation is something the file system has to do. Make few, big, writes rather than looping over tiny writes.
Similarly, don't write in formats which require you to seek around to write in different parts of the file at different times. Seeks are slow and useless.
If you're running many jobs on a node, you can use the same ramdisk trick as above (or local disk) to tar up all the jobs' outputs and send them all out to the parallel file system at once.
The above suggestions will benefit the I/O performance of your code everywhere, not juston parallel file systems. IO is slow everywhere, and the more you can do in memory and the fewer actual IO operations you execute, the faster it will go. Some systems may be more sensitive than others, so you may not notice it so much on your laptop, but it will help.
Similarly, having fewer big files rather than many small files will speed up everything from directory listings to backups on your filesystem; it is good all around.
It is hard to decide if you don't know what exactly causes the crash. If you think it is an error related to the filesystem performance, you can try an distributed filesystem: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_user_guide.html
If you want to implement Master/Slave system, maybe Hadoop can be the answer.
But first of all I would try to find out what causes the crash...
OSes don't alway behave nicely when they run out of resources; sometimes they simply abort the process that asks for the first unit of resource the OS can't provide. Many OSes have file handle resource limits (Windows I think has a several-thousand handle resource, which you can bump up against in circumstances like yours), and failure to find a free handle usually means the OS does bad things to the requesting process.
One simple solution requiring a program change, is to agree that no more than N of your many jobs can be writing at once. You'll need a shared semaphore that all jobs can see; most OSes will provide you with facilities for one, often as a named resource (!). Initialize the semaphore to N before you launch any job.
Have each writing job acquire a resource unit from the semaphore when the job is about to write, and release that resource unit when it is done. The amount of code to accomplish this should be a handful of lines inserted once into your highly parallel application. Then you tune N until you no longer have the problem. N==1 will surely solve it, and you can presumably do lots better than that.
All,
I'm working on a Real-time system, VxWorks I think, I'm saving application settings to a file. What's the best way to handle preserving the settings if the system shuts down or loses power in the middle of a file write? All I can think of is shuffling a few files around or reducing the frequency at which i Save variables in order to reduce incidents.
Hardware to hold up the power a few milliseconds and give the OS a power down warning is quite helpful!
Use an ACID compliant database for critical settings, such as SQLite; it is fully transactional even in the event of a power failure