I am working on an application that runs on a small Linux computer with an SD card for storage. The application runs automatically on startup and we want to be able to easily check the logs that it produces. Normally I would just write to a file, since that also seems to be what most normal software would do. But I am hesitant about doing this because I think continuously writing logs is a bad idea because of the SD card for storage.
The problem is that sometimes when we want to check what is happening on the system, say for debugging purposes, we have stop the application via SSH and then start it again so that we can see the output messages.
So my question is: is there a way to say write logs to some kind of circular list that can then be viewed when connecting to the system over SSH? The application is written in C and C++ if that matters.
Is your application on a Raspberry Pi?
The Linux Operating system, and all other technology, is probably writing so much to the SD card, that your 500 KB/ hour would be next to nothing in comparison.
I would personally just have the program log to the file.
If you really do not want this, you have a few other options:
Have the application send the logs via the internet to some service, which you can then monitor
have your application store the logs in a buffer in-memory, and then write to file when you reach some threshold. Expose an endpoint on localhost which listens for a message, and when received, writes the in-memory contents to the file. This allows you to see log-files for current in-memory logs without having to wait.
First thing, I think that the SD driver care about writing and about I/O operation scheduling them in the better way for the safety of the SD card itself (using a virtual filesystem). Maybe you can work on your log level to be sure to write the necessary information and nothing more.
Based on shared inputs SD card wear-out might not happen easily, however, there are many ways to handle this scenario based on the hardware and software architecture of your system :
Check if you can write the logs to some other storage device within your system (Depends on your architecture).
If external communication peripherals are available in your system, check if logging can be done by redirecting the logs to remotes servers or other devices.
Perform selective logging based on some log-level as per your architecture / framework. Also, you can do only critical logging in SD card and the other logs can be re-directed based on your architecture. This can reduce the number of writes.
Based on your need/architecture, check if the data can be compressed and logged. This can reduce the number of times, the logs are written to SD card.
To continue working and simultaneously view the logs :
Based on your architecture/need, check if you can write to a file periodically or based on threshold so that you can view the file irrespective of operation.
Send selective logs to external server / device
Related
I have a smart card-like miniSD card (it's a javacard as far as I know) and I'm trying to write an emulator for it that runs on Windows and Linux. The emulator will be used in software integration tests. I want to test my client without using the actual hardware for several reasons. One reason is that the actual hardware will change its state irreversibly and doesn't allow a complete reset.
The device implements a mass storage with FAT32 file system. It contains a special device file that is being used for controlling the device via simple file write/read operations.
My goal is that the virtual (emulated) device appears with drive letter in Windows explorer as soon as the emulator is started, similar as if someone would actually plug a real device.
I wonder if there is any open software project that I can base my program on? The biggest challenges are obviously
Providing/developing a "virtual" (USB/SD) mass storage device
Intercepting file I/O operations on the special device file.
According to Wikipedia, device files are a common way to simplify driver development. So I wondered if there are existing emulation solutions for driver developers. At least I couldn't find any.
Simulating the device file itself would be an important first step. My first idea was to use a normal file and to communicate with the client by actually reading/writing to this file while observing it. I.e. clear the file as soon as the client wrote to it and write the response into it. I don't know if this could work at all. One problem is that the client doesn't open the file with shared mode, so my simulator cannot access it at the same time.
Then I found out that QEMU can emulate mass storage, however it seems that it only supports image files and that probably doesn't allow device file.
Microsoft has some documentation about how to write USB device emulators and drivers but it seems to be very complex and I wondered if there is an exisiting solution that could be extended:
Finally there is the USB/IP Project, but I don't know if it is helpful as I still need to develop a driver and then I'm back at the complex MS documentation above.
I'm currently working with an application that has branching logic depending on whether a specific USB drive is inserted into the system. It does this by polling all drive letters looking for a path on the root of each drive.
This works for the majority of machines, but often, this application runs on startup with the USB drive inserted. In addition, some machines are especially slow and take a good minute to load the USB drive once Windows boots. In these machines, the code reaches the check of whether this drive exists, it can't find the drive, and the wrong branch is executed.
It could be possible to wait a minute before checking for drives. I'd prefer to have the application wait for all USB devices (or simply only mass storage devices) to load before checking for the drives, or something even more intelligent.
Unfortunately, I am unfamiliar with the methods needed to wait for all USB devices to finish loading, and with the DDK in general. I can see that it's possible to register a window for device notifications with GUID_DEVINTERFACE_USB_DEVICE, possibly receiving messages like DBT_DEVICEARRIVAL, DBT_DEVNODES_CHANGED, and WM_DEVICECHANGE, the distinctions of which I do not know.
However, the USB drive will likely be already inserted and detected (but not having a drive letter) into the system before the program executes. So, registering for all device change notifications would not make sense. If it's possible to identify devices which are inserted but not loaded (possibly with SetupDiEnumDeviceInterfaces) and then register for "loaded" notifications on all of those devices, it might work. I don't have familiarity with any this, so pointers (or sample code) would be extremely helpful.
I don't think you can.
The problem is that you're tyring to distinguish two cases which are not actually distinct, that of plugging in a USB device during boot and plugging it in later.
You have to know that USB is a protocol which requires a non-trivial amount of intelligence in the USB slave devices. They exchange multiple messages with the USB host (i.e. your OS). This exchange isn't instant. For instance, your USB harddrive will need to ask for permission to draw more than 100mA power. To answer that, the power drivers of Windows must be up and running. The phsyical disk can only spin up when the answer arrives.
So, there's a whole message sequence going on, and the drive letter shows up only fairly late. Windows must know how many partitions exist. So during this exchange, new devices are being created all the time.
When you enumerate devices while devices are actively being added, you're really asking for troubles. The SetupDiEnumDeviceInterfaces API doesn't operate on a snapshot (which we know because there's no Close method); you're asking for the N'th device until you get an "no more devices" error an you know N was to big. But when devices are still actively being added, N changes. And I don't see a guarantee that the list order is by age; devices might be added in the middle as well.
I don't think that getting notifications about drivers being installed for newly plugged-in devices would help you much. When you plug in the same device repeatedly, the drivers are usually installed only the first time.
Also, an USB flash drive, although physically looks like one compact device, is represented by at least three PnP devices by Windows: an USB mass storage device (represents the USB endpoint), a disk device (represents the physical disk inside the flash drive) and one or multiple Volume devices (each represents one volume = partition in your case). Drive letters may be assigned to the Volume devices.
What you can possibly do is monitoring the arrival and removal of Volume devices (RegisterDeviceNotification for GUID_DEVINTERFACE_VOLUME) and examining each volume device that arrives (I believe that Setup API allows you to track its "parents" to the USB stack).
I am developing a system which copies and writes files on NTFS inside a virtual machine. At any time the VM can poweroff (direct shutdown). The poweroff is controlled from the outside so I do not have any way to detect it. Due to that files and complete directories which are being written to get lost. Is there any way to prevent that or do I have to develop my own file system? I have to store the files on the local disk and cannot send files via network.
There always exists a [short] period between when your data is written (sent to the API) and when this data is written to the physical hardware. If the system crashes in the middle, the data will be lost.
There is a setting in Windows to disable system write cache for certain disks. This setting can help you ensure that the data is at least sent to the host's hardware. Probably that's the answer you've been looking for.
Writing your own filesystem won't help much because it's mainly the write cache that causes the data to be lost. There can exist a filesystem-level cache as well, though, and I don't know if the write cache setting I mentioned above also affects internal filesystem cache.
If you write data to a file opened with "write through" enabled, the method only returns after the data is physically written to the disk so you can be sure it got written. You normally do that by passing in a WRITE_THROUGH flag when you open the file.
I am working on an embedded application without any OS that needs the use of a File System. I've been over this many times with the people in the project and some agree with me that the system must make a proper shut down of the system whenever there is a power failure or else the file system might go crazy.
Some people say that it doesn't matter if you simply power off the system and let nature run its course, but I think that's one of the worst things to do, especially if you know this will bring you a problem and probably shorten your product's life span.
In the last paragraph I just assumed that it is a problem, but my question remains:
Does a power down have any effect on the file system?
Here is a list of various techniques to help an embedded system tolerate a power failure. These may not be practical for your particular application.
Use a Journaling File System - Can tolerate incomplete writes due to power failure, OS crash, etc. Most modern filesystems are journaled, but do your homework to confirm.
Unless your application needs the write performance, disable all write caching. Check your disk drivers for caching options. Under Linux/Unix, consider mounting the filesystem in sync mode.
Unless it must be writable, make it read-only. Try to keep your application executables and operating system files on their own partition(s), with write protections in place (e.g. mount read only in Linux). Your read/write data should be on its own partition. Even if your application data gets corrupted, your system should still be able to boot (albeit with a fail safe default configuration).
3a. For data that is only written once (e.g. Configuration Settings), try to keep it mounted as read-only most of the time. If there is a settings change mount is as R/W temporarily, update the data, and then unmount/remount it as read-only.
3b. Use a technique similar to 3a to handle application/OS updates in the field.
3c. If it is impractical for you to mount the FS as read-only, at least consider opening individual files as read-only (e.g. fp=fopen("configuration.ini", "r")).
If possible, use separate devices for your storage. Keeping things in separate partitions provides some protection, but there are still edge cases where a partition table may become corrupt and render the entire drive unreadable. Using physically separate devices further isolates against one corrupt device bringing down the whole system. In a perfect world, you would have at least 4 separate devices:
4a. Boot Loader
4b. Operating System & Application Code
4c. Configuration Settings
4e. Application Data
Know the characteristics of your storage devices, and control the brand/model/revision of devices used. Some hard disks ignore cache flush commands from the OS. We had cases where some models of CompactFlash cards would corrupt themselves during a power failure, but the "industrial" models did not have this problem. Of course, this information was not published in any datasheet, and had to be gathered by experimental testing. We developed a list of approved CF cards, and kept inventory of those cards. We periodically had to update this list as older cards became obsolete, or the manufacturer would make a revision.
Put your temporary files in a RAM Disk. If you keep those writes off-disk, you eliminate them as a potential source of corruption. You also reduce flash wear and tear.
Develop automated corruption detection and recovery methods. - All of the above techniques will not help you if the application simply hangs because a missing config file. You need to be able to recover as gracefully as possible:
7a. Your system should maintain at least two copies of its configuration settings, a "primary" and a "backup". If the primary fails for some reason, switch to the backup. You should also consider mechanisms for making backups whenever whenever the configuration is changed, or after a configuration has been declared "good" by the user (testing vs production mode).
7b. Did your Application Data partition fail to mount? Automatically run chkdsk/fsck.
7c. Did chkdsk/fsck fail to fix the problem? Automatically re-format the partition and get it back to a known state.
7d. Do you have a Boot Loader or other method to restore the OS and application after a failure?
7e. Make sure your system will beep, flash an LED, or something to indicate to the user what happened.
Power Failures should be part of your system qualification testing. The only way you will be sure you have a robust system is to test it. Yank the power cord from the system and document what happens. Try yanking the power at multiple points in the system operation (during runtime, while booting, mid configuration, etc). Repeat each test multiple times.
If you cannot mitigate all power failure problems, incorporate a battery or Supercapacitor into the system - Keep in mind that you will need a background process in your OS to initiate a graceful shutdown when power gets low. Also, batteries will require periodic testing and replacement with age.
Addition to msemack's response, unfortunately my rating is too low to post a comment to his answer vs. a separate answer.
Does a power down have any effect on the file system?
Yes, if proper measures aren't put in place to prevent corruption. See previous answers for file system options to help mitigate. However if ATA flush/sleep aren't properly implemented on your device you may run into the scenario we did. In our scenario the device was corrupt beyond the file system, and fdisk/format would not recover the device.
Instead an ATA security-erase was required to recover the device once corruption occurs. In order to avoid this, we implemented an ATA sleep command prior to power loss. This required hold-up of 400ms to support the 160ms ATA sleep took, and leave some head room for degradation of the caps over the life of the product.
Notes from our scenario:
fdisk/format failed to repair/recover the drive.
Our power-safe file system's check disk utility returned that the device had bad blocks, but there really weren't any.
flush/sync returned success, quickly, and most likely weren't implemented.
Once corrupt, dd could not read the device beyond the 1st partition boundary and returned i/o errors after.
hdparm used to issue ATA security-erase, as only method of recovery for some corruption scenarios.
For non-journalling filesystem unexpected turn-off can mean corruption of certain data including directory structure. This happens if there's unsaved data in the cache or if the FS is in the process of writing multi-block update and interruption happens when only some blocks are written.
Journalling addresses this problem mostly - if there's interruption in the middle, recovery routine or check-and-repair operation done by the FS (usually implicitly) brings the filesystem to consistent state. However this state is not always the latest - i.e. if there were some data in the memory cache, they can be lost even with journalling. This is because journalling saves you from corruption of the filesystem but doesn't do magic.
Write-through mode (no write caching) reduces possibility of the data loss but doesn't solve the problem completely, as journalling will work as a cache (for a very short time).
So unfortunately backup or data duplication are the main ways to prevent data loss.
It totally depends on the file system you are using and if it is acceptable to loose some data at power off based on your project requirements.
One could imagine using a file system that is secured against unattended power-off and is able to recover from a partial write sequence. So on the applicative side, if you don't have critic data that absolutely needs to be written before shuting down, there is no need for a specific power off detection procedure.
Now if you want a more specific answer for your project you will have to give more information on the file system you are using and your project requirements.
Edit: As you have critical applicative data to save before power-off, i think you have answered the question yourself. The only way to secure unattended power-off is to have a brown-out detection that alerts your embedded device coupled with some hardware circuitry that allows keeping delivering enought power to the device to perform the shutdown procedure.
The FAT file-system is particularly prone to corruption if a write is in progress or a file is open on shutdown - specifically if ther is a buffered operation that is not flushed . On one project I worked on the solution was to run a file system integrity check and repair (essentially chkdsk/scandsk) on start-up. This strategy did not prevent data loss, but it did prevent the file system becoming unusable.
A number of vendors provide journalling add-on components for FAT to counter exactly this problem. These include Segger, Quadros and Micrium for example.
Either way, your system should generally adopt a open-write-close approach to file access, or open-write-flush if you feel the need to keep the file open.
I've been assigned to upgrade an embedded application written in C. The application is configured via a web interface.
When the user modifies the web application a file is written to /var/www/settings.json and the file /var/www/UPDATE_SETTINGS is touched.
In the main application loop it checks to see if UPDATE_SETTINGS exists. If it does it parses the settings.json with json-c and then deletes UPDATE_SETTINGS.
This works well enough, however, we would prefer to move to an event-driven architecture (perhaps libev) in which settings.json is fed directly into the program by the webapp script to a plain-old UDP port and then issue a callback to perform the update.
What are some other elegant ways to solve this problem? Should we just stick with the current approach?
Just use inotify. It was created for cases like yours.
I am making some assumptions here.
1) you are connected to the internet all the time with you embedded device.
2) your device can set up interrupts on things like "USART RX buffer not empty"
note: depending on what kind of hardware you are using you could set up interrupts on things like pings and other stuff this could be another way of interrupting the embedded device.
if those two assumptions are correct you could do this, have another "script" on a server or computer somewhere that watches the /var/www/settings.json for changes you could use something like rsync to watch for changes. this "script" when it notices that the json file changes will communicate to the embedded device using tcp/ip you can either ping the device or just send the file over. If you can set an USART interrupt on the embedded device then the device will be able to detect the data coming in and therefore respond by either reading the data you are sending or going to the website to download the json file to be parsed.
this way you will have an event drive embedded device and it will not waste time checking to see if this json file has changed.
I hope this helps