mkstemp and hard disk stress - c

Are temporary files created with mkstemp synced to disk?
Here is what I have:
Program creates temporary file using mkstemp and sends fd to another program.
This temporary file is mmap-ped by both programs and used heavily (up to 400 MB/sec of writes and 400 MB/sec of reads; up to 60 reads and writes per second).
I can't use memfd_create (may not be supported on target devices).
Lets also assume (and this is almost true) that I can't create this file on tmpfs (like in /tmp).
What I need is guarantee that such file will not stress hard disk. I can't allow it to be written to disk even if this only happens once every 5 seconds. If I can't get such guarantee, I will look for another way.
Additional info (not important):
I am writing wayland compositor for Android devices. Currently temporary files (wayland surfaces actually) are created on tmpfs. And everything works fine as long as SELinux is not enabled. But if I enable SELinux, it prevents fd's from being transferred from client to compositor. Only solution I currently know is to create temporary files in app's home dir. But if such way is dangerous, I will find another.

Are temporary files created with mkstemp synced to disk?
The mkstemp function does not impart any special properties to files it opens that would prevent them from being synced to disk. The filesystem on which they are created might have such a property, but that's independent of file creation. In particular, files created via mkstemp() will persist indefinitely if not removed.
What I need is guarantee that such file will not stress hard disk. I can't allow it to be written to disk even if this only happens once every 5 seconds. If I can't get such guarantee, I will look for another way.
As far as I am aware, even tmpfs filesystems do not guarantee that their contents will remain locked in memory, as opposed to being paged out. They are backed by virtual memory. But if the actual file is comparatively small and all its pages are hot, then they are likely to remain in memory only.
With regard to the larger problem,
everything works fine as long as SELinux is not enabled. But if I
enable SELinux, it prevents fd's from being transferred from client to
compositor. Only solution I currently know is to create temporary
files in app's home dir.
By default, newly-created files inherit the SELinux type of their parent directory. Your Wayland clients presumably do not have sufficient privilege to modify the SELinux labels of the files they create, but you should be able to administratively create a directory wherever you like with a label conducive to your needs. For example, you could cause a subdirectory of /dev/shm to be created for the purpose (at every boot), and chconned to have an appropriate label. If the clients create their temp files there then they should inherit the SELinux type you choose.

Related

Restricting folder/file access to one program?

What I need, boiled down, is a way to 'selectively' encrypt either a folder or a zip file - Whatever the solution would be, it needs to block (or redirect) all reads/writes EXCEPT from one specific program (not mine, a legacy application that I do not have source code access to - I cannot modify the program who would have the sole permission to perform reads and writes on the encrypted folder/zip file). I would like to avoid having a constantly running background app (as all the end-user would have to do to circumvent the protection would be to kill the program)
The purpose is to, of course, protect the files within the folder from tampering.
I could modify folder permissions at install, but this would block all programs from access wouldn't it? I more or less need to only block File Explorer from accessing the files, but not the program which needs to read them... if that makes sense. Or, if I could protect the (plaintext) files somehow without affecting the legacy application's reading of them... argh.
I wonder if it would be possible with CreateProcess() to run the legacy application as a high-level user and give the folders it needs access to the same permission, such as TrustedInstaller or SYSTEM, (who, in Windows, own things that not even administrators can touch, like System Volume Information)
This would allow the program to read/write to the folders, but not the user.
I was looking at LockFile, seems to be close to what I am looking for but not quite. I need something like semi-exclusive access.
I am fairly fluent in C++, Visual Basic.net, only some Python, but I am willing to use any language which would allow a solution to this problem (Though it probably could be implemented in any language, if possible at all.)

A way to make a file contents snapshot in Linux

What is the best way to create an "atomic" snapshot of file contents in Linux? Emphasis is not on performance, but on getting contents as a whole.
I may think of using sendfile(2) (since 2.6.33) or splice(2), but neither have any indication of operation atomicity. Both are run in the kernel-space entirely, but at least sendfile(2) implies it's using mmap(2) and mmap gives no guarantees that writes to the same mmaped (as MAP_SHARED) region in other processes won't be visible even with MAP_PRIVATE (probably they will, because that are the same pages).
Taking that this functions are writing with performance in mind and sendfile(2) is optimized to be used with DMA, I may only assume that they just copy memory in some background kernel thread and it's quite possible that other operations may also affect the data being copied.
So the only possible solution I see is to place a read lease with fcntl(2) (FD_SETLEASE) and copy file as normal, but if someone opens it for writing, either try to "rush" it (very reliable, I know) and beat the timer, or just give up and try later. Is that correct?
So the only possible [filesystem-independent] solution I see is to place a read lease with fcntl(2) (FD_SETLEASE) and copy file as normal, but if someone opens it for writing, either try to "rush" it (very reliable, I know) and beat the timer, or just give up and try later. Is that correct?
Almost; there is also fanotify. Plus, as mentioned in a comment, there are some filesystem-specific options, and some possibilities only available in certain configurations.
The lease break timer is configurable, /proc/sys/fs/lease_break_time in seconds, and the default is 45 seconds.
"Just give up and try later" is also a bit defeatist; you do have ways to monitor when the snapshot might work. Consider placing an inotify IN_CLOSE_WRITE and IN_CLOSE_NOWRITE watch on the file, and try the snapshot whenever you receive such an event.
fanotify:
For a few years now, I've been monitoring the progress of Linux fanotify, in the hopes that it would grow enough features that it could be used for automagic file versioning. Essentially, whenever someone opens the file with write permissions, the current file would be snapshot to temporary storage, marked with some metadata (timestamp, real human user (backtracked through sudo/su), and so on). When that descriptor is closed, another snapshot is taken, and a helper thread/process diffs the two, annotating the changes (or even pushing it to git).
It is limited to local filesystems, but with 2.6.37 and later kernels (including 3.x), the interface is sufficient for specific files, or an entire mount. In your case, the fanotify interface allows similar features to file leases, except for local filesystems only, but you can simply deny any accesses during the snapshot. (One can argue whether that is a good idea at all, especially if the file to be snapshotted is a system or configuration file; many programmers overlook error checking, because "some files just have to be always accessible, or your system is broken".)
As far as my change monitoring goes, fanotify should now have all sufficient features, but only if an entire mount is monitored. I was hoping to monitor configuration files on multi-admin clusters, but those files reside on the same mount as all system libraries and binaries do, so the monitoring causes considerable overhead. So much so, that it seems more appropriate to just modify SSH configuration, console configuration (getty etc.), sudo configuration, and possibly su, to always include a dynamic library that interposes file access syscalls, and basically does the versioning on behalf of the user. This way service binaries are not affected, only user actions are monitored.
This might work under some circumstances:
(Optional) Do something to prevent new processes to open the file:
a/ rename the file
b/ restrict file permissions
Find all existing file readers/writers via lsof and kill -STOP them
Do your snapshot
kill -CONT all readers/writers
(Optional) Restore action 1.

Adding Capability to NFS Server - Compressing/Decompressing Stored/Retrieved Files

I need to build a custom Suse Linux NFS Server that does compression on certain files that are stored on the disk, and decompresses files as they are read from the disk. This needs to be transparent to the remote users of the file system, meaning that if a user saves a 10MB file named XYZZY.tif on /archiveDirectoryOnNFSServer, that when they do a ls -l on that mounted directory, they will see a 10MB file called XYZZY.tif, even though the actual file stored on the disk on the NFS server will be XYZZY.tif.compressed, and it will be 2MB in size.
I'm expecting that I need to build this as a driver that sits below the NFS Server software stack, but, I'm having difficulty finding where to start. Are there existing NFS Servers that provide this level of customization through APIs? Will I need to modify source of an open source NFS Server, and, if so, is there one that would be easiest to start with, and are they modularly structured such that this will be straight forward? I'm having difficult locating relevant content on the internet, and any pointers will be greatly appreciated.
IMO that kind of functionality is absolutely not the NFS server's responsibility (an nfs server should, well, serve files over nfs), but the underlying filesystem's. However, there's not that much choice in Linuxland but you could start by checking out fusecompress and btrfs.
This post is a bit old so you may already be aware of some options here, but there are a couple others (both for server side).
http://zfsonlinux.org/
zfs filesystem has built-in compression. I typically use lzjb as it is the fastest compression algorithm and does a reasonable job (MySQL DB's get 2-4x compression, filesystems with non-compressed data get around 4). you have a choice of algorithm depending on how much CPU time you wish to offer the compression.
if you want different file types compressed then you may consider laying gluster on top of a set of zfs filesystems.
gluster will allow you to store certain file types (by extension) on different underlying filesystems.
in this case, you specify the underlying filesystem as a zfs volume with the particular options you need (for example, .zip and .png go on an uncompressed filesystem, while things you write once and read many like static html files might go on a higher compression--you'll pay once when it's written but reads should be really fast since it scans fewer disk blocks and decompression is very fast)
zfs will manage the nfs mounts if you use it as your nfs server--you wont want this if you lay gluster on top.
it's easy to specify dynamically other attributes per filesystem (atime/noatime, # of copies if you want redundancy other than your normal raid, you can add SSD's as cache devices to get more performance).
in these solutions, you still send the full uncompressed files over the wire, so it doesn't make up for network performance but gives a lot of options if you're trying to speed up Disk IO or get more utilization out of your drives.

How do I copy a locked file directly from the disk and make sure that the file is intact?

The application I am writing needs to be able to copy files that are locked. We attempted to use Volume Shadow Copy, and while it was successful in copying the file, the application that had the lock on the file crashed because it could not acquire a lock while we were copying the file.
I am left to believe that my only option is to bypass the OS and read directly from the disk. The problem is that if I read directly to the disk I cannot be sure of the integrity of the file, if it is in the middle of a write the file will be in a damaged state.
After hours of searching I was able to find one utility that copied the file directly from the disk and used a file system driver to cache writes while copying so that it could make sure that the file was in an intact state. However, that utility is extraordinarily expensive, 100k+ for the license I would likely need to use.
Does anyone have any ideas on how to accomplish what I am trying to?
We are planning on restricting the system to NTFS volumes only.
I ended up using a C program called DirectCopy written by Napalm. It works rather well.
http://www.rohitab.com/discuss/topic/24252-ntfs-directcopy-method-from-napalm/
Can you grab the process ID of the application that has a lock on it and suspend its thread while you perform the copy? Something like this http://www.codeproject.com/KB/threads/pausep.aspx
This description of "layered drivers" might be useful. I know nothing about it though.
Also, if the file is locked then can you just 'watch' it and wait for it to be unlocked and then quickly copy it?

Garbage data in file after sudden power loss

I am using a flash with FAT32 system. I am continuosly writing data to a file using file system APIs from rtos(SMX). However, after sudden poweroffs, the file contains garbage values just above the first file entry on system reboot.
I run chkdsk utility, but it doesn't fix any problem.
Any idea how can i get rid of these garbage entries even on unclean power offs?
If you expect sudden power loss, you'll need to disable all caching/buffering on file writes. Of course you'll also need to deal with partially-written files, but that should at least prevent trailing garbage.
I don't know the API you're using, but this might be done by mounting the drive "synchronously" (e.g., mount -o sync in Linux) or by opening individual files with specific options. If you do disable buffering on individual file writes, you may still run the risk of corrupting the FAT, however, and losing all the files.

Resources