How can I set file creation times in ZFS? - filesystems

I've just got a NAS running ZFS and I'd like to preserve creation times when transferring files into it. Both linux/ext4 (where the data is now) and zfs store creation time or birth time. In the case of zfs it's even reported by the stat command. But I haven't been able to figure out how I can set the creation time of a file so it mirrors the creation time in the original file system. Unlike an ext4->ext4 transfer where I can feed debugfs a script to set the file creation times.
Is there a tool similar to debugfs for ZFS?
PS. To explain better:
I have a USB drive attached to a Ubuntu 14.04 laptop. It holds a file system where I care about the creation date (birth date) of the individual files. I consult these creation timestamps often using a script based on debugfs, which reports it as crtime.
I want to move the data to a NAS box running ZFS, but the methods I know (scp -p -r, rsync -a, and tar, among others I've tried) preserve the modification time but not the creation time.
If I were moving to another ext4 file system I would solve the problem using the fantastic tool debugfs. Specifically I can make a list of (filename, crtime) pairs on the source fs (file system), then use debugfs -w on the target fs to read a script with lines of the form
set_inode_field filename crtime <value>
I've tested this and it works just fine.
But my target fs is not ext4 but ZFS and although debugfs runs on the target machine, it is entirely useless there. It doesn't even recognize the fs. Another debug tool that lets you alter timestamps by editing an inode directly is fsdb; it too runs on the target machine, but again I can't seem to get it to recognize a ZFS file system.
I'm told by the folks who sold me the NAS box that debugfs and fsdb are not meant for ZFS filesystems, but they haven't been able to come up with an equivalent. So, after much googling and trying out things I finally decided to post a question here today, hoping someone might have the answer.
I'm surprised at how hard this is turning out to be. The question of how to replicate a dataset so all timestamps are identical seems quite natural from an archival point of view.

Indeed, neither fsdb nor debugfs are likely to be suitable for use with ZFS. What you might need to do instead is find an archive format that will the preserve crtime field that presumably is already set for the files on your fileserver. If there is a version of pax or another archiving tool for your system it may be able to do this (cf. the -pe "preserve everything" flag for pax which it seems in current versions does not preserve "everything" - viz. it does not preserve crtime/birth_time). You will likely have more success finding an archiving application that is "crtime aware" than trying set the creation times by hacking on the ZFS based FreeBSD system with what are likely to be rudimentary tools.
You may be able to find more advanced tools on OpenSolaris based systems like Illumos or SmartOS (e.g. mdb). Whether it would be possible to transfer your data to a ZFS dataset on one of those platforms and then, combining the tools they have with, say, dtrace in order to rewrite the crtime fields is more of a theoretical question. If it worked then you could export the pool and its datasets to FreeBSD - exporting a pool does seem to preserve the crtime time stamps. If you are able to preserve crtime while dumping your ext4 filesystem to a ZFSonLinux dataset on the same host (nb: I have not tested this) you could then use zfs send to transfer the whole filesystem to your NAS.
This core utils bug report may shed some light on the state of user and operating system level tools on Linux. Arguably the filesystem level crtime field of an inode should be difficult to change. While ZFS on FreeBSD "supports" crtime, the state of low level filesystem debugging tools on FreeBSD might not have kept pace in earlier releases (c.f. the zdb manual page). Are you sure you want to "set" (or reset) inode creation times? Or do you want to preserve them after they have been set on a system that already supports them?
On a FreeBSD system if you stat a file stored on a ZFS dataset you will often notice that the crtime field of the file is set to the same time as the ctime field. This is likely because the application that wrote the file did not have access to library and kernel functions required to set crtime at the time the file was "born" and its inode entries were created. There are examples of applications / libraries that try to preserve crtime at the application level such as libarchive(3) (see also: archive_entry_atime(3)) and gracefully handle inode creation if the archive is restored on a filesystem that does not support the crtime field. But that might not be relevant in your case.
As you might imagine, there are a lot of applications that write files to filesystems ... especially with Unix/POSIX systems where "everything is a file". I'm not sure if older applications would need to be modified or recompiled to support those fields, or whether they would pick them up transparently from the host system's C libraries. Applications being used on older FreeBSD releases or on a Linux system without ext4 could be made to run in compatibility mode on an up to date OS, for example, but whether they would properly handle the time fields is a good question.
For me running this little script as sh birthtime_test confirms that file creation times are "turned on" on my FreeBSD systems (all of which use ZFS post v28 i.e. with feature flags):
#!/bin/sh
#birthtime_test
uname -r
if [ -f new_born ] ; then rm -f new_born ; fi
touch new_born
sleep 3
touch -a new_born
sleep 3
echo "Hello from new_born at:" >> new_born
echo `date` >> new_born
sleep 3
chmod o+w new_born
stat -f "Name:%t%N
Born:%t%SB
Access:%t%Sa
Modify:%t%Sm
Change:%t%Sc" new_born
cat new_born
Output:
9.2-RELEASE-p10
Name: new_born
Born: May 7 12:38:35 2015
Access: May 7 12:38:38 2015
Modify: May 7 12:38:41 2015
Change: May 7 12:38:44 2015
Hello from new_born at:
Thu May 7 12:38:41 EDT 2015
(NB: the chmod operation "changes" but does not "modify" the file contents - this is what the echo command does by adding content to the file. See the touch manual page for explanations of the -m and -a flags).
This is the oldest FreeBSD release I have access to right now. I'd be curious to know how far back in the release cycle FreeBSD is able handle this (on ZFS or UFS2 file systems). I'm pretty sure this has been a feature for quite a while now. There are also OSX and Linux versions of ZFS that it would be useful to know about regarding this feature.
Just one more thing ...
Here is an especially nice feature for simple "forensics". Say we want to send our new_born file back to when time began, back to the leap second that never happened and when - in a moment of timeless time - Unix was born ... :-) 1. We can just change the date using touch -d and everyone will think new_born is old and wise, right?
Nope:
~/ % touch -d "1970-01-01T00:00:01" new_born
~/ % stat -f "Name:%t%N
Born:%t%SB
Access:%t%Sa
Modify:%t%Sm
Change:%t%Sc" new_born
Name: new_born
Born: May 7 12:38:35 2015
Access: Jan 1 00:00:01 1970
Modify: Jan 1 00:00:01 1970
Change: May 7 13:29:37 2015
It's always more truthful to actually be as young as you look :-)
Time and Unix - a subject both practical and poetic: after all, what is "change"; and what does it mean to "modify" or "create" something? Thanks for your great post Silvio - I hope it lives on and gathers useful answers.
You can improve and generalize your question if you can be more specific about your requirements for preserving, setting, archiving of file timestamp fields. Don't get me wrong: this is a very good question and it will continue to get up votes for a long time.
You might take a look at Dylan Leigh's presentation Forensic Timestamp Analysis of ZFS or even contact Dylan for clues on how to access crftime information.
[1] There was a legend that claimed in the beginning, seconds since long (SSL) ago was never less than date -u -j -f "%Y-%m-%d:%T" "1970-01-01:00:00:01" "+%s" because of a leap second ...

Related

How to display top largest files in a non blocking manner on linux?

For years I have being using variasons of du command below in order to produce a report of the largest files from specific location, and most of the time it worked well.
du -L -ch /var/log | sort -rh | head -n 10 &> log-size.txt
This this proved to get stuck in several cases, in a way that prevented stopping it with even the timeout -s KILL 5m ... approach.
Few years back this was caused by stalled NFS mounts but more recently I got into this in on VMs where I didn't use NFS at all. Apparently there is a ~1:30 chance to get this on openstack builds.
I read that following symbolic links (-L) can block "du" in some cases if there are loops but my tests failed to reproduce the problem, even when I created some loop.
I cannot avoid following the symlinks because that's how the files are organized.
What would be safer alternative to generate this report, one that would not block or at least if it does, it can be constrainted to a maximum running duration. It is essential to limit the execution of this command to a number of minutes -- if I can also get a partial result on timeouts or some debuggin info even better.
If you don't care about sparse files and can make do with apparent size (and not the on-disk size), then ls should work just fine: ls -L --sort=s|head -n10> log-size.txt

Why looses this Armbian repeatedly connectivity?

I have an Olimex Lime2 running an Armbian, headless. On this board I only care for SSH and MiniDLNA. I hope to get around to include the whole configuration, but one important bit might be that in /boot/armbianEnv.txt I put
extraargs=acpi=off
For one year now I experience very hard to debug issues with availability. The machine randomly stops to be accessible via ping or ssh. The issues are hard to debug because they seem to disappear when connecting monitor or keyboard while I can't find any trace of them when the system runs headless. While I got the problem mostly under control without knowing how, the Olimex still stops to respond now and then. This time I want to ask why.
I noticed that the Olimex stopped to provide DLNA access at 10/25th, ~2pm. I did not touch it to see if it recovers (which happens sometimes). This time the system remained unreachable for 2 days until I unplugged power.
Below you can find links to two logs. I would be very glad if anything suspicious in them could be pointed out so I can start resolving them.
One particular thing I wonder: Why did the system decided to reboot? There was no power outage at that day. I expect that a normal reboot would manifest itself in the logs, do they?
The logs:
/var/logs/messages: https://pastebin.com/qgRumreB
/var/logs/syslog: https://pastebin.com/U5jpHNHm
The logs are complete. I only removed lines in the beginning and at the end, but not in between.
Albeit I did not find a clear solution, I want to share what I have tried. Hopefully this will provide inspiration for others with similar problems.
One thing helping me a lot (which wasn't available back then) was to switch to a newer OS. I am running Armbian based on Ubuntu 18.04 and the logs are less cluttered now. Also, some minor details improved. If you landed here and still run Armbian Stretch, you should upgrade.
One reason for frequent crashes could be a corrupt file system, caused by frequent crashes :-(. Armbian mounts your SD card with
UUID=<uid> / ext4 defaults,noatime,nodiratime,commit=600,errors=remount-ro 0 1
note the commit=600, which means that changes will be written only every 10 minutes. If your machine crashes inbetween, the file system might become corrupt. Thus, you might run fsck.ext4 on your SD card file system. In order to counteract the problem at all, you could:
run fsck at every boot
omit the commit setting. My SD card can handle the additional stress quite well.
What I think solved the problem for me is to put my external HDD, attached via SATA, to sleep. I was surprised that this does not happen automatically. Now I have the following section attached to my /etc/hdparm.conf:
/dev/disk/by-uuid/<uid> {
spindown_time = 60
write_cache = off
}
This tells hdparm to put the HDD to standby after 5 minutes of inactivity. Turning the write cache off is a safety measure against file system corruption again. Some disks reorder BtrFS write instructions when they shouldn't.
Something that I observed is that putting some load on the machine helped to keep the system alive. Unfortunately, that was true only until some point in time. However, if attaching Keyboard and Mouse or letting a script run throughout helps with restarts, you will have something to work with.
I use the following script to log information that might help me in case of a crash:
#!/bin/bash
# LICENSE: GPLv3 or later
set -euo pipefail
LOGFILE=/home/mgoerner/error-detection.log
function main() {
parse_cli_args "$#"
while true
do
print_debug_information >>"$LOGFILE" 2>&1
sync
sleep 3m
done
}
function parse_cli_args() {
if [[ $# -eq 1 ]]
then
arg="$1";shift
if [[ "$arg" == "--help" || "$arg" == "-h" ]]
then
print_usage
exit
fi
LOGFILE="$arg"
elif [[ $# -gt 1 ]]
then
echo "Please provide at most one argument!" >&2
exit 1
fi
}
function print_usage() {
cat <<EOF
$0 [LOGFILE]
EOF
}
function print_debug_information() {
echo
date
uptime
dmesg -uT | tail
ip addr show wlxd85d4c97e434
iwlist wlxd85d4c97e434 scan | egrep ESSID
hdparm -acdgkmurC /dev/sda
free
}
main "$#"
I let it start automatically after boot. Setting the sleep time under ~10 minutes used to let the crashes disappear, but does not anymore. Unfortunately, the error log produced by this script never helped getting any insight. The same goes for various logs from /var/log/. However, that might be different for you.
Furthermore, I suspect that my WiFi dongle does not like it warm. I reuse a childs shoe box as case and putting the dongle inside the closed box caused some connectivity problems.
Last but not least, I turned off automatic updates after reboots. Very often the crashes happened directly after some (mysterious) reboot. Having turned of automatic updates helped me getting rid of this altogether.

busybox ntpd does not resync date/time after changing it

I'm trying to figure out how ntpd (from busybox) works.
I'm running the following scenario, for a test sake:
set up date/time, using date -s, to any arbitrary date/time (e.g. 2000-01-01 00:00:00);
run the command ntpd -N -p <server_address> to start the daemon. Just after that, the date/time is successfully sync;
change the date/time againt, using date -s, to the same used in the 1st step (i.e. 2000-01-01 00:00:00);
After that, I have been expecting that date/time was synchronized again, but this doesn't occur, even if I wait for a couple of hours.
My question is: my comprehension about the ntpd's behavior is correct? Should the date/time be resync automatically after the 3rd step? If not, what should I do to resync the date/time?
I would check internaly in the trimmed busybox implementation if the use case is actually covered. Some options could be actually ignored and that can cause confusion.
If not, in case it is a yocto based embedded system, you should consider bring the actual and complete ntpd instead of the busybox one.

Get full path of executable of running process on AIX

This is most similar to Get full path of executable of running process on HPUX…, except for AIX.
The basic question is: how, on AIX, can I determine the full path to the current executable? Having to do this before doing anything else (e.g., chdir) is fine.
The most accurate answer I've found so far is to check the output from
svmon -P $$ -O format=nolimit,filename=on,filtertype=client
(where $$ has its shell meaning: current pid). That's not only heavy amounts of C, but svmon is also not very fast and can easily overwhelm the runtime of the rest of the application.
The next best answer seems to be to simply look at argv[0], and, if it has a slash in it, it's either a full path name (starts with a leading /) or a relative-to-current-dir name (does not start with a leading /). If it doesn't have a slash in it, it's relative to something in PATH.
And if, after this resolution, I end up with a symlink, then there's all the resolution of symlink(s) to deal with as well (hard links are probably beyond the scope of any solution). This solution looks like it's relatively cross-platform, but is also very heavy in the C code (should be faster than svmon). And I expect there are race-conditions and such to deal with.
Thanks,
Update: I'm looking for a solution to submit to the perl devs. And they're going to worry about an insecure PATH, especially in setuid/setgid scenarios. See perlsec. I think that we could be okay here, but if you combined setuid with faking argv[0], you could force perl to think it's somewhere else, and load the wrong modules. The "next best" answer above only really works when not in perl's taint-mode.
Why can't you use ps options as a base line? Granted, you'll still need to process the cmd value to see if has a leading '/' or not. Something like
ps -o pid,env,cwd,cmd | grep youAppName | awk -f programToRationalizePathName
I don't have access to AIX anymore, but I did work on it for 2 1/2 years and I know I've used this sort of feature. I didn't think it was slow, but I might have had different requirements than you do.
I get the impression you want a utility function, a 1-at-time call that returns the full path, but if you need an on-going process and are concerned about re-starting ps every 1 minute (for example), look at the AIX specific nmon utility. Not sure if it can generate output similar to the ps -o cmd but it can be set up to run as long as you want, as often as you want (down to 1 second intervals) and it is just one process, whose output can be redirected as needed. (Nmon is not part of the std install, but most organizations do install it, as it is IBM blessed (if not supported directly)).
Of course all of the 'not 100%' caveats apply from the similar questions mentioned by you and in the comments.
I hope this helps.
Use ps to get executable path address
ps -aef | grep app | awk '{print $8}'
above command gives your app executable path address

Following multiple log files efficiently

I'm intending to create a programme that can permanently follow a large dynamic set of log files to copy their entries over to a database for easier near-realtime statistics. The log files are written by diverse daemons and applications, but the format of them is known so they can be parsed. Some of the daemons write logs into one file per day, like Apache's cronolog that creates files like access.20100928. Those files appear with each new day and may disappear when they're gzipped away the next day.
The target platform is an Ubuntu Server, 64 bit.
What would be the best approach to efficiently reading those log files?
I could think of scripting languages like PHP that either open the files theirselves and read new data or use system tools like tail -f to follow the logs, or other runtimes like Mono. Bash shell scripts probably aren't so well suited for parsing the log lines and inserting them to a database server (MySQL), not to mention an easy configuration of my app.
If my programme will read the log files, I'd think it should stat() the file once in a second or so to get its size and open the file when it's grown. After reading the file (which should hopefully only return complete lines) it could call tell() to get the current position and next time directly seek() to the saved position to continue reading. (These are C function names, but actually I wouldn't want to do that in C. And Mono/.NET or PHP offer similar functions as well.)
Is that constant stat()ing of the files and subsequent opening and closing a problem? How would tail -f do that? Can I keep the files open and be notified about new data with something like select()? Or does it always return at the end of the file?
In case I'm blocked in some kind of select() or external tail, I'd need to interrupt that every 1, 2 minutes to scan for new or deleted files that shall (no longer) be followed. Resuming with tail -f then is probably not very reliable. That should work better with my own saved file positions.
Could I use some kind of inotify (file system notification) for that?
If you want to know how tail -f works, why not look at the source? In a nutshell, you don't need to periodically interrupt or constantly stat() to scan for changes to files or directories. That's what inotify does.

Resources