Why EXT4/JBD2 after mounted keeps calling ext4_journal_stop? - c

I was investigating the journaling layer used in the EXT4 (JBD2) and I added some printk to see the behavior of the ext4_journal_start and ext4_journal_stop functions being called.
This is the procedure:
I first format a given partition using:
sudo mke2fs -t ext4 /dev/vdb
(I am using QEMU to run this experiment)
Then I mount it:
sudo mount /dev/vdb /mnt/mydisk
That is the normal procedure for mounting, but when I mount it, because of my printk's functions in both ext4_journal_start/stop, the dmesg shows a lot of calls to journal_stop without any journal_start.
P.S.: I should guess that it is some background behavior of EXT4 or something, but I have no idea what is it.
Here is the dmesg output:
* Restoring resolver state... [ OK ]
* Stopping System V runlevel compatibility [ OK ]
[ 124.648904] JOURNAL STOP
[ 124.778691] JOURNAL STOP
...
[... ] # it is called maybe more than 40 times
...
[ 129.641895] jbd2_journal_commit_transaction
[ 129.769132] JOURNAL STOP
...
[... ] # it is called maybe more than 40 times
...
[ 134.766164] jbd2_journal_commit_transaction
After 134 seconds, it stops these messages, and then I try to write some file into that mounting point, and it behaves as expected.
[ 624.995549] JOURNAL START
[ 624.996849] JOURNAL STOP
[ 625.000676] JOURNAL START
[ 625.001757] JOURNAL START
[ 625.002822] JOURNAL STOP
[ 625.003773] JOURNAL STOP
[ 631.004110] jbd2_journal_commit_transaction
So, it is strange that after mounting, even that I did absolutely nothing, these functions are being called (journal_stop) several times and, furthermore, after two commits (the function call jbd2_journal_commit_transaction) the dmesg gets stable, and it then follows an expected behavior.
To make it clear, my question is: what causes this several calls without any reason (the ext4_journal_stop)?

By debugging the ext4 source code, I discovered what causes the several journal operations right after mounting the file system.
The ext4 is creating the inode table, so that is why the journal is called several times right after mounting a partition.

Related

gem5 full system Linux boot fails with "Kernel panic - not syncing: VFS: Unable to mount root fs"

I want to run arm's linux system in gem5's fs mode,I download related files from this address:
http://www.gem5.org/documentation/general_docs/fullsystem/guest_binaries
I was able to configure the correct file path, but finally got this output in the terminal2:
[ 0.661620] No filesystem could mount root, tried:
[ 0.661621] ext3
[ 0.661650] ext4`enter code here`
[ 0.661663] ext2
[ 0.661676] vfat
[ 0.661690] fuseblk
[ 0.661703]
[ 0.661728] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0)
And got this terrible output in the terminal1:
warn: Tried to read RealView I/O at offset 0x60 that doesn't exist
warn: Tried to write RVIO at offset 0xa8 (data 0) that doesn't exist
warn: Kernel panic in simulated kernel
I can provide my command line input, but simply adjusting the configuration inside will only lead to the same result:
./build/ARM/gem5.opt configs/example/arm/starter_fs.py --cpu="minor" --num-cores=4 --disk-image=/home/ad/GEM5/ARM_GEM5/gem5/my_image/disks/aarch64-ubuntu-trusty-headless.img --dtb=/home/ad/GEM5/ARM_GEM5/gem5/my_image/binaries/armv7_gem5_v1_4cpu.dtb --kernel=/home/ad/GEM5/ARM_GEM5/gem5/my_image/binaries/vmlinux
How can I solve this?
Regards,
Here is a full diagnostic procedure for this kind of problem: https://askubuntu.com/questions/41930/kernel-panic-not-syncing-vfs-unable-to-mount-root-fs-on-unknown-block0-0/1048477#1048477
In summary, you have to ensure that:
the kernel has the config to read the disk type, for emulation usually:
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BLK=y
This seems to be the problem since there was no list of partitions given above? Please confirm. If not, the kernel can't read bytes from the disk it seems.
the kernel has the config to read the filesystem type. You kernel mentions ext2,3,4 though, so likely that's not the problem.
you are pointing the root= kernel CLI to the right partition
See also: https://cirosantilli.com/linux-kernel-module-cheat/#not-syncing That also contains a Buildroot setup that just works.
I also highly recommend that you first get it working on QEMU which boots much faster.
I had a similar problem. Trying to run a full-system emulation of both Ubuntu and Linaro minimal (from the gem5 website) under a 64-bit kernel, with the original starter_fs.py script, gives me this kernel panic:
[ 0.224367] List of all partitions:
[ 0.224394] fe00 1048320 vda
[ 0.224397] driver: virtio_blk
[ 0.224440] fe01 1048288 vda1 00000000-01
[ 0.224441]
[ 0.224480] No filesystem could mount root, tried:
[ 0.224481] ext3
[ 0.224510] ext4
[ 0.224524] ext2
[ 0.224537] squashfs
[ 0.224551] vfat
[ 0.224566] fuseblk
[ 0.224579]
[ 0.224606] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0)
[ 0.224656] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0+ #1
[ 0.224692] Hardware name: V2P-CA15 (DT)
[ 0.224717] Call trace:
[ 0.224741] dump_backtrace+0x0/0x1c0
[ 0.224765] show_stack+0x14/0x20
[ 0.224790] dump_stack+0x8c/0xac
[ 0.224812] panic+0x130/0x288
[ 0.224836] mount_block_root+0x22c/0x294
[ 0.224861] mount_root+0x140/0x174
[ 0.224884] prepare_namespace+0x138/0x180
[ 0.224910] kernel_init_freeable+0x1c0/0x1e0
[ 0.224939] kernel_init+0x10/0x108
[ 0.224961] ret_from_fork+0x10/0x18
[ 0.224987] Kernel Offset: disabled
[ 0.225009] CPU features: 0x21c06492
[ 0.225032] Memory Limit: 2048 MB
[ 0.225056] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0) ]---
Weird thing is that, few weeks ago, it worked like a charm. The problem lying into the specification of the root partition, on the kernel command line. In the starter_fs.py, change this line:
"root=/dev/vda",
By this:
"root=/dev/vda1",
You can see that before, the VirtIO block device was specified. The kernel wants a partition, not a block device. Then, you can run gem5:
build/ARM/gem5.opt -configs/example/arm/starter_fs.py --cpu="hpi" --num-cores=1 --disk-image="linaro-minimal-aarch64.img" --kernel="vmlinux.arm64"
And for me, the kernel panic is gone and I am able to boot my system again:
[ 0.228847] EXT4-fs (vda1): mounted filesystem without journal. Opts: (null)
[ 0.228906] VFS: Mounted root (ext4 filesystem) on device 254:1.
[ 0.229539] devtmpfs: mounted
[ 0.229792] Freeing unused kernel memory: 448K
INIT: version 2.88 booting
[ 0.234168] random: fast init done
Starting udev
[ 0.277039] udevd[715]: starting version 182
[ 0.411534] EXT4-fs (vda1): re-mounted. Opts: block_validity,delalloc,barrier,user_xattr
Starting Bootlog daemon: bootlogd.
[ 0.426573] random: dd: uninitialized urandom read (512 bytes read)
Populating dev cache
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
hwclock: can't open '/dev/misc/rtc': No such file or directory
Mon Jan 27 08:00:00 UTC 2014
hwclock: can't open '/dev/misc/rtc': No such file or directory
INIT: Entering runlevel: 5
Configuring network interfaces... ifconfig: SIOCGIFFLAGS: No such device
Starting rpcbind daemon...rpcbind: cannot create socket for udp6
rpcbind: cannot create socket for tcp6
done.
rpcbind: cannot get uid of '': Success
creating NFS state directory: done
starting statd: done
Starting auto-serial-console: done
Stopping Bootlog daemon:
bootlogd.
Last login: Mon Jan 27 08:00:00 UTC 2014 on tty1
INIT: no more processes left in this runlevel
root#genericarmv8:~# id
id
uid=0(root) gid=0(root) groups=0(root)
root#genericarmv8:~#

emulating the reMarkable tablet (i.MX6 ARMv7) with Qemu

I'm trying to emulate the reMarkable tablet with Qemu in order to create a proper development environment for it, instead of cross-compiling and sending to the hardware device.
The firmware flasher repo contains the rootfs, kernel, DTB and u-boot files. I've created an .img file from the rootfs in order to boot it in Qemu with the following command:
qemu-system-arm \
-M sabrelite \
-bios "files/u-boot.imx" \
-kernel "zImage" \
-append "console=ttymxc0 rootfstype=ext4 root=/dev/mmcblk1p2 rw rootwait init=/bin/bash loglevel=8 bootmem-debug earlyprintk" \
-dtb "zero-gravitas.dtb" \
-drive file="floppy.img",format=raw,id=mmcblk1p2 \
-device sd-card,drive=mmcblk1p2
but the kernel does not seem to be starting as I have the same log whether the floppy.img file (drive+device) is given or not. The startup loops on this error:
[ 0.713093] 2020000.serial: ttymxc0 at MMIO 0x2020000 (irq = 19, base_baud = 5000000) is a IMX
[ 0.732268] console [ttymxc0] enabled
[ 0.736333] phy index low: 1, phy index high: 2
[ 240.289647] INFO: task swapper:1 blocked for more than 120 seconds.
[ 240.290160] Not tainted 4.1.28-zero-gravitas-01866-ge0b823726ea4-dirty #82
[ 240.290318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.290662] swapper D 8051c44c 0 1 0 0x00000000
[ 240.292245] [<8051c44c>] (__schedule) from [<8051c73c>] (schedule+0x40/0x98)
[ 240.292473] [<8051c73c>] (schedule) from [<8051e7b8>] (schedule_timeout+0x114/0x168)
[ 240.292781] [<8051e7b8>] (schedule_timeout) from [<8051d248>] (wait_for_common+0x88/0x130)
[ 240.292953] [<8051d248>] (wait_for_common) from [<80262c74>] (imx_rng_init+0x158/0x2a8)
[ 240.293117] [<80262c74>] (imx_rng_init) from [<80262574>] (set_current_rng+0xc0/0x15c)
[ 240.293276] [<80262574>] (set_current_rng) from [<80262874>] (hwrng_register+0x190/0x1b8)
[ 240.293436] [<80262874>] (hwrng_register) from [<807c3fd8>] (imx_rng_probe+0xd4/0x134)
[ 240.293682] [<807c3fd8>] (imx_rng_probe) from [<802748e0>] (platform_drv_probe+0x44/0xac)
[ 240.293852] [<802748e0>] (platform_drv_probe) from [<802735ac>] (driver_probe_device+0x178/0x2b8)
[ 240.294009] [<802735ac>] (driver_probe_device) from [<802737bc>] (__driver_attach+0x8c/0x90)
[ 240.294158] [<802737bc>] (__driver_attach) from [<80271d50>] (bus_for_each_dev+0x68/0x9c)
[ 240.294352] [<80271d50>] (bus_for_each_dev) from [<802726bc>] (bus_add_driver+0x13c/0x1e4)
[ 240.294600] [<802726bc>] (bus_add_driver) from [<80273ed4>] (driver_register+0x78/0xf8)
[ 240.294843] [<80273ed4>] (driver_register) from [<807c434c>] (__platform_driver_probe+0x20/0x70)
[ 240.295092] [<807c434c>] (__platform_driver_probe) from [<807a9d78>] (do_one_initcall+0x118/0x1c4)
[ 240.295367] [<807a9d78>] (do_one_initcall) from [<807a9f48>] (kernel_init_freeable+0x124/0x1c4)
[ 240.295609] [<807a9f48>] (kernel_init_freeable) from [<8051883c>] (kernel_init+0x8/0xe8)
[ 240.295844] [<8051883c>] (kernel_init) from [<8000ef88>] (ret_from_fork+0x14/0x2c)
full log here
I will update this question when I have new findings, but i'm new to Qemu and I'm quite stuck and ran out of options. The repository i'm working in is here. Thanks for any input !
I haven't investigated too closely, but the fact that the backtrace shows a hang in the imx_rng_init function suggests that the problem is that QEMU doesn't have an emulation of the imx SoC's builtin RNG device, and so the guest is hanging forever waiting for a response from hardware that doesn't exist.
You'll need to either implement a model of that device, or else use a guest kernel which doesn't try to probe for that device.
More generally, running an Arm kernel that's intended for one piece of hardware on a different piece of hardware will usually not work. The sabrelite has the same SoC here, so booting works better than it would if you tried to do it with an entirely unrelated QEMU machine, but if at any time your guest code tries to access hardware outside the SoC which is specific to the reMarkable then you will find it doesn't work. If you really need to get the stock kernel for the hardware to boot you will probably at some point need to bite the bullet and implement a proper machine model of it in QEMU with the relevant devices present.
If you don't actually need to run anything on the guest that cares about the specific differences between one imx6 system and another, you might be able to get away with using a kernel and DTB for the sabrelite board plus the rootfs from the reMarkable.

Why would file checksums inconsistently fail?

I created a ~2MiB file.
dd if=/dev/urandom of=file.bin bs=2M count=1
Then I copied that file a large number of times and generated a checksum for each (identical) copy.
for i in `seq 50000`;
do
name="file.${i}.bin"
cp file.bin "${name}"
sha512sum "${name}" > "${name}.sha512"
done
I then verified all of those checksummed files with a validation script to run sha512sum against each file.
for file in `find . -regex ".*\.sha512"`
do
sha512sum --check --quiet "${file}" || (
cat "${file}" && sha512sum "${file%.sha512}"
)
done
I just created these files, and when I validate them moments later, I see intermittent failures and inconsistencies in the data (console text truncated for readability)
will:/mnt/usb $ for file in `find ...
file.5602.bin: FAILED
sha512sum: WARNING: 1 computed checksum did NOT match
91fc201a3812e93ef3d4890 ... file.5602.bin
b176e8e3ea63a223130f3a0 ... ./file.5602.bin
The checksum files are all identical since the source files are all identical
The problem seems to be that my computer is, seemingly at random, generating the wrong checksum for some of my files when I go to validate. A different file fails the checksum every time, and files that previously failed will pass.
will:/mnt/usb $ for file in `find ...
sha512sum: WARNING: 1 computed checksum did NOT match
91fc201a3812e93ef3d4890 ... file.3248.bin
442a1d8805ed134c9ab5252 ... ./file.3248.bin
Keep in mind that all of these files are identical.
I see the same behavior with SATA SSD and HDD, and USB devices, with md5 and sha512, with xfs, btrfs, ext4, and vfat. I tried live booting to another OS. I see this same stranger behavior regardless. I also see rsync --checksum for these files thinks checksums are wrong and re-copies these files even though they have not changed.
What could explain this behavior? Since it's happening on multiple devices with all the scenarios I described, I doubt this is bit rot. My kernel logs show no obvious errors. I would assume this is a hardware issue based on my troubleshooting, but how can this be diagnosed? Is it the CPU, the motherboard, the RAM?
What could explain this behavior? How can this be diagnosed?
From what I've read, a number of issues could explain this behavior. Bad disk(s), bad PSU (power supply), bad RAM, filesystem issues.
I tried the following to determine what was happening. I repeated the experiment with different...
Disks
Types of disks (SDD vs HDD)
External drives (3.5 and 2.5 enclosures)
Flash drives (USB 2 and 3 on various ports)
Filesystems (ext4, vfat (fat32), xfs, btrfs)
Different PSU
Different OS (live boot)
Nothing seemed to resolve this.
Finally, I gave memtest86+ v5.0.1 a try via an Ubuntu live USB.
voila. It found bad memory. Through process of elimination I determined one of my memory sticks was bad, and then tested the other over night to ensure it was in good shape. I re-ran my experiment again and I am seeing consistent checksums on all my files.
What a subtle bug. I only noticed this bad behavior by accident. If I hadn't been messing around with file checksums, I do not think I would have found this bad RAM.
This makes me want to regularly schedule a routine in which I verify and test my RAM. A consequence of this bad memory stick is that some of my test data did end up corrupt, but more often than not, the checksum verifications were just interimmitent failures.
In one sample data pool, all the checksums start with cb2848ca0e1ff27202a309408ec76..., because all ~50,000 files are identical.
Though, there are two files that are corrupt, but this is not bit rot or file integrity damage.
What seems most likely is that these files were created with corruption because cp encountered bad RAM when I created these files. Those files consistently return bad checksums of 58fe24f0e00229e8399dc6668b9... and bd85b51065ce5ec31ad7ebf3..., while the other 49,998 files return the same checksum.
This has been a fun extremely frustrating experiment in debugging.

Google Compute Engine VM instance: VFS: Unable to mount root fs on unknown-block

My instance on Google Compute Engine is not booting up due to having some boot order issues.
So, I have created a another instance and re-configured my machine.
My questions:
How can I handle these issues when I host some websites?
How can I recover my data from old disk?
logs
[ 0.348577] Key type trusted registered
[ 0.349232] Key type encrypted registered
[ 0.349769] AppArmor: AppArmor sha1 policy hashing enabled
[ 0.350351] ima: No TPM chip found, activating TPM-bypass!
[ 0.351070] evm: HMAC attrs: 0x1
[ 0.351549] Magic number: 11:333:138
[ 0.352077] block ram3: hash matches
[ 0.352550] rtc_cmos 00:00: setting system clock to 2015-12-19 17:06:53 UTC (1450544813)
[ 0.353492] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
[ 0.354108] EDD information not available.
[ 0.536267] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
[ 0.537862] md: Waiting for all devices to be available before autodetect
[ 0.538979] md: If you don't use raid, use raid=noautodetect
[ 0.539969] md: Autodetecting RAID arrays.
[ 0.540699] md: Scanned 0 and added 0 devices.
[ 0.541565] md: autorun ...
[ 0.542093] md: ... autorun DONE.
[ 0.542723] VFS: Cannot open root device "sda1" or unknown-block(0,0): error -6
[ 0.543731] Please append a correct "root=" boot option; here are the available partitions:
[ 0.545011] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 0.546199] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-39-generic #44~14.04.1-Ubuntu
[ 0.547579] Hardware name: Google Google, BIOS Google 01/01/2011
[ 0.548728] ffffea00008ae140 ffff880024ee7db8 ffffffff817af92b 000000000000111e
[ 0.549004] ffffffff81a7c7c8 ffff880024ee7e38 ffffffff817a976b ffff880024ee7dd8
[ 0.549004] ffffffff00000010 ffff880024ee7e48 ffff880024ee7de8 ffff880024ee7e38
[ 0.549004] Call Trace:
[ 0.549004] [] dump_stack+0x45/0x57
[ 0.549004] [] panic+0xc1/0x1f5
[ 0.549004] [] mount_block_root+0x210/0x2a9
[ 0.549004] [] mount_root+0x54/0x58
[ 0.549004] [] prepare_namespace+0x16d/0x1a6
[ 0.549004] [] kernel_init_freeable+0x1f6/0x20b
[ 0.549004] [] ? initcall_blacklist+0xc0/0xc0
[ 0.549004] [] ? rest_init+0x80/0x80
[ 0.549004] [] kernel_init+0xe/0xf0
[ 0.549004] [] ret_from_fork+0x58/0x90
[ 0.549004] [] ? rest_init+0x80/0x80
[ 0.549004] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 0.549004] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
What Causes This?
That is the million dollar question. After inspecting my GCE VM, I found out there were 14 different kernels installed taking up several hundred MB's of space. Most of the kernels didn't have a corresponding initrd.img file, and were therefore not bootable (including 3.19.0-39-generic).
I certainly never went around trying to install random kernels, and once removed, they no longer appear as available upgrades, so I'm not sure what happened. Seriously, what happened?
Edit: New response from Google Cloud Support.
I received another disconcerting response. This may explain the additional, errant kernels.
"On rare occasions, a VM needs to be migrated from one physical host to another. In such case, a kernel upgrade and security patches might be applied by Google."
1. "How can I handle these issues when I host some websites?"
My first instinct is to recommend using AWS instead of GCE. However, GCE is less expensive. Before doing any upgrades, make sure you take a snapshot, and try rebooting the server to see if the upgrades broke anything.
2. How can I recover my data from old disk?
Even Better - How to recover your instance...
After several back-and-forth emails, I finally received a response from support that allowed me to resolve the issue. Be mindful, you will have to change things to match your unique VM.
Take a snapshot of the disk first in case we need to roll back any of the changes below.
Edit the properties of the broken instance to disable this option: "Delete boot disk when instance is deleted"
Delete the broken instance.
IMPORTANT: ensure not to select the option to delete the boot disk. Otherwise, the disk will get removed permanently!!
Start up a new temporary instance.
Attach the broken disk (this will appear as /dev/sdb1) to the temporary instance
When the temporary instance is booted up, do the following:
In the temporary instance:
# Run fsck to fix any disk corruption issues
$ sudo fsck.ext4 -a /dev/sdb1
# Mount the disk from the broken vm
$ sudo mkdir /mnt/sdb
$ sudo mount /dev/sdb1 /mnt/sdb/ -t ext4
# Find out the UUID of the broken disk. In this case, the uuid of sdb1 is d9cae47b-328f-482a-a202-d0ba41926661
$ ls -alt /dev/disk/by-uuid/
lrwxrwxrwx. 1 root root 10 Jan 6 07:43 d9cae47b-328f-482a-a202-d0ba41926661 -> ../../sdb1
lrwxrwxrwx. 1 root root 10 Jan 6 05:39 a8cf6ab7-92fb-42c6-b95f-d437f94aaf98 -> ../../sda1
# Update the UUID in grub.cfg (if necessary)
$ sudo vim /mnt/sdb/boot/grub/grub.cfg
Note: This ^^^ is where I deviated from the support instructions.
Instead of modifying all the boot entries to set root=UUID=[uuid character string], I looked for all the entries that set root=/dev/sda1 and deleted them. I also deleted every entry that didn't set an initrd.img file. The top boot entry with correct parameters in my case ended up being 3.19.0-31-generic. But yours may be different.
# Flush all changes to disk
$ sudo sync
# Shut down the temporary instance
$ sudo shutdown -h now
Finally, detach the HDD from the temporary instance, and create a new instance based off of the fixed disk. It will hopefully boot.
Assuming it does boot, you have a lot of work to do. If you have half as many unused kernels as me, then you might want to purge the unused ones (especially since some are likely missing a corresponding initrd.img file).
I used the second answer (the terminal-based one) in this askubuntu question to purge the other kernels.
Note: Make sure you don't purge the kernel you booted in with!
How to handle these issues when I host some websites?
I'm not sure how you got into this situation, but it would be nice to have additional information (see my comment above) to be able to understand what triggered this issue.
How to recover my data from old disk?
Attach and mount the disk
Assuming you did not delete the original disk when you deleted the instance, you can simply mount this disk from another VM to read the data from it. To do this:
attach the disk to another VM instance, e.g.,
gcloud compute instances attach-disk $INSTANCE --disk $DISK
mount the disk:
sudo mkdir -p /mnt/disks/[MNT_DIR]
sudo mount [OPTIONS] /dev/disk/by-id/google-[DISK_NAME] /mnt/disks/[MNT_DIR]
Note: you'll need to substitute appropriate values for:
MNT_DIR: directory
OPTIONS: options appropriate for your disk and filesystem
DISK_NAME: the id of the disk after you attach it to the VM
Unmounting and detaching the disk
When you are done using the disk, reverse the steps:
Note: Before you detach a non-root disk, unmount the disk first. Detaching a mounted disk might result in incomplete I/O operation and data corruption.
unmount the disk
sudo umount /dev/disk/by-id/google-[DISK_NAME]
detach the disk from the VM:
gcloud compute instances detach-disk $INSTANCE --device-name my-new-device
In my case grub's (/boot/grub/grub.cfg) first menuentry (3.19.0-51-generic) was missing an initrd entry and was unable to boot.
Upon further investigating, looking at dpkg for the specific kernel its marked as failed and unconfigured
dpkg -l | grep 3.19.0-51-generic
iF linux-image-3.19.0-51-generic 3.19.0-51.58~14.04.1
iU linux-image-extra-3.19.0-51-generic 3.19.0-51.58~14.04.1
This all stemmed from the Ubuntu image supplied by Google having unattended-upgrades enabled. For some reason the initrd was killed when it was being built and something else came along and ran update-grub2.
unattended-upgrades-dpkg_2016-03-10_06:49:42.550403.log:update-initramfs: Generating /boot/initrd.img-3.19.0-51-generic
Killed
E: mkinitramfs failure cpio 141 xz -8 --check=crc32 137
unattended-upgrades-dpkg_2016-03-10_06:49:42.550403.log:update-initramfs: failed for /boot/initrd.img-3.19.0-51-generic with 1.
To work around the immediate problem run.
dpkg --force-confold --configure -a
Although unattended-upgrades in theory is a great idea, having it enabled by default can have unattended consequences.
There are a few cases where the kernel fails to handle the initrdless boot. Disable the GRUB_FORCE_PARTUUID options so that it boots with initrd.

Linux programming: which device a file is in

I would like to know which entry under /dev a file is in. For example, if /dev/sdc1 is mounted under /media/disk, and I ask for /media/disk/foo.txt, I would like to get /dev/sdc as response.
Using stat system call on that file I will get its partition major and minor numbers (8 and 33, for sdc1). Now I need to get the "root" device (sdc) or its major/minor from that. Is there any syscall or library function I could use to link a partition to its main device? Or even better, to get that device directly from the file?
brw-rw---- 1 root floppy 8, 32 2011-04-01 20:00 /dev/sdc
brw-rw---- 1 root floppy 8, 33 2011-04-01 20:00 /dev/sdc1
Thanks in advance!
The quick and dirty version: df $file | awk 'NR == 2 {print $1}'.
Programmatically... well, there's a reason I started with the quick and dirty version. There's no portable way to programmatically get the list of mounted filesystems. (getmntent() gets fstab entries, which is not the same thing.) Moreover, you can't even parse the output of mount(8) reliably; on different Unixes, the mountpoint may be the first or the last item. The most portable way to do this ends up being... parsing df output (And even that is iffy, as you noticed with the partition number.). So you're right back to the quick and dirty shell solution anyway, unless you want to traverse /dev and look for block devices with matching major(st_rdev) (major() being from sys/types.h).
If you restrict this to Linux, you can use /proc/mounts to get the list of mounted filesystems. Other specific Unixes can similarly be optimized: for example, on OS X and I think FreeBSD, you can use sysctl() on the vfs tree to get mountpoints. At worst you can find and use the appropriate header file to decipher whatever the mount table file is (and yes, even that varies: on Solaris it's /etc/mnttab, on many other systems it's /etc/mtab, some systems put it in /var/run instead of /etc, and on many Linuxes it's either nonexistent or a symlink to /proc/mounts). And its format is different on pretty much every Unix-like OS.
The information you want exists in sysfs which exposes the linux device tree. This models the relationships between the devices on the system and since you are trying to determine a parent disk device from a partition, this is the place to look. I don't know if there are any hard and fast rules you can rely on to stop your code breaking with future versions of the kernel, but the kernel developers do try to maintain sysfs as a stable interface.
If you look at /sys/dev/block/<major>:<minor>, you'll see it is a symlink with the tail components being block/<disk-device-name>/<partition-device-name>. If you were to perform a readlink(2) system call on that, you could parse the link destination to get the disk device name. In shell (since it's easier to express this way, but doing it in C will be pretty easy):
$ echo $(basename $(dirname $(readlink /sys/dev/block/8:33)))
sdc
Alternatively, you could take advantage of the nesting of partition directories in the disk directories (again in shell, but from C, its an open(2), read(2), and close(2)):
$ cat /sys/dev/block/8:33/../dev
8:32
That assumes your starting major:minor is actually for a partition, not some other sort of non-nested device.
What you looking for is impossible - there is no 1:1 connection between a block device file and the partition it is describing.
Consider:
You can create multiple block device files with different names (but the same major and minor numbers) and they are indistinguishable (N:1)
You can use a block device file as an argument to mount to mount a partition and then delete the block device file leaving the partition mounted. (0:1)
So there is no way to do what you want except in a few specific and narrow cases.
Major number will tell you which device it is: 3 - IDE on 1st controller, 22 - IDE on 2nd controller and 8 for SCSI.
Minor number will tell you partition number and - for IDE devices - if it's primary or secondary drive. This calculation is different for IDE and SCSI.
For IDE it is: x*64 + p, x is drive number on the controller (0 or 1) and p is partition
For SCSI it is: y*16 + p, where y is drive number and p is partition
Not a syscall, but:
df -h /path/to/my/file
From https://unix.stackexchange.com/questions/128471/determine-what-device-a-directory-is-located-on
So you could look at df's source code and see what it does.
I realize this post is old, but this question was the 2nd result in my search and no one has mentioned df -h

Resources