gem5 full system Linux boot fails with "Kernel panic - not syncing: VFS: Unable to mount root fs" - arm

I want to run arm's linux system in gem5's fs mode,I download related files from this address:
http://www.gem5.org/documentation/general_docs/fullsystem/guest_binaries
I was able to configure the correct file path, but finally got this output in the terminal2:
[ 0.661620] No filesystem could mount root, tried:
[ 0.661621] ext3
[ 0.661650] ext4`enter code here`
[ 0.661663] ext2
[ 0.661676] vfat
[ 0.661690] fuseblk
[ 0.661703]
[ 0.661728] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0)
And got this terrible output in the terminal1:
warn: Tried to read RealView I/O at offset 0x60 that doesn't exist
warn: Tried to write RVIO at offset 0xa8 (data 0) that doesn't exist
warn: Kernel panic in simulated kernel
I can provide my command line input, but simply adjusting the configuration inside will only lead to the same result:
./build/ARM/gem5.opt configs/example/arm/starter_fs.py --cpu="minor" --num-cores=4 --disk-image=/home/ad/GEM5/ARM_GEM5/gem5/my_image/disks/aarch64-ubuntu-trusty-headless.img --dtb=/home/ad/GEM5/ARM_GEM5/gem5/my_image/binaries/armv7_gem5_v1_4cpu.dtb --kernel=/home/ad/GEM5/ARM_GEM5/gem5/my_image/binaries/vmlinux
How can I solve this?
Regards,

Here is a full diagnostic procedure for this kind of problem: https://askubuntu.com/questions/41930/kernel-panic-not-syncing-vfs-unable-to-mount-root-fs-on-unknown-block0-0/1048477#1048477
In summary, you have to ensure that:
the kernel has the config to read the disk type, for emulation usually:
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BLK=y
This seems to be the problem since there was no list of partitions given above? Please confirm. If not, the kernel can't read bytes from the disk it seems.
the kernel has the config to read the filesystem type. You kernel mentions ext2,3,4 though, so likely that's not the problem.
you are pointing the root= kernel CLI to the right partition
See also: https://cirosantilli.com/linux-kernel-module-cheat/#not-syncing That also contains a Buildroot setup that just works.
I also highly recommend that you first get it working on QEMU which boots much faster.

I had a similar problem. Trying to run a full-system emulation of both Ubuntu and Linaro minimal (from the gem5 website) under a 64-bit kernel, with the original starter_fs.py script, gives me this kernel panic:
[ 0.224367] List of all partitions:
[ 0.224394] fe00 1048320 vda
[ 0.224397] driver: virtio_blk
[ 0.224440] fe01 1048288 vda1 00000000-01
[ 0.224441]
[ 0.224480] No filesystem could mount root, tried:
[ 0.224481] ext3
[ 0.224510] ext4
[ 0.224524] ext2
[ 0.224537] squashfs
[ 0.224551] vfat
[ 0.224566] fuseblk
[ 0.224579]
[ 0.224606] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0)
[ 0.224656] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0+ #1
[ 0.224692] Hardware name: V2P-CA15 (DT)
[ 0.224717] Call trace:
[ 0.224741] dump_backtrace+0x0/0x1c0
[ 0.224765] show_stack+0x14/0x20
[ 0.224790] dump_stack+0x8c/0xac
[ 0.224812] panic+0x130/0x288
[ 0.224836] mount_block_root+0x22c/0x294
[ 0.224861] mount_root+0x140/0x174
[ 0.224884] prepare_namespace+0x138/0x180
[ 0.224910] kernel_init_freeable+0x1c0/0x1e0
[ 0.224939] kernel_init+0x10/0x108
[ 0.224961] ret_from_fork+0x10/0x18
[ 0.224987] Kernel Offset: disabled
[ 0.225009] CPU features: 0x21c06492
[ 0.225032] Memory Limit: 2048 MB
[ 0.225056] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0) ]---
Weird thing is that, few weeks ago, it worked like a charm. The problem lying into the specification of the root partition, on the kernel command line. In the starter_fs.py, change this line:
"root=/dev/vda",
By this:
"root=/dev/vda1",
You can see that before, the VirtIO block device was specified. The kernel wants a partition, not a block device. Then, you can run gem5:
build/ARM/gem5.opt -configs/example/arm/starter_fs.py --cpu="hpi" --num-cores=1 --disk-image="linaro-minimal-aarch64.img" --kernel="vmlinux.arm64"
And for me, the kernel panic is gone and I am able to boot my system again:
[ 0.228847] EXT4-fs (vda1): mounted filesystem without journal. Opts: (null)
[ 0.228906] VFS: Mounted root (ext4 filesystem) on device 254:1.
[ 0.229539] devtmpfs: mounted
[ 0.229792] Freeing unused kernel memory: 448K
INIT: version 2.88 booting
[ 0.234168] random: fast init done
Starting udev
[ 0.277039] udevd[715]: starting version 182
[ 0.411534] EXT4-fs (vda1): re-mounted. Opts: block_validity,delalloc,barrier,user_xattr
Starting Bootlog daemon: bootlogd.
[ 0.426573] random: dd: uninitialized urandom read (512 bytes read)
Populating dev cache
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
hwclock: can't open '/dev/misc/rtc': No such file or directory
Mon Jan 27 08:00:00 UTC 2014
hwclock: can't open '/dev/misc/rtc': No such file or directory
INIT: Entering runlevel: 5
Configuring network interfaces... ifconfig: SIOCGIFFLAGS: No such device
Starting rpcbind daemon...rpcbind: cannot create socket for udp6
rpcbind: cannot create socket for tcp6
done.
rpcbind: cannot get uid of '': Success
creating NFS state directory: done
starting statd: done
Starting auto-serial-console: done
Stopping Bootlog daemon:
bootlogd.
Last login: Mon Jan 27 08:00:00 UTC 2014 on tty1
INIT: no more processes left in this runlevel
root#genericarmv8:~# id
id
uid=0(root) gid=0(root) groups=0(root)
root#genericarmv8:~#

Related

Bootloading QEMU emulating Freescale i.MX6 Quad SABRE Lite Board

I am trying to start a QEMU sabrelite machine with U-Boot. I am managed to build it according to guide.
I run it as
./qemu-6.1.0/build/qemu-system-arm \
-M sabrelite \
-m 1G \
-kernel u-boot \
-drive file=disk.img,format=raw,id=mycard \
-device sd-card,drive=mycard,bus=sd-bus \
-serial null \
-serial stdio \
-nic user \
-no-reboot
Disk.img is a raw formatted file with the first vfat partition. I tried partition types 0xc 'EFI (FAT-12/16/32)' and 0xef 'W95 FAT32 (LBA)'. It starts with a bundle of errors.
U-Boot 2021.07 (Aug 24 2021 - 19:43:55 +0300)
CPU: Freescale i.MX6Q rev1.0 at 792 MHz
Reset cause: POR
Model: Freescale i.MX6 Quad SABRE Lite Board
Board: SABRE Lite
I2C: ready
DRAM: 1 GiB
force_idle_bus: sda=0 scl=0 sda.gp=0x5c scl.gp=0x55
force_idle_bus: failed to clear bus, sda=0 scl=0
force_idle_bus: sda=0 scl=0 sda.gp=0x6d scl.gp=0x6c
force_idle_bus: failed to clear bus, sda=0 scl=0
force_idle_bus: sda=0 scl=0 sda.gp=0xcb scl.gp=0x5
force_idle_bus: failed to clear bus, sda=0 scl=0
MMC: FSL_SDHC: 0, FSL_SDHC: 1
Loading Environment from MMC... *** Warning - No block device, using default environment
In: serial
Out: serial
Err: serial
Net: using phy at 6
FEC [PRIME]
Error: FEC address not set.
, usb_ether
Error: usb_ether address not set.
starting USB...
Bus usb#2184000: usb dr_mode not found
probe failed, error -22
Bus usb#2184200: probe failed, error -22
No working controllers found
Hit any key to stop autoboot: 0
SATA Device Info:
S/N:
Product model number:
Firmware version:
Capacity: 0 sectors
Device 0: starting USB...
Bus usb#2184000: usb dr_mode not found
probe failed, error -22
Bus usb#2184200: probe failed, error -22
No working controllers found
USB is stopped. Please issue 'usb start' first.
starting USB...
Bus usb#2184000: usb dr_mode not found
probe failed, error -22
Bus usb#2184200: probe failed, error -22
No working controllers found
*** ERROR: `ethaddr' not set
missing environment variable: pxeuuid
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/00000000
*** ERROR: `serverip' not set
…
=> mmcinfo
=> mmc list
FSL_SDHC: 0
FSL_SDHC: 1
The issue is that SD/MMC and USB Storage fails. So, I am not able to load an OS.
I came across a discussion. It mentions that this board have some SD controller specific and that there were SD related bug. But the fix was applied a couple years ago.
I guess there is a mismatch between QEMU and U-Boot SD controller definitions.
But I have no idea how to fix it. I would be relly happy if anyone could give me a hint.
Real i.MX6Q SABRE Lite board boots from SPI-NOR flash that is preloaded with U-Boot. QEMU emulates it as sst25vf016b device. I tried to load it with a proprietary bootloader as
-device sst25vf016b,drive=spi \
-drive if=none,file=flash.bin,format=raw,id=spi \
But it doesn’t work. Is it possible to bootload a QEMU sabrelite machine this way?
Some time later...
I built a linux kernel and started it with rootfs from an SD card. QEMU 5.1.0, 5.2.0, 6.0.0 and 6.1.0 plus linux handles SD/MMC well.
[ 10.400092] sdhci: Secure Digital Host Controller Interface driver
[ 10.401012] sdhci: Copyright(c) Pierre Ossman
[ 10.401982] sdhci-pltfm: SDHCI platform and OF driver helper
[ 10.425404] caam 2100000.crypto: device ID = 0x0000000000000000 (Era -524)
[ 10.427688] sdhci-esdhc-imx 2198000.mmc: Got CD GPIO
[ 10.427954] caam 2100000.crypto: job rings = 2, qi = 0
[ 10.429650] sdhci-esdhc-imx 2198000.mmc: Got WP GPIO
[ 10.435224] sdhci-esdhc-imx 219c000.mmc: Got CD GPIO
[ 10.459144] caam_jr 2101000.jr: failed to flush job ring 0
[ 10.472458] caam_jr: probe of 2101000.jr failed with error -5
[ 10.481084] caam_jr 2102000.jr: failed to flush job ring 1
[ 10.488221] caam_jr: probe of 2102000.jr failed with error -5
[ 10.507965] usbcore: registered new interface driver usbhid
[ 10.515781] usbhid: USB HID core driver
[ 10.571593] mmc2: SDHCI controller on 2198000.mmc [2198000.mmc] using ADMA
[ 10.583693] mmc3: SDHCI controller on 219c000.mmc [219c000.mmc] using ADMA
[ 10.590880] ipu1_csi0: Registered ipu1_csi0 capture as /dev/video0
[ 10.611980] ipu1_ic_prpenc: Registered ipu1_ic_prpenc capture as /dev/video1
[ 10.622227] ipu1_ic_prpvf: Registered ipu1_ic_prpvf capture as /dev/video2
[ 10.643235] ipu1_csi1: Registered ipu1_csi1 capture as /dev/video3
[ 10.655850] ipu2_csi0: Registered ipu2_csi0 capture as /dev/video4
[ 10.660920] ipu2_ic_prpenc: Registered ipu2_ic_prpenc capture as /dev/video5
[ 10.670717] ipu2_ic_prpvf: Registered ipu2_ic_prpvf capture as /dev/video6
[ 10.690556] ipu2_csi1: Registered ipu2_csi1 capture as /dev/video7
[ 10.702178] mmc3: host does not support reading read-only switch, assuming write-enable
[ 10.710571] mmc3: new high speed SD card at address 4567
[ 10.729816] mmcblk3: mmc3:4567 QEMU! 32.0 MiB
I bet that it is a U-Boot issue. On the other hand I believe that U-Boot works with real hardware. Probably it is worth it to use U-Boot from Boundary? Can anyone with a real board confirm what U-Boot version does it support?

Gem5 ARM FS - Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,1) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I'm trying to run my first full-system simulation in Gem5, but I got the following error
[ 5.703750] VFS: Cannot open root device "sda1" or unknown-block(8,1): error -6
[ 5.704398] Please append a correct "root=" boot option; here are the available partitions:
[ 5.706209] 0800 252 sda
[ 5.706370] driver: sd
[ 5.707223] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,1)
[ 5.708000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0+ #1
[ 5.708570] Hardware name: V2P-CA15 (DT)
[ 5.709020] Call trace:
[ 5.709601] dump_backtrace+0x0/0x1c0
[ 5.710230] show_stack+0x14/0x20
[ 5.710844] dump_stack+0x8c/0xac
[ 5.711375] panic+0x130/0x288
[ 5.711980] mount_block_root+0x1a8/0x294
[ 5.712617] mount_root+0x140/0x174
[ 5.713243] prepare_namespace+0x138/0x180
[ 5.713891] kernel_init_freeable+0x1c0/0x1e0
[ 5.714489] kernel_init+0x10/0x108
[ 5.715069] ret_from_fork+0x10/0x18
[ 5.715584] Kernel Offset: disabled
[ 5.716051] CPU features: 0x21c0649a
[ 5.716509] Memory Limit: 512 MB
[ 5.717254] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,1) ]---
And the output from the gem5 command is
warn: Tried to write RVIO at offset 0xa8 (data 0) that doesn't exist
warn: Tried to read RealView I/O at offset 0x8 that doesn't exist
warn: Tried to read RealView I/O at offset 0x48 that doesn't exist
warn: EnergyCtrl: Disabled handler, ignoring read from reg 0
info: Dumping kernel dmesg buffer to system.workload.dmesg...
warn: Kernel panic in simulated kernel
I am using
Gem5 20.1.0.2 (stable branch)
cmd: ./build/ARM/gem5.opt configs/example/fs.py --cpu-type=TimingSimpleCPU --cpu-clock=1GHz --kernel=binaries/vmlinux.arm64 --disk-image=disks/m5_exit.squashfs.arm64 --bootloader=binaries/boot.arm64
The kernel and disk image is taken from gem5 guest binary page, I took the first link (Latest Linux Kernel Image / Bootloader)
I notice this question but the error is slightly different, and the running command line, kernel, image is different.
How can I solve this issue?
This answer to the question you mentioned points to: https://askubuntu.com/questions/41930/kernel-panic-not-syncing-vfs-unable-to-mount-root-fs-on-unknown-block0-0/1048477#1048477 which contains a detailed diagnosis procedure for this error.
For that and from the kernel message, we see clearly that root= kernel CLI parameter is incorrect: the default sda1 was used instead of the required sda.
On fs.py, the correct root= can be set with:
--command-line 'earlyprintk=pl011,0x1c090000 lpj=19988480 rw loglevel=8 mem=256MB root=/dev/sda console_msg_format=syslog nokaslr norandmaps panic=-1 printk.devkmsg=on printk.time=y rw console=ttyAMA0 - lkmc_home=/lkmc'
I think there isn't a way to set just root= in fs.py , so all the other options were copied from the default fs.py kernel CLI, and just --root was modified. This can also be seen here.

emulating the reMarkable tablet (i.MX6 ARMv7) with Qemu

I'm trying to emulate the reMarkable tablet with Qemu in order to create a proper development environment for it, instead of cross-compiling and sending to the hardware device.
The firmware flasher repo contains the rootfs, kernel, DTB and u-boot files. I've created an .img file from the rootfs in order to boot it in Qemu with the following command:
qemu-system-arm \
-M sabrelite \
-bios "files/u-boot.imx" \
-kernel "zImage" \
-append "console=ttymxc0 rootfstype=ext4 root=/dev/mmcblk1p2 rw rootwait init=/bin/bash loglevel=8 bootmem-debug earlyprintk" \
-dtb "zero-gravitas.dtb" \
-drive file="floppy.img",format=raw,id=mmcblk1p2 \
-device sd-card,drive=mmcblk1p2
but the kernel does not seem to be starting as I have the same log whether the floppy.img file (drive+device) is given or not. The startup loops on this error:
[ 0.713093] 2020000.serial: ttymxc0 at MMIO 0x2020000 (irq = 19, base_baud = 5000000) is a IMX
[ 0.732268] console [ttymxc0] enabled
[ 0.736333] phy index low: 1, phy index high: 2
[ 240.289647] INFO: task swapper:1 blocked for more than 120 seconds.
[ 240.290160] Not tainted 4.1.28-zero-gravitas-01866-ge0b823726ea4-dirty #82
[ 240.290318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.290662] swapper D 8051c44c 0 1 0 0x00000000
[ 240.292245] [<8051c44c>] (__schedule) from [<8051c73c>] (schedule+0x40/0x98)
[ 240.292473] [<8051c73c>] (schedule) from [<8051e7b8>] (schedule_timeout+0x114/0x168)
[ 240.292781] [<8051e7b8>] (schedule_timeout) from [<8051d248>] (wait_for_common+0x88/0x130)
[ 240.292953] [<8051d248>] (wait_for_common) from [<80262c74>] (imx_rng_init+0x158/0x2a8)
[ 240.293117] [<80262c74>] (imx_rng_init) from [<80262574>] (set_current_rng+0xc0/0x15c)
[ 240.293276] [<80262574>] (set_current_rng) from [<80262874>] (hwrng_register+0x190/0x1b8)
[ 240.293436] [<80262874>] (hwrng_register) from [<807c3fd8>] (imx_rng_probe+0xd4/0x134)
[ 240.293682] [<807c3fd8>] (imx_rng_probe) from [<802748e0>] (platform_drv_probe+0x44/0xac)
[ 240.293852] [<802748e0>] (platform_drv_probe) from [<802735ac>] (driver_probe_device+0x178/0x2b8)
[ 240.294009] [<802735ac>] (driver_probe_device) from [<802737bc>] (__driver_attach+0x8c/0x90)
[ 240.294158] [<802737bc>] (__driver_attach) from [<80271d50>] (bus_for_each_dev+0x68/0x9c)
[ 240.294352] [<80271d50>] (bus_for_each_dev) from [<802726bc>] (bus_add_driver+0x13c/0x1e4)
[ 240.294600] [<802726bc>] (bus_add_driver) from [<80273ed4>] (driver_register+0x78/0xf8)
[ 240.294843] [<80273ed4>] (driver_register) from [<807c434c>] (__platform_driver_probe+0x20/0x70)
[ 240.295092] [<807c434c>] (__platform_driver_probe) from [<807a9d78>] (do_one_initcall+0x118/0x1c4)
[ 240.295367] [<807a9d78>] (do_one_initcall) from [<807a9f48>] (kernel_init_freeable+0x124/0x1c4)
[ 240.295609] [<807a9f48>] (kernel_init_freeable) from [<8051883c>] (kernel_init+0x8/0xe8)
[ 240.295844] [<8051883c>] (kernel_init) from [<8000ef88>] (ret_from_fork+0x14/0x2c)
full log here
I will update this question when I have new findings, but i'm new to Qemu and I'm quite stuck and ran out of options. The repository i'm working in is here. Thanks for any input !
I haven't investigated too closely, but the fact that the backtrace shows a hang in the imx_rng_init function suggests that the problem is that QEMU doesn't have an emulation of the imx SoC's builtin RNG device, and so the guest is hanging forever waiting for a response from hardware that doesn't exist.
You'll need to either implement a model of that device, or else use a guest kernel which doesn't try to probe for that device.
More generally, running an Arm kernel that's intended for one piece of hardware on a different piece of hardware will usually not work. The sabrelite has the same SoC here, so booting works better than it would if you tried to do it with an entirely unrelated QEMU machine, but if at any time your guest code tries to access hardware outside the SoC which is specific to the reMarkable then you will find it doesn't work. If you really need to get the stock kernel for the hardware to boot you will probably at some point need to bite the bullet and implement a proper machine model of it in QEMU with the relevant devices present.
If you don't actually need to run anything on the guest that cares about the specific differences between one imx6 system and another, you might be able to get away with using a kernel and DTB for the sabrelite board plus the rootfs from the reMarkable.

Why EXT4/JBD2 after mounted keeps calling ext4_journal_stop?

I was investigating the journaling layer used in the EXT4 (JBD2) and I added some printk to see the behavior of the ext4_journal_start and ext4_journal_stop functions being called.
This is the procedure:
I first format a given partition using:
sudo mke2fs -t ext4 /dev/vdb
(I am using QEMU to run this experiment)
Then I mount it:
sudo mount /dev/vdb /mnt/mydisk
That is the normal procedure for mounting, but when I mount it, because of my printk's functions in both ext4_journal_start/stop, the dmesg shows a lot of calls to journal_stop without any journal_start.
P.S.: I should guess that it is some background behavior of EXT4 or something, but I have no idea what is it.
Here is the dmesg output:
* Restoring resolver state... [ OK ]
* Stopping System V runlevel compatibility [ OK ]
[ 124.648904] JOURNAL STOP
[ 124.778691] JOURNAL STOP
...
[... ] # it is called maybe more than 40 times
...
[ 129.641895] jbd2_journal_commit_transaction
[ 129.769132] JOURNAL STOP
...
[... ] # it is called maybe more than 40 times
...
[ 134.766164] jbd2_journal_commit_transaction
After 134 seconds, it stops these messages, and then I try to write some file into that mounting point, and it behaves as expected.
[ 624.995549] JOURNAL START
[ 624.996849] JOURNAL STOP
[ 625.000676] JOURNAL START
[ 625.001757] JOURNAL START
[ 625.002822] JOURNAL STOP
[ 625.003773] JOURNAL STOP
[ 631.004110] jbd2_journal_commit_transaction
So, it is strange that after mounting, even that I did absolutely nothing, these functions are being called (journal_stop) several times and, furthermore, after two commits (the function call jbd2_journal_commit_transaction) the dmesg gets stable, and it then follows an expected behavior.
To make it clear, my question is: what causes this several calls without any reason (the ext4_journal_stop)?
By debugging the ext4 source code, I discovered what causes the several journal operations right after mounting the file system.
The ext4 is creating the inode table, so that is why the journal is called several times right after mounting a partition.

Google Compute Engine VM instance: VFS: Unable to mount root fs on unknown-block

My instance on Google Compute Engine is not booting up due to having some boot order issues.
So, I have created a another instance and re-configured my machine.
My questions:
How can I handle these issues when I host some websites?
How can I recover my data from old disk?
logs
[ 0.348577] Key type trusted registered
[ 0.349232] Key type encrypted registered
[ 0.349769] AppArmor: AppArmor sha1 policy hashing enabled
[ 0.350351] ima: No TPM chip found, activating TPM-bypass!
[ 0.351070] evm: HMAC attrs: 0x1
[ 0.351549] Magic number: 11:333:138
[ 0.352077] block ram3: hash matches
[ 0.352550] rtc_cmos 00:00: setting system clock to 2015-12-19 17:06:53 UTC (1450544813)
[ 0.353492] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
[ 0.354108] EDD information not available.
[ 0.536267] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
[ 0.537862] md: Waiting for all devices to be available before autodetect
[ 0.538979] md: If you don't use raid, use raid=noautodetect
[ 0.539969] md: Autodetecting RAID arrays.
[ 0.540699] md: Scanned 0 and added 0 devices.
[ 0.541565] md: autorun ...
[ 0.542093] md: ... autorun DONE.
[ 0.542723] VFS: Cannot open root device "sda1" or unknown-block(0,0): error -6
[ 0.543731] Please append a correct "root=" boot option; here are the available partitions:
[ 0.545011] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 0.546199] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-39-generic #44~14.04.1-Ubuntu
[ 0.547579] Hardware name: Google Google, BIOS Google 01/01/2011
[ 0.548728] ffffea00008ae140 ffff880024ee7db8 ffffffff817af92b 000000000000111e
[ 0.549004] ffffffff81a7c7c8 ffff880024ee7e38 ffffffff817a976b ffff880024ee7dd8
[ 0.549004] ffffffff00000010 ffff880024ee7e48 ffff880024ee7de8 ffff880024ee7e38
[ 0.549004] Call Trace:
[ 0.549004] [] dump_stack+0x45/0x57
[ 0.549004] [] panic+0xc1/0x1f5
[ 0.549004] [] mount_block_root+0x210/0x2a9
[ 0.549004] [] mount_root+0x54/0x58
[ 0.549004] [] prepare_namespace+0x16d/0x1a6
[ 0.549004] [] kernel_init_freeable+0x1f6/0x20b
[ 0.549004] [] ? initcall_blacklist+0xc0/0xc0
[ 0.549004] [] ? rest_init+0x80/0x80
[ 0.549004] [] kernel_init+0xe/0xf0
[ 0.549004] [] ret_from_fork+0x58/0x90
[ 0.549004] [] ? rest_init+0x80/0x80
[ 0.549004] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 0.549004] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
What Causes This?
That is the million dollar question. After inspecting my GCE VM, I found out there were 14 different kernels installed taking up several hundred MB's of space. Most of the kernels didn't have a corresponding initrd.img file, and were therefore not bootable (including 3.19.0-39-generic).
I certainly never went around trying to install random kernels, and once removed, they no longer appear as available upgrades, so I'm not sure what happened. Seriously, what happened?
Edit: New response from Google Cloud Support.
I received another disconcerting response. This may explain the additional, errant kernels.
"On rare occasions, a VM needs to be migrated from one physical host to another. In such case, a kernel upgrade and security patches might be applied by Google."
1. "How can I handle these issues when I host some websites?"
My first instinct is to recommend using AWS instead of GCE. However, GCE is less expensive. Before doing any upgrades, make sure you take a snapshot, and try rebooting the server to see if the upgrades broke anything.
2. How can I recover my data from old disk?
Even Better - How to recover your instance...
After several back-and-forth emails, I finally received a response from support that allowed me to resolve the issue. Be mindful, you will have to change things to match your unique VM.
Take a snapshot of the disk first in case we need to roll back any of the changes below.
Edit the properties of the broken instance to disable this option: "Delete boot disk when instance is deleted"
Delete the broken instance.
IMPORTANT: ensure not to select the option to delete the boot disk. Otherwise, the disk will get removed permanently!!
Start up a new temporary instance.
Attach the broken disk (this will appear as /dev/sdb1) to the temporary instance
When the temporary instance is booted up, do the following:
In the temporary instance:
# Run fsck to fix any disk corruption issues
$ sudo fsck.ext4 -a /dev/sdb1
# Mount the disk from the broken vm
$ sudo mkdir /mnt/sdb
$ sudo mount /dev/sdb1 /mnt/sdb/ -t ext4
# Find out the UUID of the broken disk. In this case, the uuid of sdb1 is d9cae47b-328f-482a-a202-d0ba41926661
$ ls -alt /dev/disk/by-uuid/
lrwxrwxrwx. 1 root root 10 Jan 6 07:43 d9cae47b-328f-482a-a202-d0ba41926661 -> ../../sdb1
lrwxrwxrwx. 1 root root 10 Jan 6 05:39 a8cf6ab7-92fb-42c6-b95f-d437f94aaf98 -> ../../sda1
# Update the UUID in grub.cfg (if necessary)
$ sudo vim /mnt/sdb/boot/grub/grub.cfg
Note: This ^^^ is where I deviated from the support instructions.
Instead of modifying all the boot entries to set root=UUID=[uuid character string], I looked for all the entries that set root=/dev/sda1 and deleted them. I also deleted every entry that didn't set an initrd.img file. The top boot entry with correct parameters in my case ended up being 3.19.0-31-generic. But yours may be different.
# Flush all changes to disk
$ sudo sync
# Shut down the temporary instance
$ sudo shutdown -h now
Finally, detach the HDD from the temporary instance, and create a new instance based off of the fixed disk. It will hopefully boot.
Assuming it does boot, you have a lot of work to do. If you have half as many unused kernels as me, then you might want to purge the unused ones (especially since some are likely missing a corresponding initrd.img file).
I used the second answer (the terminal-based one) in this askubuntu question to purge the other kernels.
Note: Make sure you don't purge the kernel you booted in with!
How to handle these issues when I host some websites?
I'm not sure how you got into this situation, but it would be nice to have additional information (see my comment above) to be able to understand what triggered this issue.
How to recover my data from old disk?
Attach and mount the disk
Assuming you did not delete the original disk when you deleted the instance, you can simply mount this disk from another VM to read the data from it. To do this:
attach the disk to another VM instance, e.g.,
gcloud compute instances attach-disk $INSTANCE --disk $DISK
mount the disk:
sudo mkdir -p /mnt/disks/[MNT_DIR]
sudo mount [OPTIONS] /dev/disk/by-id/google-[DISK_NAME] /mnt/disks/[MNT_DIR]
Note: you'll need to substitute appropriate values for:
MNT_DIR: directory
OPTIONS: options appropriate for your disk and filesystem
DISK_NAME: the id of the disk after you attach it to the VM
Unmounting and detaching the disk
When you are done using the disk, reverse the steps:
Note: Before you detach a non-root disk, unmount the disk first. Detaching a mounted disk might result in incomplete I/O operation and data corruption.
unmount the disk
sudo umount /dev/disk/by-id/google-[DISK_NAME]
detach the disk from the VM:
gcloud compute instances detach-disk $INSTANCE --device-name my-new-device
In my case grub's (/boot/grub/grub.cfg) first menuentry (3.19.0-51-generic) was missing an initrd entry and was unable to boot.
Upon further investigating, looking at dpkg for the specific kernel its marked as failed and unconfigured
dpkg -l | grep 3.19.0-51-generic
iF linux-image-3.19.0-51-generic 3.19.0-51.58~14.04.1
iU linux-image-extra-3.19.0-51-generic 3.19.0-51.58~14.04.1
This all stemmed from the Ubuntu image supplied by Google having unattended-upgrades enabled. For some reason the initrd was killed when it was being built and something else came along and ran update-grub2.
unattended-upgrades-dpkg_2016-03-10_06:49:42.550403.log:update-initramfs: Generating /boot/initrd.img-3.19.0-51-generic
Killed
E: mkinitramfs failure cpio 141 xz -8 --check=crc32 137
unattended-upgrades-dpkg_2016-03-10_06:49:42.550403.log:update-initramfs: failed for /boot/initrd.img-3.19.0-51-generic with 1.
To work around the immediate problem run.
dpkg --force-confold --configure -a
Although unattended-upgrades in theory is a great idea, having it enabled by default can have unattended consequences.
There are a few cases where the kernel fails to handle the initrdless boot. Disable the GRUB_FORCE_PARTUUID options so that it boots with initrd.

Resources