My system runs ZFS and lately has been dropping to the initramfs / busybox prompt on boot. I had a hard time finding a fleshed-out guide on how to mount ZFS in a live environment for performing disaster recovery tasks like chroot and grub repair, so I thought I’d write something up.
My system was dropping to the busybox prompt after GRUB menu. I started experiencing the issue after a routine apt upgrade, I rebooted and wasn’t able to get any of my initramfs to boot. It seems a little strange, because usually the inability to boot will be limited to a new initramfs – e.g. an older version of the kernel will still have the ZFS drivers, or other necessary components to boot, while the newer versions (the ones just installed) will be lacking these necessary components for whatever reason.
First of all, burn yourself a copy of a live USB, and boot into it. Luckily, the newest version of Ubuntu (22.04 – Jammy Jellyfish) has the ZFS drivers and executables installed by default, unlike prior versions where you had to add the multiverse repo manually, download the packages, and enable the ZFS drivers using modprobe
.
A peek at lsmod
shows the ZFS drivers are indeed loaded, and lo-and-behold, there’s the zpool
and zfs
executables:
ubuntu@ubuntu:~$ lsmod | grep zfs
zfs 3751936 29
zunicode 348160 1 zfs
zzstd 487424 1 zfs
zlua 155648 1 zfs
zavl 20480 1 zfs
icp 319488 1 zfs
zcommon 102400 2 zfs,icp
znvpair 94208 2 zfs,zcommon
spl 122880 6 zfs,icp,zzstd,znvpair,zcommon,zavl
ubuntu@ubuntu:~$ which {zpool,zfs}
/usr/sbin/zpool
/usr/sbin/zfs
The drive I am diagnosing is the internal NVMe, so there’s no need to attach it. One question I had was how to mount the two pools, and in what order. By default, Ubuntu creates an rpool
for the root partition, and a bpool
for the boot partition.
Generally, on an EFI system, one would mount the root partition in a clean directory like /mnt
first, and subsequently mount boot at /mnt/boot
once it is provided by the previously mounted root partition, and then mount efi at /mnt/boot/efi
once that’s provided by the boot partition. As you can see, the order of mounting these partitions is therefore of paramount importance, but as there are only 3 options, it’s not too complicated.
You’ll need to be root for basically all these commands. Using sudo su
without a password will typically get you to a root prompt (#
) in a live environment.
TL;DR – probably way more than you ever wanted to know about an lsblk
device list:
First, we should identify the storage devices using lsblk -f
(the -f
flag includes the filesystem information, which is important for our purposes):
# lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
loop0
squash 4.0 0 100% /rofs
loop1
squash 4.0 0 100% /snap/bare/5
loop2
squash 4.0 0 100% /snap/core20/1405
loop3
squash 4.0 0 100% /snap/snapd/15177
loop4
squash 4.0 0 100% /snap/snap-store/575
loop5
squash 4.0 0 100% /snap/gtk-common-themes/1534
loop6
squash 4.0 0 100% /snap/firefox/1232
loop7
squash 4.0 0 100% /snap/snapd-desktop-integration/10
loop8
squash 4.0 0 100% /snap/gnome-3-38-2004/99
sda iso966 Jolie Ubuntu 22.04 LTS amd64
│ 2022-04-19-10-23-19-00
├─sda1
│ iso966 Jolie Ubuntu 22.04 LTS amd64
│ 2022-04-19-10-23-19-00 0 100% /cdrom
├─sda2
│ vfat FAT12 ESP 8D6C-A9F8
├─sda3
│
└─sda4
ext4 1.0 writable
bb277d84-75cc-473b-b327-fd885d85889a 24.5G 0% /var/crash
/var/log
zd0 btrfs b6239f8a-058b-4a6c-8258-b9a7b50f6c23
zd16
└─zd16p1
btrfs d6074499-b9aa-47e0-a08a-58e27c73e771
zd32 btrfs c68aa9ca-933a-48cb-9adb-22fd6a8ca8c8
zd48
└─zd48p1
btrfs f52702bd-c805-4edc-87d1-6fb877ee6738
nvme1n1
│
├─nvme1n1p1
│ vfat FAT32 B045-5C3B
├─nvme1n1p2
│ swap 1 584b9b78-7d8d-4a5a-9263-d6f6a48adc6b
├─nvme1n1p3
│ zfs_me 5000 bpool 11241115695889536197
└─nvme1n1p4
zfs_me 5000 rpool 16130566787573079380
nvme0n1
│
├─nvme0n1p1
│ vfat FAT32 EC9D-0344
├─nvme0n1p2
│
├─nvme0n1p3
│ ntfs A4EEBDB4EEBD7F5C
└─nvme0n1p4
ntfs 989EE7E99EE7BDBE
OK, there’s a lot there, so what are we looking at? Well, the first 9 devices that say loop
are snaps, since we’re on Ubuntu. Those are responsible for storing some of the programs being run by the OS. Each one gets their own virtual storage device, sometimes referred to as an “overlay”. They create a fair amount of clutter in our device list, but that’s about all. You can ignore them.
Then, /dev/sda
is our copy of Ubuntu ISO we booted from – you can see how it says cdrom
there, and iso9660
(the cdrom spec). It’s read-only, so we couldn’t do anything with it if we wanted to, and we don’t, so let’s move on…
There’s a device for log
and crash
log, so that’s kind of interesting. I imagine the live ISO makes those since you can’t write to the USB drive, seeing as the ISO is a virtual CD-ROM, and CD-ROMs are read-only. Then there’s a bunch of what are called “zvols” (the zd0
, zd16
, etc. devices – see those?). Those are devices created with ZFS that are isolated from the rest of the filesystem. zvols are virtual block devices you can use just like any other block device, but in this context they’re typically either formatted with a different filesystem, or mounted via iSCSI for block-level filesharing (filesystem-sharing?). You can see these ones say btrfs
, they were actually created for use with container runtimes, namely podman
and systemd-container
, both of which support btrfs very well and ZFS either nominally or not at all.
Now we get to nvme1n1
– this is the first NVMe drive listed. Generally 0
would be listed first, but for some reason it’s listed second. n1
is the number of the drive (the second NVMe drive in the laptop), after that the partitions are listed as p1
, p2
, p3
, and so on. Here’s the drive in isolation:
nvme1n1
│
├─nvme1n1p1
│ vfat FAT32 B045-5C3B
├─nvme1n1p2
│ swap 1 584b9b78-7d8d-4a5a-9263-d6f6a48adc6b
├─nvme1n1p3
│ zfs_me 5000 bpool 11241115695889536197
└─nvme1n1p4
zfs_me 5000 rpool 16130566787573079380
The canonical address for this drive is: /dev/nvme1n1p{1,2,3,4}
. The /dev
(device) folder, while not listed in this output, is important to reference, as the full path is required for mounting a partition. Typically one would only mount a single partition at a time, but you could conceivably chain them in a single command by using curly braces, as shown. This is not common, as you will probably need to mount different partitions in different locations (e.g. /mnt
, /mnt/boot
), and usually either in descending order, or with no pattern at all.
If you remember back at the start, I mentioned the rpool
and bpool
. These are seen on /dev/nvme1n1p4
and /dev/nvme1n1p3
respectively. If the disk were formatted in a block filesystem such as EXT4 (Ubuntu’s default filesystem), the root partition could be mounted by attaching /dev/nvme0n1p4
to an empty folder. The command would therefore be:
# mount /dev/nvme1n1p4 /mnt
And then you’d be able to ls /mnt
and see the files contained on your newly mounted root partition. E.g.:
# ls /mnt
Qogir boot dev home lib32 libx32 mnt proc run snap sys usr
bin cdrom etc lib lib64 media opt root sbin srv tmp var
But this NVMe is formatted using ZFS. So what to do? That’s the process I was having difficulty finding that inspired this blog post.
End TL;DR – here’s the ZFS-specific stuff again:
First, after you confirm that you have your ZFS modules loaded by referencing your list of loaded kernel modules, and confirming that your ZFS executables are available in PATH
(here’s the syntax again so you don’t have to scroll back):
# lsmod | grep zfs
zfs 3751936 29
zunicode 348160 1 zfs
zzstd 487424 1 zfs
zlua 155648 1 zfs
zavl 20480 1 zfs
icp 319488 1 zfs
zcommon 102400 2 zfs,icp
znvpair 94208 2 zfs,zcommon
spl 122880 6 zfs,icp,zzstd,znvpair,zcommon,zavl
# which {zpool,zfs}
/usr/sbin/zpool
/usr/sbin/zfs
Here’s where it’s different than your typical mount. You use zpool
to import rpool
, but you need to mount it using an alternate root (at /mnt
) – otherwise it’ll try to mount itself over your live environment! Then confirm that the import worked.
# zpool import -f rpool -R /mnt
# ls /mnt
Qogir boot dev home lib32 libx32 mnt proc run snap sys usr
bin cdrom etc lib lib64 media opt root sbin srv tmp var
OK, that went well. You can see that now we have a /mnt/boot
folder, which is boot
inside rpool
– that’s where initramfs lives, but they’re stored in the bpool
. We needed that folder to be available to mount our bpool
into. So, let’s import bpool
into /mnt/boot
as an alternate root (if we didn’t, it’d try and overwrite our currently mounted /boot
partition (note – this syntax is correct, ZFS has a slightly different method of dealing with mounts and folders than GNU software — if curious, see related issue with rsync
: https://superuser.com/questions/1425744/why-is-rsync-creating-a-new-subdirectory):
# zpool import -f bpool -R /mnt
# ls /mnt/boot
config-5.15.32-xanmod1 memtest86+_multiboot.bin
config-5.15.34-xanmod1 System.map-5.15.32-xanmod1
config-5.15.36-xanmod1 System.map-5.15.34-xanmod1
config-5.17.0-xanmod1 System.map-5.15.36-xanmod1
config-5.17.1-xanmod1 System.map-5.17.0-xanmod1
config-5.17.3-xanmod1 System.map-5.17.1-xanmod1
config-5.17.5-xanmod1 System.map-5.17.3-xanmod1
config-5.17.9-xanmod1 System.map-5.17.5-xanmod1
config-5.17.9-xanmod1-x64v2 System.map-5.17.9-xanmod1
efi System.map-5.17.9-xanmod1-x64v2
grub vmlinuz
initrd.img vmlinuz-5.15.32-xanmod1
initrd.img-5.15.32-xanmod1 vmlinuz-5.15.34-xanmod1
initrd.img-5.15.34-xanmod1 vmlinuz-5.15.36-xanmod1
initrd.img-5.17.0-xanmod1 vmlinuz-5.17.0-xanmod1
initrd.img-5.17.1-xanmod1 vmlinuz-5.17.1-xanmod1
initrd.img-5.17.3-xanmod1 vmlinuz-5.17.3-xanmod1
initrd.img-5.17.5-xanmod1 vmlinuz-5.17.5-xanmod1
initrd.img.old vmlinuz-5.17.9-xanmod1
memtest86+.bin vmlinuz-5.17.9-xanmod1-x64v2
memtest86+.elf vmlinuz.old
That looks like a bunch of initramfs files to me! Good, so that means those kickstarter runtimes that load from grub are available.
If you look in that list, you’ll also see both efi and grub folders. Both of those are empty and waiting for storage to be attached. The efi partition lives in the first partition of the same NVMe drive, and is formatted with FAT, and grub is a bind-mount (you can see it in /etc/fstab):
# mount -t msdos /dev/nvme1n1p1 /mnt/boot/efi
Can also use UUID from lsblk if prefer (just use one or other, not both):
# mount -t msdos UUID=B045-5C3B /mnt/boot/efi
# ls /mnt/boot/efi
efi grub system~1 (confirm it's mounted)
# grep grub /mnt/etc/fstab
/boot/efi/grub /boot/grub none defaults,bind 0 0
(we'll bind-mount this in next step)
Then you’ll want to mount a few system folders inside your drive’s filesystem so you can access them inside the chroot (required for things to work OK):
# for i in proc dev sys dev/pts; do mount -v --bind /$i /mnt/$i; done
mount: /proc bound on /mnt/proc.
mount: /dev bound on /mnt/dev.
mount: /sys bound on /mnt/sys.
mount: /dev/pts bound on /mnt/dev/pts.
# mount -v --bind /mnt/boot/efi/grub /mnt/boot/grub
mount: /mnt/boot/efi/grub bound on /mnt/boot/grub.
“chroot
ing”: Now that all 3 partitions are mounted together in a cohesive filesystem tree, and you’ve got all your necessary bind mounts, one of the most effective ways to diagnose issues as if you’re running the affected disk, is to chroot into the filesystem. Run # chroot /mnt
and now you’ll see /mnt
as /
(root), and you can run your programs as if you booted the computer using that drive (from the terminal, anyway):
# chroot /mnt
# apt update (failed)
# cd /etc
# ls -la resolv.conf
lrwxrwxrwx 1 root root 39 Feb 17 12:09 resolv.conf -> ../run/systemd/resolve/stub-resolv.conf
If your network connection fails inside the chroot
like mine did, go to /etc
and delete resolv.conf
if it’s a symlink to systemd-resolved
(as shown above). Then point /etc/resolv.conf
to a known good dns forwarder (e.g. 1.1.1.1
, 8.8.8.8
, etc.)
# echo 'nameserver 8.8.8.8' > resolv.conf
# apt update (works)
# apt list --installed | grep dkms
dkms/jammy,now 2.8.7-2ubuntu2 all [installed,automatic]
zfs-dkms/jammy-proposed,now 2.1.4-0ubuntu0.1 all [installed]
I was really hoping zfs-dkms
got uninstalled somehow, because I thought that might have been why my initramfs files didn’t have zfs modules. So unfortunately I still have to keep looking to figure out what’s wrong…
Note, you’ll probably see this error a lot, but it’s safe to ignore:
ERROR couldn't connect to zsys daemon: connection error: desc = "transport: Error while dialing dial unix /run/zsysd.sock: connect: connection refused"
Let’s try upgrading the packages and see what shakes out:
# apt upgrade
The following packages were automatically installed and are no longer required:
linux-headers-5.15.32-xanmod1 linux-headers-5.15.34-xanmod1
linux-headers-5.15.36-xanmod1 linux-headers-5.17.0-xanmod1
linux-headers-5.17.1-xanmod1 linux-headers-5.17.3-xanmod1
linux-headers-5.17.5-xanmod1 linux-image-5.15.32-xanmod1
linux-image-5.15.34-xanmod1 linux-image-5.15.36-xanmod1
linux-image-5.17.0-xanmod1 linux-image-5.17.1-xanmod1
linux-image-5.17.3-xanmod1 linux-image-5.17.5-xanmod1
Use 'sudo apt autoremove' to remove them.
That was … interesting … and then the issue presented itself next while I ran apt autoremove
:
Setting up linux-image-5.17.9-xanmod1 (5.17.9-xanmod1-0~git20220518.d88d798) ...
* dkms: running auto installation service for kernel 5.17.9-xanmod1 [ OK ]
update-initramfs: Generating /boot/initrd.img-5.17.9-xanmod1
zstd: error 25 : Write error : No space left on device (cannot write compressed
block)
(emphasis added)
bpool
has no space left. That’s almost certainly the problem. I’m going to remove a couple kernels and rebuild all my initramfs, that ought to do it. I’m also noticing my bpool
is full of snapshots. List current snapshots with this first command, and then destroy them with the second one:
// This lists the snapshots:
# zfs list -H -o name -t snapshot | grep bpool
...auto-snapshots look like pool/BOOT/ubuntu_pd3ehl@autozsys_xxxx,
snapshots have @ symbol - no @ symbol, not a snapshot, don't delete it!
// This destroys the snapshots:
# zfs list -H -o name -t snapshot | grep bpool | xargs -n1 zfs destroy -r
What this does:
(list only snapshots by full name) | (list only bpool) | (delete by ea line)
It's the same as what's above, but with the delete command, destroy.
Make sure you understand what's going on with this command, as you can delete stuff you don't want to really easily. Please be careful.
… looks pretty good to me – much more tidy:
# ls /boot
config-5.15.0-33-generic memtest86+.elf
config-5.15.40-xanmod1-tt memtest86+_multiboot.bin
efi System.map-5.15.0-33-generic
grub System.map-5.15.40-xanmod1-tt
initrd.img vmlinuz
initrd.img-5.15.0-33-generic vmlinuz-5.15.0-33-generic
initrd.img-5.15.40-xanmod1-tt vmlinuz-5.15.40-xanmod1-tt
initrd.img.old vmlinuz.old
memtest86+.bin
Install some generic kernel to make sure you have one available, check that zfs-initramfs
is installed if all you’re going to use is generic kernel (or zfs-dkms
if using xanmod
, other 3rd-party kernel). E.g. I got rid of my xanmod kernels just so I wouldn’t have to deal with building custom dkms
modules:
# apt list --installed | grep xanmod
linux-headers-5.15.40-xanmod1-tt/unknown,now 5.15.40-xanmod1-tt-0~git20220515.867e3cb amd64 [installed,automatic]
linux-image-5.15.40-xanmod1-tt/unknown,now 5.15.40-xanmod1-tt-0~git20220515.867e3cb amd64 [installed,automatic]
linux-xanmod-tt/unknown,now 5.15.40-xanmod1-tt-0 amd64 [installed]
xanmod-repository/unknown,now 1.0.5 all [installed]
# apt remove linux-headers-5.15.40-xanmod1-tt linux-image-5.15.40-xanmod1-tt xanmod-repository linux-xanmod-tt zfs-dkms
. . .
The following packages will be REMOVED:
linux-headers-5.15.40-xanmod1-tt linux-image-5.15.40-xanmod1-tt
linux-xanmod-tt xanmod-repository zfs-dkms
Do you want to continue? [Y/n]
. . .
# apt autoremove -y
... install a couple kernels...
# apt install -y linux-{image,headers}-5.15.0-28-generic linux-{image,headers}-5.15.0-33-generic
. . . using versions that are most current & 2nd most current now . . .
Then update all the initramfs one last time, just in case. I’ll probably re-install grub, too, just bc, but one thing at a time…
# update-initramfs -uvk all
. . . lots of output . . . that's how you know it's working . . .
Let’s re-install grub and run update-grub
# grub-install --bootloader-id=GRUB --recheck --target=x86_64-efi --efi-directory=/boot/efi --no-floppy
Installing for x86_64-efi platform.
grub-install: warning: EFI variables cannot be set on this system.
grub-install: warning: You will have to complete the GRUB setup manually.
Installation finished. No error reported.
When you get this error, it just means you can’t set the UEFI boot order while you’re in a chroot. I also like to run update-grub
for good measure (this is grub2-mkconfig -o /boot/grub/grub.cfg
on most other systems if that’s more familiar sounding to you). Update-grub
rebuilds the entries in your grub menu, along with their parameters detailed in /etc/default/grub
.
Speaking of which, you can always take a peek at /etc/default/grub
before you run this command – just in case.
# which update-grub
/usr/sbin/update-grub
# cat /usr/sbin/update-grub
// update-grub:
#!/bin/sh
set -e
exec grub-mkconfig -o /boot/grub/grub.cfg "$@"
# update-grub
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
Found linux image: vmlinuz-5.15.0-33-generic in rpool/ROOT/ubuntu_pd3ehl
Found initrd image: initrd.img-5.15.0-33-generic in rpool/ROOT/ubuntu_pd3ehl
Found linux image: vmlinuz-5.15.0-28-generic in rpool/ROOT/ubuntu_pd3ehl
Found initrd image: initrd.img-5.15.0-28-generic in rpool/ROOT/ubuntu_pd3ehl
Found linux image: vmlinuz-5.15.0-33-generic in rpool/ROOT/ubuntu_pd3ehl@autozsys_yg50xc
. . . snapshot menu entries . . .
Now leave the chroot now, remove the system folder redirects and bind mounts, and reboot, like so:
# exit
# for i in proc dev/pts dev sys boot/grub; do umount -v /mnt/$i; done
umount: /mnt/proc unmounted
umount: /mnt/dev/pts unmounted
umount: /mnt/dev unmounted
umount: /mnt/sys unmounted
umount: /mnt/boot/grub unmounted
# umount -v /dev/nvme1n1p1
umount: /mnt/boot/efi (/dev/nvme1n1p1) unmounted
# zpool export bpool
# zpool export rpool
One last quick thing you can do before rebooting is check out efibootmgr
and see which order your system will start up in. This is a little easier and more predictable, as you can make sure you boot from the right efi file, rather than mashing the startup menu button to make sure it loads the correct disk / efi.
Some stuff I was messing with trying cover all the bases. efibootmgr
reference: https://wiki.archlinux.org/title/GRUB/EFI_examples#Asus
# efibootmgr -v
Boot0000* ubuntu HD(1,GPT,544a9120-eef7-4aae-8311-cd6ca6929213,0x800,0x100000)/File(\EFI\ubuntu\shimx64.efi)
. . .
# efibootmgr -B Boot0000 -b 0
# efibootmgr --create /dev/nvme1n1 --part 1 --write-signature --loader /EFI/GRUB/grubx64.efi --label "GRUB" --verbose
BootCurrent: 0002
Timeout: 0 seconds
BootOrder: 0000,0001,0002
Boot0001* UEFI: Samsung SSD 980 1TB, Partition 1 HD(1,GPT,6afa5e93-54a5-4628-978f-313a0dcfe27b,0x800,0xfa000)/File(\EFI\Microsoft\Boot\bootmgfw.efi)..BO
Boot0002* UEFI: Samsung Flash Drive DUO 1100, Partition 2 PciRoot(0x0)/Pci(0x14,0x0)/USB(16,0)/HD(2,GPT,a09db2b8-b5f6-43ae-afb1-91e0a90189a1,0x6cc954,0x2130)..BO
Boot0003 Windows Boot Manager HD(1,GPT,6afa5e93-54a5-4628-978f-313a0dcfe27b,0x800,0xfa000)/File(\EFI\Microsoft\Boot\bootmgfw.efi)WINDOWS.........x...B.C.D.O.B.J.E.C.T.=.{.9.d.e.a.8.6.2.c.-.5.c.d.d.-.4.e.7.0.-.a.c.c.1.-.f.3.2.b.3.4.4.d.4.7.9.5.}....................
Boot0000* GRUB HD(1,GPT,a09db2b8-b5f6-43ae-afb2-91e0a90189a1,0x40,0x6cc914)/File(\EFI\GRUB\grubx64.efi)/dev/nvme1n1
A troubleshooting tip: If you have issues using the pool names with zpool for some reason, the UUIDs are listed in lsblk. While technically interchangeable, the UUID can coax some commands into to working correctly when the name can’t.
If it doesn’t boot from the ZFS drive again, boot it into the live ISO and go through everything all over … 😉 Good luck!!
6 responses to “Mount Ubuntu 22.04 ZFS partitions using live ISO for disaster recovery”
Dear Averyfreeman,
You did such a professional investigation and conclusion.
Though I am working since 20 years with Linux and couldn’t really follow your description so easily. There are some shortcuts in your explanations the professional certainly knows.
Could you write it in a more step by step approach please?
I haven’t found any comparable good description as you did. So many people depend on your explanation.
Thank you so much.
Kind regards
York
The steps are in the codeblocks, and if there’s a hashmark (number character – #) it means you should run ‘
sudo
‘ first, unless you elevated your shell (e.g.$ sudo su
)Use a bootable USB flash drive to run Ubuntu, then check if you have
zfs
modules loaded$ lsmod | grep zfs
Assuming you do, import
rpool
to/mnt
$ sudo zpool import -f rpool -R /mnt
Check to see if
/mnt/boot
folder is there:$ ls -a /mnt | grep boot
Import and mount the
bpool
to/mnt/boot
$ sudo zpool import -f bpool -R /mnt/boot
Make sure your initramfs files are present in
/mnt/boot
(that’s where you’re going to save a new one once you’re chrooted to/mnt
) and that you have a folder there named ‘efi
‘$ ls /mnt/boot
Mount your efi partition to
/mnt/boot/efi
(using your nvmedevice/partition
with the correct one identified fromlsblk
command – example shown:/dev/nvme0n1p1
)$ sudo mount -t msdos /dev/nvme0n1p1 /mnt/boot/efi
Make sure your
efi
files are there now:$ ls /mnt/boot/efi
Bind-mount some system resource folders to your /mnt dir because it’s going to be your root. This is binding
/proc
to/mnt/proc
,/dev
to/mnt/dev
, etc. because you’re going tochroot
there$ for i in proc dev sys dev/pts; do sudo mount -v --bind /$i /mnt/$i; done
And bind mount your
grub
folder in yourefi
partition to/mnt/boot/grub
$ sudo mount -v --bind /mnt/boot/efi/grub /mnt/boot/grub
Chroot into your
/mnt
folder and run a shell (e.g.bash
)$ chroot /mnt /bin/bash
Once you’re
chroot
ed, make sure you have external DNS resolution./etc/resolv.conf
is usually missing because it’s typically a symlink generated bysystemd-resolved
during the startup process now. Just stick Google’s DNS in a text file located at/etc/resolv.conf
$ sudo nano /etc/resolv.conf
Add line to
/etc/resolv.conf
file:nameserver 8.8.8.8
Save and exit. Now see if you can resolve an external domain name:
$ ping -4 github.com
If the ping works, run
apt update
and see what DKMS packages you have installed:$ sudo apt update; apt list --installed | grep dkms
See if your
bpool
is out of space.$ sudo zpool list bpool
If there’s room, just upgrade your packages
$ sudo apt upgrade -y
If it’s full, delete snapshots probably gratuitously auto-created by
zsys
(note: this deletes ALL snapshots onbpool
):$ sudo zfs list -H -o name -t snapshot | grep bpool | xargs -n1 sudo zfs destroy -r
Install some of your latest kernels and headers (these versions just an example – find yours by searching with
apt
):$ sudo apt install -y linux-{image,headers}-5.15.0-28-generic linux-{image,headers}-5.15.0-33-generic
Manually update your initramfs just to be safe:
$ sudo update-initramfs -uvk all
Reinstall
grub
, bc, fk-it. Why not.:$ sudo grub-install --bootloader-id=ubuntu --recheck --target=x86_64-efi --efi-directory=/boot/efi --no-floppy
Update
grub
$ sudo update-grub
You should be good at this point, just back out of the
chroot
the same way you went in, but in reverse:$ exit
And remove the resource folders bound to folders in
/mnt
(including/boot/grub
+/boot/efi/grub
just to be safe):$ for i in proc dev/pts dev sys boot/grub; do sudo umount -v /mnt/$i; done
Unmount
efi
folder:$ sudo umount -v /dev/nvme0n1p1
Export (zfs-speak for
unmount
)bpool
, thenrpool
, in that order;$ sudo zpool export bpool
$ sudo zpool export rpool
Use
efibootmgr
to see what order your bios will boot your system in:$ sudo efibootmgr -v
Figure out which one is GRUB on YOUR SSD’s
efi
partition (not the USB flash drive) and use this command to set it to boot first (example shown “Boot000
“):$ sudo efibootmgr -B Boot0000 -b 0
If you need to create a new entry for YOUR
efi
partition, try this, substituting your drive’s location and partition, of course:$ sudo efibootmgr --create /dev/nvme0n1 --part 1 --write-signature --loader /EFI/GRUB/grubx64.efi --label "GRUB" --verbose
Reboot, cross your fingers, and say a prayer to your deity of preference.
$ sudo reboot
Piece of cake, eh?
Two things:
1) zpool import -f bpool -R /mnt/boot created nested /mnt/boot/boot for me. I had to run zpool import -f bpool -R /mnt instead.
2) There is an inconsistency between:
grub-install –bootloader-id=ubuntu –recheck –target=x86_64-efi –efi-directory=/boot/efi –no-floppy
and
efibootmgr –create /dev/nvme1n1 –part 1 –write-signature –loader /EFI/GRUB/grubx64.efi –label “GRUB” –verbose
You should either change first one to –bootloader-id=grub or second one to –loader /EFI/UBUNTU/grubx64.efi
Thanks for pointing those out, I’ll do that – you’re absolutely right.
Re: #1 I’ve made that mistake countless times, and it still trips me up when using software with that behavior, like `rsync`.
Reference: https://superuser.com/questions/1425744/why-is-rsync-creating-a-new-subdirectory
Gratuitous nerd discussion about trailing slash: https://news.ycombinator.com/item?id=30965200
I usually try and be super-explicit when I am running into that issue – e.g. `rsync -avhP /subdir1/dir1/* /subdir2/dir1/.` (notice the dot) but unfortunately I usually catch it after it’s too late. It’s hard for me to accept that `rsync -avhP /subdir1/dir1 /subdir2` won’t put all the files in `/subdir2` with that syntax, but I’m getting there…
Re: #2 I think GRUB is probably the most universally accepted label, so I think I’ll go with that.
Thanks again for your feedback.
Thank you for your guide!
I had a long weekend trying to recover my ZFS bpool successfully 😅
My issue was related to the zfs bpool snapshot bug in ubuntu with grub < 2.06
I documented my journey here: https://gist.github.com/faustinoaq/d267102dd004651801c13fae9d7973ec
Thanks, sorry to hear about your issues, but I’m glad you documented the recovery, and I hope it helps people!