Adventures in distro-hopping

Earlier this evening, I helped a good friend with a System76 laptop that would not boot reliably. Actually, the saga spans about a year or so, and there is a lot more to it than just booting, but all he wanted was to have a small PopOS partition to grab at the Oryx firmware updates without issue, followed by Manjaro. Manjaro has the more recent packages he needs for what he does, and after troubleshooting various desktop configuration issues, it came down to the boot.

Looking back at the variety of issues he ran into, it seems that he encountered a few peculiarities of Linux, which seemed interesting to go over.

systemd delente est

The biggest obstacle was the boot process - PopOS, like all Ubuntu-based distros, have started using systemd-grub-boot to manage the boot process. Add one more to the list of reasons why systemd is a bad idea. In this case, this means that not only does systemd extend into the very early stages of the boot process, well past init of the OS, but it also fails to play nicely with other grub configs. This is true even with manual manipulation of fstab.

This was a massive distraction, because it masked the real issue that was stopping my friend from getting anywhere. In theory, you are supposed to be able to install may flavours of linux on the same device, and boot into whichever one you want - grub can handle it, as long as you tell it where everything is.

In practice? Ubuntu presumes that it is the only distro installed. And if you have more than one Ubuntu based distro - PopOS and elementaryOS, for instance - both their systemd-grub-boot instances will fight each other to the death for control over the boot. I obsereved similar issues with OpenSUSE on one of my machines, likely also due to use of systemd-grub-boot.

Desktop environments are horribly inconsistent

GNOME, in particular, is not very good. PopOS has added so many of their own custom extensions to GNOME that they are now moving towards their own custom backend. I hope they succeed.

My friend tried Manjaro Xfce early on, and bounced around a few distros to see if he could get his preferred workflow going. PopOS and other GNOME based desktops were the only ones that consistently allowed him to use his external monitor without issues, but even then, there were still glitches in how settings got applied, how they could (or could not) be changed....

It took a lot longer than it needed to for him to find a trouble-free desktop environment on a distro that was not tied to Ubuntu LTS. I eventually helped him find the recent Manjaro GNOME - of all the current GNOME desktops, this one is the best so far. It plays nice with his external display and other hardware.

Too many distros blindly grab the upstream desktop environment, half-ass a theme, and do nothing else to ensure a smoother integration. It is not uncommon to see two or three tools to handle network configuration (for instance) but only one incomplete, bare bones file manager (for instance). The GNOME Foundation being openly antagonistic towards the open source community that is not in their fiefdom certainly does not help matters, either.

The boot process is still a dark art

Even after EFI came along, and even though it has been around since 2005, all the utilities around it seem like clumsy hacks, and there is still a tonne of information about traditional BIOS boot in Linux. It is a pity, as it makes more trouble than there really needs to be at this point. Yes, there libreboot/coreboot - System76 ships several laptops with this out of the box - but very few machines are able to be retroactively flashed with it. There really is more to life than a Thinkpad X220. Seriously.

For as graphical and user friendly as many distro installs are nowadays, the inital boot configuration is still using the same command line tools that were in use in the late 1990's. My friend had to troubleshoot around both grub's intereaction with EFI, and systemd-boot, and it damn near drove him crazy. Fortunately, there is rEFInd.

<https://www.rodsbooks.com/refind/>

rEFInd is an open source, cross-platform boot manager. Nothing else. Not a boot loader, not an "init system", just a plain tightly scoped, tightly focused tool. Once I finally convinced my friend to install it, he was able to tame the boot process, put systemd back in its place, and finally see what was holding him up...

The last remaining obstacle

Manjaro would not boot, because somewhere along the way, pkgfile became corrupted or locked. So, the desktop would not load, because pkgfile would cause SDDM to crash out before it could even open an xsession. The full shock and horror of what this is - as well as two solutons, can be found in this forum thread :

[FAILED] failed to start pkgfile database update <https://forum.manjaro.org/t/failed-failed-to-start-pkgfile-database-update/31731>

pkgfile is a core depenency of zsh, and is used in autocompletion, autosuggestions, and a few other things. It is a database of certain strings that allows zsh to be more responsive - querying the pkgfile database is faster than querying the entire system. When it breaks, there are bascally two paths:

  1. Force a rebuild of pkgfile with 'sudo pkgfile -u'

  2. Switch away from zsh, and uninstall pkgfile

Because my friend is an old school traditionalist, he already switched to bash. We removed pkgfile and a few other zsh related items - et voilà ! The issue was resolved, and Manjaro boots now.

For the first time in a year, my friend finally has a reliable boot into an environment that has the packages he needs, and that plays nice with his external monitor. Yes, part of the reason it took so long is due to his stubborness, but still, I think we in the open source community can do better. This really should not have been so difficult, and I look upon renewed scrutiny of the Linux desktop with keen interest.