Earlier, while trying to instrument a failing boot from some peculiar nodes we were trying to provision, I came across the following gem in the Linux kernel documentation, from Documentation/filesystems/ramfs-rootfs-initramfs.txt:
Note: The cpio man page contains some bad advice that will break your initramfs archive if you follow it. It says “A typical way to generate the list of filenames is with the find command; you should give find the -depth option to minimize problems with permissions on directories that are unwritable or not searchable.” Don’t do this when creating initramfs.cpio.gz images, it won’t work. The Linux kernel cpio extractor won’t create files in a directory that doesn’t exist, so the directory entries must go before the files that go in those directories. The above script gets them in the right order.
Yup. If you follow the documentation for the tool, it renders your system unbootable. The linked documentation is actually pretty cool – it explains the rationale for the current state of the boot process, including that charming behavior, and links to the original discussions. But the particular behavior is still kind of psychotic.
Upside: After today’s digging I know all kinds of neat things about the current Linux boot process, which I hadn’t relearned after it changed at the the 2.4/2.6 transition. Similarly, the last couple times we had problems with Warewulf 3 (or, actually, Redhat-isims interfering with Warewulf) brought me back up to speed on interpreting raw packet logs from Wireshark, so this has all been thoroughly educational.
Downside: I have even less idea why the nodes won’t finish booting – the check I was adding was to test our theory that they were running out of memory, and they don’t seem to be.