Files
void-installer/docs/LIVE_ISO.md
2026-04-25 13:23:49 +02:00

356 lines
17 KiB
Markdown

# Live ISO Build — Findings & Architecture Notes
## Overview
The live ISO boots directly into a Cinnamon desktop session as user `live` with no password prompt. It is designed for hardware testing on XPS 9700 and serves as the installer delivery vehicle.
Builder: `iso/build-live-iso.sh` (host) → Docker container running `iso/_inner-build-live.sh``void-mklive/mklive.sh`
---
## Boot + Session Startup
### Kernel Cmdline
```
live.user=live console=tty0 console=ttyS0,115200
```
The `live.user=live` parameter is consumed by the vmklive dracut hook (`adduser.sh`) which creates the user inside the initramfs and sets password `voidlinux`.
### runit Stage 2 Override
We override `/etc/runit/2` to run `/etc/runit/live-setup.sh` before handing off to `runsvdir`. The script:
1. Adds extra groups (`plugdev input network docker`) to the live user
2. Writes `/etc/sudoers.d/live` (full passwordless sudo)
3. Configures `/etc/nix/nix.conf` (daemon mode, `trusted-users = root live`)
4. Auto-detects GPU and writes `/etc/X11/xorg.conf.d/20-gpu.conf`
After live-setup.sh, stage 2 mirrors the real `runit-void` exactly:
```sh
runsvchdir "${runlevel}"
ln -sf /etc/runit/runsvdir/current /run/runit/runsvdir/current
exec runsvdir -P /run/runit/runsvdir/current
```
### Services (runsvdir/default symlinks in overlay)
Enabled at build time via symlinks in `build/live-includes/etc/runit/runsvdir/default/`:
- `dbus`
- `NetworkManager`
- `lightdm`
- `nix-daemon`
> **Note:** Do NOT use mklive.sh's `-S` flag for service enable — it is not supported by the version used. Services must be wired via runsvdir symlinks in the include overlay.
---
## LightDM Autologin
### Critical: `lightdm-session` does not exist on Void Linux
The Void `lightdm` 1.32 package does **not** ship the `lightdm-session` binary. The default LightDM behaviour of spawning `lightdm-session` causes the session to crash immediately (exit code 1 in ~20ms) with no error message.
**Fix:** Set `session-wrapper=/etc/lightdm/Xsession` in `lightdm.conf`. The `/etc/lightdm/Xsession` wrapper **is** provided by the Void lightdm package and correctly sources `/etc/profile``/etc/profile.d/`.
### `greeter-env=` and `session-env=` are not supported
These options are silently ignored in LightDM 1.32 on Void. To propagate environment variables to the session use `/etc/profile.d/` scripts instead.
### lightdm.conf autologin lines must be commented
The vmklive dracut hook `display-manager-autologin.sh` uses `sed` to uncomment lines. The autologin lines in `lightdm.conf` must be present but commented out — the hook finds them by regex and uncomments them at boot.
```ini
[Seat:*]
#autologin-user=
#autologin-user-timeout=0
#autologin-session=
#user-session=
session-wrapper=/etc/lightdm/Xsession
greeter-session=lightdm-gtk-greeter
```
The `/etc/lightdm/.session` file (content: `cinnamon`) is read by the hook to set the session name.
---
## GPU Auto-Detection
`live-setup.sh` runs `lspci` at boot and writes `/etc/X11/xorg.conf.d/20-gpu.conf`:
| Detected | Xorg Config | Extra |
|----------|-------------|-------|
| Virtual (virtio/VMware/QEMU/VirtualBox) | `modesetting`, `AccelMethod none` | `LIBGL_ALWAYS_SOFTWARE=1` in `/etc/profile.d/live-env.sh` |
| NVIDIA + proprietary driver (`nvidia_drv.so`) | PRIME offload: Intel `modesetting` + NVIDIA `nvidia` | No software GL |
| NVIDIA without proprietary driver | `modesetting` | — |
| Intel / AMD / other | `modesetting` | — |
`LIBGL_ALWAYS_SOFTWARE=1` is set via `/etc/profile.d/live-env.sh`, not via `session-env=` (unsupported).
---
## Nix Integration
### Prebake architecture (packages baked into squashfs)
Nix packages are **pre-installed at ISO build time** inside the Docker container and the entire `/nix` store is rsynced into the squashfs overlay. This means packages are available immediately on boot — no downloads, no tmpfs space pressure.
**Why not install at first login?** The live system mounts squashfs + tmpfs overlay. Installing ~4 GB of nix packages at runtime fills the tmpfs overlay and causes out-of-space failures. Baking them into squashfs sidesteps this completely.
### Build-time nix install (inside Docker, single-user)
Docker runs as root. Nix is installed in single-user mode (no daemon, no nixbld group):
```sh
mkdir -m 0755 -p /nix
export NIX_CONFIG="build-users-group = " # suppress nixbld group requirement
curl -L https://nixos.org/nix/install | sh -s -- --no-daemon
source /root/.nix-profile/etc/profile.d/nix.sh
export PATH="/root/.nix-profile/bin:$PATH"
NIXPKGS_ALLOW_UNFREE=1 nix profile add \
--extra-experimental-features "nix-command flakes" --impure \
nixpkgs#spotify nixpkgs#discord ...
```
The full `/nix` directory is then staged into the squashfs overlay:
```sh
rsync -a /nix/ "$INCLUDE_DIR/nix/"
```
### Nix prebake cache
To avoid re-downloading packages on every build, the nix store is cached at:
```
cache/nix-prebake/<md5-of-package-list>/
```
If the cache exists and the package list md5 matches, the build restores from cache instead of re-running `nix profile add`. Cache is ~5 GB. Subsequent builds with an unchanged package list complete the nix step in ~1 minute instead of ~20 minutes.
### Current packages (NIX_USER_PACKAGES in build-live-iso.sh)
- `nixpkgs#google-chrome` — replaces chromium (removed from xbps packages)
- `nixpkgs#spotify`
- `nixpkgs#discord`
- `nixpkgs#localsend`
- `nixpkgs#mission-center`
- `nixpkgs#vscode`
### XDG / PATH setup for live user
For Cinnamon to find nix `.desktop` files and for terminals to find nix binaries:
- `/etc/environment`: `XDG_DATA_DIRS=/home/live/.nix-profile/share:/usr/local/share:/usr/share`
- `/etc/profile.d/nix-prebaked.sh`: adds nix profile to `PATH` for terminal sessions
- `/etc/skel/.nix-profile` → symlink to the pre-baked store profile, copied to `/home/live/` when the live user is created by the dracut hook
### Live system nix-daemon (daemon mode)
On the **booted live system**, the Void `nix` xbps package provides `nix-daemon` as a runit service. `/nix/store` stays root-owned; the live user is granted trust via `nix.conf`:
```
experimental-features = nix-command flakes
sandbox = false
auto-optimise-store = true
trusted-users = root live
```
The daemon socket is at `/var/nix/daemon-socket/socket` (Void's path, not the upstream default `/nix/var/nix/daemon-socket/socket`).
`sandbox = false` is required — no `nixbld` users exist in the dracut initramfs environment.
### postinstall.sh socket path (installed system)
In the **installed system** (not live), `installer/lib/postinstall.sh` polls for the nix-daemon socket at:
```
/var/nix/daemon-socket/socket
```
Not `/nix/var/nix/daemon-socket/socket` — Void's package uses `/var/nix/`.
---
## dconf / Theme
Cinnamon settings (theme, keyboard layout, dark mode, etc.) are pre-applied via a dconf system-db. The binary database is compiled at **ISO build time** inside the Docker container.
### Build-time compilation
`iso/_inner-build-live.sh` runs inside the Debian Docker container. The Dockerfile installs `dconf-cli` for this step. The correct Debian `dconf-cli` API is:
```sh
dconf compile <output_binary_db> <input_keyfile_dir>
# e.g.:
dconf compile build/live-includes/etc/dconf/db/local \
build/live-includes/etc/dconf/db/local.d
```
> **Note:** `dconf update <path>` does not work in Debian's `dconf-cli` — it only updates the user's own db. `dconf compile` is the correct tool for building a system-db binary.
### dconf profile
`/etc/dconf/profile/user` must point to the system-db:
```
user-db:user
system-db:local
```
Without this file, the compiled system-db is ignored and Cinnamon shows a black wallpaper with default GTK theme.
### System DB keyfile (`/etc/dconf/db/local.d/00-cinnamon`)
Built by `iso/build-live-iso.sh` from config values. Relevant excerpts:
```ini
[org/gnome/desktop/input-sources]
sources=[('xkb', 'ch+fr_nodeadkeys')]
[org/gnome/desktop/interface]
color-scheme='prefer-dark'
```
The `KEYMAP` variable comes from `config/install.conf` as `ch-fr_nodeadkeys` (vconsole dash format). The system DB uses XKB plus format. The substitution `${KEYMAP//-/+}` handles this conversion at build time.
### dconf lock file (critical for keyboard)
A lock file at `/etc/dconf/db/local.d/locks/keyboard` lists:
```
/org/gnome/desktop/input-sources/sources
```
This makes the keyboard setting **non-writable from the user session**`gsettings set org.gnome.desktop.input-sources sources ...` silently does nothing when this lock is in place. The correct value must be set in the system DB itself (see above). Do not attempt to override the keyboard via `gsettings` from `apply-live-settings.sh` or any autostart script.
### Keyboard format: vconsole (dash) vs XKB (plus)
- mklive.sh `-k` flag accepts vconsole format: `ch-fr_nodeadkeys` (dash-separated)
- XKB / gsettings / dconf uses plus format: `ch+fr_nodeadkeys`
- Bash substitution: `${KEYMAP//-/+}` converts vconsole → XKB
- `KEYMAP` is defined in `config/install.conf` in vconsole (dash) format
---
## First-Login Setup (`apply-live-settings.sh`)
A lightweight XDG autostart script runs once when Cinnamon first loads and applies theme/UX settings via `gsettings`. It does **not** install packages (packages are pre-baked into squashfs).
**Location in ISO:** `/usr/local/libexec/apply-live-settings.sh`
**Autostart:** `/etc/xdg/autostart/void-live-settings.desktop` (only in Cinnamon: `OnlyShowIn=X-Cinnamon`)
**Idempotency guard:** creates `~/.void-live-settings-done` on success
Settings applied:
- GTK/icon/cursor theme (Gruvbox-Dark)
- Cinnamon shell theme
- Wallpaper
- Default terminal (alacritty)
The script waits for `DBUS_SESSION_BUS_ADDRESS` to be set before calling `gsettings`. It does **not** set keyboard layout — that is locked in the dconf system DB (see dconf section above).
---
## Build Pipeline
```
iso/build-live-iso.sh (host — stages overlay, builds Docker image if needed)
└─ Docker: void-installer-builder:latest (debian:stable-slim)
└─ iso/_inner-build-live.sh
├─ nix prebake: install packages into /nix, rsync to $INCLUDE_DIR/nix/
│ └─ cache/nix-prebake/<md5>/ used if package list unchanged
├─ dconf compile (compiles system-db binary from keyfile)
├─ void-mklive/mklive.sh -a x86_64 -r <repo> -I <include_dir> ...
│ └─ squashfs (xz) + GRUB + ISO 9660
└─ chown -R $HOST_UID:$HOST_GID $INCLUDE_DIR (fix Docker root ownership)
```
Output: `out/void-live-stable.iso` (~4.8 GB, xz-compressed squashfs ~22 GB uncompressed)
### Docker UID/GID ownership fix
Docker runs as root. Without remediation, files created inside the container (especially the ~5 GB nix store) are owned by `root` on the host, causing `rm -rf build/live-includes` to fail with `Permission denied` on the next build.
**Fix in `_inner-build-live.sh`** (end of script):
```sh
# Fix ownership so host user can clean up on next build
if [[ -n "${HOST_UID:-}" && "$HOST_UID" != "0" ]]; then
chmod -R u+w "$INCLUDE_DIR" 2>/dev/null || true
chown -R "${HOST_UID}:${HOST_GID}" "$INCLUDE_DIR" 2>/dev/null || true
fi
```
`HOST_UID` and `HOST_GID` are passed via `docker run -e HOST_UID=$(id -u) -e HOST_GID=$(id -g)`.
**Belt-and-suspenders guard in `build-live-iso.sh`** (before `rm -rf $INCLUDE_DIR`):
```sh
chmod -R u+w "$INCLUDE_DIR/nix" 2>/dev/null || sudo rm -rf "$INCLUDE_DIR/nix"
```
**Emergency manual cleanup:** `sudo rm -rf build/live-includes/nix`
### Dockerfile dependencies
`iso/Dockerfile` (based on `debian:stable-slim`) installs: `bash git curl ca-certificates xz-utils tar patch python3 mtools xorriso squashfs-tools dosfstools e2fsprogs kmod dconf-cli rsync`. The `rsync` package is required for nix store staging.
### Build artifacts that must NOT be committed
- `build/live-includes/` — generated staging tree (hundreds of binary assets, nix store)
- `out/` — ISO output
- `cache/` — cloned void-mklive, xbps/nix package cache
---
## Known Issues & Fixes
### `nix-env --switch-profile "$HOME/.nix-profile"` creates a circular symlink
**Symptom:** `error: filesystem error: status: Too many levels of symbolic links [/home/live/.nix-profile/manifest.json]` and `tar: xz: Cannot exec: Too many levels of symbolic links` (all binaries fail to exec via nix PATH).
**Cause:** Passing `$HOME/.nix-profile` as the target to `nix-env --switch-profile` creates `~/.nix-profile -> .nix-profile` — a symlink that points to itself. This corrupts the nix profile directory and causes ELOOP on any file lookup under that path.
**Fix:** Do not call `nix-env --switch-profile` at all when using `nix profile add` (new-style commands). Let `nix profile add` initialise the profile automatically. The first-login script also contains a guard that detects and removes the circular symlink before proceeding.
### `nix profile install` is deprecated
Use `nix profile add` instead. `nix profile install` is an alias that emits a warning and will be removed in a future Nix version.
### DNS hang in live environment (nsswitch `mdns` without Avahi)
**Symptom:** `getent hosts github.com` hangs indefinitely; `first-login.sh` stuck at "starting".
**Cause:** `/etc/nsswitch.conf` includes `mdns` in the `hosts:` line. On Void Linux, `libnss_mdns.so.2` may not be present, and even if it is, the Avahi daemon is not running in the live session. glibc waits for Avahi's D-Bus socket before timing out.
**Fix:** `live-setup.sh` runs at boot and removes `mdns` from `nsswitch.conf`: `sed -i '/^hosts:/s/mdns[^ ]* *//g' /etc/nsswitch.conf`. This is safe on real hardware (NetworkManager provides proper DNS via DHCP).
### QEMU internal DNS (10.0.2.3) unreliable
**Symptom:** Even after removing `mdns`, DNS queries to QEMU's built-in resolver (10.0.2.3) time out.
**Cause:** QEMU's user-mode DNS proxy may not forward queries correctly depending on the host network configuration.
**Workaround for QEMU testing:** `echo nameserver 8.8.8.8 > /etc/resolv.conf`. This is not needed on real hardware.
### Docker root-owned files break next build
**Symptom:** `rm -rf build/live-includes` or `rm -rf build/live-includes/nix` fails with `Permission denied` at the start of a rebuild.
**Cause:** Docker runs as root. The ~5 GB nix store rsynced into `build/live-includes/nix/` is owned by `root:root` on the host.
**Fix:** `_inner-build-live.sh` now `chown -R $HOST_UID:$HOST_GID $INCLUDE_DIR` at the end of each Docker run. `HOST_UID`/`HOST_GID` are passed as env vars. See Build Pipeline section.
**Emergency cleanup:** `sudo rm -rf build/live-includes/nix`
### dconf lock file silently blocks `gsettings set`
**Symptom:** `gsettings set org.gnome.desktop.input-sources sources "[('xkb', 'ch+fr_nodeadkeys')]"` runs without error but the keyboard layout is not applied.
**Cause:** `/etc/dconf/db/local.d/locks/keyboard` locks the `input-sources` key. Any `gsettings set` targeting a locked key is silently ignored in the user session.
**Fix:** Set the correct value in the system dconf DB keyfile at ISO build time. Do not attempt to set it from an autostart script.
### Keyboard format mismatch (vconsole dash vs XKB plus)
**Symptom:** Keyboard layout reverts to US QWERTY even though `KEYMAP=ch-fr_nodeadkeys` is set.
**Cause:** mklive.sh accepts the vconsole format (`ch-fr_nodeadkeys`, dash-separated). XKB / dconf uses plus format (`ch+fr_nodeadkeys`). Passing the vconsole string directly to the dconf system DB or to `gsettings` sets an unknown layout that falls back to US.
**Fix:** In `build-live-iso.sh`, use `${KEYMAP//-/+}` when writing the dconf keyfile:
```ini
[org/gnome/desktop/input-sources]
sources=[('xkb', 'ch+fr_nodeadkeys')] # generated as: ${KEYMAP//-/+}
```
---
## QEMU Testing
### Quick launch
```bash
bash tests/launch-live-qemu.sh
# or via Makefile:
make live-qemu
```
### What `launch-live-qemu.sh` does
- RAM: 12288 MB, 4 CPUs, KVM acceleration
- Device: `virtio-vga` with `display gtk,gl=off` (no hardware GL)
- Searches `out/void-live-stable*.iso` for the ISO
- Serial console socket: `out/live-serial.sock`
- Monitor socket: `out/qemu-monitor.sock`
- Credentials: `live`/`voidlinux` (desktop), `root`/`voidlinux` (TTY)
### Manual launch (if needed)
```bash
cp /usr/share/OVMF/OVMF_VARS.fd out/OVMF_VARS.live.fd
qemu-system-x86_64 -name void-live-test -machine q35,accel=kvm:tcg -cpu max \
-m 12288 -smp 4 \
-drive "if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE.fd" \
-drive "if=pflash,format=raw,file=out/OVMF_VARS.live.fd" \
-cdrom out/void-live-stable.iso -boot order=d,menu=off \
-netdev user,id=n0 -device virtio-net-pci,netdev=n0 \
-serial "unix:out/live-serial.sock,server,nowait" \
-monitor "unix:out/qemu-monitor.sock,server,nowait" \
-device virtio-vga -display gtk,gl=off &
```
### Serial console access (Python)
```python
import socket
s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
s.connect('out/live-serial.sock')
# send commands, read output
```
### GPU in QEMU
`virtio-vga` is detected as a virtual GPU by `live-setup.sh` → writes `modesetting + AccelMethod none` xorg conf, sets `LIBGL_ALWAYS_SOFTWARE=1` in `/etc/profile.d/live-env.sh`.
### Verifying keyboard layout (in live session)
```bash
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus \
gsettings get org.gnome.desktop.input-sources sources
# expected: [('xkb', 'ch+fr_nodeadkeys')]
```