# nix-ota Open-source OTA updates for fleets of NixOS devices. A self-hostable control server + lightweight device agent that ship prebuilt system closures from a binary cache to devices that don't have your flake. Think Cachix Deploy, but you run it. ## Architecture ``` ┌─────────┐ 1. nix build + nix copy ┌──────────┐ │ CI / │ ─────────────────────────► │ Binary │ (Attic / S3 / nix-serve / Cachix) │ Builder │ │ Cache │ └────┬────┘ └────▲─────┘ │ 2. publish signed manifest │ ▼ │ 4. nix copy --from ┌─────────────┐ 3. GET current ┌────────┴─────┐ │ Control │ ◄──────────────── │ Device │ │ Server + UI │ 5. POST checkin │ Agent │ ──► switch-to-configuration └─────────────┘ └──────────────┘ ``` The control server **never holds the signing key**. Operators (or CI) sign manifests with an offline ed25519 key and POST them; devices verify against a pinned public key. A server compromise cannot push arbitrary closures. ## Components | Crate | Binary | Role | |--------------------|--------------------|------------------------------------| | `crates/server` | `nix-ota-server` | REST API + SQLite + HTMX dashboard | | `crates/agent` | `nix-ota-agent` | Polls, verifies, applies, rolls back| | `crates/publisher` | `nix-ota` | Operator/CI CLI (keygen + publish) | | `crates/common` | (lib) | Manifest types + ed25519 | ## Quickstart (< 10 minutes) > 👉 For a complete copy-pasteable setup with two real NixOS flakes > (server host + device host), see [`examples/`](./examples/). ### 1. Generate a signing key on your workstation ```sh nix run git+https://linus.dyrehytten.dk/max/nix-ota#nix-ota -- keygen --out ./sign.key # prints the public key — save it, you'll bake it into every device. ``` ### 2. Deploy the server ```nix # configuration.nix { imports = [ nix-ota.nixosModules.server ]; services.nix-ota-server = { enable = true; openFirewall = true; publishTokenFile = "/run/secrets/nix-ota-publish-token"; }; } ``` ### 3. Install the agent on a device ```nix { imports = [ nix-ota.nixosModules.agent ]; services.nix-ota-agent = { enable = true; server = "https://ota.example.com"; channel = "prod"; deviceId = "fridge-007"; publicKey = ""; cacheUrl = "https://cache.example.com"; cachePublicKey = "cache.example.com:abc...="; healthCmd = "systemctl is-system-running --wait"; # optional }; } ``` ### 4. Publish your first update ```sh nix build .#nixosConfigurations.fridge-007.config.system.build.toplevel nix copy --to s3://my-cache ./result nix run git+https://linus.dyrehytten.dk/max/nix-ota#nix-ota -- publish \ --server https://ota.example.com \ --token $(cat publish-token) \ --key ./sign.key \ --channel prod \ --store-path $(readlink -f result) \ --substituter https://cache.example.com ``` Open `https://ota.example.com/` to watch the fleet pick it up. ## How updates apply On each poll the agent: 1. Fetches `/channels//current`. 2. Verifies the ed25519 signature against the pinned key. 3. Rejects manifests with a revision ≤ the last one applied (replay defense). 4. `nix copy --from ` — Nix verifies cache signatures on every store path. 5. `nix-env -p /nix/var/nix/profiles/system --set ` 6. `/bin/switch-to-configuration switch` 7. Runs the optional `healthCmd`. On failure: switches back to the previous generation and reports `rolled_back`. ## Threat model | Threat | Mitigation | |-----------------------------------------|---------------------------------------------------------------------------| | Compromised control server pushes evil | Manifests must be signed by offline ed25519 key pinned on every device. | | Compromised cache serves wrong closure | Nix verifies per-path signatures against `trusted-public-keys`. | | Replay of an older (vulnerable) closure | Manifest carries monotonic `revision`; agent persists & rejects rollbacks.| | Random internet caller publishes | `POST /channels/:name/publish` requires bearer token. | | Random caller reads fleet state | UI/API should be put behind your reverse proxy / SSO. (v1: no built-in auth on reads.) | | Bad closure bricks device | Health-check + magic-rollback to previous system generation. | **Key management:** keep `sign.key` offline (hardware token, ops laptop, or a sealed CI secret). The server never sees it. Rotating: generate a new key, update `publicKey` on devices in a closure published with the old key, then start signing with the new one. ## Non-goals (v1) - The server does no Nix evaluation or building — CI does that. - No replacement for your binary cache — use Attic, Cachix, S3, nix-serve. - No per-device secrets (use sops-nix / agenix inside the closure). - No web-based config editing — config lives in your flake repo. ## Development ```sh nix develop cargo build --workspace cargo test --workspace nix flake check # runs the full NixOS VM test ``` ## License MIT OR Apache-2.0.