Self-contained example under examples/ with full NixOS flakes for both sides of a deployment (control server + binary cache vs. an agent device), plus a README walking through the end-to-end install + first publish.
136 lines
4.3 KiB
Markdown
136 lines
4.3 KiB
Markdown
# Worked example: standing up nix-ota
|
|
|
|
This walks through deploying a real fleet with two boxes:
|
|
|
|
- **`ota.example.com`** — runs `nix-ota-server` (control plane) and
|
|
`nix-serve` (binary cache). One machine, public DNS.
|
|
- **`fridge-007`** — a NixOS device that pulls updates. No flake, no
|
|
Nix evaluation. Could be an RPi, a kiosk, an edge node, anything.
|
|
|
|
You drive everything from your laptop. The laptop holds the signing
|
|
key and runs `nix-ota publish` to ship updates.
|
|
|
|
```
|
|
laptop ──signed manifest──► ota.example.com ◄──poll── fridge-007
|
|
│ (server + cache) │
|
|
└────nix copy closure───────────────┘ │
|
|
└────nix copy closure─┘
|
|
```
|
|
|
|
---
|
|
|
|
## 0. One-time: generate the manifest signing key
|
|
|
|
On your laptop:
|
|
|
|
```sh
|
|
nix run git+https://linus.dyrehytten.dk/max/nix-ota#nix-ota -- keygen --out ~/.config/nix-ota/sign.key
|
|
# prints the public key — save it, you'll bake it into every device.
|
|
```
|
|
|
|
Keep `sign.key` somewhere you trust (password manager, hardware token,
|
|
or a sealed CI secret). The server **never sees this key**.
|
|
|
|
---
|
|
|
|
## 1. The server host
|
|
|
|
See [`server-host/`](./server-host/) for a complete flake.
|
|
|
|
What you need on the server:
|
|
|
|
- `services.nix-ota-server` — the control plane (HTTP API + dashboard)
|
|
- `services.nix-serve` — the binary cache (or use Attic / S3 / Cachix)
|
|
- A reverse proxy with TLS in front of both (nginx, Caddy, traefik...)
|
|
- A bearer token for publishes, stored in `/run/secrets/...` (sops-nix
|
|
or agenix; the example uses a plain file for clarity)
|
|
|
|
Deploy it however you normally deploy a NixOS box (nixos-rebuild,
|
|
deploy-rs, colmena — yes, you can use any of those *to deploy the
|
|
nix-ota server itself*; we're only replacing them for the fleet).
|
|
|
|
After it's up:
|
|
|
|
```sh
|
|
curl https://ota.example.com/healthz # -> ok
|
|
curl https://ota.example.com/ # dashboard
|
|
```
|
|
|
|
---
|
|
|
|
## 2. The device
|
|
|
|
See [`device-host/`](./device-host/) for a complete flake.
|
|
|
|
What goes on each device:
|
|
|
|
- `services.nix-ota-agent` — the polling agent (a single static binary
|
|
driven by a systemd timer)
|
|
- The matching ed25519 **public key** (so the device rejects manifests
|
|
not signed by your key)
|
|
- The binary cache's URL **and public key** (so Nix accepts store paths
|
|
fetched from it)
|
|
|
|
You deploy this flake to the device **once**, manually. From then on
|
|
you never touch the device's config: subsequent updates ride on top
|
|
through `nix-ota publish`.
|
|
|
|
---
|
|
|
|
## 3. Publishing your first update
|
|
|
|
From your laptop:
|
|
|
|
```sh
|
|
# 1. Build the device's system closure.
|
|
nix build .#nixosConfigurations.fridge-007.config.system.build.toplevel
|
|
|
|
# 2. Push it to the cache.
|
|
nix copy --to 'https://ota.example.com/cache?secret-key=/path/to/cache.key' \
|
|
./result
|
|
|
|
# 3. Publish a signed manifest pointing at it.
|
|
nix run git+https://linus.dyrehytten.dk/max/nix-ota#nix-ota -- publish \
|
|
--server https://ota.example.com \
|
|
--token "$(cat ~/.config/nix-ota/publish-token)" \
|
|
--key ~/.config/nix-ota/sign.key \
|
|
--channel prod \
|
|
--store-path "$(readlink -f ./result)" \
|
|
--substituter https://ota.example.com/cache
|
|
```
|
|
|
|
Within `interval` seconds (default 60), `fridge-007` polls, verifies
|
|
the signature, copies the closure from your cache, switches into it,
|
|
runs your health check, and check-ins. Open
|
|
`https://ota.example.com/` to watch it happen.
|
|
|
|
Rolling back is just publishing the previous store path again — bump
|
|
the revision and ship it.
|
|
|
|
---
|
|
|
|
## What you do NOT need
|
|
|
|
- ❌ SSH access from the server to the devices.
|
|
- ❌ The flake on the device. Once the agent + initial config are
|
|
installed, you can drop your nix-ota flake reference from the device
|
|
entirely (subsequent updates carry it).
|
|
- ❌ Per-device builds. Build once, publish, every device on that channel
|
|
picks it up.
|
|
- ❌ A Nix daemon talking to the control server. Devices talk to the
|
|
*cache*; the control server only hands out signed pointers.
|
|
|
|
---
|
|
|
|
## File map
|
|
|
|
```
|
|
examples/
|
|
├── README.md (this file)
|
|
├── server-host/
|
|
│ ├── flake.nix full flake for the control-plane host
|
|
│ └── configuration.nix
|
|
└── device-host/
|
|
├── flake.nix full flake for a device
|
|
└── configuration.nix
|
|
```
|