Production Headscale terminates TLS for both the control plane (via the
TS2021 HTTP CONNECT upgrade endpoint) and the embedded DERP relay.
Without TLS, the daemon could only talk to plain-HTTP test stacks.
- New crate::tls module: shared TlsMode (Verify | InsecureSkipVerify)
+ tls_wrap helper. webpki roots in Verify mode; an explicit
ServerCertVerifier that accepts any cert in InsecureSkipVerify
(test-only).
- Cargo.toml: add tokio-rustls, webpki-roots, rustls-pemfile.
- noise/handshake: perform_handshake is now generic over the underlying
stream and takes an explicit `host_header` argument instead of using
`peer_addr`. Lets callers pass either a TcpStream or a TLS-wrapped
stream.
- noise/stream: NoiseStream<S> is generic over the underlying transport
with `S = TcpStream` as the default. The AsyncRead+AsyncWrite impls
forward to whatever S provides.
- control/client: ControlClient::connect detects `https://` in
coordination_url and TLS-wraps the TCP stream before the Noise
handshake. fetch_server_key now also TLS-wraps when needed. Both
honor the new derp_tls_insecure config flag (which is misnamed but
controls all TLS verification, not just DERP).
- derp/client: DerpClient::connect_with_tls accepts a TlsMode and uses
the shared tls::tls_wrap helper instead of duplicating it. The
client struct's inner Framed is now generic over a Box<dyn
DerpTransport> so it can hold either a plain or TLS-wrapped stream.
- daemon/lifecycle: derive the DERP URL scheme from coordination_url
(https → https) and pass derp_tls_insecure through.
- config.rs: new `derp_tls_insecure: bool` field on VpnConfig.
- src/vpn_cmds.rs: pass `derp_tls_insecure: false` for production.
Two bug fixes found while wiring this up:
- proxy/engine: bridge_connection used to set remote_done on any
smoltcp recv error, including the transient InvalidState that
smoltcp returns while a TCP socket is still in SynSent. That meant
the engine gave up on the connection before the WG handshake even
finished. Distinguish "not ready yet" (returns Ok(0)) from
"actually closed" (returns Err) inside tcp_recv, and only mark
remote_done on the latter.
- proxy/engine: the connection's "done" condition required
local_read_done, but most clients (curl, kubectl) keep their write
side open until they read EOF. The engine never closed its local
TCP, so clients sat in read_to_end forever. Drop the connection as
soon as the remote side has finished and we've drained its buffer
to the local socket — the local TcpStream drop closes the socket
and the client sees EOF.
`sunbeam connect` now fork-execs itself with a hidden `__vpn-daemon`
subcommand instead of running the daemon in-process. The user-facing
command spawns the child detached (stdio → log file, setsid for no
controlling TTY), polls the IPC socket until the daemon reaches
Running, prints a one-line status, and exits. The user gets back to
their shell immediately.
- src/cli.rs: `Connect { foreground }` instead of unit. Add hidden
`__vpn-daemon` Verb that the spawned child runs.
- src/vpn_cmds.rs: split into spawn_background_daemon (default path)
and run_daemon_foreground (used by both `connect --foreground` and
`__vpn-daemon`). Detached child uses pre_exec(setsid) and inherits
--context from the parent so it resolves the same VPN config.
Refuses to start if a daemon is already running on the control
socket; cleans up stale socket files. Switches the proxy bind from
16443 (sienna's existing SSH tunnel uses it) to 16579.
- sunbeam-net/src/daemon/lifecycle: add a SocketGuard RAII type so the
IPC control socket is unlinked when the daemon exits, regardless of
shutdown path. Otherwise `vpn status` after a clean disconnect would
see a stale socket and report an error.
End-to-end smoke test against the docker stack:
$ sunbeam connect
==> VPN daemon spawned (pid 90072, ...)
Connected (100.64.0.154, fd7a:115c:a1e0::9a) — 2 peers visible
$ sunbeam vpn status
VPN: running
addresses: 100.64.0.154, fd7a:115c:a1e0::9a
peers: 2
derp home: region 0
$ sunbeam disconnect
==> Asking VPN daemon to stop...
Daemon acknowledged shutdown.
$ sunbeam vpn status
VPN: not running
Adds the foreground VPN client commands. The daemon runs in-process
inside the CLI for the lifetime of `sunbeam connect` — no separate
background daemon yet, that can come later if needed.
- Cargo.toml: add sunbeam-net as a workspace dep, plus hostname/whoami
for building a per-machine netmap label like "sienna@laptop"
- src/config.rs: new `vpn-url` and `vpn-auth-key` fields on Context
- src/cli.rs: `Connect`, `Disconnect`, and `Vpn { Status }` verbs
- src/vpn_cmds.rs: command handlers
- cmd_connect reads VPN config from the active context, starts the
daemon at ~/.sunbeam/vpn, polls for Running, then blocks on ^C
before calling DaemonHandle::shutdown
- cmd_disconnect / cmd_vpn_status are placeholders that report based
on the control socket; actually talking to a backgrounded daemon
needs an IPC client (not yet exposed from sunbeam-net)
- src/workflows/mod.rs: `..Default::default()` on Context literals so
the new fields don't break the existing tests
Add the workspace crate that will host a pure Rust Headscale/Tailscale-
compatible VPN client. This first commit lands the crate skeleton plus
the leaf modules that the rest of the stack builds on:
- error: thiserror Error enum + Result alias
- config: VpnConfig
- keys: Curve25519 node/disco/wg key types with on-disk persistence
- proto/types: PascalCase serde wire types matching Tailscale's JSON
Replace hand-rolled OpenBao HTTP client with vaultrs 0.8.0, which
has official OpenBao support. BaoClient remains the public API so
callers are unchanged. KV patch uses raw HTTP since vaultrs doesn't
expose it yet.
WFE now populates execution pointer step_name from the workflow
definition, so print_summary shows actual step names instead of
"step-0", "step-1", etc.
Dispatch `sunbeam up`, `sunbeam seed`, `sunbeam verify`, and
`sunbeam bootstrap` through WFE workflows instead of monolithic
functions. Steps communicate via JSON workflow data and each
workflow is persisted in a per-context SQLite database.
- Parallel file uploads with --parallel flag (default 4)
- indicatif MultiProgress: overall bar with file count, speed, ETA
- Per-file spinner bars showing filename during upload
- Phase 1: walk tree + create folders sequentially
- Phase 2: upload files concurrently via semaphore
- Summary line on completion (files, bytes, time, speed)
- Fixed DriveFile/DriveFolder types to match actual API fields
- DriveClient now Clone for Arc sharing across tasks
SunbeamClient accessors are now async and resolve auth per-client:
- SSO bearer (get_token) for admin APIs, Matrix, La Suite, OpenSearch
- Gitea PAT (get_gitea_token) for VCS
- None for Prometheus, Loki, S3, LiveKit
Fixes client URLs to match deployed routes: hydra→hydra.{domain},
matrix→messages.{domain}, grafana→metrics.{domain},
prometheus→systemmetrics.{domain}, loki→systemlogs.{domain}.
Removes all ad-hoc token helpers from CLI modules (matrix_with_token,
os_client, people_client, etc). Every dispatch just calls
client.service().await?.
Slim binary that depends on sunbeam-sdk for all logic. Replaces 62
crate:: refs with sunbeam_sdk::. Tracing filter updated to include
sunbeam_sdk=info.
SunbeamError enum with typed variants (Kube, Config, Network, Secrets,
Build, Identity, ExternalTool, Io, Json, Yaml, Other) each mapping to
a process exit code. ResultExt trait replaces anyhow's .context().
main.rs initializes tracing-subscriber with RUST_LOG env filter and
routes all errors to exit codes via SunbeamError::exit_code().
Removes anyhow dependency.
services.rs:
- Pod status with unicode icons, grouped by namespace
- VSO sync status (VaultStaticSecret/VaultDynamicSecret via kube-rs DynamicObject)
- Log streaming via kube-rs log_stream + futures::AsyncBufReadExt
- Pod get in YAML/JSON format
- Rollout restart with namespace/service filtering
checks.rs:
- 11 health check functions (gitea, postgres, valkey, openbao, seaweedfs, kratos, hydra, people, livekit)
- AWS4-HMAC-SHA256 S3 auth header generation using sha2 + hmac
- Concurrent execution via tokio JoinSet
- mkcert root CA trust for local TLS
secrets.rs:
- Stub with cmd_seed/cmd_verify (requires live cluster for full impl)
users.rs:
- All 10 Kratos identity operations via reqwest + kubectl port-forward
- Welcome email via lettre SMTP through port-forwarded postfix
- Employee onboarding with auto-assigned ID, HR metadata
- Offboarding with Kratos + Hydra session revocation
gitea.rs:
- Bootstrap without Lima VM: admin password, org creation, OIDC auth source
- Gitea API via kubectl exec curl
images.rs:
- BuildEnv detection, buildctl build + push via port-forward
- Per-service builders for all 17 build targets
- Deploy rollout, node image pull, uv Dockerfile patching
- Mirror scaffolding (containerd operations marked TODO)
cluster.rs:
- Pure K8s cmd_up: cert-manager, linkerd, rcgen TLS certs, core service wait
- No Lima VM operations
manifests.rs:
- Full cmd_apply: kustomize build, two-pass convergence, ConfigMap restart detection
- Pre-apply cleanup, webhook wait, mkcert CA, tuwunel OAuth2 redirect patch
Test coverage: 142 tests across 14 modules (44 in checks, 27 in cli, 13 in images, 12 in tools, 12 in services, 11 in users, 10 in manifests, 9 in kube, 9 in cluster, 7 in update, 6 in gitea, 4 in openbao, 3 in output, 2 in config).
Phase 0 of Python-to-Rust CLI rewrite:
- Cargo.toml with all dependencies (kube-rs, reqwest, russh, rcgen, lettre, etc.)
- build.rs: downloads kustomize v5.8.1 + helm v4.1.0 at compile time, embeds as bytes, sets SUNBEAM_COMMIT from git
- src/main.rs: tokio main with anyhow error formatting
- src/cli.rs: full clap derive struct tree matching all Python argparse subcommands
- src/config.rs: SunbeamConfig serde struct, load/save ~/.sunbeam.json
- src/output.rs: step/ok/warn/table with exact Python format strings
- src/tools.rs: embedded kustomize+helm extraction to cache dir
- src/kube.rs: parse_target, domain_replace, context management
- src/manifests.rs: filter_by_namespace with full test coverage
- Stub modules for all remaining features (cluster, secrets, images, services, checks, gitea, users, update)
23 tests pass, cargo check clean.