Commit Graph

140 Commits

Author SHA1 Message Date
26013049ac refactor(cli): delete images.rs (build command removed) 2026-04-07 19:26:22 +01:00
0c55be8d13 refactor(sdk): remove lasuite from sunbeam-sdk 2026-04-07 19:26:11 +01:00
db925a54a6 refactor(cli): migrate gitea oidc and welcome mail off lasuite 2026-04-07 19:26:04 +01:00
7c64093342 refactor(cli): remove lasuite from workflows 2026-04-07 19:25:59 +01:00
bfbc391e7f refactor(cli): remove build command (replaced by wfe) 2026-04-07 19:25:52 +01:00
f7a6a4bf24 refactor(cli): remove lasuite namespace and services 2026-04-07 19:25:46 +01:00
65c8fb80e3 docs: service discovery labels migration guide
313-line walkthrough for adopting the `sunbeam.pt/*` label scheme on
existing manifests in sbbb. Documents the required labels, optional
annotations, virtual-service ConfigMap pattern, and the
multi-deployment grouping convention. Includes a complete table of
the 33 services with their target K8s resources and the values to put
on each. Teams onboarding new services can follow this without having
to read the registry source.
2026-04-07 17:52:50 +01:00
f1700efc7e feat(cli): service-oriented commands (deploy/secrets/shell)
Adds the user-facing half of the service registry refactor.

src/service_cmds.rs (new):
- cmd_deploy: resolves a service/category/namespace target via the
  registry, applies manifests for each unique namespace, then
  rollout-restarts the resolved deployments.
- cmd_secrets: looks up the service's `sunbeam.pt/kv-path`
  annotation, port-forwards to OpenBao, and either lists every key
  in the path (with values masked) or — given `get <key>` —
  prints the single field. Replaces a hand-rolled secret-fetching
  flow with one that's driven by the same registry as everything else.
- cmd_shell: drops into a shell on a service's pod. Special-cases
  `postgres` to spawn psql against the CNPG primary; everything else
  gets `/bin/sh` via kubectl exec.

src/services.rs:
- Drop the static `SERVICES_TO_RESTART` table and the static
  MANAGED_NS dependency. Both `cmd_status` and `cmd_restart` now ask
  the registry. The legacy `namespace` and `namespace/name` syntaxes
  still work as a fallback when the registry can't resolve the input,
  so existing muscle memory keeps working during the transition.
- The two static-table tests are removed (they tested the static
  tables that no longer exist); the icon helper test stays.

Together with the earlier `Verb::{Deploy,Secrets,Shell}` additions
in src/cli.rs, this completes the service-oriented command surface
for status / logs / restart / deploy / secrets / shell.
2026-04-07 17:52:41 +01:00
db97853f9c feat(sdk): dynamic service registry from K8s labels
Adds `sunbeam_sdk::registry`, the discovery layer that the new
service-oriented CLI commands use to resolve names like "hydra",
"auth", or "ory" into the right Kubernetes resources.

Instead of duplicating service definitions in Rust code, the registry
queries Deployments, StatefulSets, DaemonSets, and ConfigMaps that
carry the `sunbeam.pt/service` label and reads everything else from
labels and annotations:

- sunbeam.pt/service / sunbeam.pt/category — required, the primary keys
- sunbeam.pt/display-name — human-readable label for status output
- sunbeam.pt/kv-path — OpenBao KV v2 path (for `sunbeam secrets <svc>`)
- sunbeam.pt/db-user / sunbeam.pt/db-name — CNPG postgres credentials
- sunbeam.pt/build-target — buildkit target for `sunbeam build`
- sunbeam.pt/depends-on — comma-separated dependency names
- sunbeam.pt/health-check — pod-ready / cnpg / seal-status / HTTP path
- sunbeam.pt/virtual=true — for ConfigMap-only "external" services

`ServiceRegistry::resolve(input)` does name → category → namespace
matching in that order, so `sunbeam logs hydra`, `sunbeam restart auth`,
and `sunbeam status ory` all work uniformly.

Multi-deployment services (e.g. messages-{backend,mta-in,mta-out})
share a service label and the registry merges them into a single
ServiceDefinition with multiple `deployments`.

Includes 14 unit tests covering name/category/namespace resolution,
case-insensitivity, virtual services, and the empty registry case.
2026-04-07 17:52:26 +01:00
5f97d063cb feat(seed): provision postgres role + KV slot for headscale
Adds `headscale` to the lists that drive the seed workflow so the
existing CNPG role/database creation and OpenBao KV path provisioning
pick up the new VPN coordination service alongside everything else:

- src/secrets.rs: PG_USERS list grows from 15 → 16 (test asserts the
  full ordered list, so it's updated to match)
- src/workflows/seed/steps/postgres.rs: pg_db_map adds headscale →
  headscale_db
- src/workflows/seed/definition.rs: bumps the role/db step count
  assertions from 15 → 16
- src/workflows/primitives/kv_service_configs.rs: new headscale entry
  with a single `api-key` field generated as `static:` (placeholder).
  The user runs `kubectl exec -n vpn deploy/headscale -- headscale
  apikeys create` and pastes the result into vault before calling
  `sunbeam vpn create-key`. Bumps service_count test from 18 → 19.
- src/constants.rs: add `vpn` to MANAGED_NS so the legacy namespace
  list includes the new namespace.
2026-04-07 17:35:17 +01:00
b5795fd97b feat(cli): sunbeam vpn create-key + vpn-tls-insecure config flag
`sunbeam vpn create-key` calls Headscale's REST API at
`/api/v1/preauthkey` to mint a new pre-auth key for onboarding a new
client. Reads `vpn-url` and `vpn-api-key` from the active context;
the user generates the API key once via `headscale apikeys create`
on the cluster and stores it in their context config.

Flags:
- --user <name>      Headscale user the key belongs to
- --reusable         allow multiple registrations with the same key
- --ephemeral        auto-delete the node when its map stream drops
- --expiration <dur> human-friendly lifetime ("30d", "1h", "2w")

Also adds a `vpn-tls-insecure` context flag that controls TLS
verification across the whole VPN integration: it's now used by both
the daemon (for the Noise control connection + DERP relay) and the
new create-key REST client. Test stacks with self-signed certs set
this to true; production stacks leave it false.

Verified end-to-end against the docker test stack:

  $ sunbeam vpn create-key --user test --reusable --expiration 1h
  ==> Creating pre-auth key on https://localhost:8443
      Pre-auth key for user 'test':
  ebcd77f51bf30ef373c9070382b834859935797a90c2647f

  Add it to a context with:
    sunbeam config set --context <ctx> vpn-auth-key ebcd77f5...
2026-04-07 15:32:44 +01:00
94fb6155f7 test(net): TLS-enabled docker stack and active e2e test
The docker-compose stack now serves Headscale (and its embedded DERP)
over TLS on port 8443 with a self-signed cert covering localhost,
127.0.0.1, and the docker-network hostname `headscale`. Tailscale
peers trust the cert via SSL_CERT_FILE; our test daemon uses
`derp_tls_insecure: true` (gated on the SUNBEAM_NET_TEST_DERP_INSECURE
env var) since pinning a self-signed root in tests is more trouble
than it's worth.

With TLS DERP working, the previously-ignored
`test_e2e_tcp_through_tunnel` test now passes: the daemon spawns,
registers, completes a Noise handshake over TLS, opens a TLS DERP
relay session, runs a real WireGuard handshake with peer-a (verified
via boringtun ↔ tailscale interop), and TCP-tunnels an HTTP GET
through smoltcp ↔ engine ↔ proxy ↔ test client. The 191-byte echo
response round-trips and the test asserts on its body.

- tests/config/headscale.yaml: tls_cert_path + tls_key_path, listen on
  8443, server_url=https://headscale:8443
- tests/config/test-cert.pem + test-key.pem: 365-day self-signed RSA
  cert with SAN DNS:localhost, DNS:headscale, IP:127.0.0.1
- tests/docker-compose.yml: mount certs into headscale + both peers,
  set SSL_CERT_FILE on the peers, expose 8443 instead of 8080
- tests/run.sh: switch to https://localhost:8443, set
  SUNBEAM_NET_TEST_DERP_INSECURE=1
- tests/integration.rs: drop the #[ignore] on test_e2e_tcp_through_tunnel,
  read derp_tls_insecure from env in all four test configs
2026-04-07 15:29:03 +01:00
2624a13952 feat(net): TLS support for HTTPS coordination URLs and DERP relays
Production Headscale terminates TLS for both the control plane (via the
TS2021 HTTP CONNECT upgrade endpoint) and the embedded DERP relay.
Without TLS, the daemon could only talk to plain-HTTP test stacks.

- New crate::tls module: shared TlsMode (Verify | InsecureSkipVerify)
  + tls_wrap helper. webpki roots in Verify mode; an explicit
  ServerCertVerifier that accepts any cert in InsecureSkipVerify
  (test-only).
- Cargo.toml: add tokio-rustls, webpki-roots, rustls-pemfile.
- noise/handshake: perform_handshake is now generic over the underlying
  stream and takes an explicit `host_header` argument instead of using
  `peer_addr`. Lets callers pass either a TcpStream or a TLS-wrapped
  stream.
- noise/stream: NoiseStream<S> is generic over the underlying transport
  with `S = TcpStream` as the default. The AsyncRead+AsyncWrite impls
  forward to whatever S provides.
- control/client: ControlClient::connect detects `https://` in
  coordination_url and TLS-wraps the TCP stream before the Noise
  handshake. fetch_server_key now also TLS-wraps when needed. Both
  honor the new derp_tls_insecure config flag (which is misnamed but
  controls all TLS verification, not just DERP).
- derp/client: DerpClient::connect_with_tls accepts a TlsMode and uses
  the shared tls::tls_wrap helper instead of duplicating it. The
  client struct's inner Framed is now generic over a Box<dyn
  DerpTransport> so it can hold either a plain or TLS-wrapped stream.
- daemon/lifecycle: derive the DERP URL scheme from coordination_url
  (https → https) and pass derp_tls_insecure through.
- config.rs: new `derp_tls_insecure: bool` field on VpnConfig.
- src/vpn_cmds.rs: pass `derp_tls_insecure: false` for production.

Two bug fixes found while wiring this up:

- proxy/engine: bridge_connection used to set remote_done on any
  smoltcp recv error, including the transient InvalidState that
  smoltcp returns while a TCP socket is still in SynSent. That meant
  the engine gave up on the connection before the WG handshake even
  finished. Distinguish "not ready yet" (returns Ok(0)) from
  "actually closed" (returns Err) inside tcp_recv, and only mark
  remote_done on the latter.
- proxy/engine: the connection's "done" condition required
  local_read_done, but most clients (curl, kubectl) keep their write
  side open until they read EOF. The engine never closed its local
  TCP, so clients sat in read_to_end forever. Drop the connection as
  soon as the remote side has finished and we've drained its buffer
  to the local socket — the local TcpStream drop closes the socket
  and the client sees EOF.
2026-04-07 15:28:44 +01:00
e934eb45dc feat(net): derive cluster API target from netmap by hostname
Adds an optional `cluster_api_host` field to VpnConfig. When set, the
daemon resolves it against the netmap's peer list once the first
netmap arrives and uses that peer's tailnet IP as the proxy backend,
overriding the static `cluster_api_addr`. Falls back to the static
addr if the hostname doesn't match any peer.

The resolver tries hostname first, then peer name (FQDN), then a
prefix match against name. Picks v4 over v6 from the peer's address
list.

- sunbeam-net/src/config.rs: new `cluster_api_host: Option<String>`
- sunbeam-net/src/daemon/lifecycle.rs: resolve_peer_ip helper +
  resolution at proxy bind time
- sunbeam-net/tests/integration.rs: pass cluster_api_host: None in
  the existing VpnConfig literals
- src/config.rs: new context field `vpn-cluster-host`
- src/vpn_cmds.rs: thread it from context → VpnConfig
2026-04-07 15:00:30 +01:00
27a6f4377c feat(cli): background the VPN daemon with re-exec + clean shutdown
`sunbeam connect` now fork-execs itself with a hidden `__vpn-daemon`
subcommand instead of running the daemon in-process. The user-facing
command spawns the child detached (stdio → log file, setsid for no
controlling TTY), polls the IPC socket until the daemon reaches
Running, prints a one-line status, and exits. The user gets back to
their shell immediately.

- src/cli.rs: `Connect { foreground }` instead of unit. Add hidden
  `__vpn-daemon` Verb that the spawned child runs.
- src/vpn_cmds.rs: split into spawn_background_daemon (default path)
  and run_daemon_foreground (used by both `connect --foreground` and
  `__vpn-daemon`). Detached child uses pre_exec(setsid) and inherits
  --context from the parent so it resolves the same VPN config.
  Refuses to start if a daemon is already running on the control
  socket; cleans up stale socket files. Switches the proxy bind from
  16443 (sienna's existing SSH tunnel uses it) to 16579.
- sunbeam-net/src/daemon/lifecycle: add a SocketGuard RAII type so the
  IPC control socket is unlinked when the daemon exits, regardless of
  shutdown path. Otherwise `vpn status` after a clean disconnect would
  see a stale socket and report an error.

End-to-end smoke test against the docker stack:
  $ sunbeam connect
  ==> VPN daemon spawned (pid 90072, ...)
      Connected (100.64.0.154, fd7a:115c:a1e0::9a) — 2 peers visible
  $ sunbeam vpn status
  VPN: running
    addresses: 100.64.0.154, fd7a:115c:a1e0::9a
    peers: 2
    derp home: region 0
  $ sunbeam disconnect
  ==> Asking VPN daemon to stop...
      Daemon acknowledged shutdown.
  $ sunbeam vpn status
  VPN: not running
2026-04-07 14:57:15 +01:00
7019937f6f feat(net): real IPC client + working remote shutdown
DaemonHandle's shutdown_tx (oneshot) is replaced with a CancellationToken
shared between the daemon loop and the IPC server. The token is the
single source of truth for "should we shut down" — `DaemonHandle::shutdown`
cancels it, and an IPC `Stop` request also cancels it.

- daemon/state: store the CancellationToken on DaemonHandle and clone it
  on Clone (so cached IPC handles can still trigger shutdown).
- daemon/ipc: IpcServer takes a daemon_shutdown token; `Stop` now cancels
  it instead of returning Ok and doing nothing. Add IpcClient with
  `request`, `status`, and `stop` methods so the CLI can drive a
  backgrounded daemon over the Unix socket.
- daemon/lifecycle: thread the token through run_daemon_loop and
  run_session, pass a clone to IpcServer::new.
- lib.rs: re-export IpcClient/IpcCommand/IpcResponse so callers don't
  have to reach into the daemon module.
- src/vpn_cmds.rs: `sunbeam disconnect` now actually talks to the daemon
  via IpcClient::stop, and `sunbeam vpn status` queries IpcClient::status
  and prints addresses + peer count + DERP home.
2026-04-07 14:46:47 +01:00
a57246fd9f feat(cli): wire sunbeam-net into sunbeam connect/disconnect/vpn
Adds the foreground VPN client commands. The daemon runs in-process
inside the CLI for the lifetime of `sunbeam connect` — no separate
background daemon yet, that can come later if needed.

- Cargo.toml: add sunbeam-net as a workspace dep, plus hostname/whoami
  for building a per-machine netmap label like "sienna@laptop"
- src/config.rs: new `vpn-url` and `vpn-auth-key` fields on Context
- src/cli.rs: `Connect`, `Disconnect`, and `Vpn { Status }` verbs
- src/vpn_cmds.rs: command handlers
  - cmd_connect reads VPN config from the active context, starts the
    daemon at ~/.sunbeam/vpn, polls for Running, then blocks on ^C
    before calling DaemonHandle::shutdown
  - cmd_disconnect / cmd_vpn_status are placeholders that report based
    on the control socket; actually talking to a backgrounded daemon
    needs an IPC client (not yet exposed from sunbeam-net)
- src/workflows/mod.rs: `..Default::default()` on Context literals so
  the new fields don't break the existing tests
2026-04-07 14:39:40 +01:00
f1668682b7 test(net): TUN-mode docker stack and ignored e2e test
- docker-compose.yml: run peer-a and peer-b with TS_USERSPACE=false +
  /dev/net/tun device + cap_add. Pin peer-a's WG listen port to 41641
  via TS_TAILSCALED_EXTRA_ARGS and publish it to the host so direct
  UDP from outside docker has somewhere to land.
- run.sh: use an ephemeral pre-auth key for the test client so
  Headscale auto-deletes the test node when its map stream drops
  (instead of accumulating hundreds of stale entries that eventually
  slow netmap propagation to a crawl). Disable shields-up on both
  peers so the kernel firewall doesn't drop inbound tailnet TCP. Tweak
  the JSON key extraction to handle pretty-printed output.
- integration.rs: add `test_e2e_tcp_through_tunnel` that brings up
  the daemon, dials peer-a's echo server through the proxy, and
  asserts the echo body comes back. Currently `#[ignore]`d — the
  docker stack runs Headscale over plain HTTP, but Tailscale's client
  unconditionally tries TLS to DERP relays ("tls: first record does
  not look like a TLS handshake"), so peer-a can never receive
  packets we forward via the relay. Unblocking needs either TLS
  termination on the docker DERP or running the test inside the same
  docker network as peer-a. Test stays in the tree because everything
  it tests up to the read timeout is real verified behavior.
2026-04-07 14:33:59 +01:00
dca8c3b643 fix(net): protocol fixes for Tailscale-compatible peer reachability
A pile of correctness bugs that all stopped real Tailscale peers from
being able to send WireGuard packets back to us. Found while building
out the e2e test against the docker-compose stack.

1. WireGuard static key was wrong (lifecycle.rs)
   We were initializing the WgTunnel with `keys.wg_private`, a separate
   x25519 key from the one Tailscale advertises in netmaps. Peers know
   us by `node_public` and compute mac1 against it; signing handshakes
   with a different private key meant every init we sent was silently
   dropped. Use `keys.node_private` instead — node_key IS the WG static
   key in Tailscale.

2. DERP relay couldn't route packets to us (derp/client.rs)
   Our DerpClient was sealing the ClientInfo frame with a fresh
   ephemeral NaCl keypair and putting the ephemeral public in the frame
   prefix. Tailscale's protocol expects the *long-term* node public key
   in the prefix — that's how the relay knows where to forward packets
   addressed to our node_key. With the ephemeral key, the relay
   accepted the connection but never delivered our peers' responses.
   Now seal with the long-term node key.

3. Headscale never persisted our DiscoKey (proto/types.rs, control/*)
   The streaming /machine/map handler in Headscale ≥ capVer 68 doesn't
   update DiscoKey on the node record — only the "Lite endpoint update"
   path does, gated on Stream:false + OmitPeers:true + ReadOnly:false.
   Without DiscoKey our nodes appeared in `headscale nodes list` with
   `discokey:000…` and never propagated into peer netmaps. Add the
   DiscoKey field to RegisterRequest, add OmitPeers/ReadOnly fields to
   MapRequest, and call a new `lite_update` between register and the
   streaming map. Also add `post_json_no_response` for endpoints that
   reply with an empty body.

4. EncapAction is now a struct instead of an enum (wg/tunnel.rs)
   Routing was forced to either UDP or DERP. With a peer whose
   advertised UDP endpoint is on an unreachable RFC1918 network (e.g.
   docker bridge IPs), we'd send via UDP, get nothing, and never fall
   back. Send over every available transport — receivers dedupe via
   the WireGuard replay window — and let dispatch_encap forward each
   populated arm to its respective channel.

5. Drop the dead PacketRouter (wg/router.rs)
   Skeleton from an earlier design that never got wired up; it's been
   accumulating dead-code warnings.
2026-04-07 14:33:43 +01:00
85d34bb035 feat(net): add UDP transport for direct peer connections
DERP works for everything but adds relay latency. Add a parallel UDP
transport so peers with reachable endpoints can talk directly:

- wg/tunnel: track each peer's local boringtun index in PeerTunnel and
  expose find_peer_by_local_index / find_peer_by_endpoint lookups
- daemon/lifecycle: bind a UdpSocket on 0.0.0.0:0 alongside DERP, run
  the recv loop on a clone of an Arc<UdpSocket> so send and recv can
  proceed concurrently
- run_wg_loop: new udp_in_rx select arm. For inbound UDP we identify
  the source peer by parsing the WireGuard receiver_index out of the
  packet header (msg types 2/3/4) and falling back to source-address
  matching for type-1 handshake initiations
- dispatch_encap: SendUdp now actually forwards via the UDP channel

UDP failure is non-fatal — DERP can carry traffic alone if the bind
fails or packets are dropped.
2026-04-07 13:48:59 +01:00
bea8a308da test(net): add integration test harness against Headscale
Spins up Headscale 0.23 (with embedded DERP) plus two Tailscale peers
in docker compose, generates pre-auth keys, and runs three integration
tests behind the `integration` feature:

- test_register_and_receive_netmap: full TS2021 → register → first
  netmap fetch
- test_proxy_listener_accepts: starts the daemon and waits for it to
  reach the Running state
- test_daemon_lifecycle: full lifecycle including DERP connect, then
  clean shutdown via the DaemonHandle

Run with `sunbeam-net/tests/run.sh` (handles compose up/down + auth
key provisioning) or manually via cargo nextest with the env vars
SUNBEAM_NET_TEST_AUTH_KEY and SUNBEAM_NET_TEST_COORD_URL set.
2026-04-07 13:42:46 +01:00
9750d4e0b3 feat(net): add VPN daemon lifecycle, state, and IPC
The daemon orchestrates everything: it owns reconnection backoff, the
WireGuard tunnel, the smoltcp engine, the DERP relay loop, the local
TCP proxy, and a Unix-socket IPC server for status queries.

- daemon/state: DaemonStatus state machine + DaemonHandle for shutdown
  signaling and live status access
- daemon/ipc: newline-delimited JSON Unix socket server (Status,
  Disconnect, Peers requests)
- daemon/lifecycle: VpnDaemon::start spawns run_daemon_loop, which pins
  a session future and selects against shutdown_rx so shutdown breaks
  out cleanly. run_session brings up the full pipeline:
  control client → register → map stream → wg tunnel → engine →
  proxy listener → wg encap/decap loop → DERP relay → IPC server.

DERP transport: when the netmap doesn't surface a usable DERP endpoint
(Headscale's embedded relay returns host_name="headscale", port=0),
fall back to deriving host:port from coordination_url. WG packets to
SendDerp peers go via a dedicated derp_out channel; inbound DERP frames
flow back through derp_in into the decap arm, which forwards Packet
results to the engine and Response results back to derp_out for the
handshake exchange.
2026-04-07 13:42:36 +01:00
f903c1a073 feat(net): add network engine and TCP proxy
- proxy/engine: NetworkEngine that owns the smoltcp VirtualNetwork and
  bridges async TCP streams to virtual sockets via a 5ms poll loop.
  Each ProxyConnection holds the local TcpStream + smoltcp socket
  handle and shuttles data between them with try_read/try_write so the
  engine never blocks.
- proxy/tcp: skeleton TcpProxy listener (currently unused; the daemon
  inlines its own listener that hands off to the engine via mpsc)
2026-04-07 13:42:15 +01:00
d9d0d64236 feat(net): add control protocol (register + map stream)
- control/client: TS2021 connection setup — TCP, HTTP CONNECT-style
  upgrade to /ts2021, full Noise IK handshake via NoiseStream, then
  HTTP/2 client handshake on top via the h2 crate
- control/register: POST /machine/register with pre-auth key, PascalCase
  JSON serde matching Tailscale's wire format
- control/netmap: streaming MapStream that reads length-prefixed JSON
  messages from POST /machine/map, classifies them into Full/Delta/
  PeersChanged/PeersRemoved/KeepAlive, and transparently zstd-decodes
  by detecting the 0x28 0xB5 0x2F 0xFD magic (Headscale only compresses
  if the client opts in)
2026-04-07 13:41:58 +01:00
0fe55d2bf6 feat(net): add WireGuard tunnel and smoltcp virtual network
- wg/tunnel: per-peer boringtun Tunn management with peer table sync
  from netmap (add/remove/update endpoints, allowed_ips, DERP region)
  and encapsulate/decapsulate/tick that route to UDP or DERP
- wg/socket: smoltcp Interface backed by an mpsc-channel Device that
  bridges sync poll-based smoltcp with async tokio mpsc channels
- wg/router: skeleton PacketRouter (currently unused; reserved for the
  unified UDP/DERP ingress path)
2026-04-07 13:41:43 +01:00
76ab2c1a8e feat(net): add DERP relay client
DERP is Tailscale's TCP relay protocol for peers that can't establish a
direct UDP path. Add the standalone client:

- derp/framing: 5-byte frame codec (1-byte type + 4-byte BE length)
- derp/client: HTTP /derp upgrade, Tailscale's NaCl SealedBox handshake
  (ServerKey → ClientInfo → ServerInfo → NotePreferred), and
  send_packet/recv_packet for forwarding WireGuard datagrams

Includes the 8-byte DERP\xf0\x9f\x94\x91 magic prefix in the ServerKey
payload and reads the HTTP upgrade response one byte at a time so the
inline first frame isn't swallowed by a buffered reader.
2026-04-07 13:41:17 +01:00
91cef0a730 feat(net): add Noise IK + HTTP/2 stream layer
Tailscale's TS2021 protocol layers HTTP/2 over an encrypted Noise IK
channel reached via HTTP CONNECT-style upgrade. Add the lower half:

- noise/handshake: hand-rolled Noise_IK_25519_ChaChaPoly_BLAKE2s
  initiator with HKDF + ChaCha20-Poly1305 (no snow dependency)
- noise/framing: 3-byte frame codec (1-byte type + 2-byte BE length)
- noise/stream: NoiseStream implementing AsyncRead + AsyncWrite over
  the framed channel so the h2 crate can sit on top
2026-04-07 13:41:01 +01:00
13539e6e85 feat(net): scaffold sunbeam-net crate with foundations
Add the workspace crate that will host a pure Rust Headscale/Tailscale-
compatible VPN client. This first commit lands the crate skeleton plus
the leaf modules that the rest of the stack builds on:

- error: thiserror Error enum + Result alias
- config: VpnConfig
- keys: Curve25519 node/disco/wg key types with on-disk persistence
- proto/types: PascalCase serde wire types matching Tailscale's JSON
2026-04-07 13:40:27 +01:00
cc2c3f7a3b refactor(openbao): migrate to vaultrs client library
Replace hand-rolled OpenBao HTTP client with vaultrs 0.8.0, which
has official OpenBao support. BaoClient remains the public API so
callers are unchanged. KV patch uses raw HTTP since vaultrs doesn't
expose it yet.
2026-04-05 22:34:41 +01:00
971810433c fix(openbao): fix init response field name for keys_base64
OpenBao returns `keys_base64` not `unseal_keys_b64`. Added serde
alias to accept both field names for compatibility.
2026-04-05 20:41:54 +01:00
dce1cec6ac fix(openbao): create placeholder secret before waiting for pod
On a clean cluster, the OpenBao pod can't start because it mounts
the openbao-keys secret as a volume, but that secret doesn't exist
until init runs. Create a placeholder secret in WaitPodRunning so
the pod can mount it and start. InitOrUnsealOpenBao overwrites it
with real values during initialization.
2026-04-05 20:33:19 +01:00
70b1f84caa fix(config): delete legacy ~/.sunbeam.json after migration
The migration from ~/.sunbeam.json to ~/.sunbeam/config.json
copied but never removed the legacy file, which could cause
confusion with older binaries still writing to the old path.
2026-04-05 20:20:38 +01:00
0c7f1543f5 fix(wfe): update to 1.6.3 for step name display in print_summary
WFE now populates execution pointer step_name from the workflow
definition, so print_summary shows actual step names instead of
"step-0", "step-1", etc.
2026-04-05 20:09:32 +01:00
ff297b61b5 chore: remove pre-WFE monolithic functions
Remove ~1400 lines of dead code now replaced by WFE primitives:
- cluster.rs: cmd_up, ensure_cert_manager, ensure_tls_cert/secret,
  wait_for_core, print_urls, secrets_dir, CERT_MANAGER_URL
- secrets.rs: cmd_seed, cmd_verify, seed_openbao,
  seed_kratos_admin_identity, SeedResult, delete_crd,
  delete_k8s_secret, kubectl_jsonpath
2026-04-05 19:44:23 +01:00
69b0ca4871 chore: remove vendored dependencies and publish script
Dependencies are now sourced from the sunbeam registry.
2026-04-05 19:15:14 +01:00
73e72550a7 refactor(manifests): extract OpenSearch ML into parallel WFE step
Move ensure_opensearch_ml and inject_opensearch_model_id out of
cmd_apply post-hooks into dedicated WFE steps that run in a
parallel branch alongside rollout waits. The ML model download
(10+ min on first run) no longer blocks the rest of the pipeline.
2026-04-05 18:28:21 +01:00
aa19590c73 fix(secrets): add max retries and backoff to port-forward loop
The port-forward background task retried infinitely on 500 errors
when the target pod wasn't ready. Add a 30-attempt limit with 2s
backoff between retries so the step eventually fails instead of
spinning forever.
2026-04-05 18:27:08 +01:00
3cfa0fe755 refactor(wfe): decompose steps into atomic config-driven primitives
Replace big-bag steps with 10 atomic primitives that each do one
thing and read config from step_config:

- ApplyManifest (replaces 12 identical apply structs)
- WaitForRollout (replaces WaitForCore loop)
- CreatePGRole, CreatePGDatabase (replaces EnsurePGRolesAndDatabases)
- EnsureNamespace, CreateK8sSecret (replaces CreateK8sSecrets)
- SeedKVPath, WriteKVPath, CollectCredentials (replaces SeedAllKVPaths + WriteDirtyKVPaths)
- EnableVaultAuth, WriteVaultAuthConfig, WriteVaultPolicy, WriteVaultRole (replaces ConfigureKubernetesAuth)

Workflow definitions now use parallel branches for independent
operations (infra, KV seeding, PG roles, platform manifests,
K8s secrets, rollout waits).
2026-04-05 18:23:36 +01:00
9cd3c641da feat(wfe): integrate workflow engine for up, seed, verify, bootstrap
Dispatch `sunbeam up`, `sunbeam seed`, `sunbeam verify`, and
`sunbeam bootstrap` through WFE workflows instead of monolithic
functions. Steps communicate via JSON workflow data and each
workflow is persisted in a per-context SQLite database.
2026-04-05 18:21:59 +01:00
dce085cd0c feat(auth): add auth token command, penpot seed support
- `sunbeam auth token` prints JSON headers for MCP headersHelper:
  {"Authorization": "Bearer <token>"}
- Add penpot to PG_USERS, pg_db_map, KV seed, and all_paths
- Add cert-manager to VSO auth role bound namespaces
2026-04-04 12:53:53 +01:00
c6bd8be030 chore: remove Python code and pyproject.toml 2026-03-27 09:59:20 +00:00
e568ddf82a chore: checkpoint before Python removal 2026-03-26 22:33:59 +00:00
683cec9307 release: v1.1.2
- fix(opensearch): make ML model registration idempotent
v1.1.2
2026-03-25 18:09:25 +00:00
30dc4f9c5e fix(opensearch): make ML model registration idempotent
Reuse any existing model version (including DEPLOY_FAILED) instead of
registering a new copy. Prevents accumulation of stale model chunks
in .plugins-ml-model when OpenSearch restarts between applies.
2026-03-25 18:04:28 +00:00
3d2d16d53e feat(secrets): add xchacha20-poly1305 cipher key seeding for Kratos
Add rand_alphanum() using OsRng for generating fixed-length
alphanumeric secrets. Seed secrets-cipher (32 chars) into the
kratos KV path for at-rest encryption of OIDC tokens.
2026-03-24 20:51:13 +00:00
80ab6d6113 feat: enable Meet external API, fix SDK path
- Meet config: EXTERNAL_API_ENABLED=True
- Meet backend: added lasuite-resource-server configmap + RS creds
- Pingora: added /external-api/ route for Meet
- SDK: fixed Meet URL to use /external-api/ (hyphenated)

NOTE: Meet RS requires ES256 tokens + lasuite_meet scope — CLI
tokens use RS256 + generic scopes. Needs RS config adjustment.
2026-03-24 17:03:55 +00:00
b08a80d177 refactor: nest infra commands under sunbeam platform
Moves up, status, apply, seed, verify, logs, get, restart, build,
check, mirror, bootstrap, k8s under `sunbeam platform <command>`.
Top-level now has 19 commands instead of 32.
2026-03-24 15:52:44 +00:00
530b2a22b8 chore: remove solution branding from CLI help text 2026-03-24 15:44:39 +00:00
6a2b62dc42 refactor: remove bao, docs, and people subcommands
- bao: replaced by `sunbeam vault` with proper JWT auth
- docs: La Suite Docs not ready for production
- people: La Suite People not ready for production
2026-03-24 15:40:58 +00:00
4d9659a8bb chore: bump to v1.1.1, update CHANGELOG v1.1.1 2026-03-24 15:29:05 +00:00