Commit Graph

40 Commits

Author SHA1 Message Date
cc2c3f7a3b refactor(openbao): migrate to vaultrs client library
Replace hand-rolled OpenBao HTTP client with vaultrs 0.8.0, which
has official OpenBao support. BaoClient remains the public API so
callers are unchanged. KV patch uses raw HTTP since vaultrs doesn't
expose it yet.
2026-04-05 22:34:41 +01:00
971810433c fix(openbao): fix init response field name for keys_base64
OpenBao returns `keys_base64` not `unseal_keys_b64`. Added serde
alias to accept both field names for compatibility.
2026-04-05 20:41:54 +01:00
dce1cec6ac fix(openbao): create placeholder secret before waiting for pod
On a clean cluster, the OpenBao pod can't start because it mounts
the openbao-keys secret as a volume, but that secret doesn't exist
until init runs. Create a placeholder secret in WaitPodRunning so
the pod can mount it and start. InitOrUnsealOpenBao overwrites it
with real values during initialization.
2026-04-05 20:33:19 +01:00
70b1f84caa fix(config): delete legacy ~/.sunbeam.json after migration
The migration from ~/.sunbeam.json to ~/.sunbeam/config.json
copied but never removed the legacy file, which could cause
confusion with older binaries still writing to the old path.
2026-04-05 20:20:38 +01:00
ff297b61b5 chore: remove pre-WFE monolithic functions
Remove ~1400 lines of dead code now replaced by WFE primitives:
- cluster.rs: cmd_up, ensure_cert_manager, ensure_tls_cert/secret,
  wait_for_core, print_urls, secrets_dir, CERT_MANAGER_URL
- secrets.rs: cmd_seed, cmd_verify, seed_openbao,
  seed_kratos_admin_identity, SeedResult, delete_crd,
  delete_k8s_secret, kubectl_jsonpath
2026-04-05 19:44:23 +01:00
73e72550a7 refactor(manifests): extract OpenSearch ML into parallel WFE step
Move ensure_opensearch_ml and inject_opensearch_model_id out of
cmd_apply post-hooks into dedicated WFE steps that run in a
parallel branch alongside rollout waits. The ML model download
(10+ min on first run) no longer blocks the rest of the pipeline.
2026-04-05 18:28:21 +01:00
aa19590c73 fix(secrets): add max retries and backoff to port-forward loop
The port-forward background task retried infinitely on 500 errors
when the target pod wasn't ready. Add a 30-attempt limit with 2s
backoff between retries so the step eventually fails instead of
spinning forever.
2026-04-05 18:27:08 +01:00
3cfa0fe755 refactor(wfe): decompose steps into atomic config-driven primitives
Replace big-bag steps with 10 atomic primitives that each do one
thing and read config from step_config:

- ApplyManifest (replaces 12 identical apply structs)
- WaitForRollout (replaces WaitForCore loop)
- CreatePGRole, CreatePGDatabase (replaces EnsurePGRolesAndDatabases)
- EnsureNamespace, CreateK8sSecret (replaces CreateK8sSecrets)
- SeedKVPath, WriteKVPath, CollectCredentials (replaces SeedAllKVPaths + WriteDirtyKVPaths)
- EnableVaultAuth, WriteVaultAuthConfig, WriteVaultPolicy, WriteVaultRole (replaces ConfigureKubernetesAuth)

Workflow definitions now use parallel branches for independent
operations (infra, KV seeding, PG roles, platform manifests,
K8s secrets, rollout waits).
2026-04-05 18:23:36 +01:00
9cd3c641da feat(wfe): integrate workflow engine for up, seed, verify, bootstrap
Dispatch `sunbeam up`, `sunbeam seed`, `sunbeam verify`, and
`sunbeam bootstrap` through WFE workflows instead of monolithic
functions. Steps communicate via JSON workflow data and each
workflow is persisted in a per-context SQLite database.
2026-04-05 18:21:59 +01:00
dce085cd0c feat(auth): add auth token command, penpot seed support
- `sunbeam auth token` prints JSON headers for MCP headersHelper:
  {"Authorization": "Bearer <token>"}
- Add penpot to PG_USERS, pg_db_map, KV seed, and all_paths
- Add cert-manager to VSO auth role bound namespaces
2026-04-04 12:53:53 +01:00
13e3f5d42e fix opensearch pod resolution + sol-agent vault policy
os_api: resolve pod name by label instead of hardcoded opensearch-0.
added find_pod_by_label helper to kube.rs.

secrets.py: sol-agent policy (read/write sol-tokens/*) and k8s auth
role bound to matrix namespace default SA.
2026-03-23 08:48:33 +00:00
31fde1a8c6 fix: forge URL derivation for bare IP hosts, add Cargo registry config
forge_url() now checks active context domain first before falling back
to production_host. Bare IP addresses are skipped in the host heuristic.
Adds .cargo/config.toml for the sunbeam Gitea Cargo registry.
2026-03-21 15:17:47 +00:00
b6daf608af chore: suppress dead_code warning on exit code constants 2026-03-20 21:33:00 +00:00
8d6e815a91 feat: --no-cache build flag and Sol build target
- Add --no-cache flag to sunbeam build (passes --no-cache to buildctl)
- Add Sol (virtual librarian) as a build target
- Wire no_cache through all build functions and dispatch
2026-03-20 21:31:42 +00:00
f75f61f238 feat: user provisioning — mailbox, Projects, welcome email
Onboarding now provisions app-level accounts:
- create_mailbox: Django ORM via kubectl exec into messages-backend
- setup_projects_user: knex.js via kubectl exec into projects pod
- Welcome email includes job title and department when provided

Offboarding cleans up:
- delete_mailbox: removes mailbox + Django user
- cleanup_projects_user: soft-deletes Planka user + memberships

All provisioning is best-effort (warns on failure, doesn't block).
2026-03-20 21:30:27 +00:00
c6aa1bd8ce feat: complete pm subcommands with board discovery and user resolution
Planka:
- Board discovery via GET /api/projects (no hardcoded IDs)
- String IDs (snowflake) throughout — TicketRef::Planka holds String
- Create auto-discovers first board/list, or matches --target by name
- Close finds "Done"/"Closed" list and moves card automatically
- Assign resolves users via search, supports "me" for self-assign
- Ticket IDs use p:/g: short prefixes

Gitea:
- Assign uses PATCH on issue (not POST /assignees which needs collaborator)
- Create requires --target org/repo

All pm subcommands tested against live Planka + Gitea instances.
2026-03-20 21:16:55 +00:00
ffc0fe917b feat: split auth into sso/git, Planka token exchange, board discovery
Auth:
- sunbeam auth login runs SSO (Hydra OIDC) then Git (Gitea PAT)
- SSO callback auto-redirects browser to Gitea token page
- sunbeam auth sso / sunbeam auth git for individual flows
- Gitea PAT verified against API before saving

Planka:
- Token exchange via /api/access-tokens/exchange-using-token endpoint
- Board discovery via GET /api/projects
- String IDs (snowflake) handled throughout

Config:
- kubectl-style contexts: --context flag > current-context > "local"
- Removed --env flag
- Per-domain auth token storage
2026-03-20 19:25:10 +00:00
ded0ab442e refactor: remove --env flag, use --context like kubectl
Context resolution: --context flag > current-context from config > "local".
No more production/local distinction in the CLI flags — the context
determines everything (domain, kube-context, ssh-host, infra-dir).

Remove Env enum entirely. Production detection is now "context has ssh-host".
2026-03-20 15:23:54 +00:00
88b02acdd1 feat: kubectl-style contexts with per-domain auth tokens
Config now supports named contexts (like kubectl), each bundling
domain, kube-context, ssh-host, infra-dir, and acme-email. Legacy
flat config auto-migrates to a "production" context on load.

- sunbeam config set --domain sunbeam.pt --host user@server
- sunbeam config use-context production
- sunbeam config get (shows all contexts)

Auth tokens stored per-domain (~/.local/share/sunbeam/auth/{domain}.json)
so local and production don't clobber each other. pm and auth commands
read domain from active context instead of K8s cluster discovery.
2026-03-20 15:17:57 +00:00
3a5e1c62ba fix: use predictable client_id via pre-seeded K8s secret
Pre-create oidc-sunbeam-cli secret with CLIENT_ID=sunbeam-cli before
hydra-maester reconciles. No cluster access needed at login time.
2026-03-20 15:08:59 +00:00
1029ff0747 fix: auth login UX — timeout, Ctrl+C, suppress K8s error, center HTML
- 5-minute timeout on callback wait (Ctrl+C now works)
- Skip K8s client_id lookup when no cluster configured (removes noisy ERROR)
- Center the success page HTML to match Sunbeam Studios branding
2026-03-20 14:31:59 +00:00
43b5a4eef9 fix: URL-encode scope parameter with %20 instead of + 2026-03-20 14:30:31 +00:00
7fab2a7f3c fix: auth login domain resolution with --domain flag
Domain resolves from: --domain flag > cached token > config
production_host > cluster discovery. Clear error when none available.
2026-03-20 14:29:08 +00:00
184ad85c60 fix: install rustls ring crypto provider at startup
Rustls 0.23 requires an explicit CryptoProvider. Enable the ring
feature and call install_default() before any TLS operations.
2026-03-20 14:15:16 +00:00
5bdb78933f feat: unified project management across Planka and Gitea
New src/pm.rs module with sunbeam pm subcommand:
- Planka client: cards, boards, lists, comments, assignments
  via OIDC token exchange for Planka JWT
- Gitea client: issues, comments, labels, milestones
  via OAuth2 Bearer token
- Unified Ticket type with p:/g: ID prefixes
- pm list: parallel fetch from both sources, merged display
- pm show/create/comment/close/assign across both systems
- Auth via crate::auth::get_token() (Hydra OAuth2)
2026-03-20 14:11:16 +00:00
d4421d3e29 feat: OAuth2 CLI authentication with PKCE and token caching
New src/auth.rs module:
- Authorization Code + PKCE flow via localhost redirect
- OIDC discovery from Hydra well-known endpoint
- Browser-based login (opens system browser automatically)
- Token caching at ~/.local/share/sunbeam/auth.json (0600 perms)
- Automatic refresh when access token expires (refresh valid 7 days)
- get_token() for use by other modules (pm, etc.)
- cmd_auth_login/logout/status subcommands
2026-03-20 14:10:37 +00:00
aad469e9c6 fix: stdin password, port-forward retry, seed advisory lock
- set-password reads from stdin when password arg omitted
- Port-forward proxy retries on pod restart instead of failing
- cmd_seed acquires PID-based advisory lockfile to prevent concurrent runs
2026-03-20 13:37:33 +00:00
dff4588e52 fix: employee ID pagination, add async tests
- next_employee_id now paginates through all identities (was limited to 200)
- Add #[tokio::test] tests: ensure_tunnel noop, BaoClient connection error,
  check_update_background returns quickly when forge URL empty
2026-03-20 13:37:25 +00:00
019c73e300 fix: S3 auth signature tested against AWS reference vector
Refactor s3_auth_headers into deterministic s3_auth_headers_at that
accepts a timestamp. Add test with AWS example credentials and fixed
date verifying canonical request, string-to-sign, and final signature.
2026-03-20 13:37:17 +00:00
e95ee4f377 fix: rewrite users.rs to fully async (was blocking tokio runtime)
Replace all blocking I/O with async equivalents:
- tokio::process::Command instead of std::process::Command
- tokio::time::sleep instead of std::thread::sleep
- reqwest::Client (async) instead of reqwest::blocking::Client
- All helper functions (api, find_identity, generate_recovery, etc.) now async
- PortForward::Drop uses start_kill() (sync SIGKILL) for cleanup
- send_welcome_email wrapped in spawn_blocking for lettre sync transport
2026-03-20 13:31:45 +00:00
24e98b4e7d fix: CNPG readiness, DKIM SPKI format, kv_patch, container name
- Check CNPG Cluster CRD status.phase instead of pod Running phase
- DKIM public key: use SPKI format (BEGIN PUBLIC KEY) matching Python
- Use kv_patch instead of kv_put for dirty paths (preserves external fields)
- Vault KV only written when password is newly generated
- Gitea exec passes container name Some("gitea")
- Fix openbao comment (400 not 409)
2026-03-20 13:29:59 +00:00
6ec0666aa1 fix: SSH tunnel leak, cmd_bao injection, discovery cache, DNS async
- Store SSH tunnel child in static Mutex (was dropped immediately)
- cmd_bao: use env(1) for VAULT_TOKEN instead of sh -c (no shell injection)
- Cache API discovery across kube_apply documents (was per-doc roundtrip)
- Replace blocking ToSocketAddrs with tokio::net::lookup_host
- Remove double YAML->JSON->string->JSON serialization in kube_apply
- ResultExt::ctx now preserves all SunbeamError variants
2026-03-20 13:29:51 +00:00
bcfb443757 refactor: deduplicate constants, fix secret key mismatch, add VSS pruning
- New src/constants.rs: single source for MANAGED_NS (includes monitoring)
  and GITEA_ADMIN_USER, imported by all modules that previously had copies
- Fix checks.rs reading wrong key names from gitea-admin-credentials secret
- Add VaultStaticSecret pruning in pre_apply_cleanup (H1)
- Fix cert_manager_present check (was always true after canonicalize)
- Add warnings for silent failures in pre_apply_cleanup
- Fix os_api dead variable assignment
- Set TLS private key permissions to 0600
- Redact Gitea admin password in print_urls
2026-03-20 13:29:35 +00:00
503e407243 feat: implement OpenSearch ML setup and model_id injection
ensure_opensearch_ml: cluster settings, model registration/deployment
(all-mpnet-base-v2), ingest + search pipelines for hybrid BM25+neural.

inject_opensearch_model_id: reads model_id from ingest pipeline,
writes to matrix/opensearch-ml-config ConfigMap.

os_api helper: kube exec curl inside opensearch pod.
2026-03-20 13:16:00 +00:00
bc5eeaae6e feat: implement secrets.rs with OpenBao HTTP API
Full cmd_seed implementation using openbao::BaoClient:
- OpenBao init/unseal via HTTP API (no kubectl exec)
- KV v2 seeding with get_or_create pattern and dirty-path tracking
- Kubernetes auth method + VSO policy configuration
- Database secrets engine with vault PG user and static roles
- DKIM key generation via rsa + pkcs8 crates
- Kratos admin identity seeding via port-forward + reqwest

cmd_verify: VSO E2E test with test sentinel, sync poll, cleanup.
2026-03-20 13:15:53 +00:00
7fd8874d99 refactor: migrate all modules from anyhow to SunbeamError
Replace anyhow::{bail, Context, Result} with crate::error::{Result,
SunbeamError, ResultExt} across all modules. Each module uses the
appropriate error variant (Kube, Secrets, Build, Identity, etc).
2026-03-20 13:15:45 +00:00
cc0b6a833e refactor: add thiserror error tree and tracing logging
SunbeamError enum with typed variants (Kube, Config, Network, Secrets,
Build, Identity, ExternalTool, Io, Json, Yaml, Other) each mapping to
a process exit code. ResultExt trait replaces anyhow's .context().

main.rs initializes tracing-subscriber with RUST_LOG env filter and
routes all errors to exit codes via SunbeamError::exit_code().

Removes anyhow dependency.
2026-03-20 13:15:26 +00:00
ec235685bf feat: Phase 2 feature modules + comprehensive test suite (142 tests)
services.rs:
- Pod status with unicode icons, grouped by namespace
- VSO sync status (VaultStaticSecret/VaultDynamicSecret via kube-rs DynamicObject)
- Log streaming via kube-rs log_stream + futures::AsyncBufReadExt
- Pod get in YAML/JSON format
- Rollout restart with namespace/service filtering

checks.rs:
- 11 health check functions (gitea, postgres, valkey, openbao, seaweedfs, kratos, hydra, people, livekit)
- AWS4-HMAC-SHA256 S3 auth header generation using sha2 + hmac
- Concurrent execution via tokio JoinSet
- mkcert root CA trust for local TLS

secrets.rs:
- Stub with cmd_seed/cmd_verify (requires live cluster for full impl)

users.rs:
- All 10 Kratos identity operations via reqwest + kubectl port-forward
- Welcome email via lettre SMTP through port-forwarded postfix
- Employee onboarding with auto-assigned ID, HR metadata
- Offboarding with Kratos + Hydra session revocation

gitea.rs:
- Bootstrap without Lima VM: admin password, org creation, OIDC auth source
- Gitea API via kubectl exec curl

images.rs:
- BuildEnv detection, buildctl build + push via port-forward
- Per-service builders for all 17 build targets
- Deploy rollout, node image pull, uv Dockerfile patching
- Mirror scaffolding (containerd operations marked TODO)

cluster.rs:
- Pure K8s cmd_up: cert-manager, linkerd, rcgen TLS certs, core service wait
- No Lima VM operations

manifests.rs:
- Full cmd_apply: kustomize build, two-pass convergence, ConfigMap restart detection
- Pre-apply cleanup, webhook wait, mkcert CA, tuwunel OAuth2 redirect patch

Test coverage: 142 tests across 14 modules (44 in checks, 27 in cli, 13 in images, 12 in tools, 12 in services, 11 in users, 10 in manifests, 9 in kube, 9 in cluster, 7 in update, 6 in gitea, 4 in openbao, 3 in output, 2 in config).
2026-03-20 12:45:07 +00:00
42c2a74928 feat: Phase 1 foundations — kube-rs client, OpenBao HTTP client, self-update
kube.rs:
- KubeClient with lazy init from kubeconfig + context selection
- SSH tunnel via subprocess (port 2222, forward 16443->6443)
- Server-side apply for multi-document YAML via kube-rs discovery
- Secret get/create, namespace ensure, exec in pod, rollout restart
- Domain discovery from gitea-inline-config secret
- kustomize_build with embedded binary, domain/email/registry substitution
- kubectl and bao CLI passthrough commands

openbao.rs:
- Lightweight Vault/OpenBao HTTP API client using reqwest
- System ops: seal-status, init, unseal
- KV v2: get, put, patch, delete with proper response parsing
- Auth: enable method, write policy, write roles
- Database secrets engine: config, static roles
- Replaces all kubectl exec bao shell commands from Python version

update.rs:
- Self-update from latest mainline commit via Gitea API
- CI artifact download with SHA256 checksum verification
- Atomic self-replace (temp file + rename)
- Background update check with hourly cache (~/.local/share/sunbeam/)
- Enhanced version command with target triple and build date

build.rs:
- Added SUNBEAM_TARGET and SUNBEAM_BUILD_DATE env vars

35 tests pass.
2026-03-20 12:37:02 +00:00
80c67d34cb feat: Rust rewrite scaffolding with embedded kustomize+helm
Phase 0 of Python-to-Rust CLI rewrite:

- Cargo.toml with all dependencies (kube-rs, reqwest, russh, rcgen, lettre, etc.)
- build.rs: downloads kustomize v5.8.1 + helm v4.1.0 at compile time, embeds as bytes, sets SUNBEAM_COMMIT from git
- src/main.rs: tokio main with anyhow error formatting
- src/cli.rs: full clap derive struct tree matching all Python argparse subcommands
- src/config.rs: SunbeamConfig serde struct, load/save ~/.sunbeam.json
- src/output.rs: step/ok/warn/table with exact Python format strings
- src/tools.rs: embedded kustomize+helm extraction to cache dir
- src/kube.rs: parse_target, domain_replace, context management
- src/manifests.rs: filter_by_namespace with full test coverage
- Stub modules for all remaining features (cluster, secrets, images, services, checks, gitea, users, update)

23 tests pass, cargo check clean.
2026-03-20 12:24:21 +00:00