Commit Graph

22 Commits

Author SHA1 Message Date
9af3cd3c49 feat: expose admin APIs behind OIDC auth_request
Adds pingora routes for id, hydra, search, vault subdomains.
Each gated by auth_request to Hydra userinfo — only valid SSO
bearer tokens pass through. Adds new SANs to the TLS certificate.
2026-03-22 18:59:22 +00:00
d3943c9a84 feat(monitoring): wire up full LGTM observability stack
- Prometheus: discover ServiceMonitors/PodMonitors in all namespaces,
  enable remote write receiver for Tempo metrics generator
- Tempo: enable metrics generator (service-graphs + span-metrics)
  with remote write to Prometheus
- Loki: add Grafana Alloy DaemonSet to ship container logs
- Grafana: enable dashboard sidecar, add Pingora/Loki/Tempo/OpenBao
  dashboards, add stable UIDs and cross-linking between datasources
  (Loki↔Tempo derived fields, traces→logs, traces→metrics, service map)
- Linkerd: enable proxy tracing to Alloy OTLP collector, point
  linkerd-viz at existing Prometheus instead of deploying its own
- Pingora: add OTLP rollout plan (endpoint commented out until proxy
  telemetry panic fix is deployed and Alloy is verified healthy)
2026-03-21 17:36:54 +00:00
bfe0280732 feat(lasuite): add Projects (Planka Kanban) service
Deploy Planka-based project management at projects.DOMAIN_SUFFIX:
- ConfigMap with OIDC, S3, SMTP, La Gaufre widget config
- Deployment + Service (init container for DB migrations, Sails on 1337)
- OAuth2Client (client_secret_basic, redirect to /oidc-callback)
- VaultDynamicSecret for DATABASE_URL, VaultStaticSecret for SECRET_KEY
- Pingora route with websocket support (Socket.io)
- Image overrides in both local and production overlays
- TLS cert dnsNames updated for projects subdomain
- Integration service.json updated with Projects entry
- seaweedfs-s3-credentials rolloutRestartTargets includes projects
2026-03-20 13:41:54 +00:00
3c7460f4a6 feat(lasuite): add calendars service deployment manifests
Add K8s manifests for calendars backend, frontend (Caddy), CalDAV
server, and Celery worker. Wire Pingora routing for cal.sunbeam.pt
with path-based backend/caldav/static splits. Add OAuth2Client for
OIDC, VaultDynamicSecret for DB credentials, VaultStaticSecret for
Django/CalDAV keys, and TLS cert coverage for the cal subdomain.
Register calendars in the integration service gaufre widget.
2026-03-18 18:36:05 +00:00
ccfe8b877a feat: La Suite email/messages, buildkitd, monitoring, vault and storage updates
- Add Messages (email) service: backend, frontend, MTA in/out, MPA, SOCKS
  proxy, worker, DKIM config, and theme customization
- Add Collabora deployment for document collaboration
- Add Drive frontend nginx config and values
- Add buildkitd namespace for in-cluster container builds
- Add SeaweedFS remote sync and additional S3 buckets
- Update vault secrets across namespaces (devtools, lasuite, media,
  monitoring, ory, storage) with expanded credential management
- Update monitoring: rename grafana→metrics OAuth2Client, add Prometheus
  remote write and additional scrape configs
- Update local/production overlays with resource patches
- Remove stale login-ui resource patch from production overlay
2026-03-10 19:00:57 +00:00
e5741c4df6 feat: integrate tuwunel with Ory SSO, rename chat to messages subdomain
- Add matrix to hydra-maester enabledNamespaces for OAuth2Client CRD
- Update allowed_return_urls and selfservice URLs: chat→messages
- Add Kratos verification flow, employee/external identity schemas
- Extend session lifespan to 30 days with persistent cookies
- Route messages.* to tuwunel via Pingora with WebSocket support
- Replace login-ui with kratos-admin-ui as unified auth frontend
- Update TLS certificate SANs: chat→messages, add monitoring subdomains
- Add tuwunel + La Suite images to production overlay
- Switch DDoS/scanner detection to compiled-in ensemble models (observe_only)
2026-03-10 18:52:47 +00:00
91983ddf29 feat(observability): enable OTLP tracing, fix Prometheus scraping, add proxy ServiceMonitor
- Set otlp_endpoint to Tempo HTTP receiver (port 4318) for request tracing
- Add hostNetwork to prometheusSpec so it can reach kubelet/node-exporter on node public IP
- Add ServiceMonitor for proxy metrics scrape on port 9090
- Add CORS origin and Grafana datasource config for monitoring stack
2026-03-09 08:20:42 +00:00
caefb071a8 fix(ingress): use 10.0.0.0/8 bypass for all cluster-internal traffic
Pod IPs are in 10.0.0.0/24, not 10.42.0.0/16 as assumed. Broadening
to 10.0.0.0/8 covers pods, services, and CNI overlays.
2026-03-09 08:00:46 +00:00
a101ea4b06 fix(ingress): add localhost to rate-limit bypass CIDRs
Adds 127.0.0.0/8 and ::1/128 so host-networked pods (buildkitd) are
not blocked by the detection pipeline.
2026-03-09 01:40:25 +00:00
7c1676d2b9 feat(ingress): add detection pipeline config and metrics port
- Add DDoS, scanner, and rate limiter configuration to pingora-config
- Add kubernetes config section with configurable namespace/resource names
- Expose metrics port 9090 on deployment and service
2026-03-08 20:37:49 +00:00
f3faf31d4b Fix meet: ALLOWED_HOSTS, OIDC callback, and LiveKit connectivity
- meet-config: rename ALLOWED_HOSTS → DJANGO_ALLOWED_HOSTS (django-configurations
  ListValue uses DJANGO_ prefix by default; without it the list was empty and
  every browser request got 400 DisallowedHost)
- meet-config: set LIVEKIT_API_URL to public https://livekit.DOMAIN_SUFFIX so
  the meet frontend can reach LiveKit for WebSocket signaling
- pingora-config: add livekit.DOMAIN_SUFFIX → livekit-server:80 WebSocket route
- cert-manager: add livekit.DOMAIN_SUFFIX to TLS cert dnsNames
- oidc-clients: fix meet redirect URI /oidc/callback/ → /api/v1.0/callback/
  (meet embeds mozilla-django-oidc inside the api/v1.0/ prefix); add
  postLogoutRedirectUri for clean logout
- livekit-values: replace hardcoded devkey:secret-placeholder with key_file
  loaded from a VSO-managed K8s Secret (secret/livekit in OpenBao)
- media/vault-secrets: add VaultAuth + VaultStaticSecret for media namespace
  to sync livekit API credentials from OpenBao
2026-03-06 13:56:29 +00:00
424db43ccf feat(infra): Meet integration, La Suite theming, Pingora SSH + meet routes
Meet: add backend/frontend/celery deployments and services, meet-config
ConfigMap, nginx SPA config, VSO secrets (meet-db-credentials VDS,
meet-django-secret and meet-livekit VSS). Wire oidc-meet OAuth2Client.

La Suite overlay discipline: move people/docs frontend nginx ConfigMaps
and patches from overlays/local to base so both environments share them.
Remove values-ory.yaml (folded into base). Add docs-frontend nginx config
with sub_filter theming. Add local gitea mkcert CA patch.

Pingora: add [ssh] TCP passthrough block (port 22 → Gitea SSH pod) and
split meet route into frontend default + backend paths for /api/, /admin/,
/oidc/, /static/, /__. Remove now-unused values-pingora.yaml from production
overlay (host ports moved to patch-pingora-hostport.yaml).

Update both overlay kustomizations to reference all new resources and
add meet-backend/meet-frontend image entries.
2026-03-06 12:08:21 +00:00
7ffddcafcd fix(ory,lasuite): harden session security and fix logout + WebSocket routing
- Fix Hydra postLogoutRedirectUris for docs and people to match the
  actual URI sent by mozilla_django_oidc v5 (/api/v1.0/logout-callback/)
  instead of the root URL, resolving 599 logout errors.

- Fix docs y-provider WebSocket backend port: use Service port 443
  (not pod port 4444 which has no DNAT rule) in Pingora config.

- Tighten VSO VaultDynamicSecret rotation sync: add allowStaticCreds:true
  and reduce refreshAfter from 1h to 5m across all static-creds paths
  (kratos, hydra, gitea, hive, people, docs) so credential rotation is
  reflected within 5 minutes instead of up to 1 hour.

- Set Hydra token TTLs: access_token and id_token to 5m; refresh_token
  to 720h (30 days). Kratos session carries silent re-auth so the short
  access token TTL does not require users to log in manually.

- Set SESSION_COOKIE_AGE=3600 (1h) in docs and people backends. After
  1h, apps silently re-auth via the active Kratos session. Disabled
  identities (sunbeam user disable) cannot re-auth on next expiry.
2026-03-03 18:07:08 +00:00
2e89854f86 feat(lasuite): deploy La Suite Docs (impress)
Adds the impress Helm chart (suitenumerique/docs, v4.5.0) to the lasuite
namespace with full Pingora routing, VSO secrets, and local overlay
resource tuning.

Routing (pingora-config.yaml):
- docs.* frontend -> docs-frontend:80 (nginx, static Next.js export)
- /api/* and /admin/* -> docs-backend:80 (Django/uvicorn)
- /collaboration/ws/* -> docs-y-provider:4444 (Hocuspocus WebSocket)
- integration.* -> integration:80 (La Gaufre hub, same file)

Secrets (vault-secrets.yaml):
- VaultDynamicSecret docs-db-credentials (DB engine, static role)
- VaultStaticSecret docs-django-secret (DJANGO_SECRET_KEY)
- VaultStaticSecret docs-collaboration-secret (y-provider shared secret)

OIDC client (oidc-clients.yaml):
- Fix redirect_uri from /oidc/callback/ to /api/v1.0/callback/ -- impress
  mounts all OIDC URLs under api/{API_VERSION}/ via lasuite.oidc_login,
  same pattern as people.

Local overlay (values-resources.yaml):
- docs-backend: 512Mi limit, WEB_CONCURRENCY=2 (4 uvicorn workers
  exceeded 384Mi at startup on the arm64 Lima VM)
- docs-celery-worker: 384Mi limit, CELERY_WORKER_CONCURRENCY=2
- docs-y-provider: 256Mi limit
- seaweedfs-filer: raised from 256Mi to 512Mi (OOMKilled during 188MB
  multipart S3 upload of impress-y-provider image layer)

Local overlay (kustomization.yaml):
- Image mirrors for impress-backend, impress-frontend, impress-y-provider
  (amd64-only images retagged to Gitea via cmd_mirror before deploy)
2026-03-03 14:30:45 +00:00
6cc60c66ff feat(ory): add kratos-admin-ui service
Deploy the custom Kratos admin UI (Deno/Hono + Cunningham React):
- K8s Deployment + Service in ory namespace
- VSO VaultStaticSecret for cookie/csrf/admin-identity-ids secrets
- Pingora route for admin.DOMAIN_SUFFIX
2026-03-03 11:30:52 +00:00
419a45b3a7 fix: route people.* to frontend; path-route API/admin/oauth2 to backend
people.* now routes / to people-frontend (nginx/React SPA).
Path prefixes /api/, /admin/, and /o/ are forwarded to people-backend
(Django/gunicorn), matching the app's URL structure.

Previously all people.* traffic hit people-backend directly, causing
Django to return 404 "Page not found at /" for the root path.

The [[routes.paths]] mechanism already existed in the proxy (used by
the auth route) — only a config update was needed.
2026-03-03 01:04:10 +00:00
8621c0dd65 fix: correct Pingora upstream ports and kustomize namespace conflict
pingora-config.yaml: kratos-public and people-backend K8s Services
expose port 80, not 4433/8000. The wrong ports caused Pingora to
return timeouts for /kratos/* and all people.* routes.

ory/kustomization.yaml: remove kustomization-level namespace: ory
transformer. All non-Helm resources already declare namespace: ory
explicitly. The transformer was incorrectly moving hydra-maester's
enabledNamespaces Role (generated for the lasuite namespace) into ory,
producing a duplicate-name conflict during kustomize build.
2026-03-03 00:57:58 +00:00
e0f1803e33 docs(ingress): document disable_secure_redirection and other per-route options 2026-03-02 18:45:19 +00:00
3f516dc4d3 fix(ingress): fix People backend service name; add find route
The People backend service is named people-backend (not people) in the
desk chart. Add a route for find-backend to front the future OpenSearch
Dashboards service.
2026-03-02 18:33:34 +00:00
cdddc334ff feat: replace nginx placeholder with custom Pingora proxy; add Postfix MTA
Ingress:
- Deploy custom sunbeam-proxy (Pingora/Rust) replacing nginx placeholder
- HTTPS termination with mkcert (local) / rustls-acme (production)
- Host-prefix routing with path-based sub-routing for auth virtual host:
  /oauth2 + /.well-known + /userinfo → Hydra, /kratos → Kratos (prefix stripped), default → login-ui
- HTTP→HTTPS redirect, WebSocket passthrough, JSON audit logging, OTEL stub
- cert-manager HTTP-01 ACME challenge routing via Ingress watcher
- RBAC for Ingress watcher (pingora-watcher ClusterRole)
- local overlay: hostPorts 80/443, LiveKit TURN demoted to ClusterIP to avoid klipper conflict

Infrastructure:
- socket_vmnet shared network for host↔VM reachability (192.168.105.2)
- local-up.sh: cert-manager installation, eth1-based LIMA_IP detection, correct DOMAIN_SUFFIX sed substitution
- Postfix MTA in lasuite namespace: outbound relay via Scaleway TEM, accepts SMTP from cluster pods
- Kratos SMTP courier pointed at postfix.lasuite.svc.cluster.local:25
- Production overlay: cert-manager ClusterIssuer, ACME-enabled Pingora values
2026-03-01 16:25:11 +00:00
a589e6280d feat: bring up local dev stack — all services running
- Ory Hydra + Kratos: fixed secret management, DSN config, DB migrations,
  OAuth2Client CRD (helm template skips crds/ dir), login-ui env vars
- SeaweedFS: added s3.json credentials file via -s3.config CLI flag
- OpenBao: standalone mode with auto-unseal sidecar, keys in K8s secret
- OpenSearch: increased memory to 1.5Gi / JVM 1g heap
- Gitea: SSL_MODE disable, S3 bucket creation fixed
- Hive: automountServiceAccountToken: false (Lima virtiofs read-only rootfs quirk)
- LiveKit: API keys in values, hostPort conflict resolved
- Linkerd: native sidecar (proxy.nativeSidecar=true) to avoid blocking Jobs
- All placeholder images replaced: pingora→nginx:alpine, login-ui→oryd/kratos-selfservice-ui-node

Full stack running: postgres, valkey, openbao, opensearch, seaweedfs,
kratos, hydra, gitea, livekit, hive (placeholder), login-ui
2026-02-28 22:08:38 +00:00
5d9bd7b067 chore: initial infrastructure scaffold
Kustomize base + overlays for the full Sunbeam k3s stack:
- base/mesh      — Linkerd edge (crds + control-plane + viz)
- base/ingress   — custom Pingora edge proxy
- base/ory       — Kratos 0.60.1 + Hydra 0.60.1 + login-ui
- base/data      — CloudNativePG 0.27.1, Valkey 8, OpenSearch 2
- base/storage   — SeaweedFS master + volume + filer (S3 on :8333)
- base/lasuite   — Hive sync daemon + La Suite app placeholders
- base/media     — LiveKit livekit-server 1.9.0
- base/devtools  — Gitea 12.5.0 (external PG + Valkey)
overlays/local   — sslip.io domain, mkcert TLS, Lima hostPort
overlays/production — stub (TODOs for sunbeam.pt values)
scripts/         — local-up/down/certs/urls helpers
justfile         — up / down / certs / urls targets
2026-02-28 13:42:27 +00:00