feat: replace nginx placeholder with custom Pingora proxy; add Postfix MTA

Ingress:
- Deploy custom sunbeam-proxy (Pingora/Rust) replacing nginx placeholder
- HTTPS termination with mkcert (local) / rustls-acme (production)
- Host-prefix routing with path-based sub-routing for auth virtual host:
  /oauth2 + /.well-known + /userinfo → Hydra, /kratos → Kratos (prefix stripped), default → login-ui
- HTTP→HTTPS redirect, WebSocket passthrough, JSON audit logging, OTEL stub
- cert-manager HTTP-01 ACME challenge routing via Ingress watcher
- RBAC for Ingress watcher (pingora-watcher ClusterRole)
- local overlay: hostPorts 80/443, LiveKit TURN demoted to ClusterIP to avoid klipper conflict

Infrastructure:
- socket_vmnet shared network for host↔VM reachability (192.168.105.2)
- local-up.sh: cert-manager installation, eth1-based LIMA_IP detection, correct DOMAIN_SUFFIX sed substitution
- Postfix MTA in lasuite namespace: outbound relay via Scaleway TEM, accepts SMTP from cluster pods
- Kratos SMTP courier pointed at postfix.lasuite.svc.cluster.local:25
- Production overlay: cert-manager ClusterIssuer, ACME-enabled Pingora values
This commit is contained in:
2026-03-01 16:25:11 +00:00
parent a589e6280d
commit cdddc334ff
15 changed files with 391 additions and 64 deletions

View File

@@ -5,6 +5,7 @@ namespace: ingress
resources:
- namespace.yaml
- pingora-rbac.yaml
- pingora-deployment.yaml
- pingora-service.yaml
- pingora-config.yaml

View File

@@ -5,39 +5,39 @@ metadata:
namespace: ingress
data:
config.toml: |
# Pingora hostname routing table
# The domain suffix (sunbeam.pt / <LIMA_IP>.sslip.io) is patched per overlay.
# TLS cert source (rustls-acme / mkcert) is patched per overlay.
[tls]
cert_path = "/etc/tls/tls.crt"
key_path = "/etc/tls/tls.key"
# acme = true # Uncommented in production overlay (rustls-acme + Let's Encrypt)
acme = false
# Sunbeam proxy config.
#
# Substitution placeholders (replaced by sed at deploy time):
# DOMAIN_SUFFIX — e.g. <LIMA_IP>.sslip.io (local) or yourdomain.com (production)
[listen]
http = "0.0.0.0:80"
https = "0.0.0.0:443"
[turn]
backend = "livekit.media.svc.cluster.local:7880"
udp_listen = "0.0.0.0:3478"
relay_port_start = 49152
relay_port_end = 49252
[tls]
# Cert files are written here by the proxy on startup and on cert renewal
# via the K8s API. The /etc/tls directory is an emptyDir volume.
cert_path = "/etc/tls/tls.crt"
key_path = "/etc/tls/tls.key"
# Host-prefix → backend mapping.
# Pingora matches on the subdomain prefix regardless of domain suffix,
# so these routes work identically for sunbeam.pt and *.sslip.io.
[telemetry]
# Empty = OTEL disabled. Set to http://otel-collector.data.svc:4318 when ready.
otlp_endpoint = ""
# Host-prefix → backend routing table.
# The prefix is the subdomain before the first dot, so these routes work
# identically for yourdomain.com and *.sslip.io.
# Edit to match your own service names and namespaces.
[[routes]]
host_prefix = "docs"
backend = "http://docs.lasuite.svc.cluster.local:8000"
websocket = true # Y.js CRDT sync
websocket = true
[[routes]]
host_prefix = "meet"
backend = "http://meet.lasuite.svc.cluster.local:8000"
websocket = true # LiveKit signaling
websocket = true
[[routes]]
host_prefix = "drive"
@@ -50,7 +50,7 @@ data:
[[routes]]
host_prefix = "chat"
backend = "http://conversations.lasuite.svc.cluster.local:8000"
websocket = true # Vercel AI SDK streaming
websocket = true
[[routes]]
host_prefix = "people"
@@ -58,12 +58,31 @@ data:
[[routes]]
host_prefix = "src"
backend = "http://gitea.devtools.svc.cluster.local:3000"
websocket = true # Gitea Actions runner
backend = "http://gitea-http.devtools.svc.cluster.local:3000"
websocket = true
# auth: login-ui handles browser UI; Hydra handles OAuth2/OIDC; Kratos handles self-service flows.
[[routes]]
host_prefix = "auth"
backend = "http://hydra.ory.svc.cluster.local:4444"
backend = "http://login-ui.ory.svc.cluster.local:3000"
[[routes.paths]]
prefix = "/oauth2"
backend = "http://hydra-public.ory.svc.cluster.local:4444"
[[routes.paths]]
prefix = "/.well-known"
backend = "http://hydra-public.ory.svc.cluster.local:4444"
[[routes.paths]]
prefix = "/userinfo"
backend = "http://hydra-public.ory.svc.cluster.local:4444"
# /kratos prefix is stripped before forwarding so Kratos sees its native paths.
[[routes.paths]]
prefix = "/kratos"
backend = "http://kratos-public.ory.svc.cluster.local:4433"
strip_prefix = true
[[routes]]
host_prefix = "s3"

View File

@@ -5,6 +5,9 @@ metadata:
namespace: ingress
spec:
replicas: 1
# Recreate avoids rolling-update conflicts (single-node; hostPorts in local overlay)
strategy:
type: Recreate
selector:
matchLabels:
app: pingora
@@ -16,9 +19,10 @@ spec:
# Pingora terminates TLS at the mesh boundary; sidecar injection is disabled here
linkerd.io/inject: disabled
spec:
serviceAccountName: pingora
containers:
- name: pingora
image: nginx:alpine # placeholder until custom Pingora image is built
image: sunbeam-proxy:latest # overridden per overlay via kustomize images:
ports:
- name: http
containerPort: 80
@@ -34,19 +38,20 @@ spec:
- name: config
mountPath: /etc/pingora
readOnly: true
# /etc/tls is an emptyDir written by the proxy via the K8s API on
# startup and on cert renewal, so Pingora always reads a fresh cert
# without depending on kubelet volume-sync timing.
- name: tls
mountPath: /etc/tls
readOnly: true
resources:
limits:
memory: 64Mi
memory: 256Mi
requests:
memory: 32Mi
cpu: 50m
memory: 128Mi
cpu: 100m
volumes:
- name: config
configMap:
name: pingora-config
- name: tls
secret:
secretName: pingora-tls
emptyDir: {}

View File

@@ -0,0 +1,44 @@
---
# ServiceAccount used by the Pingora pod.
# The watcher in sunbeam-proxy uses in-cluster credentials (this SA's token) to
# watch the pingora-tls Secret and pingora-config ConfigMap for changes.
apiVersion: v1
kind: ServiceAccount
metadata:
name: pingora
namespace: ingress
---
# Minimal read-only role: list+watch on the two objects that drive cert reloads.
# Scoped to the ingress namespace by the Role kind (not ClusterRole).
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pingora-watcher
namespace: ingress
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "watch"]
# Ingresses are watched to route cert-manager HTTP-01 challenges to the
# correct per-domain solver pod (one Ingress per challenge, created by
# cert-manager with the exact token path and solver Service name).
- apiGroups: ["networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: pingora-watcher
namespace: ingress
subjects:
- kind: ServiceAccount
name: pingora
namespace: ingress
roleRef:
kind: Role
name: pingora-watcher
apiGroup: rbac.authorization.k8s.io

View File

@@ -5,6 +5,7 @@ namespace: lasuite
resources:
- namespace.yaml
- postfix-deployment.yaml
- hive-config.yaml
- hive-deployment.yaml
- hive-service.yaml

View File

@@ -0,0 +1,81 @@
# Postfix MTA for the Messages email platform.
#
# MTA-out: accepts SMTP from cluster-internal services (Kratos, Messages Django),
# signs with DKIM, and relays outbound via Scaleway TEM.
#
# MTA-in: receives inbound email from the internet (routed via Pingora on port 25).
# In local dev, no MX record points here so inbound never arrives.
#
# Credentials: Secret "postfix-tem-credentials" with keys:
# smtp_user — Scaleway TEM SMTP username (project ID)
# smtp_password — Scaleway TEM SMTP password (API key)
#
# DKIM keys: Secret "postfix-dkim" with key:
# private.key — DKIM private key for sunbeam.pt (generated once; add DNS TXT record)
# selector — DKIM selector (e.g. "mail")
#
apiVersion: apps/v1
kind: Deployment
metadata:
name: postfix
namespace: lasuite
spec:
replicas: 1
selector:
matchLabels:
app: postfix
template:
metadata:
labels:
app: postfix
spec:
automountServiceAccountToken: false
containers:
- name: postfix
image: boky/postfix:latest
ports:
- name: smtp
containerPort: 25
protocol: TCP
env:
# Accept mail from all cluster-internal pods.
- name: MYNETWORKS
value: "10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 127.0.0.0/8"
# Sending domain — replaced by sed at deploy time.
- name: ALLOWED_SENDER_DOMAINS
value: "DOMAIN_SUFFIX"
# Scaleway TEM outbound relay.
- name: RELAYHOST
value: "[smtp.tem.scw.cloud]:587"
- name: SASL_USER
valueFrom:
secretKeyRef:
name: postfix-tem-credentials
key: smtp_user
optional: true # allows pod to start before secret exists
- name: SASL_PASSWORD
valueFrom:
secretKeyRef:
name: postfix-tem-credentials
key: smtp_password
optional: true
resources:
limits:
memory: 64Mi
requests:
memory: 32Mi
cpu: 10m
---
apiVersion: v1
kind: Service
metadata:
name: postfix
namespace: lasuite
spec:
selector:
app: postfix
ports:
- name: smtp
port: 25
targetPort: 25
protocol: TCP

View File

@@ -39,7 +39,7 @@ kratos:
courier:
smtp:
connection_uri: "smtp://local:local@localhost:25/"
connection_uri: "smtp://postfix.lasuite.svc.cluster.local:25/?skip_ssl_verify=true"
from_address: no-reply@DOMAIN_SUFFIX
from_name: Sunbeam

View File

@@ -20,12 +20,26 @@ resources:
- ../../base/media
- ../../base/devtools
images:
# Local dev: image is built and imported directly into k3s containerd.
# imagePullPolicy: Never is set in values-pingora.yaml so k3s never tries to pull.
# Production overlay points this at src.DOMAIN_SUFFIX/sunbeam/sunbeam-proxy:latest.
- name: sunbeam-proxy
newName: sunbeam-proxy
newTag: dev
patches:
# Disable rustls-acme; add hostPort for TURN relay range on Lima VM
# Add hostPort for TURN relay range on Lima VM
- path: values-pingora.yaml
target:
kind: Deployment
name: pingora
# Downgrade LiveKit TURN service from LoadBalancer → ClusterIP (klipper would take hostPort 443)
- path: values-livekit.yaml
target:
kind: Service
name: livekit-server-turn
# Apply §10.7 memory limits to all Deployments
- path: values-resources.yaml

View File

@@ -0,0 +1,10 @@
# Local override: change LiveKit TURN service type from LoadBalancer to ClusterIP.
# k3s klipper-lb would otherwise bind hostPort 443, conflicting with Pingora.
# External TURN on port 443 is not needed in local dev (no NAT traversal required).
apiVersion: v1
kind: Service
metadata:
name: livekit-server-turn
namespace: media
spec:
type: ClusterIP

View File

@@ -1,7 +1,6 @@
# Patch: local Pingora overrides
# - Disables rustls-acme (ACME negotiation not needed locally)
# - Mounts mkcert wildcard cert from the pingora-tls Secret
# - Exposes TURN relay range as hostPort on the Lima VM
# - ACME disabled (mkcert wildcard cert from pingora-tls Secret)
# - hostPort for TURN relay range on the Lima VM
apiVersion: apps/v1
kind: Deployment
@@ -13,10 +12,17 @@ spec:
spec:
containers:
- name: pingora
env:
- name: ACME_ENABLED
value: "false"
imagePullPolicy: Never
ports:
# Bind HTTP/HTTPS directly to the Lima VM's host network
- name: http
containerPort: 80
hostPort: 80
protocol: TCP
- name: https
containerPort: 443
hostPort: 443
protocol: TCP
# Expose full TURN relay range as hostPort so the Lima VM forwards UDP
- name: turn-start
containerPort: 49152
@@ -26,5 +32,6 @@ spec:
containerPort: 49252
hostPort: 49252
protocol: UDP
# TLS cert comes from mkcert Secret created by scripts/local-certs.sh
# Secret name: pingora-tls, keys: tls.crt / tls.key
# acme.enabled = false is the default in pingora-config.yaml.
# The mkcert cert Secret (pingora-tls) is created by scripts/local-certs.sh
# before kustomize runs, so it is always present on first apply.

View File

@@ -46,7 +46,7 @@ spec:
- name: pingora
resources:
limits:
memory: 64Mi
memory: 128Mi
---
apiVersion: apps/v1

View File

@@ -0,0 +1,58 @@
# cert-manager resources for production TLS.
#
# Prerequisites:
# cert-manager must be installed in the cluster before applying this overlay:
# kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml
#
# DOMAIN_SUFFIX and ACME_EMAIL are substituted by sed at deploy time.
# See overlays/production/kustomization.yaml for the deploy command.
---
# ClusterIssuer: Let's Encrypt production via HTTP-01 challenge.
#
# cert-manager creates one Ingress per challenged domain. The pingora proxy
# watches these Ingresses and routes /.well-known/acme-challenge/<token>
# requests to the per-domain solver Service, so multi-SAN certificates are
# issued correctly even when all domain challenges run in parallel.
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-production
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: ACME_EMAIL
privateKeySecretRef:
name: letsencrypt-production-account-key
solvers:
- http01:
ingress:
# ingressClassName is intentionally blank: cert-manager still creates
# the Ingress object (which the proxy watches), but no ingress
# controller needs to act on it — the proxy handles routing itself.
ingressClassName: ""
---
# Certificate: single multi-SAN cert covering all proxy subdomains.
# cert-manager issues it via HTTP-01, stores it in pingora-tls Secret, and
# renews it automatically ~30 days before expiry. The watcher in sunbeam-proxy
# detects the Secret update and triggers a graceful upgrade so the new cert is
# loaded without dropping any connections.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: pingora-tls
namespace: ingress
spec:
secretName: pingora-tls
issuerRef:
name: letsencrypt-production
kind: ClusterIssuer
dnsNames:
- docs.DOMAIN_SUFFIX
- meet.DOMAIN_SUFFIX
- drive.DOMAIN_SUFFIX
- mail.DOMAIN_SUFFIX
- chat.DOMAIN_SUFFIX
- people.DOMAIN_SUFFIX
- src.DOMAIN_SUFFIX
- auth.DOMAIN_SUFFIX
- s3.DOMAIN_SUFFIX

View File

@@ -2,8 +2,12 @@ apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# Production overlay — targets Scaleway Elastic Metal (Paris)
# Deploy with: kubectl apply -k overlays/production/
# TODO: fill in all production values before first production deploy
#
# Deploy (DOMAIN_SUFFIX and ACME_EMAIL are substituted by sed):
# DOMAIN="yourdomain.com" EMAIL="ops@yourdomain.com"
# kustomize build overlays/production/ \
# | sed -e "s/DOMAIN_SUFFIX/${DOMAIN}/g" -e "s/ACME_EMAIL/${EMAIL}/g" \
# | kubectl apply --server-side --force-conflicts -f -
resources:
- ../../base/mesh
@@ -14,13 +18,17 @@ resources:
- ../../base/lasuite
- ../../base/media
- ../../base/devtools
# cert-manager ClusterIssuer + Certificate (requires cert-manager to be installed)
- cert-manager.yaml
images:
# Set to your container registry. DOMAIN_SUFFIX is substituted by sed.
- name: sunbeam-proxy
newName: src.DOMAIN_SUFFIX/sunbeam/sunbeam-proxy
newTag: latest
patches:
# TODO: set domain to sunbeam.pt
# - path: values-domain.yaml
# TODO: enable rustls-acme + Let's Encrypt, bind to public IP
# - path: values-pingora.yaml
- path: values-pingora.yaml
# TODO: set OIDC redirect URIs to https://*.sunbeam.pt/...
# - path: values-ory.yaml

View File

@@ -0,0 +1,66 @@
# Patch: production Pingora overrides
#
# DOMAIN_SUFFIX and ACME_EMAIL are substituted by sed at deploy time.
# See overlays/production/kustomization.yaml for the deploy command.
# Production config: routes only (TLS and telemetry are the same as base).
# The cert is issued by cert-manager via the ClusterIssuer defined in
# cert-manager.yaml and stored in the pingora-tls Secret. The proxy fetches
# it from the K8s API on startup and on renewal — no acme-cache PVC needed.
apiVersion: v1
kind: ConfigMap
metadata:
name: pingora-config
namespace: ingress
data:
config.toml: |
[listen]
http = "0.0.0.0:80"
https = "0.0.0.0:443"
[tls]
cert_path = "/etc/tls/tls.crt"
key_path = "/etc/tls/tls.key"
[telemetry]
otlp_endpoint = ""
[[routes]]
host_prefix = "docs"
backend = "http://docs.lasuite.svc.cluster.local:8000"
websocket = true
[[routes]]
host_prefix = "meet"
backend = "http://meet.lasuite.svc.cluster.local:8000"
websocket = true
[[routes]]
host_prefix = "drive"
backend = "http://drive.lasuite.svc.cluster.local:8000"
[[routes]]
host_prefix = "mail"
backend = "http://messages.lasuite.svc.cluster.local:8000"
[[routes]]
host_prefix = "chat"
backend = "http://conversations.lasuite.svc.cluster.local:8000"
websocket = true
[[routes]]
host_prefix = "people"
backend = "http://people.lasuite.svc.cluster.local:8000"
[[routes]]
host_prefix = "src"
backend = "http://gitea.devtools.svc.cluster.local:3000"
websocket = true
[[routes]]
host_prefix = "auth"
backend = "http://hydra-public.ory.svc.cluster.local:4444"
[[routes]]
host_prefix = "s3"
backend = "http://seaweedfs-filer.storage.svc.cluster.local:8333"

View File

@@ -75,8 +75,22 @@ fi
limactl shell sunbeam sudo rm -f /var/lib/rancher/k3s/server/manifests/traefik.yaml 2>/dev/null || true
# ---------------------------------------------------------------------------
# 5. Install Gateway API CRDs + Linkerd via CLI
# 5. Install cert-manager
# ---------------------------------------------------------------------------
if ! kubectl $CTX get ns cert-manager &>/dev/null; then
echo "==> Installing cert-manager..."
kubectl $CTX apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.0/cert-manager.yaml
echo " Waiting for cert-manager webhooks..."
kubectl $CTX -n cert-manager rollout status deployment/cert-manager --timeout=120s
kubectl $CTX -n cert-manager rollout status deployment/cert-manager-webhook --timeout=120s
kubectl $CTX -n cert-manager rollout status deployment/cert-manager-cainjector --timeout=120s
echo " cert-manager installed."
else
echo "==> cert-manager already installed."
fi
# ---------------------------------------------------------------------------
# 6. Install Gateway API CRDs + Linkerd via CLI# ---------------------------------------------------------------------------
if ! kubectl $CTX get ns linkerd &>/dev/null; then
echo "==> Installing Gateway API CRDs..."
kubectl $CTX apply --server-side -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/standard-install.yaml
@@ -95,9 +109,13 @@ else
fi
# ---------------------------------------------------------------------------
# 6. Generate mkcert wildcard cert
# ---------------------------------------------------------------------------
LIMA_IP=$(limactl shell sunbeam hostname -I | awk '{print $1}')
# 7. Generate mkcert wildcard cert# ---------------------------------------------------------------------------
# Use eth1 (socket_vmnet shared network) — the address reachable from the Mac host.
LIMA_IP=$(limactl shell sunbeam ip -4 addr show eth1 2>/dev/null | awk '/inet / {print $2}' | cut -d/ -f1)
if [[ -z "$LIMA_IP" ]]; then
# Fallback: first non-loopback IP (works on first-boot before eth1 is up)
LIMA_IP=$(limactl shell sunbeam hostname -I | awk '{print $1}')
fi
DOMAIN="${LIMA_IP}.sslip.io"
SECRETS_DIR="$REPO_ROOT/secrets/local"
@@ -114,8 +132,7 @@ else
fi
# ---------------------------------------------------------------------------
# 7. Create TLS Secret in ingress namespace
# ---------------------------------------------------------------------------
# 8. Create TLS Secret in ingress namespace# ---------------------------------------------------------------------------
echo "==> Applying TLS Secret to ingress namespace..."
kubectl $CTX create namespace ingress --dry-run=client -o yaml | kubectl $CTX apply -f -
kubectl $CTX create secret tls pingora-tls \
@@ -125,8 +142,7 @@ kubectl $CTX create secret tls pingora-tls \
--dry-run=client -o yaml | kubectl $CTX apply -f -
# ---------------------------------------------------------------------------
# 8. Apply manifests (server-side apply handles large CRDs)
# ---------------------------------------------------------------------------
# 9. Apply manifests (server-side apply handles large CRDs)# ---------------------------------------------------------------------------
echo "==> Applying manifests (domain: $DOMAIN)..."
cd "$REPO_ROOT"
kustomize build overlays/local --enable-helm | \
@@ -134,14 +150,12 @@ kustomize build overlays/local --enable-helm | \
kubectl $CTX apply --server-side --force-conflicts -f -
# ---------------------------------------------------------------------------
# 9. Seed secrets (waits for postgres, creates K8s secrets, inits OpenBao)
# ---------------------------------------------------------------------------
# 10. Seed secrets (waits for postgres, creates K8s secrets, inits OpenBao)# ---------------------------------------------------------------------------
echo "==> Seeding secrets..."
bash "$SCRIPT_DIR/local-seed-secrets.sh"
# ---------------------------------------------------------------------------
# 10. Restart deployments that were waiting for secrets
# ---------------------------------------------------------------------------
# 11. Restart deployments that were waiting for secrets# ---------------------------------------------------------------------------
echo "==> Restarting services that were waiting for secrets..."
for ns_deploy in \
"ory/hydra" \
@@ -157,8 +171,7 @@ for ns_deploy in \
done
# ---------------------------------------------------------------------------
# 11. Wait for core components
# ---------------------------------------------------------------------------
# 12. Wait for core components# ---------------------------------------------------------------------------
echo "==> Waiting for Valkey..."
kubectl $CTX rollout status deployment/valkey -n data --timeout=120s || true
@@ -169,8 +182,7 @@ echo "==> Waiting for Hydra..."
kubectl $CTX rollout status deployment/hydra -n ory --timeout=120s || true
# ---------------------------------------------------------------------------
# 12. Print URLs
# ---------------------------------------------------------------------------
# 13. Print URLs# ---------------------------------------------------------------------------
echo ""
echo "==> Stack is up. Domain: $DOMAIN"
echo ""
@@ -182,6 +194,7 @@ echo " Drive: https://drive.${DOMAIN}/"
echo " Chat: https://chat.${DOMAIN}/"
echo " People: https://people.${DOMAIN}/"
echo " Gitea: https://src.${DOMAIN}/"
echo " Mailpit: https://mailpit.${DOMAIN}/ (captured outbound email)"
echo ""
echo "OpenBao UI: kubectl $CTX -n data port-forward svc/openbao 8200:8200"
echo " http://localhost:8200 (token from: kubectl $CTX -n data get secret openbao-keys -o jsonpath='{.data.root-token}' | base64 -d)"