- Prometheus: discover ServiceMonitors/PodMonitors in all namespaces,
enable remote write receiver for Tempo metrics generator
- Tempo: enable metrics generator (service-graphs + span-metrics)
with remote write to Prometheus
- Loki: add Grafana Alloy DaemonSet to ship container logs
- Grafana: enable dashboard sidecar, add Pingora/Loki/Tempo/OpenBao
dashboards, add stable UIDs and cross-linking between datasources
(Loki↔Tempo derived fields, traces→logs, traces→metrics, service map)
- Linkerd: enable proxy tracing to Alloy OTLP collector, point
linkerd-viz at existing Prometheus instead of deploying its own
- Pingora: add OTLP rollout plan (endpoint commented out until proxy
telemetry panic fix is deployed and Alloy is verified healthy)
- Set otlp_endpoint to Tempo HTTP receiver (port 4318) for request tracing
- Add hostNetwork to prometheusSpec so it can reach kubelet/node-exporter on node public IP
- Add ServiceMonitor for proxy metrics scrape on port 9090
- Add CORS origin and Grafana datasource config for monitoring stack
Add new bases for cert-manager (Let's Encrypt + wildcard cert), Longhorn
distributed storage, and monitoring (kube-prometheus-stack + Loki + Tempo
+ Grafana OIDC). Add cloud-init for Scaleway Elastic Metal provisioning.
Production overlay: add patches for postgres sizing, SeaweedFS volume,
OpenSearch storage, LiveKit service, Pingora host ports, resource limits,
and CNPG daily barman backups. Update cert-manager.yaml with full dnsNames
for all *.sunbeam.pt subdomains.