From 66e3692c8bd848357f170ebf736119c39baf569a Mon Sep 17 00:00:00 2001
From: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
Date: Tue, 24 Mar 2026 11:46:11 +0000
Subject: [PATCH] =?UTF-8?q?docs:=20add=20Pingora=20proxy=20documentation?=
 =?UTF-8?q?=20=E2=80=94=20The=20Bouncer=20=F0=9F=92=8E?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Security pipeline (DDoS, scanner, rate limiting), route table, ML
models, training pipeline, static serving, TLS, auth requests, metrics.
---
 docs/proxy.md | 176 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 176 insertions(+)
 create mode 100644 docs/proxy.md

diff --git a/docs/proxy.md b/docs/proxy.md
new file mode 100644
index 0000000..bbe84ed
--- /dev/null
+++ b/docs/proxy.md
@@ -0,0 +1,176 @@
+# The Bouncer 💎
+
+Every request to The Super Boujee Business Box ✨ comes through one door — a custom reverse proxy built on Cloudflare's [Pingora](https://github.com/cloudflare/pingora) framework, written in Rust. It's the bouncer at the club: checks IDs, spots trouble, and knows exactly which room to send you to.
+
+## Why Not Nginx/Traefik?
+
+We tried the off-the-rack options. They didn't fit.
+
+- We wanted ML-powered threat detection compiled into the binary
+- Static file serving that replaces nginx sidecars entirely
+- Hot-reload TLS from K8s Secrets (not file watchers)
+- ACME challenge routing built-in
+- Auth subrequests for admin endpoints
+- All in pure Rust with rustls (no BoringSSL dependency)
+- It's a few thousand lines of Rust. That's it.
+
+## How Requests Flow
+
+The three-layer security pipeline — every request walks the velvet rope in order:
+
+```
+Request → DDoS Detection → Scanner Detection → Rate Limiting → Cache → Backend
+```
+
+### 1. DDoS Detection (per-IP)
+
+The bouncer watches your vibe over time. Fourteen behavioral features extracted over a 60-second sliding window:
+
+- **Features:** request rate, path diversity, error rate, burst patterns, cookie/referer presence, suspicious paths, content-length patterns
+- **Two-stage ensemble:**
+  - Decision tree (fast path, ~2ns inference)
+  - MLP fallback if uncertain (~85ns inference)
+- **Threshold:** 0.6, minimum 10 events before classifying
+- Both models fit in L1 cache (~4KB total)
+- Currently in **observe-only mode** (logs decisions, doesn't block)
+
+### 2. Scanner Detection (per-request)
+
+Catching the people who show up with a crowbar. Twelve features analyzed: path structure, headers, user-agent, traversal patterns.
+
+**Hard allowlist for legitimate bots:**
+- Known host + cookies → allow (legitimate browser)
+- Known host + browser UA + accept-language → allow
+
+**Bot verification:**
+- Googlebot: DNS + CIDR validation
+- Bingbot: DNS + CIDR validation
+- containerd: UA only
+
+Same ensemble architecture as DDoS — decision tree fast path, MLP fallback. If you're a scanner, you get a **403 Forbidden**. Goodbye.
+
+### 3. Rate Limiting (per-identity)
+
+Even welcome guests have limits, darling.
+
+**Identity resolution order:** session cookie > Bearer token > client IP
+
+| Tier | Burst | Sustained |
+|------|-------|-----------|
+| Authenticated | 200 requests | 50 tokens/sec |
+| Unauthenticated | 50 requests | 10 tokens/sec |
+
+- **CIDR bypass:** `10.0.0.0/8`, `127.0.0.0/8`, `::1/128`
+- Leaky bucket algorithm, 256 shards for low contention
+- Response: **429 + Retry-After header**
+
+### 4. Response Caching
+
+In-memory via pingora-cache. Only requests that pass the security pipeline get cached — blocked requests never touch the cache.
+
+- **Key:** `{host}{path}?{query}`
+- Respects `Cache-Control`: no-store, private, s-maxage, max-age, stale-while-revalidate
+- Per-route: configurable TTL, enable/disable
+
+## Route Table
+
+Every subdomain gets routed by prefix (the part before the first dot). The bouncer knows every room in the building:
+
+| Prefix | Backend | Notes |
+|--------|---------|-------|
+| `docs` | Collabora (WOPI) | WebSocket |
+| `meet` | Meet frontend + `/api/` → backend | WebSocket |
+| `drive` | Drive frontend + `/external_api/` → backend | versioning |
+| `mail` | Messages frontend (Caddy) | |
+| `messages` | Tuwunel (Matrix) | WebSocket |
+| `people` | People frontend + `/api/`, `/admin/` → backend | |
+| `find` | Find backend | |
+| `src` | Gitea HTTP | WebSocket, SSH on port 22 |
+| `auth` | Kratos UI + Hydra OAuth2 | `/.well-known` → Hydra |
+| `integration` | La Suite navbar | |
+| `cal` | Calendars frontend + `/api/`, `/caldav/` → backend | |
+| `projects` | Projects | WebSocket |
+| `s3` | SeaweedFS filer | |
+| `livekit` | LiveKit dashboard | WebSocket |
+| `metrics` | Grafana | |
+| `systemmetrics` | Prometheus | |
+| `systemlogs` | Loki | |
+| `systemtracing` | Tempo | |
+| `id` | Kratos admin (auth-gated) | subrequest to `/userinfo` |
+| `hydra` | Hydra admin (auth-gated) | subrequest to `/userinfo` |
+| `search` | OpenSearch (auth-gated) | subrequest to `/userinfo` |
+| `vault` | OpenBao (auth-gated) | subrequest to `/userinfo` |
+
+Path sub-routes use longest-prefix-first matching within each host.
+
+## Auth Request Pattern
+
+For admin endpoints (`id`, `hydra`, `search`, `vault`) — the VIP check:
+
+1. Proxy sends `GET` to Hydra `/userinfo` with original `Cookie`/`Authorization` headers
+2. If 2xx: forward to backend (you're in, gorgeous)
+3. If non-2xx: **403 Forbidden** (not on the list)
+
+## Static File Serving
+
+Replaces nginx sidecars entirely. No more sidecar containers cluttering up your pods.
+
+- **Try-files chain:** exact → `.html` → `/index.html` → SPA fallback
+- Content-Type auto-detection
+- Cache headers: `immutable` for hashed assets, `no-cache` for others
+- Path traversal protection (`../` rejected)
+
+## TLS
+
+Pure Rust, no C dependencies in the TLS stack.
+
+- **rustls** + **aws-lc-rs** crypto backend
+- **K8s Secret watcher:** cert renewal → writes to emptyDir → graceful upgrade (zero downtime)
+- **Local:** mkcert wildcard cert
+- **Production:** Let's Encrypt via cert-manager (ACME HTTP-01 challenges routed by the proxy itself)
+
+## ML Training Pipeline
+
+The models aren't downloaded — they're compiled into the binary. Weights baked in = zero model file overhead, L1-cache-resident inference.
+
+1. Collect audit logs from production traffic
+2. Auto-label with heuristics (request rate, path repetition, error rate thresholds)
+3. Merge with external datasets (CSIC 2010, CIC-IDS2017)
+4. Train ensemble offline (burn-rs + WGPU GPU acceleration)
+5. Export as Rust `const` arrays
+6. Recompile binary
+7. Deploy + replay logs to validate accuracy
+
+## Metrics
+
+Prometheus endpoint on `:9090/metrics`:
+
+| Metric | Labels |
+|--------|--------|
+| `sunbeam_requests_total` | method, host, status, backend |
+| `sunbeam_request_duration_seconds` | histogram, 1ms–10s buckets |
+| `sunbeam_ddos_decisions_total` | |
+| `sunbeam_scanner_decisions_total` | |
+| `sunbeam_rate_limit_decisions_total` | |
+| `sunbeam_cache_status_total` | hit/miss |
+| `sunbeam_active_connections` | |
+
+## Audit Logs
+
+Every request produces structured JSON:
+
+- `request_id`, `method`, `host`, `path`, `client_ip`, `status`, `duration_ms`, `user_agent`, `backend`
+- These logs are the training data — feed them back into the pipeline to retrain models. The bouncer learns from every shift.
+
+## Deployment
+
+- **Namespace:** `ingress`
+- Single replica, `Recreate` strategy
+- 256Mi memory limit, 100m CPU
+- **ConfigMap:** `pingora-config` (config.toml with all routes + detection config)
+- **RBAC:** ServiceAccount with read access to Secrets, ConfigMaps, Ingresses
+- **ServiceMonitor** for Prometheus scraping
+
+---
+
+**Source:** [proxy repository](../proxy)