From 66e3692c8bd848357f170ebf736119c39baf569a Mon Sep 17 00:00:00 2001 From: Sienna Meridian Satterwhite Date: Tue, 24 Mar 2026 11:46:11 +0000 Subject: [PATCH] =?UTF-8?q?docs:=20add=20Pingora=20proxy=20documentation?= =?UTF-8?q?=20=E2=80=94=20The=20Bouncer=20=F0=9F=92=8E?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Security pipeline (DDoS, scanner, rate limiting), route table, ML models, training pipeline, static serving, TLS, auth requests, metrics. --- docs/proxy.md | 176 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 176 insertions(+) create mode 100644 docs/proxy.md diff --git a/docs/proxy.md b/docs/proxy.md new file mode 100644 index 0000000..bbe84ed --- /dev/null +++ b/docs/proxy.md @@ -0,0 +1,176 @@ +# The Bouncer 💎 + +Every request to The Super Boujee Business Box ✨ comes through one door — a custom reverse proxy built on Cloudflare's [Pingora](https://github.com/cloudflare/pingora) framework, written in Rust. It's the bouncer at the club: checks IDs, spots trouble, and knows exactly which room to send you to. + +## Why Not Nginx/Traefik? + +We tried the off-the-rack options. They didn't fit. + +- We wanted ML-powered threat detection compiled into the binary +- Static file serving that replaces nginx sidecars entirely +- Hot-reload TLS from K8s Secrets (not file watchers) +- ACME challenge routing built-in +- Auth subrequests for admin endpoints +- All in pure Rust with rustls (no BoringSSL dependency) +- It's a few thousand lines of Rust. That's it. + +## How Requests Flow + +The three-layer security pipeline — every request walks the velvet rope in order: + +``` +Request → DDoS Detection → Scanner Detection → Rate Limiting → Cache → Backend +``` + +### 1. DDoS Detection (per-IP) + +The bouncer watches your vibe over time. Fourteen behavioral features extracted over a 60-second sliding window: + +- **Features:** request rate, path diversity, error rate, burst patterns, cookie/referer presence, suspicious paths, content-length patterns +- **Two-stage ensemble:** + - Decision tree (fast path, ~2ns inference) + - MLP fallback if uncertain (~85ns inference) +- **Threshold:** 0.6, minimum 10 events before classifying +- Both models fit in L1 cache (~4KB total) +- Currently in **observe-only mode** (logs decisions, doesn't block) + +### 2. Scanner Detection (per-request) + +Catching the people who show up with a crowbar. Twelve features analyzed: path structure, headers, user-agent, traversal patterns. + +**Hard allowlist for legitimate bots:** +- Known host + cookies → allow (legitimate browser) +- Known host + browser UA + accept-language → allow + +**Bot verification:** +- Googlebot: DNS + CIDR validation +- Bingbot: DNS + CIDR validation +- containerd: UA only + +Same ensemble architecture as DDoS — decision tree fast path, MLP fallback. If you're a scanner, you get a **403 Forbidden**. Goodbye. + +### 3. Rate Limiting (per-identity) + +Even welcome guests have limits, darling. + +**Identity resolution order:** session cookie > Bearer token > client IP + +| Tier | Burst | Sustained | +|------|-------|-----------| +| Authenticated | 200 requests | 50 tokens/sec | +| Unauthenticated | 50 requests | 10 tokens/sec | + +- **CIDR bypass:** `10.0.0.0/8`, `127.0.0.0/8`, `::1/128` +- Leaky bucket algorithm, 256 shards for low contention +- Response: **429 + Retry-After header** + +### 4. Response Caching + +In-memory via pingora-cache. Only requests that pass the security pipeline get cached — blocked requests never touch the cache. + +- **Key:** `{host}{path}?{query}` +- Respects `Cache-Control`: no-store, private, s-maxage, max-age, stale-while-revalidate +- Per-route: configurable TTL, enable/disable + +## Route Table + +Every subdomain gets routed by prefix (the part before the first dot). The bouncer knows every room in the building: + +| Prefix | Backend | Notes | +|--------|---------|-------| +| `docs` | Collabora (WOPI) | WebSocket | +| `meet` | Meet frontend + `/api/` → backend | WebSocket | +| `drive` | Drive frontend + `/external_api/` → backend | versioning | +| `mail` | Messages frontend (Caddy) | | +| `messages` | Tuwunel (Matrix) | WebSocket | +| `people` | People frontend + `/api/`, `/admin/` → backend | | +| `find` | Find backend | | +| `src` | Gitea HTTP | WebSocket, SSH on port 22 | +| `auth` | Kratos UI + Hydra OAuth2 | `/.well-known` → Hydra | +| `integration` | La Suite navbar | | +| `cal` | Calendars frontend + `/api/`, `/caldav/` → backend | | +| `projects` | Projects | WebSocket | +| `s3` | SeaweedFS filer | | +| `livekit` | LiveKit dashboard | WebSocket | +| `metrics` | Grafana | | +| `systemmetrics` | Prometheus | | +| `systemlogs` | Loki | | +| `systemtracing` | Tempo | | +| `id` | Kratos admin (auth-gated) | subrequest to `/userinfo` | +| `hydra` | Hydra admin (auth-gated) | subrequest to `/userinfo` | +| `search` | OpenSearch (auth-gated) | subrequest to `/userinfo` | +| `vault` | OpenBao (auth-gated) | subrequest to `/userinfo` | + +Path sub-routes use longest-prefix-first matching within each host. + +## Auth Request Pattern + +For admin endpoints (`id`, `hydra`, `search`, `vault`) — the VIP check: + +1. Proxy sends `GET` to Hydra `/userinfo` with original `Cookie`/`Authorization` headers +2. If 2xx: forward to backend (you're in, gorgeous) +3. If non-2xx: **403 Forbidden** (not on the list) + +## Static File Serving + +Replaces nginx sidecars entirely. No more sidecar containers cluttering up your pods. + +- **Try-files chain:** exact → `.html` → `/index.html` → SPA fallback +- Content-Type auto-detection +- Cache headers: `immutable` for hashed assets, `no-cache` for others +- Path traversal protection (`../` rejected) + +## TLS + +Pure Rust, no C dependencies in the TLS stack. + +- **rustls** + **aws-lc-rs** crypto backend +- **K8s Secret watcher:** cert renewal → writes to emptyDir → graceful upgrade (zero downtime) +- **Local:** mkcert wildcard cert +- **Production:** Let's Encrypt via cert-manager (ACME HTTP-01 challenges routed by the proxy itself) + +## ML Training Pipeline + +The models aren't downloaded — they're compiled into the binary. Weights baked in = zero model file overhead, L1-cache-resident inference. + +1. Collect audit logs from production traffic +2. Auto-label with heuristics (request rate, path repetition, error rate thresholds) +3. Merge with external datasets (CSIC 2010, CIC-IDS2017) +4. Train ensemble offline (burn-rs + WGPU GPU acceleration) +5. Export as Rust `const` arrays +6. Recompile binary +7. Deploy + replay logs to validate accuracy + +## Metrics + +Prometheus endpoint on `:9090/metrics`: + +| Metric | Labels | +|--------|--------| +| `sunbeam_requests_total` | method, host, status, backend | +| `sunbeam_request_duration_seconds` | histogram, 1ms–10s buckets | +| `sunbeam_ddos_decisions_total` | | +| `sunbeam_scanner_decisions_total` | | +| `sunbeam_rate_limit_decisions_total` | | +| `sunbeam_cache_status_total` | hit/miss | +| `sunbeam_active_connections` | | + +## Audit Logs + +Every request produces structured JSON: + +- `request_id`, `method`, `host`, `path`, `client_ip`, `status`, `duration_ms`, `user_agent`, `backend` +- These logs are the training data — feed them back into the pipeline to retrain models. The bouncer learns from every shift. + +## Deployment + +- **Namespace:** `ingress` +- Single replica, `Recreate` strategy +- 256Mi memory limit, 100m CPU +- **ConfigMap:** `pingora-config` (config.toml with all routes + detection config) +- **RBAC:** ServiceAccount with read access to Secrets, ConfigMaps, Ingresses +- **ServiceMonitor** for Prometheus scraping + +--- + +**Source:** [proxy repository](../proxy)