Files
sbbb/docs/proxy.md
Sienna Meridian Satterwhite 66e3692c8b docs: add Pingora proxy documentation — The Bouncer 💎
Security pipeline (DDoS, scanner, rate limiting), route table, ML
models, training pipeline, static serving, TLS, auth requests, metrics.
2026-03-24 11:46:11 +00:00

177 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# The Bouncer 💎
Every request to The Super Boujee Business Box ✨ comes through one door — a custom reverse proxy built on Cloudflare's [Pingora](https://github.com/cloudflare/pingora) framework, written in Rust. It's the bouncer at the club: checks IDs, spots trouble, and knows exactly which room to send you to.
## Why Not Nginx/Traefik?
We tried the off-the-rack options. They didn't fit.
- We wanted ML-powered threat detection compiled into the binary
- Static file serving that replaces nginx sidecars entirely
- Hot-reload TLS from K8s Secrets (not file watchers)
- ACME challenge routing built-in
- Auth subrequests for admin endpoints
- All in pure Rust with rustls (no BoringSSL dependency)
- It's a few thousand lines of Rust. That's it.
## How Requests Flow
The three-layer security pipeline — every request walks the velvet rope in order:
```
Request → DDoS Detection → Scanner Detection → Rate Limiting → Cache → Backend
```
### 1. DDoS Detection (per-IP)
The bouncer watches your vibe over time. Fourteen behavioral features extracted over a 60-second sliding window:
- **Features:** request rate, path diversity, error rate, burst patterns, cookie/referer presence, suspicious paths, content-length patterns
- **Two-stage ensemble:**
- Decision tree (fast path, ~2ns inference)
- MLP fallback if uncertain (~85ns inference)
- **Threshold:** 0.6, minimum 10 events before classifying
- Both models fit in L1 cache (~4KB total)
- Currently in **observe-only mode** (logs decisions, doesn't block)
### 2. Scanner Detection (per-request)
Catching the people who show up with a crowbar. Twelve features analyzed: path structure, headers, user-agent, traversal patterns.
**Hard allowlist for legitimate bots:**
- Known host + cookies → allow (legitimate browser)
- Known host + browser UA + accept-language → allow
**Bot verification:**
- Googlebot: DNS + CIDR validation
- Bingbot: DNS + CIDR validation
- containerd: UA only
Same ensemble architecture as DDoS — decision tree fast path, MLP fallback. If you're a scanner, you get a **403 Forbidden**. Goodbye.
### 3. Rate Limiting (per-identity)
Even welcome guests have limits, darling.
**Identity resolution order:** session cookie > Bearer token > client IP
| Tier | Burst | Sustained |
|------|-------|-----------|
| Authenticated | 200 requests | 50 tokens/sec |
| Unauthenticated | 50 requests | 10 tokens/sec |
- **CIDR bypass:** `10.0.0.0/8`, `127.0.0.0/8`, `::1/128`
- Leaky bucket algorithm, 256 shards for low contention
- Response: **429 + Retry-After header**
### 4. Response Caching
In-memory via pingora-cache. Only requests that pass the security pipeline get cached — blocked requests never touch the cache.
- **Key:** `{host}{path}?{query}`
- Respects `Cache-Control`: no-store, private, s-maxage, max-age, stale-while-revalidate
- Per-route: configurable TTL, enable/disable
## Route Table
Every subdomain gets routed by prefix (the part before the first dot). The bouncer knows every room in the building:
| Prefix | Backend | Notes |
|--------|---------|-------|
| `docs` | Collabora (WOPI) | WebSocket |
| `meet` | Meet frontend + `/api/` → backend | WebSocket |
| `drive` | Drive frontend + `/external_api/` → backend | versioning |
| `mail` | Messages frontend (Caddy) | |
| `messages` | Tuwunel (Matrix) | WebSocket |
| `people` | People frontend + `/api/`, `/admin/` → backend | |
| `find` | Find backend | |
| `src` | Gitea HTTP | WebSocket, SSH on port 22 |
| `auth` | Kratos UI + Hydra OAuth2 | `/.well-known` → Hydra |
| `integration` | La Suite navbar | |
| `cal` | Calendars frontend + `/api/`, `/caldav/` → backend | |
| `projects` | Projects | WebSocket |
| `s3` | SeaweedFS filer | |
| `livekit` | LiveKit dashboard | WebSocket |
| `metrics` | Grafana | |
| `systemmetrics` | Prometheus | |
| `systemlogs` | Loki | |
| `systemtracing` | Tempo | |
| `id` | Kratos admin (auth-gated) | subrequest to `/userinfo` |
| `hydra` | Hydra admin (auth-gated) | subrequest to `/userinfo` |
| `search` | OpenSearch (auth-gated) | subrequest to `/userinfo` |
| `vault` | OpenBao (auth-gated) | subrequest to `/userinfo` |
Path sub-routes use longest-prefix-first matching within each host.
## Auth Request Pattern
For admin endpoints (`id`, `hydra`, `search`, `vault`) — the VIP check:
1. Proxy sends `GET` to Hydra `/userinfo` with original `Cookie`/`Authorization` headers
2. If 2xx: forward to backend (you're in, gorgeous)
3. If non-2xx: **403 Forbidden** (not on the list)
## Static File Serving
Replaces nginx sidecars entirely. No more sidecar containers cluttering up your pods.
- **Try-files chain:** exact → `.html``/index.html` → SPA fallback
- Content-Type auto-detection
- Cache headers: `immutable` for hashed assets, `no-cache` for others
- Path traversal protection (`../` rejected)
## TLS
Pure Rust, no C dependencies in the TLS stack.
- **rustls** + **aws-lc-rs** crypto backend
- **K8s Secret watcher:** cert renewal → writes to emptyDir → graceful upgrade (zero downtime)
- **Local:** mkcert wildcard cert
- **Production:** Let's Encrypt via cert-manager (ACME HTTP-01 challenges routed by the proxy itself)
## ML Training Pipeline
The models aren't downloaded — they're compiled into the binary. Weights baked in = zero model file overhead, L1-cache-resident inference.
1. Collect audit logs from production traffic
2. Auto-label with heuristics (request rate, path repetition, error rate thresholds)
3. Merge with external datasets (CSIC 2010, CIC-IDS2017)
4. Train ensemble offline (burn-rs + WGPU GPU acceleration)
5. Export as Rust `const` arrays
6. Recompile binary
7. Deploy + replay logs to validate accuracy
## Metrics
Prometheus endpoint on `:9090/metrics`:
| Metric | Labels |
|--------|--------|
| `sunbeam_requests_total` | method, host, status, backend |
| `sunbeam_request_duration_seconds` | histogram, 1ms10s buckets |
| `sunbeam_ddos_decisions_total` | |
| `sunbeam_scanner_decisions_total` | |
| `sunbeam_rate_limit_decisions_total` | |
| `sunbeam_cache_status_total` | hit/miss |
| `sunbeam_active_connections` | |
## Audit Logs
Every request produces structured JSON:
- `request_id`, `method`, `host`, `path`, `client_ip`, `status`, `duration_ms`, `user_agent`, `backend`
- These logs are the training data — feed them back into the pipeline to retrain models. The bouncer learns from every shift.
## Deployment
- **Namespace:** `ingress`
- Single replica, `Recreate` strategy
- 256Mi memory limit, 100m CPU
- **ConfigMap:** `pingora-config` (config.toml with all routes + detection config)
- **RBAC:** ServiceAccount with read access to Secrets, ConfigMaps, Ingresses
- **ServiceMonitor** for Prometheus scraping
---
**Source:** [proxy repository](../proxy)