Commit Graph

30 Commits

Author SHA1 Message Date
9db2b1655f test(cluster): add integration tests and proptests for cluster subsystem
7 integration tests: two-node gossip exchange, three-node mesh
propagation, tenant isolation, standalone mode, aggregate bandwidth
meter, bandwidth limiter enforcement, and default 1 Gbps cap.

8 proptests for the bandwidth limiter plus 11 existing cluster proptests
covering meter, tracker, and cluster state invariants.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:21 +00:00
6e5cc75493 feat(cluster): add k8s headless service for gossip peer discovery
Headless Service (clusterIP: None) with gossip-udp port 11204 for
DNS-based peer discovery in k8s deployments.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:21 +00:00
3722972ddf feat(cluster): add Prometheus metrics for cluster gossip and bandwidth
New metrics: cluster peers gauge, bandwidth in/out gauges, gossip message
counter, aggregate rate gauges (in/out/total bytes/sec), model update
counter, and bandwidth limit enforcement decision counter.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:21 +00:00
65516404e1 feat(cluster): wire cluster into proxy lifecycle and request pipeline
Spawn cluster on dedicated thread in main.rs with graceful fallback to
standalone on failure. Add cluster field to SunbeamProxy, record
bandwidth in logging(), and enforce cluster-wide bandwidth cap in
request_filter with 429 JSON response.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:21 +00:00
5d279f992b feat(cluster): implement gossip-based cluster subsystem with iroh
Core cluster module with four gossip channels (bandwidth, models,
leader, license) over iroh-gossip HyParView/PlumTree. Includes:
- BandwidthTracker: atomic per-node counters with zero hot-path contention
- ClusterBandwidthState: peer aggregation with stale eviction
- BandwidthMeter: sliding-window aggregate rate (power-of-2 MiB units)
- BandwidthLimiter: runtime-mutable bandwidth cap (default 1 Gbps)
- ClusterHandle/spawn_cluster: dedicated OS thread + tokio runtime
- Bincode-serialized message envelope with versioned payloads
- Bootstrap and k8s peer discovery modes
- Persistent ed25519 identity for stable EndpointId across restarts

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
ad5c7f0afb feat(cluster): add iroh-gossip dependencies and cluster config schema
Add iroh v0.96, iroh-gossip v0.96, blake3, hex, and rand v0.9 to
Cargo.toml. Define ClusterConfig, DiscoveryConfig, BandwidthClusterConfig,
and ModelsConfig in config.rs with serde defaults for gossip port (11204),
broadcast interval (1s), meter window (30s), and peer discovery methods
(k8s/bootstrap).

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
2660ee974c fix(proxy): skip detection pipeline for bypass CIDR IPs
Trusted IPs (localhost, pod network) now skip the entire DDoS/scanner/
rate-limit pipeline via early return. Fixes buildkitd pushes to Gitea
being blocked by the scanner when using host networking.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
bc82ca2961 fix(docker): copy benches/ directory for Cargo.toml manifest parsing
Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
a5810dd8a7 feat: configurable k8s resources, CSIC training pipeline, unified Dockerfile
- Make K8s namespace, TLS secret, and config ConfigMap names configurable
  via [kubernetes] config section (previously hardcoded to "ingress")
- Add CSIC 2010 dataset converter and auto-download for scanner training
- Unify Dockerfile for local and production builds (remove cross-compile path)
- Bake ML models directory into container image
- Update CSIC dataset URL to self-hosted mirror (src.sunbeam.pt)
- Fix rate_limit pipeline log missing fields
- Consolidate docs/README.md into root README.md

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
0baab92141 docs: add project README, reference docs, license, CLA, and contributing guide
Apache-2.0 license with CLA for dual-licensing. Lefthook enforces
Signed-off-by on all commits. AGENTS.md updated with new modules.

Signed-off-by: Sienna Meridian Satterwhite <sienna@r3t.io>
Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
39fe5f9f5f test: add property-based tests for new proxy features
Add proptest-based tests covering content_type_for, cache_control_for,
backend_addr, UUID v4 request IDs, rewrite rule compilation, body
rewriting, config TOML deserialization, path traversal prevention,
metrics label validation, and static file serving.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
0f31c7645c feat(cache): add pingora-cache integration with per-route config
Add in-memory HTTP response cache using pingora-cache MemCache backend.
Cache runs after the detection pipeline so cache hits bypass upstream
request modifications and body rewriting. Respects Cache-Control
(no-store, private, s-maxage, max-age), skips caching for routes with
body rewrites or auth subrequest headers, and supports configurable
default TTL, stale-while-revalidate, and max file size per route.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
76ad9e93e5 feat(static_files): add static file serving, SPA fallback, rewrites, body rewriting, and auth subrequests
Add static file serving with try_files chain ($uri, $uri.html,
$uri/index.html, fallback), regex-based URL rewrites compiled at
startup, response body find/replace for text/html and JS content,
auth subrequests with header capture for path routes, and custom
response headers per route. Extends RouteConfig with static_root,
fallback, rewrites, body_rewrites, and response_headers fields.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
0fd10110ff feat(proxy): add request IDs, tracing spans, and observability hooks
Generate UUID v4 request IDs per request, create manual tracing spans
(Pingora types don't impl Debug), record Prometheus metrics for
detection decisions and request totals, and forward X-Request-Id to
both upstream requests and downstream responses.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
1ae185b5a5 feat(metrics): add Prometheus metrics and scrape endpoint
Add a prometheus metrics module with counters for requests, DDoS/scanner/
rate-limit decisions, active connections gauge, and request duration
histogram. Spawn a lightweight HTTP server on a configurable port
(default 9090) serving /metrics and /health endpoints.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
70781679b5 fix(dual_stack): set IPV6_V6ONLY on IPv6 socket to prevent EADDRINUSE
On Linux with net.ipv6.bindv6only=0 (default), binding [::]:port
claims both IPv4 and IPv6, causing the subsequent 0.0.0.0:port bind
to fail. Set IPV6_V6ONLY=1 so each socket only handles its own
address family, fixing SSH TCP proxy bind failure in containers.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
45f0751e1e feat(bench): add Criterion benchmarks and CSIC 2010 dataset converter
8 scanner benchmarks covering allowlist fast path (7.6ns), model path
(172-445ns), and feature extraction (248ns). Python converter script
transforms CSIC 2010 raw HTTP dataset into Sunbeam audit-log JSONL
with realistic scanner feature adaptation.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
867b6b2489 feat(proxy): integrate DDoS, scanner, and rate limiter into request pipeline
Wire up all three detection layers in request_filter with pipeline
logging at each stage for unfiltered training data. Add DDoS, scanner,
and rate_limit config sections. Bot allowlist check before scanner
model on the hot path. CLI subcommands for train/replay.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:20 +00:00
ae18b00fa4 feat(scanner): add model hot-reload and verified bot allowlist
ArcSwap-based lock-free hot-reload via file mtime polling. Bot
allowlist with CIDR (instant) + reverse/forward DNS (cached with
background worker thread) IP verification to prevent UA spoofing
by known crawlers, LLM agents, and commercial B2B bots.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
273a203c41 feat(scanner): add logistic regression training pipeline
JSONL audit log ingestion with ground-truth label support for external
datasets (CSIC 2010), SecLists wordlist ingestion for synthetic attack
samples, class-weighted gradient descent, stratified 80/20 train/test
split with held-out evaluation metrics (precision, recall, F1).

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
b7c8243955 feat(scanner): add per-request scanner detector with linear classifier
12-feature extraction (zero-alloc hot path), 2 interaction terms,
weighted linear scoring model with hard allowlist short-circuits for
configured host+cookies and host+browser UA. Returns ScannerVerdict
with score+reason for pipeline logging.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
4bccff3303 feat(rate_limit): add per-identity leaky bucket rate limiter
256-shard RwLock<FxHashMap> for concurrent access, auth key extraction
(ory_kratos_session cookie > Bearer token > client IP), CIDR bypass
for trusted networks, and background eviction of stale buckets.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
007865fbe7 feat(ddos): add KNN-based DDoS detection module
14-feature vector extraction, KNN classifier using fnntw, per-IP
sliding window aggregation, and heuristic auto-labeling for training.
Includes replay subcommand for offline evaluation and integration tests.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
e16299068f feat: add native dual-stack IPv4/IPv6 support
This commit implements comprehensive dual-stack support for the proxy,
allowing it to listen on both IPv4 and IPv6 addresses simultaneously.

Key changes:
- Added new dual_stack.rs module with DualStackTcpListener implementation
- Updated SSH module to use dual-stack listener
- Updated configuration documentation to reflect IPv6 support
- Added comprehensive tests for dual-stack functionality

The implementation is inspired by tokio_dual_stack but implemented
natively without external dependencies. It provides fair connection
distribution between IPv4 and IPv6 clients while maintaining full
backward compatibility with existing IPv4-only configurations.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
41cf6ccc49 fix(deps): upgrade pingora 0.7→0.8 and aws-lc-sys to patch CVEs
- pingora* 0.7.0 → 0.8.0: fixes CVE-2026-2833 (HTTP request smuggling
  via premature connection closure, CRITICAL)
- aws-lc-sys 0.37.1 → 0.38.0: fixes GHSA-65p9-r9h6-22vj (timing
  side-channel in AES-CCM tag verification, HIGH)

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
e5b6802107 feat(proxy): add SSH TCP passthrough and graceful HTTP-only startup
Add optional [ssh] config block that proxies port 22 → Gitea SSH pod,
running on a dedicated thread/runtime matching the cert-watcher pattern.

Also start HTTP-only on first deploy when the TLS cert file doesn't exist
yet — once ACME challenge completes and the cert watcher writes the file,
a graceful upgrade adds the TLS listener without downtime.

Fix ACME watcher to handle InitApply events (kube-runtime v3+) so
Ingresses that existed before the proxy started are picked up correctly.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
10de00990c fix(proxy): handle Expect: 100-continue for large upstream uploads
Docker's OCI distribution protocol sends Expect: 100-continue for blob
uploads larger than ~5 MB. Without this fix, Pingora forwarded the header
to Gitea, Gitea responded with 100 Continue, and Pingora could not reliably
proxy the informational response back — causing spurious 400 errors for the
client on large image layer pushes.

Fix: respond with 100 Continue in request_filter before upstream_peer is
called, then strip the Expect header in upstream_request_filter so the
upstream never sends its own 100 Continue.

Also adds a unit test verifying that remove_header("expect") strips the
header from the upstream request without disturbing other headers.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
4ce008dc11 fix(proxy): forward X-Forwarded-Proto via insert_header; add e2e test
Root cause: upstream_request_filter was inserting x-forwarded-proto with
a raw headers.insert() call (via DerefMut) which only updates base.headers
but NOT the CaseMap. header_to_h1_wire zips CaseMap with base.headers, so
headers added without a CaseMap entry are silently dropped on the wire.

Fix: use insert_header() which keeps both maps in sync.

Also adds:
- src/lib.rs + [lib] section: exposes SunbeamProxy/RouteConfig/AcmeRoutes
  to integration tests without re-declaring modules in main.rs
- tests/e2e.rs: real end-to-end test — starts a SunbeamProxy over plain
  HTTP, routes it to a TCP echo backend, and asserts x-forwarded-proto: http
  is present in the upstream request headers
- Updated unit tests to verify header_to_h1_wire round-trip (not just that
  HeaderMap::insert works in isolation)

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
d0146b47e3 feat(proxy): add per-route disable_secure_redirection; preserve query string in redirect
By default every plain-HTTP request is 301-redirected to HTTPS — no upstream
is ever contacted, making it as close to an L4 redirect as HTTP allows.

New RouteConfig field `disable_secure_redirection` (bool, default false):
when set to true on a route, plain-HTTP requests for that host pass through
to the backend unchanged instead of being redirected.

Also fixes the redirect URL to include the original query string, which was
previously dropped (e.g. ?next=/dashboard would be lost after redirect).

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00
6ec0f78a5b feat: initial sunbeam-proxy implementation
Custom Pingora-based edge proxy for the Sunbeam infrastructure stack.

- HTTPS termination: mkcert file-based (local dev) or rustls-acme ACME (production)
- Host-prefix routing with path-based sub-routing (auth virtual host)
- HTTP→HTTPS redirect, WebSocket passthrough
- cert-manager HTTP-01 challenge routing via Kubernetes Ingress watcher
- TLS cert auto-reload via K8s Secret watcher
- JSON structured audit logging (tracing-subscriber)
- OpenTelemetry OTLP stub (disabled by default)
- Multi-stage Dockerfile: musl static binary on chainguard/static distroless image

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:19 +00:00