Commit Graph

2 Commits

Author SHA1 Message Date
385e9d4c59 chore: add SPDX copyright headers and update license year
Add `// Copyright Sunbeam Studios 2026` and `// SPDX-License-Identifier:
Apache-2.0` headers to all source files missing them. Update LICENSE
copyright year, Dockerfile copyright header, and .dockerignore for new
project structure (lean4/, docs/, training artifacts).

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:21 +00:00
1f4366566d feat(dataset): add dataset preparation with auto-download and heuristic labeling
Unified prepare-dataset pipeline that automatically downloads and caches
upstream datasets (CSIC 2010, CIC-IDS2017), applies heuristic auto-labeling
to unlabeled production logs, generates synthetic samples for both models,
and serializes everything as a bincode DatasetManifest. Includes OWASP
ModSec parser, CIC-IDS2017 timing profile extractor, and synthetic data
generators with configurable distributions.

Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
2026-03-10 23:38:21 +00:00