Add `// Copyright Sunbeam Studios 2026` and `// SPDX-License-Identifier:
Apache-2.0` headers to all source files missing them. Update LICENSE
copyright year, Dockerfile copyright header, and .dockerignore for new
project structure (lean4/, docs/, training artifacts).
Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
Unified prepare-dataset pipeline that automatically downloads and caches
upstream datasets (CSIC 2010, CIC-IDS2017), applies heuristic auto-labeling
to unlabeled production logs, generates synthetic samples for both models,
and serializes everything as a bincode DatasetManifest. Includes OWASP
ModSec parser, CIC-IDS2017 timing profile extractor, and synthetic data
generators with configurable distributions.
Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>