feat: configurable k8s resources, CSIC training pipeline, unified Dockerfile
- Make K8s namespace, TLS secret, and config ConfigMap names configurable via [kubernetes] config section (previously hardcoded to "ingress") - Add CSIC 2010 dataset converter and auto-download for scanner training - Unify Dockerfile for local and production builds (remove cross-compile path) - Bake ML models directory into container image - Update CSIC dataset URL to self-hosted mirror (src.sunbeam.pt) - Fix rate_limit pipeline log missing fields - Consolidate docs/README.md into root README.md Signed-off-by: Sienna Meridian Satterwhite <sienna@sunbeam.pt>
This commit is contained in:
@@ -7,7 +7,7 @@ Label is determined by which file it came from (normal vs anomalous).
|
||||
|
||||
Usage:
|
||||
# Download the dataset first:
|
||||
git clone https://github.com/msudol/Web-Application-Attack-Datasets.git /tmp/csic
|
||||
git clone https://src.sunbeam.pt/studio/csic-dataset.git /tmp/csic
|
||||
|
||||
# Convert all three files:
|
||||
python3 scripts/convert_csic.py \
|
||||
@@ -20,8 +20,9 @@ Usage:
|
||||
# Merge with production logs:
|
||||
cat logs.jsonl csic_converted.jsonl > combined.jsonl
|
||||
|
||||
# Train:
|
||||
# Train (or just use --csic flag which does this automatically):
|
||||
cargo run -- train-scanner --input combined.jsonl --output scanner_model.bin
|
||||
# Simpler: cargo run -- train-scanner --input logs.jsonl --output scanner_model.bin --csic
|
||||
"""
|
||||
|
||||
import argparse
|
||||
|
||||
Reference in New Issue
Block a user