Files
sol/docs/deployment.md

6.8 KiB

Sol — Kubernetes Deployment

Sol runs as a single-replica Deployment in the matrix namespace. SQLite is the persistence backend, so only one pod can run at a time (Recreate strategy).

resource relationships

flowchart TD
    subgraph OpenBao
        vault[("secret/sol<br/>matrix-access-token<br/>matrix-device-id<br/>mistral-api-key")]
    end

    subgraph "matrix namespace"
        vss[VaultStaticSecret<br/>sol-secrets]
        secret[Secret<br/>sol-secrets]
        cm[ConfigMap<br/>sol-config<br/>sol.toml + system_prompt.md]
        pvc[PVC<br/>sol-data<br/>1Gi RWO]
        deploy[Deployment<br/>sol]
        init[initContainer<br/>fix-permissions]
        pod[Container<br/>sol]
    end

    vault --> |VSO sync| vss
    vss --> |creates| secret
    vss --> |rolloutRestartTargets| deploy
    deploy --> init
    init --> pod
    secret --> |env vars| pod
    cm --> |subPath mounts| pod
    pvc --> |/data| init
    pvc --> |/data| pod

manifests

All manifests are in infrastructure/base/matrix/.

Deployment (sol-deployment.yaml)

strategy:
  type: Recreate         # SQLite requires single-writer
replicas: 1

initContainerbusybox runs chmod -R 777 /data && mkdir -p /data/matrix-state to ensure the nonroot distroless container can write to the Longhorn PVC.

Containersol image (distroless/cc-debian12:nonroot)

  • Resources: 256Mi request / 512Mi limit memory, 100m CPU request
  • enableServiceLinks: false — avoids injecting service env vars that could conflict

Environment variables (from Secret sol-secrets):

Env Var Secret Key
SOL_MATRIX_ACCESS_TOKEN matrix-access-token
SOL_MATRIX_DEVICE_ID matrix-device-id
SOL_MISTRAL_API_KEY mistral-api-key

Fixed env vars:

Env Var Value
SOL_CONFIG /etc/sol/sol.toml
SOL_SYSTEM_PROMPT /etc/sol/system_prompt.md

Volume mounts:

Mount Source Details
/etc/sol/sol.toml ConfigMap sol-config subPath: sol.toml, readOnly
/etc/sol/system_prompt.md ConfigMap sol-config subPath: system_prompt.md, readOnly
/data PVC sol-data read-write

PVC (sol-deployment.yaml, second document)

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sol-data
  namespace: matrix
spec:
  accessModes: [ReadWriteOnce]
  resources:
    requests:
      storage: 1Gi

Uses the default StorageClass (Longhorn).

VaultStaticSecret (vault-secrets.yaml)

apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultStaticSecret
metadata:
  name: sol-secrets
  namespace: matrix
spec:
  vaultAuthRef: vso-auth
  mount: secret
  type: kv-v2
  path: sol
  refreshAfter: 60s
  rolloutRestartTargets:
    - kind: Deployment
      name: sol
  destination:
    name: sol-secrets
    create: true
    overwrite: true

The rolloutRestartTargets field means VSO will automatically restart the Sol deployment when secrets change in OpenBao.

Three keys synced from OpenBao secret/sol:

  • matrix-access-token
  • matrix-device-id
  • mistral-api-key

/data mount layout

/data/
├── sol.db           SQLite database (conversations + agents tables, WAL mode)
└── matrix-state/    Matrix SDK sqlite state store (E2EE keys, sync tokens)

Both are created automatically. The initContainer ensures directory permissions are correct for the nonroot container.

secrets in OpenBao

Store secrets at secret/sol in OpenBao KV v2:

# Via sunbeam seed (automated), or manually:
openbao kv put secret/sol \
  matrix-access-token="syt_..." \
  matrix-device-id="DEVICE_ID" \
  mistral-api-key="..."

These are synced to K8s Secret sol-secrets by the Vault Secrets Operator.

build and deploy

# Build only (local Docker image)
sunbeam build sol

# Build + push to registry
sunbeam build sol --push

# Build + push + deploy (apply manifests + rollout restart)
sunbeam build sol --push --deploy

The Docker build cross-compiles to x86_64-unknown-linux-gnu on macOS. The final image is gcr.io/distroless/cc-debian12:nonroot (~30MB).

startup sequence

  1. Initialize tracing_subscriber with RUST_LOG env filter (default: sol=info)
  2. Load config from SOL_CONFIG path
  3. Load system prompt from SOL_SYSTEM_PROMPT path
  4. Read 3 secret env vars (SOL_MATRIX_ACCESS_TOKEN, SOL_MATRIX_DEVICE_ID, SOL_MISTRAL_API_KEY)
  5. Build Matrix client with E2EE sqlite store, restore session
  6. Connect to OpenSearch, ensure archive + memory indices exist
  7. Initialize Mistral client
  8. Build components: Personality, ConversationManager, ToolRegistry, Indexer, Evaluator, Responder
  9. Backfill conversation context from archive (if backfill_on_join enabled)
  10. Open SQLite database (fallback to in-memory on failure)
  11. Initialize AgentRegistry + ConversationRegistry (load persisted state from SQLite)
  12. If use_conversations_api enabled: ensure orchestrator agent exists on Mistral server
  13. Backfill reactions from Matrix room timelines
  14. Start background index flush task
  15. Start Matrix sync loop
  16. If SQLite failed: send *sneezes* to all joined rooms
  17. Log "Sol is running", wait for SIGINT

monitoring

Sol uses tracing with structured fields. Default log level: sol=info.

Key log events:

Event Level Fields
Response sent info room, len, is_dm
Tool execution info tool, id, args
Engagement evaluation info sender, rule, relevance, threshold
Memory extraction debug count, user
Conversation created info room, conversation_id
Agent restored/created info agent_id, name
Backfill complete info rooms, messages / reactions

Set RUST_LOG=sol=debug for verbose output including tool results, evaluation prompts, and memory details.

troubleshooting

Pod won't start / CrashLoopBackOff:

sunbeam logs matrix/sol

Common causes:

  • Missing secrets (env vars not set) — check sunbeam k8s get secret sol-secrets -n matrix -o yaml
  • ConfigMap not applied — check sunbeam k8s get cm sol-config -n matrix
  • PVC not bound — check sunbeam k8s get pvc -n matrix

SQLite recovery failure (sneezes):

If Sol sends *sneezes* on startup, it means the SQLite database at /data/sol.db couldn't be opened. Sol falls back to in-memory state. Check PVC mount and file permissions:

sunbeam k8s exec -n matrix deployment/sol -- ls -la /data/

Matrix sync errors:

Sol auto-joins rooms on invite (3 retries with exponential backoff). If it can't join, check homeserver connectivity and access token validity.

Agent creation failure:

If the orchestrator agent can't be created, Sol falls back to model-only conversations (no agent). Check Mistral API key and quota.