diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..ebb4dad --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,71 @@ +# Sol — Developer Context + +## Build & Test + +```sh +cargo build --release # debug: cargo build +cargo test # 102 unit tests, no external services needed +cargo build --release --target x86_64-unknown-linux-gnu # cross-compile for production +``` + +Docker (multi-stage, vendored deps): + +```sh +docker build -t sol . +``` + +Production build + deploy via sunbeam CLI: + +```sh +sunbeam build sol --push --deploy +``` + +## Private Cargo Registry + +`mistralai-client` is published to Sunbeam's private Gitea cargo registry at `src.sunbeam.pt`. The crate is vendored into `vendor/` for Docker builds. Registry config lives in `.cargo/config.toml` (not checked in — generated by Dockerfile and locally via `cargo vendor vendor/`). + +## Vendor Workflow + +After adding/updating deps: + +```sh +cargo vendor vendor/ +``` + +This updates the vendored sources. Commit `vendor/` changes alongside `Cargo.lock`. + +## Key Architecture Notes + +- **`chat_blocking()` workaround**: The Mistral client's `chat_async` holds a `std::sync::MutexGuard` across `.await`, making the future `!Send`. All chat calls use `chat_blocking()` which runs `client.chat()` via `tokio::task::spawn_blocking`. +- **Two response paths**: Controlled by `agents.use_conversations_api` config toggle. + - Legacy: manual `Vec` assembly, chat completions, tool iteration loop. + - Conversations API: `ConversationRegistry` with persistent state (SQLite-backed), agents, function call loop. +- **deno_core sandbox**: `run_script` tool spins up a fresh V8 isolate per invocation with `sol.*` host API bindings. Timeout via V8 isolate termination. Output truncated to 4096 chars. + +## K8s Context + +- Namespace: `matrix` +- Deployment: `sol` (Recreate strategy — single replica, SQLite can't share) +- PVC: `sol-data` (1Gi RWO) mounted at `/data` — holds `sol.db` + `matrix-state/` +- ConfigMap: `sol-config` — `sol.toml` + `system_prompt.md` (subPath mounts) +- Secrets: `sol-secrets` via VaultStaticSecret from OpenBao `secret/sol` (5 keys: `matrix-access-token`, `matrix-device-id`, `mistral-api-key`, `gitea-admin-username`, `gitea-admin-password`) +- Vault auth: Sol authenticates to OpenBao via K8s auth (role `sol-agent`, policy `sol-agent`) for storing user impersonation tokens at `secret/sol-tokens/{localpart}/{service}` +- Build target: `x86_64-unknown-linux-gnu` (Scaleway amd64 server) +- Base image: `gcr.io/distroless/cc-debian12:nonroot` + +## Source Layout + +- `src/main.rs` — startup, component wiring, backfill, agent recreation + sneeze +- `src/sync.rs` — Matrix event handlers, context hint injection for new conversations +- `src/config.rs` — TOML config with serde defaults (6 sections: matrix, opensearch, mistral, behavior, agents, services, vault) +- `src/context.rs` — `ResponseContext`, `derive_user_id`, `localpart` +- `src/conversations.rs` — `ConversationRegistry` (room→conversation mapping, SQLite-backed, reset_all) +- `src/persistence.rs` — SQLite store (WAL mode, 3 tables: `conversations`, `agents`, `service_users`) +- `src/agent_ux.rs` — `AgentProgress` (reaction lifecycle + thread posting) +- `src/matrix_utils.rs` — message extraction, image download, reactions +- `src/brain/` — evaluator (full system prompt context), responder (per-message context headers + memory), personality, conversation manager +- `src/agents/` — registry (instructions hash + automatic recreation), definitions (dynamic delegation) +- `src/sdk/` — vault client (K8s auth), token store (Vault-backed), gitea client (PAT auto-provisioning) +- `src/memory/` — schema, store, extractor +- `src/tools/` — registry (12 tools), search, room_history, room_info, script, devtools (gitea), bridge +- `src/archive/` — schema, indexer diff --git a/README.md b/README.md index ba2cfad..5e9bab2 100644 --- a/README.md +++ b/README.md @@ -1,68 +1,403 @@ # sol -a virtual librarian for Matrix. sol lives in your chat rooms, archives conversations in OpenSearch, and responds with the help of Mistral AI — with end-to-end encryption, tool use, and per-user memory. +a virtual librarian for Matrix. sol lives in your chat rooms, archives conversations in OpenSearch, and responds with the help of Mistral AI — with end-to-end encryption, tool use, per-user memory, and a multi-agent architecture. -sol is built by [sunbeam studios](https://sunbeam.pt) as part of our self-hosted collaboration stack. +sol is built by [sunbeam studios](https://sunbeam.pt) as part of our self-hosted collaboration stack for a three-person game studio. ## what sol does - **Matrix presence** — joins rooms, reads the vibe, decides when to speak. direct messages always get a response; in group rooms, sol evaluates relevance before jumping in. - **message archive** — every message is indexed in OpenSearch with full-text and semantic search. sol can search its own archive via tools. -- **tool use** — mistral calls tools mid-conversation: archive search, room context retrieval, and a sandboxed TypeScript/JavaScript runtime (deno_core) for computation. -- **per-user memory** — sol remembers things about the people it talks to. memories are extracted automatically after conversations (via ministral-3b), injected into the system prompt before responding, and accessible from scripts via `sol.memory.get/set`. user isolation is enforced at the rust level. +- **tool use** — Mistral calls tools mid-conversation: archive search, room context retrieval, room info, and a sandboxed TypeScript/JavaScript runtime (deno_core) for computation. +- **per-user memory** — sol remembers things about the people it talks to. memories are extracted automatically after conversations, injected into the system prompt before responding, and accessible from scripts via `sol.memory.get/set`. +- **user impersonation** — sol acts on behalf of users when calling external services. PATs are auto-provisioned via admin APIs and stored securely in OpenBao (Vault). OIDC-to-service username mappings handle identity mismatches. +- **gitea integration** — first domain agent (sol-devtools): list repos, search issues, create issues, list PRs, get file contents — all as the requesting user. +- **multi-agent architecture** — an orchestrator agent with personality + tools + web search. domain agent delegation is dynamic — only active agents appear in instructions. agent state persisted in SQLite with instructions hash for automatic recreation on prompt changes. +- **conversations API** — persistent conversation state per room via Mistral's Conversations API, with automatic compaction at token thresholds. per-message context headers inject timestamps, room info, and memory notes. +- **multimodal** — m.image messages are downloaded from Matrix via mxc://, converted to base64 data URIs, and sent as `ContentPart::ImageUrl` to Mistral vision models. - **reactions** — sol can react to messages with emoji when it has something to express but not enough to say. - **E2EE** — full end-to-end encryption via matrix-sdk with sqlite state store. ## architecture +```mermaid +flowchart TD + subgraph Matrix + sync[Matrix Sync Loop] + end + + subgraph Engagement + eval[Evaluator] + rules[Rule Checks] + llm_eval[LLM Evaluation
ministral-3b] + end + + subgraph Response + legacy[Legacy Path
manual messages + chat completions] + convapi[Conversations API Path
ConversationRegistry + agents] + tools[Tool Execution] + end + + subgraph Persistence + sqlite[(SQLite
conversations + agents)] + opensearch[(OpenSearch
archive + memory)] + end + + sync --> |message event| eval + eval --> rules + rules --> |MustRespond| Response + rules --> |no rule match| llm_eval + llm_eval --> |MaybeRespond| Response + llm_eval --> |React| sync + llm_eval --> |Ignore| sync + legacy --> tools + convapi --> tools + tools --> opensearch + legacy --> |response text| sync + convapi --> |response text| sync + sync --> |archive| opensearch + convapi --> sqlite + sync --> |memory extraction| opensearch +``` + +## source tree + ``` src/ -├── main.rs entrypoint, Matrix client setup, backfill -├── sync.rs event loop — messages, reactions, redactions, invites -├── config.rs TOML config with serde defaults -├── context.rs ResponseContext — per-message sender identity -├── matrix_utils.rs message extraction, reply detection, room info +├── main.rs entrypoint, Matrix client setup, backfill, orchestrator init +├── sync.rs event loop — messages, reactions, redactions, invites +├── config.rs TOML config (5 sections) with serde defaults +├── context.rs ResponseContext — per-message sender identity threading +├── conversations.rs ConversationRegistry — room→conversation mapping, SQLite-backed +├── persistence.rs SQLite store (WAL mode, 2 tables: conversations, agents) +├── agent_ux.rs AgentProgress — reaction lifecycle (🔍→⚙️→✅) + thread posting +├── matrix_utils.rs message extraction, reply/edit/thread detection, image download ├── archive/ -│ ├── schema.rs ArchiveDocument, OpenSearch index mapping -│ └── indexer.rs batched indexing, reactions, edits, redactions +│ ├── schema.rs ArchiveDocument, OpenSearch index mapping +│ └── indexer.rs batched indexing, reactions, edits, redactions ├── brain/ -│ ├── conversation.rs sliding-window context per room -│ ├── evaluator.rs engagement decision (must/maybe/react/ignore) -│ ├── personality.rs system prompt templating -│ └── responder.rs Mistral chat loop with tool iterations + memory +│ ├── conversation.rs sliding-window context per room (configurable group/DM windows) +│ ├── evaluator.rs engagement decision (MustRespond/MaybeRespond/React/Ignore) +│ ├── personality.rs system prompt templating ({date}, {room_name}, {members}, etc.) +│ └── responder.rs both response paths, tool iteration loops, memory loading ├── memory/ -│ ├── schema.rs MemoryDocument, index mapping -│ ├── store.rs query, get_recent, set — OpenSearch operations -│ └── extractor.rs post-response fact extraction via ministral-3b +│ ├── schema.rs MemoryDocument, index mapping +│ ├── store.rs query (topical), get_recent, set — OpenSearch operations +│ └── extractor.rs post-response fact extraction via ministral-3b +├── agents/ +│ ├── definitions.rs orchestrator config + 8 domain agent definitions (dynamic delegation) +│ └── registry.rs agent lifecycle with instructions hash staleness detection +├── sdk/ +│ ├── mod.rs SDK module root +│ ├── vault.rs OpenBao/Vault client (K8s auth, KV v2 read/write/delete) +│ ├── tokens.rs TokenStore — Vault-backed secrets + SQLite username mappings +│ └── gitea.rs GiteaClient — typed Gitea API v1 with PAT auto-provisioning └── tools/ - ├── mod.rs ToolRegistry, tool definitions, dispatch - ├── search.rs archive search (keyword + semantic) - ├── room_history.rs context around a timestamp or event - ├── room_info.rs room listing, member queries - └── script.rs deno_core sandbox with sol.* API + ├── mod.rs ToolRegistry — 12 tool definitions + dispatch (5 core + 7 gitea) + ├── search.rs archive search (keyword + semantic via embedding pipeline) + ├── room_history.rs context around a timestamp or event + ├── room_info.rs room listing, member queries + ├── script.rs deno_core sandbox with sol.* host API, TS transpilation + ├── devtools.rs Gitea tool handlers (repos, issues, PRs, files) + └── bridge.rs ToolBridge — generic async handler map for future SDK integration ``` + +## engagement pipeline + +```mermaid +sequenceDiagram + participant M as Matrix + participant S as Sync Handler + participant E as Evaluator + participant LLM as ministral-3b + participant R as Responder + + M->>S: m.room.message + S->>S: archive message + S->>S: update conversation context + S->>E: evaluate(sender, body, is_dm, recent) + + alt own message + E-->>S: Ignore + else @mention or matrix.to link + E-->>S: MustRespond (DirectMention) + else DM + E-->>S: MustRespond (DirectMessage) + else "sol" or "hey sol" + E-->>S: MustRespond (NameInvocation) + else no rule match + E->>LLM: relevance evaluation (JSON) + LLM-->>E: {relevance, hook, emoji} + alt relevance >= spontaneous_threshold (0.85) + E-->>S: MaybeRespond + else relevance >= reaction_threshold (0.6) + emoji + E-->>S: React (emoji) + else below thresholds + E-->>S: Ignore + end + end + + alt MustRespond or MaybeRespond + S->>S: check in-flight guard + S->>S: check cooldown (15s default) + S->>R: generate response + end +``` + +## response generation + +Sol has two response paths, controlled by `agents.use_conversations_api`: + +### legacy path (`generate_response`) + +1. Apply response delay (random within configured range) +2. Send typing indicator +3. Load memory notes (topical query + recent backfill, max 5) +4. Build system prompt via `Personality` (template substitution: `{date}`, `{room_name}`, `{members}`, `{memory_notes}`, `{room_context_rules}`, `{epoch_ms}`) +5. Assemble message array: system → context messages (with timestamps) → trigger (multimodal if image) +6. Tool iteration loop (up to `max_tool_iterations`, default 5): + - If `finish_reason == ToolCalls`: execute tools, append results, continue + - If text response: strip "sol:" prefix, return +7. Fire-and-forget memory extraction + +### conversations API path (`generate_response_conversations`) + +1. Apply response delay +2. Send typing indicator +3. Format input: raw text for DMs, `<@user:server> text` for groups +4. Send through `ConversationRegistry.send_message()` (creates or appends to Mistral conversation) +5. Function call loop (up to `max_tool_iterations`): + - Execute tool calls locally via `ToolRegistry` + - Send `FunctionResultEntry` back to conversation +6. Extract assistant text, strip prefix, return + +## tool system + +| Tool | Parameters | Description | +|------|-----------|-------------| +| `search_archive` | `query` (required), `room`, `sender`, `after`, `before`, `limit`, `semantic` | Search the message archive (keyword or semantic) | +| `get_room_context` | `room_id` (required), `around_timestamp`, `around_event_id`, `before_count`, `after_count` | Get messages around a point in time or event | +| `list_rooms` | *(none)* | List all rooms Sol is in with names and member counts | +| `get_room_members` | `room_id` (required) | Get members of a specific room | +| `run_script` | `code` (required) | Execute TypeScript/JavaScript in a sandboxed deno_core runtime | + +### run_script sandbox + +The script runtime is a fresh V8 isolate per invocation with: + +- **TypeScript support** — code is transpiled via `deno_ast` before execution +- **Timeout** — configurable via `behavior.script_timeout_secs` (default 5s), enforced by V8 isolate termination +- **Heap limit** — configurable via `behavior.script_max_heap_mb` (default 64MB) +- **Output** — `console.log()` + last expression value, truncated to 4096 characters +- **Temp filesystem** — sandboxed `sol.fs.read/write/list` with path traversal protection +- **Network** — `sol.fetch(url)` restricted to `behavior.script_fetch_allowlist` domains + +Host API (`sol.*`): + +```typescript +sol.search(query, opts?) // search message archive +sol.rooms() // list joined rooms → [{name, id, members}] +sol.members(roomName) // get room members → [{name, id}] +sol.fetch(url) // HTTP GET (allowlisted domains only) +sol.memory.get(query?) // retrieve memories relevant to query +sol.memory.set(content, category?) // save a memory note +sol.fs.read(path) // read file from sandbox +sol.fs.write(path, content) // write file to sandbox +sol.fs.list(path?) // list sandbox directory +``` + +All `sol.*` methods are async — use `await`. + +## memory system + +### extraction (post-response, fire-and-forget) + +After each response, a background task sends the exchange to `ministral-3b` with a structured extraction prompt. The model returns `{"memories": [{"content": "...", "category": "preference|fact|context"}]}`. Categories are normalized via `normalize_category()` — valid categories are `preference`, `fact`, `context`; anything else falls back to `general`. + +### storage (OpenSearch) + +Each memory is a `MemoryDocument` with: `id`, `user_id`, `content`, `category`, `created_at`, `updated_at`, `source` (`"auto"` or `"script"`). The index name defaults to `sol_user_memory`. User isolation is enforced at the Rust level via `user_id` filtering on all queries. + +### pre-response loading + +Before generating a response, the responder loads up to 5 memories: + +1. **Topical query** — semantic search against the trigger message +2. **Recent backfill** — if fewer than 3 topical results, fill remaining slots with most recent memories + +Memory notes are injected into the system prompt as a `## notes about {display_name}` block with instructions to use them naturally without mentioning their existence. + +## archive + +Every message event is archived as an `ArchiveDocument` in OpenSearch: + +- **Batch indexing** — messages are buffered and flushed periodically (`opensearch.batch_size` default 50, `opensearch.flush_interval_ms` default 2000) +- **Embedding pipeline** — configurable via `opensearch.embedding_pipeline` for semantic search +- **Edit tracking** — `m.replace` events update the original document's content +- **Redaction** — `m.room.redaction` sets `redacted: true` on the original +- **Reactions** — `m.reaction` events append `{sender, emoji, timestamp}` to the document's reactions array +- **Backfill** — on startup, conversation context is backfilled from the archive; reactions are backfilled from Matrix room timelines (last 500 events per room) + +## agent architecture + +```mermaid +stateDiagram-v2 + [*] --> CheckMemory: startup + CheckMemory --> CheckServer: agent_id in SQLite? + CheckMemory --> SearchByName: not in SQLite + + CheckServer --> Ready: exists on Mistral server + CheckServer --> SearchByName: gone from server + + SearchByName --> Ready: found by name + SearchByName --> Create: not found + + Create --> Ready: agent created + Ready --> [*] +``` + +### orchestrator + +The orchestrator agent carries Sol's full personality (system prompt) plus all 5 tool definitions converted to `AgentTool` format. It's created on startup if `agents.use_conversations_api` is enabled. Temperature: 0.5. + +### domain agents (8 definitions) + +| Agent | Domain | +|-------|--------| +| `sol-observability` | Metrics, logs, dashboards, alerts (Prometheus, Loki, Grafana) | +| `sol-data` | Full-text search, object storage (OpenSearch, SeaweedFS) | +| `sol-devtools` | Git repos, issues, PRs, kanban boards (Gitea, Planka) | +| `sol-infrastructure` | Kubernetes, deployments, certificates, builds | +| `sol-identity` | User accounts, sessions, OAuth2 (Kratos, Hydra) | +| `sol-collaboration` | Contacts, documents, meetings, files, email, calendars (La Suite) | +| `sol-communication` | Chat rooms, messages, members (Matrix) | +| `sol-media` | Video/audio rooms, recordings, streams (LiveKit) | + +Domain agents are defined in `agents/definitions.rs` as `DOMAIN_AGENTS` (name, description, instructions). Temperature: 0.3. + +### ToolBridge + +`tools/bridge.rs` provides a generic async handler map (`ToolBridge`) for mapping Mistral tool call names to handler functions. This is scaffolding for future SDK-based tool integration where domain agents will have their own tool sets. + +## persistence + +SQLite database at `/data/sol.db` (configurable via `matrix.db_path`), WAL mode. + +### tables + +**conversations** — room_id (PK), conversation_id, estimated_tokens, created_at + +**agents** — name (PK), agent_id, model, created_at + +### recovery behavior + +On startup, if the database fails to open: + +1. Log error +2. Fall back to in-memory SQLite (conversations won't survive restarts) +3. After sync loop starts, send `*sneezes*` to all joined rooms to signal the hiccup + +## multimodal + +When an `m.image` message arrives: + +1. Extract media source from event (`MessageType::Image`) +2. Download bytes from Matrix media API via `matrix_sdk::media::get_media_content` +3. Base64-encode as `data:{mime};base64,{data}` URI +4. Pass to Mistral as `ContentPart::ImageUrl` alongside any text caption + +Encrypted images are not supported (the `MediaSource::Encrypted` variant is skipped). + +## configuration reference + +Config is loaded from `SOL_CONFIG` (default: `/etc/sol/sol.toml`). + +### `[matrix]` + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `homeserver_url` | string | *required* | Matrix homeserver URL | +| `user_id` | string | *required* | Bot's Matrix user ID | +| `state_store_path` | string | *required* | Path for Matrix SDK sqlite state | +| `db_path` | string | `/data/sol.db` | SQLite database for persistent state | + +### `[opensearch]` + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `url` | string | *required* | OpenSearch cluster URL | +| `index` | string | *required* | Archive index name | +| `batch_size` | usize | `50` | Messages per flush batch | +| `flush_interval_ms` | u64 | `2000` | Flush interval in milliseconds | +| `embedding_pipeline` | string | `tuwunel_embedding_pipeline` | Ingest pipeline for semantic embeddings | +| `memory_index` | string | `sol_user_memory` | Memory index name | + +### `[mistral]` + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `default_model` | string | `mistral-medium-latest` | Model for response generation | +| `evaluation_model` | string | `ministral-3b-latest` | Model for engagement evaluation + memory extraction | +| `research_model` | string | `mistral-large-latest` | Model for research tasks | +| `max_tool_iterations` | usize | `5` | Max tool call rounds per response | + +### `[behavior]` + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `response_delay_min_ms` | u64 | `100` | Min delay before direct response | +| `response_delay_max_ms` | u64 | `2300` | Max delay before direct response | +| `spontaneous_delay_min_ms` | u64 | `15000` | Min delay before spontaneous response | +| `spontaneous_delay_max_ms` | u64 | `60000` | Max delay before spontaneous response | +| `spontaneous_threshold` | f32 | `0.85` | LLM relevance score to trigger spontaneous response | +| `reaction_threshold` | f32 | `0.6` | LLM relevance score to trigger emoji reaction | +| `reaction_enabled` | bool | `true` | Enable emoji reactions | +| `room_context_window` | usize | `30` | Messages to keep in group room context | +| `dm_context_window` | usize | `100` | Messages to keep in DM context | +| `backfill_on_join` | bool | `true` | Backfill context from archive on startup | +| `backfill_limit` | usize | `10000` | Max messages to backfill | +| `instant_responses` | bool | `false` | Skip response delays (for testing) | +| `cooldown_after_response_ms` | u64 | `15000` | Cooldown before another spontaneous response | +| `evaluation_context_window` | usize | `25` | Recent messages sent to evaluation LLM | +| `detect_sol_in_conversation` | bool | `true` | Use active/passive evaluation prompts based on Sol's participation | +| `evaluation_prompt_active` | string? | *(built-in)* | Custom prompt when Sol is in conversation | +| `evaluation_prompt_passive` | string? | *(built-in)* | Custom prompt when Sol hasn't spoken | +| `script_timeout_secs` | u64 | `5` | Script execution timeout | +| `script_max_heap_mb` | usize | `64` | V8 heap limit for scripts | +| `script_fetch_allowlist` | string[] | `[]` | Domains allowed for `sol.fetch()` | +| `memory_extraction_enabled` | bool | `true` | Enable post-response memory extraction | + +### `[agents]` + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `orchestrator_model` | string | `mistral-medium-latest` | Model for orchestrator agent | +| `domain_model` | string | `mistral-medium-latest` | Model for domain agents | +| `compaction_threshold` | u32 | `118000` | Token estimate before conversation reset (~90% of 131K context) | +| `use_conversations_api` | bool | `false` | Enable Conversations API path (vs legacy chat completions) | + +## environment variables + +| Variable | Required | Description | +|----------|----------|-------------| +| `SOL_MATRIX_ACCESS_TOKEN` | yes | Matrix access token | +| `SOL_MATRIX_DEVICE_ID` | yes | Matrix device ID (for E2EE state) | +| `SOL_MISTRAL_API_KEY` | yes | Mistral API key | +| `SOL_CONFIG` | no | Config file path (default: `/etc/sol/sol.toml`) | +| `SOL_SYSTEM_PROMPT` | no | System prompt file path (default: `/etc/sol/system_prompt.md`) | + ## dependencies -sol talks to three external services: +sol talks to five external services: - **Matrix homeserver** — [tuwunel](https://github.com/tulir/tuwunel) (or any Matrix server) - **OpenSearch** — message archive + user memory indices -- **Mistral AI** — response generation, engagement evaluation, memory extraction +- **Mistral AI** — response generation, engagement evaluation, memory extraction, agents + web search +- **OpenBao** — secure token storage for user impersonation PATs (K8s auth, KV v2) +- **Gitea** — git hosting API for devtools agent (repos, issues, PRs) -## configuration - -sol reads config from `SOL_CONFIG` (default: `/etc/sol/sol.toml`) and the system prompt from `SOL_SYSTEM_PROMPT` (default: `/etc/sol/system_prompt.md`). - -secrets via environment: - -| Variable | Description | -|----------|-------------| -| `SOL_MATRIX_ACCESS_TOKEN` | Matrix access token | -| `SOL_MATRIX_DEVICE_ID` | Matrix device ID (for E2EE) | -| `SOL_MISTRAL_API_KEY` | Mistral API key | - -see `config/sol.toml` for the full config reference with defaults. +key crates: `matrix-sdk` 0.9 (E2EE + sqlite), `mistralai-client` 1.1.0 (private registry), `opensearch` 2, `deno_core` 0.393, `rusqlite` 0.32 (bundled), `ruma` 0.12. ## building @@ -70,31 +405,47 @@ see `config/sol.toml` for the full config reference with defaults. cargo build --release ``` -docker (cross-compile to x86_64 linux): +docker (cross-compile to x86_64 linux, vendored deps): ```sh docker build -t sol . ``` -## running +production build + deploy: ```sh -export SOL_MATRIX_ACCESS_TOKEN="..." -export SOL_MATRIX_DEVICE_ID="..." -export SOL_MISTRAL_API_KEY="..." -export SOL_CONFIG="config/sol.toml" -export SOL_SYSTEM_PROMPT="config/system_prompt.md" - -cargo run --release +sunbeam build sol --push --deploy ``` -## tests +the Dockerfile uses a two-stage build: deps layer (cached until Cargo.toml/vendor change) → source layer (only sol code recompiles). final image is `gcr.io/distroless/cc-debian12:nonroot`. + +## testing ```sh cargo test ``` -80 unit tests covering config parsing, conversation windowing, engagement rules, personality templating, memory schema/store/extraction, search query building, TypeScript transpilation, and sandbox path isolation. +unit tests covering: + +- config parsing (minimal, full, missing sections/fields, services, vault) +- conversation windowing, context management, reset_all, delete_all +- engagement rules (mention, DM, name invocation, case sensitivity, false positives) +- personality template substitution (date, room, members, memory notes, timestamps, room context rules) +- memory document serialization, extraction parsing, category normalization +- archive search query building (filters, date ranges, wildcards, room_name keyword field) +- TypeScript transpilation (basic, arrow, interface, invalid) +- sandbox path isolation (traversal, symlink escape, nested dirs) +- deno_core script execution (basic math, output capture) +- SQLite CRUD (conversations, agents, service_users, load_all, bulk delete) +- conversation message merging (DM, group, empty, single) +- context derivation (`@user:server` → `user@server`, localpart extraction) +- tool bridge registration and execution +- agent UX formatting (tool calls, result truncation) +- agent definitions (orchestrator instructions, dynamic delegation, deterministic hash) +- token expiry validation (PAT, future, past, malformed, null) +- Gitea API type deserialization (repos, issues, PRs, files) +- PAT conflict status codes and scope constants +- username mapping (OIDC → service identity) ## license diff --git a/docs/conversations.md b/docs/conversations.md new file mode 100644 index 0000000..95d0e49 --- /dev/null +++ b/docs/conversations.md @@ -0,0 +1,169 @@ +# Sol — Conversations API + +The Conversations API path provides persistent, server-side conversation state per Matrix room via Mistral's Conversations API. Enable it with `agents.use_conversations_api = true` in `sol.toml`. + +## lifecycle + +```mermaid +sequenceDiagram + participant M as Matrix Sync + participant E as Evaluator + participant R as Responder + participant CR as ConversationRegistry + participant API as Mistral Conversations API + participant T as ToolRegistry + participant DB as SQLite + + M->>E: message event + E-->>M: MustRespond/MaybeRespond + + M->>R: generate_response_conversations() + R->>CR: send_message(room_id, input, is_dm) + + alt new room (no conversation) + CR->>API: create_conversation(agent_id?, model, input) + API-->>CR: ConversationResponse + conversation_id + CR->>DB: upsert_conversation(room_id, conv_id, tokens) + else existing room + CR->>API: append_conversation(conv_id, input) + API-->>CR: ConversationResponse + CR->>DB: update_tokens(room_id, new_total) + end + + alt response contains function_calls + loop up to max_tool_iterations (5) + R->>T: execute(name, args) + T-->>R: result string + R->>CR: send_function_result(room_id, entries) + CR->>API: append_conversation(conv_id, FunctionResult entries) + API-->>CR: ConversationResponse + alt more function_calls + Note over R: continue loop + else text response + Note over R: break + end + end + end + + R-->>M: response text (or None) + M->>M: send to Matrix room + M->>M: fire-and-forget memory extraction +``` + +## room-to-conversation mapping + +Each Matrix room maps to exactly one Mistral conversation: + +- **Group rooms**: one shared conversation per room (all participants' messages go to the same conversation) +- **DMs**: one conversation per DM room (already unique per user pair in Matrix) + +The mapping is stored in `ConversationRegistry.mapping` (HashMap in-memory, backed by SQLite `conversations` table). + +## ConversationState + +```rust +struct ConversationState { + conversation_id: String, // Mistral conversation ID + estimated_tokens: u32, // running total from response.usage.total_tokens +} +``` + +Token estimates are incremented on each API response and persisted to SQLite. + +## message formatting + +Messages are formatted differently based on room type: + +- **DMs**: raw text — `trigger_body` is sent directly +- **Group rooms**: prefixed with Matrix user ID — `<@sienna:sunbeam.pt> what's for lunch?` + +This is handled by the responder before calling `ConversationRegistry.send_message()`. + +The `merge_user_messages()` helper (for buffered messages) follows the same pattern: + +```rust +// DM: "hello\nhow are you?" +// Group: "<@sienna:sunbeam.pt> hello\n<@lonni:sunbeam.pt> how are you?" +``` + +## compaction + +When a conversation's `estimated_tokens` reaches the `compaction_threshold` (default 118,000 — ~90% of the 131K Mistral context window): + +1. `ConversationRegistry.reset(room_id)` is called +2. The mapping is removed from the in-memory HashMap +3. The row is deleted from SQLite +4. The next message creates a fresh conversation + +This means conversation history is lost on compaction. The archive still has the full history, and memory notes persist independently. + +## persistence + +### startup recovery + +On initialization, `ConversationRegistry::new()` calls `store.load_all_conversations()` to restore all room→conversation mappings from SQLite. This means conversations survive pod restarts. + +### SQLite schema + +```sql +CREATE TABLE conversations ( + room_id TEXT PRIMARY KEY, + conversation_id TEXT NOT NULL, + estimated_tokens INTEGER NOT NULL DEFAULT 0, + created_at TEXT NOT NULL DEFAULT (datetime('now')) +); +``` + +### write operations + +| Operation | When | +|-----------|------| +| `upsert_conversation` | New conversation created | +| `update_tokens` | After each append (token count from API response) | +| `delete_conversation` | On compaction reset | + +## agent integration + +Conversations can optionally use a Mistral agent (the orchestrator) instead of a bare model: + +- If `agent_id` is set (via `set_agent_id()` at startup): new conversations are created with the agent +- If `agent_id` is `None`: conversations use the `model` directly (fallback) + +The agent provides Sol's personality, tool definitions, and delegation instructions. Without it, conversations still work but without agent-specific behavior. + +```rust +let req = CreateConversationRequest { + inputs: message, + model: if agent_id.is_none() { Some(self.model.clone()) } else { None }, + agent_id, + // ... +}; +``` + +## error handling + +- **API failure on create/append**: returns `Err(String)`, responder logs and returns `None` (no response sent to Matrix) +- **Function result send failure**: logs error, returns `None` +- **SQLite write failure**: logged as warning, in-memory state is still updated (will be lost on restart) + +Sol never crashes on a conversation error — it simply doesn't respond. + +## configuration + +| Field | Default | Description | +|-------|---------|-------------| +| `agents.use_conversations_api` | `false` | Enable this path | +| `agents.orchestrator_model` | `mistral-medium-latest` | Model for orchestrator agent | +| `agents.compaction_threshold` | `118000` | Token limit before conversation reset | +| `mistral.max_tool_iterations` | `5` | Max function call rounds per response | + +## comparison with legacy path + +| Aspect | Legacy | Conversations API | +|--------|--------|------------------| +| State management | Manual `Vec` per request | Server-side, persistent | +| Memory injection | System prompt template | Agent instructions | +| Tool calling | Chat completions tool_choice | Function calls via conversation entries | +| Context window | Sliding window (configurable) | Full conversation history until compaction | +| Multimodal | ContentPart::ImageUrl | Not yet supported (TODO in responder) | +| Persistence | None (context rebuilt from archive) | SQLite + Mistral server | diff --git a/docs/deployment.md b/docs/deployment.md new file mode 100644 index 0000000..1c8c073 --- /dev/null +++ b/docs/deployment.md @@ -0,0 +1,228 @@ +# Sol — Kubernetes Deployment + +Sol runs as a single-replica Deployment in the `matrix` namespace. SQLite is the persistence backend, so only one pod can run at a time (Recreate strategy). + +## resource relationships + +```mermaid +flowchart TD + subgraph OpenBao + vault[("secret/sol
matrix-access-token
matrix-device-id
mistral-api-key")] + end + + subgraph "matrix namespace" + vss[VaultStaticSecret
sol-secrets] + secret[Secret
sol-secrets] + cm[ConfigMap
sol-config
sol.toml + system_prompt.md] + pvc[PVC
sol-data
1Gi RWO] + deploy[Deployment
sol] + init[initContainer
fix-permissions] + pod[Container
sol] + end + + vault --> |VSO sync| vss + vss --> |creates| secret + vss --> |rolloutRestartTargets| deploy + deploy --> init + init --> pod + secret --> |env vars| pod + cm --> |subPath mounts| pod + pvc --> |/data| init + pvc --> |/data| pod +``` + +## manifests + +All manifests are in `infrastructure/base/matrix/`. + +### Deployment (`sol-deployment.yaml`) + +```yaml +strategy: + type: Recreate # SQLite requires single-writer +replicas: 1 +``` + +**initContainer** — `busybox` runs `chmod -R 777 /data && mkdir -p /data/matrix-state` to ensure the nonroot distroless container can write to the Longhorn PVC. + +**Container** — `sol` image (distroless/cc-debian12:nonroot) + +- Resources: 256Mi request / 512Mi limit memory, 100m CPU request +- `enableServiceLinks: false` — avoids injecting service env vars that could conflict + +**Environment variables** (from Secret `sol-secrets`): + +| Env Var | Secret Key | +|---------|-----------| +| `SOL_MATRIX_ACCESS_TOKEN` | `matrix-access-token` | +| `SOL_MATRIX_DEVICE_ID` | `matrix-device-id` | +| `SOL_MISTRAL_API_KEY` | `mistral-api-key` | + +Fixed env vars: + +| Env Var | Value | +|---------|-------| +| `SOL_CONFIG` | `/etc/sol/sol.toml` | +| `SOL_SYSTEM_PROMPT` | `/etc/sol/system_prompt.md` | + +**Volume mounts:** + +| Mount | Source | Details | +|-------|--------|---------| +| `/etc/sol/sol.toml` | ConfigMap `sol-config` | subPath: `sol.toml`, readOnly | +| `/etc/sol/system_prompt.md` | ConfigMap `sol-config` | subPath: `system_prompt.md`, readOnly | +| `/data` | PVC `sol-data` | read-write | + +### PVC (`sol-deployment.yaml`, second document) + +```yaml +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: sol-data + namespace: matrix +spec: + accessModes: [ReadWriteOnce] + resources: + requests: + storage: 1Gi +``` + +Uses the default StorageClass (Longhorn). + +### VaultStaticSecret (`vault-secrets.yaml`) + +```yaml +apiVersion: secrets.hashicorp.com/v1beta1 +kind: VaultStaticSecret +metadata: + name: sol-secrets + namespace: matrix +spec: + vaultAuthRef: vso-auth + mount: secret + type: kv-v2 + path: sol + refreshAfter: 60s + rolloutRestartTargets: + - kind: Deployment + name: sol + destination: + name: sol-secrets + create: true + overwrite: true +``` + +The `rolloutRestartTargets` field means VSO will automatically restart the Sol deployment when secrets change in OpenBao. + +Three keys synced from OpenBao `secret/sol`: + +- `matrix-access-token` +- `matrix-device-id` +- `mistral-api-key` + +## `/data` mount layout + +``` +/data/ +├── sol.db SQLite database (conversations + agents tables, WAL mode) +└── matrix-state/ Matrix SDK sqlite state store (E2EE keys, sync tokens) +``` + +Both are created automatically. The initContainer ensures directory permissions are correct for the nonroot container. + +## secrets in OpenBao + +Store secrets at `secret/sol` in OpenBao KV v2: + +```sh +# Via sunbeam seed (automated), or manually: +openbao kv put secret/sol \ + matrix-access-token="syt_..." \ + matrix-device-id="DEVICE_ID" \ + mistral-api-key="..." +``` + +These are synced to K8s Secret `sol-secrets` by the Vault Secrets Operator. + +## build and deploy + +```sh +# Build only (local Docker image) +sunbeam build sol + +# Build + push to registry +sunbeam build sol --push + +# Build + push + deploy (apply manifests + rollout restart) +sunbeam build sol --push --deploy +``` + +The Docker build cross-compiles to `x86_64-unknown-linux-gnu` on macOS. The final image is `gcr.io/distroless/cc-debian12:nonroot` (~30MB). + +## startup sequence + +1. Initialize `tracing_subscriber` with `RUST_LOG` env filter (default: `sol=info`) +2. Load config from `SOL_CONFIG` path +3. Load system prompt from `SOL_SYSTEM_PROMPT` path +4. Read 3 secret env vars (`SOL_MATRIX_ACCESS_TOKEN`, `SOL_MATRIX_DEVICE_ID`, `SOL_MISTRAL_API_KEY`) +5. Build Matrix client with E2EE sqlite store, restore session +6. Connect to OpenSearch, ensure archive + memory indices exist +7. Initialize Mistral client +8. Build components: Personality, ConversationManager, ToolRegistry, Indexer, Evaluator, Responder +9. Backfill conversation context from archive (if `backfill_on_join` enabled) +10. Open SQLite database (fallback to in-memory on failure) +11. Initialize AgentRegistry + ConversationRegistry (load persisted state from SQLite) +12. If `use_conversations_api` enabled: ensure orchestrator agent exists on Mistral server +13. Backfill reactions from Matrix room timelines +14. Start background index flush task +15. Start Matrix sync loop +16. If SQLite failed: send `*sneezes*` to all joined rooms +17. Log "Sol is running", wait for SIGINT + +## monitoring + +Sol uses `tracing` with structured fields. Default log level: `sol=info`. + +Key log events: + +| Event | Level | Fields | +|-------|-------|--------| +| Response sent | info | `room`, `len`, `is_dm` | +| Tool execution | info | `tool`, `id`, `args` | +| Engagement evaluation | info | `sender`, `rule`, `relevance`, `threshold` | +| Memory extraction | debug | `count`, `user` | +| Conversation created | info | `room`, `conversation_id` | +| Agent restored/created | info | `agent_id`, `name` | +| Backfill complete | info | `rooms`, `messages` / `reactions` | + +Set `RUST_LOG=sol=debug` for verbose output including tool results, evaluation prompts, and memory details. + +## troubleshooting + +**Pod won't start / CrashLoopBackOff:** + +```sh +sunbeam logs matrix/sol +``` + +Common causes: +- Missing secrets (env vars not set) — check `sunbeam k8s get secret sol-secrets -n matrix -o yaml` +- ConfigMap not applied — check `sunbeam k8s get cm sol-config -n matrix` +- PVC not bound — check `sunbeam k8s get pvc -n matrix` + +**SQLite recovery failure (*sneezes*):** + +If Sol sends `*sneezes*` on startup, it means the SQLite database at `/data/sol.db` couldn't be opened. Sol falls back to in-memory state. Check PVC mount and file permissions: + +```sh +sunbeam k8s exec -n matrix deployment/sol -- ls -la /data/ +``` + +**Matrix sync errors:** + +Sol auto-joins rooms on invite (3 retries with exponential backoff). If it can't join, check homeserver connectivity and access token validity. + +**Agent creation failure:** + +If the orchestrator agent can't be created, Sol falls back to model-only conversations (no agent). Check Mistral API key and quota.