CLAUDE.md: updated source layout with orchestrator, grpc, code_index, breadcrumbs modules. Deployment: added gRPC service, startup flowchart, new secrets and troubleshooting. Conversations: updated lifecycle to show orchestrator path and gRPC session keys.
172 lines
6.1 KiB
Markdown
172 lines
6.1 KiB
Markdown
# Sol — Conversations API
|
|
|
|
The Conversations API path provides persistent, server-side conversation state per Matrix room via Mistral's Conversations API. Enable it with `agents.use_conversations_api = true` in `sol.toml`.
|
|
|
|
## lifecycle
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant M as Matrix Sync
|
|
participant E as Evaluator
|
|
participant O as Orchestrator
|
|
participant CR as ConversationRegistry
|
|
participant API as Mistral Conversations API
|
|
participant T as ToolRegistry
|
|
participant DB as SQLite
|
|
|
|
M->>E: message event
|
|
E-->>M: MustRespond/MaybeRespond
|
|
|
|
M->>O: GenerateRequest
|
|
O->>CR: send_message(conversation_key, input, is_dm)
|
|
|
|
alt new conversation
|
|
CR->>API: create_conversation(agent_id?, model, input)
|
|
API-->>CR: ConversationResponse + conversation_id
|
|
CR->>DB: upsert_conversation(room_id, conv_id, tokens)
|
|
else existing conversation
|
|
CR->>API: append_conversation(conv_id, input)
|
|
API-->>CR: ConversationResponse
|
|
CR->>DB: update_tokens(room_id, new_total)
|
|
end
|
|
|
|
alt response contains function_calls
|
|
loop up to max_tool_iterations (5)
|
|
O->>T: execute(name, args)
|
|
T-->>O: result string
|
|
O->>CR: send_function_result(room_id, entries)
|
|
CR->>API: append_conversation(conv_id, FunctionResult entries)
|
|
API-->>CR: ConversationResponse
|
|
alt more function_calls
|
|
Note over O: continue loop
|
|
else text response
|
|
Note over O: break
|
|
end
|
|
end
|
|
end
|
|
|
|
O-->>M: response text (or None)
|
|
M->>M: send to Matrix room
|
|
M->>M: fire-and-forget memory extraction
|
|
```
|
|
|
|
## room-to-conversation mapping
|
|
|
|
Each Matrix room maps to exactly one Mistral conversation:
|
|
|
|
- **Group rooms**: one shared conversation per room (all participants' messages go to the same conversation)
|
|
- **DMs**: one conversation per DM room (already unique per user pair in Matrix)
|
|
|
|
The mapping is stored in `ConversationRegistry.mapping` (HashMap in-memory, backed by SQLite `conversations` table).
|
|
|
|
For gRPC coding sessions, the conversation key is the project path + branch, creating a dedicated conversation per coding context.
|
|
|
|
## ConversationState
|
|
|
|
```rust
|
|
struct ConversationState {
|
|
conversation_id: String, // Mistral conversation ID
|
|
estimated_tokens: u32, // running total from response.usage.total_tokens
|
|
}
|
|
```
|
|
|
|
Token estimates are incremented on each API response and persisted to SQLite.
|
|
|
|
## message formatting
|
|
|
|
Messages are formatted differently based on room type:
|
|
|
|
- **DMs**: raw text — `trigger_body` is sent directly
|
|
- **Group rooms**: prefixed with Matrix user ID — `<@sienna:sunbeam.pt> what's for lunch?`
|
|
|
|
This is handled by the responder before calling `ConversationRegistry.send_message()`.
|
|
|
|
The `merge_user_messages()` helper (for buffered messages) follows the same pattern:
|
|
|
|
```rust
|
|
// DM: "hello\nhow are you?"
|
|
// Group: "<@sienna:sunbeam.pt> hello\n<@lonni:sunbeam.pt> how are you?"
|
|
```
|
|
|
|
## compaction
|
|
|
|
When a conversation's `estimated_tokens` reaches the `compaction_threshold` (default 118,000 — ~90% of the 131K Mistral context window):
|
|
|
|
1. `ConversationRegistry.reset(room_id)` is called
|
|
2. The mapping is removed from the in-memory HashMap
|
|
3. The row is deleted from SQLite
|
|
4. The next message creates a fresh conversation
|
|
|
|
This means conversation history is lost on compaction. The archive still has the full history, and memory notes persist independently.
|
|
|
|
## persistence
|
|
|
|
### startup recovery
|
|
|
|
On initialization, `ConversationRegistry::new()` calls `store.load_all_conversations()` to restore all room-to-conversation mappings from SQLite. This means conversations survive pod restarts.
|
|
|
|
### SQLite schema
|
|
|
|
```sql
|
|
CREATE TABLE conversations (
|
|
room_id TEXT PRIMARY KEY,
|
|
conversation_id TEXT NOT NULL,
|
|
estimated_tokens INTEGER NOT NULL DEFAULT 0,
|
|
created_at TEXT NOT NULL DEFAULT (datetime('now'))
|
|
);
|
|
```
|
|
|
|
### write operations
|
|
|
|
| Operation | When |
|
|
|-----------|------|
|
|
| `upsert_conversation` | New conversation created |
|
|
| `update_tokens` | After each append (token count from API response) |
|
|
| `delete_conversation` | On compaction reset |
|
|
|
|
## agent integration
|
|
|
|
Conversations can optionally use a Mistral agent (the orchestrator) instead of a bare model:
|
|
|
|
- If `agent_id` is set (via `set_agent_id()` at startup): new conversations are created with the agent
|
|
- If `agent_id` is `None`: conversations use the `model` directly (fallback)
|
|
|
|
The agent provides Sol's personality, tool definitions, and delegation instructions. Without it, conversations still work but without agent-specific behavior.
|
|
|
|
```rust
|
|
let req = CreateConversationRequest {
|
|
inputs: message,
|
|
model: if agent_id.is_none() { Some(self.model.clone()) } else { None },
|
|
agent_id,
|
|
// ...
|
|
};
|
|
```
|
|
|
|
## error handling
|
|
|
|
- **API failure on create/append**: returns `Err(String)`, responder logs and returns `None` (no response sent to Matrix)
|
|
- **Function result send failure**: logs error, returns `None`
|
|
- **SQLite write failure**: logged as warning, in-memory state is still updated (will be lost on restart)
|
|
|
|
Sol never crashes on a conversation error — it simply doesn't respond.
|
|
|
|
## configuration
|
|
|
|
| Field | Default | Description |
|
|
|-------|---------|-------------|
|
|
| `agents.use_conversations_api` | `false` | Enable this path |
|
|
| `agents.orchestrator_model` | `mistral-medium-latest` | Model for orchestrator agent |
|
|
| `agents.compaction_threshold` | `118000` | Token limit before conversation reset |
|
|
| `mistral.max_tool_iterations` | `5` | Max function call rounds per response |
|
|
|
|
## comparison with legacy path
|
|
|
|
| Aspect | Legacy | Conversations API |
|
|
|--------|--------|------------------|
|
|
| State management | Manual `Vec<ChatMessage>` per request | Server-side, persistent |
|
|
| Memory injection | System prompt template | Agent instructions |
|
|
| Tool calling | Chat completions tool_choice | Function calls via conversation entries |
|
|
| Context window | Sliding window (configurable) | Full conversation history until compaction |
|
|
| Multimodal | ContentPart::ImageUrl | Not yet supported (TODO in responder) |
|
|
| Persistence | None (context rebuilt from archive) | SQLite + Mistral server |
|