feat: 13 e2e integration tests against real Mistral API

Orchestrator tests: - Simple chat roundtrip with token usage verification - Event ordering (Started → Thinking → Done) - Metadata pass-through (opaque bag appears in Started event) - Token usage accuracy (longer prompts → more tokens) - Conversation continuity (multi-turn recall) - Client-side tool dispatch + mock result submission - Failed tool result handling (is_error: true) - Server-side tool execution (search_web via conversation) gRPC tests: - Full roundtrip (StartSession → UserInput → Status → TextDone) - Client tool relay (ToolCall → ToolResult through gRPC stream) - Token counts in TextDone (non-zero verification) - Session resume (same room_id, resumed flag) - Clean disconnect (EndSession → SessionEnd) Infrastructure: - ToolRegistry::new_minimal() — no OpenSearch/Matrix needed - ToolRegistry fields now Option for testability - GrpcState.matrix now Option - grpc_bridge moved to src/grpc/bridge.rs - TestHarness loads API key from .env
2026-03-23 20:54:28 +00:00
parent 2810143f76
commit 40a6772f99
6 changed files with 980 additions and 31 deletions
--- a/src/main.rs
+++ b/src/main.rs
@@ -10,6 +10,8 @@ mod memory;
 mod persistence;
 mod grpc;
 mod orchestrator;
+#[cfg(test)]
+mod integration_test;
 mod sdk;
 mod sync;
 mod time_context;
@@ -319,7 +321,7 @@ async fn main() -> anyhow::Result<()> {
            tools: state.responder.tools(),
            store: store.clone(),
            mistral: state.mistral.clone(),
-            matrix: matrix_client.clone(),
+            matrix: Some(matrix_client.clone()),
            system_prompt: system_prompt_text.clone(),
            orchestrator_agent_id: orchestrator_id,
            orchestrator: Some(orch),