feat: 13 e2e integration tests against real Mistral API

Orchestrator tests:
- Simple chat roundtrip with token usage verification
- Event ordering (Started → Thinking → Done)
- Metadata pass-through (opaque bag appears in Started event)
- Token usage accuracy (longer prompts → more tokens)
- Conversation continuity (multi-turn recall)
- Client-side tool dispatch + mock result submission
- Failed tool result handling (is_error: true)
- Server-side tool execution (search_web via conversation)

gRPC tests:
- Full roundtrip (StartSession → UserInput → Status → TextDone)
- Client tool relay (ToolCall → ToolResult through gRPC stream)
- Token counts in TextDone (non-zero verification)
- Session resume (same room_id, resumed flag)
- Clean disconnect (EndSession → SessionEnd)

Infrastructure:
- ToolRegistry::new_minimal() — no OpenSearch/Matrix needed
- ToolRegistry fields now Option for testability
- GrpcState.matrix now Option
- grpc_bridge moved to src/grpc/bridge.rs
- TestHarness loads API key from .env
This commit is contained in:
2026-03-23 20:54:28 +00:00
parent 2810143f76
commit 40a6772f99
6 changed files with 980 additions and 31 deletions

View File

@@ -10,6 +10,8 @@ mod memory;
mod persistence;
mod grpc;
mod orchestrator;
#[cfg(test)]
mod integration_test;
mod sdk;
mod sync;
mod time_context;
@@ -319,7 +321,7 @@ async fn main() -> anyhow::Result<()> {
tools: state.responder.tools(),
store: store.clone(),
mistral: state.mistral.clone(),
matrix: matrix_client.clone(),
matrix: Some(matrix_client.clone()),
system_prompt: system_prompt_text.clone(),
orchestrator_agent_id: orchestrator_id,
orchestrator: Some(orch),