studio/sol - sol - Gitea: Git with a cup of tea

studio/sol

Author	SHA1	Message	Date
Sienna Meridian Satterwhite	6a2aafdccc	refactor: remove legacy chat path, fix corrupted conversation recovery - Delete CodeSession::chat() — the legacy inline tool loop that duplicated the orchestrator's conversation + tool dispatch logic - Delete wait_for_tool_result() — only used by the legacy path - Make orchestrator mandatory in run_session (no more if/else fallback) - Unify conversation creation into create_fresh_conversation() - Add corrupted conversation recovery to create_or_append_conversation: detects "function calls and responses" errors from Mistral (caused by disconnecting mid-tool-call) and auto-creates a fresh conversation - Add tracing-appender for optional rotating log file (SOL_LOG_FILE env) - Add Procfile.dev for overmind process management	2026-03-24 19:49:07 +00:00
Sienna Meridian Satterwhite	495c465a01	refactor: remove legacy responder + agent_ux, add Gitea integration tests Legacy removal: - DELETE src/brain/responder.rs (900 lines) — replaced by orchestrator - DELETE src/agent_ux.rs (184 lines) — UX moved to transport bridges - EXTRACT chat_blocking() to src/brain/chat.rs (standalone utility) - sync.rs: uses ConversationRegistry directly (no responder) - main.rs: holds ToolRegistry + Personality directly (no Responder wrapper) - research.rs: progress updates via tracing (no AgentProgress) Gitea integration testing: - docker-compose: added Gitea service with healthcheck - bootstrap-gitea.sh: creates admin, org, mirrors 6 real repos from src.sunbeam.pt (sol, cli, proxy, storybook, admin-ui, mistralai-client-rs) - PAT provisioning for SDK testing without Vault - code_index/gitea.rs: fixed directory listing (direct API calls instead of SDK's single-object parser), proper base64 file decoding New integration tests: - Gitea: list_repos, get_repo, get_file, directory listing, code indexing - Web search: SearXNG query with result verification - Conversation registry: lifecycle + send_message round-trip - Evaluator: rule matching (DM, own message) - gRPC bridge: event filtering, tool call mapping, thinking→status	2026-03-24 11:45:43 +00:00
Sienna Meridian Satterwhite	c213d74620	feat: code search tool + breadcrumb context injection + integration tests search_code tool: - Server-side tool querying sol_code OpenSearch index - BM25 search across symbol_name, signature, docstring, content - Branch-aware with boost for current branch, mainline fallback - Registered in ToolRegistry execute dispatch Breadcrumb injection: - build_context_header() now async, injects adaptive breadcrumbs - Hybrid search: _analyze → wildcard symbol matching → BM25 - Token budget enforcement (default outline + relevant expansion) - Graceful degradation when OpenSearch unavailable GrpcState: - Added Option<OpenSearch> for breadcrumb retrieval - code_index_name() accessor Integration tests (6 new, 226 total): - Index + search: bulk index symbols, verify BM25 retrieval - Breadcrumb outline: aggregation query returns project structure - Breadcrumb expansion: substantive query triggers relevant symbols - Token budget: respects character limit - Branch scoping: feat/code symbols preferred over mainline - Branch deletion: cleanup removes branch symbols, mainline survives	2026-03-24 00:19:17 +00:00
Sienna Meridian Satterwhite	57f8d608a5	feat: code index + adaptive breadcrumbs foundation Code index (sol_code): - SymbolDocument: file_path, repo_name, language, symbol_name, symbol_kind, signature, docstring, branch, source, embedding (768-dim knn_vector) - CodeIndexer: batch symbol indexer with idempotent upserts - Branch-aware: symbols scoped to branch with mainline fallback Breadcrumbs: - build_breadcrumbs(): adaptive context injection for coding prompts - Default: project outline via aggregation (modules, types, fns) - Adaptive: hybrid search (_analyze → symbol matching → BM25 + neural) - Token budget enforcement with priority (outline first, then relevance) - format_symbol(): signature + first-line docstring + file:line Query optimization: uses _analyze API to extract key terms from free-form user text, matches against actual symbol names in the index before running the hybrid search.	2026-03-23 23:54:29 +00:00
Sienna Meridian Satterwhite	40a6772f99	feat: 13 e2e integration tests against real Mistral API Orchestrator tests: - Simple chat roundtrip with token usage verification - Event ordering (Started → Thinking → Done) - Metadata pass-through (opaque bag appears in Started event) - Token usage accuracy (longer prompts → more tokens) - Conversation continuity (multi-turn recall) - Client-side tool dispatch + mock result submission - Failed tool result handling (is_error: true) - Server-side tool execution (search_web via conversation) gRPC tests: - Full roundtrip (StartSession → UserInput → Status → TextDone) - Client tool relay (ToolCall → ToolResult through gRPC stream) - Token counts in TextDone (non-zero verification) - Session resume (same room_id, resumed flag) - Clean disconnect (EndSession → SessionEnd) Infrastructure: - ToolRegistry::new_minimal() — no OpenSearch/Matrix needed - ToolRegistry fields now Option for testability - GrpcState.matrix now Option - grpc_bridge moved to src/grpc/bridge.rs - TestHarness loads API key from .env	2026-03-23 20:54:28 +00:00
Sienna Meridian Satterwhite	2810143f76	feat(grpc): proper tool result relay via tokio::select session_chat_via_orchestrator now: - Spawns generation on a background task - Reads in_stream for client tool results in foreground - Forwards results to orchestrator.submit_tool_result() - Uses tokio::select! to handle both concurrently - Uses GenerateRequest + Metadata (no transport types in orchestrator) - Calls grpc::bridge (not orchestrator::grpc_bridge)	2026-03-23 19:23:51 +00:00
Sienna Meridian Satterwhite	9e5f7e61be	feat(orchestrator): Phase 2 engine + tokenizer + tool dispatch Orchestrator engine: - engine.rs: unified Mistral Conversations API tool loop that emits OrchestratorEvent instead of calling Matrix/gRPC directly - tool_dispatch.rs: ToolSide routing (client vs server tools) - Memory loading stubbed (migrates in Phase 4) Server-side tokenizer: - tokenizer.rs: HuggingFace tokenizers-rs with Mistral's BPE tokenizer - count_tokens() for accurate usage metrics - Loads from local tokenizer.json or falls back to bundled vocab - Config: mistral.tokenizer_path (optional) No behavior change — engine is wired but not yet called from sync.rs or session.rs (Phase 2 continuation).	2026-03-23 17:40:25 +00:00
Sienna Meridian Satterwhite	ec4fde7b97	feat(orchestrator): Phase 1 — event types + broadcast channel foundation Introduces the orchestrator module with: - OrchestratorEvent enum: 11 event variants covering lifecycle, tools, progress, and side effects - RequestId (UUID per generation), ResponseMode (Chat/Code), ToolSide - ChatRequest/CodeRequest structs for transport-agnostic request input - Orchestrator struct with tokio::broadcast channel (capacity 256) - subscribe() for transport bridges, emit() for the engine - Client-side tool dispatch: pending_client_tools map with oneshot channels - submit_tool_result() to unblock engine from gRPC client responses Additive only — no behavior change. Existing responder + gRPC session paths are untouched. Phase 2 will migrate the Conversations API path.	2026-03-23 17:30:36 +00:00
Sienna Meridian Satterwhite	b8b76687a5	feat(grpc): dev mode, agent prefix, system prompt, error UX - gRPC dev_mode config: disables JWT auth, uses fixed dev identity - Agent prefix (agents.agent_prefix): dev agents use "dev-sol-orchestrator" to avoid colliding with production on shared Mistral accounts - Coding sessions use instructions (system prompt + coding addendum) with mistral-medium-latest for personality adherence - Conversations API: don't send both model + agent_id (422 fix) - GrpcState carries system_prompt + orchestrator_agent_id - Session.end() keeps session active for reuse (not "ended") - User messages posted as m.notice, assistant as m.text (role detection) - History loaded from Matrix room on session resume - Docker Compose local dev stack: OpenSearch 3 + Tuwunel + SearXNG - Dev config: localhost URLs, dev_mode, opensearch-init.sh for ML setup	2026-03-23 17:07:50 +00:00
Sienna Meridian Satterwhite	71392cef9c	feat(code): wire gRPC server into Sol startup spawns gRPC server alongside Matrix sync loop when [grpc] config is present. shares ToolRegistry, Store, MistralClient, and Matrix client with the gRPC CodeSession handler.	2026-03-23 13:01:36 +00:00
Sienna Meridian Satterwhite	35b6246fa7	feat(code): gRPC server with JWT auth + tool routing tonic 0.14 gRPC server for sunbeam code sessions: - bidirectional streaming Session RPC - JWT interceptor validates tokens against Hydra JWKS - tool router classifies calls as client-side (file_read, bash, grep, etc.) or server-side (gitea, identity, search, etc.) - service stub with session lifecycle (start, chat, tool results, end) - coding_model config (default: devstral-small-2506) - grpc config section (listen_addr, jwks_url) - 182 tests (5 new: JWT claims, tool routing) phase 2 TODOs: Matrix room bridge, Mistral agent loop, streaming	2026-03-23 11:35:37 +00:00
Sienna Meridian Satterwhite	567d4c1171	fix research agent hang: per-agent timeout + startup cleanup research agents now have a 2-minute timeout via tokio::time::timeout. a hung Mistral API call can no longer block Sol's entire sync loop. timed-out agents return partial results instead of hanging forever. on startup, Sol detects research sessions with status='running' from previous crashes and marks them as failed. 6 new tests covering the full research session lifecycle: create, append findings, complete, fail, hung cleanup, and partial findings survival.	2026-03-23 09:03:03 +00:00
Sienna Meridian Satterwhite	447bead0b7	wire up identity agent, research tool, silence state main.rs: create KratosClient, pass mistral+store to ToolRegistry, build active_agents list for dynamic delegation. conversations.rs: context_hint for new conversations, reset_all. sdk/mod.rs: added kratos module.	2026-03-23 01:43:51 +00:00
Sienna Meridian Satterwhite	7bf9e25361	per-message context headers, memory notes, conversation continuity conversations API path now injects per-message context headers with live timestamps, room name, and memory notes. this replaces the template variables in agent instructions which were frozen at creation time. memory notes (topical + recent backfill) loaded before each response in the conversations path — was previously only in the legacy path. context hint seeds new conversations with recent room history after resets, so sol doesn't lose conversational continuity on sneeze. tool call results now logged with preview + length for debugging. reset_all() clears both in-memory and sqlite conversation state.	2026-03-22 15:00:43 +00:00
Sienna Meridian Satterwhite	7580c10dda	feat: multi-agent architecture with Conversations API and persistent state Mistral Agents + Conversations API integration: - Orchestrator agent created on startup with Sol's personality + tools - ConversationRegistry routes messages through persistent conversations - Per-room conversation state (room_id → conversation_id + token counts) - Function call handling within conversation responses - Configurable via [agents] section in sol.toml (use_conversations_api flag) Multimodal support: - m.image detection and Matrix media download (mxc:// → base64 data URI) - ContentPart-based messages sent to Mistral vision models - Archive stores media_urls for image messages System prompt rewrite: - 687 → 150 lines — dense, few-shot examples, hard rules - {room_context_rules} placeholder for group vs DM behavior - Sender prefixing (<@user:server>) for multi-user turns in group rooms SQLite persistence (/data/sol.db): - Conversation mappings and agent IDs survive reboots - WAL mode for concurrent reads - Falls back to in-memory on failure (sneezes into all rooms to signal) - PVC already mounted at /data alongside Matrix SDK state store New modules: - src/persistence.rs — SQLite state store - src/conversations.rs — ConversationRegistry + message merging - src/agents/{mod,definitions,registry}.rs — agent lifecycle - src/agent_ux.rs — reaction + thread progress UX - src/tools/bridge.rs — tool dispatch for domain agents 102 tests passing.	2026-03-21 22:21:14 +00:00
Sienna Meridian Satterwhite	4949e70ecc	feat: per-user auto-memory with ResponseContext Three memory channels: hidden tool (sol.memory.set/get in scripts), pre-response injection (relevant memories loaded into system prompt), and post-response extraction (ministral-3b extracts facts after each response). User isolation enforced at Rust level — user_id derived from Matrix sender, never from script arguments. New modules: context (ResponseContext), memory (schema, store, extractor). ResponseContext threaded through responder → tools → script runtime. OpenSearch index sol_user_memory created on startup alongside archive.	2026-03-21 15:51:31 +00:00
Sienna Meridian Satterwhite	4dc20bee23	feat: initial Sol virtual librarian implementation Matrix bot with E2EE (matrix-sdk 0.9) that passively archives all messages to OpenSearch and responds to queries via Mistral AI with function calling tools. Core systems: - Archive: bulk OpenSearch indexer with batch/flush, edit/redaction handling, embedding pipeline passthrough - Brain: rule-based engagement evaluator (mentions, DMs, name invocations), LLM-powered spontaneous engagement, per-room conversation context windows, response delay simulation - Tools: search_archive, get_room_context, list_rooms, get_room_members registered as Mistral function calling tools with iterative tool loop - Personality: templated system prompt with Sol's librarian persona 47 unit tests covering config, evaluator, conversation windowing, personality templates, schema serialization, and search query building.	2026-03-20 21:40:13 +00:00

17 Commits