studio/sol - sol - Gitea: Git with a cup of tea

studio/sol

Author	SHA1	Message	Date
Sienna Meridian Satterwhite	2eda81ef2e	feat: gitea branch lifecycle tests - gitea: create_branch, delete_branch, list verification on studio/sol	2026-03-24 16:05:10 +00:00
Sienna Meridian Satterwhite	ef040aae38	feat: conversation registry compaction, reset, context hint tests - needs_compaction: verify threshold triggers after token accumulation - set_agent_id / get_agent_id: round-trip agent ID storage - reset_all: verify all rooms cleared, SQLite and memory - context_hint: verify conversation receives recent history context when creating new conversation with hint parameter	2026-03-24 15:43:50 +00:00
Sienna Meridian Satterwhite	f338444087	feat: matrix room, room_info, research execute, bootstrap improvements - bootstrap: create integration test room in Tuwunel, send bootstrap message, print room ID in summary - room_info: list_rooms and get_room_members against live Tuwunel - research: execute with empty tasks against real Matrix room + Mistral - identity: fix flaky list_users_tool test (use search instead of unbounded list to avoid pagination)	2026-03-24 15:38:12 +00:00
Sienna Meridian Satterwhite	b5c83b7c34	feat: gitea SDK extended tests + research type coverage - gitea: list_org_repos, get_issue, list_notifications, list_orgs, get_org — exercises authed_get paths for more API endpoints - research: empty tasks parse, multiple tasks parse, depth boundary edge cases	2026-03-24 15:15:56 +00:00
Sienna Meridian Satterwhite	f1009ddda4	feat: deeper script sandbox and research type tests - script tool: async operations, sol.fs.list, console.log/error/warn/ info, return value capture, additional sandbox coverage - research tool: tool_definition schema validation, depth boundary exhaustive testing, ResearchTask/ResearchResult roundtrips, output format verification - matrix_utils: extract_image returns None for text messages	2026-03-24 15:04:39 +00:00
Sienna Meridian Satterwhite	e59b55e6a9	feat: matrix, script, evaluator, and devtools integration tests - matrix_utils: construct ruma events in tests, verify extract_body (text/notice/emote/unsupported), extract_reply_to, extract_thread_id, extract_edit, extract_image, make_reply_content, make_thread_reply - script tool: full run_script against live Tuwunel + OpenSearch — basic math, TypeScript transpilation, filesystem sandbox read/write, error capture, output truncation, invalid args - evaluator: DM/mention/silence short-circuits, LLM evaluation path with Mistral API, reply-to-human suppression - agent registry: list/get_id, prompt reuse, prompt-change recreation - devtools: tool dispatch for list_repos, get_repo, list_issues, get_file, list_branches, list_comments, list_orgs - conversations: token tracking, multi-turn context recall, room isolation	2026-03-24 14:48:13 +00:00
Sienna Meridian Satterwhite	5dc739b800	feat: integration test suite — 416 tests, 61% coverage Add OpenBao and Kratos to docker-compose dev stack with bootstrap seeding. Full integration tests hitting real services: - Vault SDK: KV read/write/delete, re-auth on bad token, new_with_token constructor for dev mode - Kratos SDK: list/get/create/disable/enable users, session listing - Token store: PAT lifecycle with OpenBao backing, expiry handling - Identity tools: full tool dispatch through Kratos admin API - Gitea SDK: resolve_username, ensure_token (PAT auto-provisioning), list/get repos, issues, comments, branches, file content - Devtools: tool dispatch for all gitea_* tools against live Gitea - Archive indexer: batch flush, periodic flush task, edit/redact/reaction updates against OpenSearch - Memory store: set/query/get_recent with user scoping in OpenSearch - Room history: context retrieval by timestamp and event_id, access control enforcement - Search archive: keyword search with room/sender filters, room scoping - Code search: language filter, repo filter, branch scoping - Breadcrumbs: symbol retrieval, empty index handling, token budget - Bridge: full event lifecycle mapping, request ID filtering - Evaluator: DM/mention/silence short-circuits, LLM evaluation path, reply-to-human suppression - Agent registry: list/get_id, prompt reuse, prompt-change recreation - Conversations: token tracking, multi-turn context recall, room isolation Bug fixes caught by tests: - AgentRegistry in-memory cache skipped hash comparison on prompt change - KratosClient::set_state sent bare PUT without traits (400 error) - find_code_session returns None on NULL conversation_id	2026-03-24 14:34:03 +00:00
Sienna Meridian Satterwhite	4528739a5f	feat: deterministic Gitea integration tests + mutation lifecycle Bootstrap: - Creates test issue + comment on studio/sol for deterministic test data - Mirrors 6 real repos from src.sunbeam.pt Devtools tests (13, all deterministic): - Read: list_repos, get_repo, get_file, list_branches, list_issues, list_pulls, list_comments, list_notifications, list_org_repos, get_org, unknown_tool - Mutation lifecycle: create_repo → create_issue → create_comment → create_branch → create_pull → get_pull → edit_issue → delete_branch → cleanup (all arg names verified against tool impls) Additional tests: - Script sandbox: basic math, string manipulation, JSON output - Archive search: arg parsing, OpenSearch query - Persistence: agent CRUD, service user CRUD - gRPC bridge: event filtering, tool mapping	2026-03-24 12:45:01 +00:00
Sienna Meridian Satterwhite	0efd3e32c3	feat: devtools + tool dispatch tests, search_code tool definition fix Devtools integration tests (6 new, all via live Gitea): - gitea_list_repos, get_repo, get_file, list_branches, list_issues, list_orgs - Tests exercise the full Gitea SDK → tool handler → JSON response path Tool dispatch tests (8 new unit tests): - tool_definitions: base, gitea, kratos, all-enabled variants - agent_tool_definitions conversion - minimal registry creation - unknown tool error handling - search_code without OpenSearch error search_code: added to tool_definitions() (was only in execute dispatch)	2026-03-24 12:06:39 +00:00
Sienna Meridian Satterwhite	495c465a01	refactor: remove legacy responder + agent_ux, add Gitea integration tests Legacy removal: - DELETE src/brain/responder.rs (900 lines) — replaced by orchestrator - DELETE src/agent_ux.rs (184 lines) — UX moved to transport bridges - EXTRACT chat_blocking() to src/brain/chat.rs (standalone utility) - sync.rs: uses ConversationRegistry directly (no responder) - main.rs: holds ToolRegistry + Personality directly (no Responder wrapper) - research.rs: progress updates via tracing (no AgentProgress) Gitea integration testing: - docker-compose: added Gitea service with healthcheck - bootstrap-gitea.sh: creates admin, org, mirrors 6 real repos from src.sunbeam.pt (sol, cli, proxy, storybook, admin-ui, mistralai-client-rs) - PAT provisioning for SDK testing without Vault - code_index/gitea.rs: fixed directory listing (direct API calls instead of SDK's single-object parser), proper base64 file decoding New integration tests: - Gitea: list_repos, get_repo, get_file, directory listing, code indexing - Web search: SearXNG query with result verification - Conversation registry: lifecycle + send_message round-trip - Evaluator: rule matching (DM, own message) - gRPC bridge: event filtering, tool call mapping, thinking→status	2026-03-24 11:45:43 +00:00
Sienna Meridian Satterwhite	ec55984fd8	feat: Phase 5 polish — conditional LSP tools, capabilities, sidecar hooks - ToolSide enum: documented Sidecar future variant - StartSession.capabilities: client reports LSP availability - Client detects LSP binaries on PATH, sends ["lsp_rust", "lsp_typescript"] - build_tool_definitions() conditionally registers LSP tools only when client has LSP capability — model won't hallucinate unavailable tools - CodeSession stores capabilities, has_lsp(), has_capability() accessors - git_branch() reads from git for breadcrumb scoping - ToolRegistry.gitea_client() accessor for reindex endpoint	2026-03-24 09:54:14 +00:00
Sienna Meridian Satterwhite	c213d74620	feat: code search tool + breadcrumb context injection + integration tests search_code tool: - Server-side tool querying sol_code OpenSearch index - BM25 search across symbol_name, signature, docstring, content - Branch-aware with boost for current branch, mainline fallback - Registered in ToolRegistry execute dispatch Breadcrumb injection: - build_context_header() now async, injects adaptive breadcrumbs - Hybrid search: _analyze → wildcard symbol matching → BM25 - Token budget enforcement (default outline + relevant expansion) - Graceful degradation when OpenSearch unavailable GrpcState: - Added Option<OpenSearch> for breadcrumb retrieval - code_index_name() accessor Integration tests (6 new, 226 total): - Index + search: bulk index symbols, verify BM25 retrieval - Breadcrumb outline: aggregation query returns project structure - Breadcrumb expansion: substantive query triggers relevant symbols - Token budget: respects character limit - Branch scoping: feat/code symbols preferred over mainline - Branch deletion: cleanup removes branch symbols, mainline survives	2026-03-24 00:19:17 +00:00
Sienna Meridian Satterwhite	40a6772f99	feat: 13 e2e integration tests against real Mistral API Orchestrator tests: - Simple chat roundtrip with token usage verification - Event ordering (Started → Thinking → Done) - Metadata pass-through (opaque bag appears in Started event) - Token usage accuracy (longer prompts → more tokens) - Conversation continuity (multi-turn recall) - Client-side tool dispatch + mock result submission - Failed tool result handling (is_error: true) - Server-side tool execution (search_web via conversation) gRPC tests: - Full roundtrip (StartSession → UserInput → Status → TextDone) - Client tool relay (ToolCall → ToolResult through gRPC stream) - Token counts in TextDone (non-zero verification) - Session resume (same room_id, resumed flag) - Clean disconnect (EndSession → SessionEnd) Infrastructure: - ToolRegistry::new_minimal() — no OpenSearch/Matrix needed - ToolRegistry fields now Option for testability - GrpcState.matrix now Option - grpc_bridge moved to src/grpc/bridge.rs - TestHarness loads API key from .env	2026-03-23 20:54:28 +00:00

13 Commits