Commit Graph

17 Commits

Author SHA1 Message Date
6a2aafdccc refactor: remove legacy chat path, fix corrupted conversation recovery
- Delete CodeSession::chat() — the legacy inline tool loop that
  duplicated the orchestrator's conversation + tool dispatch logic
- Delete wait_for_tool_result() — only used by the legacy path
- Make orchestrator mandatory in run_session (no more if/else fallback)
- Unify conversation creation into create_fresh_conversation()
- Add corrupted conversation recovery to create_or_append_conversation:
  detects "function calls and responses" errors from Mistral (caused by
  disconnecting mid-tool-call) and auto-creates a fresh conversation
- Add tracing-appender for optional rotating log file (SOL_LOG_FILE env)
- Add Procfile.dev for overmind process management
2026-03-24 19:49:07 +00:00
495c465a01 refactor: remove legacy responder + agent_ux, add Gitea integration tests
Legacy removal:
- DELETE src/brain/responder.rs (900 lines) — replaced by orchestrator
- DELETE src/agent_ux.rs (184 lines) — UX moved to transport bridges
- EXTRACT chat_blocking() to src/brain/chat.rs (standalone utility)
- sync.rs: uses ConversationRegistry directly (no responder)
- main.rs: holds ToolRegistry + Personality directly (no Responder wrapper)
- research.rs: progress updates via tracing (no AgentProgress)

Gitea integration testing:
- docker-compose: added Gitea service with healthcheck
- bootstrap-gitea.sh: creates admin, org, mirrors 6 real repos from
  src.sunbeam.pt (sol, cli, proxy, storybook, admin-ui, mistralai-client-rs)
- PAT provisioning for SDK testing without Vault
- code_index/gitea.rs: fixed directory listing (direct API calls instead
  of SDK's single-object parser), proper base64 file decoding

New integration tests:
- Gitea: list_repos, get_repo, get_file, directory listing, code indexing
- Web search: SearXNG query with result verification
- Conversation registry: lifecycle + send_message round-trip
- Evaluator: rule matching (DM, own message)
- gRPC bridge: event filtering, tool call mapping, thinking→status
2026-03-24 11:45:43 +00:00
c213d74620 feat: code search tool + breadcrumb context injection + integration tests
search_code tool:
- Server-side tool querying sol_code OpenSearch index
- BM25 search across symbol_name, signature, docstring, content
- Branch-aware with boost for current branch, mainline fallback
- Registered in ToolRegistry execute dispatch

Breadcrumb injection:
- build_context_header() now async, injects adaptive breadcrumbs
- Hybrid search: _analyze → wildcard symbol matching → BM25
- Token budget enforcement (default outline + relevant expansion)
- Graceful degradation when OpenSearch unavailable

GrpcState:
- Added Option<OpenSearch> for breadcrumb retrieval
- code_index_name() accessor

Integration tests (6 new, 226 total):
- Index + search: bulk index symbols, verify BM25 retrieval
- Breadcrumb outline: aggregation query returns project structure
- Breadcrumb expansion: substantive query triggers relevant symbols
- Token budget: respects character limit
- Branch scoping: feat/code symbols preferred over mainline
- Branch deletion: cleanup removes branch symbols, mainline survives
2026-03-24 00:19:17 +00:00
57f8d608a5 feat: code index + adaptive breadcrumbs foundation
Code index (sol_code):
- SymbolDocument: file_path, repo_name, language, symbol_name, symbol_kind,
  signature, docstring, branch, source, embedding (768-dim knn_vector)
- CodeIndexer: batch symbol indexer with idempotent upserts
- Branch-aware: symbols scoped to branch with mainline fallback

Breadcrumbs:
- build_breadcrumbs(): adaptive context injection for coding prompts
- Default: project outline via aggregation (modules, types, fns)
- Adaptive: hybrid search (_analyze → symbol matching → BM25 + neural)
- Token budget enforcement with priority (outline first, then relevance)
- format_symbol(): signature + first-line docstring + file:line

Query optimization: uses _analyze API to extract key terms from
free-form user text, matches against actual symbol names in the index
before running the hybrid search.
2026-03-23 23:54:29 +00:00
40a6772f99 feat: 13 e2e integration tests against real Mistral API
Orchestrator tests:
- Simple chat roundtrip with token usage verification
- Event ordering (Started → Thinking → Done)
- Metadata pass-through (opaque bag appears in Started event)
- Token usage accuracy (longer prompts → more tokens)
- Conversation continuity (multi-turn recall)
- Client-side tool dispatch + mock result submission
- Failed tool result handling (is_error: true)
- Server-side tool execution (search_web via conversation)

gRPC tests:
- Full roundtrip (StartSession → UserInput → Status → TextDone)
- Client tool relay (ToolCall → ToolResult through gRPC stream)
- Token counts in TextDone (non-zero verification)
- Session resume (same room_id, resumed flag)
- Clean disconnect (EndSession → SessionEnd)

Infrastructure:
- ToolRegistry::new_minimal() — no OpenSearch/Matrix needed
- ToolRegistry fields now Option for testability
- GrpcState.matrix now Option
- grpc_bridge moved to src/grpc/bridge.rs
- TestHarness loads API key from .env
2026-03-23 20:54:28 +00:00
2810143f76 feat(grpc): proper tool result relay via tokio::select
session_chat_via_orchestrator now:
- Spawns generation on a background task
- Reads in_stream for client tool results in foreground
- Forwards results to orchestrator.submit_tool_result()
- Uses tokio::select! to handle both concurrently
- Uses GenerateRequest + Metadata (no transport types in orchestrator)
- Calls grpc::bridge (not orchestrator::grpc_bridge)
2026-03-23 19:23:51 +00:00
9e5f7e61be feat(orchestrator): Phase 2 engine + tokenizer + tool dispatch
Orchestrator engine:
- engine.rs: unified Mistral Conversations API tool loop that emits
  OrchestratorEvent instead of calling Matrix/gRPC directly
- tool_dispatch.rs: ToolSide routing (client vs server tools)
- Memory loading stubbed (migrates in Phase 4)

Server-side tokenizer:
- tokenizer.rs: HuggingFace tokenizers-rs with Mistral's BPE tokenizer
- count_tokens() for accurate usage metrics
- Loads from local tokenizer.json or falls back to bundled vocab
- Config: mistral.tokenizer_path (optional)

No behavior change — engine is wired but not yet called from
sync.rs or session.rs (Phase 2 continuation).
2026-03-23 17:40:25 +00:00
ec4fde7b97 feat(orchestrator): Phase 1 — event types + broadcast channel foundation
Introduces the orchestrator module with:
- OrchestratorEvent enum: 11 event variants covering lifecycle, tools,
  progress, and side effects
- RequestId (UUID per generation), ResponseMode (Chat/Code), ToolSide
- ChatRequest/CodeRequest structs for transport-agnostic request input
- Orchestrator struct with tokio::broadcast channel (capacity 256)
- subscribe() for transport bridges, emit() for the engine
- Client-side tool dispatch: pending_client_tools map with oneshot channels
- submit_tool_result() to unblock engine from gRPC client responses

Additive only — no behavior change. Existing responder + gRPC session
paths are untouched. Phase 2 will migrate the Conversations API path.
2026-03-23 17:30:36 +00:00
b8b76687a5 feat(grpc): dev mode, agent prefix, system prompt, error UX
- gRPC dev_mode config: disables JWT auth, uses fixed dev identity
- Agent prefix (agents.agent_prefix): dev agents use "dev-sol-orchestrator"
  to avoid colliding with production on shared Mistral accounts
- Coding sessions use instructions (system prompt + coding addendum)
  with mistral-medium-latest for personality adherence
- Conversations API: don't send both model + agent_id (422 fix)
- GrpcState carries system_prompt + orchestrator_agent_id
- Session.end() keeps session active for reuse (not "ended")
- User messages posted as m.notice, assistant as m.text (role detection)
- History loaded from Matrix room on session resume
- Docker Compose local dev stack: OpenSearch 3 + Tuwunel + SearXNG
- Dev config: localhost URLs, dev_mode, opensearch-init.sh for ML setup
2026-03-23 17:07:50 +00:00
71392cef9c feat(code): wire gRPC server into Sol startup
spawns gRPC server alongside Matrix sync loop when [grpc] config
is present. shares ToolRegistry, Store, MistralClient, and Matrix
client with the gRPC CodeSession handler.
2026-03-23 13:01:36 +00:00
35b6246fa7 feat(code): gRPC server with JWT auth + tool routing
tonic 0.14 gRPC server for sunbeam code sessions:
- bidirectional streaming Session RPC
- JWT interceptor validates tokens against Hydra JWKS
- tool router classifies calls as client-side (file_read, bash,
  grep, etc.) or server-side (gitea, identity, search, etc.)
- service stub with session lifecycle (start, chat, tool results, end)
- coding_model config (default: devstral-small-2506)
- grpc config section (listen_addr, jwks_url)
- 182 tests (5 new: JWT claims, tool routing)

phase 2 TODOs: Matrix room bridge, Mistral agent loop, streaming
2026-03-23 11:35:37 +00:00
567d4c1171 fix research agent hang: per-agent timeout + startup cleanup
research agents now have a 2-minute timeout via tokio::time::timeout.
a hung Mistral API call can no longer block Sol's entire sync loop.
timed-out agents return partial results instead of hanging forever.

on startup, Sol detects research sessions with status='running' from
previous crashes and marks them as failed. 6 new tests covering the
full research session lifecycle: create, append findings, complete,
fail, hung cleanup, and partial findings survival.
2026-03-23 09:03:03 +00:00
447bead0b7 wire up identity agent, research tool, silence state
main.rs: create KratosClient, pass mistral+store to ToolRegistry,
build active_agents list for dynamic delegation.

conversations.rs: context_hint for new conversations, reset_all.
sdk/mod.rs: added kratos module.
2026-03-23 01:43:51 +00:00
7bf9e25361 per-message context headers, memory notes, conversation continuity
conversations API path now injects per-message context headers with
live timestamps, room name, and memory notes. this replaces the
template variables in agent instructions which were frozen at
creation time.

memory notes (topical + recent backfill) loaded before each response
in the conversations path — was previously only in the legacy path.

context hint seeds new conversations with recent room history after
resets, so sol doesn't lose conversational continuity on sneeze.

tool call results now logged with preview + length for debugging.
reset_all() clears both in-memory and sqlite conversation state.
2026-03-22 15:00:43 +00:00
7580c10dda feat: multi-agent architecture with Conversations API and persistent state
Mistral Agents + Conversations API integration:
- Orchestrator agent created on startup with Sol's personality + tools
- ConversationRegistry routes messages through persistent conversations
- Per-room conversation state (room_id → conversation_id + token counts)
- Function call handling within conversation responses
- Configurable via [agents] section in sol.toml (use_conversations_api flag)

Multimodal support:
- m.image detection and Matrix media download (mxc:// → base64 data URI)
- ContentPart-based messages sent to Mistral vision models
- Archive stores media_urls for image messages

System prompt rewrite:
- 687 → 150 lines — dense, few-shot examples, hard rules
- {room_context_rules} placeholder for group vs DM behavior
- Sender prefixing (<@user:server>) for multi-user turns in group rooms

SQLite persistence (/data/sol.db):
- Conversation mappings and agent IDs survive reboots
- WAL mode for concurrent reads
- Falls back to in-memory on failure (sneezes into all rooms to signal)
- PVC already mounted at /data alongside Matrix SDK state store

New modules:
- src/persistence.rs — SQLite state store
- src/conversations.rs — ConversationRegistry + message merging
- src/agents/{mod,definitions,registry}.rs — agent lifecycle
- src/agent_ux.rs — reaction + thread progress UX
- src/tools/bridge.rs — tool dispatch for domain agents

102 tests passing.
2026-03-21 22:21:14 +00:00
4949e70ecc feat: per-user auto-memory with ResponseContext
Three memory channels: hidden tool (sol.memory.set/get in scripts),
pre-response injection (relevant memories loaded into system prompt),
and post-response extraction (ministral-3b extracts facts after each
response). User isolation enforced at Rust level — user_id derived
from Matrix sender, never from script arguments.

New modules: context (ResponseContext), memory (schema, store, extractor).
ResponseContext threaded through responder → tools → script runtime.
OpenSearch index sol_user_memory created on startup alongside archive.
2026-03-21 15:51:31 +00:00
4dc20bee23 feat: initial Sol virtual librarian implementation
Matrix bot with E2EE (matrix-sdk 0.9) that passively archives all
messages to OpenSearch and responds to queries via Mistral AI with
function calling tools.

Core systems:
- Archive: bulk OpenSearch indexer with batch/flush, edit/redaction
  handling, embedding pipeline passthrough
- Brain: rule-based engagement evaluator (mentions, DMs, name
  invocations), LLM-powered spontaneous engagement, per-room
  conversation context windows, response delay simulation
- Tools: search_archive, get_room_context, list_rooms, get_room_members
  registered as Mistral function calling tools with iterative tool loop
- Personality: templated system prompt with Sol's librarian persona

47 unit tests covering config, evaluator, conversation windowing,
personality templates, schema serialization, and search query building.
2026-03-20 21:40:13 +00:00