Commit Graph

10 Commits

Author SHA1 Message Date
a11b313301 feat: Gitea repo indexing via gRPC ReindexCode endpoint
Gitea indexer (code_index/gitea.rs):
- Walks repos via GiteaClient API (list repos → traverse dirs → fetch files)
- Base64 decodes file content from Gitea API responses
- Extracts symbols with tree-sitter (Rust, TypeScript, Python)
- Indexes to sol_code OpenSearch index with repo/branch/source metadata
- Skips hidden dirs, vendor, node_modules, files >100KB
- delete_branch() for clean re-indexing

Server-side tree-sitter (code_index/symbols.rs):
- Full symbol extraction shared with CLI client
- extract_symbols(), extract_project_symbols(), detect_language()

gRPC ReindexCode RPC:
- ReindexCodeRequest: org, repo, branch (all optional filters)
- ReindexCodeResponse: repos_indexed, symbols_indexed, error
- Uses ToolRegistry's GiteaClient (already authenticated)
- Creates sol_code index if not exists

ToolRegistry.gitea_client() accessor for reindex endpoint.
2026-03-24 09:36:42 +00:00
40a6772f99 feat: 13 e2e integration tests against real Mistral API
Orchestrator tests:
- Simple chat roundtrip with token usage verification
- Event ordering (Started → Thinking → Done)
- Metadata pass-through (opaque bag appears in Started event)
- Token usage accuracy (longer prompts → more tokens)
- Conversation continuity (multi-turn recall)
- Client-side tool dispatch + mock result submission
- Failed tool result handling (is_error: true)
- Server-side tool execution (search_web via conversation)

gRPC tests:
- Full roundtrip (StartSession → UserInput → Status → TextDone)
- Client tool relay (ToolCall → ToolResult through gRPC stream)
- Token counts in TextDone (non-zero verification)
- Session resume (same room_id, resumed flag)
- Clean disconnect (EndSession → SessionEnd)

Infrastructure:
- ToolRegistry::new_minimal() — no OpenSearch/Matrix needed
- ToolRegistry fields now Option for testability
- GrpcState.matrix now Option
- grpc_bridge moved to src/grpc/bridge.rs
- TestHarness loads API key from .env
2026-03-23 20:54:28 +00:00
9e5f7e61be feat(orchestrator): Phase 2 engine + tokenizer + tool dispatch
Orchestrator engine:
- engine.rs: unified Mistral Conversations API tool loop that emits
  OrchestratorEvent instead of calling Matrix/gRPC directly
- tool_dispatch.rs: ToolSide routing (client vs server tools)
- Memory loading stubbed (migrates in Phase 4)

Server-side tokenizer:
- tokenizer.rs: HuggingFace tokenizers-rs with Mistral's BPE tokenizer
- count_tokens() for accurate usage metrics
- Loads from local tokenizer.json or falls back to bundled vocab
- Config: mistral.tokenizer_path (optional)

No behavior change — engine is wired but not yet called from
sync.rs or session.rs (Phase 2 continuation).
2026-03-23 17:40:25 +00:00
35b6246fa7 feat(code): gRPC server with JWT auth + tool routing
tonic 0.14 gRPC server for sunbeam code sessions:
- bidirectional streaming Session RPC
- JWT interceptor validates tokens against Hydra JWKS
- tool router classifies calls as client-side (file_read, bash,
  grep, etc.) or server-side (gitea, identity, search, etc.)
- service stub with session lifecycle (start, chat, tool results, end)
- coding_model config (default: devstral-small-2506)
- grpc config section (listen_addr, jwks_url)
- 182 tests (5 new: JWT claims, tool routing)

phase 2 TODOs: Matrix room bridge, Mistral agent loop, streaming
2026-03-23 11:35:37 +00:00
de33ddfe33 multi-agent research: parallel LLM-powered investigation
new research tool spawns 3-25 micro-agents (ministral-3b) in
parallel via futures::join_all. each agent gets its own Mistral
conversation with full tool access.

recursive spawning up to depth 4 — agents can spawn sub-agents.
research sessions persisted in SQLite (survive reboots).
thread UX: 🔍 reaction, per-agent progress posts,  when done.

cost: ~$0.03 per research task (20 micro-agents on ministral-3b).
2026-03-23 01:42:40 +00:00
822a597a87 sol 1.0.0 — relicense to AGPL-3.0, dual-license model
relicensed from MIT to AGPL-3.0-or-later. commercial license
available for organizations that need private modifications
(does not permit redistribution).

sol 1.0.0 ships with:
- multi-agent architecture with mistral conversations API
- user impersonation via vault-backed PAT provisioning
- gitea integration (first domain agent: devtools)
- per-user memory system with automatic extraction
- full-context evaluator with system prompt awareness
- agent recreation on prompt changes with conversation reset
- web search, sandboxed deno runtime, archive search
2026-03-22 15:24:02 +00:00
7580c10dda feat: multi-agent architecture with Conversations API and persistent state
Mistral Agents + Conversations API integration:
- Orchestrator agent created on startup with Sol's personality + tools
- ConversationRegistry routes messages through persistent conversations
- Per-room conversation state (room_id → conversation_id + token counts)
- Function call handling within conversation responses
- Configurable via [agents] section in sol.toml (use_conversations_api flag)

Multimodal support:
- m.image detection and Matrix media download (mxc:// → base64 data URI)
- ContentPart-based messages sent to Mistral vision models
- Archive stores media_urls for image messages

System prompt rewrite:
- 687 → 150 lines — dense, few-shot examples, hard rules
- {room_context_rules} placeholder for group vs DM behavior
- Sender prefixing (<@user:server>) for multi-user turns in group rooms

SQLite persistence (/data/sol.db):
- Conversation mappings and agent IDs survive reboots
- WAL mode for concurrent reads
- Falls back to in-memory on failure (sneezes into all rooms to signal)
- PVC already mounted at /data alongside Matrix SDK state store

New modules:
- src/persistence.rs — SQLite state store
- src/conversations.rs — ConversationRegistry + message merging
- src/agents/{mod,definitions,registry}.rs — agent lifecycle
- src/agent_ux.rs — reaction + thread progress UX
- src/tools/bridge.rs — tool dispatch for domain agents

102 tests passing.
2026-03-21 22:21:14 +00:00
5e2186f324 add README, MIT license, and package metadata 2026-03-21 15:53:49 +00:00
4949e70ecc feat: per-user auto-memory with ResponseContext
Three memory channels: hidden tool (sol.memory.set/get in scripts),
pre-response injection (relevant memories loaded into system prompt),
and post-response extraction (ministral-3b extracts facts after each
response). User isolation enforced at Rust level — user_id derived
from Matrix sender, never from script arguments.

New modules: context (ResponseContext), memory (schema, store, extractor).
ResponseContext threaded through responder → tools → script runtime.
OpenSearch index sol_user_memory created on startup alongside archive.
2026-03-21 15:51:31 +00:00
4dc20bee23 feat: initial Sol virtual librarian implementation
Matrix bot with E2EE (matrix-sdk 0.9) that passively archives all
messages to OpenSearch and responds to queries via Mistral AI with
function calling tools.

Core systems:
- Archive: bulk OpenSearch indexer with batch/flush, edit/redaction
  handling, embedding pipeline passthrough
- Brain: rule-based engagement evaluator (mentions, DMs, name
  invocations), LLM-powered spontaneous engagement, per-room
  conversation context windows, response delay simulation
- Tools: search_archive, get_room_context, list_rooms, get_room_members
  registered as Mistral function calling tools with iterative tool loop
- Personality: templated system prompt with Sol's librarian persona

47 unit tests covering config, evaluator, conversation windowing,
personality templates, schema serialization, and search query building.
2026-03-20 21:40:13 +00:00