Commit Graph

6 Commits

Author SHA1 Message Date
f338444087 feat: matrix room, room_info, research execute, bootstrap improvements
- bootstrap: create integration test room in Tuwunel, send bootstrap
  message, print room ID in summary
- room_info: list_rooms and get_room_members against live Tuwunel
- research: execute with empty tasks against real Matrix room + Mistral
- identity: fix flaky list_users_tool test (use search instead of
  unbounded list to avoid pagination)
2026-03-24 15:38:12 +00:00
5dc739b800 feat: integration test suite — 416 tests, 61% coverage
Add OpenBao and Kratos to docker-compose dev stack with bootstrap
seeding. Full integration tests hitting real services:

- Vault SDK: KV read/write/delete, re-auth on bad token, new_with_token
  constructor for dev mode
- Kratos SDK: list/get/create/disable/enable users, session listing
- Token store: PAT lifecycle with OpenBao backing, expiry handling
- Identity tools: full tool dispatch through Kratos admin API
- Gitea SDK: resolve_username, ensure_token (PAT auto-provisioning),
  list/get repos, issues, comments, branches, file content
- Devtools: tool dispatch for all gitea_* tools against live Gitea
- Archive indexer: batch flush, periodic flush task, edit/redact/reaction
  updates against OpenSearch
- Memory store: set/query/get_recent with user scoping in OpenSearch
- Room history: context retrieval by timestamp and event_id, access
  control enforcement
- Search archive: keyword search with room/sender filters, room scoping
- Code search: language filter, repo filter, branch scoping
- Breadcrumbs: symbol retrieval, empty index handling, token budget
- Bridge: full event lifecycle mapping, request ID filtering
- Evaluator: DM/mention/silence short-circuits, LLM evaluation path,
  reply-to-human suppression
- Agent registry: list/get_id, prompt reuse, prompt-change recreation
- Conversations: token tracking, multi-turn context recall, room
  isolation

Bug fixes caught by tests:
- AgentRegistry in-memory cache skipped hash comparison on prompt change
- KratosClient::set_state sent bare PUT without traits (400 error)
- find_code_session returns None on NULL conversation_id
2026-03-24 14:34:03 +00:00
4528739a5f feat: deterministic Gitea integration tests + mutation lifecycle
Bootstrap:
- Creates test issue + comment on studio/sol for deterministic test data
- Mirrors 6 real repos from src.sunbeam.pt

Devtools tests (13, all deterministic):
- Read: list_repos, get_repo, get_file, list_branches, list_issues,
  list_pulls, list_comments, list_notifications, list_org_repos,
  get_org, unknown_tool
- Mutation lifecycle: create_repo → create_issue → create_comment →
  create_branch → create_pull → get_pull → edit_issue →
  delete_branch → cleanup (all arg names verified against tool impls)

Additional tests:
- Script sandbox: basic math, string manipulation, JSON output
- Archive search: arg parsing, OpenSearch query
- Persistence: agent CRUD, service user CRUD
- gRPC bridge: event filtering, tool mapping
2026-03-24 12:45:01 +00:00
495c465a01 refactor: remove legacy responder + agent_ux, add Gitea integration tests
Legacy removal:
- DELETE src/brain/responder.rs (900 lines) — replaced by orchestrator
- DELETE src/agent_ux.rs (184 lines) — UX moved to transport bridges
- EXTRACT chat_blocking() to src/brain/chat.rs (standalone utility)
- sync.rs: uses ConversationRegistry directly (no responder)
- main.rs: holds ToolRegistry + Personality directly (no Responder wrapper)
- research.rs: progress updates via tracing (no AgentProgress)

Gitea integration testing:
- docker-compose: added Gitea service with healthcheck
- bootstrap-gitea.sh: creates admin, org, mirrors 6 real repos from
  src.sunbeam.pt (sol, cli, proxy, storybook, admin-ui, mistralai-client-rs)
- PAT provisioning for SDK testing without Vault
- code_index/gitea.rs: fixed directory listing (direct API calls instead
  of SDK's single-object parser), proper base64 file decoding

New integration tests:
- Gitea: list_repos, get_repo, get_file, directory listing, code indexing
- Web search: SearXNG query with result verification
- Conversation registry: lifecycle + send_message round-trip
- Evaluator: rule matching (DM, own message)
- gRPC bridge: event filtering, tool call mapping, thinking→status
2026-03-24 11:45:43 +00:00
9e5f7e61be feat(orchestrator): Phase 2 engine + tokenizer + tool dispatch
Orchestrator engine:
- engine.rs: unified Mistral Conversations API tool loop that emits
  OrchestratorEvent instead of calling Matrix/gRPC directly
- tool_dispatch.rs: ToolSide routing (client vs server tools)
- Memory loading stubbed (migrates in Phase 4)

Server-side tokenizer:
- tokenizer.rs: HuggingFace tokenizers-rs with Mistral's BPE tokenizer
- count_tokens() for accurate usage metrics
- Loads from local tokenizer.json or falls back to bundled vocab
- Config: mistral.tokenizer_path (optional)

No behavior change — engine is wired but not yet called from
sync.rs or session.rs (Phase 2 continuation).
2026-03-23 17:40:25 +00:00
b8b76687a5 feat(grpc): dev mode, agent prefix, system prompt, error UX
- gRPC dev_mode config: disables JWT auth, uses fixed dev identity
- Agent prefix (agents.agent_prefix): dev agents use "dev-sol-orchestrator"
  to avoid colliding with production on shared Mistral accounts
- Coding sessions use instructions (system prompt + coding addendum)
  with mistral-medium-latest for personality adherence
- Conversations API: don't send both model + agent_id (422 fix)
- GrpcState carries system_prompt + orchestrator_agent_id
- Session.end() keeps session active for reuse (not "ended")
- User messages posted as m.notice, assistant as m.text (role detection)
- History loaded from Matrix room on session resume
- Docker Compose local dev stack: OpenSearch 3 + Tuwunel + SearXNG
- Dev config: localhost URLs, dev_mode, opensearch-init.sh for ML setup
2026-03-23 17:07:50 +00:00