1022 lines
26 KiB
Markdown
1022 lines
26 KiB
Markdown
|
|
# LSP Testing Strategy
|
||
|
|
**Version:** 1.0
|
||
|
|
**Date:** 2026-02-12
|
||
|
|
**Owner:** LSP Test Engineer
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Executive Summary
|
||
|
|
|
||
|
|
This document outlines a comprehensive testing strategy for the Storybook LSP implementation, designed to achieve **85% code coverage** and **zero critical bugs** within 4-6 weeks. The strategy is divided into 5 phases, prioritized by immediate blocking issues first, followed by coverage expansion, integration testing, automation, and finally multi-file support.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Strategy Goals
|
||
|
|
|
||
|
|
### Primary Objectives
|
||
|
|
1. **Fix Critical Bugs:** Resolve 2 compilation errors blocking all tests
|
||
|
|
2. **Achieve Stability:** 100% of tests pass reliably
|
||
|
|
3. **Expand Coverage:** Reach 80%+ code coverage across all LSP features
|
||
|
|
4. **Enable Automation:** Set up CI/CD for continuous testing
|
||
|
|
5. **Support Multi-File:** Prepare infrastructure for cross-file navigation
|
||
|
|
|
||
|
|
### Success Metrics
|
||
|
|
- **Week 1:** All tests compile and run (100% pass rate)
|
||
|
|
- **Week 2:** All features have dedicated unit tests (70% coverage)
|
||
|
|
- **Week 3:** Integration tests complete (80% coverage)
|
||
|
|
- **Week 4:** CI/CD operational, benchmarks established
|
||
|
|
- **Weeks 5-6:** Multi-file support tested (85% coverage)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Testing Pyramid
|
||
|
|
|
||
|
|
```
|
||
|
|
/\
|
||
|
|
/ \ E2E Tests (5%)
|
||
|
|
/____\ - Zed extension integration
|
||
|
|
/ \ - Real-world scenarios
|
||
|
|
/ INTEG \
|
||
|
|
/__________\ Integration Tests (15%)
|
||
|
|
/ \ - Multi-file scenarios
|
||
|
|
/ UNIT TESTS \ - Concurrent access
|
||
|
|
/________________\ - Document lifecycle
|
||
|
|
/ \
|
||
|
|
/ UNIT TESTS 80% \ Unit Tests (80%)
|
||
|
|
/____________________\ - Feature-specific tests
|
||
|
|
- Edge cases
|
||
|
|
- Error conditions
|
||
|
|
```
|
||
|
|
|
||
|
|
**Rationale:**
|
||
|
|
- **80% Unit Tests** - Fast feedback, easy to debug, high confidence in individual components
|
||
|
|
- **15% Integration Tests** - Validate feature interactions, realistic scenarios
|
||
|
|
- **5% E2E Tests** - Validate actual editor integration (Zed)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase 1: Stabilization (Days 1-2)
|
||
|
|
|
||
|
|
### Objective
|
||
|
|
Fix blocking compilation errors and establish baseline test health.
|
||
|
|
|
||
|
|
### Tasks
|
||
|
|
|
||
|
|
#### 1.1 Fix Compilation Errors
|
||
|
|
**Priority:** CRITICAL
|
||
|
|
**Estimated Time:** 30 minutes
|
||
|
|
|
||
|
|
**Bug #1: Inlay Hints Import Path**
|
||
|
|
- **File:** `storybook/src/lsp/inlay_hints.rs:134`
|
||
|
|
- **Current:** `crate::project::positions::PositionTracker`
|
||
|
|
- **Fix:** `crate::position::PositionTracker`
|
||
|
|
- **Verification:** `cargo check`
|
||
|
|
|
||
|
|
**Bug #2: Completion Type Annotation**
|
||
|
|
- **File:** `storybook/src/lsp/completion.rs:421`
|
||
|
|
- **Current:** `let mut nesting_level = 0;`
|
||
|
|
- **Fix:** `let mut nesting_level: i32 = 0;`
|
||
|
|
- **Verification:** `cargo check`
|
||
|
|
|
||
|
|
#### 1.2 Baseline Test Run
|
||
|
|
**Priority:** CRITICAL
|
||
|
|
**Estimated Time:** 1 hour
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Compile tests
|
||
|
|
cd /Users/sienna/Development/storybook/storybook
|
||
|
|
cargo test --lib lsp --no-run
|
||
|
|
|
||
|
|
# Run all LSP tests
|
||
|
|
cargo test --lib lsp
|
||
|
|
|
||
|
|
# Document results
|
||
|
|
# - Total tests run
|
||
|
|
# - Pass/fail counts
|
||
|
|
# - Any flaky tests
|
||
|
|
# - Performance outliers
|
||
|
|
```
|
||
|
|
|
||
|
|
**Deliverable:** Test baseline report
|
||
|
|
- Total test count (expect ~147)
|
||
|
|
- Pass rate (target: 100%)
|
||
|
|
- Test execution time
|
||
|
|
- Flaky test identification
|
||
|
|
|
||
|
|
#### 1.3 Tree-sitter Verification
|
||
|
|
**Priority:** HIGH
|
||
|
|
**Estimated Time:** 30 minutes
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cd /Users/sienna/Development/storybook/tree-sitter-storybook
|
||
|
|
npm run test
|
||
|
|
```
|
||
|
|
|
||
|
|
**Verification:** All 27 tests pass
|
||
|
|
|
||
|
|
### Phase 1 Deliverables
|
||
|
|
- ✅ All compilation errors fixed
|
||
|
|
- ✅ All tests compile successfully
|
||
|
|
- ✅ Baseline test report with 100% pass rate
|
||
|
|
- ✅ Tree-sitter tests verified
|
||
|
|
- ✅ List of any identified issues for Phase 2
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase 2: Coverage Expansion (Days 3-7)
|
||
|
|
|
||
|
|
### Objective
|
||
|
|
Fill test coverage gaps for under-tested features and add edge case testing.
|
||
|
|
|
||
|
|
### 2.1 Feature-Specific Unit Tests
|
||
|
|
|
||
|
|
#### 2.1.1 Hover Tests
|
||
|
|
**File:** Create `storybook/src/lsp/hover_tests.rs`
|
||
|
|
**Estimated Time:** 3 hours
|
||
|
|
**Target:** 15 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Basic Hover** (3 tests)
|
||
|
|
- Hover on character name
|
||
|
|
- Hover on template name
|
||
|
|
- Hover on behavior name
|
||
|
|
|
||
|
|
2. **Type Information** (4 tests)
|
||
|
|
- Hover on field shows type
|
||
|
|
- Hover on species reference
|
||
|
|
- Hover on template reference
|
||
|
|
- Hover on behavior reference
|
||
|
|
|
||
|
|
3. **Documentation** (4 tests)
|
||
|
|
- Hover shows field documentation
|
||
|
|
- Hover shows prose blocks
|
||
|
|
- Hover on state shows transitions
|
||
|
|
- Hover on schedule shows time blocks
|
||
|
|
|
||
|
|
4. **Edge Cases** (4 tests)
|
||
|
|
- Hover on whitespace (returns None)
|
||
|
|
- Hover on comment (returns None)
|
||
|
|
- Hover at EOF (returns None)
|
||
|
|
- Hover on invalid position (returns None)
|
||
|
|
|
||
|
|
**Coverage Target:** 85%
|
||
|
|
|
||
|
|
#### 2.1.2 Formatting Tests
|
||
|
|
**File:** Create `storybook/src/lsp/formatting_tests.rs`
|
||
|
|
**Estimated Time:** 3 hours
|
||
|
|
**Target:** 15 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Indentation** (5 tests)
|
||
|
|
- Character block indentation
|
||
|
|
- Template block indentation
|
||
|
|
- Nested behavior indentation
|
||
|
|
- Life arc state indentation
|
||
|
|
- Consistent tab/space usage
|
||
|
|
|
||
|
|
2. **Spacing** (4 tests)
|
||
|
|
- Spacing around colons
|
||
|
|
- Spacing around braces
|
||
|
|
- Line breaks between declarations
|
||
|
|
- Trailing whitespace removal
|
||
|
|
|
||
|
|
3. **Alignment** (3 tests)
|
||
|
|
- Field alignment in blocks
|
||
|
|
- Comment alignment
|
||
|
|
- Multiline value alignment
|
||
|
|
|
||
|
|
4. **Edge Cases** (3 tests)
|
||
|
|
- Empty document formatting
|
||
|
|
- Already-formatted document (no changes)
|
||
|
|
- Document with syntax errors (graceful handling)
|
||
|
|
|
||
|
|
**Coverage Target:** 80%
|
||
|
|
|
||
|
|
#### 2.1.3 Rename Tests
|
||
|
|
**File:** Create `storybook/src/lsp/rename_tests.rs`
|
||
|
|
**Estimated Time:** 2 hours
|
||
|
|
**Target:** 12 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Basic Rename** (4 tests)
|
||
|
|
- Rename character (updates all references)
|
||
|
|
- Rename template (updates all uses)
|
||
|
|
- Rename behavior (updates all calls)
|
||
|
|
- Rename field (updates all occurrences)
|
||
|
|
|
||
|
|
2. **Scope Testing** (3 tests)
|
||
|
|
- Rename doesn't affect different scope
|
||
|
|
- Rename preserves capitalization context
|
||
|
|
- Rename updates definition + all uses
|
||
|
|
|
||
|
|
3. **Validation** (3 tests)
|
||
|
|
- Reject invalid identifier names
|
||
|
|
- Reject rename to existing symbol
|
||
|
|
- Reject rename of built-in keywords
|
||
|
|
|
||
|
|
4. **Edge Cases** (2 tests)
|
||
|
|
- Rename at EOF
|
||
|
|
- Rename in comment (should fail gracefully)
|
||
|
|
|
||
|
|
**Coverage Target:** 80%
|
||
|
|
|
||
|
|
#### 2.1.4 Semantic Tokens Tests
|
||
|
|
**File:** Create `storybook/src/lsp/semantic_tokens_tests.rs`
|
||
|
|
**Estimated Time:** 3 hours
|
||
|
|
**Target:** 15 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Token Types** (7 tests)
|
||
|
|
- Character tokens (type.character)
|
||
|
|
- Template tokens (type.template)
|
||
|
|
- Behavior tokens (type.behavior)
|
||
|
|
- Keyword tokens (keyword.declaration)
|
||
|
|
- Field tokens (property)
|
||
|
|
- String tokens (string)
|
||
|
|
- Number tokens (constant.numeric)
|
||
|
|
|
||
|
|
2. **Token Modifiers** (3 tests)
|
||
|
|
- Definition modifier
|
||
|
|
- Reference modifier
|
||
|
|
- Deprecated modifier (if applicable)
|
||
|
|
|
||
|
|
3. **Complex Scenarios** (3 tests)
|
||
|
|
- Nested structures (correct scoping)
|
||
|
|
- Multiline declarations
|
||
|
|
- Mixed token types in single line
|
||
|
|
|
||
|
|
4. **Edge Cases** (2 tests)
|
||
|
|
- Empty document (no tokens)
|
||
|
|
- Syntax errors (partial tokenization)
|
||
|
|
|
||
|
|
**Coverage Target:** 80%
|
||
|
|
|
||
|
|
### 2.2 Edge Case Testing Suite
|
||
|
|
|
||
|
|
#### 2.2.1 Large File Tests
|
||
|
|
**File:** Create `storybook/src/lsp/stress_tests.rs`
|
||
|
|
**Estimated Time:** 4 hours
|
||
|
|
**Target:** 10 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Size Tests** (4 tests)
|
||
|
|
- 1,000 line document
|
||
|
|
- 10,000 line document
|
||
|
|
- 50,000 line document
|
||
|
|
- Document with 1,000+ symbols
|
||
|
|
|
||
|
|
2. **Depth Tests** (3 tests)
|
||
|
|
- 10-level nested behaviors
|
||
|
|
- 20-level nested behaviors
|
||
|
|
- Deeply nested template includes
|
||
|
|
|
||
|
|
3. **Performance Tests** (3 tests)
|
||
|
|
- Parse time < 100ms for 1,000 lines
|
||
|
|
- Symbol extraction < 50ms
|
||
|
|
- Completion latency < 20ms
|
||
|
|
|
||
|
|
**Coverage Target:** N/A (performance validation)
|
||
|
|
|
||
|
|
#### 2.2.2 Unicode and Special Characters
|
||
|
|
**File:** Add to `storybook/src/lsp/document_edge_tests.rs`
|
||
|
|
**Estimated Time:** 2 hours
|
||
|
|
**Target:** 8 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Unicode Identifiers** (3 tests)
|
||
|
|
- Chinese character names: `character 爱丽丝 {}`
|
||
|
|
- Emoji in prose: `---backstory\n😊\n---`
|
||
|
|
- Mixed scripts: `character Αλίκη {}`
|
||
|
|
|
||
|
|
2. **Special Characters** (3 tests)
|
||
|
|
- Underscores in identifiers
|
||
|
|
- Hyphens in strings
|
||
|
|
- Escape sequences in strings
|
||
|
|
|
||
|
|
3. **Boundary Conditions** (2 tests)
|
||
|
|
- Zero-width characters
|
||
|
|
- RTL text in prose blocks
|
||
|
|
|
||
|
|
**Coverage Target:** 85%
|
||
|
|
|
||
|
|
#### 2.2.3 Malformed Input Tests
|
||
|
|
**File:** Create `storybook/src/lsp/error_recovery_tests.rs`
|
||
|
|
**Estimated Time:** 3 hours
|
||
|
|
**Target:** 12 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Truncated Files** (4 tests)
|
||
|
|
- Incomplete character block
|
||
|
|
- Unclosed prose block
|
||
|
|
- Missing closing brace
|
||
|
|
- Truncated at field value
|
||
|
|
|
||
|
|
2. **Invalid UTF-8** (2 tests)
|
||
|
|
- Invalid byte sequences
|
||
|
|
- Null bytes in content
|
||
|
|
|
||
|
|
3. **Syntax Errors** (4 tests)
|
||
|
|
- Missing colons in fields
|
||
|
|
- Invalid identifiers (starting with numbers)
|
||
|
|
- Unmatched braces
|
||
|
|
- Invalid keywords
|
||
|
|
|
||
|
|
4. **Graceful Degradation** (2 tests)
|
||
|
|
- Partial symbol extraction on errors
|
||
|
|
- Diagnostics still generated
|
||
|
|
- Server doesn't crash
|
||
|
|
|
||
|
|
**Coverage Target:** 90%
|
||
|
|
|
||
|
|
### Phase 2 Deliverables
|
||
|
|
- ✅ 4 new test modules created (~57 new tests)
|
||
|
|
- ✅ Coverage increased from ~60% to 80%
|
||
|
|
- ✅ All features have dedicated unit tests
|
||
|
|
- ✅ Edge cases comprehensively tested
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase 3: Integration Testing (Days 8-12)
|
||
|
|
|
||
|
|
### Objective
|
||
|
|
Validate feature interactions and real-world scenarios.
|
||
|
|
|
||
|
|
### 3.1 Multi-File Scenarios
|
||
|
|
|
||
|
|
#### 3.1.1 Multi-File Test Suite
|
||
|
|
**File:** Create `storybook/src/lsp/integration_tests.rs`
|
||
|
|
**Estimated Time:** 6 hours
|
||
|
|
**Target:** 15 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Use Declarations** (3 tests)
|
||
|
|
- Import character from another file
|
||
|
|
- Import template from another file
|
||
|
|
- Wildcard imports (`use characters::*`)
|
||
|
|
|
||
|
|
2. **Cross-File References** (4 tests)
|
||
|
|
- Template include from another file
|
||
|
|
- Species reference from another file
|
||
|
|
- Behavior subtree reference (`@module::tree`)
|
||
|
|
- Relationship participants from other files
|
||
|
|
|
||
|
|
3. **Workspace State** (3 tests)
|
||
|
|
- Workspace rebuild on file add
|
||
|
|
- Workspace rebuild on file remove
|
||
|
|
- Workspace rebuild on file modify
|
||
|
|
|
||
|
|
4. **Symbol Resolution** (3 tests)
|
||
|
|
- Go-to-definition across files (even if not fully supported)
|
||
|
|
- Find references across files (even if not fully supported)
|
||
|
|
- Completion shows cross-file symbols (even if not fully supported)
|
||
|
|
|
||
|
|
5. **Edge Cases** (2 tests)
|
||
|
|
- Circular dependencies
|
||
|
|
- Missing import targets
|
||
|
|
|
||
|
|
**Note:** These tests document **expected behavior** even if cross-file support isn't fully implemented. They serve as regression tests for when Phase 5 is implemented.
|
||
|
|
|
||
|
|
**Coverage Target:** Establish baseline for multi-file features
|
||
|
|
|
||
|
|
#### 3.1.2 Concurrent Access Tests
|
||
|
|
**File:** Create `storybook/src/lsp/concurrency_tests.rs`
|
||
|
|
**Estimated Time:** 4 hours
|
||
|
|
**Target:** 8 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Multiple Clients** (3 tests)
|
||
|
|
- Two clients open same document
|
||
|
|
- Simultaneous edits from different clients
|
||
|
|
- One client closes while other edits
|
||
|
|
|
||
|
|
2. **Rapid Operations** (3 tests)
|
||
|
|
- Rapid did_change events
|
||
|
|
- Open/close/reopen sequences
|
||
|
|
- Completion requests during editing
|
||
|
|
|
||
|
|
3. **Locking** (2 tests)
|
||
|
|
- No deadlocks on concurrent reads
|
||
|
|
- Write lock releases properly
|
||
|
|
|
||
|
|
**Coverage Target:** Validate concurrency safety
|
||
|
|
|
||
|
|
### 3.2 Document Lifecycle Tests
|
||
|
|
|
||
|
|
#### 3.2.1 Lifecycle Integration
|
||
|
|
**File:** Add to `storybook/src/lsp/integration_tests.rs`
|
||
|
|
**Estimated Time:** 3 hours
|
||
|
|
**Target:** 10 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Open/Edit/Close** (4 tests)
|
||
|
|
- Normal lifecycle (open → edit → close)
|
||
|
|
- Multiple edits before close
|
||
|
|
- Close without edit
|
||
|
|
- Reopen after close
|
||
|
|
|
||
|
|
2. **State Consistency** (3 tests)
|
||
|
|
- Symbols update on edit
|
||
|
|
- Diagnostics update on edit
|
||
|
|
- Workspace rebuilds on change
|
||
|
|
|
||
|
|
3. **Error Recovery** (3 tests)
|
||
|
|
- Server survives parse errors
|
||
|
|
- Server survives malformed edits
|
||
|
|
- Server recovers from invalid positions
|
||
|
|
|
||
|
|
**Coverage Target:** 90% of document.rs
|
||
|
|
|
||
|
|
### Phase 3 Deliverables
|
||
|
|
- ✅ Integration test suite (33 tests)
|
||
|
|
- ✅ Multi-file behavior documented
|
||
|
|
- ✅ Concurrency safety validated
|
||
|
|
- ✅ Document lifecycle coverage > 90%
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase 4: Automation & Performance (Days 13-15)
|
||
|
|
|
||
|
|
### Objective
|
||
|
|
Enable automated testing and establish performance baselines.
|
||
|
|
|
||
|
|
### 4.1 CI/CD Setup
|
||
|
|
|
||
|
|
#### 4.1.1 GitHub Actions Workflow
|
||
|
|
**File:** Create `.github/workflows/lsp-tests.yml`
|
||
|
|
**Estimated Time:** 4 hours
|
||
|
|
|
||
|
|
**Workflow Components:**
|
||
|
|
1. **Test Jobs**
|
||
|
|
- Rust version: stable, nightly
|
||
|
|
- OS: ubuntu-latest, macos-latest
|
||
|
|
- Run: `cargo test --lib lsp`
|
||
|
|
|
||
|
|
2. **Coverage Job**
|
||
|
|
- Tool: cargo-tarpaulin or cargo-llvm-cov
|
||
|
|
- Upload to: codecov.io or coveralls.io
|
||
|
|
- Fail if coverage < 75%
|
||
|
|
|
||
|
|
3. **Tree-sitter Job**
|
||
|
|
- Setup: Node.js 18+
|
||
|
|
- Run: `cd tree-sitter-storybook && npm test`
|
||
|
|
|
||
|
|
4. **Benchmark Job**
|
||
|
|
- Run: `cargo bench --bench lsp_benchmarks`
|
||
|
|
- Upload results for regression tracking
|
||
|
|
|
||
|
|
**Triggers:**
|
||
|
|
- On push to main
|
||
|
|
- On pull requests
|
||
|
|
- Nightly (for performance tracking)
|
||
|
|
|
||
|
|
#### 4.1.2 Pre-commit Hooks
|
||
|
|
**File:** Update `.github/workflows/pre-commit.yml` or `lefthook.yml`
|
||
|
|
**Estimated Time:** 1 hour
|
||
|
|
|
||
|
|
**Hooks:**
|
||
|
|
- `cargo test --lib lsp` (fast tests only)
|
||
|
|
- `cargo clippy` (linting)
|
||
|
|
- `cargo fmt --check` (formatting)
|
||
|
|
|
||
|
|
### 4.2 Performance Benchmarking
|
||
|
|
|
||
|
|
#### 4.2.1 Benchmark Suite
|
||
|
|
**File:** Create `storybook/benches/lsp_benchmarks.rs`
|
||
|
|
**Estimated Time:** 6 hours
|
||
|
|
|
||
|
|
**Benchmarks:**
|
||
|
|
1. **Parse Benchmarks** (4 benchmarks)
|
||
|
|
- Small file (100 lines)
|
||
|
|
- Medium file (1,000 lines)
|
||
|
|
- Large file (10,000 lines)
|
||
|
|
- Very large file (50,000 lines)
|
||
|
|
|
||
|
|
2. **Symbol Extraction** (3 benchmarks)
|
||
|
|
- 10 symbols
|
||
|
|
- 100 symbols
|
||
|
|
- 1,000 symbols
|
||
|
|
|
||
|
|
3. **Completion** (3 benchmarks)
|
||
|
|
- Field completion (10 options)
|
||
|
|
- Template completion (100 options)
|
||
|
|
- Completion in large file
|
||
|
|
|
||
|
|
4. **Navigation** (3 benchmarks)
|
||
|
|
- Go-to-definition in small file
|
||
|
|
- Go-to-definition in large file
|
||
|
|
- Find references (100 occurrences)
|
||
|
|
|
||
|
|
5. **Workspace** (2 benchmarks)
|
||
|
|
- Workspace rebuild (10 files)
|
||
|
|
- Workspace rebuild (100 files)
|
||
|
|
|
||
|
|
**Tool:** Use `criterion.rs` for statistical analysis
|
||
|
|
|
||
|
|
**Baselines:**
|
||
|
|
- Parse: < 10ms per 1,000 lines
|
||
|
|
- Symbol extraction: < 5ms per 100 symbols
|
||
|
|
- Completion: < 20ms latency
|
||
|
|
- Navigation: < 10ms per operation
|
||
|
|
- Workspace rebuild: < 100ms per 10 files
|
||
|
|
|
||
|
|
#### 4.2.2 Memory Profiling
|
||
|
|
**Estimated Time:** 3 hours
|
||
|
|
|
||
|
|
**Profiling Scenarios:**
|
||
|
|
1. Memory usage with 100 open documents
|
||
|
|
2. Memory usage with 1,000+ symbols
|
||
|
|
3. Memory leak detection (long-running session)
|
||
|
|
|
||
|
|
**Tools:**
|
||
|
|
- `valgrind` (Linux)
|
||
|
|
- `instruments` (macOS)
|
||
|
|
- `heaptrack` (cross-platform)
|
||
|
|
|
||
|
|
### Phase 4 Deliverables
|
||
|
|
- ✅ CI/CD pipeline operational
|
||
|
|
- ✅ Code coverage reporting automated
|
||
|
|
- ✅ Benchmark suite established
|
||
|
|
- ✅ Performance baselines documented
|
||
|
|
- ✅ Memory profiling completed
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase 5: Multi-File Support (Days 16-30)
|
||
|
|
|
||
|
|
### Objective
|
||
|
|
Implement and test cross-file navigation and workspace features.
|
||
|
|
|
||
|
|
### 5.1 Architecture Implementation
|
||
|
|
|
||
|
|
#### 5.1.1 Workspace NameTable Integration
|
||
|
|
**File:** `storybook/src/lsp/server.rs`
|
||
|
|
**Estimated Time:** 8 hours
|
||
|
|
|
||
|
|
**Changes:**
|
||
|
|
1. Populate `WorkspaceState.name_table` in `rebuild()`
|
||
|
|
2. Use file indices from NameTable
|
||
|
|
3. Store symbol locations with file indices
|
||
|
|
4. Expose workspace-level symbol lookup
|
||
|
|
|
||
|
|
**Testing:**
|
||
|
|
- Unit tests for NameTable population
|
||
|
|
- Verify symbol file indices correct
|
||
|
|
- Test incremental updates
|
||
|
|
|
||
|
|
#### 5.1.2 Cross-File Go-to-Definition
|
||
|
|
**File:** `storybook/src/lsp/definition.rs`
|
||
|
|
**Estimated Time:** 6 hours
|
||
|
|
|
||
|
|
**Changes:**
|
||
|
|
1. Check local document symbols first (fast path)
|
||
|
|
2. Fall back to workspace NameTable
|
||
|
|
3. Resolve file URL from file index
|
||
|
|
4. Return Location with correct file URL
|
||
|
|
|
||
|
|
**Testing:**
|
||
|
|
- Go to character in another file
|
||
|
|
- Go to template in another file
|
||
|
|
- Go to behavior in another file
|
||
|
|
- Handle missing symbols gracefully
|
||
|
|
|
||
|
|
#### 5.1.3 Cross-File Find References
|
||
|
|
**File:** `storybook/src/lsp/references.rs`
|
||
|
|
**Estimated Time:** 6 hours
|
||
|
|
|
||
|
|
**Changes:**
|
||
|
|
1. Search all open documents (not just current)
|
||
|
|
2. Use workspace NameTable for symbol lookup
|
||
|
|
3. Aggregate results from multiple files
|
||
|
|
4. Return Locations with file URLs
|
||
|
|
|
||
|
|
**Testing:**
|
||
|
|
- Find character references across files
|
||
|
|
- Find template uses across files
|
||
|
|
- Find behavior calls across files
|
||
|
|
- Performance with 100+ open files
|
||
|
|
|
||
|
|
#### 5.1.4 Use Declaration Resolution
|
||
|
|
**File:** Create `storybook/src/lsp/imports.rs`
|
||
|
|
**Estimated Time:** 8 hours
|
||
|
|
|
||
|
|
**Changes:**
|
||
|
|
1. Parse `use` declarations
|
||
|
|
2. Resolve module paths to file URLs
|
||
|
|
3. Populate symbol table from imports
|
||
|
|
4. Handle grouped imports (`use foo::{bar, baz}`)
|
||
|
|
5. Handle wildcard imports (`use foo::*`)
|
||
|
|
|
||
|
|
**Testing:**
|
||
|
|
- Simple use declaration
|
||
|
|
- Grouped imports
|
||
|
|
- Wildcard imports
|
||
|
|
- Nested module paths
|
||
|
|
- Missing import targets
|
||
|
|
|
||
|
|
### 5.2 Multi-File Test Suite
|
||
|
|
|
||
|
|
#### 5.2.1 Cross-File Navigation Tests
|
||
|
|
**File:** Create `storybook/src/lsp/cross_file_tests.rs`
|
||
|
|
**Estimated Time:** 6 hours
|
||
|
|
**Target:** 20 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Go-to-Definition** (5 tests)
|
||
|
|
- Navigate to character in other file
|
||
|
|
- Navigate to template in other file
|
||
|
|
- Navigate to behavior in other file
|
||
|
|
- Navigate with nested modules
|
||
|
|
- Navigate with use declarations
|
||
|
|
|
||
|
|
2. **Find References** (5 tests)
|
||
|
|
- Find character uses across files
|
||
|
|
- Find template includes across files
|
||
|
|
- Find behavior calls across files
|
||
|
|
- Find species references across files
|
||
|
|
- Performance with 50+ files
|
||
|
|
|
||
|
|
3. **Completion** (5 tests)
|
||
|
|
- Complete imported symbols
|
||
|
|
- Complete from wildcard imports
|
||
|
|
- Complete nested module paths
|
||
|
|
- Complete with use declarations
|
||
|
|
- Don't show non-imported symbols
|
||
|
|
|
||
|
|
4. **Workspace Management** (5 tests)
|
||
|
|
- Add file updates workspace
|
||
|
|
- Remove file updates workspace
|
||
|
|
- Modify file updates workspace
|
||
|
|
- Rename file updates workspace
|
||
|
|
- Large workspace (100+ files)
|
||
|
|
|
||
|
|
**Coverage Target:** 85% of cross-file features
|
||
|
|
|
||
|
|
#### 5.2.2 Import Resolution Tests
|
||
|
|
**File:** Create `storybook/src/lsp/import_tests.rs`
|
||
|
|
**Estimated Time:** 4 hours
|
||
|
|
**Target:** 15 tests
|
||
|
|
|
||
|
|
**Test Categories:**
|
||
|
|
1. **Simple Imports** (3 tests)
|
||
|
|
- `use characters::Alice;`
|
||
|
|
- `use templates::Child;`
|
||
|
|
- `use behaviors::Walk;`
|
||
|
|
|
||
|
|
2. **Grouped Imports** (3 tests)
|
||
|
|
- `use characters::{Alice, Bob};`
|
||
|
|
- `use templates::{Child, Adult};`
|
||
|
|
- Mixed types in group
|
||
|
|
|
||
|
|
3. **Wildcard Imports** (3 tests)
|
||
|
|
- `use characters::*;`
|
||
|
|
- Multiple wildcards from different modules
|
||
|
|
- Wildcard precedence
|
||
|
|
|
||
|
|
4. **Nested Modules** (3 tests)
|
||
|
|
- `use world::characters::Alice;`
|
||
|
|
- `use world::locations::wonderland::Garden;`
|
||
|
|
- Deep nesting (5+ levels)
|
||
|
|
|
||
|
|
5. **Edge Cases** (3 tests)
|
||
|
|
- Import non-existent symbol
|
||
|
|
- Circular imports
|
||
|
|
- Import from self
|
||
|
|
|
||
|
|
**Coverage Target:** 90% of import resolution
|
||
|
|
|
||
|
|
### Phase 5 Deliverables
|
||
|
|
- ✅ Cross-file navigation implemented
|
||
|
|
- ✅ Use declaration resolution working
|
||
|
|
- ✅ Workspace NameTable populated
|
||
|
|
- ✅ 35 new multi-file tests
|
||
|
|
- ✅ Overall coverage > 85%
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Test Execution Strategy
|
||
|
|
|
||
|
|
### Daily Test Runs
|
||
|
|
```bash
|
||
|
|
# Quick sanity check (< 5 seconds)
|
||
|
|
cargo test --lib lsp -- --test-threads=1 --nocapture
|
||
|
|
|
||
|
|
# Full test suite (< 2 minutes)
|
||
|
|
cargo test --lib lsp
|
||
|
|
|
||
|
|
# With coverage (< 5 minutes)
|
||
|
|
cargo tarpaulin --lib --out Html --output-dir coverage
|
||
|
|
|
||
|
|
# Tree-sitter tests (< 10 seconds)
|
||
|
|
cd tree-sitter-storybook && npm test
|
||
|
|
```
|
||
|
|
|
||
|
|
### Weekly Test Runs
|
||
|
|
```bash
|
||
|
|
# Performance benchmarks (~ 10 minutes)
|
||
|
|
cargo bench --bench lsp_benchmarks
|
||
|
|
|
||
|
|
# Memory profiling (~ 15 minutes)
|
||
|
|
./scripts/memory_profile.sh
|
||
|
|
|
||
|
|
# Stress tests (~ 20 minutes)
|
||
|
|
cargo test --lib lsp --release -- stress_tests
|
||
|
|
```
|
||
|
|
|
||
|
|
### Pre-Release Test Runs
|
||
|
|
```bash
|
||
|
|
# Full suite with coverage
|
||
|
|
cargo tarpaulin --lib --out Html --output-dir coverage
|
||
|
|
|
||
|
|
# Performance regression check
|
||
|
|
cargo bench --bench lsp_benchmarks -- --save-baseline main
|
||
|
|
|
||
|
|
# Integration tests
|
||
|
|
cargo test --test '*'
|
||
|
|
|
||
|
|
# E2E tests (manual)
|
||
|
|
# 1. Build Zed extension
|
||
|
|
# 2. Test in Zed editor
|
||
|
|
# 3. Validate all LSP features work
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Coverage Measurement
|
||
|
|
|
||
|
|
### Tools
|
||
|
|
1. **cargo-tarpaulin** - Code coverage for Rust
|
||
|
|
```bash
|
||
|
|
cargo install cargo-tarpaulin
|
||
|
|
cargo tarpaulin --lib --out Html --output-dir coverage
|
||
|
|
```
|
||
|
|
|
||
|
|
2. **cargo-llvm-cov** - Alternative coverage tool
|
||
|
|
```bash
|
||
|
|
cargo install cargo-llvm-cov
|
||
|
|
cargo llvm-cov --html --lib
|
||
|
|
```
|
||
|
|
|
||
|
|
### Coverage Targets by Phase
|
||
|
|
|
||
|
|
| Phase | Target | Focus |
|
||
|
|
|-------|--------|-------|
|
||
|
|
| Phase 1 | 60% | Existing tests |
|
||
|
|
| Phase 2 | 80% | Feature coverage |
|
||
|
|
| Phase 3 | 82% | Integration |
|
||
|
|
| Phase 4 | 83% | Automation |
|
||
|
|
| Phase 5 | 85% | Multi-file |
|
||
|
|
|
||
|
|
### Coverage Exemptions
|
||
|
|
- Generated code (tree-sitter bindings)
|
||
|
|
- Test utilities and fixtures
|
||
|
|
- Deprecated code paths
|
||
|
|
- Unreachable error handling (panic paths)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Risk Management
|
||
|
|
|
||
|
|
### High-Risk Areas
|
||
|
|
|
||
|
|
1. **Concurrency Bugs**
|
||
|
|
- **Risk:** Deadlocks, race conditions in RwLock usage
|
||
|
|
- **Mitigation:** Dedicated concurrency test suite, stress testing
|
||
|
|
- **Detection:** Thread sanitizer, long-running tests
|
||
|
|
|
||
|
|
2. **Performance Regressions**
|
||
|
|
- **Risk:** Features slow down with large files/workspaces
|
||
|
|
- **Mitigation:** Benchmark suite, performance tracking in CI
|
||
|
|
- **Detection:** Criterion benchmarks, regression alerts
|
||
|
|
|
||
|
|
3. **Memory Leaks**
|
||
|
|
- **Risk:** Documents not cleaned up, symbol table grows unbounded
|
||
|
|
- **Mitigation:** Memory profiling, leak detection tools
|
||
|
|
- **Detection:** Valgrind, long-running session tests
|
||
|
|
|
||
|
|
4. **Cross-File State Corruption**
|
||
|
|
- **Risk:** NameTable inconsistencies, stale file references
|
||
|
|
- **Mitigation:** Workspace state validation tests
|
||
|
|
- **Detection:** Integration tests, state invariant checks
|
||
|
|
|
||
|
|
### Medium-Risk Areas
|
||
|
|
|
||
|
|
1. **Unicode Handling**
|
||
|
|
- **Risk:** Position calculations off by one with multi-byte chars
|
||
|
|
- **Mitigation:** Comprehensive unicode test suite
|
||
|
|
- **Detection:** Unicode-heavy test files
|
||
|
|
|
||
|
|
2. **Error Recovery**
|
||
|
|
- **Risk:** Server crashes on malformed input
|
||
|
|
- **Mitigation:** Fuzz testing, malformed input tests
|
||
|
|
- **Detection:** Error recovery test suite
|
||
|
|
|
||
|
|
3. **Zed Integration**
|
||
|
|
- **Risk:** LSP features not working in actual editor
|
||
|
|
- **Mitigation:** E2E testing in Zed
|
||
|
|
- **Detection:** Manual testing, user feedback
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Test Maintenance
|
||
|
|
|
||
|
|
### Test Naming Convention
|
||
|
|
```rust
|
||
|
|
// Pattern: test_<feature>_<scenario>_<expected_result>
|
||
|
|
#[test]
|
||
|
|
fn test_goto_definition_character_in_same_file() { ... }
|
||
|
|
|
||
|
|
#[test]
|
||
|
|
fn test_completion_field_after_colon() { ... }
|
||
|
|
|
||
|
|
#[test]
|
||
|
|
fn test_diagnostics_parse_error_missing_brace() { ... }
|
||
|
|
```
|
||
|
|
|
||
|
|
### Test Organization
|
||
|
|
```
|
||
|
|
src/lsp/
|
||
|
|
├── *_tests.rs # Feature-specific unit tests
|
||
|
|
├── integration_tests.rs # Multi-feature integration
|
||
|
|
├── stress_tests.rs # Performance and scale
|
||
|
|
└── cross_file_tests.rs # Multi-file scenarios
|
||
|
|
```
|
||
|
|
|
||
|
|
### Test Documentation
|
||
|
|
- Each test file has module-level doc comment explaining scope
|
||
|
|
- Each test has doc comment explaining scenario
|
||
|
|
- Complex tests have inline comments for key assertions
|
||
|
|
|
||
|
|
### Test Refactoring Guidelines
|
||
|
|
- Extract common test fixtures to `test_utils.rs`
|
||
|
|
- Use builder pattern for complex test data
|
||
|
|
- Keep tests independent (no shared mutable state)
|
||
|
|
- Aim for tests < 50 lines each
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Appendix A: Test Counting Summary
|
||
|
|
|
||
|
|
### Current Tests (Phase 1)
|
||
|
|
- behavior_tests.rs: 7
|
||
|
|
- code_actions_tests.rs: 27
|
||
|
|
- completion_tests.rs: 10
|
||
|
|
- diagnostics_tests.rs: 20
|
||
|
|
- document_edge_tests.rs: 17
|
||
|
|
- navigation_tests.rs: 11
|
||
|
|
- validation_tests.rs: 10
|
||
|
|
- Integration tests: ~45
|
||
|
|
- **Total: ~147 tests**
|
||
|
|
|
||
|
|
### New Tests by Phase
|
||
|
|
|
||
|
|
**Phase 2 (+57 tests):**
|
||
|
|
- hover_tests.rs: 15
|
||
|
|
- formatting_tests.rs: 15
|
||
|
|
- rename_tests.rs: 12
|
||
|
|
- semantic_tokens_tests.rs: 15
|
||
|
|
|
||
|
|
**Phase 3 (+33 tests):**
|
||
|
|
- integration_tests.rs: 25
|
||
|
|
- concurrency_tests.rs: 8
|
||
|
|
|
||
|
|
**Phase 4 (+15 benchmarks):**
|
||
|
|
- lsp_benchmarks.rs: 15 benchmarks
|
||
|
|
|
||
|
|
**Phase 5 (+35 tests):**
|
||
|
|
- cross_file_tests.rs: 20
|
||
|
|
- import_tests.rs: 15
|
||
|
|
|
||
|
|
**Grand Total: ~287 tests + 15 benchmarks**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Appendix B: Test Commands Reference
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# === LSP Tests ===
|
||
|
|
|
||
|
|
# Run all LSP tests
|
||
|
|
cargo test --lib lsp
|
||
|
|
|
||
|
|
# Run specific test file
|
||
|
|
cargo test --lib lsp::hover_tests
|
||
|
|
|
||
|
|
# Run single test
|
||
|
|
cargo test --lib lsp::hover_tests::test_hover_on_character
|
||
|
|
|
||
|
|
# Run tests with output
|
||
|
|
cargo test --lib lsp -- --nocapture
|
||
|
|
|
||
|
|
# Run tests in single thread (for debugging)
|
||
|
|
cargo test --lib lsp -- --test-threads=1
|
||
|
|
|
||
|
|
# === Tree-sitter Tests ===
|
||
|
|
|
||
|
|
# Run all tree-sitter tests
|
||
|
|
cd tree-sitter-storybook && npm test
|
||
|
|
|
||
|
|
# Run with verbose output
|
||
|
|
cd tree-sitter-storybook && npm test -- --debug
|
||
|
|
|
||
|
|
# === Coverage ===
|
||
|
|
|
||
|
|
# Generate HTML coverage report
|
||
|
|
cargo tarpaulin --lib --out Html --output-dir coverage
|
||
|
|
|
||
|
|
# Coverage with specific minimum threshold
|
||
|
|
cargo tarpaulin --lib --fail-under 80
|
||
|
|
|
||
|
|
# === Benchmarks ===
|
||
|
|
|
||
|
|
# Run all benchmarks
|
||
|
|
cargo bench --bench lsp_benchmarks
|
||
|
|
|
||
|
|
# Run specific benchmark
|
||
|
|
cargo bench --bench lsp_benchmarks -- parse
|
||
|
|
|
||
|
|
# Save baseline
|
||
|
|
cargo bench --bench lsp_benchmarks -- --save-baseline main
|
||
|
|
|
||
|
|
# Compare to baseline
|
||
|
|
cargo bench --bench lsp_benchmarks -- --baseline main
|
||
|
|
|
||
|
|
# === Memory Profiling ===
|
||
|
|
|
||
|
|
# Linux (valgrind)
|
||
|
|
valgrind --leak-check=full --show-leak-kinds=all \
|
||
|
|
cargo test --lib lsp
|
||
|
|
|
||
|
|
# macOS (instruments)
|
||
|
|
instruments -t Leaks cargo test --lib lsp
|
||
|
|
|
||
|
|
# === Linting ===
|
||
|
|
|
||
|
|
# Run clippy
|
||
|
|
cargo clippy --lib
|
||
|
|
|
||
|
|
# Run clippy with strict rules
|
||
|
|
cargo clippy --lib -- -D warnings
|
||
|
|
|
||
|
|
# Format check
|
||
|
|
cargo fmt --check
|
||
|
|
|
||
|
|
# Auto-format
|
||
|
|
cargo fmt
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Appendix C: Coverage Report Template
|
||
|
|
|
||
|
|
```markdown
|
||
|
|
# LSP Test Coverage Report
|
||
|
|
**Date:** YYYY-MM-DD
|
||
|
|
**Commit:** <git sha>
|
||
|
|
|
||
|
|
## Summary
|
||
|
|
- **Overall Coverage:** XX%
|
||
|
|
- **Tests Run:** XXX
|
||
|
|
- **Tests Passed:** XXX
|
||
|
|
- **Tests Failed:** X
|
||
|
|
- **Duration:** XX seconds
|
||
|
|
|
||
|
|
## Coverage by Module
|
||
|
|
| Module | Coverage | Lines Covered | Total Lines |
|
||
|
|
|--------|----------|---------------|-------------|
|
||
|
|
| server.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| hover.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| completion.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| definition.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| references.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| symbols.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| diagnostics.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| formatting.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| rename.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| code_actions.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| semantic_tokens.rs | XX% | XXX / XXX | XXX |
|
||
|
|
| inlay_hints.rs | XX% | XXX / XXX | XXX |
|
||
|
|
|
||
|
|
## Uncovered Lines
|
||
|
|
- server.rs: lines XXX-XXX (error handling)
|
||
|
|
- completion.rs: line XXX (unreachable branch)
|
||
|
|
|
||
|
|
## Recommendations
|
||
|
|
- [ ] Add tests for uncovered error paths
|
||
|
|
- [ ] Increase coverage in completion.rs
|
||
|
|
- [ ] ...
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Conclusion
|
||
|
|
|
||
|
|
This testing strategy provides a **clear roadmap** from the current broken state (compilation errors) to a fully tested, production-ready LSP implementation with 85% coverage. The phased approach allows for **incremental progress** while maintaining **immediate value** at each phase:
|
||
|
|
|
||
|
|
- **Phase 1** - Unblocks development
|
||
|
|
- **Phase 2** - Achieves stability
|
||
|
|
- **Phase 3** - Validates integration
|
||
|
|
- **Phase 4** - Enables automation
|
||
|
|
- **Phase 5** - Adds advanced features
|
||
|
|
|
||
|
|
By following this strategy, the Storybook LSP will become **robust, maintainable, and reliable** for end users.
|