LSP_TEST_STRATEGY.md

# LSP Testing Strategy
**Version:** 1.0
**Date:** 2026-02-12
**Owner:** LSP Test Engineer

---

## Executive Summary

This document outlines a comprehensive testing strategy for the Storybook LSP implementation, designed to achieve **85% code coverage** and **zero critical bugs** within 4-6 weeks. The strategy is divided into 5 phases, prioritized by immediate blocking issues first, followed by coverage expansion, integration testing, automation, and finally multi-file support.

---

## Strategy Goals

### Primary Objectives
1. **Fix Critical Bugs:** Resolve 2 compilation errors blocking all tests
2. **Achieve Stability:** 100% of tests pass reliably
3. **Expand Coverage:** Reach 80%+ code coverage across all LSP features
4. **Enable Automation:** Set up CI/CD for continuous testing
5. **Support Multi-File:** Prepare infrastructure for cross-file navigation

### Success Metrics
- **Week 1:** All tests compile and run (100% pass rate)
- **Week 2:** All features have dedicated unit tests (70% coverage)
- **Week 3:** Integration tests complete (80% coverage)
- **Week 4:** CI/CD operational, benchmarks established
- **Weeks 5-6:** Multi-file support tested (85% coverage)

---

## Testing Pyramid

```
           /\
          /  \         E2E Tests (5%)
         /____\        - Zed extension integration
        /      \       - Real-world scenarios
       /  INTEG \
      /__________\     Integration Tests (15%)
     /            \    - Multi-file scenarios
    /  UNIT TESTS \   - Concurrent access
   /________________\  - Document lifecycle
  /                  \
 /   UNIT TESTS 80%  \ Unit Tests (80%)
/____________________\ - Feature-specific tests
                       - Edge cases
                       - Error conditions
```

**Rationale:**
- **80% Unit Tests** - Fast feedback, easy to debug, high confidence in individual components
- **15% Integration Tests** - Validate feature interactions, realistic scenarios
- **5% E2E Tests** - Validate actual editor integration (Zed)

---

## Phase 1: Stabilization (Days 1-2)

### Objective
Fix blocking compilation errors and establish baseline test health.

### Tasks

#### 1.1 Fix Compilation Errors
**Priority:** CRITICAL
**Estimated Time:** 30 minutes

**Bug #1: Inlay Hints Import Path**
- **File:** `storybook/src/lsp/inlay_hints.rs:134`
- **Current:** `crate::project::positions::PositionTracker`
- **Fix:** `crate::position::PositionTracker`
- **Verification:** `cargo check`

**Bug #2: Completion Type Annotation**
- **File:** `storybook/src/lsp/completion.rs:421`
- **Current:** `let mut nesting_level = 0;`
- **Fix:** `let mut nesting_level: i32 = 0;`
- **Verification:** `cargo check`

#### 1.2 Baseline Test Run
**Priority:** CRITICAL
**Estimated Time:** 1 hour

```bash
# Compile tests
cd /Users/sienna/Development/storybook/storybook
cargo test --lib lsp --no-run

# Run all LSP tests
cargo test --lib lsp

# Document results
# - Total tests run
# - Pass/fail counts
# - Any flaky tests
# - Performance outliers
```

**Deliverable:** Test baseline report
- Total test count (expect ~147)
- Pass rate (target: 100%)
- Test execution time
- Flaky test identification

#### 1.3 Tree-sitter Verification
**Priority:** HIGH
**Estimated Time:** 30 minutes

```bash
cd /Users/sienna/Development/storybook/tree-sitter-storybook
npm run test
```

**Verification:** All 27 tests pass

### Phase 1 Deliverables
- ✅ All compilation errors fixed
- ✅ All tests compile successfully
- ✅ Baseline test report with 100% pass rate
- ✅ Tree-sitter tests verified
- ✅ List of any identified issues for Phase 2

---

## Phase 2: Coverage Expansion (Days 3-7)

### Objective
Fill test coverage gaps for under-tested features and add edge case testing.

### 2.1 Feature-Specific Unit Tests

#### 2.1.1 Hover Tests
**File:** Create `storybook/src/lsp/hover_tests.rs`
**Estimated Time:** 3 hours
**Target:** 15 tests

**Test Categories:**
1. **Basic Hover** (3 tests)
   - Hover on character name
   - Hover on template name
   - Hover on behavior name

2. **Type Information** (4 tests)
   - Hover on field shows type
   - Hover on species reference
   - Hover on template reference
   - Hover on behavior reference

3. **Documentation** (4 tests)
   - Hover shows field documentation
   - Hover shows prose blocks
   - Hover on state shows transitions
   - Hover on schedule shows time blocks

4. **Edge Cases** (4 tests)
   - Hover on whitespace (returns None)
   - Hover on comment (returns None)
   - Hover at EOF (returns None)
   - Hover on invalid position (returns None)

**Coverage Target:** 85%

#### 2.1.2 Formatting Tests
**File:** Create `storybook/src/lsp/formatting_tests.rs`
**Estimated Time:** 3 hours
**Target:** 15 tests

**Test Categories:**
1. **Indentation** (5 tests)
   - Character block indentation
   - Template block indentation
   - Nested behavior indentation
   - Life arc state indentation
   - Consistent tab/space usage

2. **Spacing** (4 tests)
   - Spacing around colons
   - Spacing around braces
   - Line breaks between declarations
   - Trailing whitespace removal

3. **Alignment** (3 tests)
   - Field alignment in blocks
   - Comment alignment
   - Multiline value alignment

4. **Edge Cases** (3 tests)
   - Empty document formatting
   - Already-formatted document (no changes)
   - Document with syntax errors (graceful handling)

**Coverage Target:** 80%

#### 2.1.3 Rename Tests
**File:** Create `storybook/src/lsp/rename_tests.rs`
**Estimated Time:** 2 hours
**Target:** 12 tests

**Test Categories:**
1. **Basic Rename** (4 tests)
   - Rename character (updates all references)
   - Rename template (updates all uses)
   - Rename behavior (updates all calls)
   - Rename field (updates all occurrences)

2. **Scope Testing** (3 tests)
   - Rename doesn't affect different scope
   - Rename preserves capitalization context
   - Rename updates definition + all uses

3. **Validation** (3 tests)
   - Reject invalid identifier names
   - Reject rename to existing symbol
   - Reject rename of built-in keywords

4. **Edge Cases** (2 tests)
   - Rename at EOF
   - Rename in comment (should fail gracefully)

**Coverage Target:** 80%

#### 2.1.4 Semantic Tokens Tests
**File:** Create `storybook/src/lsp/semantic_tokens_tests.rs`
**Estimated Time:** 3 hours
**Target:** 15 tests

**Test Categories:**
1. **Token Types** (7 tests)
   - Character tokens (type.character)
   - Template tokens (type.template)
   - Behavior tokens (type.behavior)
   - Keyword tokens (keyword.declaration)
   - Field tokens (property)
   - String tokens (string)
   - Number tokens (constant.numeric)

2. **Token Modifiers** (3 tests)
   - Definition modifier
   - Reference modifier
   - Deprecated modifier (if applicable)

3. **Complex Scenarios** (3 tests)
   - Nested structures (correct scoping)
   - Multiline declarations
   - Mixed token types in single line

4. **Edge Cases** (2 tests)
   - Empty document (no tokens)
   - Syntax errors (partial tokenization)

**Coverage Target:** 80%

### 2.2 Edge Case Testing Suite

#### 2.2.1 Large File Tests
**File:** Create `storybook/src/lsp/stress_tests.rs`
**Estimated Time:** 4 hours
**Target:** 10 tests

**Test Categories:**
1. **Size Tests** (4 tests)
   - 1,000 line document
   - 10,000 line document
   - 50,000 line document
   - Document with 1,000+ symbols

2. **Depth Tests** (3 tests)
   - 10-level nested behaviors
   - 20-level nested behaviors
   - Deeply nested template includes

3. **Performance Tests** (3 tests)
   - Parse time < 100ms for 1,000 lines
   - Symbol extraction < 50ms
   - Completion latency < 20ms

**Coverage Target:** N/A (performance validation)

#### 2.2.2 Unicode and Special Characters
**File:** Add to `storybook/src/lsp/document_edge_tests.rs`
**Estimated Time:** 2 hours
**Target:** 8 tests

**Test Categories:**
1. **Unicode Identifiers** (3 tests)
   - Chinese character names: `character 爱丽丝 {}`
   - Emoji in prose: `---backstory\n😊\n---`
   - Mixed scripts: `character Αλίκη {}`

2. **Special Characters** (3 tests)
   - Underscores in identifiers
   - Hyphens in strings
   - Escape sequences in strings

3. **Boundary Conditions** (2 tests)
   - Zero-width characters
   - RTL text in prose blocks

**Coverage Target:** 85%

#### 2.2.3 Malformed Input Tests
**File:** Create `storybook/src/lsp/error_recovery_tests.rs`
**Estimated Time:** 3 hours
**Target:** 12 tests

**Test Categories:**
1. **Truncated Files** (4 tests)
   - Incomplete character block
   - Unclosed prose block
   - Missing closing brace
   - Truncated at field value

2. **Invalid UTF-8** (2 tests)
   - Invalid byte sequences
   - Null bytes in content

3. **Syntax Errors** (4 tests)
   - Missing colons in fields
   - Invalid identifiers (starting with numbers)
   - Unmatched braces
   - Invalid keywords

4. **Graceful Degradation** (2 tests)
   - Partial symbol extraction on errors
   - Diagnostics still generated
   - Server doesn't crash

**Coverage Target:** 90%

### Phase 2 Deliverables
- ✅ 4 new test modules created (~57 new tests)
- ✅ Coverage increased from ~60% to 80%
- ✅ All features have dedicated unit tests
- ✅ Edge cases comprehensively tested

---

## Phase 3: Integration Testing (Days 8-12)

### Objective
Validate feature interactions and real-world scenarios.

### 3.1 Multi-File Scenarios

#### 3.1.1 Multi-File Test Suite
**File:** Create `storybook/src/lsp/integration_tests.rs`
**Estimated Time:** 6 hours
**Target:** 15 tests

**Test Categories:**
1. **Use Declarations** (3 tests)
   - Import character from another file
   - Import template from another file
   - Wildcard imports (`use characters::*`)

2. **Cross-File References** (4 tests)
   - Template include from another file
   - Species reference from another file
   - Behavior subtree reference (`@module::tree`)
   - Relationship participants from other files

3. **Workspace State** (3 tests)
   - Workspace rebuild on file add
   - Workspace rebuild on file remove
   - Workspace rebuild on file modify

4. **Symbol Resolution** (3 tests)
   - Go-to-definition across files (even if not fully supported)
   - Find references across files (even if not fully supported)
   - Completion shows cross-file symbols (even if not fully supported)

5. **Edge Cases** (2 tests)
   - Circular dependencies
   - Missing import targets

**Note:** These tests document **expected behavior** even if cross-file support isn't fully implemented. They serve as regression tests for when Phase 5 is implemented.

**Coverage Target:** Establish baseline for multi-file features

#### 3.1.2 Concurrent Access Tests
**File:** Create `storybook/src/lsp/concurrency_tests.rs`
**Estimated Time:** 4 hours
**Target:** 8 tests

**Test Categories:**
1. **Multiple Clients** (3 tests)
   - Two clients open same document
   - Simultaneous edits from different clients
   - One client closes while other edits

2. **Rapid Operations** (3 tests)
   - Rapid did_change events
   - Open/close/reopen sequences
   - Completion requests during editing

3. **Locking** (2 tests)
   - No deadlocks on concurrent reads
   - Write lock releases properly

**Coverage Target:** Validate concurrency safety

### 3.2 Document Lifecycle Tests

#### 3.2.1 Lifecycle Integration
**File:** Add to `storybook/src/lsp/integration_tests.rs`
**Estimated Time:** 3 hours
**Target:** 10 tests

**Test Categories:**
1. **Open/Edit/Close** (4 tests)
   - Normal lifecycle (open → edit → close)
   - Multiple edits before close
   - Close without edit
   - Reopen after close

2. **State Consistency** (3 tests)
   - Symbols update on edit
   - Diagnostics update on edit
   - Workspace rebuilds on change

3. **Error Recovery** (3 tests)
   - Server survives parse errors
   - Server survives malformed edits
   - Server recovers from invalid positions

**Coverage Target:** 90% of document.rs

### Phase 3 Deliverables
- ✅ Integration test suite (33 tests)
- ✅ Multi-file behavior documented
- ✅ Concurrency safety validated
- ✅ Document lifecycle coverage > 90%

---

## Phase 4: Automation & Performance (Days 13-15)

### Objective
Enable automated testing and establish performance baselines.

### 4.1 CI/CD Setup

#### 4.1.1 GitHub Actions Workflow
**File:** Create `.github/workflows/lsp-tests.yml`
**Estimated Time:** 4 hours

**Workflow Components:**
1. **Test Jobs**
   - Rust version: stable, nightly
   - OS: ubuntu-latest, macos-latest
   - Run: `cargo test --lib lsp`

2. **Coverage Job**
   - Tool: cargo-tarpaulin or cargo-llvm-cov
   - Upload to: codecov.io or coveralls.io
   - Fail if coverage < 75%

3. **Tree-sitter Job**
   - Setup: Node.js 18+
   - Run: `cd tree-sitter-storybook && npm test`

4. **Benchmark Job**
   - Run: `cargo bench --bench lsp_benchmarks`
   - Upload results for regression tracking

**Triggers:**
- On push to main
- On pull requests
- Nightly (for performance tracking)

#### 4.1.2 Pre-commit Hooks
**File:** Update `.github/workflows/pre-commit.yml` or `lefthook.yml`
**Estimated Time:** 1 hour

**Hooks:**
- `cargo test --lib lsp` (fast tests only)
- `cargo clippy` (linting)
- `cargo fmt --check` (formatting)

### 4.2 Performance Benchmarking

#### 4.2.1 Benchmark Suite
**File:** Create `storybook/benches/lsp_benchmarks.rs`
**Estimated Time:** 6 hours

**Benchmarks:**
1. **Parse Benchmarks** (4 benchmarks)
   - Small file (100 lines)
   - Medium file (1,000 lines)
   - Large file (10,000 lines)
   - Very large file (50,000 lines)

2. **Symbol Extraction** (3 benchmarks)
   - 10 symbols
   - 100 symbols
   - 1,000 symbols

3. **Completion** (3 benchmarks)
   - Field completion (10 options)
   - Template completion (100 options)
   - Completion in large file

4. **Navigation** (3 benchmarks)
   - Go-to-definition in small file
   - Go-to-definition in large file
   - Find references (100 occurrences)

5. **Workspace** (2 benchmarks)
   - Workspace rebuild (10 files)
   - Workspace rebuild (100 files)

**Tool:** Use `criterion.rs` for statistical analysis

**Baselines:**
- Parse: < 10ms per 1,000 lines
- Symbol extraction: < 5ms per 100 symbols
- Completion: < 20ms latency
- Navigation: < 10ms per operation
- Workspace rebuild: < 100ms per 10 files

#### 4.2.2 Memory Profiling
**Estimated Time:** 3 hours

**Profiling Scenarios:**
1. Memory usage with 100 open documents
2. Memory usage with 1,000+ symbols
3. Memory leak detection (long-running session)

**Tools:**
- `valgrind` (Linux)
- `instruments` (macOS)
- `heaptrack` (cross-platform)

### Phase 4 Deliverables
- ✅ CI/CD pipeline operational
- ✅ Code coverage reporting automated
- ✅ Benchmark suite established
- ✅ Performance baselines documented
- ✅ Memory profiling completed

---

## Phase 5: Multi-File Support (Days 16-30)

### Objective
Implement and test cross-file navigation and workspace features.

### 5.1 Architecture Implementation

#### 5.1.1 Workspace NameTable Integration
**File:** `storybook/src/lsp/server.rs`
**Estimated Time:** 8 hours

**Changes:**
1. Populate `WorkspaceState.name_table` in `rebuild()`
2. Use file indices from NameTable
3. Store symbol locations with file indices
4. Expose workspace-level symbol lookup

**Testing:**
- Unit tests for NameTable population
- Verify symbol file indices correct
- Test incremental updates

#### 5.1.2 Cross-File Go-to-Definition
**File:** `storybook/src/lsp/definition.rs`
**Estimated Time:** 6 hours

**Changes:**
1. Check local document symbols first (fast path)
2. Fall back to workspace NameTable
3. Resolve file URL from file index
4. Return Location with correct file URL

**Testing:**
- Go to character in another file
- Go to template in another file
- Go to behavior in another file
- Handle missing symbols gracefully

#### 5.1.3 Cross-File Find References
**File:** `storybook/src/lsp/references.rs`
**Estimated Time:** 6 hours

**Changes:**
1. Search all open documents (not just current)
2. Use workspace NameTable for symbol lookup
3. Aggregate results from multiple files
4. Return Locations with file URLs

**Testing:**
- Find character references across files
- Find template uses across files
- Find behavior calls across files
- Performance with 100+ open files

#### 5.1.4 Use Declaration Resolution
**File:** Create `storybook/src/lsp/imports.rs`
**Estimated Time:** 8 hours

**Changes:**
1. Parse `use` declarations
2. Resolve module paths to file URLs
3. Populate symbol table from imports
4. Handle grouped imports (`use foo::{bar, baz}`)
5. Handle wildcard imports (`use foo::*`)

**Testing:**
- Simple use declaration
- Grouped imports
- Wildcard imports
- Nested module paths
- Missing import targets

### 5.2 Multi-File Test Suite

#### 5.2.1 Cross-File Navigation Tests
**File:** Create `storybook/src/lsp/cross_file_tests.rs`
**Estimated Time:** 6 hours
**Target:** 20 tests

**Test Categories:**
1. **Go-to-Definition** (5 tests)
   - Navigate to character in other file
   - Navigate to template in other file
   - Navigate to behavior in other file
   - Navigate with nested modules
   - Navigate with use declarations

2. **Find References** (5 tests)
   - Find character uses across files
   - Find template includes across files
   - Find behavior calls across files
   - Find species references across files
   - Performance with 50+ files

3. **Completion** (5 tests)
   - Complete imported symbols
   - Complete from wildcard imports
   - Complete nested module paths
   - Complete with use declarations
   - Don't show non-imported symbols

4. **Workspace Management** (5 tests)
   - Add file updates workspace
   - Remove file updates workspace
   - Modify file updates workspace
   - Rename file updates workspace
   - Large workspace (100+ files)

**Coverage Target:** 85% of cross-file features

#### 5.2.2 Import Resolution Tests
**File:** Create `storybook/src/lsp/import_tests.rs`
**Estimated Time:** 4 hours
**Target:** 15 tests

**Test Categories:**
1. **Simple Imports** (3 tests)
   - `use characters::Alice;`
   - `use templates::Child;`
   - `use behaviors::Walk;`

2. **Grouped Imports** (3 tests)
   - `use characters::{Alice, Bob};`
   - `use templates::{Child, Adult};`
   - Mixed types in group

3. **Wildcard Imports** (3 tests)
   - `use characters::*;`
   - Multiple wildcards from different modules
   - Wildcard precedence

4. **Nested Modules** (3 tests)
   - `use world::characters::Alice;`
   - `use world::locations::wonderland::Garden;`
   - Deep nesting (5+ levels)

5. **Edge Cases** (3 tests)
   - Import non-existent symbol
   - Circular imports
   - Import from self

**Coverage Target:** 90% of import resolution

### Phase 5 Deliverables
- ✅ Cross-file navigation implemented
- ✅ Use declaration resolution working
- ✅ Workspace NameTable populated
- ✅ 35 new multi-file tests
- ✅ Overall coverage > 85%

---

## Test Execution Strategy

### Daily Test Runs
```bash
# Quick sanity check (< 5 seconds)
cargo test --lib lsp -- --test-threads=1 --nocapture

# Full test suite (< 2 minutes)
cargo test --lib lsp

# With coverage (< 5 minutes)
cargo tarpaulin --lib --out Html --output-dir coverage

# Tree-sitter tests (< 10 seconds)
cd tree-sitter-storybook && npm test
```

### Weekly Test Runs
```bash
# Performance benchmarks (~ 10 minutes)
cargo bench --bench lsp_benchmarks

# Memory profiling (~ 15 minutes)
./scripts/memory_profile.sh

# Stress tests (~ 20 minutes)
cargo test --lib lsp --release -- stress_tests
```

### Pre-Release Test Runs
```bash
# Full suite with coverage
cargo tarpaulin --lib --out Html --output-dir coverage

# Performance regression check
cargo bench --bench lsp_benchmarks -- --save-baseline main

# Integration tests
cargo test --test '*'

# E2E tests (manual)
# 1. Build Zed extension
# 2. Test in Zed editor
# 3. Validate all LSP features work
```

---

## Coverage Measurement

### Tools
1. **cargo-tarpaulin** - Code coverage for Rust
   ```bash
   cargo install cargo-tarpaulin
   cargo tarpaulin --lib --out Html --output-dir coverage
   ```

2. **cargo-llvm-cov** - Alternative coverage tool
   ```bash
   cargo install cargo-llvm-cov
   cargo llvm-cov --html --lib
   ```

### Coverage Targets by Phase

| Phase | Target | Focus |
|-------|--------|-------|
| Phase 1 | 60% | Existing tests |
| Phase 2 | 80% | Feature coverage |
| Phase 3 | 82% | Integration |
| Phase 4 | 83% | Automation |
| Phase 5 | 85% | Multi-file |

### Coverage Exemptions
- Generated code (tree-sitter bindings)
- Test utilities and fixtures
- Deprecated code paths
- Unreachable error handling (panic paths)

---

## Risk Management

### High-Risk Areas

1. **Concurrency Bugs**
   - **Risk:** Deadlocks, race conditions in RwLock usage
   - **Mitigation:** Dedicated concurrency test suite, stress testing
   - **Detection:** Thread sanitizer, long-running tests

2. **Performance Regressions**
   - **Risk:** Features slow down with large files/workspaces
   - **Mitigation:** Benchmark suite, performance tracking in CI
   - **Detection:** Criterion benchmarks, regression alerts

3. **Memory Leaks**
   - **Risk:** Documents not cleaned up, symbol table grows unbounded
   - **Mitigation:** Memory profiling, leak detection tools
   - **Detection:** Valgrind, long-running session tests

4. **Cross-File State Corruption**
   - **Risk:** NameTable inconsistencies, stale file references
   - **Mitigation:** Workspace state validation tests
   - **Detection:** Integration tests, state invariant checks

### Medium-Risk Areas

1. **Unicode Handling**
   - **Risk:** Position calculations off by one with multi-byte chars
   - **Mitigation:** Comprehensive unicode test suite
   - **Detection:** Unicode-heavy test files

2. **Error Recovery**
   - **Risk:** Server crashes on malformed input
   - **Mitigation:** Fuzz testing, malformed input tests
   - **Detection:** Error recovery test suite

3. **Zed Integration**
   - **Risk:** LSP features not working in actual editor
   - **Mitigation:** E2E testing in Zed
   - **Detection:** Manual testing, user feedback

---

## Test Maintenance

### Test Naming Convention
```rust
// Pattern: test_<feature>_<scenario>_<expected_result>
#[test]
fn test_goto_definition_character_in_same_file() { ... }

#[test]
fn test_completion_field_after_colon() { ... }

#[test]
fn test_diagnostics_parse_error_missing_brace() { ... }
```

### Test Organization
```
src/lsp/
├── *_tests.rs           # Feature-specific unit tests
├── integration_tests.rs # Multi-feature integration
├── stress_tests.rs      # Performance and scale
└── cross_file_tests.rs  # Multi-file scenarios
```

### Test Documentation
- Each test file has module-level doc comment explaining scope
- Each test has doc comment explaining scenario
- Complex tests have inline comments for key assertions

### Test Refactoring Guidelines
- Extract common test fixtures to `test_utils.rs`
- Use builder pattern for complex test data
- Keep tests independent (no shared mutable state)
- Aim for tests < 50 lines each

---

## Appendix A: Test Counting Summary

### Current Tests (Phase 1)
- behavior_tests.rs: 7
- code_actions_tests.rs: 27
- completion_tests.rs: 10
- diagnostics_tests.rs: 20
- document_edge_tests.rs: 17
- navigation_tests.rs: 11
- validation_tests.rs: 10
- Integration tests: ~45
- **Total: ~147 tests**

### New Tests by Phase

**Phase 2 (+57 tests):**
- hover_tests.rs: 15
- formatting_tests.rs: 15
- rename_tests.rs: 12
- semantic_tokens_tests.rs: 15

**Phase 3 (+33 tests):**
- integration_tests.rs: 25
- concurrency_tests.rs: 8

**Phase 4 (+15 benchmarks):**
- lsp_benchmarks.rs: 15 benchmarks

**Phase 5 (+35 tests):**
- cross_file_tests.rs: 20
- import_tests.rs: 15

**Grand Total: ~287 tests + 15 benchmarks**

---

## Appendix B: Test Commands Reference

```bash
# === LSP Tests ===

# Run all LSP tests
cargo test --lib lsp

# Run specific test file
cargo test --lib lsp::hover_tests

# Run single test
cargo test --lib lsp::hover_tests::test_hover_on_character

# Run tests with output
cargo test --lib lsp -- --nocapture

# Run tests in single thread (for debugging)
cargo test --lib lsp -- --test-threads=1

# === Tree-sitter Tests ===

# Run all tree-sitter tests
cd tree-sitter-storybook && npm test

# Run with verbose output
cd tree-sitter-storybook && npm test -- --debug

# === Coverage ===

# Generate HTML coverage report
cargo tarpaulin --lib --out Html --output-dir coverage

# Coverage with specific minimum threshold
cargo tarpaulin --lib --fail-under 80

# === Benchmarks ===

# Run all benchmarks
cargo bench --bench lsp_benchmarks

# Run specific benchmark
cargo bench --bench lsp_benchmarks -- parse

# Save baseline
cargo bench --bench lsp_benchmarks -- --save-baseline main

# Compare to baseline
cargo bench --bench lsp_benchmarks -- --baseline main

# === Memory Profiling ===

# Linux (valgrind)
valgrind --leak-check=full --show-leak-kinds=all \
  cargo test --lib lsp

# macOS (instruments)
instruments -t Leaks cargo test --lib lsp

# === Linting ===

# Run clippy
cargo clippy --lib

# Run clippy with strict rules
cargo clippy --lib -- -D warnings

# Format check
cargo fmt --check

# Auto-format
cargo fmt
```

---

## Appendix C: Coverage Report Template

```markdown
# LSP Test Coverage Report
**Date:** YYYY-MM-DD
**Commit:** <git sha>

## Summary
- **Overall Coverage:** XX%
- **Tests Run:** XXX
- **Tests Passed:** XXX
- **Tests Failed:** X
- **Duration:** XX seconds

## Coverage by Module
| Module | Coverage | Lines Covered | Total Lines |
|--------|----------|---------------|-------------|
| server.rs | XX% | XXX / XXX | XXX |
| hover.rs | XX% | XXX / XXX | XXX |
| completion.rs | XX% | XXX / XXX | XXX |
| definition.rs | XX% | XXX / XXX | XXX |
| references.rs | XX% | XXX / XXX | XXX |
| symbols.rs | XX% | XXX / XXX | XXX |
| diagnostics.rs | XX% | XXX / XXX | XXX |
| formatting.rs | XX% | XXX / XXX | XXX |
| rename.rs | XX% | XXX / XXX | XXX |
| code_actions.rs | XX% | XXX / XXX | XXX |
| semantic_tokens.rs | XX% | XXX / XXX | XXX |
| inlay_hints.rs | XX% | XXX / XXX | XXX |

## Uncovered Lines
- server.rs: lines XXX-XXX (error handling)
- completion.rs: line XXX (unreachable branch)

## Recommendations
- [ ] Add tests for uncovered error paths
- [ ] Increase coverage in completion.rs
- [ ] ...
```

---

## Conclusion

This testing strategy provides a **clear roadmap** from the current broken state (compilation errors) to a fully tested, production-ready LSP implementation with 85% coverage. The phased approach allows for **incremental progress** while maintaining **immediate value** at each phase:

- **Phase 1** - Unblocks development
- **Phase 2** - Achieves stability
- **Phase 3** - Validates integration
- **Phase 4** - Enables automation
- **Phase 5** - Adds advanced features

By following this strategy, the Storybook LSP will become **robust, maintainable, and reliable** for end users.