Spatial Audio System #4

New Issue

2025-12-16T21:55:38Z

siennathesane commented

2025-12-16 21:55:38 +00:00

(Migrated from github.com)

Summary

Implement a 3D spatial audio engine with HRTF-based binaural rendering, distance attenuation, occlusion, and environmental reverb. Create an immersive soundscape that reinforces the sense of place and complements low-poly PBR visuals.

Scope

In Scope

HRTF-based binaural audio rendering for realistic 3D positioning
Distance-based attenuation with configurable falloff curves
Basic occlusion system (raycast-based sound blocking)
Environmental reverb zones for different spaces
Audio listener component tied to player/camera
Audio source component with 3D positioning
Integration with Bevy ECS for spatial queries
Cross-platform support (macOS, iOS)

Out of Scope

Advanced acoustic simulation (wave propagation, diffraction)
Real-time DSP effects beyond reverb
Audio streaming (initial implementation uses preloaded assets)
Dialogue system integration (separate epic)
Music system (separate from spatial audio)

Success Criteria

Scenario: 3D positioned sound source
  Given a sound source in 3D space
  And a listener with position and orientation
  When the sound plays
  Then the audio should pan correctly (left/right/behind)
  And distance should affect volume naturally
  And HRTF should provide elevation cues

Scenario: Occlusion affects sound
  Given a sound source behind a wall
  When the listener moves to the other side of the wall
  Then the sound should be muffled/attenuated
  And high frequencies should be reduced more than low frequencies

Scenario: Environmental reverb
  Given the listener enters a large hall
  When sounds play in that space
  Then reverb should reflect the room size and materials
  And the acoustic character should feel appropriate

Features & Tasks

This epic breaks down into the following work items:

Phase 1: Core Spatial Audio Engine

Research and select HRTF library (evaluate options: rust-hrtf, others)
Implement audio listener component (position, orientation, forward/up vectors)
Implement audio source component (3D position, attenuation settings)
Create basic 3D panning system
Implement distance-based attenuation with falloff curves

Phase 2: HRTF Integration

Integrate HRTF library for binaural rendering
Load and process HRTF datasets (e.g., MIT KEMAR)
Implement head-related coordinate transformations
Test elevation and azimuth accuracy
Performance optimization for multiple sources

Phase 3: Occlusion System

Implement raycast-based occlusion detection
Create occlusion material properties (absorption coefficients)
Apply frequency-dependent filtering for occluded sounds
Optimize raycasting for multiple audio sources
Test with various building/wall configurations

Phase 4: Environmental Reverb

Design reverb zone system (room size, RT60, material properties)
Implement convolution reverb or algorithmic reverb
Create reverb presets (small room, hall, outdoor, cave)
Blend reverb when transitioning between zones
Optimize reverb processing performance

Phase 5: Bevy ECS Integration

Create ECS systems for spatial audio updates
Query transforms for listener and sources each frame
Handle audio source spawning/despawning
Integrate with existing platform audio backends
Test with agent simulation (many moving sound sources)

Phase 6: Platform Support & Testing

Test on macOS with various audio output devices
Test on iOS with headphones vs speakers
Verify HRTF works with earbuds/AirPods
Performance profiling (CPU usage, latency)
Audio latency measurement and optimization

Phase 7: Content Pipeline & Documentation

Define audio asset formats and metadata
Create tools for audio source placement in scenes
Write spatial audio design guidelines
Document API for game code
Create example scenes demonstrating features

Dependencies

Depends on: None (can be developed in parallel with renderer work)
Blocks: Agent ambient sounds, environmental soundscapes, dialogue system
Related: RFC 0005 - Spatial Audio System Architecture

Background

Aspen is a life simulation game focused on creating a sense of place and immersion. With spatial audio:

Presence: 3D audio makes the world feel real and inhabited
Spatial Awareness: Players can locate agents and events by sound
Emotional Connection: Realistic acoustics deepen the "could live here" feeling
Complements Visuals: Low-poly PBR + spatial audio = cohesive immersive aesthetic

Without spatial audio, the game feels flat. With it, the world comes alive.

Technical Notes

HRTF (Head-Related Transfer Function):
HRTFs simulate how sound waves interact with the listener's head, ears, and torso to create realistic 3D positioning. They're essential for elevation cues and behind/front discrimination.

Distance Attenuation Models:

Linear: Simple but unnatural
Inverse: 1/distance - realistic for point sources
Inverse Square: 1/(distance²) - physically accurate in free space
Custom curves: Game-tuned for artistic control

Occlusion Strategy:
Use raycasts from listener to each audio source. If geometry is hit:

Apply low-pass filter (muffle high frequencies)
Reduce volume based on material absorption
Cache results and update periodically (not every frame)

Reverb Zones:
Define volumes in the world with reverb properties:

RT60: Reverberation time (how long sound persists)
size: Room dimensions affect early reflections
damping: Material absorption (carpet vs concrete)

Blend between zones as the listener moves.

Performance Considerations:

Limit active audio sources (distance culling, priority system)
Update spatial calculations at lower rate than render (30-60Hz is fine)
Use spatial hashing for efficient queries
Profile on iOS to ensure battery/thermal budget

Priority: High (core immersion feature)
Effort: Extra Large (multi-phase system)
Type: Epic
RFC: RFC 0005 - Spatial Audio System

## Summary Implement a 3D spatial audio engine with HRTF-based binaural rendering, distance attenuation, occlusion, and environmental reverb. Create an immersive soundscape that reinforces the sense of place and complements low-poly PBR visuals. ## Scope ### In Scope - HRTF-based binaural audio rendering for realistic 3D positioning - Distance-based attenuation with configurable falloff curves - Basic occlusion system (raycast-based sound blocking) - Environmental reverb zones for different spaces - Audio listener component tied to player/camera - Audio source component with 3D positioning - Integration with Bevy ECS for spatial queries - Cross-platform support (macOS, iOS) ### Out of Scope - Advanced acoustic simulation (wave propagation, diffraction) - Real-time DSP effects beyond reverb - Audio streaming (initial implementation uses preloaded assets) - Dialogue system integration (separate epic) - Music system (separate from spatial audio) ## Success Criteria ```gherkin Scenario: 3D positioned sound source Given a sound source in 3D space And a listener with position and orientation When the sound plays Then the audio should pan correctly (left/right/behind) And distance should affect volume naturally And HRTF should provide elevation cues Scenario: Occlusion affects sound Given a sound source behind a wall When the listener moves to the other side of the wall Then the sound should be muffled/attenuated And high frequencies should be reduced more than low frequencies Scenario: Environmental reverb Given the listener enters a large hall When sounds play in that space Then reverb should reflect the room size and materials And the acoustic character should feel appropriate ``` ## Features & Tasks This epic breaks down into the following work items: ### Phase 1: Core Spatial Audio Engine - Research and select HRTF library (evaluate options: rust-hrtf, others) - Implement audio listener component (position, orientation, forward/up vectors) - Implement audio source component (3D position, attenuation settings) - Create basic 3D panning system - Implement distance-based attenuation with falloff curves ### Phase 2: HRTF Integration - Integrate HRTF library for binaural rendering - Load and process HRTF datasets (e.g., MIT KEMAR) - Implement head-related coordinate transformations - Test elevation and azimuth accuracy - Performance optimization for multiple sources ### Phase 3: Occlusion System - Implement raycast-based occlusion detection - Create occlusion material properties (absorption coefficients) - Apply frequency-dependent filtering for occluded sounds - Optimize raycasting for multiple audio sources - Test with various building/wall configurations ### Phase 4: Environmental Reverb - Design reverb zone system (room size, RT60, material properties) - Implement convolution reverb or algorithmic reverb - Create reverb presets (small room, hall, outdoor, cave) - Blend reverb when transitioning between zones - Optimize reverb processing performance ### Phase 5: Bevy ECS Integration - Create ECS systems for spatial audio updates - Query transforms for listener and sources each frame - Handle audio source spawning/despawning - Integrate with existing platform audio backends - Test with agent simulation (many moving sound sources) ### Phase 6: Platform Support & Testing - Test on macOS with various audio output devices - Test on iOS with headphones vs speakers - Verify HRTF works with earbuds/AirPods - Performance profiling (CPU usage, latency) - Audio latency measurement and optimization ### Phase 7: Content Pipeline & Documentation - Define audio asset formats and metadata - Create tools for audio source placement in scenes - Write spatial audio design guidelines - Document API for game code - Create example scenes demonstrating features ## Dependencies **Depends on:** None (can be developed in parallel with renderer work) **Blocks:** Agent ambient sounds, environmental soundscapes, dialogue system **Related:** RFC 0005 - Spatial Audio System Architecture ## Background Aspen is a life simulation game focused on creating a sense of place and immersion. With spatial audio: - **Presence**: 3D audio makes the world feel real and inhabited - **Spatial Awareness**: Players can locate agents and events by sound - **Emotional Connection**: Realistic acoustics deepen the "could live here" feeling - **Complements Visuals**: Low-poly PBR + spatial audio = cohesive immersive aesthetic Without spatial audio, the game feels flat. With it, the world comes alive. ## Technical Notes **HRTF (Head-Related Transfer Function):** HRTFs simulate how sound waves interact with the listener's head, ears, and torso to create realistic 3D positioning. They're essential for elevation cues and behind/front discrimination. **Distance Attenuation Models:** - Linear: Simple but unnatural - Inverse: `1/distance` - realistic for point sources - Inverse Square: `1/(distance²)` - physically accurate in free space - Custom curves: Game-tuned for artistic control **Occlusion Strategy:** Use raycasts from listener to each audio source. If geometry is hit: - Apply low-pass filter (muffle high frequencies) - Reduce volume based on material absorption - Cache results and update periodically (not every frame) **Reverb Zones:** Define volumes in the world with reverb properties: - `RT60`: Reverberation time (how long sound persists) - `size`: Room dimensions affect early reflections - `damping`: Material absorption (carpet vs concrete) Blend between zones as the listener moves. **Performance Considerations:** - Limit active audio sources (distance culling, priority system) - Update spatial calculations at lower rate than render (30-60Hz is fine) - Use spatial hashing for efficient queries - Profile on iOS to ensure battery/thermal budget --- **Priority:** High (core immersion feature) **Effort:** Extra Large (multi-phase system) **Type:** Epic **RFC:** [RFC 0005 - Spatial Audio System](../rfcs/0005-spatial-audio-system.md)

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: studio/marathon#4