Files
meet/docs/features/transcription.md
Martin Guitteny c07b8f920f 📝(docs) add summarization documentation
Add documentation for transcription et summarization
Include sequence diagrams
2025-10-06 14:53:56 +02:00

11 KiB

Transcription

La Suite Meet provides a room transcription capability, currently available in beta. This feature is under active development, with ongoing enhancements planned.

The transcription feature enables users to record room sessions. Upon completion of a recording, the room owner receives a notification containing a link to LaSuite Docs, where the transcribed meeting content can be accessed.

Note

Audio recordings are automatically deleted after the configured RECORDING_EXPIRATION_DAYS period.

For configuration and setup details of the recording functionality, refer to the Recording feature documentation. This page only describes the supplementary tools required for audio processing.

Example of a transcript :

**SPEAKER_00**: Hello everyone!
**SPEAKER_01**: Yes, it works.

Current Limitations

  • Participant identification is not yet implemented; participants are labeled generically (e.g., PARTICIPANT_1).
  • Transcription backend relies on WhisperX, which does not provide an OpenAI-compatible API.

Note

Questions? Open an issue on GitHub or join our Matrix community.

Special requirements

To enable the transcription feature, the following components must be in place:

  • Recording feature components: All dependencies and configurations required for the recording feature.
  • LaSuite Docs instance: A running LaSuite Docs capable of handling requests to the /create-for-owner endpoint.
  • WhisperX API: A running WhisperX service. An open-source implementation combining WhisperX and FastAPI is available here.
  • Deployment of the summary service, a Celery worker, and a Redis instance.

How It Works

sequenceDiagram
  participant Backend as Backend API
  participant Summary as Summary Service
  participant Celery as Celery Workers (transcribe-queue)
  participant MinIO as MinIO (Object Storage)
  participant STT as WhisperX API
  participant Docs as LaSuite Docs

  Backend->>Summary: POST /api/v1/tasks/ (bearer token, payload)
  Note right of Backend: Payload contains 7 params: owner_id, filename, email, sub, room, recording_date, recording_time

  Summary->>Celery: Register task (transcribe-queue)
  Celery->>MinIO: Fetch audio file
  Celery->>STT: Transcribe audio (WhisperX)
  STT-->>Celery: Segmented transcript

  Celery->>Celery: Format transcript (text)

  Celery->>Docs: POST /create-for-owner (title, content, email, sub, api token)
  Docs-->>Celery: Acknowledgement

Configuration Options

Option Type Default Description
app_name String "app" Name of the application/service.
app_api_v1_str String "/api/v1" Base path for the API endpoints.
app_api_token Secret API token for authenticating requests.
recording_max_duration Integer None Maximum duration of audio recordings in milliseconds. Set to None for unlimited. Audio recordings longer than the configured limit will be ignored and not processed.
celery_broker_url String "redis://redis/0" Celery broker URL.
celery_result_backend String "redis://redis/0" Celery result backend URL.
celery_max_retries Integer 1 Maximum number of retries for Celery tasks.
transcribe_queue String "transcribe-queue" Name of the Celery queue for transcription tasks.
aws_storage_bucket_name String Name of the S3/MinIO bucket used for storing recordings.
aws_s3_endpoint_url String Endpoint URL of the S3/MinIO storage.
aws_s3_access_key_id String Access key for S3/MinIO.
aws_s3_secret_access_key Secret Secret key for S3/MinIO.
aws_s3_secure_access Boolean True Use HTTPS for S3/MinIO requests.
whisperx_api_key Secret API key for accessing WhisperX.
whisperx_base_url String "https://api.whisperx.com/v1" Base URL for the WhisperX API.
whisperx_asr_model String "whisper-1" ASR model used for transcription.
whisperx_max_retries Integer 0 Maximum number of retries for WhisperX API requests.
webhook_max_retries Integer 2 Maximum retries for webhook requests.
webhook_status_forcelist List[Int] [502, 503, 504] HTTP status codes triggering webhook retry.
webhook_backoff_factor Float 0.1 Exponential backoff factor for webhook retries.
webhook_api_token Secret Token to authenticate incoming webhook requests.
webhook_url String URL to which webhook events are sent.
document_default_title String "Transcription" Default title for generated documents.
document_title_template String 'Réunion "{room}" du {room_recording_date} à {room_recording_time}' Template for document title.
sentry_is_enabled Boolean False Enable or disable Sentry error tracking.
sentry_dsn String None DSN for Sentry integration.