Files

Martin Guitteny c07b8f920f 📝(docs) add summarization documentation

Add documentation for transcription et summarization
Include sequence diagrams

2025-10-06 14:53:56 +02:00

11 KiB

Raw Blame History

Transcription

La Suite Meet provides a room transcription capability, currently available in beta. This feature is under active development, with ongoing enhancements planned.

The transcription feature enables users to record room sessions. Upon completion of a recording, the room owner receives a notification containing a link to LaSuite Docs, where the transcribed meeting content can be accessed.

Note

Audio recordings are automatically deleted after the configured RECORDING_EXPIRATION_DAYS period.

For configuration and setup details of the recording functionality, refer to the Recording feature documentation. This page only describes the supplementary tools required for audio processing.

Example of a transcript :

**SPEAKER_00**: Hello everyone!
**SPEAKER_01**: Yes, it works.

Current Limitations

Participant identification is not yet implemented; participants are labeled generically (e.g., PARTICIPANT_1).
Transcription backend relies on WhisperX, which does not provide an OpenAI-compatible API.

Note

Questions? Open an issue on GitHub or join our Matrix community.

Special requirements

To enable the transcription feature, the following components must be in place:

Recording feature components: All dependencies and configurations required for the recording feature.
LaSuite Docs instance: A running LaSuite Docs capable of handling requests to the /create-for-owner endpoint.
WhisperX API: A running WhisperX service. An open-source implementation combining WhisperX and FastAPI is available here.
Deployment of the summary service, a Celery worker, and a Redis instance.

How It Works

sequenceDiagram
  participant Backend as Backend API
  participant Summary as Summary Service
  participant Celery as Celery Workers (transcribe-queue)
  participant MinIO as MinIO (Object Storage)
  participant STT as WhisperX API
  participant Docs as LaSuite Docs

  Backend->>Summary: POST /api/v1/tasks/ (bearer token, payload)
  Note right of Backend: Payload contains 7 params: owner_id, filename, email, sub, room, recording_date, recording_time

  Summary->>Celery: Register task (transcribe-queue)
  Celery->>MinIO: Fetch audio file
  Celery->>STT: Transcribe audio (WhisperX)
  STT-->>Celery: Segmented transcript

  Celery->>Celery: Format transcript (text)

  Celery->>Docs: POST /create-for-owner (title, content, email, sub, api token)
  Docs-->>Celery: Acknowledgement

Configuration Options

Option	Type	Default	Description
app_name	String	`"app"`	Name of the application/service.
app_api_v1_str	String	`"/api/v1"`	Base path for the API endpoints.
app_api_token	Secret	—	API token for authenticating requests.
recording_max_duration	Integer	`None`	Maximum duration of audio recordings in milliseconds. Set to `None` for unlimited. Audio recordings longer than the configured limit will be ignored and not processed.
celery_broker_url	String	`"redis://redis/0"`	Celery broker URL.
celery_result_backend	String	`"redis://redis/0"`	Celery result backend URL.
celery_max_retries	Integer	`1`	Maximum number of retries for Celery tasks.
transcribe_queue	String	`"transcribe-queue"`	Name of the Celery queue for transcription tasks.
aws_storage_bucket_name	String	—	Name of the S3/MinIO bucket used for storing recordings.
aws_s3_endpoint_url	String	—	Endpoint URL of the S3/MinIO storage.
aws_s3_access_key_id	String	—	Access key for S3/MinIO.
aws_s3_secret_access_key	Secret	—	Secret key for S3/MinIO.
aws_s3_secure_access	Boolean	`True`	Use HTTPS for S3/MinIO requests.
whisperx_api_key	Secret	—	API key for accessing WhisperX.
whisperx_base_url	String	`"https://api.whisperx.com/v1"`	Base URL for the WhisperX API.
whisperx_asr_model	String	`"whisper-1"`	ASR model used for transcription.
whisperx_max_retries	Integer	`0`	Maximum number of retries for WhisperX API requests.
webhook_max_retries	Integer	`2`	Maximum retries for webhook requests.
webhook_status_forcelist	List[Int]	`[502, 503, 504]`	HTTP status codes triggering webhook retry.
webhook_backoff_factor	Float	`0.1`	Exponential backoff factor for webhook retries.
webhook_api_token	Secret	—	Token to authenticate incoming webhook requests.
webhook_url	String	—	URL to which webhook events are sent.
document_default_title	String	`"Transcription"`	Default title for generated documents.
document_title_template	String	`'Réunion "{room}" du {room_recording_date} à {room_recording_time}'`	Template for document title.
sentry_is_enabled	Boolean	`False`	Enable or disable Sentry error tracking.
sentry_dsn	String	`None`	DSN for Sentry integration.

11 KiB Raw Blame History