From c07b8f920fb00f1badd7a1d94a85545c290c3146 Mon Sep 17 00:00:00 2001
From: Martin Guitteny <“martin.guitteny@centralesupelec.fr”>
Date: Mon, 29 Sep 2025 15:52:40 +0200
Subject: [PATCH] =?UTF-8?q?=F0=9F=93=9D(docs)=20add=20summarization=20docu?=
 =?UTF-8?q?mentation?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add documentation for transcription et summarization
Include sequence diagrams
---
 docs/features/summarization.md |  4 ++
 docs/features/transcription.md | 92 ++++++++++++++++++++++++++++++++++
 src/summary/README.md          |  8 ++-
 3 files changed, 103 insertions(+), 1 deletion(-)
 create mode 100644 docs/features/summarization.md
 create mode 100644 docs/features/transcription.md

diff --git a/docs/features/summarization.md b/docs/features/summarization.md
new file mode 100644
index 00000000..be14b2f8
--- /dev/null
+++ b/docs/features/summarization.md
@@ -0,0 +1,4 @@
+
+# Meeting summarization (WIP)
+
+This feature is currently under development and not yet ready for production use. Documentation and detailed instructions will be provided once the feature is stable and officially released.
diff --git a/docs/features/transcription.md b/docs/features/transcription.md
new file mode 100644
index 00000000..e5533795
--- /dev/null
+++ b/docs/features/transcription.md
@@ -0,0 +1,92 @@
+# Transcription
+
+La Suite Meet provides a room transcription capability, currently available in beta. This feature is under active development, with ongoing enhancements planned.
+
+The transcription feature enables users to record room sessions. Upon completion of a recording, the room owner receives a notification containing a link to LaSuite Docs, where the transcribed meeting content can be accessed.
+
+> [!NOTE]
+> Audio recordings are automatically deleted after the configured `RECORDING_EXPIRATION_DAYS` period.
+
+For configuration and setup details of the recording functionality, refer to the [Recording feature documentation](https://github.com/suitenumerique/meet/blob/main/docs/features/recording.md).
+This page only describes the supplementary tools required for audio processing.
+
+Example of a transcript : 
+
+```
+**SPEAKER_00**: Hello everyone!
+**SPEAKER_01**: Yes, it works.
+```
+
+
+### Current Limitations
+
+* Participant identification is not yet implemented; participants are labeled generically (e.g., `PARTICIPANT_1`).
+* Transcription backend relies on [WhisperX](https://github.com/m-bain/whisperX), which does not provide an OpenAI-compatible API.
+
+> [!NOTE]
+> Questions? Open an issue on [GitHub](https://github.com/suitenumerique/meet/issues/new?assignees=&labels=bug&template=Bug_report.md) or join our [Matrix community](https://matrix.to/#/#meet-official:matrix.org).
+
+## Special requirements
+
+To enable the transcription feature, the following components must be in place:
+
+* Recording feature components: All dependencies and configurations required for the [recording feature](https://github.com/suitenumerique/meet/blob/main/docs/features/recording.md).
+* LaSuite Docs instance: A running [LaSuite Docs](https://github.com/suitenumerique/docs) capable of handling requests to the `/create-for-owner` endpoint.
+* WhisperX API: A running WhisperX service. An open-source implementation combining WhisperX and FastAPI is available [here](https://github.com/suitenumerique/meet-whisperx).
+* Deployment of the [summary service](https://hub.docker.com/r/lasuite/meet-summary), a Celery worker, and a Redis instance.
+
+## How It Works
+
+```mermaid
+sequenceDiagram
+  participant Backend as Backend API
+  participant Summary as Summary Service
+  participant Celery as Celery Workers (transcribe-queue)
+  participant MinIO as MinIO (Object Storage)
+  participant STT as WhisperX API
+  participant Docs as LaSuite Docs
+
+  Backend->>Summary: POST /api/v1/tasks/ (bearer token, payload)
+  Note right of Backend: Payload contains 7 params: owner_id, filename, email, sub, room, recording_date, recording_time
+
+  Summary->>Celery: Register task (transcribe-queue)
+  Celery->>MinIO: Fetch audio file
+  Celery->>STT: Transcribe audio (WhisperX)
+  STT-->>Celery: Segmented transcript
+
+  Celery->>Celery: Format transcript (text)
+
+  Celery->>Docs: POST /create-for-owner (title, content, email, sub, api token)
+  Docs-->>Celery: Acknowledgement
+```
+
+## Configuration Options
+
+| Option                   | Type      | Default                                                               | Description                                                                                                                                                             |
+| ------------------------ | --------- |-----------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| app_name                 | String    | `"app"`                                                               | Name of the application/service.                                                                                                                                        |
+| app_api_v1_str           | String    | `"/api/v1"`                                                           | Base path for the API endpoints.                                                                                                                                        |
+| app_api_token            | Secret    | —                                                                     | API token for authenticating requests.                                                                                                                                  |
+| recording_max_duration   | Integer   | `None`                                                                | Maximum duration of audio recordings in milliseconds. Set to `None` for unlimited. Audio recordings longer than the configured limit will be ignored and not processed. |
+| celery_broker_url        | String    | `"redis://redis/0"`                                                   | Celery broker URL.                                                                                                                                                      |
+| celery_result_backend    | String    | `"redis://redis/0"`                                                   | Celery result backend URL.                                                                                                                                              |
+| celery_max_retries       | Integer   | `1`                                                                   | Maximum number of retries for Celery tasks.                                                                                                                             |
+| transcribe_queue         | String    | `"transcribe-queue"`                                                  | Name of the Celery queue for transcription tasks.                                                                                                                       |
+| aws_storage_bucket_name  | String    | —                                                                     | Name of the S3/MinIO bucket used for storing recordings.                                                                                                                |
+| aws_s3_endpoint_url      | String    | —                                                                     | Endpoint URL of the S3/MinIO storage.                                                                                                                                   |
+| aws_s3_access_key_id     | String    | —                                                                     | Access key for S3/MinIO.                                                                                                                                                |
+| aws_s3_secret_access_key | Secret    | —                                                                     | Secret key for S3/MinIO.                                                                                                                                                |
+| aws_s3_secure_access     | Boolean   | `True`                                                                | Use HTTPS for S3/MinIO requests.                                                                                                                                        |
+| whisperx_api_key         | Secret    | —                                                                     | API key for accessing WhisperX.                                                                                                                                         |
+| whisperx_base_url        | String    | `"https://api.whisperx.com/v1"`                                       | Base URL for the WhisperX API.                                                                                                                                          |
+| whisperx_asr_model       | String    | `"whisper-1"`                                                         | ASR model used for transcription.                                                                                                                                       |
+| whisperx_max_retries     | Integer   | `0`                                                                   | Maximum number of retries for WhisperX API requests.                                                                                                                    |
+| webhook_max_retries      | Integer   | `2`                                                                   | Maximum retries for webhook requests.                                                                                                                                   |
+| webhook_status_forcelist | List[Int] | `[502, 503, 504]`                                                     | HTTP status codes triggering webhook retry.                                                                                                                             |
+| webhook_backoff_factor   | Float     | `0.1`                                                                 | Exponential backoff factor for webhook retries.                                                                                                                         |
+| webhook_api_token        | Secret    | —                                                                     | Token to authenticate incoming webhook requests.                                                                                                                        |
+| webhook_url              | String    | —                                                                     | URL to which webhook events are sent.                                                                                                                                   |
+| document_default_title   | String    | `"Transcription"`                                                     | Default title for generated documents.                                                                                                                                  |
+| document_title_template  | String    | `'Réunion "{room}" du {room_recording_date} à {room_recording_time}'` | Template for document title.                                                                                                                                            |
+| sentry_is_enabled        | Boolean   | `False`                                                               | Enable or disable Sentry error tracking.                                                                                                                                |
+| sentry_dsn               | String    | `None`                                                                | DSN for Sentry integration.                                                                                                                                             |
diff --git a/src/summary/README.md b/src/summary/README.md
index 3cea22ce..34246255 100644
--- a/src/summary/README.md
+++ b/src/summary/README.md
@@ -2,7 +2,13 @@
 
 This is an experimental part of the stack.  It currently lacks proper observability, unit tests, and other production-grade features. This serves as the base for AI features in Visio.
 
-## Usage
+## How it works 
+
+Please refer to the [Recording feature documentation](https://github.com/suitenumerique/meet/blob/main/docs/features/recording.md) and the [Transcription feature documentation](https://github.com/suitenumerique/meet/blob/main/docs/features/transcription.md).
+
+## How to develop
+
+(To develop locally follow the instructions on [developing La Suite Meet locally](https://github.com/suitenumerique/meet/blob/main/docs/developping_locally.md))
 
 From the root of the project: