📝(docs) add summarization documentation

Add documentation for transcription et summarization
Include sequence diagrams
This commit is contained in:
Martin Guitteny
2025-09-29 15:52:40 +02:00
committed by aleb_the_flash
parent a25baa628a
commit c07b8f920f
3 changed files with 103 additions and 1 deletions

View File

@@ -0,0 +1,4 @@
# Meeting summarization (WIP)
This feature is currently under development and not yet ready for production use. Documentation and detailed instructions will be provided once the feature is stable and officially released.

View File

@@ -0,0 +1,92 @@
# Transcription
La Suite Meet provides a room transcription capability, currently available in beta. This feature is under active development, with ongoing enhancements planned.
The transcription feature enables users to record room sessions. Upon completion of a recording, the room owner receives a notification containing a link to LaSuite Docs, where the transcribed meeting content can be accessed.
> [!NOTE]
> Audio recordings are automatically deleted after the configured `RECORDING_EXPIRATION_DAYS` period.
For configuration and setup details of the recording functionality, refer to the [Recording feature documentation](https://github.com/suitenumerique/meet/blob/main/docs/features/recording.md).
This page only describes the supplementary tools required for audio processing.
Example of a transcript :
```
**SPEAKER_00**: Hello everyone!
**SPEAKER_01**: Yes, it works.
```
### Current Limitations
* Participant identification is not yet implemented; participants are labeled generically (e.g., `PARTICIPANT_1`).
* Transcription backend relies on [WhisperX](https://github.com/m-bain/whisperX), which does not provide an OpenAI-compatible API.
> [!NOTE]
> Questions? Open an issue on [GitHub](https://github.com/suitenumerique/meet/issues/new?assignees=&labels=bug&template=Bug_report.md) or join our [Matrix community](https://matrix.to/#/#meet-official:matrix.org).
## Special requirements
To enable the transcription feature, the following components must be in place:
* Recording feature components: All dependencies and configurations required for the [recording feature](https://github.com/suitenumerique/meet/blob/main/docs/features/recording.md).
* LaSuite Docs instance: A running [LaSuite Docs](https://github.com/suitenumerique/docs) capable of handling requests to the `/create-for-owner` endpoint.
* WhisperX API: A running WhisperX service. An open-source implementation combining WhisperX and FastAPI is available [here](https://github.com/suitenumerique/meet-whisperx).
* Deployment of the [summary service](https://hub.docker.com/r/lasuite/meet-summary), a Celery worker, and a Redis instance.
## How It Works
```mermaid
sequenceDiagram
participant Backend as Backend API
participant Summary as Summary Service
participant Celery as Celery Workers (transcribe-queue)
participant MinIO as MinIO (Object Storage)
participant STT as WhisperX API
participant Docs as LaSuite Docs
Backend->>Summary: POST /api/v1/tasks/ (bearer token, payload)
Note right of Backend: Payload contains 7 params: owner_id, filename, email, sub, room, recording_date, recording_time
Summary->>Celery: Register task (transcribe-queue)
Celery->>MinIO: Fetch audio file
Celery->>STT: Transcribe audio (WhisperX)
STT-->>Celery: Segmented transcript
Celery->>Celery: Format transcript (text)
Celery->>Docs: POST /create-for-owner (title, content, email, sub, api token)
Docs-->>Celery: Acknowledgement
```
## Configuration Options
| Option | Type | Default | Description |
| ------------------------ | --------- |-----------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| app_name | String | `"app"` | Name of the application/service. |
| app_api_v1_str | String | `"/api/v1"` | Base path for the API endpoints. |
| app_api_token | Secret | — | API token for authenticating requests. |
| recording_max_duration | Integer | `None` | Maximum duration of audio recordings in milliseconds. Set to `None` for unlimited. Audio recordings longer than the configured limit will be ignored and not processed. |
| celery_broker_url | String | `"redis://redis/0"` | Celery broker URL. |
| celery_result_backend | String | `"redis://redis/0"` | Celery result backend URL. |
| celery_max_retries | Integer | `1` | Maximum number of retries for Celery tasks. |
| transcribe_queue | String | `"transcribe-queue"` | Name of the Celery queue for transcription tasks. |
| aws_storage_bucket_name | String | — | Name of the S3/MinIO bucket used for storing recordings. |
| aws_s3_endpoint_url | String | — | Endpoint URL of the S3/MinIO storage. |
| aws_s3_access_key_id | String | — | Access key for S3/MinIO. |
| aws_s3_secret_access_key | Secret | — | Secret key for S3/MinIO. |
| aws_s3_secure_access | Boolean | `True` | Use HTTPS for S3/MinIO requests. |
| whisperx_api_key | Secret | — | API key for accessing WhisperX. |
| whisperx_base_url | String | `"https://api.whisperx.com/v1"` | Base URL for the WhisperX API. |
| whisperx_asr_model | String | `"whisper-1"` | ASR model used for transcription. |
| whisperx_max_retries | Integer | `0` | Maximum number of retries for WhisperX API requests. |
| webhook_max_retries | Integer | `2` | Maximum retries for webhook requests. |
| webhook_status_forcelist | List[Int] | `[502, 503, 504]` | HTTP status codes triggering webhook retry. |
| webhook_backoff_factor | Float | `0.1` | Exponential backoff factor for webhook retries. |
| webhook_api_token | Secret | — | Token to authenticate incoming webhook requests. |
| webhook_url | String | — | URL to which webhook events are sent. |
| document_default_title | String | `"Transcription"` | Default title for generated documents. |
| document_title_template | String | `'Réunion "{room}" du {room_recording_date} à {room_recording_time}'` | Template for document title. |
| sentry_is_enabled | Boolean | `False` | Enable or disable Sentry error tracking. |
| sentry_dsn | String | `None` | DSN for Sentry integration. |

View File

@@ -2,7 +2,13 @@
This is an experimental part of the stack. It currently lacks proper observability, unit tests, and other production-grade features. This serves as the base for AI features in Visio.
## Usage
## How it works
Please refer to the [Recording feature documentation](https://github.com/suitenumerique/meet/blob/main/docs/features/recording.md) and the [Transcription feature documentation](https://github.com/suitenumerique/meet/blob/main/docs/features/transcription.md).
## How to develop
(To develop locally follow the instructions on [developing La Suite Meet locally](https://github.com/suitenumerique/meet/blob/main/docs/developping_locally.md))
From the root of the project: