Replace custom Docker Hub authentication with standard, secure,
official GitHub actions for improved security and maintainability.
Uses officially supported actions that follow security best practices
and receive regular updates from GitHub.
Avoid unsecure handling of GitHub secrets.
Implement CI build and push workflow for meet-agents Docker image,
following the same pattern established by the summary image.
Extends CI pipeline to include meet-agents image distribution through
dockerhub for consistent deployment infrastructure.
Remove default unprivileged Docker user that was incompatible with hot
reloading in tilt stack. Update tilt config to resolve path issues.
CI builds still use unprivileged user, making this change safe while
enabling proper development workflow with hot reloading functionality.
Replace outdated numerique.gouv.fr repository references with current
repository location for accurate documentation and links.
Maintenance cleanup unrelated to current PR but necessary to keep
references up-to-date. Better addressed now than deferred.
Kickstart frontend with first draft of subtitle control visible only
to users with appropriate feature flag enabled.
Opens new container at bottom of screen displaying transcription
segments organized by participant. Transcription segment handling was
heavily LLM-generated and will likely need refactoring and review to
simplify and enhance the implementation.
Initial implementation to begin testing subtitle functionality with
real transcription data from LiveKit agents.
Allow any user, anonymous or authenticated, to start subtitling
in a room only if they are an active participant of it.
Subtitling a room consists of starting the multi-user transcriber agent.
This agent forwards all participants' audio to an STT server and returns
transcription segments for any active voice to the room.
User roles in the backend room system cannot be used
to determine subtitle permissions.
The transcriber agent can be triggered multiple times but will only join a
room once. Unicity is managed by the agent itself.
Any user with a valid LiveKit token can initiate subtitles. Feature flag
logic is implemented on the frontend. The frontend ensures the "start
subtitle" action is only available to users who should see it. The backend
does not enforce feature flags in this version.
Authentication in our system does not imply access to a room. The only
valid proof of access is the LiveKit API token issued by the backend.
Security consideration: A LiveKit API token is valid for 6 hours and
cannot be revoked at the end of a meeting. It is important to verify
that the token was issued for the correct room.
Calls to the agent dispatch endpoint must be server-initiated. The backend
proxies these calls, as clients cannot securely contact the agent dispatch
endpoint directly (per LiveKit documentation).
Room ID is passed as a query parameter. There is currently no validation
ensuring that the room exists prior to agent dispatch.
TODO: implement validation or error handling for non-existent rooms.
The backend does not forward LiveKit tokens to the agent. Default API
rate limiting is applied to prevent abuse.
Create basic Helm chart for LiveKit agent framework deployment on
Kubernetes, inspired by meet-summary FastAPI server configuration.
Integrate chart into local tilt development stack and properly handle
certificate issues that typically occur when calling LiveKit server
with nip.io domain names.
Create Python script based on LiveKit's multi-user transcriber example
with enhanced request_fnc handler that ensures job uniqueness by room.
A transcriber sends segments to every participant present in a room and
transcribes every participant's audio. We don't need several
transcribers in the same room. Made the worker hidden - by default it
uses auto dispatch and is visible as any other participant, but having
a transcriber participant would be weird since no other videoconference
tool treats this feature as a bot participant joining a call.
Job uniqueness is ensured using agent identity by forging a
deterministic identity for each transcriber by room. This makes sure
two transcribers would never be able to join the same room. It might be
a bit harsh, but our API calling to list participants before accepting
a new transcription job should already filter out situations where an
agent is triggered twice.
We chose explicit worker orchestration over auto-dispatch because we
want to keep control of this feature which will be challenging to
scale. LiveKit agent scaling is documented but we need to experiment in
real life situations with their Worker/Job mechanism.
Currently uses Deepgram since Arnaud's draft Kyutai plugin isn't ready
for production. This allows our ops team to advance on deploying and
monitoring agents. Deepgram was a random choice offering 200 hours
free, though it only works for English. ASR provider needs to be
refactored as a pluggable system selectable through environment
variables or settings.
Agent dispatch will be triggered via a new REST API endpoint to our
backend. This is quite a first naive version of a minimal dockerized
LiveKit agent to start playing with the framework.
Leverage reference to initial processor choice to prevent unnecessary
preview track recreation when updating processor options.
Improves performance by maintaining existing track instance during
processor updates instead of creating new tracks, eliminating visual
interruptions and reducing resource overhead.
Replace multiple processor wrappers with single unified class that
enables seamless transformer switching and option updates without
visual blinking artifacts.
Leverages LiveKit track processor v0.6.0 updateTransformerOptions fix
to provide smooth transitions between transformer types, eliminating
the recreation-based approach that caused flickering during effects
switching.
Streamline processor factory logic to prepare for unified transformer
class refactoring.
Reduces complexity and establishes foundation for consolidated
transformer approach.
Update LiveKit track processor to version 0.6.0 which includes fix for
updateTransformerOptions allowing seamless switching between transformer
types without visual artifacts.
Eliminates weird flickering behavior when users select different
transformer types by enabling proper transformer transitions instead of
recreation, improving user experience during effects switching.
Remove call to generate demo data in tilt stack as it was never useful
to developers and only complicated the migration job unnecessarily.
Migration job should be laser focused on applying database migrations
rather than seeding mock data, improving clarity and reducing
complexity.
Replace mock Django secret key with longer version to resolve security
warnings in development stack.
Still not production-suitable as key remains versioned in repository,
but eliminates security warnings during development workflow.
Remove dependencies on bitnami Helm charts since recent changes in
bitnami organization led to charts no longer being maintained or
published.
Enhanced the Tilt dependencies to avoid any bootstrap or refresh
errors while developping using the Tilt stack.
Making components dependant from each others increase slightly
the time required to spin up the stack the first time.
Implement pip dependency caching across all CI jobs requiring package
installation and upgrade actions/setup-python from v4 to v5.
The setup-python action is able to cache the dependencies and reuse this
cache while the pyproject file has not changed. It is easy to setup,
just the package manager used has to be declared in the cache settings
Introduce cross icon to switch component when in disabled/negative
state to provide clearer visual feedback to users.
Improves component usability by making the negative state more
explicitly recognizable through visual indicators.
Replace settings context provider with valtio global store for easier
access outside room components and better long-term maintainability.
Prepares for upcoming prejoin screen settings access by making settings
globally available without React context limitations.
Document that toggleButtonProps are intended to override default and
computed values within ToggleComponent, acknowledging this breaks
encapsulation but serves as useful starting point.
Skip state updates when selected device hasn't actually changed to
prevent unnecessary re-renders that caused visible camera track
blinking.
Improves user experience by eliminating visual artifacts during device
selection interactions when no actual change occurs.
Enhance toggle naming in video controls to explicitly indicate special
processor handling functionality and improve toggleProps TypeScript
typing.
Makes code more self-documenting by clearly identifying processor-aware
toggle behavior while strengthening type safety
Correct wrong copy-paste error in audio track dynamic initialization
that was missed during previous PR merge process.
Fixes initialization logic that was accidentally duplicated or
incorrectly modified during merge conflict resolution.
Restore participant name display in transcription and recording toast
notifications that was accidentally removed in recent changes.
Simple regression fix to ensure proper participant identification in
notification messages.
Close device control popover automatically when user opens sidepanels
or external dialogs to prevent confusing UI state.
Improves focus management by ensuring only one interface element
demands user attention at a time, reducing cognitive load during
interactions.
Fix bug where device toggling shortcuts remained active despite lacking
permissions, by disabling device-related shortcuts until permissions
are granted.
Prevents confusing user experience where shortcuts appear to work but
have no effect due to missing media permissions.
Convert audio tab device selections to controlled behavior matching
video tab implementation for consistency.
Maintains current component structure without migrating to SelectDevice
component yet, focusing on controlled state pattern alignment first.
Provide direct access to background and effects options from video
device controls during conference for additional user convenience.
Creates another pathway to effects configuration, giving users more
flexibility in accessing video enhancement features while in meetings.
Update tooltip and aria-label text for in-room device controls to
indicate they now open comprehensive settings dialog instead of simple
device selection.
Enable opening settings dialog directly from device controls while
inside a conference for quick access to device configuration.
Improves UX by providing immediate settings access without
enhancing convenience during meetings.
Requested by users.
Update localization keys for device toggling and selection to be more
generic, enabling translation sharing between join and room contexts.
Eliminates duplicate translations and creates consistent messaging for
device interactions regardless of application context.
Replace separate prejoin and room toggle components with unified
component that's adaptable and easier to evolve without overfitting.
Adds responsibilities to join component but eliminates duplication. Join
component needs future refactoring as complexity is growing
significantly.
Remove ugly toggle device configuration and implement hook to determine
appropriate keyboard shortcuts based on media device kind.
Cleaner approach that encapsulates shortcut logic in reusable hook
instead of scattered configuration objects.
Create simple hook to assign icons to toggle/select components based on
media kind using dictionary lookup for optimization.
Eliminates duplicate icon assignment logic across components with
straightforward, performant implementation that's easy to maintain.
Create hook to encapsulate permission denied/prompted/loading checks
based on media kind, eliminating props drilling and simplifying code.
Returns appropriate permission state for consuming components based on
media type, cleaning up code structure with small enhancement.
Refactor device selection within rooms and add audio output selection
to audio controls as requested by users.
Ensures code reuse between join and room components by sharing device
selection logic across both contexts.
Temporary state separating audio and video controls to improve clarity
and prepare for device selection/toggle component reorganization.
Work in progress to better structure device-related components before
implementing final unified control architecture.
Add dark variant to Select component following same approach as Popover
primitive. Same design inconsistency as other variant patterns.
Quick implementation pending UI v2 refactoring for unified variant
system across all components.
Add placement prop to Popover primitive to leverage React Aria's
explicit placement control functionality.
Provides better positioning control for popovers by exposing underlying
React Aria placement options, enabling more precise UI layouts.