Introduce ENABLE_EXTERNAL_API setting (defaults to False) to allow
administrators to disable external API endpoints, preventing unintended
exposure for self-hosted instances where such endpoints aren't
needed or desired.
From a security perspective, the list endpoint should be limited to return only
rooms created by the external application. Currently, there is a risk of
exposing public rooms through this endpoint.
I will address this in upcoming commits by updating the room model to track
the source of generation. This will also provide useful information
for analytics.
The API viewset was largely copied and adapted. The serializer was heavily
restricted to return a response more appropriate for external applications,
providing ready-to-use information for their users
(for example, a clickable link).
I plan to extend the room information further, potentially aligning it with the
Google Meet API format. This first draft serves as a solid foundation.
Although scopes for delete and update exist, these methods have not yet been
implemented in the viewset. They will be added in future commits.
Enforce the principle of least privilege by granting viewset permissions only
based on the scopes included in the token.
JWTs should never be issued without controlling which actions the application
is allowed to perform.
The first and minimal scope is to allow creating a room link. Additional actions
on the viewset will only be considered after this baseline scope is in place.
This endpoint does not strictly follow the OAuth2 Machine-to-Machine
specification, as we introduce the concept of user delegation (instead of
using the term impersonation).
Typically, OAuth2 M2M is used only to authenticate a machine in server-to-server
exchanges. In our case, we require external applications to act on behalf of a
user in order to assign room ownership and access.
Since these external applications are not integrated with our authorization
server, a workaround was necessary. We treat the delegated user’s email as a
form of scope and issue a JWT to the application if it is authorized to request
it.
Using the term scope for an email may be confusing, but it remains consistent
with OAuth2 vocabulary and allows for future extension, such as supporting a
proper M2M process without any user delegation.
It is important not to confuse the scope in the request body with the scope in
the generated JWT. The request scope refers to the delegated email, while the
JWT scope defines what actions the external application can perform on our
viewset, matching Django’s viewset method naming.
The viewset currently contains a significant amount of logic. I did not find
a clean way to split it without reducing maintainability, but this can be
reconsidered in the future.
Error messages are intentionally vague to avoid exposing sensitive
information to attackers.
Prepare for the introduction of new endpoints reserved for external
applications. Configure the required router and update the Helm chart to ensure
that the Kubernetes ingress properly routes traffic to these new endpoints.
It is important to support independent versioning of both APIs.
Base route’s name aligns with PR #195 on lasuite/drive, opened by @lunika
We need to integrate with external applications. Objective: enable them to
securely generate room links with proper ownership attribution.
Proposed solution: Following the OAuth2 Machine-to-Machine specification,
we expose an endpoint allowing external applications to exchange a client_id
and client_secret pair for a JWT. This JWT is valid only within a well-scoped,
isolated external API, served through a dedicated viewset.
This commit introduces a model to persist application records in the database.
The main challenge lies in generating a secure client_secret and ensuring
it is properly stored.
The restframework-apikey dependency was discarded, as its approach diverges
significantly from OAuth2. Instead, inspiration was taken from oauthlib and
django-oauth-toolkit. However, their implementations proved either too heavy or
not entirely suitable for the intended use case. To avoid pulling in large
dependencies for minimal utility, the necessary components were selectively
copied, adapted, and improved.
A generic SecretField was introduced, designed for reuse and potentially
suitable for upstream contribution to Django.
Secrets are exposed only once at object creation time in the Django admin.
Once the object is saved, the secret is immediately hashed, ensuring it can
never be retrieved again.
One limitation remains: enforcing client_id and client_secret as read-only
during edits. At object creation, marking them read-only excluded them from
the Django form, which unintentionally regenerated new values.
This area requires further refinement.
The design prioritizes configurability while adhering to the principle of least
privilege. By default, new applications are created without any assigned scopes,
preventing them from performing actions on the API until explicitly configured.
If no domain is specified, domain delegation is not applied, allowing tokens
to be issued for any email domain.
Sadly, we used user db id as the posthog distinct id
of identified user, and not the sub.
Before this commit, we were only passing sub to the
summary microservice.
Add the owner's id. Please note we introduce a different
naming behavir, by prefixing the id with "owner". We didn't
for the sub and the email.
We cannot align sub and email with this new naming approach,
because external contributors have already started building
their own microservice.
Refactor feature flag mechanism from Django permission classes to custom
decorator that returns 404 Not Found when features are disabled instead
of exposing API structure through permission errors.
Improves security by preventing information disclosure about disabled
features and provides more appropriate response semantics. Custom
decorator approach is better suited for feature toggling than Django's
permission system which is designed for authorization.
Configure LiveKit token to explicitly allow users to subscribe to other
participants' video and audio tracks instead of relying on default
permissions.
Replace hardcoded default publishing source constants with values from
Django backend settings to prevent desynchronization between frontend
and backend configurations.
Switch from metadata to attributes when generating LiveKit tokens for
more convenient dict-like structure handling during token creation and
client-side reading.
Attributes provide better data structure flexibility compared to
metadata, simplifying both server-side token generation and client-side
data access patterns.
Refactor client-side LiveKit API calls to server-side endpoints
following LiveKit documentation recommendations for participant
management operations.
Replaces hacky direct client calls with proper backend-mediated
requests, improving security and following official LiveKit
Introduce new method on lobby system to clear lobby cache for specific
room and participant combinations.
Enables targeted cleanup of lobby state when participants leave or are
removed, improving cache management and preventing stale lobby entries.
Refactor lobby system to use consistent UUID v4 across lobby
registration and LiveKit token participant identity instead of
generating separate UUIDs.
Maintains synchronized identifiers between lobby cache and LiveKit
participants, simplifying future participant removal operations by
using the same UUID reference across both systems.
Extend LiveKit token creation utility with additional room configuration
and user role parameters to properly adapt room_admin grants and
publish sources based on permission levels.
This creates technical debt in utility function design that should be
refactored into proper service architecture for token
generation operations in future iterations.
Eliminates code duplication across validation serializers, improving
maintainability and ensuring consistent validation behavior throughout
the API layer.
Allow any user, anonymous or authenticated, to start subtitling
in a room only if they are an active participant of it.
Subtitling a room consists of starting the multi-user transcriber agent.
This agent forwards all participants' audio to an STT server and returns
transcription segments for any active voice to the room.
User roles in the backend room system cannot be used
to determine subtitle permissions.
The transcriber agent can be triggered multiple times but will only join a
room once. Unicity is managed by the agent itself.
Any user with a valid LiveKit token can initiate subtitles. Feature flag
logic is implemented on the frontend. The frontend ensures the "start
subtitle" action is only available to users who should see it. The backend
does not enforce feature flags in this version.
Authentication in our system does not imply access to a room. The only
valid proof of access is the LiveKit API token issued by the backend.
Security consideration: A LiveKit API token is valid for 6 hours and
cannot be revoked at the end of a meeting. It is important to verify
that the token was issued for the correct room.
Calls to the agent dispatch endpoint must be server-initiated. The backend
proxies these calls, as clients cannot securely contact the agent dispatch
endpoint directly (per LiveKit documentation).
Room ID is passed as a query parameter. There is currently no validation
ensuring that the room exists prior to agent dispatch.
TODO: implement validation or error handling for non-existent rooms.
The backend does not forward LiveKit tokens to the agent. Default API
rate limiting is applied to prevent abuse.
The LiveKit API URL is necessary to interact with the API. It uses https
protocol.
Eplicit wss protocol is necessary in Websocket constructor for some
older browsers.
This resolves critical compatibility issues with legacy browsers
(notably Firefox <124, Chrome <125, Edge <125) that lack support
for HTTPS URLs in the WebSocket() constructor. Without explicit WSS
URLs, WebSocket signaling connections may fail, crash, or be blocked
entirely in these environments.
The setting is optional and defaults to the current behavior when
not specified, ensuring zero breaking changes for existing deployments.
Connection warmup wasn't working properly - only works when trying to
establish WebSocket first, then workaround kicks in. Call WebSocket
endpoint without auth info expecting 401 error, but enough to initiate
cache for subsequent WebSocket functionality.
Scope this **dirty** trick to Firefox users only. Haven't figured out
how to detect proxy from JS code simply.
Tested in staging and works on our constrained WiFi.
Implement HTTPS prefetch before joining rooms to resolve WebSocket
handshake failures where Firefox+proxy returns HTTP 200 instead of 101.
Reproduced locally with Squid container. No proxy configuration fixes
found - HTTPS warmup is only working workaround. Issue doesn't occur
when signaling server shares webapp domain, making warmup unnecessary.
Use HEAD request to minimize bandwidth.
Implement method to process egress limit reached events from LiveKit
webhooks for better recording duration management.
Livekit by default is not notifying the participant of a room when
an egress reached its limit. I needed to proxy it through the back.
Create new service to handle recording-related webhooks, starting with
limit reached events. Will expand to enhance UX by notifying backend
of other LiveKit events.
Doesn't fit cleanly with existing recording package - may need broader
redesign. Chose dedicated service over mixing responsibilities.
Move from lobby service to utils for reuse across services. Method is
generic enough for utility status. Future: create dedicated LiveKit
service to encapsulate all LiveKit-related utilities.
Send backend recording duration limit to frontend to display warning
messages when recordings approach or reach maximum allowed length.
This configuration needs to be synced with the egres. I chose to keep
this duration in ms to be consistent with other settings.
Add optional room name, recording time and date to generate better
document names based on user feedback. Template is customizable for
internationalization support.
Implemented a service that automatically creates a SIP dispatch rule when
the first WebRTC participant joins a room and removes it when the room
becomes empty.
Why? I don’t want a SIP participant to join an empty room.
The PIN code could be easily leaked, and there is currently no lobby
mechanism available for SIP participants.
A WebRTC participant is still required to create a room.
This behavior is inspired by a proprietary tool. The service uses LiveKit’s
webhook notification system to react to room lifecycle events. This is
a naive implementation that currently supports only a single SIP trunk and
will require refactoring to support multiple trunks. When no trunk is
specified, rules are created by default on a fallback trunk.
@rouja wrote a minimal Helm chart for LiveKit SIP with Asterisk, which
couldn’t be versioned yet due to embedded credentials. I deployed it
locally and successfully tested the integration with a remote
OVH SIP trunk.
One point to note: LiveKit lacks advanced filtering capabilities when
listing dispatch rules. Their recommendation is to fetch all rules and
filter them within your backend logic. I’ve opened a feature request asking
for at least the ability to filter dispatch rules by room, since filtering
by trunk is already supported, room-based filtering feels like a natural
addition.
Until there's an update, I prefer to keep the implementation simple.
It works well at our current scale, and can be refactored when higher load
or multi-trunk support becomes necessary.
While caching dispatch rule IDs could be a performance optimization,
I feel it would be premature and potentially error-prone due to the complexity
of invalidation. If performance becomes an issue, I’ll consider introducing
caching at that point. To handle the edge case where multiple dispatch rules
with different PIN codes are present, the service performs an extensive
cleanup during room creation to ensure SIP routing remains clean and
predictable. This edge case should not happen.
In the 'delete_dispatch_rule' if deleting one rule fails, method would exit
without deleting the other rules. It's okay IMO for a first iteration.
If multiple dispatch rules are often found for room, I would enhance this part.
Remove assertion statement that was placed after code expected to raise an
exception. The assertion was never evaluated due to the exception flow,
making the test ineffective.
Rework regex pattern to exclude empty string matches since
url_encoded_folder_path is optional.
Add additional test cases covering edge cases and failure
scenarios to improve validation coverage
and prevent false positives.
Replace inverted boolean comparisons (not ... ==) with direct opposite
operators (!=) to improve code readability and reduce unnecessary
complexity in conditional statements.
Handle unhandled exceptions to prevent UX impact. Marketing email operations
are optional and should not disrupt core functionality.
My first implementation was imperfect, raising error in sentry.
Show recording owner(s) directly in admin list interface to speed up
troubleshooting. Previously required clicking into each object to identify
owner. Handles multiple owners (rare) by displaying a default message.