We need to integrate with external applications. Objective: enable them to
securely generate room links with proper ownership attribution.
Proposed solution: Following the OAuth2 Machine-to-Machine specification,
we expose an endpoint allowing external applications to exchange a client_id
and client_secret pair for a JWT. This JWT is valid only within a well-scoped,
isolated external API, served through a dedicated viewset.
This commit introduces a model to persist application records in the database.
The main challenge lies in generating a secure client_secret and ensuring
it is properly stored.
The restframework-apikey dependency was discarded, as its approach diverges
significantly from OAuth2. Instead, inspiration was taken from oauthlib and
django-oauth-toolkit. However, their implementations proved either too heavy or
not entirely suitable for the intended use case. To avoid pulling in large
dependencies for minimal utility, the necessary components were selectively
copied, adapted, and improved.
A generic SecretField was introduced, designed for reuse and potentially
suitable for upstream contribution to Django.
Secrets are exposed only once at object creation time in the Django admin.
Once the object is saved, the secret is immediately hashed, ensuring it can
never be retrieved again.
One limitation remains: enforcing client_id and client_secret as read-only
during edits. At object creation, marking them read-only excluded them from
the Django form, which unintentionally regenerated new values.
This area requires further refinement.
The design prioritizes configurability while adhering to the principle of least
privilege. By default, new applications are created without any assigned scopes,
preventing them from performing actions on the API until explicitly configured.
If no domain is specified, domain delegation is not applied, allowing tokens
to be issued for any email domain.
Sadly, we used user db id as the posthog distinct id
of identified user, and not the sub.
Before this commit, we were only passing sub to the
summary microservice.
Add the owner's id. Please note we introduce a different
naming behavir, by prefixing the id with "owner". We didn't
for the sub and the email.
We cannot align sub and email with this new naming approach,
because external contributors have already started building
their own microservice.
Refactor feature flag mechanism from Django permission classes to custom
decorator that returns 404 Not Found when features are disabled instead
of exposing API structure through permission errors.
Improves security by preventing information disclosure about disabled
features and provides more appropriate response semantics. Custom
decorator approach is better suited for feature toggling than Django's
permission system which is designed for authorization.
Configure LiveKit token to explicitly allow users to subscribe to other
participants' video and audio tracks instead of relying on default
permissions.
Replace hardcoded default publishing source constants with values from
Django backend settings to prevent desynchronization between frontend
and backend configurations.
Switch from metadata to attributes when generating LiveKit tokens for
more convenient dict-like structure handling during token creation and
client-side reading.
Attributes provide better data structure flexibility compared to
metadata, simplifying both server-side token generation and client-side
data access patterns.
Refactor client-side LiveKit API calls to server-side endpoints
following LiveKit documentation recommendations for participant
management operations.
Replaces hacky direct client calls with proper backend-mediated
requests, improving security and following official LiveKit
Introduce new method on lobby system to clear lobby cache for specific
room and participant combinations.
Enables targeted cleanup of lobby state when participants leave or are
removed, improving cache management and preventing stale lobby entries.
Refactor lobby system to use consistent UUID v4 across lobby
registration and LiveKit token participant identity instead of
generating separate UUIDs.
Maintains synchronized identifiers between lobby cache and LiveKit
participants, simplifying future participant removal operations by
using the same UUID reference across both systems.
Extend LiveKit token creation utility with additional room configuration
and user role parameters to properly adapt room_admin grants and
publish sources based on permission levels.
This creates technical debt in utility function design that should be
refactored into proper service architecture for token
generation operations in future iterations.
Eliminates code duplication across validation serializers, improving
maintainability and ensuring consistent validation behavior throughout
the API layer.
Allow any user, anonymous or authenticated, to start subtitling
in a room only if they are an active participant of it.
Subtitling a room consists of starting the multi-user transcriber agent.
This agent forwards all participants' audio to an STT server and returns
transcription segments for any active voice to the room.
User roles in the backend room system cannot be used
to determine subtitle permissions.
The transcriber agent can be triggered multiple times but will only join a
room once. Unicity is managed by the agent itself.
Any user with a valid LiveKit token can initiate subtitles. Feature flag
logic is implemented on the frontend. The frontend ensures the "start
subtitle" action is only available to users who should see it. The backend
does not enforce feature flags in this version.
Authentication in our system does not imply access to a room. The only
valid proof of access is the LiveKit API token issued by the backend.
Security consideration: A LiveKit API token is valid for 6 hours and
cannot be revoked at the end of a meeting. It is important to verify
that the token was issued for the correct room.
Calls to the agent dispatch endpoint must be server-initiated. The backend
proxies these calls, as clients cannot securely contact the agent dispatch
endpoint directly (per LiveKit documentation).
Room ID is passed as a query parameter. There is currently no validation
ensuring that the room exists prior to agent dispatch.
TODO: implement validation or error handling for non-existent rooms.
The backend does not forward LiveKit tokens to the agent. Default API
rate limiting is applied to prevent abuse.
The LiveKit API URL is necessary to interact with the API. It uses https
protocol.
Eplicit wss protocol is necessary in Websocket constructor for some
older browsers.
This resolves critical compatibility issues with legacy browsers
(notably Firefox <124, Chrome <125, Edge <125) that lack support
for HTTPS URLs in the WebSocket() constructor. Without explicit WSS
URLs, WebSocket signaling connections may fail, crash, or be blocked
entirely in these environments.
The setting is optional and defaults to the current behavior when
not specified, ensuring zero breaking changes for existing deployments.
Connection warmup wasn't working properly - only works when trying to
establish WebSocket first, then workaround kicks in. Call WebSocket
endpoint without auth info expecting 401 error, but enough to initiate
cache for subsequent WebSocket functionality.
Scope this **dirty** trick to Firefox users only. Haven't figured out
how to detect proxy from JS code simply.
Tested in staging and works on our constrained WiFi.
Implement HTTPS prefetch before joining rooms to resolve WebSocket
handshake failures where Firefox+proxy returns HTTP 200 instead of 101.
Reproduced locally with Squid container. No proxy configuration fixes
found - HTTPS warmup is only working workaround. Issue doesn't occur
when signaling server shares webapp domain, making warmup unnecessary.
Use HEAD request to minimize bandwidth.
Implement method to process egress limit reached events from LiveKit
webhooks for better recording duration management.
Livekit by default is not notifying the participant of a room when
an egress reached its limit. I needed to proxy it through the back.
Create new service to handle recording-related webhooks, starting with
limit reached events. Will expand to enhance UX by notifying backend
of other LiveKit events.
Doesn't fit cleanly with existing recording package - may need broader
redesign. Chose dedicated service over mixing responsibilities.
Move from lobby service to utils for reuse across services. Method is
generic enough for utility status. Future: create dedicated LiveKit
service to encapsulate all LiveKit-related utilities.
Send backend recording duration limit to frontend to display warning
messages when recordings approach or reach maximum allowed length.
This configuration needs to be synced with the egres. I chose to keep
this duration in ms to be consistent with other settings.
Add optional room name, recording time and date to generate better
document names based on user feedback. Template is customizable for
internationalization support.
Implemented a service that automatically creates a SIP dispatch rule when
the first WebRTC participant joins a room and removes it when the room
becomes empty.
Why? I don’t want a SIP participant to join an empty room.
The PIN code could be easily leaked, and there is currently no lobby
mechanism available for SIP participants.
A WebRTC participant is still required to create a room.
This behavior is inspired by a proprietary tool. The service uses LiveKit’s
webhook notification system to react to room lifecycle events. This is
a naive implementation that currently supports only a single SIP trunk and
will require refactoring to support multiple trunks. When no trunk is
specified, rules are created by default on a fallback trunk.
@rouja wrote a minimal Helm chart for LiveKit SIP with Asterisk, which
couldn’t be versioned yet due to embedded credentials. I deployed it
locally and successfully tested the integration with a remote
OVH SIP trunk.
One point to note: LiveKit lacks advanced filtering capabilities when
listing dispatch rules. Their recommendation is to fetch all rules and
filter them within your backend logic. I’ve opened a feature request asking
for at least the ability to filter dispatch rules by room, since filtering
by trunk is already supported, room-based filtering feels like a natural
addition.
Until there's an update, I prefer to keep the implementation simple.
It works well at our current scale, and can be refactored when higher load
or multi-trunk support becomes necessary.
While caching dispatch rule IDs could be a performance optimization,
I feel it would be premature and potentially error-prone due to the complexity
of invalidation. If performance becomes an issue, I’ll consider introducing
caching at that point. To handle the edge case where multiple dispatch rules
with different PIN codes are present, the service performs an extensive
cleanup during room creation to ensure SIP routing remains clean and
predictable. This edge case should not happen.
In the 'delete_dispatch_rule' if deleting one rule fails, method would exit
without deleting the other rules. It's okay IMO for a first iteration.
If multiple dispatch rules are often found for room, I would enhance this part.
Remove assertion statement that was placed after code expected to raise an
exception. The assertion was never evaluated due to the exception flow,
making the test ineffective.
Rework regex pattern to exclude empty string matches since
url_encoded_folder_path is optional.
Add additional test cases covering edge cases and failure
scenarios to improve validation coverage
and prevent false positives.
Replace inverted boolean comparisons (not ... ==) with direct opposite
operators (!=) to improve code readability and reduce unnecessary
complexity in conditional statements.
Handle unhandled exceptions to prevent UX impact. Marketing email operations
are optional and should not disrupt core functionality.
My first implementation was imperfect, raising error in sentry.
Show recording owner(s) directly in admin list interface to speed up
troubleshooting. Previously required clicking into each object to identify
owner. Handles multiple owners (rare) by displaying a default message.
Fix test cases for room PIN code generation that were not updated when
max retry limit was increased during code review. Aligns test expectations
with actual implementation to prevent false failures.
Enable users to join rooms via SIP telephony by:
- Dialing the SIP trunk number
- Entering the room's PIN followed by '#'
The PIN code needs to be generated before the LiveKit room is created,
allowing the owner to send invites to participants in advance.
With 10-digit PINs (10^10 combinations) and a large number of rooms
(e.g., 1M), collisions become statistically inevitable. A retry mechanism
helps reduce the chance of repeated collisions but doesn't eliminate
the overall risk.
With 100K generated PINs, the probability of at least one collision exceeds
39%, due to the birthday paradox.
To scale safely, we’ll later propose using multiple trunks. Each trunk
will handle a separate PIN namespace, and the combination of trunk_id and PIN
will ensure uniqueness. Room assignment will be evenly distributed across
trunks to balance load and minimize collisions.
Following XP principles, we’ll ship the simplest working version of this
feature. The goal is to deliver value quickly without over-engineering.
We’re not solving scaling challenges we don’t currently face.
Our production load is around 10,000 rooms — well within safe limits for
the initial implementation.
Discussion points:
- The `while` loop should be reviewed. Should we add rate limiting
for failed attempts?
- A systematic existence check before `INSERT` is more costly for a rare
event and doesn't prevent race conditions, whereas retrying on integrity
errors is more efficient overall.
- Should we add logging or monitoring to track and analyze collisions?
I tried to balance performance and simplicity while ensuring the
robustness of the PIN generation process.
The idea behind wrapping choices in `lazy` function was to allow
overriding the list of languages in tests with `override_settings`.
This was causing makemigrations to keep on including the field in
migrations when it is not needed. Since we finally don't override
the LANGUAGES setting in tests, we can remove it to fix the problem.
Taken from docs #c882f13
Remove translation markers from backend strings that are never displayed to
users. Streamlines localization process by focusing only on user-visible
content that requires actual translation.
Implement broad exception handling to catch any non-twirp errors
during recording operations. Ensures recording status is properly reset to
"failed to start" when errors occur, allowing users to retry the recording
while still logging errors to Sentry for investigation.
It's generally a bad practice, however in this case it's fine, I am
catching exception beforehand and it only acts as a fallback.