studio/meet - meet - Gitea: Git with a cup of tea

studio/meet

Fork 0

Commit Graph

Author	SHA1	Message	Date
lebaudantoine	88dbfae925	🔖(minor) bump release to 0.1.35 - fix crisp regression	2025-09-09 23:30:54 +02:00
lebaudantoine	ea2e5e8609	✨(agents) initialize LiveKit agent from multi-user transcriber example Create Python script based on LiveKit's multi-user transcriber example with enhanced request_fnc handler that ensures job uniqueness by room. A transcriber sends segments to every participant present in a room and transcribes every participant's audio. We don't need several transcribers in the same room. Made the worker hidden - by default it uses auto dispatch and is visible as any other participant, but having a transcriber participant would be weird since no other videoconference tool treats this feature as a bot participant joining a call. Job uniqueness is ensured using agent identity by forging a deterministic identity for each transcriber by room. This makes sure two transcribers would never be able to join the same room. It might be a bit harsh, but our API calling to list participants before accepting a new transcription job should already filter out situations where an agent is triggered twice. We chose explicit worker orchestration over auto-dispatch because we want to keep control of this feature which will be challenging to scale. LiveKit agent scaling is documented but we need to experiment in real life situations with their Worker/Job mechanism. Currently uses Deepgram since Arnaud's draft Kyutai plugin isn't ready for production. This allows our ops team to advance on deploying and monitoring agents. Deepgram was a random choice offering 200 hours free, though it only works for English. ASR provider needs to be refactored as a pluggable system selectable through environment variables or settings. Agent dispatch will be triggered via a new REST API endpoint to our backend. This is quite a first naive version of a minimal dockerized LiveKit agent to start playing with the framework.	2025-09-03 18:09:00 +02:00

Author

SHA1

Message

Date

lebaudantoine

88dbfae925

🔖(minor) bump release to 0.1.35

- fix crisp regression

2025-09-09 23:30:54 +02:00

lebaudantoine

ea2e5e8609

✨(agents) initialize LiveKit agent from multi-user transcriber example

Create Python script based on LiveKit's multi-user transcriber example
with enhanced request_fnc handler that ensures job uniqueness by room.

A transcriber sends segments to every participant present in a room and
transcribes every participant's audio. We don't need several
transcribers in the same room. Made the worker hidden - by default it
uses auto dispatch and is visible as any other participant, but having
a transcriber participant would be weird since no other videoconference
tool treats this feature as a bot participant joining a call.

Job uniqueness is ensured using agent identity by forging a
deterministic identity for each transcriber by room. This makes sure
two transcribers would never be able to join the same room. It might be
a bit harsh, but our API calling to list participants before accepting
a new transcription job should already filter out situations where an
agent is triggered twice.

We chose explicit worker orchestration over auto-dispatch because we
want to keep control of this feature which will be challenging to
scale. LiveKit agent scaling is documented but we need to experiment in
real life situations with their Worker/Job mechanism.

Currently uses Deepgram since Arnaud's draft Kyutai plugin isn't ready
for production. This allows our ops team to advance on deploying and
monitoring agents. Deepgram was a random choice offering 200 hours
free, though it only works for English. ASR provider needs to be
refactored as a pluggable system selectable through environment
variables or settings.

Agent dispatch will be triggered via a new REST API endpoint to our
backend. This is quite a first naive version of a minimal dockerized
LiveKit agent to start playing with the framework.

2025-09-03 18:09:00 +02:00

2 Commits