Consolidate scattered transcript formatting functions into single
cohesive class encapsulating all transcript processing logic
for better maintainability and clearer separation of concerns.
Add transcript cleaning step to remove spurious recognition artifacts
like randomly predicted "Vap'n'Roll Thierry" phrases that appear
without corresponding audio, improving transcript quality
by filtering model hallucinations.
- add admin action to retry a recording notification to external services
- log more Celery tasks' parameters
- add multilingual support for real-time subtitles
- update backend dependencies
Add detailed logging for owner ID, recording metadata, and
processing context in transcription tasks to improve debugging
capabilities.
It was especially important to get the created document id,
so when having trouble with the docs API, I could share
with them the newly created documents being impacted.
Restore correct task_args ordering in metadata manager after commit f0939b6f
added sender argument to Celery signals for transcription task scoping,
unexpectedly shifting positional arguments and breaking metadata creation.
Issue went undetected due to missing staging analytics deployment, silently
losing production observability on microservice without blocking transcription
job execution, highlighting need for staging analytics activation.
Add ability to use response_format in call function in order to
have better result with albert-large model
Use reponse_format for next steps and plan generation
Restrict metadata manager signal triggers to transcription-specific Celery
tasks to prevent exceptions when new summary worker executes tasks
not designed for metadata operations, reducing false-positive Sentry errors.
Make WhisperX language detection configurable through FastAPI settings
to handle empty audio start scenarios where automatic detection fails and
incorrectly defaults to English despite 99% French usage.
Quick fix acknowledging long-term solution should allow dynamic
per-recording language selection configured by users through web
interface rather than global server settings.
Sadly, we used user db id as the posthog distinct id
of identified user, and not the sub.
Before this commit, we were only passing sub to the
summary microservice.
Add the owner's id. Please note we introduce a different
naming behavir, by prefixing the id with "owner". We didn't
for the sub and the email.
We cannot align sub and email with this new naming approach,
because external contributors have already started building
their own microservice.
Ensure transcribe jobs are properly assigned to their specific queue
instead of using default queue. This prevents job routing issues and
ensures proper task distribution across workers.
Introduce FastAPI settings configuration option to completely disable
the summary feature. This improves developer experience by allowing
developers to skip summary-related setup when not needed for their
workflow.
Implement summarization functionality that processes completed meeting
transcripts to generate concise summaries.
First draft base on a simple recursive agentic scenario.
Observability and evaluation will be added in the next PRs.
Name the Celery queue used by transcription worker to prepare for
dedicated summarization queue separation, enabling faster transcript
delivery while isolating new agentic logic in separate worker processes.
Rename incorrectly named OpenAI configuration settings since
they're used to instantiate WhisperX client which is not OpenAI
compatible, preventing confusion about actual service dependencies.
Consolidate summary service into main development stack to centralize
development environment management and simplify service orchestration
with shared infrastructure like MinIO storage.
Sync ruff's target Python version to match Docker image version
used for summary component to ensure runtime consistency.
Prevents syntax/feature mismatches, catches version-specific issues
before deployment, and ensures linting targets the actual runtime
environment for better deployment safety.
Remove default unprivileged Docker user that was incompatible with hot
reloading in tilt stack. Update tilt config to resolve path issues.
CI builds still use unprivileged user, making this change safe while
enabling proper development workflow with hot reloading functionality.
Enable Celery task lifecycle events and broker dispatch events per
@rouja's exporter requirements. Basic configuration following
documentation without parameterization.
Add optional room name, recording time and date to generate better
document names based on user feedback. Template is customizable for
internationalization support.
Resolve float/int to string conversion problems when deserializing Redis
data for PostHog. Added type conversion fix - not bulletproof but works
for most cases. Avoid using for critical operations.