As we filter the empty documents from the batch during indexing some batches
can be empty and cause an error. Now they are ignored.
Add --batch-size argument to the index command.
Signed-off-by: Fabre Florian <ffabre@hybird.org>
Reduce the number of Find API calls by grouping all the latest changes
for indexation : send all the documents updated or deleted since the
triggering of the task.
Signed-off-by: Fabre Florian <ffabre@hybird.org>
Rename FindDocumentIndexer as SearchIndexer
Rename FindDocumentSerializer as SearchDocumentSerializer
Rename package core.tasks.find as core.task.search
Remove logs on http errors in SearchIndexer
Factorise some code in search API view.
Signed-off-by: Fabre Florian <ffabre@hybird.org>
Filter deleted documents from visited ones.
Set default ordering to the Find API search call (-updated_at)
BaseDocumentIndexer.search now returns a list of document ids instead of models.
Do not call the indexer in signals when SEARCH_INDEXER_CLASS is not defined
or properly configured.
Signed-off-by: Fabre Florian <ffabre@hybird.org>
New SEARCH_INDEXER_CLASS setting to define the indexer service class.
Raise ImpoperlyConfigured errors instead of RuntimeError in index service.
Signed-off-by: Fabre Florian <ffabre@hybird.org>
On document content or permission changes, start a celery job that will call the
indexation API of the app "Find".
Signed-off-by: Fabre Florian <ffabre@hybird.org>
This allows API users to process document content, enabling the
use of Docs as a headless CMS for instance, or any kind of document
processing. Fixes#1206.
The AI answer was activating the code block feature
in the editor, which was not desired.
The prompt for AI actions has been updated to
instruct the AI to return content directly
without wrapping it in code blocks or markdown
delimiters.
In a scenario where the first user is editing a docs without websocket
and nobody has reached the websocket server first, the y-provider
service will return a 404 and we don't handle this case in the can-edit
endpoint leading to a server error.
The endpoint can_edit is added to the DocumentViewset, it will give the
information to the frontend application id the current user can edit the
Docs based on the no-websocket rules.
When a document is updated, users not connected to the collaboration
server can override work made by other people connected to the
collaboration server. To avoid this, the priority is given to user
connected to the collaboration server. If the websocket property in the
request payload is missing or set to False, the backend fetch the
collaboration server to now if the user can save or not. If users are
already connected, the user can't save. Also, only one user without
websocket can save a connect, the first user saving acquire a lock and
all other users can't save.
To implement this behavior, we need to track all users, connected and
not, so a session is created for every user in the
ForceSessionMiddleware.
Handle the raw payloads in requests and responses to convert-endpoint.
This change replaces Base64-encoded I/O with direct binary streaming,
yielding several benefits:
- **Network efficiency**: Eliminates the ~33% size inflation of Base64,
cutting bandwidth and latency.
- **Memory savings**: Enables piping DOCX (already compressed) buffers
straight to DocSpec API without holding, encoding and decoding multi-MB
payload in RAM.
Signed-off-by: Stephan Meijer <me@stephanmeijer.com>
Renamed the `convert_markdown` method to `convert` to prepare for an
all-purpose conversion endpoint, enabling support for multiple formats
and simplifying future extension.
Signed-off-by: Stephan Meijer <me@stephanmeijer.com>
We added the `FRONTEND_URL_JSON_FOOTER` environment
variable. It will give the possibility to generate
your own footer content in the frontend.
If the variable is not set, the footer will not
be displayed.
The ai translation were quite lossy about formatting.
Colors, background, breaklines, table sizes were
lost in the translation.
We improve the AI translation request to keep
the formatting as close as possible by using
html instead of markdown.
Abstracted base URL and API key under 'y-provider' for
reuse in future endpoints, aligning with microservice naming.
Please note the YProvider API here is internal to the cluster.
In facts, we don't want these endpoints to be exposed by any ingress
Minor adjustments were needed after working in parallel on two PRs.
The microservice now accepts an API key without requiring it as a Bearer token.
A mistake in reading the microservice response was corrected after refactoring
the serializer to delegate logic to the converter microservice.
We want trusted external applications to be able to create documents
via the API on behalf of any user. The user may or may not pre-exist
in our database and should be notified of the document creation by
email.
Albert send us back a malformed IA json, the
sanitize function was not able to handle it correctly.
We add a try catch on it, to not use the sanitizer if
the json.loads fails.
When an access is updated or removed, the
collaboration server is notified to reset the
access connection; by being disconnected, the
accesses will automatically reconnect by passing
by the ngnix subrequest, and so get the good
rights.
We do the same system when the document link is
updated, except here we reset every access
connection.
We created 2 new action endpoints on the document
to perform AI operations:
- POST /api/v1.0/documents/{uuid}/ai-transform
- POST /api/v1.0/documents/{uuid}/ai-translate