We introduce a new model for user wanted to access a document or upgrade
their role if they already have access.
The viewsets does not implement PUT and PATCH, we don't need it for now.
We want to protect all requests from django with content security
policy header. We use the djang-csp library and configure it with
default values.
Fixes#1000
We added the possibility to scan all uploaded files with an anti malware
solution. Depending the backend used, we want to give the possibility to
check the file mimtype to determine if this one is tagged as unsafe or
not. To this you can set the environment variable
DOCUMENT_ATTACHMENT_CHECK_UNSAFE_MIME_TYPES_ENABLED to False. The
default value is True.
Django 5.2 is now mature enough and we can use it in production.
In some tests the number of sql queries is increasing. This is because
the `full_clean` method called in the `save` method on all our models is
creating a transaction, so a savepoint and release is added.
We also fix deprecated warning in this commit.
We want to have the media-check url returned on the attachment-upload
response instead of the media url directly. The front will know the
endpoint to use to check the media status.
With the usage of a malware detection system, we need a way to know the
file status. The front will use it to display a loader while the analyse
is not ended.
In the attachment_upload method, the status in the file metadata to
processing and the malware_detection backend is called. We check in the
media_auth if the status is ready in order to accept the request.
When 2 docs are created almost at the same time,
the second one will fail because the first one.
We get a unicity error on the path key already
used ("impress_document_path_key").
To fix this issue, we will lock the table the
time to create the document, the next query will
wait for the lock to be released.
This should work in both cases:
- search for "vélo" when the document title contains "velo"
- search for "velo" when the document title contains "vélo"
We recently extract images url in the content. For this, we assume that
the document content is always in base64. We enforce this assumption by
checking if it's a valide base64 in the serializer.
Ypy is deprecated and unmaintained. We have problem with parsing
existing documents. We replace it by pycrdt, library actively maintained
and without the issues we have with Ypy.
Every user having an access to a document, no matter its role have
access to the entire accesses list with all the user details. Only
owner or admin should be able to have the entire list, for the other
roles, they have access to the list containing only owner and
administrator with less information on the username. The email and its
id is removed
The refactor made in the tree view caching the ancestors_links to not
compute them again in the document.get_abilities method lead to a bug.
If the get_abilities method is called without ancestors_links, then they
are computed on all the ancestors but not from the highest readable
ancestor for the current user. We have to compute them with this
constraint.
We can't prevent document editors from copy/pasting content to from one
document to another. The problem is that copying content, will copy the
urls pointing to attachments but if we don't do anything, the reader of
the document to which the content is being pasted, may not be allowed to
access the attachment files from the original document.
Using the work from the previous commit, we can grant access to the readers
of the target document by extracting the attachment keys from the content and
adding themto the target document's "attachments" field. Before doing this,
we check that the current user can indeed access the attachment files extracted
from the content and that they are allowed to edit the current document.
We took this opportunity to refactor the way access is controlled on
media attachments. We now add the media key to a list on the document
instance each time a media is uploaded to a document. This list is
passed along when a document is duplicated, allowing us to grant
access to readers on the new document, even if they don't have or
lost access to the original document.
We also propose an option to reproduce the same access rights on the
duplicate document as what was in place on the original document.
This can be requested by passing the "with_accesses=true" option in
the query string.
The tricky point is that we need to extract attachment keys from the
existing documents and set them on the new "attachments" field that is
now used to track access rights on media files.
Tests were forgotten. While writing the tests, I fixed
a few edge cases like the possibility to connect to the
collaboration server for an anonymous user.
The cors-proxy endpoint allowed to use every type of files and to
execute it in the browser. We limit the scope only to images and
Content-Security-Policy and Content-Disposition headers are also added
to not allow script execution that can be present in a SVG file.
The ai translation were quite lossy about formatting.
Colors, background, breaklines, table sizes were
lost in the translation.
We improve the AI translation request to keep
the formatting as close as possible by using
html instead of markdown.
When exporting a document in PDF and if the doc contains external
resources, we want to fetch them using a proxy bypassing CORS
restrictions. To ensure this endpoint is not used for something else
than fetching urls contains in the doc, we use access control and check
if the url really exists in the document.
The "nb_accesses" field was displaying the number of access instances
related to a document or any of its ancestors. Some features on the
frontend require to know how many of these access instances are related
to the document directly.
If a document already gets a link reach/role inheriting from one of its
ancestors, we should not propose setting link reach/role on the
document that would be more restrictive than what we inherited from
ancestors.
We want to be able to make a search query inside a hierchical document.
It's elegant to do it as a document detail action so that we benefit
from access control.
the "filter_queryset" method is called in the middle of the
"get_object" method. We use the "get_object" in actions like
"children", "tree", etc. which start by calling "get_object"
but return lists of documents.
We would like to apply filters to these views but the it didn't
work because the "get_object" method was also impacted by the
filters...
In a future PR, we should take control of the "get_object" method
and decouple all this. We need a quick solution to allow releasing
the hierchical documents feature in the frontend.
We want to display the tree structure to which a document belongs
on the left side panel of its detail view. For this, we need an
endpoint to retrieve the list view of the document's ancestors
opened.
By opened, we mean that when display the document, we also need to
display its siblings. When displaying the parent of the current
document, we also need to display the siblings of the parent...
- language for invitation emails => language saved on the invited user
- if invited user does not exist yet => language of the sending user
- if for some reason no sending user => system default language
On the media upload endpoint, we want to set the content-disposition
header. Its value is based on the uploaded file mime-type and if flagged
as unsafe. If the file is not an image or is unsafe then the
contentDisposition is set to attachment to force its download.
Otherwise, we set it to inline.
The frontend cannot access custom headers of a file,
so we need to add a flag in the filename.
We add the `unsafe` flag in the filename to
indicate that the file is unsafe.
Previous filename: "/{UUID4}.{extension}"
New filename: "/{UUID4}-unsafe.{extension}"
We want to be able to define whether AI features are available to
anonymous users who gained editor access on a document, or if we
demand that they be authenticated or even if we demand that they
gained their editor access via a specific document access.
Being authenticated is now the default value. This will change the
default behavior on your existing instance (see UPGRADE.md)
We changed the way we upload the translations to
Crowdin, some translations were missing for the
email templates. We add them back and improve
the tests to make sure we don't forget them again.
When creating a new document/template via the API, we add the
logged-in user as owner of the created object. This should be
done atomically with the object creation to make sure we don't
end-up with an orphan object that the creator can't access
anymore.
Only owners can see and restore deleted documents. They can only do
it during the grace period before the document is considered hard
deleted and hidden from everybody on the API.
Now that we have introduced a document tree structure, it is not
possible to allow deleting documents anymore as it impacts the whole
subtree below the deleted document and the consequences are too big.
We introduce soft delete in order to give a second thought to the
document's owner (who is the only one to be allowed to delete a
document). After a document is soft deleted, the owner can still
see it in the trashbin (/api/v1.0/documents/trashbin).
After a grace period (30 days be default) the document disappears
from the trashbin and can't be restored anymore. Note that even
then it is still kept in database. Cleaning the database to erase
deleted documents after the grace period can be done as a maintenance
script.
Only administrators or owners of a document can move it to a target
document for which they are also administrator or owner.
We allow different moving modes:
- first-child: move the document as the first child of the target
- last-child: move the document as the last child of the target
- first-sibling: move the document as the first sibling of the target
- last-sibling: move the document as the last sibling of the target
- left: move the document as sibling ordered just before the target
- right: move the document as sibling ordered just after the target
The whole subtree below the document that is being moved, moves as
well and remains below the document after it is moved.
This test was missing the status code check. Without this check
the error that follows does not make sense because the content
returned is not at all what we expect in the following assert
statement.
user roles were already computed as an annotation on the query for
performance as we must look at all the document's ancestors to determine
the roles that apply recursively. We can easily expose them as readonly
via the serializer.
Including the content field in the list view is not efficient as we need
to query the object storage to retrieve it. We want to display an excerpt
of the content on the list view so we should store it in database. We
let the frontend compute it and save it for us in the new "excerpt" field
because we are not supposed to have access to the content (E2EE feature coming)