Initial commit — Drive, an S3 file browser with WOPI editing

Lightweight replacement for the upstream La Suite Numérique drive
(Django/Celery/Next.js) built as a single Deno binary.

Server (Deno + Hono):
- S3 file operations via AWS SigV4 (no SDK) with pre-signed URLs
- WOPI host for Collabora Online (CheckFileInfo, GetFile, PutFile, locks)
- Ory Kratos session auth + CSRF protection
- Ory Keto permission model (OPL namespaces, not yet wired to routes)
- PostgreSQL metadata with recursive folder sizes
- S3 backfill API for registering files uploaded outside the UI
- OpenTelemetry tracing + metrics (opt-in via OTEL_ENABLED)

Frontend (React 19 + Cunningham v4 + react-aria):
- File browser with GridList, keyboard nav, multi-select
- Collabora editor iframe (full-screen, form POST, postMessage)
- Profile menu, waffle menu, drag-drop upload, asset type badges
- La Suite integration service theming (runtime CSS)

Testing (549 tests):
- 235 server unit tests (Deno) — 90%+ coverage
- 278 UI unit tests (Vitest) — 90%+ coverage
- 11 E2E tests (Playwright)
- 12 integration service tests (Playwright)
- 13 WOPI integration tests (Playwright + Docker Compose + Collabora)

MIT licensed.
This commit is contained in:
2026-03-25 18:28:37 +00:00
commit 58237d9e44
112 changed files with 26841 additions and 0 deletions

210
docs/architecture.md Normal file
View File

@@ -0,0 +1,210 @@
# Architecture
How the pieces fit together, and why there aren't very many of them.
---
## The Big Picture
Drive is one Deno binary. It serves a React SPA, proxies API calls to S3 and PostgreSQL, handles WOPI callbacks from Collabora, and validates sessions against Ory Kratos. No Django, no Celery, no Next.js, no BFF layer. One process, one binary, done.
```mermaid
graph TB
Browser["Browser (React SPA)"]
subgraph "Deno / Hono Server"
Router["Hono Router"]
Auth["Auth Middleware<br/>(Kratos sessions)"]
CSRF["CSRF Middleware<br/>(HMAC double-submit)"]
FileAPI["File API<br/>(CRUD, presigned URLs)"]
WopiAPI["WOPI Endpoints<br/>(CheckFileInfo, GetFile, PutFile, Locks)"]
PermAPI["Permission Middleware<br/>(Keto checks)"]
Static["Static File Server<br/>(ui/dist)"]
end
Browser -->|"HTTP"| Router
Router --> Auth
Auth --> CSRF
CSRF --> FileAPI
CSRF --> WopiAPI
CSRF --> PermAPI
Router --> Static
FileAPI -->|"SQL"| PostgreSQL["PostgreSQL<br/>(metadata, folder sizes)"]
FileAPI -->|"S3 API"| SeaweedFS["SeaweedFS<br/>(file storage)"]
WopiAPI -->|"Lock ops"| Valkey["Valkey<br/>(WOPI locks w/ TTL)"]
WopiAPI --> PostgreSQL
WopiAPI --> SeaweedFS
PermAPI -->|"HTTP"| Keto["Ory Keto<br/>(Zanzibar permissions)"]
Auth -->|"HTTP"| Kratos["Ory Kratos<br/>(session validation)"]
Collabora["Collabora Online"] -->|"WOPI callbacks"| WopiAPI
Browser -->|"iframe postMessage"| Collabora
```
## Request Lifecycle
Every request hits the Hono router in `main.ts`. The middleware stack is short and you can read the whole thing without scrolling:
1. **OpenTelemetry middleware** — tracing and metrics on every request.
2. **`/health`** — no auth, no CSRF. Returns `{ ok: true, time: "..." }`. K8s probes hit this.
3. **Auth middleware** — runs on everything except `/health`. Skips WOPI routes (they carry their own JWTs). Test mode (`DRIVER_TEST_MODE=1`) injects a fake identity.
4. **CSRF middleware** — validates HMAC double-submit cookies on mutating requests (`POST`, `PUT`, `PATCH`, `DELETE`) to `/api/*`. Skips WOPI routes.
5. **Route handlers** — the actual work.
From `main.ts`, the routing structure:
```
GET /health Health check (no auth)
GET /api/auth/session Session info
GET /api/files List files (with sort, search, pagination)
POST /api/files Create file (form-data or JSON metadata)
GET /api/files/:id Get file metadata
PUT /api/files/:id Rename or move
DELETE /api/files/:id Soft delete
POST /api/files/:id/restore Restore from trash
GET /api/files/:id/download Pre-signed download URL
POST /api/files/:id/upload-url Pre-signed upload URL(s)
POST /api/files/:id/complete-upload Complete multipart upload
POST /api/folders Create folder
GET /api/folders/:id/children List folder contents
GET /api/recent Recently opened files
GET /api/favorites Favorited files
PUT /api/files/:id/favorite Toggle favorite
GET /api/trash Deleted files
POST /api/admin/backfill S3 -> DB backfill (internal only)
GET /wopi/files/:id CheckFileInfo (token auth)
GET /wopi/files/:id/contents GetFile (token auth)
POST /wopi/files/:id/contents PutFile (token auth)
POST /wopi/files/:id Lock/Unlock/Refresh/GetLock (token auth)
POST /api/wopi/token Generate WOPI token (session auth)
/* Static files from ui/dist
/* SPA fallback (index.html)
```
## The SPA Lifecycle
The UI is a Vite-built React SPA. In production, it's static files in `ui/dist/`. Nothing fancy.
```mermaid
graph LR
A["npm install + vite build"] -->|"outputs"| B["ui/dist/<br/>index.html + assets/"]
B -->|"served by"| C["Hono serveStatic<br/>(root: ./ui/dist)"]
C -->|"SPA fallback"| D["All non-API routes<br/>return index.html"]
D -->|"client-side"| E["React Router<br/>handles /explorer, /recent, etc."]
```
**Build step:**
```bash
cd ui && npm install && npx vite build
```
Outputs `ui/dist/index.html` and `ui/dist/assets/` with hashed JS/CSS bundles.
**Serving:**
Hono's `serveStatic` serves from `ui/dist` for any route that doesn't match an API endpoint. A second `serveStatic` call serves `index.html` as the SPA fallback — navigating to `/explorer/some-folder-id` returns the shell, React Router takes it from there.
**Development:**
In dev mode (`deno task dev`), both run simultaneously:
- `deno run -A --watch main.ts` — server with hot reload
- `cd ui && npx vite` — Vite dev server with HMR
The Vite dev server proxies API calls to the Deno server.
**Compiled binary:**
```bash
deno compile --allow-net --allow-read --allow-env --include ui/dist -o driver main.ts
```
`deno compile` bundles the JS entry point and the entire `ui/dist` directory into a single executable (~450KB JS + static assets). This is what gets deployed.
## WOPI Callback Flow
This is the part that confuses people. Collabora doesn't talk to the browser — it talks to your server. The browser is out of the loop during editing:
```mermaid
sequenceDiagram
participant Browser
participant Server as Deno/Hono
participant Collabora
Browser->>Server: POST /api/wopi/token {file_id}
Server->>Server: Generate JWT (HMAC-SHA256)
Server->>Server: Fetch Collabora discovery XML
Server-->>Browser: {access_token, editor_url}
Browser->>Collabora: Form POST to editor_url<br/>(access_token in hidden field, target=iframe)
Collabora->>Server: GET /wopi/files/:id?access_token=...
Server-->>Collabora: CheckFileInfo JSON
Collabora->>Server: GET /wopi/files/:id/contents?access_token=...
Server->>SeaweedFS: GetObject
SeaweedFS-->>Server: File bytes
Server-->>Collabora: File content
Note over Collabora: User edits document...
Collabora->>Server: POST /wopi/files/:id (LOCK)
Collabora->>Server: POST /wopi/files/:id/contents (PutFile)
Server->>SeaweedFS: PutObject
Collabora->>Server: POST /wopi/files/:id (UNLOCK)
```
See [wopi.md](wopi.md) for the full breakdown.
## Auth Model
Two auth mechanisms, one server. This is the only slightly tricky part:
1. **Session auth** (most routes) — browser sends Ory Kratos session cookies. The middleware calls `GET /sessions/whoami` on Kratos. Invalid session? API routes get 401, page routes get a redirect to `/login`.
2. **Token auth** (WOPI routes) — Collabora doesn't have browser cookies (it's a separate server). WOPI endpoints accept a JWT `access_token` query parameter, HMAC-SHA256 signed, scoped to a specific file and user, 8-hour TTL.
The split happens in `server/auth.ts` based on URL prefix: anything under `/wopi/` skips session auth. WOPI handlers validate their own tokens.
## Data Flow
**File upload (pre-signed):**
1. Client sends metadata to `POST /api/files` (filename, mimetype, size, parent_id)
2. Server creates a `files` row, computes the S3 key
3. Client calls `POST /api/files/:id/upload-url` for a pre-signed PUT URL
4. Client uploads directly to S3 — file bytes never touch the server
5. Large files use multipart: multiple pre-signed URLs, then `POST /api/files/:id/complete-upload`
6. Folder sizes propagate up the ancestor chain via `propagate_folder_sizes()`
**File download (pre-signed):**
1. Client calls `GET /api/files/:id/download`
2. Server hands back a pre-signed GET URL
3. Client downloads directly from S3
The server never streams file content for regular uploads/downloads. The only time bytes flow through the server is WOPI callbacks — Collabora can't use pre-signed URLs, so we proxy those.
## Database
Two tables. That's it.
- **`files`** — the file registry. UUID PK, `s3_key` (unique), filename, mimetype, size, owner_id, parent_id (self-referencing for folder hierarchy), `is_folder` flag, timestamps, soft-delete via `deleted_at`.
- **`user_file_state`** — per-user state: favorites, last-opened timestamp. Composite PK on `(user_id, file_id)`.
Two PostgreSQL functions handle folder sizes:
- `recompute_folder_size(folder_id)` — recursive CTE that sums all descendant file sizes
- `propagate_folder_sizes(start_parent_id)` — walks up the ancestor chain, recomputing each folder
Migrations live in `server/migrate.ts` and run with `deno task migrate`.
## What's Not Here (Yet)
- **Rate limiting** — relying on ingress-level limits for now. Good enough until it isn't.
- **WebSocket** — no real-time updates between browser tabs. Collabora handles its own real-time editing internally, so this hasn't been a pain point yet.

227
docs/deployment.md Normal file
View File

@@ -0,0 +1,227 @@
# Deployment
How Drive runs in production as part of the SBBB Kubernetes stack.
---
## Where This Fits
Drive replaces the upstream [suitenumerique/drive](https://github.com/suitenumerique/drive) Helm chart in the SBBB stack. Same role, one binary instead of Django + Celery + Redis queues + Next.js.
Sits alongside the other La Suite apps and shares the Ory identity stack (Kratos + Hydra) for auth.
---
## The Binary
```bash
deno task build
```
Produces a single `driver` binary via `deno compile`:
```bash
deno compile --allow-net --allow-read --allow-env --include ui/dist -o driver main.ts
```
Bundles:
- Deno runtime
- All server TypeScript (compiled to JS)
- The entire `ui/dist` directory (React SPA)
No Node.js, no npm, no `node_modules` at runtime. Copy the binary into a container and run it. That's the deployment.
---
## Environment Variables
Everything is configured via environment variables. No config files.
| Variable | Default | Required | Description |
|----------|---------|----------|-------------|
| `PORT` | `3000` | No | Server listen port |
| `PUBLIC_URL` | `http://localhost:3000` | Yes | Public-facing URL. Used in WOPI callback URLs, redirects, and CSRF. Must be the URL users see in their browser. |
| `DATABASE_URL` | `postgres://driver:driver@localhost:5432/driver_db` | Yes | PostgreSQL connection string |
| `SEAWEEDFS_S3_URL` | `http://seaweedfs-filer.storage.svc.cluster.local:8333` | No | S3 endpoint |
| `SEAWEEDFS_ACCESS_KEY` | *(empty)* | No | S3 access key (empty = unauthenticated) |
| `SEAWEEDFS_SECRET_KEY` | *(empty)* | No | S3 secret key |
| `S3_BUCKET` | `sunbeam-driver` | No | S3 bucket name |
| `S3_REGION` | `us-east-1` | No | S3 region for signing |
| `VALKEY_URL` | `redis://localhost:6379/2` | No | Valkey/Redis URL for WOPI locks. Falls back to in-memory if unavailable. Set to your Valkey cluster URL in production. |
| `KRATOS_PUBLIC_URL` | `http://kratos-public.ory.svc.cluster.local:80` | No | Kratos public API for session validation |
| `KETO_READ_URL` | `http://keto-read.ory.svc.cluster.local:4466` | No | Keto read API for permission checks |
| `KETO_WRITE_URL` | `http://keto-write.ory.svc.cluster.local:4467` | No | Keto write API for tuple management |
| `COLLABORA_URL` | `http://collabora.lasuite.svc.cluster.local:9980` | No | Collabora Online for WOPI discovery |
| `WOPI_JWT_SECRET` | `dev-wopi-secret-change-in-production` | **Yes** | HMAC secret for WOPI access tokens. **Change this in production.** |
| `CSRF_COOKIE_SECRET` | `dev-secret-change-in-production` | **Yes** | HMAC secret for CSRF double-submit cookies. **Change this in production.** |
| `DRIVER_TEST_MODE` | *(unset)* | No | Set to `1` to bypass auth. **Never set this in production.** |
---
## Health Check
```
GET /health
```
Returns:
```json
{ "ok": true, "time": "2026-03-25T10:30:00.000Z" }
```
No auth required. Use this for Kubernetes liveness and readiness probes:
```yaml
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 3
periodSeconds: 5
```
---
## Collabora Configuration
Collabora needs to know which hosts can make WOPI callbacks. This is the `aliasgroup1` env var on the Collabora deployment. Get this wrong and Collabora will reject every callback with a "Not allowed" error.
```yaml
# Collabora environment
aliasgroup1: "https://drive\\.example\\.com"
```
The value is a regex — dots must be escaped. Multiple hosts:
```yaml
aliasgroup1: "https://drive\\.example\\.com|https://drive\\.staging\\.example\\.com"
```
In the Docker Compose test stack, this is:
```yaml
aliasgroup1: "http://host\\.docker\\.internal:3200"
```
---
## OIDC Client Setup
Drive authenticates users via Ory Kratos sessions. For OIDC flows, you need a client registered in Ory Hydra.
Client credentials go in a Kubernetes secret:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: oidc-drive
type: Opaque
stringData:
client-id: "drive"
client-secret: "your-secret-here"
```
The Kratos identity schema needs standard OIDC claims (email, name). Drive reads from session traits:
```typescript
const givenName = traits.given_name ?? traits.name?.first ?? "";
const familyName = traits.family_name ?? traits.name?.last ?? "";
```
Supports both OIDC-standard (`given_name`/`family_name`) and legacy (`name.first`/`name.last`) formats. You probably won't need to think about this.
---
## SeaweedFS Bucket
Create the bucket before first deploy (or don't — SeaweedFS creates buckets on first write with default config):
```bash
# Using AWS CLI pointed at SeaweedFS
aws --endpoint-url http://seaweedfs:8333 s3 mb s3://sunbeam-driver
```
Bucket name defaults to `sunbeam-driver`, configurable via `S3_BUCKET`.
---
## Keto Deployment
Keto is new to the SBBB stack — Drive introduced it. You need to deploy:
1. **Keto server** with read (4466) and write (4467) APIs
2. **OPL namespaces** from `keto/namespaces.ts` loaded at deploy time
The namespace file defines the permission model (User, Group, Bucket, Folder, File). See [permissions.md](permissions.md) for the details.
Typical Keto config:
```yaml
# Keto config
dsn: postgres://keto:keto@postgres:5432/keto_db
namespaces:
location: file:///etc/keto/namespaces.ts
serve:
read:
host: 0.0.0.0
port: 4466
write:
host: 0.0.0.0
port: 4467
```
Read API: accessible from Drive pods. Write API: accessible from Drive pods + admin tooling. Neither should be exposed to the internet.
---
## Database Migration
Run migrations before first deploy and after updates that add new ones:
```bash
DATABASE_URL="postgres://..." ./driver migrate
```
Or, if running from source:
```bash
DATABASE_URL="postgres://..." deno run -A server/migrate.ts
```
Idempotent — running them multiple times is safe. A `_migrations` table tracks what's been applied.
---
## Observability
OpenTelemetry tracing and metrics are built in (`server/telemetry.ts`) — every request is instrumented automatically.
Full picture:
- **OpenTelemetry** — tracing and metrics via OTLP. Especially useful for debugging WOPI callback chains.
- `/health` endpoint for uptime monitoring
- PostgreSQL query logs for database performance
- S3 access logs from SeaweedFS
- Container stdout/stderr
---
## Deployment Checklist
1. Build the binary: `deno task build`
2. Set **`WOPI_JWT_SECRET`** to a random secret (32+ characters)
3. Set **`CSRF_COOKIE_SECRET`** to a different random secret
4. Set **`PUBLIC_URL`** to the actual user-facing URL
5. Set **`DATABASE_URL`** and run migrations
6. Ensure SeaweedFS bucket exists
7. Configure Collabora `aliasgroup1` to allow WOPI callbacks from `PUBLIC_URL`
8. Register the OIDC client in Hydra (or use the `oidc-drive` secret)
9. Deploy Keto with the namespace file from `keto/namespaces.ts`
10. Verify: `curl https://drive.example.com/health`
11. **Do not** set `DRIVER_TEST_MODE=1` in production. It disables all auth. You will have a bad time.

221
docs/local-dev.md Normal file
View File

@@ -0,0 +1,221 @@
# Local Development
Zero to running in 2 minutes. Full WOPI editing stack takes a bit longer, but not much.
---
## Prerequisites
| Tool | Version | What for |
|------|---------|----------|
| [Deno](https://deno.land/) | 2.x | Server runtime |
| [Node.js](https://nodejs.org/) | 20+ | UI build toolchain |
| PostgreSQL | 14+ | Metadata storage |
| [SeaweedFS](https://github.com/seaweedfs/seaweedfs) | latest | S3-compatible file storage |
### Installing SeaweedFS
`weed mini` is all you need — single-process mode that runs master, volume, and filer with S3 support:
```bash
# macOS
brew install seaweedfs
# Or download from GitHub releases
# https://github.com/seaweedfs/seaweedfs/releases
```
---
## Quick Start (No WOPI)
Gets you a working file browser with uploads, folders, sort, search — everything except document editing.
### 1. Database Setup
```bash
createdb driver_db
DATABASE_URL="postgres://localhost/driver_db" deno run -A server/migrate.ts
```
Creates `files`, `user_file_state`, `_migrations` tables and the folder size functions.
### 2. Start SeaweedFS
```bash
weed mini -dir=/tmp/seaweed-driver
```
S3 endpoint on `http://localhost:8333`. No auth, no config. It works.
### 3. Build the UI
```bash
cd ui && npm install && npx vite build && cd ..
```
### 4. Start the Server
```bash
DATABASE_URL="postgres://localhost/driver_db" \
SEAWEEDFS_S3_URL="http://localhost:8333" \
DRIVER_TEST_MODE=1 \
deno run -A main.ts
```
Open `http://localhost:3000`.
`DRIVER_TEST_MODE=1` bypasses Kratos auth and injects a fake identity. No Ory stack needed unless you're working on auth flows.
### Development Mode (Hot Reload)
Or skip the manual steps entirely:
```bash
deno task dev
```
Runs both in parallel:
- `deno run -A --watch main.ts` — server with file-watching restart
- `cd ui && npx vite` — Vite dev server with HMR
Vite proxies API calls to the Deno server. Edit code, save, see it.
---
## Full WOPI Setup (Collabora Editing)
Document editing needs Collabora Online. Docker Compose is the path of least resistance.
### 1. Start the Compose Stack
```bash
docker compose up -d
```
Starts:
- **PostgreSQL** on 5433 (not 5432, avoids conflicts)
- **SeaweedFS** on 8334 (not 8333, same reason)
- **Collabora Online** on 9980
Collabora takes ~30 seconds on first start. Wait for healthy:
```bash
docker compose ps
```
### 2. Run Migrations
```bash
DATABASE_URL="postgres://driver:driver@localhost:5433/driver_db" deno run -A server/migrate.ts
```
### 3. Start the Server
```bash
DATABASE_URL="postgres://driver:driver@localhost:5433/driver_db" \
SEAWEEDFS_S3_URL="http://localhost:8334" \
COLLABORA_URL="http://localhost:9980" \
PUBLIC_URL="http://host.docker.internal:3200" \
PORT=3200 \
DRIVER_TEST_MODE=1 \
deno run -A main.ts
```
Note `PUBLIC_URL` — Collabora (in Docker) needs to reach the server (on the host) for WOPI callbacks, hence `host.docker.internal`. Port 3200 avoids stepping on anything already on 3000.
### 4. Test It
Upload a `.docx` or `.odt`, double-click it. Collabora loads in an iframe. If it doesn't, check the `aliasgroup1` config in `compose.yaml`.
### Tear Down
```bash
docker compose down -v
```
`-v` removes volumes so you start clean next time.
---
## Integration Service Theming
To test La Suite theming locally, point at the production integration service:
```bash
INTEGRATION_URL=https://integration.sunbeam.pt deno run -A main.ts
```
Injects runtime theme CSS, fonts, dark mode, and the waffle menu. The integration tests validate this:
```bash
cd ui && INTEGRATION_URL=https://integration.sunbeam.pt npx playwright test e2e/integration-service.spec.ts
```
---
## Environment Variables for Local Dev
Minimal reference (Drive doesn't read `.env` files — set these in your shell or prefix your command):
```bash
# Required
DATABASE_URL="postgres://localhost/driver_db"
SEAWEEDFS_S3_URL="http://localhost:8333"
# Optional — defaults are fine for local dev
PORT=3000
PUBLIC_URL="http://localhost:3000"
S3_BUCKET="sunbeam-driver"
DRIVER_TEST_MODE=1
# Only needed for WOPI editing
COLLABORA_URL="http://localhost:9980"
WOPI_JWT_SECRET="dev-wopi-secret-change-in-production"
# Only needed for real auth (not test mode)
KRATOS_PUBLIC_URL="http://localhost:4433"
# Only needed for permissions
KETO_READ_URL="http://localhost:4466"
KETO_WRITE_URL="http://localhost:4467"
```
---
## Common Tasks
### Reset the database
```bash
dropdb driver_db && createdb driver_db
DATABASE_URL="postgres://localhost/driver_db" deno run -A server/migrate.ts
```
### Backfill files from S3
Uploaded files directly to SeaweedFS and they're not showing up?
```bash
curl -X POST http://localhost:3000/api/admin/backfill \
-H "Content-Type: application/json" \
-d '{"dry_run": true}'
```
Drop `"dry_run": true` to actually write the rows.
### Build the compiled binary
```bash
deno task build
```
Runs `vite build` then `deno compile`. One binary in the project root.
### Run all tests
```bash
deno task test:all
```
See [testing.md](testing.md) for the full breakdown.

281
docs/permissions.md Normal file
View File

@@ -0,0 +1,281 @@
# Permissions
Zanzibar-style relationship-based access control via Ory Keto. Sounds fancy, works well.
---
## The Model
Drive uses [Ory Keto](https://www.ory.sh/keto/) for permissions. The model: relationship tuples ("user X has relation Y on object Z") checked by graph traversal. If you've read the Zanzibar paper, this is that.
The OPL (Ory Permission Language) definitions live in `keto/namespaces.ts`. This file is consumed by the Keto server at deploy time — Deno never executes it. It uses TypeScript syntax with Keto's built-in types (`Namespace`, `Context`, `SubjectSet`).
## Namespaces
Five namespaces, hierarchical:
```
User
Group members: (User | Group)[]
Bucket owners, editors, viewers --> permits: write, read, delete
Folder owners, editors, viewers, parents: (Folder | Bucket)[] --> permits: write, read, delete
File owners, editors, viewers, parents: Folder[] --> permits: write, read, delete
```
### User
```typescript
class User implements Namespace {}
```
Marker namespace. Users are their Kratos identity UUID.
### Group
```typescript
class Group implements Namespace {
related: {
members: (User | Group)[];
};
}
```
Groups can contain users or other groups (nested groups). Used in editor/viewer relations via `SubjectSet<Group, "members">`.
### Bucket
```typescript
class Bucket implements Namespace {
related: {
owners: User[];
editors: (User | SubjectSet<Group, "members">)[];
viewers: (User | SubjectSet<Group, "members">)[];
};
permits = {
write: (ctx) => owners.includes(subject) || editors.includes(subject),
read: (ctx) => write(ctx) || viewers.includes(subject),
delete: (ctx) => owners.includes(subject),
};
}
```
Top of the hierarchy. Only owners can delete. Writers can also read. Standard privilege escalation.
### Folder
```typescript
class Folder implements Namespace {
related: {
owners: User[];
editors: (User | SubjectSet<Group, "members">)[];
viewers: (User | SubjectSet<Group, "members">)[];
parents: (Folder | Bucket)[];
};
permits = {
write: (ctx) =>
owners.includes(subject) ||
editors.includes(subject) ||
this.related.parents.traverse((p) => p.permits.write(ctx)),
// ... same pattern for read and delete
};
}
```
The interesting bit is `parents.traverse()`. No direct permission grant? Keto walks up the parent chain. Folder in a bucket inherits the bucket's permissions. Folder in a folder inherits from that, which may inherit from its parent, all the way up to the bucket. One tuple at the top covers an entire tree.
### File
```typescript
class File implements Namespace {
related: {
owners: User[];
editors: (User | SubjectSet<Group, "members">)[];
viewers: (User | SubjectSet<Group, "members">)[];
parents: Folder[];
};
// Same traverse pattern as Folder
}
```
Files can only have Folder parents (not Buckets directly). Same traversal logic.
## How Traversal Works
`parents.traverse()` is where Keto earns its keep. When you check "can user X write file Y?", Keto:
1. Checks if X is in Y's `owners` or `editors`
2. If not, looks at Y's `parents` relations
3. For each parent folder, recursively checks if X can write that folder
4. Each folder checks its own owners/editors, then traverses up to *its* parents
5. Eventually reaches a Bucket (no more parents) and checks there
Give a user `editor` on a bucket and they can write every folder and file in it. No per-file tuples needed.
---
## The HTTP API
The Keto client in `server/keto.ts` is a thin fetch wrapper. No SDK — Keto's HTTP API is simple enough that a client library would add more weight than value.
### Check Permission
```typescript
const allowed = await checkPermission(
"files", // namespace
"550e8400-e29b-41d4-a716-446655440000", // object (file UUID)
"read", // relation
"kratos-identity-uuid", // subject_id
);
```
Under the hood:
```
POST {KETO_READ_URL}/relation-tuples/check/openapi
Content-Type: application/json
{
"namespace": "files",
"object": "550e8400-...",
"relation": "read",
"subject_id": "kratos-identity-uuid"
}
```
Returns `{ "allowed": true }` or `{ "allowed": false }`. Network errors return `false` — fail closed, always.
### Write Relationship
```typescript
await createRelationship("files", fileId, "owners", userId);
```
```
PUT {KETO_WRITE_URL}/admin/relation-tuples
Content-Type: application/json
{
"namespace": "files",
"object": "file-uuid",
"relation": "owners",
"subject_id": "user-uuid"
}
```
### Write Relationship with Subject Set
For parent relationships (file -> folder, folder -> folder, folder -> bucket):
```typescript
await createRelationshipWithSubjectSet(
"files", fileId, "parents", // this file's "parents" relation
"folders", parentFolderId, "" // points to the folder
);
```
### Delete Relationship
```
DELETE {KETO_WRITE_URL}/admin/relation-tuples?namespace=files&object=...&relation=...&subject_id=...
```
### Batch Write
Atomic insert/delete of multiple tuples:
```typescript
await batchWriteRelationships([
{ action: "delete", relation_tuple: { namespace: "files", object: fileId, relation: "parents", subject_set: { ... } } },
{ action: "insert", relation_tuple: { namespace: "files", object: fileId, relation: "parents", subject_set: { ... } } },
]);
```
```
PATCH {KETO_WRITE_URL}/admin/relation-tuples
```
### Expand (Debugging)
```typescript
const tree = await expandPermission("files", fileId, "read", 3);
```
Returns the full permission tree. Useful for debugging why someone can or can't access something.
---
## Permission Middleware
`server/permissions.ts` exports `permissionMiddleware` for file/folder routes. Straightforward:
1. Extracts identity from `c.get("identity")`
2. Parses file/folder UUID from the URL path (`/api/files/:id` or `/api/folders/:id`)
3. Maps HTTP method to permission: `GET``read`, `DELETE``delete`, everything else → `write`
4. Checks against Keto
5. 403 if denied
For list operations (no `:id` in the path), the middleware passes through — per-item filtering happens in the handler:
```typescript
const filtered = await filterByPermission(files, userId, "read");
```
Checks permissions in parallel and returns only the allowed items.
---
## Tuple Lifecycle
Tuples live and die with their resources. No orphans.
### On File Create
```typescript
await writeFilePermissions(fileId, ownerId, parentFolderId);
```
Creates:
- `files:{fileId}#owners@{ownerId}` — creator is owner
- `files:{fileId}#parents@folders:{parentFolderId}#...` — links to parent folder
### On Folder Create
```typescript
await writeFolderPermissions(folderId, ownerId, parentFolderId, bucketId);
```
Creates:
- `folders:{folderId}#owners@{ownerId}`
- `folders:{folderId}#parents@folders:{parentFolderId}#...` — nested in another folder
- `folders:{folderId}#parents@buckets:{bucketId}#...` — at the bucket root
### On Delete
```typescript
await deleteFilePermissions(fileId);
```
Lists all tuples for the file across every relation (`owners`, `editors`, `viewers`, `parents`) and batch-deletes them. Clean break.
### On Move
```typescript
await moveFilePermissions(fileId, newParentId);
```
Batch operation: delete old parent tuple, insert new one, atomically. The file never exists in a state with no parent or two parents.
---
## What This Means in Practice
The payoff of Zanzibar for a file system:
- **Share a folder** → everything inside it is shared, recursively. No per-file grants.
- **Move a file** → it picks up the new folder's permissions automatically.
- **Groups are transitive** — add a user to a group, they inherit everything the group has.
- **One tuple on a bucket** can cover thousands of files.
The tradeoff is check latency. Every permission check is an HTTP call to Keto, which may traverse multiple levels. For list operations, checks run in parallel, but it's still N HTTP calls for N items. Fine for file browser pagination (50 items). Would need optimization for bulk operations — we'll cross that bridge when someone has 10,000 files in a folder.

261
docs/s3-layout.md Normal file
View File

@@ -0,0 +1,261 @@
# S3 Layout
How files are stored in SeaweedFS, and why you can browse the bucket and actually understand what you're looking at.
---
## Key Convention
S3 keys are human-readable paths. This is intentional, and we'd do it again.
### Personal files
```
{identity-id}/my-files/{path}/{filename}
```
Examples:
```
a1b2c3d4-e5f6-7890-abcd-ef1234567890/my-files/quarterly-report.docx
a1b2c3d4-e5f6-7890-abcd-ef1234567890/my-files/Projects/game-prototype/level-01.fbx
a1b2c3d4-e5f6-7890-abcd-ef1234567890/my-files/Documents/meeting-notes.odt
```
The identity ID is the Kratos identity UUID. `my-files` is a fixed segment that separates the identity prefix from user content.
### Shared files
```
shared/{path}/{filename}
```
Examples:
```
shared/team-assets/brand-guide.pdf
shared/templates/invoice-template.xlsx
```
Shared files use `"shared"` as the owner ID.
### Folders
Folder keys end with a trailing slash:
```
a1b2c3d4-e5f6-7890-abcd-ef1234567890/my-files/Projects/
a1b2c3d4-e5f6-7890-abcd-ef1234567890/my-files/Projects/game-prototype/
```
### Why Human-Readable?
Most S3-backed file systems use UUIDs as keys (`files/550e8400-e29b-41d4.bin`). We don't. You should be able to `s3cmd ls` the bucket and immediately see who owns what, what the folder structure looks like, and what the files are.
This pays for itself when debugging, doing backfills, migrating data, or when someone asks "where did my file go?" — you can answer by looking at S3 directly, no database cross-referencing required.
The tradeoff: renames and moves require S3 copy + delete (S3 has no rename operation). The `updateFile` handler in `server/files.ts` handles this:
```typescript
if (newS3Key !== file.s3_key && !file.is_folder && Number(file.size) > 0) {
await copyObject(file.s3_key, newS3Key);
await deleteObject(file.s3_key);
}
```
---
## The PostgreSQL Metadata Layer
S3 stores the bytes. PostgreSQL tracks everything else.
### The `files` Table
```sql
CREATE TABLE files (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
s3_key TEXT NOT NULL UNIQUE,
filename TEXT NOT NULL,
mimetype TEXT NOT NULL DEFAULT 'application/octet-stream',
size BIGINT NOT NULL DEFAULT 0,
owner_id TEXT NOT NULL,
parent_id UUID REFERENCES files(id) ON DELETE CASCADE,
is_folder BOOLEAN NOT NULL DEFAULT false,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ
);
```
Design decisions worth noting:
- **UUID primary key** — API routes use UUIDs, not paths. Decouples the URL space from the S3 key space.
- **`s3_key` is unique** — the bridge between metadata and storage. UUID → s3_key and s3_key → metadata, both directions work.
- **`parent_id` is self-referencing** — folders and files share a table. A folder is a row with `is_folder = true`. The hierarchy is a tree of `parent_id` pointers.
- **Soft delete** — `deleted_at` gets set, the row stays. Trash view queries `deleted_at IS NOT NULL`. Restore clears it.
- **`owner_id` is a text field** — not a FK to a users table, because there is no users table. It's the Kratos identity UUID. We don't duplicate identity data locally.
Indexes:
```sql
CREATE INDEX idx_files_parent ON files(parent_id) WHERE deleted_at IS NULL;
CREATE INDEX idx_files_owner ON files(owner_id) WHERE deleted_at IS NULL;
CREATE INDEX idx_files_s3key ON files(s3_key);
```
Partial indexes on `parent_id` and `owner_id` exclude soft-deleted rows — most queries filter on `deleted_at IS NULL`, so the indexes stay lean. Deleted files don't bloat the hot path.
### The `user_file_state` Table
```sql
CREATE TABLE user_file_state (
user_id TEXT NOT NULL,
file_id UUID NOT NULL REFERENCES files(id) ON DELETE CASCADE,
favorited BOOLEAN NOT NULL DEFAULT false,
last_opened TIMESTAMPTZ,
PRIMARY KEY (user_id, file_id)
);
```
Per-user state that doesn't belong on the file itself. Favorites and recent files are powered by this table.
---
## Folder Sizes
Folders show their total size (all descendants, recursively). Two PostgreSQL functions handle this.
### `recompute_folder_size(folder_id)`
Recursive CTE that walks all descendants:
```sql
CREATE OR REPLACE FUNCTION recompute_folder_size(folder_id UUID)
RETURNS BIGINT LANGUAGE SQL AS $$
WITH RECURSIVE descendants AS (
SELECT id, size, is_folder
FROM files
WHERE parent_id = folder_id AND deleted_at IS NULL
UNION ALL
SELECT f.id, f.size, f.is_folder
FROM files f
JOIN descendants d ON f.parent_id = d.id
WHERE f.deleted_at IS NULL
)
SELECT COALESCE(SUM(size) FILTER (WHERE NOT is_folder), 0)
FROM descendants;
$$;
```
Sums file sizes only (not folders) to avoid double-counting. Excludes soft-deleted items.
### `propagate_folder_sizes(start_parent_id)`
Walks up the ancestor chain, recomputing each folder's size:
```sql
CREATE OR REPLACE FUNCTION propagate_folder_sizes(start_parent_id UUID)
RETURNS VOID LANGUAGE plpgsql AS $$
DECLARE
current_id UUID := start_parent_id;
computed BIGINT;
BEGIN
WHILE current_id IS NOT NULL LOOP
computed := recompute_folder_size(current_id);
UPDATE files SET size = computed WHERE id = current_id AND is_folder = true;
SELECT parent_id INTO current_id FROM files WHERE id = current_id;
END LOOP;
END;
$$;
```
Called after every file mutation: create, delete, restore, move, upload completion. The handler calls it like:
```typescript
await sql`SELECT propagate_folder_sizes(${parentId}::uuid)`;
```
When a file moves between folders, both the old and new parent chains get recomputed:
```typescript
await propagateFolderSizes(newParentId);
if (oldParentId && oldParentId !== newParentId) {
await propagateFolderSizes(oldParentId);
}
```
---
## The Backfill API
S3 and the database can get out of sync — someone uploaded directly to SeaweedFS, a migration didn't finish, a backup restore happened. The backfill API reconciles them.
```
POST /api/admin/backfill
Content-Type: application/json
{
"prefix": "",
"dry_run": true
}
```
Both fields are optional. `prefix` filters by S3 key prefix. `dry_run` shows what would happen without writing anything.
Response:
```json
{
"scanned": 847,
"already_registered": 812,
"folders_created": 15,
"files_created": 20,
"errors": [],
"dry_run": false
}
```
### What it does
1. Lists all objects in the bucket (paginated, 1000 at a time)
2. Loads existing `s3_key` values from PostgreSQL into a Set
3. For each S3 object not in the database:
- Parses the key → owner ID, path, filename
- Infers mimetype from extension (extensive map in `server/backfill.ts` — documents, images, video, audio, 3D formats, code, archives)
- `HEAD` on the object for real content-type and size
- Creates parent folder rows recursively if missing
- Inserts the file row
4. Recomputes folder sizes for every folder
The key parsing handles both conventions:
```typescript
// {identity-id}/my-files/{path} -> owner = identity-id
// shared/{path} -> owner = "shared"
```
### When to reach for it
- After bulk-uploading files directly to SeaweedFS
- After migrating from another storage system
- After restoring from a backup
- Any time S3 has files the database doesn't know about
The endpoint requires an authenticated session but isn't exposed via ingress — admin-only, on purpose.
---
## S3 Client
The S3 client (`server/s3.ts`) does AWS Signature V4 with Web Crypto. No AWS SDK — it's a lot of dependency for six API calls. Implements:
- `listObjects` — ListObjectsV2 with pagination
- `headObject` — HEAD for content-type and size
- `getObject` — GET (streaming response)
- `putObject` — PUT with content hash
- `deleteObject` — DELETE (404 is not an error)
- `copyObject` — PUT with `x-amz-copy-source` header
Pre-signed URLs (`server/s3-presign.ts`) support:
- `presignGetUrl` — download directly from S3
- `presignPutUrl` — upload directly to S3
- `createMultipartUpload` + `presignUploadPart` + `completeMultipartUpload` — large file uploads
Default pre-signed URL expiry is 1 hour.

204
docs/testing.md Normal file
View File

@@ -0,0 +1,204 @@
# Testing
Five test suites, 90%+ coverage on both layers, and a Docker Compose stack for full WOPI integration tests.
---
## Overview
| Suite | Runner | Count | What it tests |
|-------|--------|-------|---------------|
| Server unit tests | Deno | 93 | API handlers, S3 signing, WOPI tokens, locks, auth, CSRF, Keto, permissions, backfill |
| UI unit tests | Vitest | 278 | Components, pages, hooks, stores, API client |
| E2E tests | Playwright | 11 | Full browser flows against a running server |
| Integration service tests | Playwright | 12 | Theme tokens, CSS injection, waffle menu from production integration service |
| WOPI integration tests | Playwright | 13 | End-to-end Collabora editing via Docker Compose |
---
## Server Unit Tests (Deno)
```bash
deno task test
```
Which runs:
```bash
deno test -A tests/server/
```
10 test files, one per server module:
| File | What it covers |
|------|---------------|
| `auth_test.ts` | Kratos session validation, cookie extraction, AAL2 handling, test mode |
| `csrf_test.ts` | HMAC double-submit token generation, verification, timing-safe comparison |
| `files_test.ts` | File CRUD handlers, presigned URL generation, sort/search/pagination |
| `s3_test.ts` | AWS SigV4 signing, canonical request building, presign URL generation |
| `keto_test.ts` | Keto HTTP client: check, write, delete, batch, list, expand |
| `permissions_test.ts` | Permission middleware, tuple lifecycle, filterByPermission |
| `backfill_test.ts` | Key parsing, folder chain creation, dry run, mimetype inference |
| `wopi_token_test.ts` | JWT generation, verification, expiry, payload validation |
| `wopi_lock_test.ts` | Lock acquire, release, refresh, conflict, unlock-and-relock, TTL |
| `wopi_discovery_test.ts` | Discovery XML parsing, caching, retry logic |
All use Deno's built-in test runner and assertions — no test framework dependency. WOPI lock tests inject an `InMemoryLockStore` so you don't need Valkey running.
### Running with coverage
```bash
deno test -A --coverage tests/server/
deno coverage coverage/
```
---
## UI Unit Tests (Vitest)
```bash
cd ui && npx vitest run
```
Or from the project root:
```bash
deno task test:ui
```
278 tests across 27 files:
| Area | Files | What they test |
|------|-------|---------------|
| Components | `FileBrowser`, `FileUpload`, `CollaboraEditor`, `ProfileMenu`, `FileActions`, `FilePreview`, `AssetTypeBadge`, `BreadcrumbNav`, `WaffleButton`, `ShareDialog` | Rendering, user interaction, keyboard navigation, aria attributes |
| Pages | `Explorer`, `Recent`, `Favorites`, `Trash`, `Editor` | Route-level rendering, data loading, empty states |
| API | `client`, `files`, `session`, `wopi` | Fetch mocking, error handling, request formatting |
| Stores | `selection`, `upload` | Zustand state management, multi-select, upload queue |
| Hooks | `useAssetType`, `usePreview`, `useThreeDPreview` | File type detection, preview capability determination |
| Layouts | `AppLayout` | Header, sidebar, main content area rendering |
| Cunningham | `useCunninghamTheme` | Theme integration, CSS variable injection |
| Root | `App` | CunninghamProvider + Router mounting |
### Running with coverage
```bash
cd ui && npx vitest run --coverage
```
---
## E2E Tests (Playwright)
```bash
cd ui && npx playwright test e2e/driver.spec.ts
```
Or:
```bash
deno task test:e2e
```
**Needs a running server** with:
- `DRIVER_TEST_MODE=1` (bypasses Kratos auth, injects a fake identity)
- PostgreSQL with migrations applied
- SeaweedFS (`weed mini` works fine)
11 tests covering browser-level flows:
- File browser navigation
- Folder creation and navigation
- File upload (single and multi-part)
- File rename and move
- File deletion and restore from trash
- Sort and search
- Favorites toggle
- Download via presigned URL
---
## Integration Service Tests (Playwright)
```bash
cd ui && INTEGRATION_URL=https://integration.sunbeam.pt npx playwright test e2e/integration-service.spec.ts
```
12 tests validating La Suite integration service theming:
- CSS variable injection from the integration service
- Theme token validation
- Waffle menu rendering
- Dark mode support
- Custom font loading
- Runtime theme switching
These hit the production integration service at `integration.sunbeam.pt`. No local Drive server needed — they test the theme integration layer in isolation.
---
## WOPI Integration Tests (Playwright)
```bash
# Start the test stack
docker compose up -d
# Wait for services to be healthy (Collabora takes ~30s)
docker compose ps
# Start the server pointed at compose services
DATABASE_URL="postgres://driver:driver@localhost:5433/driver_db" \
SEAWEEDFS_S3_URL="http://localhost:8334" \
COLLABORA_URL="http://localhost:9980" \
PUBLIC_URL="http://host.docker.internal:3200" \
PORT=3200 \
DRIVER_TEST_MODE=1 \
deno run -A main.ts
# Run the tests
cd ui && DRIVER_URL=http://localhost:3200 npx playwright test e2e/wopi.spec.ts
```
13 tests covering the full WOPI editing flow:
- Token generation for various file types
- Editor URL construction with WOPISrc
- Collabora iframe loading
- CheckFileInfo response validation
- GetFile content retrieval
- PutFile content writing
- Lock/unlock lifecycle
- Lock conflict handling
- Token expiry and refresh
- Concurrent editing detection
### The Docker Compose Stack
`compose.yaml` spins up three services:
| Service | Image | Port | Purpose |
|---------|-------|------|---------|
| `postgres` | `postgres:16-alpine` | 5433 | Test database (5433 to avoid conflict with host PostgreSQL) |
| `seaweedfs` | `chrislusf/seaweedfs:latest` | 8334 | S3 storage (8334 to avoid conflict with host weed mini) |
| `collabora` | `collabora/code:latest` | 9980 | Document editor |
Drive runs on the host (not in Docker) so Playwright can reach it. Collabora needs to call back to the host for WOPI — `aliasgroup1` points to `host.docker.internal:3200`.
Tear down after testing:
```bash
docker compose down -v
```
---
## Running Everything
```bash
# All automated tests (server + UI + E2E)
deno task test:all
```
Which runs:
```bash
deno task test && deno task test:ui && deno task test:e2e
```
WOPI integration tests aren't in `test:all` — they need the Docker Compose stack. Run them separately when touching WOPI code.

286
docs/wopi.md Normal file
View File

@@ -0,0 +1,286 @@
# WOPI Integration
How Drive talks to Collabora Online, how Collabora talks back, and the iframe dance that ties them together.
---
## What WOPI Is
WOPI (Web Application Open Platform Interface) is Microsoft's protocol for embedding document editors in web apps. Collabora implements it. Your app (the "WOPI host") exposes a few HTTP endpoints, and Collabora calls them to read files, write files, and manage locks.
The mental model that matters: during editing, the browser talks to Collabora, and Collabora talks to your server. The browser is not in the loop for file I/O.
## The Full Flow
Here's what happens when a user double-clicks a `.docx`:
### 1. Token Generation
The browser calls our API to get a WOPI access token:
```
POST /api/wopi/token
Content-Type: application/json
{ "file_id": "550e8400-e29b-41d4-a716-446655440000" }
```
The server:
- Validates the Kratos session (normal session auth)
- Looks up the file in PostgreSQL
- Determines write access (currently: owner = can write)
- Generates a JWT signed with HMAC-SHA256
- Fetches the Collabora discovery XML to find the editor URL for this mimetype
- Returns the token, TTL, and editor URL
Response:
```json
{
"access_token": "eyJhbGciOiJIUzI1NiIs...",
"access_token_ttl": 1711382400000,
"editor_url": "https://collabora.example.com/browser/abc123/cool.html?WOPISrc=..."
}
```
### 2. Form POST to Collabora
The browser doesn't `fetch()` the editor URL — it submits a hidden HTML form targeting an iframe. Yes, a form POST in 2026. This is how WOPI works: the token goes as a form field, not a header.
From `CollaboraEditor.tsx`:
```tsx
<form
ref={formRef}
target="collabora_frame"
action={wopiData.editor_url!}
encType="multipart/form-data"
method="post"
style={{ display: 'none' }}
>
<input name="access_token" value={wopiData.access_token} type="hidden" readOnly />
<input name="access_token_ttl" value={String(wopiData.access_token_ttl)} type="hidden" readOnly />
</form>
<iframe
ref={iframeRef}
name="collabora_frame"
title="Collabora Editor"
sandbox="allow-scripts allow-same-origin allow-forms allow-popups allow-popups-to-escape-sandbox allow-downloads"
allow="clipboard-read *; clipboard-write *"
allowFullScreen
/>
```
**The timing matters.** The form submission fires in a `useEffect` on `wopiData` change — not in a callback, not in a `setTimeout`. Both the `<form>` and `<iframe name="collabora_frame">` must be in the DOM before `formRef.current.submit()`. If you submit before the named iframe exists, the browser opens the POST in the main window and your SPA is toast. Ask us how we know.
```tsx
useEffect(() => {
if (wopiData?.editor_url && formRef.current && iframeRef.current) {
formRef.current.submit()
}
}, [wopiData])
```
### 3. Collabora Calls Back
Once Collabora receives the form POST, it starts making WOPI requests to our server using the `WOPISrc` URL embedded in the editor URL. Every request includes `?access_token=...` as a query parameter.
### 4. PostMessage Communication
Collabora talks to the parent window via `postMessage`. The component listens for:
- `App_LoadingStatus` (Status: `Document_Loaded`) — hide the loading spinner, focus the iframe
- `UI_Close` — user clicked the close button in Collabora
- `Action_Save` / `Action_Save_Resp` — save status for the UI
Token refresh also uses postMessage. Before the token expires, the component fetches a new one and sends it to the iframe:
```tsx
iframeRef.current.contentWindow.postMessage(
JSON.stringify({
MessageId: 'Action_ResetAccessToken',
Values: {
token: data.access_token,
token_ttl: String(data.access_token_ttl),
},
}),
'*',
)
```
Tokens refresh 5 minutes before expiry (`TOKEN_REFRESH_MARGIN_MS = 5 * 60 * 1000`).
---
## WOPI Endpoints
### CheckFileInfo
```
GET /wopi/files/:id?access_token=...
```
Returns metadata Collabora needs to render the editor:
```json
{
"BaseFileName": "quarterly-report.docx",
"OwnerId": "kratos-identity-uuid",
"Size": 145832,
"UserId": "kratos-identity-uuid",
"UserFriendlyName": "Sienna Costa",
"Version": "2025-03-15T10:30:00.000Z",
"UserCanWrite": true,
"UserCanNotWriteRelative": true,
"SupportsLocks": true,
"SupportsUpdate": true,
"SupportsGetLock": true,
"LastModifiedTime": "2025-03-15T10:30:00.000Z"
}
```
`UserCanNotWriteRelative` is always `true` — we don't support PutRelativeFile (creating files from within the editor). PutRelative override gets a 501.
### GetFile
```
GET /wopi/files/:id/contents?access_token=...
```
Fetches the file from S3 and streams it back. This is the one place file bytes flow through the server — Collabora can't use pre-signed URLs, so we proxy.
### PutFile
```
POST /wopi/files/:id/contents?access_token=...
X-WOPI-Lock: <lock-id>
[file bytes]
```
Writes the edited file back to S3. Validates the lock — if a different lock holds the file, returns 409 with the current lock ID in `X-WOPI-Lock`. Updates file size and `updated_at` in PostgreSQL.
### Lock Operations
All lock operations go through `POST /wopi/files/:id` with the `X-WOPI-Override` header:
| Override | Headers | What it does |
|----------|---------|-------------|
| `LOCK` | `X-WOPI-Lock` | Acquire a lock. If `X-WOPI-OldLock` is also present, it's an unlock-and-relock. |
| `GET_LOCK` | — | Returns current lock ID in `X-WOPI-Lock` response header. |
| `REFRESH_LOCK` | `X-WOPI-Lock` | Extend the TTL of an existing lock. |
| `UNLOCK` | `X-WOPI-Lock` | Release the lock. |
| `RENAME_FILE` | `X-WOPI-RequestedName` | Rename the file (requires write permission). |
Lock conflicts return 409 with the conflicting lock ID in the `X-WOPI-Lock` response header.
---
## Token Generation
WOPI tokens are JWTs signed with HMAC-SHA256 using Web Crypto. No external JWT library — it's 30 lines of code and one fewer dependency.
The payload:
```typescript
interface WopiTokenPayload {
fid: string // File UUID
uid: string // User ID (Kratos identity)
unm: string // User display name
wr: boolean // Can write
iat: number // Issued at (unix seconds)
exp: number // Expires at (unix seconds)
}
```
Default expiry is 8 hours (`DEFAULT_EXPIRES_SECONDS = 8 * 3600`).
The signing is textbook JWT — base64url-encode header and payload, HMAC-SHA256 sign the `header.payload` string, base64url-encode the signature:
```typescript
const header = base64urlEncode(
encoder.encode(JSON.stringify({ alg: "HS256", typ: "JWT" })),
);
const body = base64urlEncode(
encoder.encode(JSON.stringify(payload)),
);
const sigInput = encoder.encode(`${header}.${body}`);
const sig = await hmacSign(sigInput, secret);
return `${header}.${body}.${base64urlEncode(sig)}`;
```
Verification checks signature, then expiry. Token is scoped to a specific file — the handler validates `payload.fid === fileId` on every request.
Secret comes from the `WOPI_JWT_SECRET` env var. Default is `dev-wopi-secret-change-in-production` — the name is the reminder.
---
## Lock Service
Locks live in Valkey (Redis-compatible) with a 30-minute TTL. The key format is `wopi:lock:{fileId}`.
From `server/wopi/lock.ts`:
```typescript
const LOCK_TTL_SECONDS = 30 * 60; // 30 minutes
const KEY_PREFIX = "wopi:lock:";
```
The lock service uses an injectable `LockStore` interface:
- **`ValkeyLockStore`** — production, uses ioredis
- **`InMemoryLockStore`** — in-memory Map for tests and local dev
Fallback chain — try Valkey, fall back to in-memory (good enough for local dev, you'd notice in production):
```typescript
function getStore(): LockStore {
if (!_store) {
try {
_store = new ValkeyLockStore();
} catch {
console.warn("WOPI lock: falling back to in-memory store");
_store = new InMemoryLockStore();
}
}
return _store;
}
```
Lock acquisition uses `SET NX EX` (set-if-not-exists with TTL) for atomicity. If the lock exists with the same lock ID, the TTL refreshes instead — Collabora does this "re-lock with same ID" thing and you have to handle it.
---
## Discovery Caching
Collabora publishes a discovery XML at `/hosting/discovery` that maps mimetypes to editor URLs. We cache it for 1 hour:
```typescript
const CACHE_TTL_MS = 60 * 60 * 1000; // 1 hour
```
Retries up to 3 times with exponential backoff (1s, 2s). The XML is parsed with regex — yes, regex for XML, but the discovery format is stable and an XML parser dependency isn't worth it for one endpoint. Pulls `<app name="mimetype">` blocks and extracts `<action name="..." urlsrc="..." />` entries.
If Collabora is down, the token endpoint returns `editor_url: null` and the UI shows an error. No crash.
Cache can be cleared with `clearDiscoveryCache()` for testing.
---
## Iframe Sandbox
The iframe sandbox is as tight as Collabora allows:
```
sandbox="allow-scripts allow-same-origin allow-forms allow-popups allow-popups-to-escape-sandbox allow-downloads"
allow="clipboard-read *; clipboard-write *"
```
Every permission is load-bearing — remove any one of these and something breaks:
- `allow-scripts` — Collabora is a web app, needs JS
- `allow-same-origin` — Collabora's internal communication
- `allow-forms` — the initial form POST targets this iframe
- `allow-popups` — help/about dialogs
- `allow-popups-to-escape-sandbox` — those popups need full functionality
- `allow-downloads` — "Download as..." from within the editor
- `clipboard-read/write` — copy/paste