- Cloudflare Worker (Hono) serves API + static SPA
- Queue consumer handles asynchronous ingestion
- Data planes:
- Vectorize: embeddings + lightweight metadata
- KV: fast hydration + request telemetry + v1 progress states
- D1: v2 relational notebook/source/chunk/job state
- R2: raw uploaded source files
- Primary primitive: conversation archive
- Storage pattern:
- full conversation text in KV (
conv:*) - vector metadata in Vectorize
- full conversation text in KV (
- Ingestion supports JSON array/object conversation exports
- Primary primitive: Knowledge Notebook (workspace scoped)
- Storage pattern:
- notebooks/sources/chunks/jobs in D1
- vectors in Vectorize with notebook/source/chunk metadata pointers
- raw files in R2
- Ingestion supports parser-typed jobs from source registration
- Current parser coverage:
markdown,txt,chat_export(+ NDJSON thread-compatible ingest path) - Artifacts are persisted in D1 with
snapshot_hashes; stale state is computed by comparing current source hashes.
- request enters notebook scoped route
- embedding generated via Workers AI
- vector search in Vectorize
- notebook scope enforced via metadata filter + app-level guard
- fallback to D1 chunk retrieval when vector/generation path degrades
- Notebook delete is soft-delete (
deleted_at) in D1 - Sources under notebook are soft-deleted together
- This avoids irreversible data loss and enables async cleanup workflows
- Module size cap: < 400 lines per source module
- Keep runtime dependency surface minimal (Hono only)
- Prefer deterministic, source-grounded outputs with explicit citations
- Keep OSS-safe boundaries: no secrets/PII artifacts in tracked files