Skip to content

omne42/db-vfs

Repository files navigation

db-vfs

DB-backed virtual filesystem (DB-VFS) for service workloads.

The service-facing config file maps to db_vfs_service::policy::ServicePolicy; its core VFS subset projects to db_vfs_core::policy::VfsPolicy.

What it provides

  • Safety-first policy model: Permissions + Limits + Secrets + Traversal + Auth.
  • Tool-like operations: read, write, patch, delete, glob, grep.
  • Backends: SQLite (default/dev) and Postgres (Postgres-only build uses --no-default-features --features postgres).
  • Service policy loader accepts .toml, .json, .yaml, and .yml; the format must be explicit from the file extension.

db-vfs-service now exposes backend features explicitly:

  • default build: SQLite with bundled libsqlite (cargo run -p db-vfs-service -- ...)
  • Postgres-only build: cargo run -p db-vfs-service --no-default-features --features postgres -- --postgres ...
  • SQLite without bundled libsqlite: cargo run -p db-vfs-service --no-default-features --features sqlite -- ...

Quickstart (5 min)

  1. Create a local policy and token:
cp policy.example.toml policy.local.toml
export DB_VFS_TOKEN='dev-token-change-me'
  1. Enable local writes in policy.local.toml:
[permissions]
write = true
  1. Start SQLite service:
cargo run -p db-vfs-service -- \
  --sqlite ./db-vfs.sqlite \
  --policy ./policy.local.toml \
  --listen 127.0.0.1:8080
  1. Verify write/read:
curl -sS http://127.0.0.1:8080/v1/write \
  -H 'content-type: application/json' \
  -H "authorization: Bearer ${DB_VFS_TOKEN}" \
  -d '{"workspace_id":"w1","path":"docs/a.txt","content":"hello","expected_version":null}'

curl -sS http://127.0.0.1:8080/v1/read \
  -H 'content-type: application/json' \
  -H "authorization: Bearer ${DB_VFS_TOKEN}" \
  -d '{"workspace_id":"w1","path":"docs/a.txt","start_line":null,"end_line":null}'

Verification Matrix

Local baseline gates:

  • cargo fmt --all
  • cargo test --workspace
  • cargo clippy --workspace --all-targets --all-features -- -D warnings

CI extends that baseline in two directions:

  • platform matrix: Linux, macOS, Windows
  • backend matrix: a dedicated Linux Postgres Integration job runs cargo test --workspace --all-features --locked with DB_VFS_TEST_POSTGRES_URL pointed at a live Postgres service

./scripts/gate.sh is the local superset of that baseline: it already runs cargo test --workspace --all-features --locked alongside the feature-profile checks. When DB_VFS_TEST_POSTGRES_URL is unset, the Postgres-backed tests in that pass self-skip their live backend path; when the env var is set, the same local gate exercises the live Postgres coverage that Linux CI enforces before merge.

When you touch Postgres-specific code, feature gating, or the validation matrix itself, mirror that coverage locally with:

DB_VFS_TEST_POSTGRES_URL=postgres://postgres:postgres@127.0.0.1:5432/db_vfs_ci \
  cargo test --workspace --all-features --locked

API field reference (minimal)

All endpoints are JSON POST and require:

  • content-type: application/json
  • authorization: Bearer <token> (unless --unsafe-no-auth)
Endpoint Request fields Key response fields Typical errors
/v1/read workspace_id, path, start_line?, end_line? requested_path, path, content, bytes_read, version unauthorized, invalid_path, not_found
/v1/write workspace_id, path, content, expected_version? requested_path, path, bytes_written, created, version conflict, file_too_large
/v1/patch workspace_id, path, patch, expected_version requested_path, path, bytes_written, version patch, conflict, not_found, not_permitted
/v1/delete workspace_id, path, expected_version?, ignore_missing? requested_path, path, deleted conflict, not_found
/v1/glob workspace_id, pattern, path_prefix? matches, truncated, public scan counters not_permitted, timeout
/v1/grep workspace_id, query, regex, glob?, path_prefix? matches[], truncated, public scan counters invalid_regex, not_permitted, timeout

Error body:

{"code":"<stable_code>","message":"<human message>"}

workspace_id is a literal namespace, not a glob. It must be non-empty and must not contain whitespace, path separators, :, .., or *. The * character is reserved for auth allowed_workspaces pattern syntax. allowed_workspaces exact entries and trailing-* prefix literals must themselves satisfy the same workspace_id syntax, so impossible patterns such as team/ops or team:prod-* are rejected at startup instead of silently never matching.

ignore_missing = true makes /v1/delete idempotent for absent targets by returning 200 {"deleted":false,...}.

For glob and grep, omitting path_prefix is only allowed when the request still has a safe literal scope. Exact-file patterns auto-scope to that one path; wildcard patterns auto-scope only to their longest literal directory prefix.

Line-range read still enforces max_read_bytes on the returned slice. Without secret redaction rules, the store can stop after the requested range instead of materializing the whole file. When secret redaction rules are active, both the raw backing file and the redacted whole-file intermediate must fit within the same budget; otherwise the request fails with file_too_large before slice extraction. Line-oriented reads treat \n, \r\n, and lone \r as equivalent line boundaries, including mixed-ending files. The chunked no-redaction path derives progress conservatively from the byte budget, so multi-byte UTF-8 content cannot reinterpret max_read_bytes as an equally large character budget.

grep is line-oriented for both literal and regex queries. regex = true patterns that can consume \n or \r are rejected instead of silently behaving like whole-file regex search, and literal queries containing \n or \r short-circuit to no matches without forcing content loads. Line numbering and matches[].text follow the same \n / \r\n / lone \r line-boundary semantics, including files that mix those terminators.

patch is disabled whenever secrets.redact_regexes is active. Applying unified diffs against the raw backing text would otherwise turn patch context match/no-match into a secret oracle, so the service now returns not_permitted instead of pretending redacted files are safely patchable.

expected_version must be >= 1 whenever it is present. It is monotonic per (workspace_id, path) even across delete/recreate, so recreating a deleted file does not reset its version back to 1 and stale CAS tokens cannot hit a new file lifetime by accident.

Security Baseline

  • Keep auth enabled; avoid --unsafe-no-auth outside local isolated dev.
  • Prefer sha256:<64 hex> tokens or env-backed runtime tokens.
  • auth.tokens[*].token_env_var always carries the raw bearer token; only auth.tokens[*].token accepts a pre-hashed sha256:<64 hex> value.
  • A literal sha256:<64 hex> string in token_env_var is rejected at startup because it is not a valid Bearer token value.
  • If you use plaintext env-backed tokens, keep them valid HTTP Bearer tokens (token68 syntax; no whitespace or disallowed punctuation).
  • Scope tokens with allowed_workspaces (avoid broad * in production).
  • Trailing wildcard rules like team-a-* require at least one additional character after the prefix, so they do not also authorize the bare team-a- namespace by accident.
  • Use TLS/HTTPS end-to-end for bearer token transport.
  • Enable audit log with audit.required = true.
  • --trust-mode untrusted also refuses policy-side resource amplification beyond the service's default concurrency / DB caps and rejects scan configurations whose estimated in-flight memory footprint would exceed 512 MiB.

Performance Limits

Tune policy limits for your workload:

  • request bytes: max_read_bytes, max_write_bytes, max_patch_bytes
  • scan bounds: max_results, max_walk_files, max_walk_entries, max_walk_ms, max_line_bytes
  • concurrency: max_concurrency_io, max_concurrency_scan, max_db_connections
  • timeout/rate: max_io_ms, max_requests_per_ip_per_sec, max_requests_burst_per_ip

Budget semantics:

  • max_io_ms bounds non-scan requests (read/write/patch/delete), request-body buffering / JSON decode, and healthy DB pool wait/connect time.
  • max_io_ms must stay within backend session-timeout range (1..=2147483647 ms), so policy validation fails before SQLite/Postgres setup can reject it later.
  • The router body cap still keeps its hard limit, but it now reserves worst-case JSON string escape expansion for write / patch payloads so escape-heavy yet logically valid bodies are not rejected before decoded-size enforcement runs.
  • read / delete / glob / grep stay on a fixed 64 KiB JSON frontdoor cap instead of inheriting the larger write / patch transport budget.
  • Once the JSON body is buffered, the service preflights workspace_id before full request-schema decode and VFS execution, so token-authorized but disallowed workspaces fail early without paying the full operation parse/execute cost.
  • Omitting limits.max_walk_ms in policy config deserializes to the default Some(2000) scan budget.
  • Once the JSON body is buffered, the service preflights the top-level workspace_id before full request-schema deserialization, so token-valid requests aimed at a disallowed workspace can fail with 403 not_permitted without materializing large content / patch fields.
  • max_walk_ms bounds scan execution (glob/grep); max_walk_ms = None keeps scan runtime unbounded while DB pool wait/connect plus backend lock / statement waits remain bounded by max_io_ms.
  • These budgets cap how long the service waits before returning. They do not forcibly stop every in-flight CPU path; non-cancelable work can still finish in the background after a 408 timeout, so clients must treat timeout responses as "status unknown".
  • If pooled checkout already carries a backend connect/health-check failure detail, the service surfaces that path as 500 {"code":"db","message":"internal error"} instead of pretending it was just a request timeout.
  • max_concurrency_io / max_concurrency_scan are acquired before request body buffering and JSON schema decode, so malformed or oversized bodies cannot bypass service saturation gates.
  • When audit.required = true, the originating request keeps its concurrency permit until append+flush completes.
  • The same request runtime budget also caps any remaining required-audit append+flush wait after VFS execution begins.
  • Unauthorized and rate-limited required-audit early rejects acquire the same frontdoor IO/scan permit class and spend the same max_io_ms budget before returning, so those fail-closed paths cannot bypass audit durability or concurrency accounting.
  • Service startup DB migrations also reuse max_io_ms for connect/lock budgeting, so startup cannot hang indefinitely under backend contention.
  • SQLite busy_timeout and Postgres statement_timeout/lock_timeout follow the active request or startup migration budget.
  • Scan requests still keep DB pool wait/connect bounded by max_io_ms even when max_walk_ms = None.
  • When secret redaction rules are active, size scan concurrency for up to 2 * max_read_bytes per in-flight scan because the service may hold both the original file content and a bounded redacted copy at once.

Secrets semantics:

  • secrets.replacement must not contain control characters, so read line ranges and grep.matches[].text stay line-oriented.
  • db_vfs_core::redaction::SecretRedactor::from_rules() enforces the same replacement size/control-character bounds as VfsPolicy::validate(), so direct crate callers cannot bypass them.
  • ValidatedVfsPolicy::new() also proves that policy-derived secret/traversal matchers compile, so validated-policy constructor families do not defer matcher failures to runtime.
  • DbVfs::new_with_supplied_matchers_validated() and DbVfs::try_new_with_supplied_matchers_validated() are the canonical strict validated constructors for caller-supplied matchers. The older *_with_matchers_validated() names remain as deprecated compatibility aliases with the same fail-fast mismatch behavior. Use DbVfs::new_validated() when callers want policy-derived matchers instead of supplying pre-built ones.
  • Multi-line secret regexes are redacted with line structure preserved before ranged read slices or grep result lines are returned.
  • When redaction rules are active, grep evaluates literal/regex matches against that redacted line view instead of the hidden raw secret text, so masked content cannot still act as a match oracle.
  • grep and redaction-backed ranged read also budget redaction-expanded intermediates against max_read_bytes; over-budget redacted content is rejected or skipped as file_too_large.

Observability / Audit

  • x-request-id is accepted/echoed; invalid/missing IDs are replaced by service-generated IDs.
  • Optional JSONL audit via audit.jsonl_path.
  • Optional audit auto-recovers after sink failure by rotating the possibly corrupted JSONL file and starting a fresh worker; the failing event can still be lost, but later requests do not permanently run without audit.
  • Audit records include auth_subject="sha256:<64 hex>" whenever the service can derive a stable bearer-token fingerprint from the request, so successful requests, post-auth rejects, and syntactically valid unauthorized attempts can still be tied back to the same caller identity without writing raw tokens to disk.
  • With audit.required = true, audit runs fail-closed after startup: each request waits for its audit record to append+flush successfully, keeps its originating concurrency slot until that wait finishes, and uses the same request runtime budget for the required audit wait; worker loss or audit-budget exhaustion turns audited traffic into a visible availability failure instead of silently dropping events.
  • The same fail-closed permit retention also applies to early rejects that already consumed a request slot (for example invalid content type / JSON / schema, invalid workspace_id, or a disallowed workspace), so audited rejection paths cannot free concurrency before append+flush finishes.
  • Unauthorized and rate-limited early rejects also follow the fail-closed path when audit.required = true; they acquire a frontdoor permit, spend the remaining max_io_ms budget on append+flush, and release that permit as soon as the request-level audit wait ends instead of waiting for any background worker lifetime.
  • Required-audit queue saturation also fails closed with 503 audit_unavailable immediately instead of blocking indefinitely on the audit channel; the originating permit is released as soon as that bounded audit wait returns.
  • If required audit append/flush fails after startup, the service returns 503 audit_unavailable; the operation may already have completed, so clients should verify state before retrying writes. The same error is used when required audit cannot finish within the request's remaining runtime budget.
  • If a request returns 408 timeout but the background worker later settles, audit emits a second JSONL record with the same request_id and late_completion=true carrying the final settled result, so operators can distinguish “timed out and later succeeded/failed” from “timed out and never reconciled”.
  • Audit path/glob redaction is conservative for malformed or pattern-based secret-ish inputs too; values such as .env/../visible.txt, ".[en]nv", or control-character variants are masked as <secret> instead of being written through to JSONL. That masking is derived from the same policy-backed core::redaction matcher semantics used by runtime secret denial, so service-layer audit fields do not maintain a separate guessing rule set.
  • Early rejects (unauthorized/invalid JSON/rate-limited) are audited with workspace_id="<unknown>".
  • Service logs use tracing; configure via RUST_LOG.

Troubleshooting matrix

HTTP Common causes First checks
401 missing/invalid token Authorization, token hash/env var
403 workspace/policy denied allowed_workspaces, permissions.*, secrets.deny_globs
409 stale CAS version re-read latest version before retry
408 timeout budget exceeded (operation status may be unknown) limits.max_io_ms, limits.max_walk_ms, DB latency, pool/lock wait
500 backend unavailable or pooled connection health/bootstrap failure service logs, DB reachability/credentials, backend health
503 concurrency saturation or required audit unavailable max_concurrency_*, max_db_connections, audit worker / audit.jsonl_path health

More docs

  • Docs entrypoints: docs/README.md and docs/docs-system-map.md
  • Human docs (mdBook): docs/ (./scripts/docs.sh)
  • LLM bundle: llms.txt and docs/llms.txt (./scripts/llms.sh)

About

No description, website, or topics provided.

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors