Skip to content

feat(access): "hidden" sensitivity tombstone — exclude content from search/visibility#95

Merged
mruwnik merged 4 commits into
masterfrom
claude/u2-e19-a37dccd27312a1c0ffa3c8f64f01d888
Jun 5, 2026
Merged

feat(access): "hidden" sensitivity tombstone — exclude content from search/visibility#95
mruwnik merged 4 commits into
masterfrom
claude/u2-e19-a37dccd27312a1c0ffa3c8f64f01d888

Conversation

@mruwnik

@mruwnik mruwnik commented Jun 5, 2026

Copy link
Copy Markdown
Owner

Goal

Let an admin make an email account's content invisible to search and fetch, reversibly, by setting the account's sensitivity to a new tombstone value "hidden". The existing access-control reconciliation pipeline propagates the change to all existing messages + their Qdrant payloads for free.

"hidden" is a storable sensitivity that is not a rung on the role/sensitivity ladder: it's kept out of SensitivityLevel / ALLOWED_SENSITIVITIES (so generic content-create paths can't set it), settable only via the email-account API, and actively excluded before every admin bypass — so even a superadmin can't see it. Reversible: messages inherit sensitivity from the account, so basic → hidden → basic round-trips via re-inheritance.

Enforcement ("always-on, even for admins")

Mirrors the existing always-on Qdrant must_not. Applied before each admin short-circuit:

  • access_control.apply_access_filter_to_querysensitivity != "hidden" before the access_filter=None early-out (BM25, final-merge, list/count).
  • embeddings.search_chunks — always-present must_not on sensitivity == "hidden".
  • access_control.user_can_access — deny before the admin check (read guard only; edit/delete stay open so admins can un-hide/remove).
  • access_control.get_accessible_source_item_by_filename — deny before admin short-circuit (file/attachment serving).
  • MCP core.fetchsensitivity != "hidden" in the fetch-by-id base query (id= + ids=).
  • MCP journal.can_access_journal_target — deny hidden source_item targets before the admin bypass (list_all read + add write).
  • MCP deadlines.accessible_source_items — exclude hidden in the base query.

search_bm25 now joins SourceItem unconditionally and always calls the access helper (the join is required for the tombstone filter on the admin path); this also removes the old needs_source_join / min_size=0 cross-join hazard.

Storage / API / frontend

  • source_item + email_accounts CheckConstraints widened to 5 values (model + Alembic migration 20260605_hidden_sensitivity). The other 7 sensitivity constraints stay 4-valued — only email accounts can mint hidden content today (documented extension point).
  • Email-account Literal + TS unions widened; EmailPanel gains a "Hidden (excluded from search)" option.

Review history

Hardened over 3 rounds of /review-loop (reviewer "Usul"). All seven enforcement surfaces above were closed in response — the recurring class was "admin short-circuit before the hidden check". All review comments resolved.

Known limitation (pre-existing, not closed here)

Mail ingested after an account is hidden stays at the inherited default until the next recent-tier access-control reconciliation runs — the same eventual-consistency window the whole inherited-AC system has for every sensitivity level. If "hidden" must be a hard guarantee for new mail, resolving inheritance at ingest is a follow-up.

Tests

Tombstone exclusion verified end-to-end at every closed surface (helper admin path, search_bm25 admin path, search_chunks must_not, user_can_access, file-serving, core.fetch, journal list_all/add, deadlines) plus DB-constraint and schema-validation tests. 680 passed in the combined sweep (--run-slow, qdrant-marked excluded — no container runtime).

🤖 Generated with Claude Code

…earch/visibility

Add a storable "hidden" sensitivity that makes an email account's content
invisible to search and fetch, reversibly, for EVERYONE including superadmins.
"hidden" is deliberately NOT a rung on the role/sensitivity ladder (kept out of
SensitivityLevel / ALLOWED_SENSITIVITIES) and is settable only via the
email-account API. Messages inherit it from the account, so basic->hidden->basic
round-trips via re-inheritance.

Enforcement is "always-on, even for admins" — applied before every admin bypass,
mirroring the existing always-on Qdrant must_not:
- access_control.apply_access_filter_to_query: filter sensitivity \!= "hidden"
  before the access_filter=None early-out (covers BM25, final-merge, list/count).
- embeddings.search_chunks: always-present must_not on sensitivity == "hidden".
- access_control.user_can_access: deny "hidden" before the admin check (read
  guard only; edit/delete stay open so admins can un-hide/remove).
- access_control.get_accessible_source_item_by_filename: deny "hidden" before
  the admin short-circuit (file/attachment serving).
- MCP core.fetch: sensitivity \!= "hidden" in the fetch-by-id base query.
- MCP journal.can_access_journal_target: deny hidden source_item targets before
  the admin bypass (list_all read + add write).
- MCP deadlines.accessible_source_items: exclude hidden in the base query.

bm25.search_bm25 now joins SourceItem unconditionally and always calls the
access helper (the join is required for the tombstone filter on the admin path);
this also removes the old needs_source_join/min_size=0 cross-join hazard.

Storage: widen the source_item and email_accounts CheckConstraints to 5 values
(model + Alembic migration). The other 7 sensitivity constraints stay 4-valued —
only email accounts can mint hidden content today (documented extension point).

API + frontend: widen the email-account Literal and the TS unions; add the
"Hidden (excluded from search)" option to EmailPanel.

Known limitation (pre-existing, not closed here): mail ingested AFTER an account
is hidden stays at the inherited default until the next recent-tier access-control
reconciliation runs — same eventual-consistency window the whole inherited-AC
system has for all sensitivity levels.

Tests: tombstone exclusion verified end-to-end at every closed surface
(apply_access_filter_to_query admin path, search_bm25 admin path, search_chunks
must_not, user_can_access, file-serving, core.fetch, journal list_all/add,
deadlines accessible_source_items) plus DB-constraint and schema-validation tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread db/migrations/versions/20260605_hidden_sensitivity.py Outdated
Comment thread src/memory/api/MCP/servers/deadlines.py Outdated
Comment thread src/memory/api/email_accounts.py Outdated
Claude and others added 3 commits June 5, 2026 21:24
The check-queue tests (tests/memory/api/MCP/test_check.py,
tests/memory/api/check/conftest.py, test_router.py) import
fakeredis.aioredis, but fakeredis was never added to any requirements
file. CI's `pip install .[all]` therefore omits it, so the repo-wide
`pyright $(git ls-files '*.py')` linter step fails with three
reportMissingImports errors and skips the test step — the backend job
has been red on master since the check feature landed.

Scope note: this is pre-existing breakage unrelated to the hidden-
sensitivity change in this PR; fixing it here per the "red CI is yours
to fix" policy so the branch can go green. Verified with the pinned
ruff==0.11.10 / pyright==1.1.408 against a fresh `.[all]` install: 0
errors once fakeredis is present.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sitivity values, simpler migration

Responds to maintainer review on PR #95:

1. Single fetch-by-id access gate (was: deadlines re-implemented access
   control with its own admin short-circuit + per-item user_can_access loop;
   core.fetch had a third variant). Add fetch_accessible_source_items() +
   build_mcp_user_access_filter() to MCP/access.py — one choke point that
   applies the caller's AccessFilter via the canonical
   apply_access_filter_to_query (creator / person / public / project-role AND
   the hidden tombstone, even on the admin path). Both deadlines (attach /
   upsert) and core.fetch now route through it. core.fetch keeps its
   token-derived filter, preserving the API-key scope-override property
   (test_core_access_filter) that user-scope recomputation would have lost.

2. Single source of truth for sensitivity values. Add StorableSensitivityLiteral
   (role ladder + "hidden") and STORABLE_SENSITIVITIES / the CHECK SQL to
   scopes.py — the low-level module the models already import. source_item and
   email_accounts CHECK constraints and the email-account API Literal now derive
   from it, so adding a storable value happens in one place. "hidden" stays out
   of the role-granted SensitivityLevelLiteral by design (no role grants it).

3. Simpler migration: inline the four op() calls instead of the
   table/constant/_recreate helper machinery.

Behaviour-preserving: the derived CHECK SQL is byte-identical to the literals;
the consolidated gate matches each call site's prior semantics. Mocked
core.fetch unit tests updated for the new (shorter) builder chain via a
self-returning query mock.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ncipals

The fetch-gate consolidation dropped core.fetch's original
`if access_filter is not None: if user is None ...: items = []` guard. With the
empty AccessFilter(conditions=[]) that get_current_user_access_filter() returns
for a token it can't resolve to a user (include_public=True by default), the
canonical gate would surface `public` items by id where the old code returned
nothing — a narrowing of fail-closed semantics for expired/deleted sessions
(public-only; the hidden tombstone is still excluded).

Restore the guard, scoped exactly as the original: `access_filter is not None
and user is None`. It only fires for non-admin tokens; the admin path
(access_filter is None) still tolerates a None user and resolves via the gate.
The original `user_id is None` sub-check is redundant here — fetch's UserProxy
is only ever built from a persisted User, whose id is non-None.

Adds test_fetch_unidentifiable_nonadmin_user_fails_closed. Found in /review-loop
round 1; round 2 converged with no further findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mruwnik mruwnik merged commit 2e371fa into master Jun 5, 2026
2 checks passed
@mruwnik mruwnik deleted the claude/u2-e19-a37dccd27312a1c0ffa3c8f64f01d888 branch June 5, 2026 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant