Skip to content

fix(oc): reject duplicate recording titles per project (ENG-71)#83

Open
henokteixeira wants to merge 2 commits into
mainfrom
henok/eng-71-new-recordings-with-the-same-title-silently-overwrite
Open

fix(oc): reject duplicate recording titles per project (ENG-71)#83
henokteixeira wants to merge 2 commits into
mainfrom
henok/eng-71-new-recordings-with-the-same-title-silently-overwrite

Conversation

@henokteixeira

Copy link
Copy Markdown
Contributor

Summary

🤖 Generated with Nori

Fixes ENG-71 (Urgent). POST /api/oc/recordings silently de-duplicated by (project_id, title) — it returned the existing recording instead of creating a new one. The client then requested an upload URL for that old recording_id, whose GCS path is deterministic, so the PUT overwrote the original blob. The old recording kept its metadata but now held the new audio — silent, unrecoverable data loss.

  • create_recording and update_recording now reject a duplicate title in the same project with ConflictError409 {detail, code: "CONFLICT"} (via the existing global handler — routers stay thin).
  • Match is trimmed, case-sensitive, exact. Blank/whitespace titles normalize to NULL and are exempt. Split-derived (split_from_id set) and archived-after-split rows are exempt (they legitimately reuse a parent title).
  • GET /api/oc/recordings gains an optional title filter, used by the client's save-time pre-check.

No schema change → no migration in this PR.

Known limitations / fast-follows

  • Concurrency race (deferred DB hardening). The uniqueness check is service-layer only (SELECT-then-INSERT, no row lock/constraint), so two concurrent creates of the same title in one project could still both insert. This is far narrower than the original always-overwrite bug (which the service check fully closes for the reported single-user flow), but it is reachable across two devices/users. Recommended fast-follow: a partial unique index plus an IntegrityErrorConflictError backstop, gated on a pre-deploy audit (run against prod first; expect ~0 rows since the old dedup prevented exact duplicates):
    SELECT project_id, btrim(title) AS norm, count(*), array_agg(id)
    FROM oc_recordings
    WHERE title IS NOT NULL AND btrim(title) <> ''
      AND split_from_id IS NULL AND splitting_status <> 'archived_after_split'
    GROUP BY project_id, btrim(title) HAVING count(*) > 1;
    op.create_index(
        "uq_oc_recordings_project_title", "oc_recordings", ["project_id", "title"],
        unique=True, postgresql_where=sa.text("split_from_id IS NULL"),
    )
  • Case-sensitivity is exercised in tests on SQLite (ASCII = is case-sensitive there) and relies on Postgres' default collation in prod; the fast-follow index makes it explicit.
  • The client-side 409 UX (conflict status + rename dialog + save-time pre-check) lands in a separate oral-collector PR.

Test Plan

  • uv run pytest tests/test_oc_recording_service.py38 passed (12 new ENG-71 cases: create/update reject, trim-on-store, case-sensitive, blank→NULL, per-project scoping allowed, split-child & archived-parent exemption, list title filter)
  • Full suite uv run pytest665 passed, 2 skipped
  • ruff check + ruff format --check + mypy clean
  • Staging: create a 2nd recording with an existing title in the same project → returns 409 (not 200/silent); original blob untouched
  • Staging: PATCH-rename a recording to an existing title → 409, no 500

POST /api/oc/recordings silently de-duplicated by (project_id, title),
returning the existing recording; the client then uploaded to that old
recording's deterministic GCS path, overwriting the original blob with no
warning (silent data loss).

create_recording and update_recording now reject a duplicate title in the
same project with ConflictError (409). Match is trimmed, case-sensitive and
exact; blank titles normalize to NULL and stay exempt; split-derived and
archived-after-split rows are exempt. list_recordings gains a title filter
for the client save-time pre-check.

🤖 Generated with [Nori](https://noriagentic.com)

Co-Authored-By: Nori <contact@tilework.tech>
Adds a partial unique index on oc_recordings (project_id, title), exempting
split-derived rows (split_from_id IS NOT NULL) and archived split parents
(splitting_status = 'archived_after_split') — matching the service-layer
check. NULL titles may repeat. This closes the concurrent-write race that the
service-level check alone leaves open (two simultaneous creates of the same
title in one project).

create_recording and update_recording now also catch IntegrityError and raise
ConflictError, so a race that slips past the pre-check returns a 409, not a
500.

The index is mirrored on the ORM model (so the SQLite test DB enforces it and
the in-place tests are real) and added in prod via Alembic migration
20260601_0001. That migration carries the pre-deploy duplicate-audit SQL in
its docstring and must be run by someone with database access — it is not run
or verified here.

Part of ENG-71 (backend).

🤖 Generated with [Nori](https://noriagentic.com)

Co-Authored-By: Nori <contact@tilework.tech>
@henokteixeira henokteixeira requested a review from joaocarvoli June 2, 2026 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant