Feat/annotation studio by joaocarvoli · Pull Request #86 · shemaobt/tripod-api

joaocarvoli · 2026-06-09T03:13:51Z

No description provided.

Nine as_* tables, every language_id FK retargeted to tripod languages.id (CASCADE). As* enums kept isolated from app.core.enums so the studio's pending/stored upload lifecycle never clashes with the oral-collector UploadStatus. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Ported from the studio interface schemas; reuses tripod LanguageResponse for the picker and drops is_seed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collection engine ported into tripod's async-SQLAlchemy style: speakers, Tier A/B/C, export bundler, readiness, results, dashboard. naming.py and export_plan.py are pure and verbatim to preserve the CSV/zip contract. storage.py reuses the oral-collector GCS presign pattern, keys prefixed annotation-studio/ in the shared bucket. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Eight routers (speakers, Tier A/B/C, export, results, languages, audio) + _deps.py guards (require_app_access / require_role). Mounted at /api/annotation-studio (40 routes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Migration creates only the as_* tables and seeds the App row + admin/facilitator roles idempotently. Default access-request role is facilitator. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Isolate experiment audio in its own bucket (annotation-studio-audio) instead of sharing oral-collector's tripod-image-uploads. Its CORS is managed independently. Drops the now-redundant annotation-studio/ key prefix so object names are the logical naming.py keys directly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Wrap long lines + ruff-format; contextlib.suppress for the GCS delete; type annotations on the export helpers (_config_hint/_assemble_zip/_gather); type:ignore for the GCS-stub Any returns and the generic get_or_404 model.id; Row→tuple comprehension in tier_c; avoid rebinding the CurrentUser _ in set_reference. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

readiness.tier_b reported only `pairs` (total created) + `recordings`, so the studio UI marked Tier B complete/export-ready even on empty pairs. Add `pairs_ready` = pairs where both sides have >= REPS_PER_SIDE (5) stored takes, mirroring Tier A's `words_ready`. The frontend gates its ready/percent on this field instead of `pairs`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… & storage hardening) (#85) * feat(db): add per-language membership table and indexes Introduce as_language_members to scope annotation-studio facilitators to specific languages (admins and platform admins bypass it). The Alembic migration creates the table, seeds memberships for every current AS user across every currently-active language so no one is locked out on deploy, and adds composite (FK, upload_status) indexes on the tier recording/clip tables used by readiness and export filters. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(access): enforce per-language access across all routers Add access helpers (assert_language_access, accessible_language_ids and the language_id resolvers) and call them on every annotation-studio data route: path routes check the URL language, by-id routes resolve the owning language from the resource. The dashboard now lists only languages the user may see. Closes the gap where any approved AS user could read or modify any language's data. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(members): admin endpoints to manage language members Add admin-only endpoints to list, add (by email) and remove per-language facilitators, backed by member_service and the LanguageMember schemas, and mount the members router. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * perf(readiness): compute readiness with SQL aggregation Replace the in-Python Counter/row-loading readiness computation with SQL COUNT/GROUP BY/HAVING aggregates. This avoids loading every recording, clip and sort row into memory and removes the per-language cost the dashboard multiplied across languages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(storage): commit before deleting objects and cap upload size Delete storage objects only after the database commit succeeds, so a failed commit can no longer leave a row pointing at a deleted file (delete word, recording, clip and export, plus the reference set/clear flows). Also enforce a maximum audio size at the upload "complete" step, deleting oversized objects instead of marking them stored. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(annotation-studio): cover access control, readiness and members Add service-level tests for assert_language_access, the audio storage-key resolver, readiness aggregation and member management, plus an end-to-end HTTP suite that drives the real ASGI stack to prove cross-language requests are rejected with 403 (and guards against a route forgetting the check). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * style(annotation-studio): apply ruff format Run `ruff format` on the new annotation-studio modules and tests to satisfy the CI format check (`ruff format --check`). Formatting only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Resolve conflicts for PR #86 (feat/annotation-studio → main). The annotation-studio module existed independently on both branches with byte-identical content at this branch's pre-#85 base, so every conflicted module file was resolved to this branch's version — the superset that adds the #85 per-language access control, persistence, readiness and storage changes without dropping anything from main (which only equalled the base for these files). main's unrelated changes (oral-collector, bhsa, storage, scripts, tests) merge in cleanly. Verified on the merged tree: single alembic head (20260608_0001), ruff format/check clean, mypy clean (429 files), pytest 677 passed / 2 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

joaocarvoli and others added 10 commits June 3, 2026 17:59

Add annotation-studio Pydantic schemas

6fdc118

Ported from the studio interface schemas; reuses tripod LanguageResponse for the picker and drops is_seed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add annotation-studio API routers

c55a358

Eight routers (speakers, Tier A/B/C, export, results, languages, audio) + _deps.py guards (require_app_access / require_role). Mounted at /api/annotation-studio (40 routes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add annotation-studio migration and app/role seed

3909aa5

Migration creates only the as_* tables and seeds the App row + admin/facilitator roles idempotently. Default access-request role is facilitator. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add annotation-studio CORS origin

770a7c8

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

joaocarvoli self-assigned this Jun 9, 2026

joaocarvoli merged commit 8dd5061 into main Jun 9, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/annotation studio#86

Feat/annotation studio#86
joaocarvoli merged 11 commits into
mainfrom
feat/annotation-studio

joaocarvoli commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joaocarvoli commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant