Feat/annotation studio#86
Merged
Merged
Conversation
Nine as_* tables, every language_id FK retargeted to tripod languages.id (CASCADE). As* enums kept isolated from app.core.enums so the studio's pending/stored upload lifecycle never clashes with the oral-collector UploadStatus. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Ported from the studio interface schemas; reuses tripod LanguageResponse for the picker and drops is_seed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collection engine ported into tripod's async-SQLAlchemy style: speakers, Tier A/B/C, export bundler, readiness, results, dashboard. naming.py and export_plan.py are pure and verbatim to preserve the CSV/zip contract. storage.py reuses the oral-collector GCS presign pattern, keys prefixed annotation-studio/ in the shared bucket. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Eight routers (speakers, Tier A/B/C, export, results, languages, audio) + _deps.py guards (require_app_access / require_role). Mounted at /api/annotation-studio (40 routes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Migration creates only the as_* tables and seeds the App row + admin/facilitator roles idempotently. Default access-request role is facilitator. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Isolate experiment audio in its own bucket (annotation-studio-audio) instead of sharing oral-collector's tripod-image-uploads. Its CORS is managed independently. Drops the now-redundant annotation-studio/ key prefix so object names are the logical naming.py keys directly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wrap long lines + ruff-format; contextlib.suppress for the GCS delete; type annotations on the export helpers (_config_hint/_assemble_zip/_gather); type:ignore for the GCS-stub Any returns and the generic get_or_404 model.id; Row→tuple comprehension in tier_c; avoid rebinding the CurrentUser _ in set_reference. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
readiness.tier_b reported only `pairs` (total created) + `recordings`, so the studio UI marked Tier B complete/export-ready even on empty pairs. Add `pairs_ready` = pairs where both sides have >= REPS_PER_SIDE (5) stored takes, mirroring Tier A's `words_ready`. The frontend gates its ready/percent on this field instead of `pairs`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… & storage hardening) (#85) * feat(db): add per-language membership table and indexes Introduce as_language_members to scope annotation-studio facilitators to specific languages (admins and platform admins bypass it). The Alembic migration creates the table, seeds memberships for every current AS user across every currently-active language so no one is locked out on deploy, and adds composite (FK, upload_status) indexes on the tier recording/clip tables used by readiness and export filters. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(access): enforce per-language access across all routers Add access helpers (assert_language_access, accessible_language_ids and the language_id resolvers) and call them on every annotation-studio data route: path routes check the URL language, by-id routes resolve the owning language from the resource. The dashboard now lists only languages the user may see. Closes the gap where any approved AS user could read or modify any language's data. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(members): admin endpoints to manage language members Add admin-only endpoints to list, add (by email) and remove per-language facilitators, backed by member_service and the LanguageMember schemas, and mount the members router. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * perf(readiness): compute readiness with SQL aggregation Replace the in-Python Counter/row-loading readiness computation with SQL COUNT/GROUP BY/HAVING aggregates. This avoids loading every recording, clip and sort row into memory and removes the per-language cost the dashboard multiplied across languages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(storage): commit before deleting objects and cap upload size Delete storage objects only after the database commit succeeds, so a failed commit can no longer leave a row pointing at a deleted file (delete word, recording, clip and export, plus the reference set/clear flows). Also enforce a maximum audio size at the upload "complete" step, deleting oversized objects instead of marking them stored. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(annotation-studio): cover access control, readiness and members Add service-level tests for assert_language_access, the audio storage-key resolver, readiness aggregation and member management, plus an end-to-end HTTP suite that drives the real ASGI stack to prove cross-language requests are rejected with 403 (and guards against a route forgetting the check). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * style(annotation-studio): apply ruff format Run `ruff format` on the new annotation-studio modules and tests to satisfy the CI format check (`ruff format --check`). Formatting only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve conflicts for PR #86 (feat/annotation-studio → main). The annotation-studio module existed independently on both branches with byte-identical content at this branch's pre-#85 base, so every conflicted module file was resolved to this branch's version — the superset that adds the #85 per-language access control, persistence, readiness and storage changes without dropping anything from main (which only equalled the base for these files). main's unrelated changes (oral-collector, bhsa, storage, scripts, tests) merge in cleanly. Verified on the merged tree: single alembic head (20260608_0001), ruff format/check clean, mypy clean (429 files), pytest 677 passed / 2 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.