Canopy is a FastAPI backend for managing metadata and submission-tracking records for the Australian Tree of Life project.
This README is the central repository document. It gives the verified high-level picture of the application and links to more detailed maintainer documentation where the detail would otherwise make this file unwieldy.
The current codebase stores and serves metadata for:
- organisms
- taxonomy enrichment records
- samples and specimen-to-derived sample lineage
- experiments
- reads
- QC-read results
- assemblies, assembly runs, and assembly stage runs
- projects
- BPA initiatives
- genome notes
- users and refresh tokens
- broker submission attempts, events, and ToLID request state
Primary implementation paths:
- API app:
app/main.py - Versioned router:
app/api/v1/api.py - Runtime settings:
app/core/settings.py - Models:
app/models/ - Schemas:
app/schemas/ - Shared business logic:
app/services/ - Migrations:
alembic/versions/
Several entities use a main-table plus submission-table pattern.
- Main table: current application-facing record
- Submission table: staged payloads and submission lifecycle state for external workflows
This pattern is visible for:
- projects
- samples
- experiments
- QC reads
- assemblies
The repo exposes broker endpoints under /api/v1/broker/... for claiming work, validating payload prerequisites, reporting outcomes, finalising failed attempts, and managing ToLID request state.
The currently routed claim/report surface is:
/broker/claims/ready/broker/claims/entity/broker/claims/batch/broker/validation/broker/reports/{attempt_id}/broker/attempts/{attempt_id}/finalise
Detailed broker lifecycle notes are in docs/handover/broker_and_submission_flows.md.
The current assembly flow is centered on:
- creating an assembly intent
- generating and storing an assembly manifest
- registering assembly pipeline runs by GitHub repo and commit
- reporting QC-read outputs
- reporting per-stage results
Detailed assembly instructions are in docs/assembly_reporting_api.md.
- docs/assembly_reporting_api.md
- docs/auth_refresh_tokens.md
- docs/bulk_import_api.md
- docs/migration_workflow.md
- docs/ncbi_taxonomy_sync.md
- docs/tolid_broker_api.md
- docs/handover/system_overview.md
- docs/handover/setup_and_operations.md
- docs/handover/broker_and_submission_flows.md
- docs/handover/config_reference.md
- docs/handover/troubleshooting.md
- docs/handover/open_questions.md
- Docker and Docker Compose for the container-based local path
- or
uvplus a reachable PostgreSQL instance for the non-Docker local path
Settings are defined in app/core/settings.py.
At application import time, the code requires:
JWT_SECRET_KEYJWT_ALGORITHMDATABASE_URI, or enoughPOSTGRES_*values forDATABASE_URIto be derived
At container entrypoint time, scripts/entrypoint.sh requires DATABASE_URI to exist in the shell environment before it will run migrations.
The repository includes:
.env.exampledocker-compose.ymlDockerfilescripts/entrypoint.sh
- Copy the template:
cp .env.example .env-
Set the required values in
.env. -
Start the stack:
docker compose up --build- Open the API docs:
http://localhost:8000/api/v1/docshttp://localhost:8000/api/v1/redoc
The Docker Compose file maps:
- API: host
8000-> container8000 - Postgres: host
5433-> container5432
- Install dependencies:
uv sync --dev --frozen-
Export the required environment variables.
-
Apply migrations:
uv run alembic upgrade head- Start the API:
uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload- SQLAlchemy session setup is in
app/db/session.py. - Alembic config is in
alembic/env.py. - Active revisions are in
alembic/versions/. scripts/entrypoint.shrunsuv run alembic upgrade headbefore starting the app.schema.sqlis maintained as a schema snapshot, not the runtime bootstrap mechanism.
For the repo-local migration workflow, see docs/migration_workflow.md.
The current authentication surface is:
POST /api/v1/auth/loginPOST /api/v1/auth/refreshPOST /api/v1/auth/logout
Access tokens are JWTs. Refresh tokens are stored as hashed values in the refresh_token table.
Detailed auth behavior is in docs/auth_refresh_tokens.md.
- OpenAPI JSON:
/api/v1/openapi.json - Swagger UI:
/api/v1/docs - ReDoc:
/api/v1/redoc - Health:
/health - Version:
/version
scripts/entrypoint.sh: wait for DB, run migrations, start appscripts/create_user.py: create a user directly in the databasescripts/expire_leases.py: expire broker leases directly from Python
The repository contains unit tests under tests/.
Run them with:
pytest -qThe repository does not by itself tell us:
- which broker API surface is used in production
- who or what promotes submission rows from
drafttoready - the full production runtime topology behind the GitHub Actions deployment workflows
- any external runbooks or operational conventions outside version control