Summary
Separate machine predictions from human identifications in exports and API so researchers see both side-by-side. Currently the export has a single determination that gets overwritten when a human verifies — losing the original ML prediction.
Design spec: docs/superpowers/specs/2026-04-07-export-fields-design.md (in ami-devops)
New Export Fields
Machine prediction fields
best_machine_prediction_name — taxon name from best Classification
best_machine_prediction_algorithm — algorithm name
best_machine_prediction_score — confidence score
Verification fields
verified_by — username of best (most recent non-withdrawn) identification's user
verified_by_count — count of non-withdrawn identifications
agreed_with_algorithm — algorithm name if human explicitly agreed with an ML prediction
determination_matches_machine_prediction — boolean: does determination taxon match best prediction taxon?
Detection/capture fields
best_detection_bbox — raw [x1, y1, x2, y2]
best_detection_source_image_url — public URL to original capture image
best_detection_occurrence_url — platform UI link to occurrence in context
Modifications
determination_score → set to null when determination comes from a human ID (ML score preserved in best_machine_prediction_score)
API Changes
Add best_machine_prediction nested object to OccurrenceListSerializer — always populated regardless of verification status.
Implementation Plan
Step 1: Refactor update_occurrence_determination() (model layer)
File: ami/main/models.py
- Extract
find_best_prediction() → Occurrence method returning best Classification (terminal-first, highest score)
- Extract
find_best_identification() → Occurrence method returning most recent non-withdrawn Identification
- Update
update_occurrence_determination() to call both; set determination_score = None for human IDs
- Have existing
best_prediction cached_property delegate to find_best_prediction()
Step 2: Add export queryset annotations
File: ami/main/models.py — OccurrenceExportManager
Extend existing subquery annotation pattern:
best_machine_prediction_name — Subquery: Classification → Taxon.name (ordered by -terminal, -score)
best_machine_prediction_algorithm — Subquery: Classification → Algorithm.name
best_machine_prediction_score — Subquery: Classification.score
best_identification_user — Subquery: Identification → User.username (most recent non-withdrawn)
verified_by_count — Count of non-withdrawn Identifications
best_detection_bbox — Subquery: Detection.bbox
best_detection_source_image_path + public_base_url — raw values for URL computation in serializer
Step 3: Add fields to OccurrenceTabularSerializer
File: ami/exports/format_types.py
Add all new fields to the CSV/tabular serializer using the queryset annotations from step 2. Compute best_detection_source_image_url and best_detection_occurrence_url in serializer methods.
Step 4: Add best_machine_prediction to API serializer
File: ami/main/api/serializers.py
Add best_machine_prediction nested field to OccurrenceListSerializer using find_best_prediction().
Step 5: Tests
File: ami/exports/tests.py, API tests
- ML prediction only → fields populated, verified_by null
- ML + agreeing human ID → verified_by set, determination_matches = true, determination_score = null
- ML + disagreeing human ID → determination_matches = false
- Multiple identifications → verified_by_count correct
- bbox and URL fields populated
- API: best_machine_prediction persists after human verification
Step 6: Management command for backfill
- Create command to run
update_occurrence_determination() on all occurrences (backfills determination_score = null for human IDs)
Follow-up TODOs
🤖 Generated with Claude Code
Summary
Separate machine predictions from human identifications in exports and API so researchers see both side-by-side. Currently the export has a single
determinationthat gets overwritten when a human verifies — losing the original ML prediction.Design spec:
docs/superpowers/specs/2026-04-07-export-fields-design.md(in ami-devops)New Export Fields
Machine prediction fields
best_machine_prediction_name— taxon name from best Classificationbest_machine_prediction_algorithm— algorithm namebest_machine_prediction_score— confidence scoreVerification fields
verified_by— username of best (most recent non-withdrawn) identification's userverified_by_count— count of non-withdrawn identificationsagreed_with_algorithm— algorithm name if human explicitly agreed with an ML predictiondetermination_matches_machine_prediction— boolean: does determination taxon match best prediction taxon?Detection/capture fields
best_detection_bbox— raw[x1, y1, x2, y2]best_detection_source_image_url— public URL to original capture imagebest_detection_occurrence_url— platform UI link to occurrence in contextModifications
determination_score→ set tonullwhen determination comes from a human ID (ML score preserved inbest_machine_prediction_score)API Changes
Add
best_machine_predictionnested object toOccurrenceListSerializer— always populated regardless of verification status.Implementation Plan
Step 1: Refactor
update_occurrence_determination()(model layer)File:
ami/main/models.pyfind_best_prediction()→ Occurrence method returning best Classification (terminal-first, highest score)find_best_identification()→ Occurrence method returning most recent non-withdrawn Identificationupdate_occurrence_determination()to call both; setdetermination_score = Nonefor human IDsbest_predictioncached_property delegate tofind_best_prediction()Step 2: Add export queryset annotations
File:
ami/main/models.py—OccurrenceExportManagerExtend existing subquery annotation pattern:
best_machine_prediction_name— Subquery: Classification → Taxon.name (ordered by -terminal, -score)best_machine_prediction_algorithm— Subquery: Classification → Algorithm.namebest_machine_prediction_score— Subquery: Classification.scorebest_identification_user— Subquery: Identification → User.username (most recent non-withdrawn)verified_by_count— Count of non-withdrawn Identificationsbest_detection_bbox— Subquery: Detection.bboxbest_detection_source_image_path+public_base_url— raw values for URL computation in serializerStep 3: Add fields to
OccurrenceTabularSerializerFile:
ami/exports/format_types.pyAdd all new fields to the CSV/tabular serializer using the queryset annotations from step 2. Compute
best_detection_source_image_urlandbest_detection_occurrence_urlin serializer methods.Step 4: Add
best_machine_predictionto API serializerFile:
ami/main/api/serializers.pyAdd
best_machine_predictionnested field toOccurrenceListSerializerusingfind_best_prediction().Step 5: Tests
File:
ami/exports/tests.py, API testsStep 6: Management command for backfill
update_occurrence_determination()on all occurrences (backfills determination_score = null for human IDs)Follow-up TODOs
🤖 Generated with Claude Code