Skip to content

Support resumable scoring#3

Draft
ftshijt wants to merge 1 commit into
codex/implement-clap-scorefrom
resume-utterance-scoring-update
Draft

Support resumable scoring#3
ftshijt wants to merge 1 commit into
codex/implement-clap-scorefrom
resume-utterance-scoring-update

Conversation

@ftshijt
Copy link
Copy Markdown
Owner

@ftshijt ftshijt commented May 8, 2026

Summary

Adds resume support for utterance-level scoring so interrupted runs can continue from an existing JSONL output file instead of starting over.

Details

  • Adds a --resume flag to the standard and chunked scorer entrypoints.
  • Loads completed utterance keys from an existing output file and skips them during scoring.
  • Appends new results safely, including when the existing file is missing a trailing newline.
  • Keeps resumed rows in the returned score list so summary computation includes previous and new results.
  • Adds MetricSuite.__len__ so legacy chunked scoring code can safely check loaded metric counts.
  • Adds a focused regression test for skipping completed keys.

Validation

  • conda run -n versa-dev python -m pytest test/test_pipeline/test_base_metrics_pipeline.py::test_score_utterances_resume_skips_completed_keys -q
  • conda run -n versa-dev python -m pytest test/test_pipeline/test_base_metrics_pipeline.py::test_score_utterances_resume_skips_completed_keys test/test_pipeline/test_base_metrics_pipeline.py::test_stoi_and_signal_pipeline_with_registry -q
  • conda run -n versa-dev python -m versa.bin.scorer --pred test/test_samples/test2.scp --gt test/test_samples/test1.scp --score_config egs/separate_metrics/snr_related.yaml --output_file /private/tmp/versa-resume-cli.jsonl --io soundfile --resume twice; output stayed at 1 row
  • conda run -n versa-dev python -m versa.bin.scorer_chunk --pred test/test_samples/test2.scp --gt test/test_samples/test1.scp --score_config egs/separate_metrics/snr_related.yaml --output_file /private/tmp/versa-resume-chunk.jsonl --io soundfile --enable_chunking --chunk_duration 0.5 --hop_duration 0.5 --resume twice; output stayed at 2 rows
  • python3 -m py_compile versa/definition.py versa/scorer_shared.py versa/bin/scorer.py versa/bin/scorer_chunk.py test/test_pipeline/test_base_metrics_pipeline.py
  • git diff --check

@ftshijt ftshijt force-pushed the resume-utterance-scoring-update branch from cac95ac to c490875 Compare May 8, 2026 01:23
@ftshijt ftshijt force-pushed the resume-utterance-scoring-update branch from c490875 to 96215e1 Compare May 11, 2026 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant