Skip to content

[codex] Make real metric caches explicit#5

Closed
ftshijt wants to merge 120 commits into
mainfrom
codex-move-fastdtw-to-audio-extra
Closed

[codex] Make real metric caches explicit#5
ftshijt wants to merge 120 commits into
mainfrom
codex-move-fastdtw-to-audio-extra

Conversation

@ftshijt
Copy link
Copy Markdown
Owner

@ftshijt ftshijt commented May 18, 2026

Summary

  • Add a visible Hugging Face/discrete-speech cache setup flow for real model-backed metric tests.
  • Wire discrete speech and EmoVAD loaders to use explicit cache paths and avoid hidden or incomplete cache state.
  • Tighten metric pipeline sample checks so missing outputs fail instead of silently passing.
  • Document when to run tools/setup_huggingface_cache.sh in the README and CI/testing docs.

Root Cause

The real-model failures came from package availability checks passing while required model/checkpoint assets were missing or stored in hidden/inconsistent cache locations. Some pipeline checks also iterated over returned summary keys, so an empty metric output could look successful.

Validation

  • VERSA_HF_CACHE_DIR=$PWD/versa_cache/huggingface VERSA_DISCRETE_SPEECH_CACHE_DIR=$PWD/versa_cache/discrete_speech_metrics MPLCONFIGDIR=/private/tmp/matplotlib-cache .codex-test-venv/bin/python -m pytest --import-mode=importlib -q --tb=short test/test_metrics/test_discrete_speech.py test/test_metrics/test_emo_vad.py -> 14 passed
  • VERSA_RUN_REAL_MODEL_TESTS=1 VERSA_HF_CACHE_DIR=$PWD/versa_cache/huggingface VERSA_DISCRETE_SPEECH_CACHE_DIR=$PWD/versa_cache/discrete_speech_metrics MPLCONFIGDIR=/private/tmp/matplotlib-cache .codex-test-venv/bin/python -m pytest --import-mode=importlib -q -rs test -> 221 passed, 5 skipped
  • git diff --check
  • .codex-test-venv/bin/python -m compileall -q versa test
  • bash -n tools/setup_huggingface_cache.sh

Notes

The remaining skipped real-model item is scoreq_versa when that optional package is not installed.

ftshijt and others added 30 commits June 16, 2025 02:34
1) wer with fireredasr
2) per with fireredasr
ftshijt added 29 commits May 5, 2026 19:02
refactor versa in OO (major update)
Add WER metrics: Faster-Whisper, NeMo ASR, Facebook’s HuBERT-large-finetuned
# Conflicts:
#	versa/corpus_metrics/whisper_wer.py
#	versa/scorer_shared.py
Update pseudo_mos.py for default data type
…result

# Conflicts:
#	versa/utterance_metrics/sheet_ssqa.py
…m-show-result

[codex] Fix Slurm result summary command
…t-clap-score

[codex] Implement CLAP score metric
…d-scoring

Add metric-oriented scoring mode
…ckaging-ci

Refine packaging CI and metric summaries
@ftshijt ftshijt closed this May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants