[codex] Make real metric caches explicit by ftshijt · Pull Request #5 · ftshijt/versa

ftshijt · 2026-05-18T21:59:27Z

Summary

Add a visible Hugging Face/discrete-speech cache setup flow for real model-backed metric tests.
Wire discrete speech and EmoVAD loaders to use explicit cache paths and avoid hidden or incomplete cache state.
Tighten metric pipeline sample checks so missing outputs fail instead of silently passing.
Document when to run tools/setup_huggingface_cache.sh in the README and CI/testing docs.

Root Cause

The real-model failures came from package availability checks passing while required model/checkpoint assets were missing or stored in hidden/inconsistent cache locations. Some pipeline checks also iterated over returned summary keys, so an empty metric output could look successful.

Validation

VERSA_HF_CACHE_DIR=$PWD/versa_cache/huggingface VERSA_DISCRETE_SPEECH_CACHE_DIR=$PWD/versa_cache/discrete_speech_metrics MPLCONFIGDIR=/private/tmp/matplotlib-cache .codex-test-venv/bin/python -m pytest --import-mode=importlib -q --tb=short test/test_metrics/test_discrete_speech.py test/test_metrics/test_emo_vad.py -> 14 passed
VERSA_RUN_REAL_MODEL_TESTS=1 VERSA_HF_CACHE_DIR=$PWD/versa_cache/huggingface VERSA_DISCRETE_SPEECH_CACHE_DIR=$PWD/versa_cache/discrete_speech_metrics MPLCONFIGDIR=/private/tmp/matplotlib-cache .codex-test-venv/bin/python -m pytest --import-mode=importlib -q -rs test -> 221 passed, 5 skipped
git diff --check
.codex-test-venv/bin/python -m compileall -q versa test
bash -n tools/setup_huggingface_cache.sh

Notes

The remaining skipped real-model item is scoreq_versa when that optional package is not installed.

… scorer

…model usage

1) wer with fireredasr 2) per with fireredasr

…ctor

…sion

refactor versa in OO (major update)

Add WER metrics: Faster-Whisper, NeMo ASR, Facebook’s HuBERT-large-finetuned

# Conflicts: # versa/corpus_metrics/whisper_wer.py # versa/scorer_shared.py

Add metrics: Phoneme Error Rate

Update pseudo_mos.py for default data type

…result # Conflicts: # versa/utterance_metrics/sheet_ssqa.py

…m-show-result [codex] Fix Slurm result summary command

…t-clap-score [codex] Implement CLAP score metric

…ble-scoring Support resumable scoring

…able-scoring Document resumable scoring

…d-scoring Add metric-oriented scoring mode

Fix speaker metric import side effects

…ckaging-ci Refine packaging CI and metric summaries

ftshijt and others added 30 commits June 16, 2025 02:34

refactor versa with oo update -> a major update

13ac00b

Merge branch 'main' into refactor

58eb805

Merge branch 'main' into refactor

8d2f6f0

init visualization - sunburst chart

2007cc6

add faster-whisper wer

461f668

add: three metrics

db72a27

fix: copyright

50d10c5

fix: a small bug in scorer_shared/list_scoring

718b2ed

add: docs

c289243

add interactive radar chart

84df3ad

update

878ef17

add description function of using text LLMs

6d5d931

add a interpreter that can directly apply for the metric results from…

0fe5a5c

… scorer

add some common functions for interpreter usage and future extension …

576794b

…model usage

Merge branch 'wavlab-speech:main' into metric_description

707c24a

run isort and black on the interpreter_shared.py

fe93aea

run isort and black on the interpreter.py

8068469

run isort and black on the text_llm_description.py

5696ef1

add todo:

d9b4ad0

1) wer with fireredasr 2) per with fireredasr

add asvspoof.py

8796810

Merge branch 'refactor' of https://github.com/ftshijt/versa into refa…

53c3e0a

…ctor

update discrets speech / chroma_alignment

7e95ee8

update test function and versa with black and emo_vad

b7b9dd4

Merge branch 'main' into refactor

dcc1822

update emo_similarity

8ff163a

fix metric list and set setup.py

78894ee

fix setup.py

e5e10bf

fix(versa/utterance_metrics/pseudo_mos.py): update singmos to new ver…

bb3de56

…sion

feat: add a new version of singmos

6112732

update README

e3ed7ab

ftshijt added 29 commits May 5, 2026 19:02

Make README example commands runnable

bcec446

Merge pull request wavlab-speech#37 from ftshijt/refactor

0344585

refactor versa in OO (major update)

Merge upstream main into PR 44

08f7083

Merge pull request wavlab-speech#44 from whr-a/main

f4c8704

Add WER metrics: Faster-Whisper, NeMo ASR, Facebook’s HuBERT-large-finetuned

Merge branch 'codex/upstream-main' into codex/pr-50

297c740

# Conflicts: # versa/corpus_metrics/whisper_wer.py # versa/scorer_shared.py

Merge pull request wavlab-speech#50 from South-Twilight/per

aca43b5

Add metrics: Phoneme Error Rate

Add UTMOSv2 data type label test

1829036

Merge branch 'codex/upstream-main' into codex/pr-62-rebase

0b35e86

Merge pull request wavlab-speech#62 from wavlab-speech/ftshijt-patch-2

33066c9

Update pseudo_mos.py for default data type

Fix Slurm result summary command

926e8fa

Trust SpeechMOS torch hub repo in CI

183504a

Retry Sheet SSQA hub download

128626c

Merge remote-tracking branch 'origin/main' into codex/fix-slurm-show-…

b1b3f40

…result # Conflicts: # versa/utterance_metrics/sheet_ssqa.py

Merge pull request wavlab-speech#68 from wavlab-speech/codex/fix-slur…

b85d4a8

…m-show-result [codex] Fix Slurm result summary command

Implement CLAP score metric

71ca920

Merge pull request wavlab-speech#69 from wavlab-speech/codex/implemen…

812c2e2

…t-clap-score [codex] Implement CLAP score metric

Support resumable scoring

b341c88

Merge pull request wavlab-speech#70 from wavlab-speech/support-resuma…

c66c76b

…ble-scoring Support resumable scoring

Document resumable scoring

7006714

Merge pull request wavlab-speech#71 from wavlab-speech/document-resum…

452a912

…able-scoring Document resumable scoring

Add metric-oriented scoring mode

69dcd38

Apply Black formatting

4f37147

Merge pull request wavlab-speech#72 from ftshijt/codex/metric-oriente…

d6ffb85

…d-scoring Add metric-oriented scoring mode

Document issue 59 installation notes

6f8e7f9

Merge pull request wavlab-speech#73 from ftshijt/codex/fix-issue-59

d7f8b22

Fix speaker metric import side effects

Refine packaging CI and metric summaries

9ec9d40

Merge pull request wavlab-speech#74 from ftshijt/codex/refine-repo-pa…

a8fa328

…ckaging-ci Refine packaging CI and metric summaries

Move fastdtw to audio extra

1b7817f

Make real metric caches explicit

c66ba0f

ftshijt closed this May 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Make real metric caches explicit#5

[codex] Make real metric caches explicit#5
ftshijt wants to merge 120 commits into
mainfrom
codex-move-fastdtw-to-audio-extra

ftshijt commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

ftshijt commented May 18, 2026

Summary

Root Cause

Validation

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants