Skip to content

Orchive dataset#79

Open
nkundiushuti wants to merge 3 commits into
mainfrom
marius/orchive
Open

Orchive dataset#79
nkundiushuti wants to merge 3 commits into
mainfrom
marius/orchive

Conversation

@nkundiushuti
Copy link
Copy Markdown
Contributor

the annotated part of orchive is quite small (I was not aware)
they provide, however, a quite large unannotated dataset with cropped calls

@nkundiushuti nkundiushuti requested a review from GaganNarula June 25, 2025 17:04
@nkundiushuti nkundiushuti changed the title Orchive archive Orchive dataset Jun 25, 2025
@david-rx
Copy link
Copy Markdown
Contributor

david-rx commented Jul 28, 2025

Hey @nkundiushuti @GaganNarula @mil-ad - what do we think of including this in the end? For vocal repertoire, we don't need a training set, clustering/retrieval are the primary evals for many repertoire datasets anyways. And, this would be our hardest, most diverse repertoire task and only marine one. Maybe we could just add a split for "all data" since we will probably not train on it and we want to use everything, and maybe a split with all data but having at least 3 examples per class?

Lmk what you think and happy to help with that if you agree

EDIT - hold that thought, I think there is a much better one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants