Skip to content

Add IFBench RLVR verifiers environment#31

Open
partyplatter08-lab wants to merge 4 commits into
allenai:mainfrom
partyplatter08-lab:codex/ifbench-rlvr-env
Open

Add IFBench RLVR verifiers environment#31
partyplatter08-lab wants to merge 4 commits into
allenai:mainfrom
partyplatter08-lab:codex/ifbench-rlvr-env

Conversation

@partyplatter08-lab

@partyplatter08-lab partyplatter08-lab commented May 31, 2026

Copy link
Copy Markdown

Summary

This PR adds a small, self-contained Prime Verifiers environment for IFBench instruction-following RLVR work.

It includes:

  • environments/ifbench_rlvr/ with a packaged load_environment() entrypoint for prime eval run / Environment Hub usage
  • normalization for both released IFBench eval rows (prompt, instruction_id_list, kwargs) and IF-RLVR training rows from allenai/IF_multi_constraints_upto5 (messages, ground_truth)
  • reward functions that reuse IFBench verifier classes, falling back to the allenai/open-instruct IF-RLVR verifier registry for training-only constraint IDs
  • README quickstart and environment arguments
  • focused tests for dataset normalization, JSONL loading, and fractional verifier reward scoring

Why

The README points users to the released IF-RLVR training data and Open Instruct verifier code, but there is not currently a train/eval-ready Prime verifiers environment in this repo. This gives downstream users a reproducible environment wrapper while keeping the core benchmark code unchanged.

This PR was prepared with AI assistance for the public Prime Intellect Algora IF-RLVR/Bench bounty: https://algora.io/PrimeIntellect-ai/bounties/dderbjHtPwTiGVY4. I have reviewed and tested the changes locally.

Validation

  • PYTHONDONTWRITEBYTECODE=1 uv run pytest -> 63 passed, 1 warning
  • uv build environments/ifbench_rlvr -> wheel and source distribution built successfully
  • PYTHONDONTWRITEBYTECODE=1 uv run --project environments/ifbench_rlvr --python 3.12 python ... -> installed the packaged environment, resolved last_word:last_word_answer through open_instruct.IFEvalG, constructed verifiers.envs.singleturn_env.SingleTurnEnv, and scored a demo response as reward_ok=1.0, reward_bad=0.0
  • Dataset registry smoke check on allenai/IF_multi_constraints_upto5[:20] and allenai/IFBench_test[:20] -> missing=[] for both

/claim https://algora.io/PrimeIntellect-ai/bounties/dderbjHtPwTiGVY4

@partyplatter08-lab

Copy link
Copy Markdown
Author

Submitted for the public Prime Intellect Algora IF-RLVR/Bench bounty: https://algora.io/PrimeIntellect-ai/bounties/dderbjHtPwTiGVY4 Current verification on this branch: - PYTHONDONTWRITEBYTECODE=1 uv run pytest -> 63 passed, 1 warning - uv build environments/ifbench_rlvr -> wheel and sdist built successfully The implementation is intentionally scoped as a standalone verifiers environment wrapper around the existing IFBench/Open Instruct verifier logic, so the core benchmark code stays unchanged.

@partyplatter08-lab

Copy link
Copy Markdown
Author

Follow-up verification update: I found and fixed a packaging/runtime issue while doing an end-to-end environment install. The environment now pins open-instruct to a packaged revision that includes open_instruct.IFEvalG, and constrains Python to the dependency-supported >=3.12,<3.13 range.

Additional validation after commit 3d9d406:

  • PYTHONDONTWRITEBYTECODE=1 uv run pytest -> 63 passed, 1 warning
  • uv build environments/ifbench_rlvr -> wheel and sdist built successfully
  • PYTHONDONTWRITEBYTECODE=1 uv run --project environments/ifbench_rlvr --python 3.12 python ... -> installed the packaged environment, resolved last_word:last_word_answer via open_instruct.IFEvalG, constructed verifiers.envs.singleturn_env.SingleTurnEnv, and scored the demo response as reward_ok=1.0, reward_bad=0.0
  • Dataset registry smoke check on allenai/IF_multi_constraints_upto5[:20] and allenai/IFBench_test[:20] -> missing=[] for both

This should make the submitted environment reproducible from the package metadata rather than relying on whatever open-instruct HEAD happens to install.

@partyplatter08-lab

Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant