Add RLVR reward wrapper for IFBench verifiers by Catnap7 · Pull Request #29 · allenai/IFBench

Catnap7 · 2026-05-22T02:49:04Z

Summary

Algora bounty: https://algora.io/PrimeIntellect-ai/bounties/dderbjHtPwTiGVY4

This PR adds a small train-ready RLVR wrapper around the existing IFBench verifier logic.

It provides:

rlvr_env.score_response(...) for single prompt/completion reward scoring
strict and loose evaluation modes matching the benchmark verifier behavior
binary all rewards and dense partial-credit fraction rewards
python -m rlvr_env ... to convert prompt/response JSONL files into reward-labeled JSONL
README usage examples for RLVR loops
UTF-8 file handling in evaluation_lib.py, which avoids Windows locale decode errors on the included JSONL data
focused tests for scalar rewards, per-instruction diagnostics, and JSONL output

This is intended as a lightweight adapter for IF-RLVR style training/evaluation loops without changing the underlying instruction verifier implementations.

Verification

Ran locally on Windows with Python 3.12:

.\.venv\Scripts\python -m pytest instructions_test.py tests/test_rlvr_env.py
62 passed, 1 warning

The warning is from the existing syllapy dependency importing pkg_resources; local verification used setuptools<81 to keep that dependency path working.

Catnap7 · 2026-05-22T02:49:34Z

Submitting this PR for the Prime Intellect IF-RLVR/Bench bounty:
https://algora.io/PrimeIntellect-ai/bounties/dderbjHtPwTiGVY4

The change adds a train-ready RLVR reward wrapper around IFBench's existing verifiers, with a reproducible CLI and tests. Happy to adjust scope if the expected deliverable differs.

Add IFBench RLVR reward wrapper

c3ab2d1

Catnap7 force-pushed the codex/ifbench-rlvr-reward-env branch from 264a514 to c3ab2d1 Compare May 22, 2026 18:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RLVR reward wrapper for IFBench verifiers#29

Add RLVR reward wrapper for IFBench verifiers#29
Catnap7 wants to merge 1 commit into
allenai:mainfrom
Catnap7:codex/ifbench-rlvr-reward-env

Catnap7 commented May 22, 2026 •

edited

Loading

Uh oh!

Catnap7 commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Catnap7 commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Uh oh!

Catnap7 commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Catnap7 commented May 22, 2026 •

edited

Loading