Skip to content

superfarther/K2V

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Knowledge-to-Verification: Unlocking Reinforcement Learning with Verifiable Rewards for LLMs in Knowledge-Intensive Domains

What is K2V

K2V (Knowledge-to-Verification) is a framework that extends RLVR (Reinforcement learning ith verifiable Rewards) to knowledge-intensive domains and enabling verification of the model's reasoning process, without any human supervision.

How to use

  1. Clone the repository

    git clone --recurse-submodules https://github.com/superfarther/K2V.git
    cd K2V
  2. Install the dependencies of graphgen-mask according to the README, and then synthesize the fill-blank style QA pairs.

    cd graphgen-mask
    vim README.md 
  3. Synthsize question-specific checklist for each QA pair.

    cd utils
    vim README.md 
  4. Install the dependencies of verl according to the README, and then start training.

    cd verl
    vim README.md 

About

[ACL 2026] Knowledge-to-Verification: Exploring RLVR for LLMs in Knowledge-Intensive Domains

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors