Knowledge-to-Verification: Unlocking Reinforcement Learning with Verifiable Rewards for LLMs in Knowledge-Intensive Domains

What is K2V

K2V (Knowledge-to-Verification) is a framework that extends RLVR (Reinforcement learning ith verifiable Rewards) to knowledge-intensive domains and enabling verification of the model's reasoning process, without any human supervision.

How to use

Clone the repository

git clone --recurse-submodules https://github.com/superfarther/K2V.git
cd K2V

Install the dependencies of graphgen-mask according to the README, and then synthesize the fill-blank style QA pairs.
```
cd graphgen-mask
vim README.md 
```
Synthsize question-specific checklist for each QA pair.
```
cd utils
vim README.md 
```
Install the dependencies of verl according to the README, and then start training.
```
cd verl
vim README.md 
```

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
graphgen-mask @ 8589abd		graphgen-mask @ 8589abd
utils @ 4d6b390		utils @ 4d6b390
verl @ a4108bd		verl @ a4108bd
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge-to-Verification: Unlocking Reinforcement Learning with Verifiable Rewards for LLMs in Knowledge-Intensive Domains

What is K2V

How to use

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Knowledge-to-Verification: Unlocking Reinforcement Learning with Verifiable Rewards for LLMs in Knowledge-Intensive Domains

What is K2V

How to use

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages