Skip to content

superfarther/graphgen-mask

Repository files navigation

What is this repo

This repo contains the framework used by K2V to synthesize fill-blank style QA pairs. We developed this framework based on GraphGen.

Installation

We recommend to begin with a fresh new conda environment.

conda create --name verl python=3.10 -y
conda activate graphgen-mask

Install the necessary dependencies.

git clone https://github.com/superfarther/graphgen-mask.git
pip install -r requirements_K2V.txt

Note: This repo is significantly outdated compared to the official GraphGen. To use the latest code for synthesizing QA pairs, you can visit the official GraphGen repository and navigate to the examples/generate/generate_masked_fill_in_blank_qa.

Quick Start

  1. In order to construct a KG from corpus, K2V deploy a LLM using vLLM to perform Named Entity Recognition (NER) and Relation Extraction (RE).

    vllm serve Qwen/Qwen2.5-72B-Instruct --max_model_len 32768
  2. Configure the environment

    • Create an .env file in the root directory
      cp .env.example .env
    • fill in the necessary key in the .env
      • SYNTHESIZER_MODEL: Local path of LLM deployed with vLLM
      • SYNTHESIZER_BASE_URL: Service endpoint for the LLM deployed with vLLM.
      • SYNTHESIZER_API_KEY: (optional) API key.
  3. We provide example corpus, which is stored in the K2V-example/data/example_corpus.json. Additionally, a example configuration file is available at K2V-example/config.yaml.

  4. Run the generation script

    bash K2V-example/run.sh

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors