Skip to content

Example of using this lib for RLHF? #41

Description

@asmith26

Just wondering if there are any example of using this lib for implement RLHF (Reinforcement Learning from Human Feedback)?

Inspired by: https://openai.com/blog/chatgpt
image

Many thanks for any help! :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions