Skip to content

RL训练 #33

@boyu9

Description

@boyu9

如果说是生成前端页面功能类的代码数据,比如说生成一个web页面,包含一些按钮点击,页面候选框类似等功能,没有具体的真值答案,可以通过论文中说的,先给出一个参考模版答案,进行reward的方法吗,这种类型的代码生成数据训练,通过RL训练如何进行reward设计呢,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions