Skip to content

Collecting trajectories in SimplerEnv to make SFT datasets #13

Description

@JaehongJaehongMin

Hello,
Thank you for sharing your great work!

I have some question about collecting trajectories in SimplerEnv-Bridge simulation.

I've fully fine tuned the original OpenVLA checkpoint with your rlds dataset you've uploaded on the issue#8 (let me call it your data).
The fully fine tuned model (please let me call it OpenVLA-Full-SFT) shows good performance! Almost same performance with your LoRA fine-tuned model on the hugging face.(Great! Thank you!)

And then I collected trajectories with this OpenVLA-Full-SFT model following your README.

The problem is here.
I fully fine-tuned the original OpenVLA checkpoint with trajectories that I collected using OpenVLA-Full-SFT (let me call it my data).
And then I've evaluated this OpenVLA-Full-SFT(w/ my dataset).
But it shows really bad performance.

In short,
OpenVLA-Full-SFT (w/ your data) works good,
OpenVLA-Full-SFT (w/ my data) works bad.

Is there anything I should be careful about when collecting trajectories or fully fine tune the original OpenVLA model?
Or could you please let me know the details about your collecting process?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions