Collecting trajectories in SimplerEnv to make SFT datasets

Hello,
Thank you for sharing your great work!

I have some question about collecting trajectories in SimplerEnv-Bridge simulation.

I've fully fine tuned the original OpenVLA checkpoint with your rlds dataset you've uploaded on the issue#8 (let me call it **your data**).
The fully fine tuned model (please let me call it OpenVLA-Full-SFT) shows good performance! Almost same performance with your LoRA fine-tuned model on the hugging face.(Great! Thank you!)

And then I collected trajectories with this OpenVLA-Full-SFT model following your README.

The problem is here.
I fully fine-tuned the original OpenVLA checkpoint with trajectories that I collected using OpenVLA-Full-SFT (let me call it **my data**).
And then I've evaluated this OpenVLA-Full-SFT(w/ my dataset).
But it shows really bad performance.

In short,
OpenVLA-Full-SFT (w/ your data) works good,
OpenVLA-Full-SFT (w/ my data) works bad.

Is there anything I should be careful about when collecting trajectories or fully fine tune the original OpenVLA model?
Or could you please let me know the details about your collecting process?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Collecting trajectories in SimplerEnv to make SFT datasets #13

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Collecting trajectories in SimplerEnv to make SFT datasets #13

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions