You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 1, 2024. It is now read-only.
Firstly, thank you for your great work.I freeze the target-encoder weights, and train a decoder following the RCDM framework to map the average-pool of the target-encoder outputs back to pixel space,but i can't obtain visual results similar to those in the paper.Specifically, I transformed the output of the I-JEPA target encoder, which is a tensor of [batchsize, 256,1280], into a tensor of [batch size, 1280] through torch. mean (x, dim=1), and then trained the decoder of the generated model using the RCDM framework. Due to limitations in graphics memory and graphics card, I set the batch size to 6 instead of the default 8, and used a single GPU for training. I used imagenet as the training set and trained approximately 1000000 steps, but I did not achieve good results. Should I need to make any special settings when training the decoder? By the way, will the pre trained model for generating model decoders be released? I would greatly appreciate it if you could reply to me
Firstly, thank you for your great work.I freeze the target-encoder weights, and train a decoder following the RCDM framework to map the average-pool of the target-encoder outputs back to pixel space,but i can't obtain visual results similar to those in the paper.Specifically, I transformed the output of the I-JEPA target encoder, which is a tensor of [batchsize, 256,1280], into a tensor of [batch size, 1280] through torch. mean (x, dim=1), and then trained the decoder of the generated model using the RCDM framework. Due to limitations in graphics memory and graphics card, I set the batch size to 6 instead of the default 8, and used a single GPU for training. I used imagenet as the training set and trained approximately 1000000 steps, but I did not achieve good results. Should I need to make any special settings when training the decoder? By the way, will the pre trained model for generating model decoders be released? I would greatly appreciate it if you could reply to me