generative model decoder for target-encoder

Firstly, thank you for your great work.I freeze the target-encoder weights, and train a decoder following the RCDM framework to map the average-pool of the  target-encoder outputs back to pixel space,but i can't obtain visual results similar to those in the paper.Specifically, I transformed the output of the I-JEPA target encoder, which is a tensor of [batchsize, 256，1280], into a tensor of [batch size, 1280] through torch. mean (x, dim=1), and then trained the decoder of the generated model using the RCDM framework. Due to limitations in graphics memory and graphics card, I set the batch size to 6 instead of the default 8, and used a single GPU for training. I used imagenet as the training set and trained approximately 1000000 steps, but I did not achieve good results. Should I need to make any special settings when training the decoder? By the way, will the pre trained model for generating model decoders be released? I would greatly appreciate it if you could reply to me

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generative model decoder for target-encoder #56

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

generative model decoder for target-encoder #56

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions