Skip to content
This repository was archived by the owner on Aug 1, 2024. It is now read-only.
This repository was archived by the owner on Aug 1, 2024. It is now read-only.

generative model decoder for target-encoder #56

@sls619

Description

@sls619

Firstly, thank you for your great work.I freeze the target-encoder weights, and train a decoder following the RCDM framework to map the average-pool of the target-encoder outputs back to pixel space,but i can't obtain visual results similar to those in the paper.Specifically, I transformed the output of the I-JEPA target encoder, which is a tensor of [batchsize, 256,1280], into a tensor of [batch size, 1280] through torch. mean (x, dim=1), and then trained the decoder of the generated model using the RCDM framework. Due to limitations in graphics memory and graphics card, I set the batch size to 6 instead of the default 8, and used a single GPU for training. I used imagenet as the training set and trained approximately 1000000 steps, but I did not achieve good results. Should I need to make any special settings when training the decoder? By the way, will the pre trained model for generating model decoders be released? I would greatly appreciate it if you could reply to me

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions