DiffSensei-Unofficial

Unofficial PyTorch implementation starter for DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation (CVPR 2025).

If this repo saves you reading / reproduction time, please star it and follow @StaryMoon. I am building honest open reproduction starters for recent CVPR papers.

Status

This repository is an independent, unofficial, work-in-progress starter.

Paper: DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Venue: CVPR 2025
Reproduction status: benchmarks are not reproduced yet
Relationship to authors: this repo is not official and is not affiliated with the paper authors.

What Is Implemented

This v0.1.0 starter implements a compact, readable scaffold inspired by the paper:

separate visual and language token pathways
cross-modal fusion block
understanding and generation heads
toy contrastive/generation loss
smoke-test script

The goal is to make the high-level idea easy to inspect, fork, and improve.

What Is Not Implemented Yet

large language model backbone
image diffusion backend
large-scale instruction tuning
official evaluation protocol

Quick Start

git clone https://github.com/StaryMoon/DiffSensei-Unofficial.git
cd DiffSensei-Unofficial
pip install -r requirements.txt
python scripts/smoke_test.py

Expected output includes:

loss: ...
logits: torch.Size([2, 8, 32])

Minimal Usage

import torch
from diffsensei_unofficial import UnofficialStarter

image = torch.rand(2, 3, 64, 64)
model = UnofficialStarter(kind="vlm")
out = model(image)

Roadmap

Replace toy modules with a closer implementation of the paper.
Add dataset loader and config files.
Add metric scripts and visualization.
Reproduce a small benchmark or ablation table.
Add pretrained weights once experiments are stable.

Search Tags

cvpr-2025, manga-generation, diffusion, mllm, pytorch, unofficial-implementation

Citation

Please cite the original paper if you use the method. This repo is only an unofficial starter and does not replace the paper.

License

MIT License. The original paper and official materials remain owned by their respective authors / publishers.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
configs		configs
scripts		scripts
src/diffsensei_unofficial		src/diffsensei_unofficial
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiffSensei-Unofficial

Status

What Is Implemented

What Is Not Implemented Yet

Quick Start

Minimal Usage

Roadmap

Search Tags

Citation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DiffSensei-Unofficial

Status

What Is Implemented

What Is Not Implemented Yet

Quick Start

Minimal Usage

Roadmap

Search Tags

Citation

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages