support ernie-image-turbo#1391
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the ERNIE-Image model provider from 'baidu' to 'PaddlePaddle' across documentation, configurations, and example scripts, and introduces the 'ERNIE-Image-Turbo' model. Technical changes include updating the FlowMatchScheduler to support a shift parameter for ERNIE-Image timesteps and removing the uniform training weight logic. A review comment suggests adding a docstring to the updated set_timesteps_ernie_image function and simplifying a redundant null check for the shift parameter.
| def set_timesteps_ernie_image(num_inference_steps=50, denoising_strength=1.0, shift=3.0): | ||
| sigma_min = 0.0 | ||
| sigma_max = 1.0 | ||
| num_train_timesteps = 1000 | ||
| sigma_start = denoising_strength | ||
| sigmas = torch.linspace(sigma_start, 0.0, num_inference_steps + 1)[:-1] | ||
| sigma_start = sigma_min + (sigma_max - sigma_min) * denoising_strength | ||
| sigmas = torch.linspace(sigma_start, sigma_min, num_inference_steps + 1)[:-1] | ||
| if shift is not None and shift != 1.0: | ||
| sigmas = shift * sigmas / (1 + (shift - 1) * sigmas) | ||
| timesteps = sigmas * num_train_timesteps | ||
| return sigmas, timesteps |
There was a problem hiding this comment.
The function set_timesteps_ernie_image has been updated, but it's missing a docstring. It's good practice to add one to explain its purpose and parameters, especially with the new shift parameter. Additionally, the condition if shift is not None and shift != 1.0: can be simplified since shift has a non-None default value.
| def set_timesteps_ernie_image(num_inference_steps=50, denoising_strength=1.0, shift=3.0): | |
| sigma_min = 0.0 | |
| sigma_max = 1.0 | |
| num_train_timesteps = 1000 | |
| sigma_start = denoising_strength | |
| sigmas = torch.linspace(sigma_start, 0.0, num_inference_steps + 1)[:-1] | |
| sigma_start = sigma_min + (sigma_max - sigma_min) * denoising_strength | |
| sigmas = torch.linspace(sigma_start, sigma_min, num_inference_steps + 1)[:-1] | |
| if shift is not None and shift != 1.0: | |
| sigmas = shift * sigmas / (1 + (shift - 1) * sigmas) | |
| timesteps = sigmas * num_train_timesteps | |
| return sigmas, timesteps | |
| def set_timesteps_ernie_image(num_inference_steps=50, denoising_strength=1.0, shift=3.0): | |
| """Sets the timesteps for the ERNIE-Image scheduler, with optional sigma shifting.""" | |
| sigma_min = 0.0 | |
| sigma_max = 1.0 | |
| num_train_timesteps = 1000 | |
| sigma_start = sigma_min + (sigma_max - sigma_min) * denoising_strength | |
| sigmas = torch.linspace(sigma_start, sigma_min, num_inference_steps + 1)[:-1] | |
| if shift != 1.0: | |
| sigmas = shift * sigmas / (1 + (shift - 1) * sigmas) | |
| timesteps = sigmas * num_train_timesteps | |
| return sigmas, timesteps |
| rand_device: str = "cuda", | ||
| # Steps | ||
| num_inference_steps: int = 50, | ||
| scheduler_shift: float = 3.0, |
There was a problem hiding this comment.
Rename scheduler_shift to sigma_shift, which is consistent with other pipelines.
| pipe = ErnieImagePipeline.from_pretrained( | ||
| torch_dtype=torch.bfloat16, | ||
| device='cuda', | ||
| model_configs=[ |
There was a problem hiding this comment.
In ERNIE-Image-Turbo, the text_encoder, vae and tokenizer are the same as ERNIE-Image, please use PaddlePaddle/ERNIE-Image to avoid repeated downloading. Only use PaddlePaddle/ERNIE-Image-Turbo for transformer backbone.
| | 模型 ID | 推理 | 低显存推理 | 全量训练 | 全量训练后验证 | LoRA 训练 | LoRA 训练后验证 | | ||
| |-|-|-|-|-|-|-| | ||
| |[baidu/ERNIE-Image: T2I](https://www.modelscope.cn/models/baidu/ERNIE-Image)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_inference/Ernie-Image-T2I.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_inference_low_vram/Ernie-Image-T2I.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_training/full/Ernie-Image-T2I.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_training/validate_full/Ernie-Image-T2I.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_training/lora/Ernie-Image-T2I.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_training/validate_lora/Ernie-Image-T2I.py)| | ||
| |[PaddlePaddle/ERNIE-Image: T2I](https://www.modelscope.cn/models/PaddlePaddle/ERNIE-Image)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_inference/Ernie-Image-T2I.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_inference_low_vram/Ernie-Image-T2I.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_training/full/Ernie-Image-T2I.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_training/validate_full/Ernie-Image-T2I.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_training/lora/Ernie-Image-T2I.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/ernie_image/model_training/validate_lora/Ernie-Image-T2I.py)| |
There was a problem hiding this comment.
The example file name should be consistent with the model id, i.e., ERNIE-Image.py and ERNIE-Image-Turbo.py
No description provided.