feat: add fp8 optimization support for transformer model#6
Open
eric-gitta-moore wants to merge 6 commits intobrandon929:mainfrom
Open
feat: add fp8 optimization support for transformer model#6eric-gitta-moore wants to merge 6 commits intobrandon929:mainfrom
eric-gitta-moore wants to merge 6 commits intobrandon929:mainfrom
Conversation
- Implement fp8 quantization utilities for linear layers - Add fp8 optimization option to gradio demo interface - Modify worker function to handle fp8 optimized state dict - Include monkey patching for fp8 linear layer forward pass
Add --offline flag to load models from local cache instead of downloading from HuggingFace hub. This enables usage in environments with restricted internet access.
Fix the --offline argument to remove incorrect store_true action and adjust the minimum value of gpu_memory_preservation slider from 6 to 0 for better flexibility in low-memory scenarios
The FP8 optimization checkbox was disabled by default, which may lead to suboptimal performance for users who are unaware of this setting. Enabling it by default ensures better performance out of the box.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
merge https://github.com/kohya-ss/FramePack-LoRAReady/blob/main/utils/fp8_optimization_utils.py