Support for ToriiGate 0.5#24
Conversation
WalkthroughA new ToriiGate 0.5 model configuration is introduced with a corresponding YAML manifest, and the associated wrapper is updated to use a different model loader and refine caption post-processing to handle multiple preamble format variants. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/wrappers/toriigate.py (1)
55-59:⚠️ Potential issue | 🟠 MajorRemove
.to()call on 4-bit quantized model; usedevice_mapduring loading instead.Calling
.to(self.device)on models loaded withBitsAndBytesConfig(load_in_4bit=True)is unsupported and breaks quantization state management—resulting in runtime failures during inference. Usedevice_map="auto"infrom_pretrainedinstead, which handles device placement during loading.Note:
dtype=is valid in current Transformers versions and works alongsidequantization_config.Proposed fix
- self.model = AutoModelForImageTextToText.from_pretrained( - model_path, - dtype=torch.bfloat16, - quantization_config=nf4_config, - ).to(self.device) + load_kwargs = { + "torch_dtype": torch.bfloat16, + "quantization_config": nf4_config, + } + if self.device == "cuda": + load_kwargs["device_map"] = "auto" + + self.model = AutoModelForImageTextToText.from_pretrained( + model_path, + **load_kwargs, + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/wrappers/toriigate.py` around lines 55 - 59, The call that moves the 4-bit quantized model (self.model = AutoModelForImageTextToText.from_pretrained(...).to(self.device)) breaks bnb quantization state; change the loader to let Transformers place the model by adding device_map="auto" to AutoModelForImageTextToText.from_pretrained (keep dtype=torch.bfloat16 and quantization_config=nf4_config) and remove the trailing .to(self.device); also search for any other explicit .to(self.device) calls on this self.model and remove them so device placement is handled by device_map.
🧹 Nitpick comments (1)
src/config/models/ToriiGate0.5.yaml (1)
2-3: Consider versioned display naming for clarity.If multiple ToriiGate variants are intended to coexist, a versioned
name/id(for example,ToriiGate 0.5) avoids UI and selection ambiguity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config/models/ToriiGate0.5.yaml` around lines 2 - 3, The display name and identifier in ToriiGate0.5.yaml are ambiguous; update the name and id fields (currently name and id) to include the version (e.g., change name to "ToriiGate 0.5" and id to a matching versioned identifier like "ToriiGate-0.5") to avoid UI/selection collisions, and search for any references to the old id/name elsewhere (configs, registries or lookups) to update them accordingly so the variant remains uniquely addressable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/wrappers/toriigate.py`:
- Around line 111-114: The current split unconditionally removes the substring
"assistant. " anywhere in text, which can truncate valid content; update the
logic to only strip when that token is a leading preamble—e.g., check the
trimmed start of text for a speaker prefix (use
text.lstrip().lower().startswith("assistant.") or
text.lstrip().startswith("Assistant:") or a small regex that matches a leading
"assistant." or "Assistant:" optionally followed by whitespace) before
splitting; operate on the leading match and then strip the remainder from the
variable text so in-line occurrences are not removed.
---
Outside diff comments:
In `@src/wrappers/toriigate.py`:
- Around line 55-59: The call that moves the 4-bit quantized model (self.model =
AutoModelForImageTextToText.from_pretrained(...).to(self.device)) breaks bnb
quantization state; change the loader to let Transformers place the model by
adding device_map="auto" to AutoModelForImageTextToText.from_pretrained (keep
dtype=torch.bfloat16 and quantization_config=nf4_config) and remove the trailing
.to(self.device); also search for any other explicit .to(self.device) calls on
this self.model and remove them so device placement is handled by device_map.
---
Nitpick comments:
In `@src/config/models/ToriiGate0.5.yaml`:
- Around line 2-3: The display name and identifier in ToriiGate0.5.yaml are
ambiguous; update the name and id fields (currently name and id) to include the
version (e.g., change name to "ToriiGate 0.5" and id to a matching versioned
identifier like "ToriiGate-0.5") to avoid UI/selection collisions, and search
for any references to the old id/name elsewhere (configs, registries or lookups)
to update them accordingly so the variant remains uniquely addressable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3a411945-cb64-4435-852e-79818b1e45b3
📒 Files selected for processing (2)
src/config/models/ToriiGate0.5.yamlsrc/wrappers/toriigate.py
| if "assistant. " in text: | ||
| text = text.split("assistant. ", 1)[-1].strip() | ||
| elif "Assistant:" in text: | ||
| text = text.split("Assistant:", 1)[-1].strip() |
There was a problem hiding this comment.
Only strip assistant preamble when it is actually a preamble.
Current logic splits on "assistant. " anywhere in the text. If that token appears in valid caption content, output gets truncated.
Proposed patch
- if "assistant. " in text:
- text = text.split("assistant. ", 1)[-1].strip()
- elif "Assistant:" in text:
- text = text.split("Assistant:", 1)[-1].strip()
+ normalized = text.lstrip()
+ if normalized.lower().startswith("assistant. "):
+ text = normalized[len("assistant. "):].strip()
+ elif normalized.startswith("Assistant:"):
+ text = normalized[len("Assistant:"):].strip()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/wrappers/toriigate.py` around lines 111 - 114, The current split
unconditionally removes the substring "assistant. " anywhere in text, which can
truncate valid content; update the logic to only strip when that token is a
leading preamble—e.g., check the trimmed start of text for a speaker prefix (use
text.lstrip().lower().startswith("assistant.") or
text.lstrip().startswith("Assistant:") or a small regex that matches a leading
"assistant." or "Assistant:" optionally followed by whitespace) before
splitting; operate on the leading match and then strip the remainder from the
variable text so in-line occurrences are not removed.
Functions for me locally, an adjustment was made to code to support newer transformers.
Summary by CodeRabbit