Skip to content

Support for ToriiGate 0.5#24

Open
404-brittle wants to merge 1 commit into
MNeMoNiCuZ:mainfrom
404-brittle:main
Open

Support for ToriiGate 0.5#24
404-brittle wants to merge 1 commit into
MNeMoNiCuZ:mainfrom
404-brittle:main

Conversation

@404-brittle

@404-brittle 404-brittle commented Apr 11, 2026

Copy link
Copy Markdown

Functions for me locally, an adjustment was made to code to support newer transformers.

Summary by CodeRabbit

  • New Features
    • Added ToriiGate 0.5 model for image captioning with three preset options: Simple Caption, Detailed Caption, and JSON-like formats.
    • Enhanced caption generation with improved output processing.

@coderabbitai

coderabbitai Bot commented Apr 11, 2026

Copy link
Copy Markdown

Walkthrough

A new ToriiGate 0.5 model configuration is introduced with a corresponding YAML manifest, and the associated wrapper is updated to use a different model loader and refine caption post-processing to handle multiple preamble format variants.

Changes

Cohort / File(s) Summary
ToriiGate Configuration
src/config/models/ToriiGate0.5.yaml
New model configuration declaring ToriiGate wrapper binding, supported features (task prompt, prompt presets/templates, system prompt), generation defaults (max_tokens: 500), three prompt presets (Simple/Detailed/JSON-like Caption), VRAM recommendations, and feature UI layout configuration.
ToriiGate Wrapper Implementation
src/wrappers/toriigate.py
Model loading updated from AutoModelForVision2Seq to AutoModelForImageTextToText with 4-bit NF4 quantization preserved; caption post-processing refactored to strip either lowercase "assistant. " or uppercase "Assistant:" preamble markers with fallback logic.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🎌 A gate of torrii stands tall and true,
With captions flowing, both old and new,
From vision to text, in elegant prose,
Five simple presets—your config now knows! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: adding support for ToriiGate 0.5 model, which is the main objective of the pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/wrappers/toriigate.py (1)

55-59: ⚠️ Potential issue | 🟠 Major

Remove .to() call on 4-bit quantized model; use device_map during loading instead.

Calling .to(self.device) on models loaded with BitsAndBytesConfig(load_in_4bit=True) is unsupported and breaks quantization state management—resulting in runtime failures during inference. Use device_map="auto" in from_pretrained instead, which handles device placement during loading.

Note: dtype= is valid in current Transformers versions and works alongside quantization_config.

Proposed fix
-        self.model = AutoModelForImageTextToText.from_pretrained(
-            model_path,
-            dtype=torch.bfloat16,
-            quantization_config=nf4_config,
-        ).to(self.device)
+        load_kwargs = {
+            "torch_dtype": torch.bfloat16,
+            "quantization_config": nf4_config,
+        }
+        if self.device == "cuda":
+            load_kwargs["device_map"] = "auto"
+
+        self.model = AutoModelForImageTextToText.from_pretrained(
+            model_path,
+            **load_kwargs,
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/wrappers/toriigate.py` around lines 55 - 59, The call that moves the
4-bit quantized model (self.model =
AutoModelForImageTextToText.from_pretrained(...).to(self.device)) breaks bnb
quantization state; change the loader to let Transformers place the model by
adding device_map="auto" to AutoModelForImageTextToText.from_pretrained (keep
dtype=torch.bfloat16 and quantization_config=nf4_config) and remove the trailing
.to(self.device); also search for any other explicit .to(self.device) calls on
this self.model and remove them so device placement is handled by device_map.
🧹 Nitpick comments (1)
src/config/models/ToriiGate0.5.yaml (1)

2-3: Consider versioned display naming for clarity.

If multiple ToriiGate variants are intended to coexist, a versioned name/id (for example, ToriiGate 0.5) avoids UI and selection ambiguity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/models/ToriiGate0.5.yaml` around lines 2 - 3, The display name and
identifier in ToriiGate0.5.yaml are ambiguous; update the name and id fields
(currently name and id) to include the version (e.g., change name to "ToriiGate
0.5" and id to a matching versioned identifier like "ToriiGate-0.5") to avoid
UI/selection collisions, and search for any references to the old id/name
elsewhere (configs, registries or lookups) to update them accordingly so the
variant remains uniquely addressable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/wrappers/toriigate.py`:
- Around line 111-114: The current split unconditionally removes the substring
"assistant. " anywhere in text, which can truncate valid content; update the
logic to only strip when that token is a leading preamble—e.g., check the
trimmed start of text for a speaker prefix (use
text.lstrip().lower().startswith("assistant.") or
text.lstrip().startswith("Assistant:") or a small regex that matches a leading
"assistant." or "Assistant:" optionally followed by whitespace) before
splitting; operate on the leading match and then strip the remainder from the
variable text so in-line occurrences are not removed.

---

Outside diff comments:
In `@src/wrappers/toriigate.py`:
- Around line 55-59: The call that moves the 4-bit quantized model (self.model =
AutoModelForImageTextToText.from_pretrained(...).to(self.device)) breaks bnb
quantization state; change the loader to let Transformers place the model by
adding device_map="auto" to AutoModelForImageTextToText.from_pretrained (keep
dtype=torch.bfloat16 and quantization_config=nf4_config) and remove the trailing
.to(self.device); also search for any other explicit .to(self.device) calls on
this self.model and remove them so device placement is handled by device_map.

---

Nitpick comments:
In `@src/config/models/ToriiGate0.5.yaml`:
- Around line 2-3: The display name and identifier in ToriiGate0.5.yaml are
ambiguous; update the name and id fields (currently name and id) to include the
version (e.g., change name to "ToriiGate 0.5" and id to a matching versioned
identifier like "ToriiGate-0.5") to avoid UI/selection collisions, and search
for any references to the old id/name elsewhere (configs, registries or lookups)
to update them accordingly so the variant remains uniquely addressable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3a411945-cb64-4435-852e-79818b1e45b3

📥 Commits

Reviewing files that changed from the base of the PR and between 20e012a and 89a2bc7.

📒 Files selected for processing (2)
  • src/config/models/ToriiGate0.5.yaml
  • src/wrappers/toriigate.py

Comment thread src/wrappers/toriigate.py
Comment on lines +111 to 114
if "assistant. " in text:
text = text.split("assistant. ", 1)[-1].strip()
elif "Assistant:" in text:
text = text.split("Assistant:", 1)[-1].strip()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Only strip assistant preamble when it is actually a preamble.

Current logic splits on "assistant. " anywhere in the text. If that token appears in valid caption content, output gets truncated.

Proposed patch
-            if "assistant. " in text:
-                text = text.split("assistant. ", 1)[-1].strip()
-            elif "Assistant:" in text:
-                text = text.split("Assistant:", 1)[-1].strip()
+            normalized = text.lstrip()
+            if normalized.lower().startswith("assistant. "):
+                text = normalized[len("assistant. "):].strip()
+            elif normalized.startswith("Assistant:"):
+                text = normalized[len("Assistant:"):].strip()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/wrappers/toriigate.py` around lines 111 - 114, The current split
unconditionally removes the substring "assistant. " anywhere in text, which can
truncate valid content; update the logic to only strip when that token is a
leading preamble—e.g., check the trimmed start of text for a speaker prefix (use
text.lstrip().lower().startswith("assistant.") or
text.lstrip().startswith("Assistant:") or a small regex that matches a leading
"assistant." or "Assistant:" optionally followed by whitespace) before
splitting; operate on the leading match and then strip the remainder from the
variable text so in-line occurrences are not removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant