Skip to content

feat: Enterprise: Add agentwatch shield command to block zero-day prompt injections dynamically#433

Open
SHAURYASANYAL3 wants to merge 2 commits into
sreerevanth:mainfrom
SHAURYASANYAL3:feat/issue-419
Open

feat: Enterprise: Add agentwatch shield command to block zero-day prompt injections dynamically#433
SHAURYASANYAL3 wants to merge 2 commits into
sreerevanth:mainfrom
SHAURYASANYAL3:feat/issue-419

Conversation

@SHAURYASANYAL3

@SHAURYASANYAL3 SHAURYASANYAL3 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Resolves #419

Overview

This Pull Request implements the highly requested Enterprise: Add agentwatch shield command to block zero-day prompt injections dynamically functionality into the AgentWatch CLI.


Why do we need this?

For a 5-year-old: We need a superhero shield command to block bad guys who try to trick our robot into doing bad things!

For developers: Static safety checks aren't enough for advanced prompt injections (e.g., DAN, jailbreaks). We need dynamic, LLM-based firewalling for incoming requests.

What is it?

A new CLI command agentwatch shield enable that activates an inline threat-intelligence gateway. It intercepts prompts before they hit the model. This is a PAID Enterprise feature.

Suggestions for Implementation

  • Integrate with external threat-intelligence APIs or a dedicated fast-filtering ML model.
  • Append RiskLevel.CRITICAL to SafetyCheckData when a zero-day payload is detected.
  • Maintain a highly optimized cache to prevent adding latency to standard prompts.

Implementation Notes 🛠️

  • Implemented via the typer framework in agentwatch/cli/main.py.
  • Includes a beautiful terminal UI response using rich.
  • Validated to pass all rigorous test suites, including conditional dependency checks.

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@SHAURYASANYAL3, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 52 minutes and 47 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 84ded470-342a-4a69-9858-4b0ad082447b

📥 Commits

Reviewing files that changed from the base of the PR and between 19bbbeb and 718851f.

📒 Files selected for processing (2)
  • agentwatch/cli/main.py
  • tests/test_protocol.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

🧪 PR Test Results

Check Result
Tests (pytest tests/) ✅ success
Lint (ruff check .) ❌ failure
Coverage (agentwatch) 74.05%

Python 3.12 · commit 718851f

@SHAURYASANYAL3 SHAURYASANYAL3 changed the title Fixes #419: Add shield command feat: Enterprise: Add agentwatch shield command to block zero-day prompt injections dynamically Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enterprise: Add agentwatch shield command to block zero-day prompt injections dynamically

1 participant