feat: add OWASP Agentic Top 10 automated test harness (closes #152)#436
feat: add OWASP Agentic Top 10 automated test harness (closes #152)#436arcgod-design wants to merge 4 commits into
Conversation
|
Warning Review limit reached
More reviews will be available in 10 minutes and 32 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🧪 PR Test Results
Python 3.12 · commit 4b6cd50 |
|
@arcgod-design merge ready resolve conflicts |
9aa027e to
16b6be4
Compare
|
@arcgod-design fix ci please and join discord |
Summary
Adds a standalone CLI test harness that validates AgentWatch against OWASP Agentic Top 10 exploits, running prompt injections, path traversals, and shell exploit commands through the detection layer.
Changes
scripts/owasp_test_harness.py— standalone CLI script with 19 OWASP attack vectorstests/test_owasp_harness.py— 12 tests covering scanner, red-team, and CLI integration--jsonoutput for CI consumption--fail-fastto stop on first critical findingUsage
Testing
OWASP Vectors Covered
Impact
Provides a repeatable, automated way to validate AgentWatch's security defenses against the OWASP Agentic Top 10. Can be integrated into CI pipelines to catch regressions in safety detection coverage.