Skip to content

docs: reframe WAF effectiveness as reported observations, not pass/fail gates#228

Merged
bihius merged 1 commit into
mainfrom
zealous-hertz-ae62a8
Jun 6, 2026
Merged

docs: reframe WAF effectiveness as reported observations, not pass/fail gates#228
bihius merged 1 commit into
mainfrom
zealous-hertz-ae62a8

Conversation

@bihius
Copy link
Copy Markdown
Owner

@bihius bihius commented Jun 6, 2026

Summary

  • Remove the Detection Targets (draft) block from README.testing.md — the TPR >95% and FPR <10% thresholds were framed as CI gates but those numbers are properties of the CRS ruleset and paranoia level, not of Guard Proxy itself.
  • Replace with a two-section block: what is reported (measured against a pinned baseline with CRS SHA, PL, and corpus recorded) vs what CI actually enforces (functional block/allow assertions + RPS overhead gate).
  • Update docs/course-project-report.md to match: new "Cele dodatkowe" block in §1, split test taxonomy in §12, and a new §13 "Plan ewaluacji" table (E1–E5, gate vs reported).

Why

Enforcing absolute TPR/FPR thresholds as pass/fail gates would couple CI to CRS rule quality — something Guard Proxy doesn't control. The right framing is: Guard Proxy owns the wiring and management correctness; CRS owns detection quality. The thesis evaluation should report effectiveness measurements against a pinned baseline, not gate builds on them.

The key thesis finding is now framed as the before/after FP delta from applying a rule override on a fixed corpus — a reproducible, owned result that demonstrates the tuning workflow works.

What CI still enforces (unchanged behaviour)

  • Known attack → blocked (403) — smoke test
  • Known benign → passes (200) — smoke test
  • Rule toggle via panel changes live WAF behaviour — policy apply e2e
  • WAF overhead < 20% RPS degradation — performance benchmarks

Related

  • GitHub issue #227 — post-MVP policy profiles feature (app-specific CRS exclusion profiles) created in this session; not part of this PR.
  • thesis/evaluation-plan.md §7 also updated (file is gitignored — lives outside the repo).

Test plan

  • README.testing.md renders correctly — no broken markdown, section headers make sense
  • course-project-report.md new §13 table displays correctly
  • No other files reference the removed <10% / >95% thresholds (confirmed by grep — none found)

…il gates

Remove the absolute TPR >95% / FPR <10% detection targets from README.testing.md
and course-project-report.md. Those numbers are properties of the CRS ruleset and
paranoia level, not of Guard Proxy itself — enforcing them as CI gates would couple
the project to something it doesn't own.

README.testing.md:
- Replace "Detection Targets (draft)" block with two explicit sections:
  "WAF effectiveness — reported, not gated" (pinned-baseline measurement with CRS
  SHA, PL, and corpus recorded per run) and "What CI actually enforces" (functional
  block/allow assertions + <20% overhead gate that the project actually owns).
- Key thesis finding reframed as the before/after FP delta from applying a rule
  override, not an absolute FP rate.

docs/course-project-report.md:
- §1 (goals): add "Cele dodatkowe" block distinguishing primary owned goals from
  secondary reported observations (fidelity, overhead, tuning-delta); add explicit
  statement that TPR/FPR are CRS properties measured against a pinned baseline.
- §12 (tests): split into "Bramy zaliczenia" (CI-enforced gates) vs "Pomiary
  raportowane" (TP/FP/delta — measured, not gated); note that every reported
  measurement must record its CRS reference point.
- §13 (new): "Plan ewaluacji" — five-row table (E1–E5) with measurement, method,
  and type (gate vs reported); "punkt odniesienia" block specifying what must be
  recorded with every reported run.
- §14 (formerly §13): Screenshots (renumbered).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bihius bihius merged commit ddaa9f3 into main Jun 6, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant