Skip to content

test(crawler): add contract and parser coverage for crawler plugin#628

Open
anshul23102 wants to merge 4 commits into
utksh1:mainfrom
anshul23102:test/494-crawler-plugin-coverage
Open

test(crawler): add contract and parser coverage for crawler plugin#628
anshul23102 wants to merge 4 commits into
utksh1:mainfrom
anshul23102:test/494-crawler-plugin-coverage

Conversation

@anshul23102
Copy link
Copy Markdown
Contributor

Summary

Closes #494

Adds backend contract and parser tests for the crawler plugin. All tests load the real plugins/crawler/metadata.json, validate through the project PluginMetadataValidator, render commands through PluginManager.build_command(), and call plugins.crawler.parser.parse() directly.

What Is Tested

Metadata contract

  • metadata.json exists and is valid JSON
  • PluginMetadataValidator passes without errors
  • id matches directory name (crawler)
  • Engine binary is katana
  • target field is declared, required, and enforces https?:// URL pattern
  • depth field is optional with a default of 2
  • Parser type is custom and parser.py exists
  • Plugin requires user consent (intrusive safety level)

Command rendering (real PluginManager)

  • Full token sequence: ["katana", "-u", "<target>", "-depth", "2", "-silent"]
  • Default depth 2 from metadata.json is applied when depth is omitted
  • Explicit depth override replaces the default correctly
  • Missing required target returns None
  • Plugin loads successfully via PluginManager

Parser contract (real parser.py)

  • Returns findings, count, items keys
  • count equals len(findings)
  • Each finding has title, category, severity, description, remediation, metadata
  • critical/injection keywords classified as high severity
  • found/exposed/detected keywords classified as low severity
  • items list matches all non-empty lines from output
  • Empty input handled without raising

Why These Tests Will Catch Regressions

  • If katana is replaced in engine.binary, test_crawler_engine_is_katana fails
  • If command_template tokens change order, test_crawler_command_full_token_sequence fails
  • If the depth default is removed, test_crawler_command_uses_default_depth fails
  • If parser.py stops returning findings, every parser test fails
  • If severity classification logic changes, severity assertion tests fail

Program

This contribution is submitted under NSoC'26 (Nexus Spring of Code 2026).

Please apply labels: type:testing, level:intermediate, area:backend, area:plugins

Two separate CI regressions were introduced by commits 0e03877 and a2a7e02:

Backend lint (F821 - Undefined name 'db')
  workflows.py._run_workflow() calls get_target_policy(db, ...) but 'db'
  was never acquired in that method. tick() obtains 'db' but does not
  pass it into _run_workflow(). Fixed by adding db = await get_db() at
  the top of _run_workflow().

Frontend unit test failures (3 tests)
  ToolConfig.tsx now calls listTargetPolicies(), listCredentialProfiles(),
  and listSessionProfiles() inside its useEffect via Promise.all. Tests
  that only mocked the original 3-4 API functions caused Promise.all to
  reject (unmocked vi.fn() returns undefined, not a Promise), making
  setServerLimits never execute and breaking max/min attribute assertions.

  Workflows.tsx changed emptySteps to include an execution_context object
  in each step. The createWorkflow assertion expected the old shape.

Fixes applied:
  - ToolConfigDynamic.test.tsx: add listTargetPolicies, listCredentialProfiles,
    listSessionProfiles, getSettings to vi.mock factory and beforeEach mocks;
    update startTask assertion to accept the new 5th executionContext argument
  - ToolConfigTimeout.test.tsx: add the three new API functions to vi.mock
    factory and beforeEach mocks so Promise.all resolves correctly
  - Workflows.test.tsx: update createWorkflow expectation to include
    execution_context in the steps array
…mpliance

{ items: [] } was inferred as { items: never[] }, which does not satisfy
NamedResourceList<T> (requires items: T[] and total: number). Added total: 0
to all three mock returns so TypeScript accepts the fixture without casting.
Add backend test suite for the crawler plugin that loads the real
plugins/crawler/metadata.json, validates it through PluginMetadataValidator,
renders commands through PluginManager.build_command(), and calls the real
plugins.crawler.parser.parse() directly.

Assertions are tied to the actual plugin contract:
- engine.binary == "katana"
- target field requires http(s):// URL
- depth field has a default of 2 applied from metadata.json
- explicit depth override works correctly
- full command token sequence from real command_template
- severity classification: high for critical/injection, low for found/exposed
- required keys in each finding dict
- items list matches the parsed output lines

Tests will fail if metadata.json, command_template, or parser.py drift.

Closes utksh1#494
@anshul23102 anshul23102 force-pushed the test/494-crawler-plugin-coverage branch from a36e42e to ac1eabf Compare June 6, 2026 08:09
build_command drops the unresolved {target} token instead of returning None.
Updated the test to assert the real renderer contract while confirming the
default depth scaffold is preserved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TEST] Add parser and contract coverage for plugin crawler

1 participant