Skip to content

fix(reachability): seed native/cross-language entry points, fix Zig reachability crash, stop silent zero-seed blackout#75

Open
gadievron wants to merge 2 commits into
masterfrom
fix/reachability-seed-native-cross-language-entry-points-fix
Open

fix(reachability): seed native/cross-language entry points, fix Zig reachability crash, stop silent zero-seed blackout#75
gadievron wants to merge 2 commits into
masterfrom
fix/reachability-seed-native-cross-language-entry-points-fix

Conversation

@gadievron

Copy link
Copy Markdown
Collaborator

Six filed defects converge on the shared EntryPointDetector + reachability seeding
path; one is a Zig-local crash.

Central detector (utilities/agentic_enhancer/entry_point_detector.py):

  • Add 'main' / 'http_handler' / 'middleware' to ENTRY_POINT_TYPES. The C/Go/Zig
    parsers emit these program/web entry unit_types, but the set only knew the
    Python/Express web vocabulary, so a compiled binary seeded zero entry points
    (total reachability blackout).
  • Add _unit_type() dual-key read (snake 'unit_type' + camel 'unitType') for
    Check-1, Check-4, details and statistics. The per-parser reachable path
    normalizes under camelCase 'unitType' (parsers/{c,php,ruby}/test_pipeline.py:257)
    while the detector read snake-only, so Check-1/Check-4 were dead on that path
    even for valid entry types. (The earlier-filed file_path/filePath locus is
    phantom; the operative residual is exactly this dual-read gap.)

Zig classifier (parsers/zig/function_extractor.py):

  • _classify_function: main -> 'main' (was generic 'function'), matching C/Go so a
    Zig binary's entry point is now seedable.

Zig reachability filter (parsers/zig/test_pipeline.py):

  • Rewrite apply_reachability_filter to the real EntryPointDetector(functions,
    call_graph).detect_entry_points() / ReachabilityAnalyzer(functions,
    reverse_call_graph, entry_points).get_all_reachable() contract. The old code
    called a non-existent API; imports succeed (sys.path), so except ImportError
    never fired and the wrong-arity TypeError crashed every Zig parse at
    --processing-level reachable. Derived against Zig's own snake-case data shape
    (no token-copy of the C normalizer).

Empty-seed safety-net (core/parser_adapter.py):

  • A zero entry-point seed previously emptied the dataset silently (100% reduction
    reported as SUCCESS) — the dominant failure for non-web library/stdlib targets.
    Degrade to pass-through (units preserved, filtering NOT applied) + record a loud
    warning in the filter metadata so the blackout can never be silent. The broader
    generic-library seeding heuristic is architectural and out of scope.

Tests (RED->GREEN): tests/test_entry_point_detector_native_seeds.py (6),
tests/parsers/zig/test_zig_main_classification.py (3),
tests/parsers/zig/test_zig_reachability_api.py (2, reproduced the exact
TypeError), tests/test_reachability_empty_seed.py (2). 12 failed pre-fix (1
guard green at base by design) -> 13 passed after the fix. Full suite 189 passed /
63 skipped / 0 failed. go test/vet/build clean in parsers/go/go_parser (types.go
unchanged, gofmt-clean). ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

…eachability crash, stop silent zero-seed blackout

Six filed defects converge on the shared EntryPointDetector + reachability seeding
path; one is a Zig-local crash.

Central detector (utilities/agentic_enhancer/entry_point_detector.py):
- Add 'main' / 'http_handler' / 'middleware' to ENTRY_POINT_TYPES. The C/Go/Zig
  parsers emit these program/web entry unit_types, but the set only knew the
  Python/Express web vocabulary, so a compiled binary seeded zero entry points
  (total reachability blackout).
- Add _unit_type() dual-key read (snake 'unit_type' + camel 'unitType') for
  Check-1, Check-4, details and statistics. The per-parser reachable path
  normalizes under camelCase 'unitType' (parsers/{c,php,ruby}/test_pipeline.py:257)
  while the detector read snake-only, so Check-1/Check-4 were dead on that path
  even for valid entry types. (The earlier-filed file_path/filePath locus is
  phantom; the operative residual is exactly this dual-read gap.)

Zig classifier (parsers/zig/function_extractor.py):
- _classify_function: main -> 'main' (was generic 'function'), matching C/Go so a
  Zig binary's entry point is now seedable.

Zig reachability filter (parsers/zig/test_pipeline.py):
- Rewrite apply_reachability_filter to the real EntryPointDetector(functions,
  call_graph).detect_entry_points() / ReachabilityAnalyzer(functions,
  reverse_call_graph, entry_points).get_all_reachable() contract. The old code
  called a non-existent API; imports succeed (sys.path), so except ImportError
  never fired and the wrong-arity TypeError crashed every Zig parse at
  --processing-level reachable. Derived against Zig's own snake-case data shape
  (no token-copy of the C normalizer).

Empty-seed safety-net (core/parser_adapter.py):
- A zero entry-point seed previously emptied the dataset silently (100% reduction
  reported as SUCCESS) — the dominant failure for non-web library/stdlib targets.
  Degrade to pass-through (units preserved, filtering NOT applied) + record a loud
  warning in the filter metadata so the blackout can never be silent. The broader
  generic-library seeding heuristic is architectural and out of scope.

Tests (RED->GREEN): tests/test_entry_point_detector_native_seeds.py (6),
tests/parsers/zig/test_zig_main_classification.py (3),
tests/parsers/zig/test_zig_reachability_api.py (2, reproduced the exact
TypeError), tests/test_reachability_empty_seed.py (2). 12 failed pre-fix (1
guard green at base by design) -> 13 passed after the fix. Full suite 189 passed /
63 skipped / 0 failed. go test/vet/build clean in parsers/go/go_parser (types.go
unchanged, gofmt-clean). ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lean install

The zig parser modules top-level import tree_sitter_zig, but the dependency was
absent from requirements.txt and pyproject.toml, so a clean install raised
ModuleNotFoundError: No module named 'tree_sitter_zig' and every zig test
errored on collection. Add tree-sitter-zig pinned to the version the zig tests
pass with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant