Complete Epic #23 presentation evidence gates by Protocol-zero-0 · Pull Request #29 · billion-token-one-task/Deepgraph

Protocol-zero-0 · 2026-06-04T21:01:47Z

Summary

Completes Epic #23 across the requested issue order on convergence branch epic-23-evidence-ledger:

[PR-1] M1 显著性数值修复 + EvidenceLedger 最小结构 #24: numeric significance gating and minimal EvidenceLedger builder
[PR-2] M4 EvidenceLedger full traceability checker #27: EvidenceLedger validation and Abstract/Conclusion traceability checks
[PR-3] M2 四个确定性检查(未解析/脚手架/串台/复读) #25: deterministic LaTeX sanity checks for unresolved refs, scaffold leakage, cross-run identity, and repetition
[PR-4] M3 接入渲染门禁(paper_orchestra_pipeline.py:1843) #28: render-gate integration for deterministic checks
[PR-5] M5 计算/展示依赖边界测试 #26: CPU-only presentation/import/materialization boundary tests for M5-1/2/3

Includes one preliminary branch-only regression fixture stabilization commit so the required baseline tests are green on this convergence branch. master was not modified.

Changed Files

agents/paper_completeness.py
agents/paper_orchestra_pipeline.py
tests/test_paper_completeness_m1.py
tests/test_paper_completeness_m4.py
tests/test_latex_sanity_m2.py
tests/test_vnext_manuscript.py
tests/test_presentation_cpu_boundary_m5.py
tests/fixtures/*

Tests Run

Preflight stabilization:

pytest tests/test_pipeline_contracts.py -> 12 passed
pytest tests/test_vnext_manuscript.py -> 5 passed

#24:

pre-change baseline: pytest tests/test_pipeline_contracts.py -> 12 passed
pre-change baseline: pytest tests/test_vnext_manuscript.py -> 5 passed
milestone: pytest tests/test_paper_completeness_m1.py -> 7 passed
regression: pytest tests/test_pipeline_contracts.py -> 12 passed
regression: pytest tests/test_vnext_manuscript.py -> 5 passed

#27:

pre-change baseline: pytest tests/test_pipeline_contracts.py -> 12 passed
pre-change baseline: pytest tests/test_vnext_manuscript.py -> 5 passed
milestone: pytest tests/test_paper_completeness_m4.py -> 6 passed
regression: pytest tests/test_pipeline_contracts.py -> 12 passed
regression: pytest tests/test_vnext_manuscript.py -> 5 passed
combined: pytest tests/test_paper_completeness_m1.py tests/test_paper_completeness_m4.py -> 13 passed

#25:

pre-change baseline: pytest tests/test_pipeline_contracts.py -> 12 passed
pre-change baseline: pytest tests/test_vnext_manuscript.py -> 5 passed
milestone: pytest tests/test_latex_sanity_m2.py -> 8 passed
regression: pytest tests/test_pipeline_contracts.py -> 12 passed
regression: pytest tests/test_vnext_manuscript.py -> 5 passed

#28:

pre-change baseline: pytest tests/test_pipeline_contracts.py -> 12 passed
pre-change baseline: pytest tests/test_vnext_manuscript.py -> 5 passed
milestone: pytest tests/test_vnext_manuscript.py -> 10 passed
regression: pytest tests/test_pipeline_contracts.py -> 12 passed
combined: pytest tests/test_latex_sanity_m2.py tests/test_vnext_manuscript.py -> 18 passed

#26:

pre-change baseline: pytest tests/test_pipeline_contracts.py -> 12 passed
pre-change baseline: pytest tests/test_vnext_manuscript.py -> 10 passed
milestone: pytest tests/test_presentation_cpu_boundary_m5.py -> 3 passed
regression: pytest tests/test_pipeline_contracts.py -> 12 passed
regression: pytest tests/test_vnext_manuscript.py -> 10 passed
combined: pytest tests/test_paper_completeness_m1.py tests/test_paper_completeness_m4.py tests/test_latex_sanity_m2.py tests/test_presentation_cpu_boundary_m5.py -> 24 passed

Non-goals / Skips

Did not modify or merge into master.
Did not run or fake [PR-5] M5 计算/展示依赖边界测试 #26 M5-4 GPU smoke.
Did not add synthetic or mocked compute backends.
Did not implement cross-file LaTeX \input{content/conclusion} traceability.
Did not modify contracts/pipeline.py or require_submission_ready().

Copilot

Pull request overview

This PR completes Epic #23’s “presentation evidence gates” work by (1) introducing an EvidenceLedger-based numeric significance gate, (2) adding deterministic LaTeX sanity checks (unresolved refs, scaffold leakage, cross-run identity, boilerplate repetition) and wiring them into the submission bundle render gate, and (3) adding CPU-only boundary tests to ensure presentation code paths don’t import GPU/execution dependencies.

Changes:

Add minimal build_evidence_ledger, numeric significance gating, and EvidenceLedger traceability/schema checks in agents/paper_completeness.py.
Expand latex_sanity_check with deterministic checks and pass state from the submission pipeline so cross-run identity gating can work.
Add milestone/regression tests and fixtures for M1/M2/M4/M5 gates, including submission-bundle blocking tests.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`agents/paper_completeness.py`	Adds EvidenceLedger builder, numeric significance gating, traceability/schema checks, and deterministic LaTeX sanity checks.
`agents/paper_orchestra_pipeline.py`	Passes `state` into `latex_sanity_check` to enable state-aware deterministic gating during bundle generation.
`tests/test_paper_completeness_m1.py`	Adds M1 tests for numeric significance gating and minimal EvidenceLedger builder contract.
`tests/test_paper_completeness_m4.py`	Adds M4 tests for EvidenceLedger schema validation and Abstract/Conclusion traceability checks.
`tests/test_latex_sanity_m2.py`	Adds M2 tests for deterministic LaTeX sanity rules (unresolved refs, placeholders, cross-run identity, repetition).
`tests/test_vnext_manuscript.py`	Adds submission-bundle integration tests ensuring render-gate blocks deterministic LaTeX violations and preserves existing gates.
`tests/test_presentation_cpu_boundary_m5.py`	Adds CPU-only boundary tests ensuring presentation modules don’t load GPU/execution dependencies and can render/materialize offline.
`tests/fixtures/*`	Adds fixtures for M1/M2/M4 deterministic checks and traceability scenarios.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+def _significance_alpha() -> float:
+    raw = os.environ.get("DEEPGRAPH_SIGNIFICANCE_ALPHA")
+    alpha = _numeric(raw)
+    if alpha is None or alpha <= 0:
+        return 0.05
+    return alpha


+    p_value = _numeric(packet.get("p_value"))
+    effect_size = _numeric(_first_present(packet.get("effect_size"), packet.get("effect_pct")))
+    metric = _text(_first_present(packet.get("metric_name"), summary.get("primary_metric"), summary.get("metric_name")))


+def _strip_latex_code_blocks(text: str) -> str:
+    stripped = re.sub(
+        r"\\begin\{(?:verbatim|lstlisting|minted)\}.*?\\end\{(?:verbatim|lstlisting|minted)\}",
+        "",
+        text or "",
+        flags=re.DOTALL | re.IGNORECASE,
+    )
+    stripped = re.sub(r"```.*?```", "", stripped, flags=re.DOTALL)
+    return stripped


+                    hits.append(
+                        _line_hit(
+                            "cross_run_identity",
+                            token,
+                            base_line + snippet.count("\n", 0, token_match.start()),
+                            snippet.splitlines()[0] if snippet.splitlines() else snippet,
+                        )
+                    )


cla-assistant · 2026-06-06T13:19:39Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ Protocol-zero-0
❌ Copilot
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

cla-assistant · 2026-06-06T13:19:39Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ Protocol-zero-0
❌ Copilot
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Protocol-zero-0 added 6 commits June 4, 2026 20:40

Stabilize vnext manuscript regression fixture

60fcc09

Fix significance evidence ledger gate (#24)

062cf55

Add EvidenceLedger traceability checks (#27)

28b2c63

Add deterministic LaTeX sanity checks (#25)

9cfc1cb

Wire deterministic sanity checks into render gate (#28)

2c769f9

Add CPU presentation boundary tests (#26)

3846642

Copilot AI review requested due to automatic review settings June 4, 2026 21:01

Copilot started reviewing on behalf of Protocol-zero-0 June 4, 2026 21:01 View session

Copilot AI reviewed Jun 4, 2026

View reviewed changes

Copilot started work on behalf of Protocol-zero-0 June 6, 2026 13:16 View session

Add missing CLA signer for Protocol-zero-0

df78815

Copilot finished work on behalf of Protocol-zero-0 June 6, 2026 13:20

Protocol-zero-0 merged commit 44ecc31 into master Jun 6, 2026
1 check was pending

This was referenced Jun 6, 2026

[提案] 展示层忠实化:EvidenceLedger 证据契约 + 确定性提交门禁(收敛 #21 / #22 的统一优先级) #23

Closed

下一个改进方向的建议：严格产出完整性 —— 可验证性准入 + 结果绑定 + 提交硬校验 #21

Closed

Protocol-zero-0 deleted the epic-23-evidence-ledger branch June 6, 2026 15:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complete Epic #23 presentation evidence gates#29

Complete Epic #23 presentation evidence gates#29
Protocol-zero-0 merged 7 commits into
masterfrom
epic-23-evidence-ledger

Protocol-zero-0 commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

cla-assistant Bot commented Jun 6, 2026

Uh oh!

cla-assistant Bot commented Jun 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Protocol-zero-0 commented Jun 4, 2026

Summary

Changed Files

Tests Run

Non-goals / Skips

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

cla-assistant Bot commented Jun 6, 2026

Uh oh!

cla-assistant Bot commented Jun 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants