drknowhow · drknowhow · Jun 1, 2026 · Jun 1, 2026
diff --git a/examples/cholesterol_primary_prevention/README.md b/examples/cholesterol_primary_prevention/README.md
@@ -23,33 +23,39 @@ unstructured LLM-generated review would miss.
 | File | What it is |
 |---|---|
 | `protocol.md` | The pre-registered protocol JSON, formatted for reading. |
-| `run.log` | A trimmed timeline of what each phase produced. |
-| `synthesis.md` | The Synthesizer's output, converted to markdown. |
-| `synthesis.docx` | Same content as a Word document with forest plot, table, and PRISMA flow embedded (v0.2 — placeholder for now). |
+| `run.log` | Phase-by-phase counts from the actual run (project id `216f7599-1c1b-487c-a269-e67277da0e42`, completed 2026-06-01). |
+| `synthesis.md` | The Synthesizer's output, content-faithful markdown (image-free). |
+| `synthesis.docx` | The same content as a Word document with forest plot, stance × tier heat-table, PRISMA flow, and effect-size table embedded. This is the artifact the protocol actually produced. |
+| `synthesis.pdf` | PDF export of `synthesis.docx`, for read-only sharing. |
 
 ## What the protocol caught
 
 Three things worth calling out for anyone using this as a model:
 
-1. **The Skeptic surfaced a 2023 meta-analysis with a null primary-outcome
-   result that the popular narrative does not include.** Without the
-   Skeptic role explicitly running counter-queries, that paper would
-   never have entered the corpus.
-2. **The Methodologist downgraded several frequently-cited studies to
-   "RCT, high RoB" or "narrative review"** based on abstract-stated
-   methodology. The Synthesizer then weighted the effect-size table
-   accordingly.
-3. **The Synthesizer cut three claims** that were in an earlier draft of
-   the bottom-line answer because the cited rows' `quote_span` fields,
-   after Phase 4 full-text retrieval, didn't actually contain the
-   verbatim numeric claim. The cut claims became "evidence is
-   insufficient to determine X" statements.
+1. **The Skeptic surfaced strict-PP-only meta-analyses (Morsell 2026,
+   Pignone 2000) and the ALLHAT-LLT older-adult subgroup (Han 2017)** —
+   all showing null ACM in true primary prevention. The popular
+   narrative leans on broader meta-analyses (CTT/Fulcher and similar)
+   that pool primary + secondary prevention; the Skeptic's counter-
+   queries were what brought the strict-PP evidence into the corpus.
+2. **The "supports" stance row ended empty by design.** No row in the
+   corpus directly supports an ACM benefit of pharmacological LDL-C
+   lowering in true primary prevention. Broader-PP analyses that show
+   ACM benefit live as `background` because they don't match the PICO.
+   The synthesis names this explicitly rather than hiding it.
+3. **Of 20 Pass-2 candidates, 11 were retrievable at full text.**
+   The seven paywalled-with-no-numerics candidates are named in the
+   synthesis's Limitations section, so the reader can see exactly
+   where the corpus is thin (CTT overestimation abstract, JUPITER
+   reanalysis, Tonelli/Ray-era PP meta-analyses, BMJ 2021 statin
+   adverse-events meta-analysis).
 
 ## What to read in what order
 
 1. `protocol.md` — to see what was agreed BEFORE any search ran.
-2. `run.log` — to see the corpus shape at the end of Pass-1.
-3. `synthesis.md` — to see the no-fabrication output.
+2. `run.log` — to see the corpus shape at each phase.
+3. `synthesis.docx` (or `synthesis.pdf` / `synthesis.md`) — the
+   no-fabrication output.
 
 ## Reproducing
 
@@ -58,9 +64,10 @@ PubMed, arXiv, Europe PMC, Crossref, and Unpaywall. Running the protocol
 again on a later date will find more recent studies (use the `refresh`
 continuation mode for that).
 
-The agent runtime used was the Claude Agent SDK with three concurrent
-Pass-1 subagents and one sequential Synthesizer. Other agent runtimes
-should be able to reproduce the workflow given the prompts in `agents/`.
+The agent runtime used was the Claude Agent SDK with concurrent Pass-1
+subagents (Scout + Skeptic for this run) and one sequential Synthesizer.
+Other agent runtimes should be able to reproduce the workflow given the
+prompts in `agents/`.
 
 ## A caveat
 

diff --git a/examples/cholesterol_primary_prevention/run.log b/examples/cholesterol_primary_prevention/run.log
@@ -1,112 +1,123 @@
 ==============================================================================
 deep_research run log — cholesterol primary prevention
 ==============================================================================
-project slug:   cholesterol-primary-prevention
-created at:     2026-06-01T10:42Z
-completed at:   2026-06-01T18:55Z
-agent runtime:  Claude Agent SDK, three concurrent Pass-1 subagents,
-                one sequential Synthesizer.
+project id:        216f7599-1c1b-487c-a269-e67277da0e42
+project slug:      cholesterol-primary-prevention
+created at:        2026-06-01T12:28:15Z
+completed at:      2026-06-01T17:44:18Z
+synthesis v2 at:   2026-06-01T17:53:59Z   # native-formatted .docx supersedes
+                                          # the plain-text v1 doc
+total cost (USD):  $42.85
+
+agent runtime:     Claude Agent SDK
+roles:             Scout + Skeptic (Pass-1 concurrent),
+                   Synthesizer (Phase 4 sequential)
 
 ------------------------------------------------------------------------------
 Phase 1 — Protocol pre-registration
 ------------------------------------------------------------------------------
 Status:         planned → protocol_gated
-Gate 1:         approved by user 2026-06-01T11:14Z
+Gate 1:         approved by user
+
+The pre-registered protocol (see protocol.md) fixed: PICO definition,
+inclusion/exclusion criteria (true primary prevention only — no prior
+CVD/CeVD/PVD), effect measures (ACM primary, MACE secondary), source
+tiers, retraction-sweep policy, and the Pass-2 spend ceiling.
 
 ------------------------------------------------------------------------------
 Phase 2 — Pass-1 triage
 ------------------------------------------------------------------------------
 Status:         protocol_gated → pass1_running
 
-Scout
-  queries_run:           42  (7 protocol queries × 6 sources)
-  unique_hits_kept:      89
-  sources_with_zero_hits: ["arxiv"]      # expected — clinical question
-
-Skeptic
-  counter_query_count:   18
-  refutes_kept:          11
-  mixed_kept:             7
-  retractions_found:      2     # one 2019 statin meta-analysis, one 2021
-                                # PCSK9 cohort study; both written as
-                                # refute-side rows referencing the
-                                # original DOI.
-
-Methodologist
-  rows_graded:           107   # 89 scout + 18 skeptic
-  effect_sizes_extracted: 23
-  rows_flagged_unclear:   14   # all flagged with notes='unclear from abstract'
-
-Corpus rollup after Pass-1 reconciliation:
-  by stance:
-    background:   71
-    supports:      0   # nothing claimed support yet — Phase 4 sets stance
-    refutes:      13   # skeptic + retractions
-    mixed:         7
-  by tier:
-    tier 1:       58
-    tier 2:        4
-    tier 3:       29
-    tier 4:        0   # blogs/news intentionally excluded by protocol
-  retractions:    2
+Corpus rollup after Pass-1 (recorded in research_evidence):
+  total rows:                       135
+  retrieved_by_role (distinct):       2   # scout, skeptic
+  by stance (post-synthesis update):
+    background:                     125   # not directly tested against PICO
+    refutes:                          4   # quote-spanned numerics show null/harm
+    mixed:                            6   # quote-spanned MACE↓ but ACM null
+    supports:                         0   # by design — no row directly supports
+                                          #   ACM benefit in strict primary
+                                          #   prevention; broader-PP analyses
+                                          #   that show ACM benefit live as
+                                          #   background because they pool
+                                          #   primary + secondary populations
+  by source tier:
+    tier 1 (peer-reviewed):         129
+    tier 2 (preprint / gov):          2
+    tier 3 (grey lit / abstract):     4
 
 ------------------------------------------------------------------------------
 Gate 2 — Pass-2 spend approval
 ------------------------------------------------------------------------------
-Gate 2:         approved by user 2026-06-01T14:02Z
-Approved Pass-2 candidate count: 18 (capped at pass2_max_full_text_retrievals=20)
+Gate 2:         approved by user
+Pass-2 candidate cap (protocol): 20
+Pass-2 candidates approved:      20
 
 ------------------------------------------------------------------------------
 Phase 3 — Pass-2 full-text retrieval
 ------------------------------------------------------------------------------
 Status:         pass1_gated → pass2_running
 
-  candidates_selected:   18
-  retrieved_full_text:   13     # OA via Unpaywall + crossref direct
-  abstract_only:          3     # paywalled, no OA copy
-  unavailable:            2     # DOI resolution failed
+  candidates selected:           20
+  retrieved at full text:        11   # OA via Unpaywall + crossref direct
+  abstract-only:                  2   # paywalled, rich abstract usable
+  unavailable:                    7   # paywalled with no extractable numerics
+                                      #   or DOI resolution failed
 
 ------------------------------------------------------------------------------
 Phase 4 — Synthesis
 ------------------------------------------------------------------------------
 Status:         pass2_running → synthesizing
 
 Synthesizer (one sequential subagent):
-  tool_calls_used:       38 / 50
-  claims_drafted:        24
-  claims_cut:             3   # cited rows' quote_span did not contain the
-                              # paraphrased number after full-text retrieval.
-                              # Cut claims rewritten as
-                              # 'evidence is insufficient to determine X'.
-  n_cited:               21
-  rows_updated_with_quote_span:  21   # synth wrote quote_span + locator
-                                      # for every cited row before citing.
-  stance_assignments:
-    supports:             9
-    refutes:              6
-    mixed:                6
-  visuals_rendered:       forest plot (ACM panel + MACE panel),
-                          PRISMA flow,
-                          stance × tier heat-table,
-                          effect-size summary table.
-  complete:               true
+  evidence rows updated with verbatim quote_span:  7
+  rows directly cited in synthesis:                7   # every cited row carries
+                                                       #   a verbatim quote_span
+                                                       #   plus locator
+  claims drafted but cut (failed verbatim check):  several
+                                                       # rewritten as
+                                                       # 'evidence is
+                                                       # insufficient to
+                                                       # determine X' rather
+                                                       # than paraphrased
+
+  visuals rendered:
+    - effect-size summary table (10 rows, 7 unique studies)
+    - forest plot (ACM panel + MACE panel)
+    - stance × tier heat-table
+    - PRISMA flow (135 → 20 → 11 → 7)
+
+  document features:
+    - native heading hierarchy (no ASCII separators)
+    - native Word tables (not paragraph-rendered)
+    - intense-quote blockquotes for every cited row
+    - .docx uploaded via gdrive(convert_to_doc=true) for fidelity
 
 ------------------------------------------------------------------------------
 Phase 5 — Closure
 ------------------------------------------------------------------------------
 Status:         synthesizing → complete
-Synthesis doc:  see synthesis.md / synthesis.docx in this directory.
+Synthesis artifacts (this directory):
+  synthesis.docx   — native Word, embedded forest plot + PRISMA + heat-table
+  synthesis.pdf    — PDF export of the same
+  synthesis.md     — content-faithful markdown (image-free)
 
 ------------------------------------------------------------------------------
 Notes worth keeping for future runs
 ------------------------------------------------------------------------------
-- Skeptic counter-queries on "fails to replicate" and "publication bias"
-  surfaced two papers that Scout's straight queries missed even at 6 sources.
-- Two of the three cut claims involved the absolute-risk-reduction figure
-  often cited in popular summaries. The verbatim quote spans in the
-  cited papers expressed relative risk only; the absolute number was a
-  derivation that the synthesizer would not commit to without a row
-  saying so verbatim.
-- The retraction sweep was cheap (one crossref ping per Scout hit's DOI)
-  and caught two non-trivial entries. Worth keeping as a fixed Skeptic
-  task on every run.
+- Strict-PP-only meta-analyses tend to find null ACM; broader meta-analyses
+  that pool primary + secondary populations tend to find ACM benefit. The
+  difference is mostly about who's in the denominator, not about the drug.
+- Pass-2 full-text retrieval rate was 11/20 (55%). Paywalls dominated the
+  unavailable bucket. The synthesis flags every landmark paper that was
+  paywall-blocked, so the reader knows where the corpus is thin.
+- The supports stance ended empty by design. The protocol's strict PICO
+  excludes the analyses that would have populated it. This is a feature
+  of the protocol, not a hole in the corpus — and the synthesis says so
+  explicitly.
+- Two roles ran in Pass-1 here (Scout + Skeptic) rather than the
+  three-role pattern (+Methodologist) shown in SKILL.md. Methodology
+  grading was folded into the Skeptic + Synthesizer passes for this
+  question because the methodology dimension was largely uncontested
+  for the included RCTs.
diff --git a/examples/cholesterol_primary_prevention/synthesis.docx b/examples/cholesterol_primary_prevention/synthesis.docx