Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 24 additions & 15 deletions .conductor/registry/workflows/ado-pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -618,9 +618,14 @@ agents:
# marker → bot reviewer (probably pr_initial_reviewer). Only
# treat items under the `**Blocking concerns**` section as
# negative; ignore everything else in the bot's body.
# 2. Comment author matches the PR author → exclude (PR author
# narrating their own work doesn't count as feedback to themselves).
# 3. Otherwise → human reviewer. Use natural-language judgement.
# 2. Otherwise → human reviewer. Use natural-language judgement,
# regardless of whether the comment author matches the PR
# author identity. Polyphony PRs are agent-authored under the
# operator's PAT, so author-identity equality is NOT a valid
# proxy for bot-vs-human — the marker is. Self-approval /
# self-waiting-for-author is a platform branch-protection
# concern, not the analyzer's. See
# docs/decisions/pr-feedback-identity-classification.md.
#
# Output digest contract: `feedback_digest` is a stable hash of
# (comment ids + thread ids + vote tuples + body sha) of everything
Expand Down Expand Up @@ -690,24 +695,28 @@ agents:
marker-posting agent). Only treat items under the `**Blocking
concerns**` section as negative; ignore everything else in
the bot's body.
2. Else if `author == PR author identity` → **author self-comment**.
Exclude entirely (the PR author narrating their own work is
not feedback to themselves).
3. Else → **human reviewer**. Use natural-language judgement of
2. Else → **human reviewer**. Use natural-language judgement of
the comment's sentiment and any embedded code-review concerns.
**Do NOT compare `author` against the PR author identity** —
polyphony PRs are agent-authored under the operator's PAT, so
the operator's own comments on the PR are valid human-in-the-
loop interventions and must be treated as feedback.

For reviewer **votes** on ADO: a `-5` (waiting for author) vote
counts as negative feedback. A `+5` (approved with suggestions)
vote ALONE does not — but if it's paired with unresolved actionable
threads, the threads still win and you should treat the unresolved
threads as negative feedback. A `+10` or `-10` was already terminated
by the deterministic classifier; if you see one here it means
mixed signals — defer to the threads / comments to decide.
counts as negative feedback regardless of who cast it (including
the PR author themselves — self-vote enforcement is a platform
branch-protection concern, not yours). A `+5` (approved with
suggestions) vote ALONE does not — but if it's paired with
unresolved actionable threads, the threads still win and you
should treat the unresolved threads as negative feedback. A
`+10` or `-10` was already terminated by the deterministic
classifier; if you see one here it means mixed signals — defer
to the threads / comments to decide.

Unresolved threads (status active / pending — anything that is NOT
resolved / fixed / closed / by-design / won-fix) that contain at
least one in-scope (non-author) comment count as negative feedback
even if the comment body is short.
least one non-bot comment count as negative feedback even if the
comment body is short.

## Output

Expand Down
41 changes: 24 additions & 17 deletions .conductor/registry/workflows/github-pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -491,9 +491,14 @@ agents:
# marker → bot reviewer (probably pr_initial_reviewer). Only
# treat items under the `**Blocking concerns**` section as
# negative; ignore everything else in the bot's body.
# 2. Comment author matches the PR author → exclude (PR author
# narrating their own work doesn't count as feedback to themselves).
# 3. Otherwise → human reviewer. Use natural-language judgement.
# 2. Otherwise → human reviewer. Use natural-language judgement,
# regardless of whether the comment author matches the PR
# author identity. Polyphony PRs are agent-authored under the
# operator's PAT, so author-identity equality is NOT a valid
# proxy for bot-vs-human — the marker is. Self-approval /
# self-changes-requested is a platform branch-protection
# concern, not the analyzer's. See
# docs/decisions/pr-feedback-identity-classification.md.
#
# Output digest contract: `feedback_digest` is a stable hash of
# (comment ids + thread ids + vote tuples + body sha) of everything
Expand Down Expand Up @@ -563,23 +568,25 @@ agents:
marker-posting agent). Only treat items under the `**Blocking
concerns**` section as negative; ignore everything else in
the bot's body.
2. Else if `author == PR author identity` → **author self-comment**.
Exclude entirely (the PR author narrating their own work is
not feedback to themselves).
3. Else → **human reviewer**. Use natural-language judgement of
2. Else → **human reviewer**. Use natural-language judgement of
the comment's sentiment and any embedded code-review concerns.

For reviewer **votes**: a `changes_requested` vote from a non-
author reviewer counts as negative feedback regardless of any
explanatory comment. An `approved` vote with no unresolved
threads was already terminated by the deterministic classifier —
if you see it here it means the classifier returned `none` due
to mixed signals (e.g. approved AND unresolved threads), so the
threads still win.
**Do NOT compare `author` against the PR author identity** —
polyphony PRs are agent-authored under the operator's PAT, so
the operator's own comments on the PR are valid human-in-the-
loop interventions and must be treated as feedback.

For reviewer **votes**: a `changes_requested` vote counts as
negative feedback regardless of who cast it (including the PR
author themselves — self-approval / self-changes-requested is
a platform branch-protection concern, not yours). An `approved`
vote with no unresolved threads was already terminated by the
deterministic classifier — if you see it here it means the
classifier returned `none` due to mixed signals (e.g. approved
AND unresolved threads), so the threads still win.

Unresolved threads (`is_resolved == false && is_outdated != true`)
that contain at least one in-scope (non-author) comment count as
negative feedback even if the comment body is short.
that contain at least one non-bot comment count as negative
feedback even if the comment body is short.

## Output

Expand Down
35 changes: 21 additions & 14 deletions .conductor/registry/workflows/plan-level.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -417,7 +417,7 @@
- "plan"
- "derive-ancestor-chain"
- "--root-id"
- "{{ root_resolver.output.resolved_root_id }}"

Check warning on line 420 in .conductor/registry/workflows/plan-level.yaml

View workflow job for this annotation

GitHub Actions / build-and-test

JINJA002: 'root_resolver.output.resolved_root_id' has can_omit_when_null=true and is not guarded. Wrap in '{% if root_resolver.output.resolved_root_id is defined %}', '{% if root_resolver.output is defined %}', or pipe through '| default(...)'.

Check warning on line 420 in .conductor/registry/workflows/plan-level.yaml

View workflow job for this annotation

GitHub Actions / build-and-test

JINJA002: 'root_resolver.output.resolved_root_id' has can_omit_when_null=true and is not guarded. Wrap in '{% if root_resolver.output.resolved_root_id is defined %}', '{% if root_resolver.output is defined %}', or pipe through '| default(...)'.
- "--item-id"
- "{{ workflow.input.work_item_id }}"
routes:
Expand All @@ -433,7 +433,7 @@

Could not derive the plan-tree ancestor chain for work item
**{{ workflow.input.work_item_id }}** under root
**{{ root_resolver.output.resolved_root_id }}**.

Check warning on line 436 in .conductor/registry/workflows/plan-level.yaml

View workflow job for this annotation

GitHub Actions / build-and-test

JINJA002: 'root_resolver.output.resolved_root_id' has can_omit_when_null=true and is not guarded. Wrap in '{% if root_resolver.output.resolved_root_id is defined %}', '{% if root_resolver.output is defined %}', or pipe through '| default(...)'.

**Error:** {{ ancestor_chain.output.error if ancestor_chain.output.error is defined else "Unknown error" }}

Expand Down Expand Up @@ -1616,9 +1616,13 @@
# Identity policy (load-bearing — encoded in the prompt):
# 1. Comment with a `<!-- polyphony:agent-comment agent=... -->`
# marker → bot reviewer. INCLUDE the comment's content.
# 2. Comment author matches the PR author → exclude (PR author
# narrating their own work doesn't count as feedback to itself).
# 3. Otherwise → human reviewer. INCLUDE.
# 2. Otherwise → human reviewer. INCLUDE, regardless of whether
# the comment author matches the PR author identity. Polyphony
# PRs are agent-authored under the operator's PAT, so
# author-identity equality is NOT a valid proxy for bot-vs-
# human — the marker is. Self-approval / self-changes-requested
# is a platform branch-protection concern, not the analyzer's.
# See docs/decisions/pr-feedback-identity-classification.md.
#
# Severity policy (also load-bearing):
# - For polyphony's own bot reviewers (e.g. plan_reviewer), only
Expand Down Expand Up @@ -1700,20 +1704,23 @@
posting agent). Only treat items under the `**Blocking
concerns**` section as negative; ignore everything else in the
bot's body.
2. Else if `author == PR author identity` → **author self-comment**.
Exclude entirely (the PR author narrating their own work is
not feedback to themselves).
3. Else → **human reviewer**. Use natural-language judgement of
2. Else → **human reviewer**. Use natural-language judgement of
the comment's sentiment and any embedded code-review concerns.

For reviewer **votes**: a `changes_requested` vote from a non-
author reviewer counts as negative feedback regardless of any
explanatory comment. A `rejected` vote was already terminated by
the deterministic classifier — you should not see it here.
**Do NOT compare `author` against the PR author identity** —
polyphony PRs are agent-authored under the operator's PAT, so
the operator's own comments on the PR are valid human-in-the-
loop interventions and must be treated as feedback.

For reviewer **votes**: a `changes_requested` vote counts as
negative feedback regardless of who cast it (including the PR
author themselves — self-vote enforcement is a platform branch-
protection concern, not yours). A `rejected` vote was already
terminated by the deterministic classifier — you should not see
it here.

Unresolved threads (`is_resolved == false && is_outdated != true`)
that contain at least one in-scope (non-author) comment count as
negative feedback even if the comment body is short.
that contain at least one non-bot comment count as negative
feedback even if the comment body is short.

## Output

Expand Down
128 changes: 128 additions & 0 deletions docs/decisions/pr-feedback-identity-classification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# PR Feedback Identity Classification — Marker, Not Author Identity

> **Status:** Accepted (2026-05-25).
> **Affects:** `.conductor/registry/workflows/{plan-level,github-pr,ado-pr}.yaml`
> — specifically the `pr_feedback_analyzer` prompt in each.
> **Supersedes:** the prior "identity policy" rule that excluded any
> comment whose author matched the PR author identity.

## Context

Three workflows (`plan-level.yaml`, `github-pr.yaml`, `ado-pr.yaml`) run a
sentiment-driven review loop centered on a `pr_feedback_analyzer` agent.
The analyzer reads the poll envelope (comments, threads, votes) and
returns `has_negative_feedback` to drive the revise/merge/abort routing.

Polyphony's PRs are almost always **agent-authored under the operator's
PAT** — the coder agent (and the marker-posting reviewer agents) push
through the same token the human operator uses. As a result, the PR
`author_identity` on the platform API is the **human operator**, not
the agent that actually drove the commits or posted the comments.

To disambiguate bot-from-human under a shared identity, polyphony emits a
deterministic HTML marker on every machine-posted comment:

```html
<!-- polyphony:agent-comment agent=plan_reviewer head_sha=abc1234 run_id=xyz -->
```

The parser lives at `src/Polyphony/Commands/PrCommentMarker.cs` and is
called out explicitly in its summary as existing for "the common ADO case
where `plan_reviewer_poster_ado` shares the operator's PAT".

### The bug this ADR addresses

Through 2026-05, the analyzer prompt carried a three-tier classifier:

1. Has `polyphony:agent-comment` marker → **bot**.
2. Else if `author == PR author identity` → **author self-comment**, exclude.
3. Else → **human reviewer**.

Rule 2 is wrong. It excludes the human operator's own comments on
agent-authored PRs — exactly the highest-priority signal in the system,
the **human-in-the-loop intervention**. An incident on 2026-05-25
surfaced this: an analyzer run reported `has_negative_feedback: false`
while three review threads from the operator identity were silently
dropped under rule 2.

The stated rationale ("the PR author narrating their own work doesn't
count as feedback to themselves") conflates two distinct cases:

- The **agent** narrating its own progress — yes, noise, but the marker
already identifies these as bot comments.
- The **human operator** intervening on an agent-authored PR — this is
authoritative feedback that must drive the revise loop.

Identity-equality is not a valid proxy for "narrator vs. intervener"
when the agent and the operator share an identity. The marker is.

## Decision

The analyzer classifies comments **by marker presence, not by author
identity**. The full rule, applied uniformly to comments AND votes:

1. **Marker present** → bot. Classify by the marker's `agent`
attribute. Apply existing bot logic (e.g. the
`**Blocking concerns**` section convention).
2. **Marker absent** → human. Apply natural-language judgement, regardless
of whether the author identity matches the PR author identity.

There is **no identity comparison** anywhere in the rule. A `+10` /
`approved` vote from the operator on their own PR counts as positive
feedback. A `-5` / `changes_requested` vote from the operator counts as
negative feedback. A free-form comment from the operator is judged on
its content like any other human comment.

### Why no carve-out for votes

A previous draft of this ADR kept identity-exclusion for the **votes**
case on the grounds that "self-approval is platform-blocked anyway".
That carve-out was rejected because:

- Self-approval is only platform-blocked at the **final merge to main**
in default branch-protection configurations. Earlier-stage approvals
(intermediate gates, draft reviews, ADO `+5` "approved with
suggestions" on a non-protected branch) are not universally blocked.
- Whether self-approval is permitted at any given stage is a
**policy/branch-protection concern enforced by the platform**, not a
concern of the polyphony feedback analyzer. If the platform blocks the
merge for policy reasons, the platform surfaces that downstream
(`mergeable_state`, merge-time refusal). The analyzer's job is only to
classify the **feedback content**, not to predict or enforce platform
merge policy.

Mixing platform-policy reasoning into the analyzer was the original
mistake and the carve-out would have replicated it in miniature.

## Consequences

- **Human operator interventions on agent-authored PRs now drive the
revise loop.** This is the intended human-in-the-loop affordance and
the load-bearing reason for the change.
- **Self-approval is honored as positive signal at the analyzer layer.**
Whether the resulting merge actually proceeds is the platform's call
via branch protection / approval-required policy. Operators who want
to forbid self-approval should configure that in branch protection,
not rely on the analyzer to silently filter it out.
- **The `author_identity` field stays in the rendered prompt context**
for diagnostic value (it appears in the "Context — Poll envelope"
block), but the analyzer no longer compares against it. Future
versions may drop the field entirely; keeping it for now eases
triage of analyzer reasoning traces.
- **The structural consistency lint
(`lint-sentiment-loop-consistency.ps1`) is unaffected** — it pins
agent shape, output schema, and route presence, not prompt text.
- **The marker parser
(`src/Polyphony/Commands/PrCommentMarker.cs`) becomes more
load-bearing.** Any future bot poster that fails to inject the marker
will have its comments classified as human feedback. The existing
Pester tests around the marker remain the safety net.

## Non-goals

- This ADR does not change which votes are **terminal** (handled by the
deterministic classifier upstream of the analyzer). The terminal
classifier's identity rules — if any — are out of scope and continue
to be evaluated on their own merits.
- This ADR does not change marker injection by polyphony's bot posters;
it only changes how the analyzer interprets the absence of a marker.
Loading
Loading