Skip to content

[POC] Introducing Code Review category#610

Open
haoranpb wants to merge 38 commits into
mainfrom
category/code-review
Open

[POC] Introducing Code Review category#610
haoranpb wants to merge 38 commits into
mainfrom
category/code-review

Conversation

@haoranpb
Copy link
Copy Markdown
Collaborator

@haoranpb haoranpb commented Apr 12, 2026

A POC showing how Code Review category can be implemented.

Try it out locally:

# See the placeholder task in the dataset
uv run bcbench dataset view microsoft__BCApps-4699 --category code-review

# See the default copilot review live
uv run bcbench -v evaluate copilot microsoft__BCApps-4699 --category code-review --repo-path C:\depot\BCApps

Getting Started

Code Review team to implement

  • Replace the dataset, see codereview.jsonl
  • Implement the scoring and evaluation metrics calculation, we currently have placeholder metrics: count of comments
  • Add the AL Code Review skill

Leaderboard related logic is intentionally left unimplemented for now.

Comment thread src/bcbench/evaluate/codereview.py Fixed
Copy link
Copy Markdown
Collaborator Author

@haoranpb haoranpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid progress.

One thing we should discuss: do we want this synthatic dataset? Or do we want to invest in real-world production PRs?

Comment thread tools/code-review/run_all_evals.ps1
Comment thread src/bcbench/agent/copilot/agent.py Outdated
Comment thread src/bcbench/agent/shared/config.yaml Outdated
Comment thread src/bcbench/evaluate/codereview.py Outdated
Comment thread src/bcbench/evaluate/codereview.py
Comment thread src/bcbench/results/codereview.py Outdated
Comment thread src/bcbench/results/summary.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@WaelAbuSeada WaelAbuSeada marked this pull request as ready for review May 21, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants