Skip to content

feat(agentready): centralized assessment repo onboarding#95

Closed
CryptoRodeo wants to merge 18 commits into
konflux-ci:mainfrom
CryptoRodeo:feat/agentready-submissions-onboarding
Closed

feat(agentready): centralized assessment repo onboarding#95
CryptoRodeo wants to merge 18 commits into
konflux-ci:mainfrom
CryptoRodeo:feat/agentready-submissions-onboarding

Conversation

@CryptoRodeo

@CryptoRodeo CryptoRodeo commented Jun 3, 2026

Copy link
Copy Markdown

⚠️ Pre Checklist

Please complete ALL items in this checklist, and remove before submitting

  • I have read through the Contributing Documentation.
  • I have added relevant tests.
  • I have added relevant documentation.
  • I will add labels to the PR, such as pr-type/bug-fix, pr-type/feature-development, etc.

Summary

What does this PR do?

I added the ability to toggle between single repo and a centralized repo in the scope config settings.
Here is the repo I used to test this new feature: https://github.com/CryptoRodeo/agentready-scores
It's a pretty simple setup. In submissions you add directories in this format: org/repo and in each repo folder you add the assessment file. For example: https://github.com/CryptoRodeo/agentready-scores/tree/main/submissions/TSD-UI/rhtas-console-ui

This should make onboarding way easier. No need to keep these files in your project repo (unless you want to)

This also opens up the possibility to immediately onboard teams. You can run the agentready tool against their repo, grab the assessment file and make them a directory. Or automate that process, as mentioned in this thread.

Does this close any open issues?

Closes KFLUXDP-1010

Screenshots

Include any relevant screenshots here.

Scope config UI:

submissions-scope-config-modal.mp4

Older demo showing full workflow (missing some new UI inputs):

agentready-central-repo-demo.mp4

Other Information

Any other information that is important to this PR.

Depends on #94

CryptoRodeo and others added 15 commits June 2, 2026 16:34
Add ${project} template variable and project_mapping JOINs to all three
dashboards (fleet-overview, findings-analysis, repo-detail) so panels
scope data to the selected DevLake project.

Assisted-By: Claude Opus 4.6
Signed-off-by: Bryan Ramos <bramos@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…very by connection

Calculator now discovers submissions repo IDs the same way the extractor
does. Discovery query is scoped by connection_id to prevent cross-project
data leaks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Bryan Ramos <bramos@redhat.com>
Signed-off-by: Bryan Ramos <bramos@redhat.com>
Signed-off-by: Bryan Ramos <bramos@redhat.com>
- allow users to specify the branch for the centralized submissions repo
- add migration script
- update scope config UI

Signed-off-by: Bryan Ramos <bramos@redhat.com>
Signed-off-by: Bryan Ramos <bramos@redhat.com>
@codecov-commenter

codecov-commenter commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 94 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.79%. Comparing base (b0112b9) to head (b69c53f).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
...s/project/detail/agentready-scope-config-modal.tsx 0.00% 94 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (b0112b9) and HEAD (b69c53f). Click for more details.

HEAD has 4 uploads less than BASE
Flag BASE (b0112b9) HEAD (b69c53f)
unit-tests-go 3 1
unit-tests-python 3 1
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main      #95       +/-   ##
===========================================
- Coverage   59.41%   17.79%   -41.62%     
===========================================
  Files          81      402      +321     
  Lines        6105    21243    +15138     
  Branches        0       14       +14     
===========================================
+ Hits         3627     3780      +153     
- Misses       2403    17387    +14984     
- Partials       75       76        +1     
Flag Coverage Δ
unit-tests-go 44.00% <ø> (-8.43%) ⬇️
unit-tests-python 74.36% <ø> (ø)
unit-tests-typescript 0.95% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
config-ui/src/api/plugin/agentready.ts 0.00% <ø> (ø)
...s/project/detail/agentready-scope-config-modal.tsx 0.00% <0.00%> (ø)

... and 319 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@CryptoRodeo CryptoRodeo changed the title Feat/agentready submissions onboarding feat(agentready): centralized assessment repo onboarding Jun 3, 2026
@Dannyb48

Dannyb48 commented Jun 4, 2026

Copy link
Copy Markdown

/agentic_review

@qodo-app-for-konflux-ci

qodo-app-for-konflux-ci Bot commented Jun 4, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (5) 📘 Rule violations (1) 🔗 Cross-repo conflicts (0)

Context used
✅ Compliance rules (platform): 110 rules

Grey Divider


Action required

1. Edits outside allowed plugin dirs 📘 Rule violation § Compliance
Description
This PR modifies backend/plugins/agentready/..., which is outside the only permitted change paths
for this PR. This violates the restriction on modifying files outside the owned plugin directories.
Code

backend/plugins/agentready/tasks/submissions_collector.go[1]

+package tasks
Relevance

⭐⭐⭐ High

Repo enforces owned-plugin boundary (aireview/codecov/testregistry); agentready changes likely
flagged as non-owned.

PR-#86

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Compliance rule 984 restricts modifications to three owned plugin directory prefixes only. The PR
adds backend/plugins/agentready/tasks/submissions_collector.go, which is outside those allowed
directories, so the rule is violated.

Rule 984: Restrict code changes to owned plugin directories only
backend/plugins/agentready/tasks/submissions_collector.go[1-1]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
PR changes include files outside the allowed plugin directories.

## Issue Context
Per compliance requirements, all touched paths must be under one of:
- `backend/plugins/aireview/`
- `backend/plugins/codecov/`
- `backend/plugins/testregistry/`

## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[1-1]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Submissions branch ignored 🐞 Bug ≡ Correctness
Description
collectFromSubmissionsRepo fetches the tree using submissionsBranch but downloads each JSON via
FetchGithubAssessment with an empty ref, so it reads from the repo default branch instead of the
configured submissionsBranch and may ingest wrong/missing files.
Code

backend/plugins/agentready/tasks/submissions_collector.go[R152-194]

+	branch := config.SubmissionsBranch
+
+	// Fetch the full recursive tree
+	treeResp, err := FetchGithubTree(ctx, endpoint, config.SubmissionsRepo, branch, conn.Token)
+	if err != nil {
+		logger.Warn(nil, "Failed to fetch tree for submissions repo %s: %v", config.SubmissionsRepo, err)
+		return
+	}
+
+	if treeResp.Truncated {
+		logger.Warn(nil, "Submissions repo tree is truncated; some entries may be missing")
+	}
+
+	// Parse entries from the tree
+	entries := ParseSubmissionEntries(treeResp.Tree, submissionsPath)
+	logger.Info("Found %d submission entries in %s", len(entries), config.SubmissionsRepo)
+
+	// Register submission repos in project_mapping so Grafana dashboards
+	// (which JOIN on project_mapping) can find them.
+	if projectName != "" {
+		seen := map[string]bool{}
+		for _, entry := range entries {
+			repoId := MakeSubmissionsRepoId(entry.Org, entry.Repo)
+			if seen[repoId] {
+				continue
+			}
+			seen[repoId] = true
+			mapping := &projectMappingRow{
+				ProjectName: projectName,
+				Table:       "repos",
+				RowId:       repoId,
+			}
+			if mapErr := db.CreateOrUpdate(mapping); mapErr != nil {
+				logger.Warn(mapErr, "Failed to insert project_mapping for %s", repoId)
+			}
+		}
+	}
+
+	now := time.Now()
+	for _, entry := range entries {
+		// Fetch the assessment file content via the Contents API
+		rawJSON, fetchErr := FetchGithubAssessment(ctx, endpoint, config.SubmissionsRepo, entry.TreePath, "", conn.Token)
+		if fetchErr != nil {
Relevance

⭐⭐⭐ High

Team has history fixing branch-handling correctness; mismatch between tree branch and content ref
likely unacceptable.

PR-#82

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The tree is fetched using branch := config.SubmissionsBranch, but individual files are fetched
with ref hardcoded to empty, which will not necessarily match the tree branch.

backend/plugins/agentready/tasks/submissions_collector.go[152-156]
backend/plugins/agentready/tasks/submissions_collector.go[190-194]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`collectFromSubmissionsRepo` discovers files from a specific branch via the Git Trees API, but then downloads those files via the Contents API without passing the same branch/ref. This makes collection inconsistent and can silently skip or ingest wrong assessments when `submissionsBranch` is set.

## Issue Context
- Tree discovery uses `config.SubmissionsBranch`.
- File download calls `FetchGithubAssessment(..., ref="")`, which defaults to the repo’s default branch.

## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[152-194]

## Proposed fix
- Thread the same `branch` value into `FetchGithubAssessment`:
 - `FetchGithubAssessment(..., entry.TreePath, branch, conn.Token)`
- Ensure the behavior for empty `branch` is consistent between tree discovery and contents download (either both default to repo default branch or both default to the same configured fallback).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Tree ref handling brittle 🐞 Bug ☼ Reliability
Description
FetchGithubTree hardcodes an empty branch to "main" and interpolates the ref directly into the URL
path without URL-escaping, so repos with non-main default branches or refs containing reserved
characters (e.g., '/') can fail tree discovery.
Code

backend/plugins/agentready/tasks/submissions_collector.go[R80-90]

+func FetchGithubTree(ctx context.Context, endpoint, fullName, branch, token string) (*githubTreeResponse, error) {
+	if token == "" {
+		return nil, fmt.Errorf("a GitHub token is required to fetch the tree")
+	}
+	if branch == "" {
+		branch = "main"
+	}
+
+	endpoint = strings.TrimSuffix(endpoint, "/")
+	apiURL := fmt.Sprintf("%s/repos/%s/git/trees/%s?recursive=1", endpoint, fullName, branch)
+
Relevance

⭐⭐⭐ High

Team addressed hardcoded "main"/default-branch issues before; expects using repo default branch and
robust ref handling.

PR-#82

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Code forces empty branch to main and places the ref directly in the URL path; README states empty
should use repo default branch.

backend/plugins/agentready/tasks/submissions_collector.go[80-90]
backend/plugins/agentready/README.md[216-224]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`FetchGithubTree` forces `branch==""` to `"main"` and uses the branch/ref as a raw URL path segment. This conflicts with documented behavior (“empty uses default branch”) and can break refs that contain `/` or other URL-reserved characters unless escaped.

## Issue Context
- README documents `submissionsBranch` empty => repository default branch.
- Current implementation uses `main` for empty and does not escape the ref in the URL.

## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[80-90]
- backend/plugins/agentready/README.md[216-224]

## Proposed fix
- If `branch` is empty, resolve the submissions repo’s default branch via GitHub repo API (or accept an explicit `default_branch` returned by the API) instead of hardcoding `main`.
- URL-escape the ref when building `/git/trees/{ref}` (e.g., `url.PathEscape(ref)`), or resolve the commit SHA first and call the trees API using the SHA (most robust).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Submissions IDs unscoped 🐞 Bug ≡ Correctness
Description
Extractor/metrics append submission repo IDs by querying all assessments where
provider='submissions' (optionally filtered by connection_id) without filtering by the current
project or configured submissionsRepo, so runs can process unrelated submissions from other
projects/configurations.
Code

backend/plugins/agentready/tasks/assessment_extractor.go[R222-233]

+func discoverSubmissionRepoIds(db dal.Dal, connectionId uint64) ([]string, error) {
+	var rows []struct {
+		RepoId string `gorm:"column:repo_id"`
+	}
+	clauses := []dal.Clause{
+		dal.Select("DISTINCT repo_id"),
+		dal.From(&models.AgentReadyAssessment{}),
+		dal.Where("provider = ?", "submissions"),
+	}
+	if connectionId > 0 {
+		clauses = append(clauses, dal.Where("connection_id = ?", connectionId))
+	}
Relevance

⭐⭐ Medium

No historical evidence found for scoping submissions discovery queries by project/submissionsRepo in
agentready tasks.

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Extraction adds submissions IDs when submissionsRepo is configured, but the discovery query only
filters by provider/connection and ignores project/submissionsRepo; submissions collector does track
project membership via project_mapping, but that data isn’t used for discovery.

backend/plugins/agentready/tasks/assessment_extractor.go[79-86]
backend/plugins/agentready/tasks/assessment_extractor.go[222-233]
backend/plugins/agentready/tasks/submissions_collector.go[169-188]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`discoverSubmissionRepoIds` returns every `repo_id` ever collected with `provider='submissions'` (optionally by connection). When extraction/metrics run for a given project/scope config, this can pull in submissions from other projects or other submissions repos, contaminating results.

## Issue Context
- Submissions collector writes project mappings for synthetic repo IDs.
- Extraction currently discovers submission IDs via `_tool_agentready_assessments` without using `project_mapping` or `submissionsRepo`.

## Fix Focus Areas
- backend/plugins/agentready/tasks/assessment_extractor.go[79-86]
- backend/plugins/agentready/tasks/assessment_extractor.go[222-233]
- backend/plugins/agentready/tasks/submissions_collector.go[169-188]
- backend/plugins/agentready/tasks/metrics_calculator.go[36-43]

## Proposed fix
- Change submission repo ID discovery to be project-scoped, e.g.:
 - Query `project_mapping` for the current `projectName` + `table='repos'` + `row_id LIKE 'submissions:%'` and use those row_ids.
- Alternatively (or additionally), persist `submissionsRepo` (and maybe `submissionsPath`) onto `_tool_agentready_assessments` rows and filter by it during discovery.
- Apply the same scoping fix in both extractor and metrics calculator.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. Multi-project Grafana overcounts 🐞 Bug ≡ Correctness
Description
Dashboards add a multi-select project variable and JOIN project_mapping directly; if a repo is
mapped to multiple selected projects, aggregates like COUNT(*) and AVG(...) will be duplicated and
biased.
Code

backend/plugins/agentready/grafana/findings-analysis.json[24]

+          "rawSql": "SELECT\n  f.attribute_name,\n  COUNT(*) AS failure_count\nFROM _tool_agentready_findings f\nJOIN project_mapping pm ON f.repo_id = pm.row_id AND pm.`table` = 'repos'\nWHERE f.status = 'fail' AND pm.project_name in (${project})\nGROUP BY f.attribute_id, f.attribute_name\nORDER BY failure_count DESC\nLIMIT 15",
Relevance

⭐⭐⭐ High

Team previously fixed Grafana inflated averages by deduplicating rows in agentready dashboards.

PR-#87

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The dashboard enables multi-project selection and uses a plain join in aggregate queries; the
AgentReady API already warns that project_mapping joins can duplicate rows and uses DISTINCT to
guard against it.

backend/plugins/agentready/grafana/findings-analysis.json[16-46]
backend/plugins/agentready/grafana/findings-analysis.json[103-130]
backend/plugins/agentready/api/assessments.go[26-35]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Grafana queries join `project_mapping` and then aggregate with `COUNT(*)`/`AVG(...)`. With multi-select projects enabled, a repo that belongs to more than one selected project will appear multiple times, inflating counts and biasing averages.

## Issue Context
- Dashboard defines `project` as `multi: true` + `includeAll: true`.
- Queries use `JOIN project_mapping pm ... WHERE pm.project_name in (${project})`.
- AgentReady API code already documents that project_mapping joins can create duplicates and uses `SELECT DISTINCT` to guard against them.

## Fix Focus Areas
- backend/plugins/agentready/grafana/findings-analysis.json[16-46]
- backend/plugins/agentready/grafana/findings-analysis.json[103-130]
- backend/plugins/agentready/grafana/fleet-overview.json[21-32]
- backend/plugins/agentready/api/assessments.go[26-35]

## Proposed fix
- Join against a de-duplicated mapping subquery:
 - `JOIN (SELECT DISTINCT row_id FROM project_mapping WHERE `table`='repos' AND project_name IN (${project})) pm ON ...`
- Or replace JOIN with `WHERE EXISTS (...)` to avoid row multiplication.
- Apply to panels that use `COUNT(*)` or `AVG(...)` (e.g., failure counts, tier pass rates, avg score).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Progress can exceed total 🐞 Bug ◔ Observability
Description
CollectAssessments sets progress total to the number of discovered repos, then
collectFromSubmissionsRepo increments progress once per submission file, so progress accounting can
exceed 100% and mislead task monitoring.
Code

backend/plugins/agentready/tasks/assessment_collector.go[R181-184]

+	if config.SubmissionsRepo != "" {
+		logger.Info("Collecting assessments from submissions repo: %s", config.SubmissionsRepo)
+		collectFromSubmissionsRepo(ctx, db, logger, taskCtx, config, data.Options.ProjectName)
+	}
Relevance

⭐⭐ Medium

No historical evidence found about task progress totals vs IncProgress overcounting in this repo.

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Collector progress total is set using only repo count, but submissions collection also increments
progress per entry.

backend/plugins/agentready/tasks/assessment_collector.go[100-104]
backend/plugins/agentready/tasks/assessment_collector.go[181-184]
backend/plugins/agentready/tasks/submissions_collector.go[190-203]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Progress total is set to `len(repos)` but the task increments progress for both repo collection and each submissions file. This can cause progress to exceed the configured total and makes pipeline monitoring inaccurate.

## Issue Context
- `CollectAssessments` sets total based on repo count.
- Submissions collector calls `taskCtx.IncProgress(1)` for each submission entry.

## Fix Focus Areas
- backend/plugins/agentready/tasks/assessment_collector.go[100-104]
- backend/plugins/agentready/tasks/assessment_collector.go[181-184]
- backend/plugins/agentready/tasks/submissions_collector.go[190-203]

## Proposed fix
- Option A: Before processing submissions entries, compute `len(entries)` and update total: `SetProgress(current, len(repos)+len(entries))` (or equivalent supported API).
- Option B: Track submissions progress separately (don’t reuse the same progress counter) if DevLake supports subtasks/child progress.
- Option C: Don’t call `IncProgress` for submissions entries unless total includes them.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@@ -0,0 +1,236 @@
package tasks

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Edits outside allowed plugin dirs 📘 Rule violation § Compliance

This PR modifies backend/plugins/agentready/..., which is outside the only permitted change paths
for this PR. This violates the restriction on modifying files outside the owned plugin directories.
Agent Prompt
## Issue description
PR changes include files outside the allowed plugin directories.

## Issue Context
Per compliance requirements, all touched paths must be under one of:
- `backend/plugins/aireview/`
- `backend/plugins/codecov/`
- `backend/plugins/testregistry/`

## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[1-1]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update this in another PR once these plugin PRs get merged.

Comment on lines +152 to +194
branch := config.SubmissionsBranch

// Fetch the full recursive tree
treeResp, err := FetchGithubTree(ctx, endpoint, config.SubmissionsRepo, branch, conn.Token)
if err != nil {
logger.Warn(nil, "Failed to fetch tree for submissions repo %s: %v", config.SubmissionsRepo, err)
return
}

if treeResp.Truncated {
logger.Warn(nil, "Submissions repo tree is truncated; some entries may be missing")
}

// Parse entries from the tree
entries := ParseSubmissionEntries(treeResp.Tree, submissionsPath)
logger.Info("Found %d submission entries in %s", len(entries), config.SubmissionsRepo)

// Register submission repos in project_mapping so Grafana dashboards
// (which JOIN on project_mapping) can find them.
if projectName != "" {
seen := map[string]bool{}
for _, entry := range entries {
repoId := MakeSubmissionsRepoId(entry.Org, entry.Repo)
if seen[repoId] {
continue
}
seen[repoId] = true
mapping := &projectMappingRow{
ProjectName: projectName,
Table: "repos",
RowId: repoId,
}
if mapErr := db.CreateOrUpdate(mapping); mapErr != nil {
logger.Warn(mapErr, "Failed to insert project_mapping for %s", repoId)
}
}
}

now := time.Now()
for _, entry := range entries {
// Fetch the assessment file content via the Contents API
rawJSON, fetchErr := FetchGithubAssessment(ctx, endpoint, config.SubmissionsRepo, entry.TreePath, "", conn.Token)
if fetchErr != nil {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Submissions branch ignored 🐞 Bug ≡ Correctness

collectFromSubmissionsRepo fetches the tree using submissionsBranch but downloads each JSON via
FetchGithubAssessment with an empty ref, so it reads from the repo default branch instead of the
configured submissionsBranch and may ingest wrong/missing files.
Agent Prompt
## Issue description
`collectFromSubmissionsRepo` discovers files from a specific branch via the Git Trees API, but then downloads those files via the Contents API without passing the same branch/ref. This makes collection inconsistent and can silently skip or ingest wrong assessments when `submissionsBranch` is set.

## Issue Context
- Tree discovery uses `config.SubmissionsBranch`.
- File download calls `FetchGithubAssessment(..., ref="")`, which defaults to the repo’s default branch.

## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[152-194]

## Proposed fix
- Thread the same `branch` value into `FetchGithubAssessment`:
  - `FetchGithubAssessment(..., entry.TreePath, branch, conn.Token)`
- Ensure the behavior for empty `branch` is consistent between tree discovery and contents download (either both default to repo default branch or both default to the same configured fallback).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@CryptoRodeo CryptoRodeo Jun 4, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here: bcaed61

Comment on lines +80 to +90
func FetchGithubTree(ctx context.Context, endpoint, fullName, branch, token string) (*githubTreeResponse, error) {
if token == "" {
return nil, fmt.Errorf("a GitHub token is required to fetch the tree")
}
if branch == "" {
branch = "main"
}

endpoint = strings.TrimSuffix(endpoint, "/")
apiURL := fmt.Sprintf("%s/repos/%s/git/trees/%s?recursive=1", endpoint, fullName, branch)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Tree ref handling brittle 🐞 Bug ☼ Reliability

FetchGithubTree hardcodes an empty branch to "main" and interpolates the ref directly into the URL
path without URL-escaping, so repos with non-main default branches or refs containing reserved
characters (e.g., '/') can fail tree discovery.
Agent Prompt
## Issue description
`FetchGithubTree` forces `branch==""` to `"main"` and uses the branch/ref as a raw URL path segment. This conflicts with documented behavior (“empty uses default branch”) and can break refs that contain `/` or other URL-reserved characters unless escaped.

## Issue Context
- README documents `submissionsBranch` empty => repository default branch.
- Current implementation uses `main` for empty and does not escape the ref in the URL.

## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[80-90]
- backend/plugins/agentready/README.md[216-224]

## Proposed fix
- If `branch` is empty, resolve the submissions repo’s default branch via GitHub repo API (or accept an explicit `default_branch` returned by the API) instead of hardcoding `main`.
- URL-escape the ref when building `/git/trees/{ref}` (e.g., `url.PathEscape(ref)`), or resolve the commit SHA first and call the trees API using the SHA (most robust).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting to 'main' should be fine if a branch isn't passed. I'll check on the URL-escaping.

@CryptoRodeo CryptoRodeo Jun 4, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed here: 3fd4726

Comment on lines +222 to +233
func discoverSubmissionRepoIds(db dal.Dal, connectionId uint64) ([]string, error) {
var rows []struct {
RepoId string `gorm:"column:repo_id"`
}
clauses := []dal.Clause{
dal.Select("DISTINCT repo_id"),
dal.From(&models.AgentReadyAssessment{}),
dal.Where("provider = ?", "submissions"),
}
if connectionId > 0 {
clauses = append(clauses, dal.Where("connection_id = ?", connectionId))
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. Submissions ids unscoped 🐞 Bug ≡ Correctness

Extractor/metrics append submission repo IDs by querying all assessments where
provider='submissions' (optionally filtered by connection_id) without filtering by the current
project or configured submissionsRepo, so runs can process unrelated submissions from other
projects/configurations.
Agent Prompt
## Issue description
`discoverSubmissionRepoIds` returns every `repo_id` ever collected with `provider='submissions'` (optionally by connection). When extraction/metrics run for a given project/scope config, this can pull in submissions from other projects or other submissions repos, contaminating results.

## Issue Context
- Submissions collector writes project mappings for synthetic repo IDs.
- Extraction currently discovers submission IDs via `_tool_agentready_assessments` without using `project_mapping` or `submissionsRepo`.

## Fix Focus Areas
- backend/plugins/agentready/tasks/assessment_extractor.go[79-86]
- backend/plugins/agentready/tasks/assessment_extractor.go[222-233]
- backend/plugins/agentready/tasks/submissions_collector.go[169-188]
- backend/plugins/agentready/tasks/metrics_calculator.go[36-43]

## Proposed fix
- Change submission repo ID discovery to be project-scoped, e.g.:
  - Query `project_mapping` for the current `projectName` + `table='repos'` + `row_id LIKE 'submissions:%'` and use those row_ids.
- Alternatively (or additionally), persist `submissionsRepo` (and maybe `submissionsPath`) onto `_tool_agentready_assessments` rows and filter by it during discovery.
- Apply the same scoping fix in both extractor and metrics calculator.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh this is a great catch!

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here: b69c53f

…files

collectFromSubmissionsRepo fetched the tree using submissionsBranch but
downloaded each JSON via FetchGithubAssessment with an empty ref,
causing reads from the repo default branch instead of the configured
submissions branch.

Signed-off-by: Bryan Ramos <bramos@redhat.com>
FetchGithubTree interpolated the branch ref directly into the URL path
without escaping. Branches containing reserved characters like "/"
(e.g. "chore/entries") produced malformed URLs where GitHub interpreted
the slash as a path separator instead of part of the ref name.

Use url.PathEscape on the branch before URL interpolation.

Signed-off-by: Bryan Ramos <bramos@redhat.com>
discoverSubmissionRepoIds was querying all assessments with
provider='submissions' without filtering by the current project,
causing extraction and metrics to process unrelated submissions
from other projects.

Now uses project_mapping for scoping when projectName is available,
matching the pattern used by discoverRepos.

Signed-off-by: Bryan Ramos <bramos@redhat.com>
@Dannyb48

Dannyb48 commented Jun 4, 2026

Copy link
Copy Markdown

@CryptoRodeo thinking out loud.

How do we handle deletes? Like what if someone removes their submission/org/repo because maybe they don't want to report their scores for a particular org/repo anymore or its archived. Should capture that diff somehow delete any repo that isn't their anymore? I'm curious what is the actual precedent is with these other plugins?

Also, if I want to collect this for konflux, for each Konflux team there is a project configured in devlake to point to Konflux's central agentready scores repo. It feels odd to have to set this for every project. I was wondering if we should switch more to test registries plugin model? With connection, scopeconfig, and assigning projects. It feels like a better UX. Something to think about?

@CryptoRodeo

Copy link
Copy Markdown
Author

@Dannyb48 Great points, I'll get back to you on these

@CryptoRodeo

CryptoRodeo commented Jun 4, 2026

Copy link
Copy Markdown
Author

@Dannyb48 I'm gonna look into making it more like the testregistry plugin. This will require the plugin to be converted from a metrics plugin to a datasource plugin. I'm thinking of scrapping the single-repo feature and sticking with the centralized-repo idea to keep this simple.

For deletions, it should be simple enough to delete the entry if the repo no longer exists when scanning. But I need to look into this.

@CryptoRodeo

Copy link
Copy Markdown
Author

Closing, all work moving to #97

Still need to tackle the deletion logic @Dannyb48 mentioned.

@CryptoRodeo CryptoRodeo closed this Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants