feat(agentready): centralized assessment repo onboarding#95
feat(agentready): centralized assessment repo onboarding#95CryptoRodeo wants to merge 18 commits into
Conversation
Add ${project} template variable and project_mapping JOINs to all three
dashboards (fleet-overview, findings-analysis, repo-detail) so panels
scope data to the selected DevLake project.
Assisted-By: Claude Opus 4.6
Signed-off-by: Bryan Ramos <bramos@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…very by connection Calculator now discovers submissions repo IDs the same way the extractor does. Discovery query is scoped by connection_id to prevent cross-project data leaks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Bryan Ramos <bramos@redhat.com>
…bmissions-onboarding
Signed-off-by: Bryan Ramos <bramos@redhat.com>
Signed-off-by: Bryan Ramos <bramos@redhat.com>
- allow users to specify the branch for the centralized submissions repo - add migration script - update scope config UI Signed-off-by: Bryan Ramos <bramos@redhat.com>
Signed-off-by: Bryan Ramos <bramos@redhat.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #95 +/- ##
===========================================
- Coverage 59.41% 17.79% -41.62%
===========================================
Files 81 402 +321
Lines 6105 21243 +15138
Branches 0 14 +14
===========================================
+ Hits 3627 3780 +153
- Misses 2403 17387 +14984
- Partials 75 76 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/agentic_review |
Code Review by Qodo
Context used✅ Compliance rules (platform):
110 rules 1. Edits outside allowed plugin dirs
|
| @@ -0,0 +1,236 @@ | |||
| package tasks | |||
There was a problem hiding this comment.
1. Edits outside allowed plugin dirs 📘 Rule violation § Compliance
This PR modifies backend/plugins/agentready/..., which is outside the only permitted change paths for this PR. This violates the restriction on modifying files outside the owned plugin directories.
Agent Prompt
## Issue description
PR changes include files outside the allowed plugin directories.
## Issue Context
Per compliance requirements, all touched paths must be under one of:
- `backend/plugins/aireview/`
- `backend/plugins/codecov/`
- `backend/plugins/testregistry/`
## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[1-1]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
I'll update this in another PR once these plugin PRs get merged.
| branch := config.SubmissionsBranch | ||
|
|
||
| // Fetch the full recursive tree | ||
| treeResp, err := FetchGithubTree(ctx, endpoint, config.SubmissionsRepo, branch, conn.Token) | ||
| if err != nil { | ||
| logger.Warn(nil, "Failed to fetch tree for submissions repo %s: %v", config.SubmissionsRepo, err) | ||
| return | ||
| } | ||
|
|
||
| if treeResp.Truncated { | ||
| logger.Warn(nil, "Submissions repo tree is truncated; some entries may be missing") | ||
| } | ||
|
|
||
| // Parse entries from the tree | ||
| entries := ParseSubmissionEntries(treeResp.Tree, submissionsPath) | ||
| logger.Info("Found %d submission entries in %s", len(entries), config.SubmissionsRepo) | ||
|
|
||
| // Register submission repos in project_mapping so Grafana dashboards | ||
| // (which JOIN on project_mapping) can find them. | ||
| if projectName != "" { | ||
| seen := map[string]bool{} | ||
| for _, entry := range entries { | ||
| repoId := MakeSubmissionsRepoId(entry.Org, entry.Repo) | ||
| if seen[repoId] { | ||
| continue | ||
| } | ||
| seen[repoId] = true | ||
| mapping := &projectMappingRow{ | ||
| ProjectName: projectName, | ||
| Table: "repos", | ||
| RowId: repoId, | ||
| } | ||
| if mapErr := db.CreateOrUpdate(mapping); mapErr != nil { | ||
| logger.Warn(mapErr, "Failed to insert project_mapping for %s", repoId) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| now := time.Now() | ||
| for _, entry := range entries { | ||
| // Fetch the assessment file content via the Contents API | ||
| rawJSON, fetchErr := FetchGithubAssessment(ctx, endpoint, config.SubmissionsRepo, entry.TreePath, "", conn.Token) | ||
| if fetchErr != nil { |
There was a problem hiding this comment.
2. Submissions branch ignored 🐞 Bug ≡ Correctness
collectFromSubmissionsRepo fetches the tree using submissionsBranch but downloads each JSON via FetchGithubAssessment with an empty ref, so it reads from the repo default branch instead of the configured submissionsBranch and may ingest wrong/missing files.
Agent Prompt
## Issue description
`collectFromSubmissionsRepo` discovers files from a specific branch via the Git Trees API, but then downloads those files via the Contents API without passing the same branch/ref. This makes collection inconsistent and can silently skip or ingest wrong assessments when `submissionsBranch` is set.
## Issue Context
- Tree discovery uses `config.SubmissionsBranch`.
- File download calls `FetchGithubAssessment(..., ref="")`, which defaults to the repo’s default branch.
## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[152-194]
## Proposed fix
- Thread the same `branch` value into `FetchGithubAssessment`:
- `FetchGithubAssessment(..., entry.TreePath, branch, conn.Token)`
- Ensure the behavior for empty `branch` is consistent between tree discovery and contents download (either both default to repo default branch or both default to the same configured fallback).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| func FetchGithubTree(ctx context.Context, endpoint, fullName, branch, token string) (*githubTreeResponse, error) { | ||
| if token == "" { | ||
| return nil, fmt.Errorf("a GitHub token is required to fetch the tree") | ||
| } | ||
| if branch == "" { | ||
| branch = "main" | ||
| } | ||
|
|
||
| endpoint = strings.TrimSuffix(endpoint, "/") | ||
| apiURL := fmt.Sprintf("%s/repos/%s/git/trees/%s?recursive=1", endpoint, fullName, branch) | ||
|
|
There was a problem hiding this comment.
3. Tree ref handling brittle 🐞 Bug ☼ Reliability
FetchGithubTree hardcodes an empty branch to "main" and interpolates the ref directly into the URL path without URL-escaping, so repos with non-main default branches or refs containing reserved characters (e.g., '/') can fail tree discovery.
Agent Prompt
## Issue description
`FetchGithubTree` forces `branch==""` to `"main"` and uses the branch/ref as a raw URL path segment. This conflicts with documented behavior (“empty uses default branch”) and can break refs that contain `/` or other URL-reserved characters unless escaped.
## Issue Context
- README documents `submissionsBranch` empty => repository default branch.
- Current implementation uses `main` for empty and does not escape the ref in the URL.
## Fix Focus Areas
- backend/plugins/agentready/tasks/submissions_collector.go[80-90]
- backend/plugins/agentready/README.md[216-224]
## Proposed fix
- If `branch` is empty, resolve the submissions repo’s default branch via GitHub repo API (or accept an explicit `default_branch` returned by the API) instead of hardcoding `main`.
- URL-escape the ref when building `/git/trees/{ref}` (e.g., `url.PathEscape(ref)`), or resolve the commit SHA first and call the trees API using the SHA (most robust).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Defaulting to 'main' should be fine if a branch isn't passed. I'll check on the URL-escaping.
| func discoverSubmissionRepoIds(db dal.Dal, connectionId uint64) ([]string, error) { | ||
| var rows []struct { | ||
| RepoId string `gorm:"column:repo_id"` | ||
| } | ||
| clauses := []dal.Clause{ | ||
| dal.Select("DISTINCT repo_id"), | ||
| dal.From(&models.AgentReadyAssessment{}), | ||
| dal.Where("provider = ?", "submissions"), | ||
| } | ||
| if connectionId > 0 { | ||
| clauses = append(clauses, dal.Where("connection_id = ?", connectionId)) | ||
| } |
There was a problem hiding this comment.
4. Submissions ids unscoped 🐞 Bug ≡ Correctness
Extractor/metrics append submission repo IDs by querying all assessments where provider='submissions' (optionally filtered by connection_id) without filtering by the current project or configured submissionsRepo, so runs can process unrelated submissions from other projects/configurations.
Agent Prompt
## Issue description
`discoverSubmissionRepoIds` returns every `repo_id` ever collected with `provider='submissions'` (optionally by connection). When extraction/metrics run for a given project/scope config, this can pull in submissions from other projects or other submissions repos, contaminating results.
## Issue Context
- Submissions collector writes project mappings for synthetic repo IDs.
- Extraction currently discovers submission IDs via `_tool_agentready_assessments` without using `project_mapping` or `submissionsRepo`.
## Fix Focus Areas
- backend/plugins/agentready/tasks/assessment_extractor.go[79-86]
- backend/plugins/agentready/tasks/assessment_extractor.go[222-233]
- backend/plugins/agentready/tasks/submissions_collector.go[169-188]
- backend/plugins/agentready/tasks/metrics_calculator.go[36-43]
## Proposed fix
- Change submission repo ID discovery to be project-scoped, e.g.:
- Query `project_mapping` for the current `projectName` + `table='repos'` + `row_id LIKE 'submissions:%'` and use those row_ids.
- Alternatively (or additionally), persist `submissionsRepo` (and maybe `submissionsPath`) onto `_tool_agentready_assessments` rows and filter by it during discovery.
- Apply the same scoping fix in both extractor and metrics calculator.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
…files collectFromSubmissionsRepo fetched the tree using submissionsBranch but downloaded each JSON via FetchGithubAssessment with an empty ref, causing reads from the repo default branch instead of the configured submissions branch. Signed-off-by: Bryan Ramos <bramos@redhat.com>
FetchGithubTree interpolated the branch ref directly into the URL path without escaping. Branches containing reserved characters like "/" (e.g. "chore/entries") produced malformed URLs where GitHub interpreted the slash as a path separator instead of part of the ref name. Use url.PathEscape on the branch before URL interpolation. Signed-off-by: Bryan Ramos <bramos@redhat.com>
discoverSubmissionRepoIds was querying all assessments with provider='submissions' without filtering by the current project, causing extraction and metrics to process unrelated submissions from other projects. Now uses project_mapping for scoping when projectName is available, matching the pattern used by discoverRepos. Signed-off-by: Bryan Ramos <bramos@redhat.com>
|
@CryptoRodeo thinking out loud. How do we handle deletes? Like what if someone removes their submission/org/repo because maybe they don't want to report their scores for a particular org/repo anymore or its archived. Should capture that diff somehow delete any repo that isn't their anymore? I'm curious what is the actual precedent is with these other plugins? Also, if I want to collect this for konflux, for each Konflux team there is a project configured in devlake to point to Konflux's central agentready scores repo. It feels odd to have to set this for every project. I was wondering if we should switch more to test registries plugin model? With connection, scopeconfig, and assigning projects. It feels like a better UX. Something to think about? |
|
@Dannyb48 Great points, I'll get back to you on these |
|
@Dannyb48 I'm gonna look into making it more like the testregistry plugin. This will require the plugin to be converted from a metrics plugin to a datasource plugin. I'm thinking of scrapping the single-repo feature and sticking with the centralized-repo idea to keep this simple. For deletions, it should be simple enough to delete the entry if the repo no longer exists when scanning. But I need to look into this. |
pr-type/bug-fix,pr-type/feature-development, etc.Summary
What does this PR do?
I added the ability to toggle between single repo and a centralized repo in the scope config settings.
Here is the repo I used to test this new feature: https://github.com/CryptoRodeo/agentready-scores
It's a pretty simple setup. In submissions you add directories in this format: org/repo and in each repo folder you add the assessment file. For example: https://github.com/CryptoRodeo/agentready-scores/tree/main/submissions/TSD-UI/rhtas-console-ui
This should make onboarding way easier. No need to keep these files in your project repo (unless you want to)
This also opens up the possibility to immediately onboard teams. You can run the agentready tool against their repo, grab the assessment file and make them a directory. Or automate that process, as mentioned in this thread.
Does this close any open issues?
Closes KFLUXDP-1010
Screenshots
Include any relevant screenshots here.
Scope config UI:
submissions-scope-config-modal.mp4
Older demo showing full workflow (missing some new UI inputs):
agentready-central-repo-demo.mp4
Other Information
Any other information that is important to this PR.
Depends on #94