Skip to content

SREP-347: Adding 'osdctl rhobs' subcommand to interact with RHOBS#876

Open
Nikokolas3270 wants to merge 2 commits intoopenshift:masterfrom
Nikokolas3270:SREP-347
Open

SREP-347: Adding 'osdctl rhobs' subcommand to interact with RHOBS#876
Nikokolas3270 wants to merge 2 commits intoopenshift:masterfrom
Nikokolas3270:SREP-347

Conversation

@Nikokolas3270
Copy link
Copy Markdown
Contributor

@Nikokolas3270 Nikokolas3270 commented Apr 13, 2026

Sub-subcommands which are already ready for review:

  • cell: print the RHOBS cell for a given or for the current cluster
  • metrics: print the metrics for a given or for the current cluster
  • logs: print the logs for a given or for the current MC cluster

Summary by CodeRabbit

  • New Features

    • Added a top-level rhobs command with cell, logs, and metrics subcommands to query RHOBS observability data.
  • New Features / Enhancements

    • Logs: rich Loki-style filtering, time-range controls, paging, and text/csv/json outputs.
    • Metrics: instant Prometheus queries with cluster-aware filtering and table/csv/json outputs.
  • Improvements

    • Cleaner global CLI flag handling and explicit skip-version-check flag.
    • Confirmation prompt and some CLI output now write to stderr for clearer UX.
    • Updated help and command documentation to include rhobs and streamlined flag displays.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 13, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 13, 2026

@Nikokolas3270: This pull request references SREP-347 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Sub-subcommands which are already ready for review:

  • cell: give the RHOBS cell for a given or for the current cluster
  • metrics: give the metrics for a given or for the current cluster - currently only the JSON output format is available... working on some more human friendly formats

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 13, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Refactors global flag handling into a per-command GlobalOptions API, centralizes HTTP/token/vault logic into pkg/utils (Requester, token, vault), updates Dynatrace callers to use these shared utils, and adds a new top-level rhobs command with subcommands: cell, logs, and metrics. Documentation and deps updated.

Changes

CLI Architecture Refactor

Layer / File(s) Summary
Data Shape / Constructor
internal/utils/globalflags/globalflags.go
Adds NewGlobalOptions(); GlobalOptions gains KubeFlags genericclioptions.ConfigFlags.
Flag Registration API
internal/utils/globalflags/globalflags.go
Adds instance methods: AddSkipVersionCheckFlag, AddOutputFlag, AddNoAwsProxyFlag, AddKubeFlags. Removes prior package helpers for global flags.
Root Wiring / Command Integration
cmd/cmd.go, main.go
Root now constructs globalOpts from NewGlobalOptions(), registers skip-version-check via globalOpts.AddSkipVersionCheckFlag, conditionally writes --no-proxy into viper only when flag exists on the executing command, attaches per-subcommand flags via a helper, creates kube client from &globalOpts.KubeFlags, and registers rhobs.NewCmdRhobs(). main.go sets cobra.EnableTraverseRunHooks = true.
Docs
docs/*.md
Updated help/docs to reflect condensed global-flag listings and new rhobs command pages.

RHOBS + Shared Utils & Dynatrace Refactor

Layer / File(s) Summary
New Feature API / Types
cmd/rhobs/requests.go
Adds exported RhobsFetcher, MetricsFormat/LogsFormat enums and parsers, and RhobsFetchUsage constants and helpers.
RHOBS Command Root & Options
cmd/rhobs/rootCmd.go
Adds NewCmdRhobs() root command with persistent --cluster-id/-C and --hive-ocm-url flags and lazy cluster-id resolution via PersistentPreRunE.
RHOBS Subcommands
cmd/rhobs/cellCmd.go, cmd/rhobs/logsCmd.go, cmd/rhobs/metricsCmd.go
Adds cell (prints RHOBS cell(s)), logs [pod] (Loki query builder, filters, time/limit/direction/output), and metrics <expr> (PromQL instant query) subcommands.
RHOBS Cell Resolution & Fetcher Construction
cmd/rhobs/requests.go
CreateRhobsFetcher resolves cluster context (maps HCP/HyperShift to management cluster for metrics), determines RHOBS cell via usage-specific OCM configmap or Hive ClusterDeployment fallback, reloads config, and constructs a RhobsFetcher that injects scoped bearer tokens per request.
Core RHOBS Behavior
cmd/rhobs/requests.go
Implements PrintMetrics (instant PromQL query, status validation, optional cluster filtering, table/csv/json output) and PrintLogs (LogQL range queries with sliding time-window pagination, flatten/sort/print entries, format-specific printers, 1ns boundary shift to avoid stalling).
Vault / Token / Requester Core
pkg/utils/vault.go, pkg/utils/token.go, pkg/utils/requester.go
Moves vault helpers into utils with VaultAddrKey, VaultRef, GetVaultRef, GetSecretFromVault; refactors GetCredsFromVault to return explicit errors; adds GetScopedAccessToken and GetScopedTokenProvider; adds exported Requester with Send() implementing HTTP request/response/error parsing.
Dynatrace Callers Update
cmd/dynatrace/requests.go, cmd/dynatrace/hcpGatherLogsCmd.go
Renames Dynatrace vault-path constants to DT*VaultPathKey, removes local Dynatrace Requester/error types, and replaces local request/token helpers with utils.Requester.Send() and utils.GetScopedAccessToken / utils.GetScopedTokenProvider. Updates gather-logs helper signatures to accept utils.AccessTokenProvider.
Docs / Tests
docs/osdctl_rhobs*.md, pkg/utils/vault_test.go
Adds rhobs documentation pages and README updates; test package for vault changed from dynatraceutils (no test logic changes).

Minor UX / Utilities / Dependency Changes

Layer / File(s) Summary
Prompt stream
pkg/utils/utils.go
ConfirmPrompt now writes prompt and invalid-input messages to stderr (os.Stderr) instead of stdout.
Dependency / Module
go.mod
Adds github.com/observatorium/api (with replace), adds github.com/ghodss/yaml indirectly, and updates k8s.io/component-base to v0.32.2.
Docs Formatting
docs/*.md
Extensive help-output formatting edits and relocation/condensing of --skip-version-check/global flags across many command docs and README.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Cmd as metrics<br/>Command
    participant Fetcher as RhobsFetcher
    participant OCM as OCM / Hive
    participant Vault as Vault
    participant RHOBS as RHOBS API

    Client->>Cmd: Execute with PromQL expression
    Cmd->>Fetcher: CreateRhobsFetcher(clusterId, hiveOcmUrl)
    Fetcher->>OCM: Resolve cluster / read configMap or ClusterDeployment
    Fetcher->>Vault: GetCredsFromVault(vaultConfigKey)
    Vault-->>Fetcher: client_id, client_secret
    Fetcher->>Vault: GetScopedAccessToken(authUrl, vaultConfigKey, scopes)
    Vault-->>Fetcher: access_token
    Fetcher-->>Cmd: RhobsFetcher ready
    Cmd->>Fetcher: PrintMetrics(expression, format, filter)
    Fetcher->>RHOBS: Send instant PromQL query (with token)
    RHOBS-->>Fetcher: Metrics results
    Fetcher->>Cmd: Formatted output
Loading
sequenceDiagram
    participant Client as Client
    participant Cmd as logs<br/>Command
    participant Fetcher as RhobsFetcher
    participant OCM as OCM / Hive
    participant Vault as Vault
    participant RHOBS as RHOBS API

    Client->>Cmd: Execute with pod/query options
    Cmd->>Cmd: Build LogQL expression (or use --query)
    Cmd->>Fetcher: CreateRhobsFetcher(clusterId, hiveOcmUrl)
    Fetcher->>OCM: Resolve cluster / cell
    Fetcher->>Vault: GetCredsFromVault(vaultConfigKey)
    Vault-->>Fetcher: credentials
    Fetcher->>Vault: GetScopedAccessToken(authUrl, vaultConfigKey, scopes)
    Vault-->>Fetcher: access_token
    Fetcher-->>Cmd: RhobsFetcher ready
    Cmd->>Fetcher: PrintLogs(expr, startTime, endTime, limit, direction)
    loop while more logs needed
        Fetcher->>RHOBS: Range query [start,end]
        RHOBS-->>Fetcher: Streams with values
        Fetcher->>Fetcher: Flatten, parse timestamps, sort
        Fetcher->>Cmd: Print entries (decrement remaining)
        Fetcher->>Fetcher: Advance time window (skip boundary or +1ns)
    end
    Cmd-->>Client: Output logs
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically describes the main change: adding a new 'osdctl rhobs' subcommand to interact with RHOBS, which aligns with the substantial code additions for rhobs functionality across multiple files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed No Ginkgo test files were added or modified in this PR. Check requires Ginkgo test examination, which is not applicable.
Test Structure And Quality ✅ Passed No Ginkgo test files were added or substantially modified in this PR. The check for test structure and quality is not applicable as there are no tests to evaluate.
Microshift Test Compatibility ✅ Passed This PR does not add any Ginkgo e2e tests. It adds CLI command implementations, utilities, and documentation. The custom check for MicroShift test compatibility is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. The PR adds CLI commands, utility libraries, unit tests (standard Go testing, not Ginkgo), and documentation. The custom check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed This PR adds CLI tooling only (osdctl rhobs command suite) with no deployment manifests, operator code, or controllers. Not applicable to this CLI utility.
Ote Binary Stdout Contract ✅ Passed PR does not introduce OTE violations. Pre-existing stdout issues in main.go unchanged. New RHOBS code uses stdout only in command implementations. ConfirmPrompt fixed to stderr.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR does not add any Ginkgo e2e tests. Changes consist of CLI command implementations, utility functions, and documentation. Check not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@Nikokolas3270 Nikokolas3270 marked this pull request as draft April 13, 2026 08:54
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 13, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Nikokolas3270

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 13, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
cmd/vault/vault_test.go (1)

37-45: These tests mirror the implementation instead of exercising it.

Both test bodies rebuild the loginArgs and stdout/stderr branches locally rather than calling setupVaultToken() in cmd/vault/vault.go. That means they can stay green even if the production logic drifts. Please extract the command-construction/output-selection into a helper and assert against that, or inject the exec factory so the tests hit the real code path.

Also applies to: 123-137

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/vault/vault_test.go` around lines 37 - 45, The tests currently
reconstruct loginArgs and stdout/stderr themselves instead of exercising
setupVaultToken; refactor so the production code exposes the
command-construction logic or an injectable exec factory and update tests to
call setupVaultToken so they assert the real behavior. Specifically, extract the
logic that builds the vault command into a helper like buildVaultLoginArgs (or
accept an execCommandFactory in setupVaultToken) and have the tests call
setupVaultToken (or use the injected factory) to capture the constructed
exec.Command and chosen output stream, then assert on that command args and
output selection instead of rebuilding loginArgs locally; apply the same change
to the other test block currently mirroring lines 123-137.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/rhobs/logsCmd.go`:
- Around line 13-37: The newCmdLogs command advertises a --cluster-id flag but
never registers it and its main_ always returns nil; fix by registering the flag
in newCmdLogs (e.g., use cmd.Flags().StringVar(&clusterId, "cluster-id", "",
"cluster identifier") or equivalent) so the flag is recognized, and change main_
to return an explicit not-implemented error (e.g., return fmt.Errorf("logs
command not implemented")) until a real implementation exists; reference
newCmdLogs for flag registration and main_ for the error return.

In `@cmd/rhobs/metricsCmd.go`:
- Around line 59-66: The function retrieveMetrics should not prepend a
human-readable banner for machine-readable outputs; update retrieveMetrics to
only print the "Metrics:" banner for non-structured formats and emit the raw
metrics payload when format is JSON or CSV. Locate retrieveMetrics and the call
to getMetrics (using cmdMetricsOptions.clusterId and promExpr) and change the
printing logic to conditionally print the banner based on MetricsFormat (skip
banner for JSON/CSV), ensuring stdout contains only the raw metrics payload for
machine consumption.

---

Nitpick comments:
In `@cmd/vault/vault_test.go`:
- Around line 37-45: The tests currently reconstruct loginArgs and stdout/stderr
themselves instead of exercising setupVaultToken; refactor so the production
code exposes the command-construction logic or an injectable exec factory and
update tests to call setupVaultToken so they assert the real behavior.
Specifically, extract the logic that builds the vault command into a helper like
buildVaultLoginArgs (or accept an execCommandFactory in setupVaultToken) and
have the tests call setupVaultToken (or use the injected factory) to capture the
constructed exec.Command and chosen output stream, then assert on that command
args and output selection instead of rebuilding loginArgs locally; apply the
same change to the other test block currently mirroring lines 123-137.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b158aa44-15a4-4a81-9ef8-28bee475559b

📥 Commits

Reviewing files that changed from the base of the PR and between 4e14680 and d85e415.

📒 Files selected for processing (10)
  • cmd/cmd.go
  • cmd/dynatrace/requests.go
  • cmd/requester/requester.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/requests.go
  • cmd/rhobs/rootCmd.go
  • cmd/rhobs/urlCmd.go
  • cmd/vault/vault.go
  • cmd/vault/vault_test.go
👮 Files not reviewed due to content moderation or server errors (2)
  • cmd/requester/requester.go
  • cmd/rhobs/requests.go

Comment thread cmd/rhobs/logsCmd.go Outdated
Comment thread cmd/rhobs/metricsCmd.go Outdated
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 14, 2026

@Nikokolas3270: This pull request references SREP-347 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Sub-subcommands which are already ready for review:

  • cell: give the RHOBS cell for a given or for the current cluster
  • metrics: give the metrics for a given or for the current cluster - currently only the JSON output format is available... working on some more human friendly formats

Summary by CodeRabbit

  • New Features
  • Added RHOBS command with cell, logs, and metrics subcommands for observability queries
  • Metrics command supports multiple output formats: table, CSV, and JSON
  • Added --cluster-id flag to specify cluster for RHOBS operations

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (10)
cmd/dynatrace/requests.go (3)

105-118: Variable shadows package name.

The local variable requester shadows the imported package requester, which can lead to confusion and potential issues if you need to reference the package again within this scope.

♻️ Suggested rename
-	requester := requester.Requester{
+	req := requester.Requester{
 		Method: http.MethodPost,
 		Url:    dtURL + "platform/storage/query/v1/query:execute",
 		Data:   string(payloadJSON),
 		Headers: map[string]string{
 			"Content-Type":  "application/json",
 			"Authorization": "Bearer " + accessToken,
 		},
 		SuccessCode: http.StatusAccepted,
 	}

 	var resp string
 	for {
-		resp, err = requester.Send()
+		resp, err = req.Send()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/dynatrace/requests.go` around lines 105 - 118, The local variable named
requester shadows the imported requester package; rename the local variable
(e.g., reqClient or dtRequester) wherever it is declared and used (the Requester
struct literal and the variable passed to resp, err = requester.Send()) so you
no longer mask the package name; update all references (the Requester struct
instantiation, its Headers/Method/Url/Data/SuccessCode fields, and the
subsequent Send() call) to use the new variable name.

161-172: Same shadowing issue with requester variable.

Consider renaming to req for consistency across all usages in this file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/dynatrace/requests.go` around lines 161 - 172, The variable named
requester shadows the package/type name requester.Requester; rename the local
variable to req for consistency and to avoid shadowing: change the instantiation
from "requester := requester.Requester{...}" to "req :=
requester.Requester{...}" and update all subsequent uses (e.g., calls to Send())
in this scope/loop to use req.Send() and any other req fields; ensure there are
no leftover references to the old variable name.

205-215: Same shadowing issue with requester variable.

Consider renaming to req for consistency across all usages in this file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/dynatrace/requests.go` around lines 205 - 215, The variable requester is
shadowing the package name requester; rename the local variable requester (from
the instantiation requester := requester.Requester{...}) to req (or similar) and
update its usage (req.Send(), req.Method, req.Url, req.Headers, req.SuccessCode)
throughout this block so you don't conflict with the requester package
identifier; ensure imports and any other references in the file remain unchanged
and compile.
cmd/requester/requester.go (3)

62-70: Error response handling loses context when JSON unmarshal fails.

If the response body is not valid JSON (e.g., HTML error page, plain text), the unmarshal error is returned without the actual response content. This makes debugging difficult.

♻️ Proposed fix to preserve response content
 	if resp.StatusCode != rh.SuccessCode {
 		var respErr responseError
-		err = json.Unmarshal([]byte(body), &respErr)
+		err = json.Unmarshal(body, &respErr)
 		if err != nil {
-			return "", err
+			// If response is not JSON, include raw body in error
+			return "", fmt.Errorf("request failed: %s, body: %s", resp.Status, string(body))
 		}

 		return "", fmt.Errorf("request failed: %v %s", resp.Status, respErr)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/requester/requester.go` around lines 62 - 70, When handling non-success
responses in requester.go (the block comparing resp.StatusCode to
rh.SuccessCode), the current json.Unmarshal into responseError can fail and you
return that unmarshal error without preserving the raw response body; instead
catch json.Unmarshal errors and include the original body (variable body) and
HTTP status in the returned error so callers see the raw response content when
parsing fails (e.g., return an error that contains resp.Status and the raw body)
while still attempting to use responseError when unmarshal succeeds.

29-32: Consider documenting the 600-second timeout.

A 10-minute timeout is unusually long. If this is intentional (e.g., for long-running Dynatrace queries), consider adding a comment explaining why, or making it configurable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/requester/requester.go` around lines 29 - 32, The Send method currently
constructs an http.Client with a hardcoded 600-second timeout (in func Send on
type Requester), which is unusually long; either document why 10 minutes is
required with an inline comment above the client creation (e.g., because
Dynatrace queries may be long-running) or make the timeout configurable by
replacing the literal with a Requester field or configuration value (e.g.,
rh.Timeout or a command-line/env flag) so callers can control it and add a brief
comment describing the default behavior.

17-19: responseError field naming may be misleading.

The field is named Records but maps to JSON key "error". Consider renaming to Error for clarity.

♻️ Proposed rename
 type responseError struct {
-	Records json.RawMessage `json:"error"`
+	Error json.RawMessage `json:"error"`
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/requester/requester.go` around lines 17 - 19, The struct field
responseError.Records is misleading because it maps to the JSON key "error";
rename the field to Error (i.e., change responseError.Records ->
responseError.Error) and keep the json tag `json:"error"`, then update all
usages/readers of responseError that reference Records to use Error instead
(search for responseError, Records, and any unmarshalling sites to update
variable access).
cmd/vault/vault.go (2)

113-113: Returning pointer to map is unnecessary.

Maps in Go are already reference types. Returning *map[string]string adds indirection without benefit and requires callers to dereference (e.g., (*secretData)["key"] at lines 149, 153).

♻️ Return map directly
-func GetSecretFromVault(vaultRef VaultRef) (*map[string]string, error) {
+func GetSecretFromVault(vaultRef VaultRef) (map[string]string, error) {
 	err := setupVaultToken(vaultRef.Addr)
 	if err != nil {
-		return nil, err
+		return nil, err
 	}
 	// ... rest of function ...
-	return &formattedOutput.Data.Data, nil
+	return formattedOutput.Data.Data, nil
 }

And update GetCredsFromVault:

-	clientID, ok := (*secretData)["client_id"]
+	clientID, ok := secretData["client_id"]
 	// ...
-	clientSecret, ok := (*secretData)["client_secret"]
+	clientSecret, ok := secretData["client_secret"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/vault/vault.go` at line 113, Change GetSecretFromVault to return a
map[string]string (not *map[string]string) and update any functions that call it
(e.g., GetCredsFromVault) to expect and pass a map directly; remove pointer
indirection and replace dereferences like (*secretData)["key"] with
secretData["key"]. Ensure signatures for GetSecretFromVault and
GetCredsFromVault are updated consistently and update all callers to stop
dereferencing the map pointer.

113-136: Redundant VAULT_ADDR environment variable setting.

setupVaultToken(vaultRef.Addr) already sets VAULT_ADDR at line 42. The second os.Setenv call at line 119 is unnecessary.

♻️ Remove redundant Setenv
 func GetSecretFromVault(vaultRef VaultRef) (*map[string]string, error) {
 	err := setupVaultToken(vaultRef.Addr)
 	if err != nil {
 		return nil, err
 	}

-	err = os.Setenv("VAULT_ADDR", vaultRef.Addr)
-	if err != nil {
-		return nil, fmt.Errorf("error setting environment variable: %v", err)
-	}
-
 	kvGetCommand := exec.Command("vault", "kv", "get", "-format=json", vaultRef.Path)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/vault/vault.go` around lines 113 - 136, GetSecretFromVault redundantly
calls os.Setenv("VAULT_ADDR", vaultRef.Addr) after setupVaultToken already sets
VAULT_ADDR; remove the redundant os.Setenv call and its error handling from
GetSecretFromVault so the function just relies on setupVaultToken(vaultRef.Addr)
to set the env var, keeping the rest of GetSecretFromVault (exec.Command,
json.Unmarshal, return) unchanged and ensure setupVaultToken remains responsible
for VAULT_ADDR.
cmd/rhobs/requests.go (2)

143-152: Same variable shadowing as in dynatrace/requests.go.

The local variable requester shadows the imported package name.

♻️ Suggested rename
-	requester := requester.Requester{
+	req := requester.Requester{
 		Method: http.MethodGet,
 		Url:    "https://" + rhobsCell + "/api/metrics/v1/hcp/api/v1/query?" + queryData,
 		// ...
 	}

-	resp, err := requester.Send()
+	resp, err := req.Send()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` around lines 143 - 152, The local variable named
requester shadows the imported package requester; rename the local variable (for
example to req, reqBuilder, or requesterCfg) wherever it's declared and used so
it no longer collides with the package name, e.g., change the variable assigned
from requester.Requester{ ... } and all subsequent references to that variable
(including fields Method, Url, Headers, SuccessCode) to the new name while
leaving the imported package identifier requester unchanged.

238-261: Use fmt.Printf instead of fmt.Print(fmt.Sprintf(...)).

Multiple lines use redundant fmt.Sprintf calls inside fmt.Print. This is flagged by staticcheck (S1038).

♻️ Proposed fix
 	// Header
-	fmt.Print("|")
+	fmt.Print("|")
 	for _, column := range columns {
-		fmt.Print(fmt.Sprintf(" %-*s |", column.width, column.name))
+		fmt.Printf(" %-*s |", column.width, column.name)
 	}
 	// ...
 	for _, metric := range metrics {
 		// ...
-		fmt.Print(fmt.Sprintf("| %*s | %*s |", columns[0].width, time, columns[1].width, value))
+		fmt.Printf("| %*s | %*s |", columns[0].width, time, columns[1].width, value)

 		for _, column := range columns[2:] {
 			labelValue := metric.Metric[column.name]
-			fmt.Print(fmt.Sprintf(" %*s |", column.width, labelValue))
+			fmt.Printf(" %*s |", column.width, labelValue)
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` around lines 238 - 261, Replace redundant
fmt.Print(fmt.Sprintf(...)) calls with direct fmt.Printf or fmt.Println where
appropriate: use fmt.Printf for formatted strings (e.g., the header row, each
metric row, and label columns) and keep fmt.Println for newline-only prints;
update the blocks that render the table (references: variables columns, metrics,
separatorLine and the loop over metric.Value and columns[2:]) to call fmt.Printf
with the same format verbs and arguments instead of wrapping fmt.Sprintf inside
fmt.Print.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/rhobs/requests.go`:
- Around line 84-87: The call to hiveClient.List is discarding its returned
error; capture and check the error from hiveClient.List(...) (e.g., assign to
err using err = hiveClient.List(...) or err := hiveClient.List(...) as
appropriate) and handle it the same way as the previous error check so that
failures listing clusterDeployments are detected and returned (referencing
hiveClient.List, clusterDeployments, clusterSelector and the existing fmt.Errorf
return).
- Around line 264-287: printMetricsAsCsv currently ignores errors from
csv.NewWriter's writer.Write and writer.Flush which can mask I/O failures;
update printMetricsAsCsv to return an error, check and propagate the error
returned by each writer.Write call (for header and each row) and the final
writer.Flush call, and update the caller at the previous call site accordingly;
locate the function printMetricsAsCsv and the loops that call writer.Write and
the final writer.Flush to add proper error handling and return propagation (also
ensure csv.NewWriter and getTableColumns usage remains unchanged).

---

Nitpick comments:
In `@cmd/dynatrace/requests.go`:
- Around line 105-118: The local variable named requester shadows the imported
requester package; rename the local variable (e.g., reqClient or dtRequester)
wherever it is declared and used (the Requester struct literal and the variable
passed to resp, err = requester.Send()) so you no longer mask the package name;
update all references (the Requester struct instantiation, its
Headers/Method/Url/Data/SuccessCode fields, and the subsequent Send() call) to
use the new variable name.
- Around line 161-172: The variable named requester shadows the package/type
name requester.Requester; rename the local variable to req for consistency and
to avoid shadowing: change the instantiation from "requester :=
requester.Requester{...}" to "req := requester.Requester{...}" and update all
subsequent uses (e.g., calls to Send()) in this scope/loop to use req.Send() and
any other req fields; ensure there are no leftover references to the old
variable name.
- Around line 205-215: The variable requester is shadowing the package name
requester; rename the local variable requester (from the instantiation requester
:= requester.Requester{...}) to req (or similar) and update its usage
(req.Send(), req.Method, req.Url, req.Headers, req.SuccessCode) throughout this
block so you don't conflict with the requester package identifier; ensure
imports and any other references in the file remain unchanged and compile.

In `@cmd/requester/requester.go`:
- Around line 62-70: When handling non-success responses in requester.go (the
block comparing resp.StatusCode to rh.SuccessCode), the current json.Unmarshal
into responseError can fail and you return that unmarshal error without
preserving the raw response body; instead catch json.Unmarshal errors and
include the original body (variable body) and HTTP status in the returned error
so callers see the raw response content when parsing fails (e.g., return an
error that contains resp.Status and the raw body) while still attempting to use
responseError when unmarshal succeeds.
- Around line 29-32: The Send method currently constructs an http.Client with a
hardcoded 600-second timeout (in func Send on type Requester), which is
unusually long; either document why 10 minutes is required with an inline
comment above the client creation (e.g., because Dynatrace queries may be
long-running) or make the timeout configurable by replacing the literal with a
Requester field or configuration value (e.g., rh.Timeout or a command-line/env
flag) so callers can control it and add a brief comment describing the default
behavior.
- Around line 17-19: The struct field responseError.Records is misleading
because it maps to the JSON key "error"; rename the field to Error (i.e., change
responseError.Records -> responseError.Error) and keep the json tag
`json:"error"`, then update all usages/readers of responseError that reference
Records to use Error instead (search for responseError, Records, and any
unmarshalling sites to update variable access).

In `@cmd/rhobs/requests.go`:
- Around line 143-152: The local variable named requester shadows the imported
package requester; rename the local variable (for example to req, reqBuilder, or
requesterCfg) wherever it's declared and used so it no longer collides with the
package name, e.g., change the variable assigned from requester.Requester{ ... }
and all subsequent references to that variable (including fields Method, Url,
Headers, SuccessCode) to the new name while leaving the imported package
identifier requester unchanged.
- Around line 238-261: Replace redundant fmt.Print(fmt.Sprintf(...)) calls with
direct fmt.Printf or fmt.Println where appropriate: use fmt.Printf for formatted
strings (e.g., the header row, each metric row, and label columns) and keep
fmt.Println for newline-only prints; update the blocks that render the table
(references: variables columns, metrics, separatorLine and the loop over
metric.Value and columns[2:]) to call fmt.Printf with the same format verbs and
arguments instead of wrapping fmt.Sprintf inside fmt.Print.

In `@cmd/vault/vault.go`:
- Line 113: Change GetSecretFromVault to return a map[string]string (not
*map[string]string) and update any functions that call it (e.g.,
GetCredsFromVault) to expect and pass a map directly; remove pointer indirection
and replace dereferences like (*secretData)["key"] with secretData["key"].
Ensure signatures for GetSecretFromVault and GetCredsFromVault are updated
consistently and update all callers to stop dereferencing the map pointer.
- Around line 113-136: GetSecretFromVault redundantly calls
os.Setenv("VAULT_ADDR", vaultRef.Addr) after setupVaultToken already sets
VAULT_ADDR; remove the redundant os.Setenv call and its error handling from
GetSecretFromVault so the function just relies on setupVaultToken(vaultRef.Addr)
to set the env var, keeping the rest of GetSecretFromVault (exec.Command,
json.Unmarshal, return) unchanged and ensure setupVaultToken remains responsible
for VAULT_ADDR.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 1da9481d-674c-4905-838e-f5d227bd7f37

📥 Commits

Reviewing files that changed from the base of the PR and between d85e415 and 1280a97.

📒 Files selected for processing (10)
  • cmd/cmd.go
  • cmd/dynatrace/requests.go
  • cmd/requester/requester.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/requests.go
  • cmd/rhobs/rootCmd.go
  • cmd/rhobs/urlCmd.go
  • cmd/vault/vault.go
  • cmd/vault/vault_test.go
✅ Files skipped from review due to trivial changes (2)
  • cmd/vault/vault_test.go
  • cmd/cmd.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • cmd/rhobs/urlCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/rootCmd.go

Comment thread cmd/rhobs/requests.go Outdated
Comment thread cmd/rhobs/requests.go Outdated
Comment thread pkg/utils/vault.go
Comment thread cmd/rhobs/logsCmd.go Outdated

func newCmdLogs() *cobra.Command {
cmd := &cobra.Command{
Use: "logs --cluster-id <cluster-identifier>",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be an option to add simple filtering? something like --since=timestamp --untill timestamp --level ERROR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite some filtering has been added. I will consider your suggestion for being able to select the log level... even if that not normalized...

Copy link
Copy Markdown
Contributor Author

@Nikokolas3270 Nikokolas3270 May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petrkotas , please see the doc (docs/osdctl_rhobs_logs.md file) for the command: it is now possible to:

  • Give a time range for the logs to retrieve
  • Grep for the logs to retain on the server side
  • Select the log levels to retain

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is what I was looking for.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 29, 2026

@Nikokolas3270: This pull request references SREP-347 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Sub-subcommands which are already ready for review:

  • cell: give the RHOBS cell for a given or for the current cluster
  • metrics: give the metrics for a given or for the current cluster - currently only the JSON output format is available... working on some more human friendly formats

Summary by CodeRabbit

Release Notes

  • New Features

  • Added RHOBS (Red Hat Observability Services) integration with three new commands: rhobs logs for querying logs with Loki expressions, namespace/pod/label filters, and multiple output formats; rhobs metrics for querying Prometheus metrics; and rhobs cell to display current cluster cell information.

  • Chores

  • Refactored internal flag handling and updated dependencies.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (2)
cmd/rhobs/requests.go (2)

151-154: ⚠️ Potential issue | 🔴 Critical

Capture the hiveClient.List error.

The return value from hiveClient.List(...) is discarded, so the err checked below still refers to the earlier selector conversion. A failed list is treated as success and the code continues with an empty clusterDeployments.

Proposed fix
-	hiveClient.List(context.TODO(), &clusterDeployments, &client.ListOptions{LabelSelector: clusterSelector})
-	if err != nil {
+	err = hiveClient.List(context.TODO(), &clusterDeployments, &client.ListOptions{LabelSelector: clusterSelector})
+	if err != nil {
 		return nil, fmt.Errorf("failed to list cluster deployments for cluster '%s': %v", managementCluster.ID(), err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` around lines 151 - 154, The call to hiveClient.List
currently discards its error return so the subsequent err check is validating
the wrong error; assign the result to err (e.g., err := hiveClient.List(...) or
err = hiveClient.List(...)) when calling hiveClient.List with
&clusterDeployments and &client.ListOptions{LabelSelector: clusterSelector},
then keep the existing error branch that returns fmt.Errorf("failed to list
cluster deployments for cluster '%s': %v", managementCluster.ID(), err) to
ensure failures from hiveClient.List are captured and propagated.

296-319: ⚠️ Potential issue | 🟡 Minor

Check CSV write errors in metrics output.

Both writer.Write calls and the final flush error are ignored here. If stdout is broken or a downstream pipe closes early, the command still reports success.

Proposed fix
-func printMetricsAsCsv(results *[]instantMetricResult) {
+func printMetricsAsCsv(results *[]instantMetricResult) error {
 	columns := getTableColumns(results)
 
 	writer := csv.NewWriter(os.Stdout)
@@
-	writer.Write(header)
+	if err := writer.Write(header); err != nil {
+		return fmt.Errorf("failed to write CSV header: %w", err)
+	}
@@
-		writer.Write(row)
+		if err := writer.Write(row); err != nil {
+			return fmt.Errorf("failed to write CSV row: %w", err)
+		}
 	}
 
 	writer.Flush()
+	return writer.Error()
 }

PrintMetrics should then return that error from the CSV branch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` around lines 296 - 319, The CSV writer calls in
printMetricsAsCsv ignore errors — change printMetricsAsCsv to return an error
(rename it or adjust signature) and check the error returned from each
writer.Write(...) and from writer.Flush(); on any write/flush error return it to
the caller (so the PrintMetrics caller can propagate the failure). Update the
caller (PrintMetrics) to handle the returned error from printMetricsAsCsv and
return it up the call chain; ensure you reference the existing symbols
printMetricsAsCsv, PrintMetrics, writer.Write and writer.Flush when making the
changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/cmd.go`:
- Around line 86-93: The root command is missing the --no-aws-proxy flag
registration which causes PersistentPreRun to fail when it reads
aws.NoProxyFlag; call globalOpts.AddNoAwsProxyFlag on rootCmd (in addition to
adding it to commands via addToRootCmdWithOtherGlobalOpts) so the flag is
registered for direct children like promote, dynatrace, rhobs, version, and
upgrade; update the initialization where rootCmd is set up to invoke
globalOpts.AddNoAwsProxyFlag(rootCmd) so PersistentPreRun can always access
aws.NoProxyFlag.
- Around line 117-118: The RHOBS command tree is missing kube-related flags
because rhobs.NewCmdRhobs() is added directly to rootCmd without calling
globalOpts.AddKubeFlags; fix by creating the command in a local variable (cmd :=
rhobs.NewCmdRhobs()), call globalOpts.AddKubeFlags(cmd) to attach
--kubeconfig/--context flags to that command tree, then add it to rootCmd with
rootCmd.AddCommand(cmd).

In `@cmd/requester/requester.go`:
- Around line 62-69: The Send error path in requester.go currently tries to
unmarshal the response body into responseError and returns JSON unmarshal errors
or formats the struct (causing %!s) which loses the original payload; update the
Send function to preserve the raw response body and produce a clear error by:
attempt json.Unmarshal([]byte(body), &respErr) but if unmarshal fails use the
raw body string in the returned error, and if unmarshal succeeds use
respErr.Error (or the struct's message field) instead of printing the struct;
ensure the final fmt.Errorf includes resp.Status and the chosen message (raw
body or respErr.Error) so callers always get the HTTP status plus the original
payload or parsed error.

In `@cmd/rhobs/logsCmd.go`:
- Around line 130-136: The JSON output branch incorrectly checks for a
non-existent "no-timestamp" flag; update the condition in the LogsFormatJson
branch to validate the actual "timestamps" flag instead (i.e., if
cmd.Flags().Changed("timestamps") return an error that --timestamps cannot be
used with json output), keeping the existing check for "fields" intact; modify
the check near the outputFormat == LogsFormatJson block in cmd/rhobs/logsCmd.go
so the JSON printer rejects --timestamps rather than silently ignoring it.
- Around line 80-88: In the selection.In/selection.NotIn branch where lokiExpr
is built (variables: lokiExpr, values, req.Key(), lokiOp), escape each selector
value before inserting into the alternation so literal label values like "4.14"
or "prod+canary" are not interpreted as regex; use Go's regexp.QuoteMeta on each
val when appending (and keep the existing parentheses/pipe alternation and
surrounding quotes intact) so the generated Loki regex matches the literal label
values.

In `@cmd/rhobs/requests.go`:
- Around line 507-535: Update csvLogsPrinter.PrintHeader,
csvLogsPrinter.PrintResult, and csvLogsPrinter.PrintTrailer to return error and
propagate any errors from p.writer.Write and p.writer.Flush instead of ignoring
them: have PrintHeader and PrintResult check the returned error from
p.writer.Write(header/row) and return it, and have PrintTrailer return the error
from p.writer.Flush(). Then update the caller path (the PrintLogs flow that
invokes these methods) to handle and return those errors so failures bubble up
to the command (ensure you update any interfaces/types that declare these
methods as needed). Use the existing symbols csvLogsPrinter.PrintHeader,
csvLogsPrinter.PrintResult, csvLogsPrinter.PrintTrailer and the top-level
PrintLogs caller to locate and change the signatures and error handling.
- Around line 664-691: The loop currently skips printing any result with ts ==
edgeTimeStamp which drops boundary-timestamp logs; always print each result
(remove the ts != edgeTimeStamp guard and always call logsPrinter.PrintResult
and decrement logsCount) so boundary entries are emitted, and keep the existing
cursor advancement logic around flattenedResults, getTimeStamp(), edgeTimeStamp,
startTimeStamp, endTimeStamp and isGoingForward but ensure when you need to skip
already-seen timestamps you advance the cursor by one (edgeTimeStamp+1 or
edgeTimeStamp-1) in the branch where flattenedResults[0].getTimeStamp() ==
edgeTimeStamp so you avoid re-fetching the same timestamp on the next page
without omitting the current batch’s boundary logs.

---

Duplicate comments:
In `@cmd/rhobs/requests.go`:
- Around line 151-154: The call to hiveClient.List currently discards its error
return so the subsequent err check is validating the wrong error; assign the
result to err (e.g., err := hiveClient.List(...) or err = hiveClient.List(...))
when calling hiveClient.List with &clusterDeployments and
&client.ListOptions{LabelSelector: clusterSelector}, then keep the existing
error branch that returns fmt.Errorf("failed to list cluster deployments for
cluster '%s': %v", managementCluster.ID(), err) to ensure failures from
hiveClient.List are captured and propagated.
- Around line 296-319: The CSV writer calls in printMetricsAsCsv ignore errors —
change printMetricsAsCsv to return an error (rename it or adjust signature) and
check the error returned from each writer.Write(...) and from writer.Flush(); on
any write/flush error return it to the caller (so the PrintMetrics caller can
propagate the failure). Update the caller (PrintMetrics) to handle the returned
error from printMetricsAsCsv and return it up the call chain; ensure you
reference the existing symbols printMetricsAsCsv, PrintMetrics, writer.Write and
writer.Flush when making the changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5199383c-46c9-41cd-bc61-8ad15add31b2

📥 Commits

Reviewing files that changed from the base of the PR and between 1280a97 and 297d68a.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (12)
  • cmd/cmd.go
  • cmd/dynatrace/requests.go
  • cmd/requester/requester.go
  • cmd/rhobs/cellCmd.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/requests.go
  • cmd/rhobs/rootCmd.go
  • cmd/vault/vault.go
  • cmd/vault/vault_test.go
  • go.mod
  • internal/utils/globalflags/globalflags.go
✅ Files skipped from review due to trivial changes (1)
  • cmd/vault/vault_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • cmd/rhobs/rootCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/dynatrace/requests.go

Comment thread cmd/cmd.go
Comment thread cmd/cmd.go
Comment thread cmd/requester/requester.go Outdated
Comment thread cmd/rhobs/logsCmd.go
Comment thread cmd/rhobs/logsCmd.go
Comment thread cmd/rhobs/requests.go
Comment thread cmd/rhobs/requests.go
@Nikokolas3270 Nikokolas3270 marked this pull request as ready for review April 30, 2026 08:22
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 30, 2026
@openshift-ci openshift-ci Bot requested review from dustman9000 and gvnnn April 30, 2026 08:23
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/dynatrace/requests.go (1)

171-184: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add poll delay/backoff in RUNNING state.

The loop retries immediately on RUNNING, which can flood Dynatrace and burn CPU. Add at least a small sleep (or backoff) before the next poll.

Proposed fix
 import (
 	"encoding/json"
 	"fmt"
 	"io"
 	"net/http"
 	"net/url"
+	"time"
@@
 	for {
 		resp, err := requester.Send()
 		if err != nil {
 			return "", err
 		}
@@
 		if dtPollRes.State == "RUNNING" {
+			time.Sleep(1 * time.Second)
 			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/dynatrace/requests.go` around lines 171 - 184, The polling loop in
cmd/dynatrace/requests.go currently busy-spins when dtPollRes.State == "RUNNING"
by immediately continuing; modify the loop around requester.Send() / dtPollRes
handling to wait before retrying (e.g., call time.Sleep with a small fixed delay
like 500ms or implement simple exponential backoff increasing a retryDelay
variable) and reset/adjust the delay when the state changes; ensure you import
the time package and reference the existing requester.Send() call and
dtPollRes.State checks so the sleep/backoff is applied right before continuing
rather than immediately looping.
♻️ Duplicate comments (3)
cmd/rhobs/requests.go (3)

261-261: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return metrics printer failures to the caller.

printMetricsAsCsv now detects Write failures, but it only logs them and PrintMetrics still returns nil. That means osdctl rhobs metrics -o csv | head can exit successfully after partial output. Make the printer return error, check writer.Error() after Flush(), and bubble the failure through createMetricsPrinter/PrintMetrics.

Also applies to: 298-329, 428-428

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` at line 261, Change the metricsPrinter type to return
error (e.g., type metricsPrinter func(*[]instantMetricResult) error) and update
implementations such as printMetricsAsCsv to return any write/flush failures:
call csvWriter.Flush(), then check csvWriter.Error() and return that error
instead of only logging; propagate that error through createMetricsPrinter and
have PrintMetrics return it to the caller so callers see partial-write failures.
Ensure all places that construct or call metricsPrinter (createMetricsPrinter,
PrintMetrics, and the csv/text printer implementations) are updated to handle
and propagate the returned error.

480-484: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate CSV log write failures back through PrintLogs.

csvLogsPrinter still drops Write/Flush errors, and the logsPrinter interface gives PrintLogs no way to return them. Broken pipes will therefore look like success with truncated CSV output.

Also applies to: 519-547, 601-603, 708-708

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` around lines 480 - 484, The logsPrinter interface and
its implementations currently swallow CSV Write/Flush errors causing PrintLogs
to report success on broken pipes; change logsPrinter method signatures
(PrintHeader, PrintResult, PrintTrailer) to return error, update csvLogsPrinter
methods to return any underlying io.Writer Write/Flush errors (including
csv.Writer.Flush and any stored csv.Writer.Error), and propagate those errors up
by changing PrintLogs to accept/handle the returned errors and return a non-nil
error to callers; update all callers of PrintLogs and any other logsPrinter
implementations accordingly so Write/Flush failures are not dropped.

676-704: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

The pagination cursor still skips page-boundary logs.

After anchoring the next window to edgeTimeStamp, Lines 698-704 suppress every result with that timestamp. That drops boundary entries instead of only avoiding re-fetches. The current batch should be printed in full, and only the next cursor move should skip past an already-seen timestamp.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` around lines 676 - 704, The loop currently suppresses
results whose getTimeStamp() equals edgeTimeStamp, dropping page-boundary logs;
instead, print every result in flattenedResults (call logsPrinter.PrintResult
for each and decrement logsCount) regardless of equality to edgeTimeStamp, and
only use edgeTimeStamp when adjusting the cursor (startTimeStamp/endTimeStamp)
after processing the batch—advance the cursor past edgeTimeStamp by +1 when
isGoingForward or -1 when going backward to avoid refetching the same timestamp.
Ensure references: flattenedResults, edgeTimeStamp, getTimeStamp(),
logsPrinter.PrintResult, logsCount, startTimeStamp, endTimeStamp, isGoingForward
are updated accordingly.
🧹 Nitpick comments (1)
main.go (1)

22-22: Audit the hook ordering before enabling global Cobra hook traversal, or scope the change more narrowly.

Line 22 sets a Cobra-global flag that affects hook execution order across the entire CLI. Verification found only two persistent hook definitions in the codebase—cmd/cmd.go:65 (root PersistentPreRun) and cmd/rhobs/rootCmd.go:19 (rhobs PersistentPreRunE)—which limits the actual impact scope. However, because this is a process-wide setting, confirm that the hook ordering change is intentional for both the root command and RHOBS before keeping it globally enabled, or consider calling shared setup explicitly from RHOBS instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@main.go` at line 22, The global flag cobra.EnableTraverseRunHooks is being
enabled process-wide; instead either restrict its effect or make the behavior
explicit: audit the two PersistentPreRun hooks (the root command's
PersistentPreRun and rhobs' PersistentPreRunE) and either 1) remove the global
cobra.EnableTraverseRunHooks = true and call the shared setup explicitly from
the rhobs root (invoke the root setup function from the rhobs init or its RunE),
or 2) if the traversal is intentional, add a clear comment and a
unit/integration test that validates the execution order for the root
PersistentPreRun and rhobs PersistentPreRunE to document the decision; update
references to cobra.EnableTraverseRunHooks accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/rhobs/logsCmd.go`:
- Around line 159-167: The flag help claims --include-events is exclusive with
--query but the code only marks namespace/selector/container mutually exclusive
with "query"; add the mutual exclusion for the include-events flag so the CLI
enforces it. Specifically, call cmd.MarkFlagsMutuallyExclusive("query",
"include-events") (or otherwise check lokiExpr vs isIncludingEvents in the
command Run and return an error) near the other MarkFlagsMutuallyExclusive calls
that reference lokiExpr, so the flags lokiExpr (query) and isIncludingEvents
(include-events) cannot be used together.
- Around line 75-80: Normalize Kubernetes label keys before embedding them into
the Loki field name by replacing illegal characters ('.', '/', '-') with
underscores: create or use a helper (e.g., normalizeLabelKey) that calls
strings.NewReplacer(".", "_", "/", "_", "-", "_").Replace(req.Key()) and use
that normalizedKey instead of req.Key() when building lokiExpr in all
occurrences (the fmt.Sprintf calls that produce `k8s_pod_label_%s ...`),
including the branches handling selection.Exists/DoesNotExist and the other two
cases shown; ensure you use the same helper for consistency.

In `@cmd/rhobs/requests.go`:
- Around line 497-499: Replace the RFC3339 timestamp formatting with nanosecond
precision so printed timestamps preserve ordering: when p.isPrintingTimeValue
writes result.getTime().Format(...), change the formatter from time.RFC3339 to
time.RFC3339Nano (apply the same change at the other occurrence around line
535); update both the text printer and CSV printer paths that call
result.getTime().Format(...) so all --ts outputs use time.RFC3339Nano.

In `@cmd/vault/vault.go`:
- Around line 24-35: GetVaultRef currently only checks viper.IsSet for
VaultAddrKey and the provided vaultPathKey but allows empty strings; update
GetVaultRef to also validate the retrieved values are non-empty (vaultAddr :=
viper.GetString(VaultAddrKey) and vaultPath := viper.GetString(vaultPathKey))
and return an error if either is "" so VaultRef{Addr: vaultAddr, Path:
vaultPath} is never constructed with empty fields; reference the VaultAddrKey,
vaultPathKey, GetVaultRef function and VaultRef type in the change.

---

Outside diff comments:
In `@cmd/dynatrace/requests.go`:
- Around line 171-184: The polling loop in cmd/dynatrace/requests.go currently
busy-spins when dtPollRes.State == "RUNNING" by immediately continuing; modify
the loop around requester.Send() / dtPollRes handling to wait before retrying
(e.g., call time.Sleep with a small fixed delay like 500ms or implement simple
exponential backoff increasing a retryDelay variable) and reset/adjust the delay
when the state changes; ensure you import the time package and reference the
existing requester.Send() call and dtPollRes.State checks so the sleep/backoff
is applied right before continuing rather than immediately looping.

---

Duplicate comments:
In `@cmd/rhobs/requests.go`:
- Line 261: Change the metricsPrinter type to return error (e.g., type
metricsPrinter func(*[]instantMetricResult) error) and update implementations
such as printMetricsAsCsv to return any write/flush failures: call
csvWriter.Flush(), then check csvWriter.Error() and return that error instead of
only logging; propagate that error through createMetricsPrinter and have
PrintMetrics return it to the caller so callers see partial-write failures.
Ensure all places that construct or call metricsPrinter (createMetricsPrinter,
PrintMetrics, and the csv/text printer implementations) are updated to handle
and propagate the returned error.
- Around line 480-484: The logsPrinter interface and its implementations
currently swallow CSV Write/Flush errors causing PrintLogs to report success on
broken pipes; change logsPrinter method signatures (PrintHeader, PrintResult,
PrintTrailer) to return error, update csvLogsPrinter methods to return any
underlying io.Writer Write/Flush errors (including csv.Writer.Flush and any
stored csv.Writer.Error), and propagate those errors up by changing PrintLogs to
accept/handle the returned errors and return a non-nil error to callers; update
all callers of PrintLogs and any other logsPrinter implementations accordingly
so Write/Flush failures are not dropped.
- Around line 676-704: The loop currently suppresses results whose
getTimeStamp() equals edgeTimeStamp, dropping page-boundary logs; instead, print
every result in flattenedResults (call logsPrinter.PrintResult for each and
decrement logsCount) regardless of equality to edgeTimeStamp, and only use
edgeTimeStamp when adjusting the cursor (startTimeStamp/endTimeStamp) after
processing the batch—advance the cursor past edgeTimeStamp by +1 when
isGoingForward or -1 when going backward to avoid refetching the same timestamp.
Ensure references: flattenedResults, edgeTimeStamp, getTimeStamp(),
logsPrinter.PrintResult, logsCount, startTimeStamp, endTimeStamp, isGoingForward
are updated accordingly.

---

Nitpick comments:
In `@main.go`:
- Line 22: The global flag cobra.EnableTraverseRunHooks is being enabled
process-wide; instead either restrict its effect or make the behavior explicit:
audit the two PersistentPreRun hooks (the root command's PersistentPreRun and
rhobs' PersistentPreRunE) and either 1) remove the global
cobra.EnableTraverseRunHooks = true and call the shared setup explicitly from
the rhobs root (invoke the root setup function from the rhobs init or its RunE),
or 2) if the traversal is intentional, add a clear comment and a
unit/integration test that validates the execution order for the root
PersistentPreRun and rhobs PersistentPreRunE to document the decision; update
references to cobra.EnableTraverseRunHooks accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 902685fc-ed42-477e-90b1-8594fb5236f4

📥 Commits

Reviewing files that changed from the base of the PR and between 297d68a and ce11fda.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • cmd/cmd.go
  • cmd/dynatrace/requests.go
  • cmd/requester/requester.go
  • cmd/rhobs/cellCmd.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/requests.go
  • cmd/rhobs/rootCmd.go
  • cmd/vault/vault.go
  • cmd/vault/vault_test.go
  • go.mod
  • internal/utils/globalflags/globalflags.go
  • main.go
  • pkg/utils/utils.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/utils/utils.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • cmd/vault/vault_test.go
  • cmd/rhobs/cellCmd.go
  • cmd/cmd.go
  • internal/utils/globalflags/globalflags.go

Comment thread cmd/rhobs/logsCmd.go Outdated
Comment thread cmd/rhobs/logsCmd.go Outdated
Comment thread cmd/rhobs/requests.go
Comment thread pkg/utils/vault.go
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/dynatrace/hcpGatherLogsCmd.go (1)

134-199: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the method parameters consistently in query construction.

dumpEvents and dumpPodLogs accept since/tail/sortOrder but currently ignore them and read from g.* instead. That makes the function contract misleading and brittle for future callers.

Proposed fix
- eventQuery, err := getEventQuery(d.Name, targetNS, g.Since, g.Tail, g.SortOrder, managementClusterName)
+ eventQuery, err := getEventQuery(d.Name, targetNS, since, tail, sortOrder, managementClusterName)

- podLogsQuery, err := getPodQuery(p.Name, targetNS, g.Since, g.Tail, g.SortOrder, managementClusterName)
+ podLogsQuery, err := getPodQuery(p.Name, targetNS, since, tail, sortOrder, managementClusterName)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/dynatrace/hcpGatherLogsCmd.go` around lines 134 - 199, The dumpEvents and
dumpPodLogs functions ignore their local parameters since, tail, and sortOrder
and instead use g.Since/g.Tail/g.SortOrder when constructing queries; update the
calls to getEventQuery and getPodQuery to pass the function parameters (since,
tail, sortOrder) rather than g.Since/g.Tail/g.SortOrder so the functions honor
their contract and callers can supply custom values; ensure any other query
construction in dumpEvents/dumpPodLogs similarly uses the local parameters.
♻️ Duplicate comments (4)
cmd/rhobs/logsCmd.go (2)

164-167: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

--include-events is not enforced as mutually exclusive with --query.

The help text on line 159 states --query is "exclusive with --namespace, --selector, --container and --include-events flags", but only the first three are enforced. Running osdctl rhobs logs --query ... --include-events will silently ignore --include-events.

🔧 Proposed fix
 cmd.MarkFlagsMutuallyExclusive("query", "namespace")
 cmd.MarkFlagsMutuallyExclusive("query", "selector")
 cmd.MarkFlagsMutuallyExclusive("query", "container")
+cmd.MarkFlagsMutuallyExclusive("query", "include-events")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/logsCmd.go` around lines 164 - 167, The help claims --query is
exclusive with --include-events but the code only marks "namespace", "selector",
and "container" mutually exclusive; update the flag mutual-exclusion setup in
the command initialization by adding the include-events flag to the
MarkFlagsMutuallyExclusive calls (i.e., call
cmd.MarkFlagsMutuallyExclusive("query", "include-events") alongside the existing
calls in logs command setup) so that attempting to use --query and
--include-events together is rejected.

75-92: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize label selector keys before using them as k8s_pod_label_* field names.

Kubernetes label keys containing special characters (., /, -) must be normalized by replacing them with underscores for RHOBS/Loki field names. For example, app.kubernetes.io/name=myapp would generate k8s_pod_label_app.kubernetes.io/name, but the backend expects k8s_pod_label_app_kubernetes_io_name.

This affects lines 75, 80, and 82 where req.Key() is used directly.

🔧 Proposed fix
+import "strings"
+
+func normalizeLabelKey(key string) string {
+	replacer := strings.NewReplacer(".", "_", "/", "_", "-", "_")
+	return replacer.Replace(key)
+}
+
 // Then in the switch cases:
 case selection.Equals, selection.DoubleEquals, selection.NotEquals:
 	if len(values) != 1 {
 		return fmt.Errorf("internal error: label selector operator '%s' is expected to have only one value in '%s'", op, labelSelectorStr)
 	}
-	lokiExpr += fmt.Sprintf(` | k8s_pod_label_%s %s "%s"`, req.Key(), lokiOp, values[0])
+	lokiExpr += fmt.Sprintf(` | k8s_pod_label_%s %s "%s"`, normalizeLabelKey(req.Key()), lokiOp, values[0])
 case selection.Exists, selection.DoesNotExist:
 	if len(values) != 0 {
 		return fmt.Errorf("internal error: label selector operator '%s' is not expected to have any value in '%s'", op, labelSelectorStr)
 	}
-	lokiExpr += fmt.Sprintf(` | k8s_pod_label_%s %s ".*"`, req.Key(), lokiOp)
+	lokiExpr += fmt.Sprintf(` | k8s_pod_label_%s %s ".*"`, normalizeLabelKey(req.Key()), lokiOp)
 case selection.In, selection.NotIn:
-	lokiExpr += fmt.Sprintf(` | k8s_pod_label_%s %s "`, req.Key(), lokiOp)
+	lokiExpr += fmt.Sprintf(` | k8s_pod_label_%s %s "`, normalizeLabelKey(req.Key()), lokiOp)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/logsCmd.go` around lines 75 - 92, Normalize the label key before
embedding it into k8s_pod_label_%s; replace characters like '.', '/', '-' with
'_' (e.g. via strings.NewReplacer(".", "_", "/", "_", "-", "_") or a small
helper normalizeLabelKey(key string) string) and use the normalized value
instead of req.Key() when building lokiExpr in the switch handling
req.Operator() (the cases that call fmt.Sprintf(` | k8s_pod_label_%s ...`,
req.Key()) — including the Exists/DoesNotExist, equality and set operators).
Ensure you apply the normalized key in every fmt.Sprintf call that currently
uses req.Key() so field names become k8s_pod_label_<normalized_key>.
pkg/utils/vault.go (1)

24-36: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate that config values are non-empty, not just present.

GetVaultRef currently only checks viper.IsSet() but accepts empty string values. This will cause less actionable errors later when the vault CLI is invoked with empty arguments.

🔧 Proposed fix
+import "strings"
+
 func GetVaultRef(vaultPathKey string) (VaultRef, error) {
 	if !viper.IsSet(VaultAddrKey) {
 		return VaultRef{}, fmt.Errorf("key '%s' is not set in config file", VaultAddrKey)
 	}
 	vaultAddr := viper.GetString(VaultAddrKey)
+	if strings.TrimSpace(vaultAddr) == "" {
+		return VaultRef{}, fmt.Errorf("key '%s' is empty in config file", VaultAddrKey)
+	}
 
 	if !viper.IsSet(vaultPathKey) {
 		return VaultRef{}, fmt.Errorf("key '%s' is not set in config file", vaultPathKey)
 	}
 	vaultPath := viper.GetString(vaultPathKey)
+	if strings.TrimSpace(vaultPath) == "" {
+		return VaultRef{}, fmt.Errorf("key '%s' is empty in config file", vaultPathKey)
+	}
 
 	return VaultRef{Addr: vaultAddr, Path: vaultPath}, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/utils/vault.go` around lines 24 - 36, GetVaultRef currently only uses
viper.IsSet but may return empty vaultAddr or vaultPath; update GetVaultRef to
validate that viper.GetString(VaultAddrKey) and viper.GetString(vaultPathKey)
return non-empty strings and return an error if either is empty. Specifically,
after retrieving vaultAddr and vaultPath in GetVaultRef, check if vaultAddr ==
"" or vaultPath == "" and return fmt.Errorf with the corresponding key name to
fail fast before any Vault CLI invocation.
cmd/rhobs/requests.go (1)

519-530: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Check error return value in csvLogsPrinter.PrintHeader.

PrintResult checks the Write error (line 542-544), but PrintHeader does not (line 529). For consistency and robustness, handle the error similarly.

🔧 Proposed fix
 func (p *csvLogsPrinter) PrintHeader() {
 	header := []string{}
 	if p.isPrintingTimeValue {
 		header = append(header, "TIME")
 	}
-	for _, fieldName := range p.fieldNames {
-		header = append(header, fieldName)
-	}
+	header = append(header, p.fieldNames...)
 	header = append(header, "MESSAGE")
 
-	p.writer.Write(header)
+	if err := p.writer.Write(header); err != nil {
+		log.Warnln("Failed to write CSV header:", err)
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` around lines 519 - 530, The PrintHeader method of
csvLogsPrinter currently calls p.writer.Write(header) and ignores its error;
update csvLogsPrinter.PrintHeader to capture and handle the returned error like
PrintResult does (check the error and return or propagate it appropriately),
e.g., change the signature or error handling so p.writer.Write(...)’s error is
checked and handled consistently with PrintResult; locate
csvLogsPrinter.PrintHeader and p.writer.Write in the diff and mirror the
error-check pattern used in PrintResult.
🧹 Nitpick comments (2)
cmd/rhobs/logsCmd.go (1)

171-171: 💤 Low value

Check the error return value of MarkHidden.

Static analysis indicates the error from MarkHidden is not checked. While unlikely to fail since the flag is registered immediately before, handling the error follows best practices.

🔧 Proposed fix
-cmd.Flags().MarkHidden("since-time")
+if err := cmd.Flags().MarkHidden("since-time"); err != nil {
+	// This should never happen since we just registered the flag
+	panic(fmt.Sprintf("failed to mark since-time flag as hidden: %v", err))
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/logsCmd.go` at line 171, The call to
cmd.Flags().MarkHidden("since-time") ignores its returned error; update the
invocation where the command is constructed (the variable cmd in logsCmd.go) to
capture and handle the error from MarkHidden — e.g., assign the error (err :=
cmd.Flags().MarkHidden("since-time")) and return or log it appropriately from
the surrounding function or propagate it so failures are not silently ignored;
ensure the change references the cmd.Flags().MarkHidden call and follows the
existing error-handling pattern in this function.
cmd/rhobs/requests.go (1)

275-291: 💤 Low value

Use fmt.Printf instead of fmt.Print(fmt.Sprintf(...)).

Static analysis indicates these can be simplified. This is a minor style improvement.

🔧 Proposed fix
-fmt.Print(fmt.Sprintf(" %-*s |", column.width, column.name))
+fmt.Printf(" %-*s |", column.width, column.name)
 ...
-fmt.Print(fmt.Sprintf("| %*s | %*s |", columns[0].width, time, columns[1].width, value))
+fmt.Printf("| %*s | %*s |", columns[0].width, time, columns[1].width, value)
 ...
-fmt.Print(fmt.Sprintf(" %*s |", column.width, labelValue))
+fmt.Printf(" %*s |", column.width, labelValue)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/rhobs/requests.go` around lines 275 - 291, Several print calls use
fmt.Print(fmt.Sprintf(...)) which is redundant; replace those with fmt.Printf to
simplify formatting. In the header loop replace fmt.Print(fmt.Sprintf(" %-*s |",
column.width, column.name)) with fmt.Printf(" %-*s |", column.width,
column.name); in the rows replace fmt.Print(fmt.Sprintf("| %*s | %*s |",
columns[0].width, time, columns[1].width, value)) with fmt.Printf(...); and
replace fmt.Print(fmt.Sprintf(" %*s |", column.width, labelValue)) inside the
label loop with fmt.Printf(" %*s |", column.width, labelValue). Keep existing
variables/results/columns/separatorLine usage unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/rhobs/requests.go`:
- Around line 17-18: Remove the duplicate import of
"github.com/openshift/osdctl/pkg/utils" by keeping a single import name (choose
either alias ocmutils or plain utils) and update call sites to match that name;
specifically, ensure CreateConnection, GetCluster, and GetManagementCluster
calls use the chosen import name (ocmutils.CreateConnection,
ocmutils.GetCluster, ocmutils.GetManagementCluster if keeping ocmutils) and that
GetScopedAccessToken is called with the same import name (e.g.,
utils.GetScopedAccessToken if you choose utils), then remove the other import
entry so only one import of pkg/utils remains.

---

Outside diff comments:
In `@cmd/dynatrace/hcpGatherLogsCmd.go`:
- Around line 134-199: The dumpEvents and dumpPodLogs functions ignore their
local parameters since, tail, and sortOrder and instead use
g.Since/g.Tail/g.SortOrder when constructing queries; update the calls to
getEventQuery and getPodQuery to pass the function parameters (since, tail,
sortOrder) rather than g.Since/g.Tail/g.SortOrder so the functions honor their
contract and callers can supply custom values; ensure any other query
construction in dumpEvents/dumpPodLogs similarly uses the local parameters.

---

Duplicate comments:
In `@cmd/rhobs/logsCmd.go`:
- Around line 164-167: The help claims --query is exclusive with
--include-events but the code only marks "namespace", "selector", and
"container" mutually exclusive; update the flag mutual-exclusion setup in the
command initialization by adding the include-events flag to the
MarkFlagsMutuallyExclusive calls (i.e., call
cmd.MarkFlagsMutuallyExclusive("query", "include-events") alongside the existing
calls in logs command setup) so that attempting to use --query and
--include-events together is rejected.
- Around line 75-92: Normalize the label key before embedding it into
k8s_pod_label_%s; replace characters like '.', '/', '-' with '_' (e.g. via
strings.NewReplacer(".", "_", "/", "_", "-", "_") or a small helper
normalizeLabelKey(key string) string) and use the normalized value instead of
req.Key() when building lokiExpr in the switch handling req.Operator() (the
cases that call fmt.Sprintf(` | k8s_pod_label_%s ...`, req.Key()) — including
the Exists/DoesNotExist, equality and set operators). Ensure you apply the
normalized key in every fmt.Sprintf call that currently uses req.Key() so field
names become k8s_pod_label_<normalized_key>.

In `@cmd/rhobs/requests.go`:
- Around line 519-530: The PrintHeader method of csvLogsPrinter currently calls
p.writer.Write(header) and ignores its error; update csvLogsPrinter.PrintHeader
to capture and handle the returned error like PrintResult does (check the error
and return or propagate it appropriately), e.g., change the signature or error
handling so p.writer.Write(...)’s error is checked and handled consistently with
PrintResult; locate csvLogsPrinter.PrintHeader and p.writer.Write in the diff
and mirror the error-check pattern used in PrintResult.

In `@pkg/utils/vault.go`:
- Around line 24-36: GetVaultRef currently only uses viper.IsSet but may return
empty vaultAddr or vaultPath; update GetVaultRef to validate that
viper.GetString(VaultAddrKey) and viper.GetString(vaultPathKey) return non-empty
strings and return an error if either is empty. Specifically, after retrieving
vaultAddr and vaultPath in GetVaultRef, check if vaultAddr == "" or vaultPath ==
"" and return fmt.Errorf with the corresponding key name to fail fast before any
Vault CLI invocation.

---

Nitpick comments:
In `@cmd/rhobs/logsCmd.go`:
- Line 171: The call to cmd.Flags().MarkHidden("since-time") ignores its
returned error; update the invocation where the command is constructed (the
variable cmd in logsCmd.go) to capture and handle the error from MarkHidden —
e.g., assign the error (err := cmd.Flags().MarkHidden("since-time")) and return
or log it appropriately from the surrounding function or propagate it so
failures are not silently ignored; ensure the change references the
cmd.Flags().MarkHidden call and follows the existing error-handling pattern in
this function.

In `@cmd/rhobs/requests.go`:
- Around line 275-291: Several print calls use fmt.Print(fmt.Sprintf(...)) which
is redundant; replace those with fmt.Printf to simplify formatting. In the
header loop replace fmt.Print(fmt.Sprintf(" %-*s |", column.width, column.name))
with fmt.Printf(" %-*s |", column.width, column.name); in the rows replace
fmt.Print(fmt.Sprintf("| %*s | %*s |", columns[0].width, time, columns[1].width,
value)) with fmt.Printf(...); and replace fmt.Print(fmt.Sprintf(" %*s |",
column.width, labelValue)) inside the label loop with fmt.Printf(" %*s |",
column.width, labelValue). Keep existing variables/results/columns/separatorLine
usage unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7e44ee40-1933-41af-a59a-3cba5027d279

📥 Commits

Reviewing files that changed from the base of the PR and between ce11fda and 5529f6a.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (16)
  • cmd/cmd.go
  • cmd/dynatrace/hcpGatherLogsCmd.go
  • cmd/dynatrace/requests.go
  • cmd/rhobs/cellCmd.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/requests.go
  • cmd/rhobs/rootCmd.go
  • go.mod
  • internal/utils/globalflags/globalflags.go
  • main.go
  • pkg/utils/requester.go
  • pkg/utils/token.go
  • pkg/utils/utils.go
  • pkg/utils/vault.go
  • pkg/utils/vault_test.go
✅ Files skipped from review due to trivial changes (3)
  • pkg/utils/vault_test.go
  • pkg/utils/utils.go
  • cmd/rhobs/cellCmd.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/rhobs/rootCmd.go

Comment thread cmd/rhobs/requests.go Outdated
Comment thread cmd/rhobs/requests.go
Comment on lines +160 to +161
clusterDeployment := clusterDeployments.Items[0]
rhobsCell, ok := clusterDeployment.Labels[rhobsCellCdLabel]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend we avoid using the rhobs-cell label on the MC. While I would expect the label value to contain the correct RHOBS cell the vast majority of the time, app-interface can overwrite the forwarding destination for RHOBS components, which would invalidate the label's destination

Instead, you can check the annotation on a set of configmaps, which should always accurately reflect the true forwarding destination

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as discussed I will use those config maps. I will raise tickets for clusters on which those config maps are missing

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnierman : code has been changed to read the RHOBS cell from the config maps

@Nikokolas3270 Nikokolas3270 force-pushed the SREP-347 branch 4 times, most recently from 4ef2856 to 353eb0a Compare May 5, 2026 16:09
@Nikokolas3270
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🧹 Nitpick comments (5)
docs/osdctl_network.md (1)

22-24: 💤 Low value

Add language specifier to fenced code block for markdown compliance.

The fenced code block is missing a language specifier. While this doesn't affect functionality, adding one improves markdown quality and rendering consistency. Consider adding an empty language specifier or bash for consistency with markdown best practices.

As per static analysis tool markdownlint-cli2, fenced code blocks should have a language specified (MD040).

📝 Proposed fix
 ### Options inherited from parent commands
 
-```
+```bash
   -S, --skip-version-check   skip checking to see if this is the most recent release

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @docs/osdctl_network.md around lines 22 - 24, The fenced code block
containing the CLI option line "-S, --skip-version-check skip checking to see
if this is the most recent release" lacks a language specifier; update that
fence to include a language (e.g., ```bash) so the snippet complies with
markdownlint MD040 and renders consistently.


</details>

</blockquote></details>
<details>
<summary>pkg/utils/vault.go (1)</summary><blockquote>

`125-128`: _💤 Low value_

**Redundant environment variable setting.**

`VAULT_ADDR` is already set in `setupVaultToken()` (line 48), which is called immediately before this. This second assignment is unnecessary.


<details>
<summary>♻️ Proposed fix</summary>

```diff
 func GetSecretFromVault(vaultRef VaultRef) (*map[string]string, error) {
 	err := setupVaultToken(vaultRef.Addr)
 	if err != nil {
 		return nil, err
 	}
 
-	err = os.Setenv("VAULT_ADDR", vaultRef.Addr)
-	if err != nil {
-		return nil, fmt.Errorf("error setting environment variable: %v", err)
-	}
-
 	kvGetCommand := exec.Command("vault", "kv", "get", "-format=json", vaultRef.Path)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/utils/vault.go` around lines 125 - 128, Remove the redundant VAULT_ADDR
assignment: since setupVaultToken() already sets VAULT_ADDR before the code in
pkg/utils/vault.go (the os.Setenv("VAULT_ADDR", vaultRef.Addr) call), delete
that second os.Setenv call and its error handling; ensure any remaining code
relies on the VAULT_ADDR set by setupVaultToken() and keep setupVaultToken() as
the single source of truth for setting the environment variable.
```

</details>

</blockquote></details>
<details>
<summary>cmd/dynatrace/requests.go (1)</summary><blockquote>

`122-148`: _⚡ Quick win_

**Redundant loop and duplicate state check.**

The `for` loop (lines 122-138) always executes exactly once and breaks unconditionally. The state check at lines 133-135 is then repeated at lines 146-148. This appears to be leftover code or a refactoring artifact.


<details>
<summary>♻️ Proposed simplification</summary>

```diff
 	var resp string
-	for {
-		resp, err = requester.Send()
-		if err != nil {
-			return "", err
-		}
-		var execState DTExecuteState
-		err = json.Unmarshal([]byte(resp), &execState)
-		if err != nil {
-			return "", err
-		}
-
-		if execState.State != "RUNNING" && execState.State != "SUCCEEDED" {
-			return "", fmt.Errorf("query failed")
-		}
-
-		break
-	}
-
-	var state DTExecuteState
-	err = json.Unmarshal([]byte(resp), &state)
+	resp, err = requester.Send()
 	if err != nil {
 		return "", err
 	}
 
+	var state DTExecuteState
+	err = json.Unmarshal([]byte(resp), &state)
+	if err != nil {
+		return "", err
+	}
+
 	if state.State != "RUNNING" && state.State != "SUCCEEDED" {
 		return "", fmt.Errorf("query failed")
 	}
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/dynatrace/requests.go` around lines 122 - 148, The loop and duplicate
state unmarshal are redundant: call requester.Send() once, unmarshal its resp
into a DTExecuteState (use the existing DTExecuteState type and resp variable),
check that state.State is either "RUNNING" or "SUCCEEDED" and return an error if
not, then proceed; remove the for { ... break } block and the second
json.Unmarshal/duplicate state check (the variables requester.Send(), resp,
DTExecuteState and state are the identifiers to update).
```

</details>

</blockquote></details>
<details>
<summary>cmd/rhobs/requests.go (1)</summary><blockquote>

`415-416`: _💤 Low value_

**Check `writer.Error()` after `Flush()` for consistency.**

The CSV writer may buffer errors until `Flush()` is called. For consistency with the error handling pattern used elsewhere in this file (and per past review discussions), consider checking `writer.Error()` after `Flush()`.



<details>
<summary>♻️ Proposed fix</summary>

```diff
 	writer.Flush()
+	if err := writer.Error(); err != nil {
+		log.Warnln("Failed to flush CSV writer:", err)
+	}
 }
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/rhobs/requests.go` around lines 415 - 416, After calling writer.Flush(),
call writer.Error() and handle any returned error consistently with the file's
existing pattern (e.g., return or log the error). Specifically, immediately
after the existing writer.Flush() invocation check err := writer.Error(); and
propagate or report err the same way other errors are handled in this function
(keep the same control flow surrounding writer.Flush() and writer.Error() to
match existing error handling semantics).
```

</details>

</blockquote></details>
<details>
<summary>cmd/rhobs/cellCmd.go (1)</summary><blockquote>

`17-45`: _⚡ Quick win_

**Convert `Run` to `RunE` and return an error instead of calling `os.Exit(1)`.**

The `cellCmd.go` handler uses `os.Exit(1)` (line 43) which bypasses normal error propagation in Cobra's command tree. Other commands in this directory (`metricsCmd.go`, `logsCmd.go`) follow the `RunE` pattern, which returns errors for consistent error handling and easier testing.

<details>
<summary>Suggested refactor</summary>

```diff
-import (
-	"fmt"
-	"os"
+import (
+	"fmt"

 	log "github.com/sirupsen/logrus"
 	"github.com/spf13/cobra"
 )
 
-		Run: func(cmd *cobra.Command, args []string) {
+		RunE: func(cmd *cobra.Command, args []string) error {
 			cmd.SilenceUsage = true
 			
 			metricsRhobsFetcher, metricsErr := CreateRhobsFetcher(commonOptions.clusterId, RhobsFetchForMetrics, commonOptions.hiveOcmUrl)
 			if metricsErr != nil {
 				log.Errorf("Error while computing metrics RHOBS cell: %v", metricsErr)
 			}
 			
 			logsRhobsFetcher, logsErr := CreateRhobsFetcher(commonOptions.clusterId, RhobsFetchForLogs, commonOptions.hiveOcmUrl)
 			if logsErr != nil {
 				log.Errorf("Error while computing logs RHOBS cell: %v", logsErr)
 			}
 			
 			if metricsErr == nil {
 				if logsErr == nil && metricsRhobsFetcher.RhobsCell == logsRhobsFetcher.RhobsCell {
 					fmt.Println("Metrics & logs RHOBS cell:", metricsRhobsFetcher.RhobsCell)
-					return
+					return nil
 				}
 				fmt.Println("Metrics RHOBS cell:", metricsRhobsFetcher.RhobsCell)
 			}
 			
 			if logsErr == nil {
 				fmt.Println("Logs RHOBS cell   :", logsRhobsFetcher.RhobsCell)
 			}
 			
 			if metricsErr != nil || logsErr != nil {
-				os.Exit(1)
+				return fmt.Errorf("failed to compute RHOBS cell (metricsErr=%v, logsErr=%v)", metricsErr, logsErr)
 			}
+			return nil
 		},
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/rhobs/cellCmd.go` around lines 17 - 45, The command's handler currently
uses Run and calls os.Exit(1) which prevents Cobra-style error propagation;
change the cobra.Command field from Run to RunE, make the anonymous function
signature return error, replace os.Exit(1) with returning a non-nil error
(wrap/return metricsErr or logsErr or a combined error) and ensure success paths
return nil; keep existing logic around CreateRhobsFetcher, metricsRhobsFetcher,
logsRhobsFetcher and commonOptions.clusterId/hiveOcmUrl but propagate failures
by returning errors instead of calling os.Exit.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @cmd/rhobs/logsCmd.go:

  • Line 216: The help text construction for the --level flag is producing
    incorrect quoting because it builds the quoted list with manual concatenation
    ("+strings.Join(...), resulting in stray spaces/quotes; update the StringArrayVar call so the allowed values are formatted as a proper comma-separated quoted list (e.g., build a single string like "+ strings.Join(allowedLogLevels,",") + "` or use fmt.Sprintf with the %q verb
    on the slice/values) and pass that string into fmt.Sprintf for the flag
    description; change the expression inside cmd.Flags().StringArrayVar (where
    logLevels and allowedLogLevels are referenced) to use the corrected quoted-list
    construction.

In @docs/osdctl_aao.md:

  • Around line 22-24: The fenced code block containing the option line "-S,
    --skip-version-check skip checking to see if this is the most recent release"
    lacks a language tag and triggers MD040; update that fenced block in
    docs/osdctl_aao.md to include an explicit language identifier (e.g., "text")
    after the opening triple backticks so the block with "-S, --skip-version-check"
    satisfies markdownlint.

In @docs/osdctl_account.md:

  • Around line 22-24: Add a language specifier to the fenced code blocks
    containing the flag descriptions: update the block that contains "-S,
    --skip-version-check skip checking to see if this is the most recent release"
    and the block that contains "--as string Username to
    impersonate for the operation. User could be a regular user or a service account
    in a namespace." to use text (i.e., change the opening fence from to

In `@docs/osdctl_cluster.md`:
- Around line 22-24: The fenced code block containing the option line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
is missing a language tag; update that code fence to include a language
(preferably "text" or "bash") so markdownlint MD040 is satisfied—for example
change the opening triple backticks to "```text" (or "```bash") for the block
that contains the "-S, --skip-version-check" line.

In `@docs/osdctl_evidence.md`:
- Around line 34-36: Update the fenced code block that contains the line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
to declare a language (for example "text") by adding the language identifier
after the opening triple backticks so the block becomes "```text" instead of
"```"; this will satisfy markdownlint MD040 while leaving the content of the
block unchanged.

In `@docs/osdctl_hcp.md`:
- Around line 22-24: The fenced code block containing the flag line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
should include a language identifier to satisfy markdownlint MD040; update the
opening fence from ``` to ```text (or another appropriate lang like ```console)
so the block becomes ```text and keep the rest unchanged.

In `@docs/osdctl_iampermissions.md`:
- Around line 27-29: Update the fenced code block so it includes a language tag
to satisfy markdownlint MD040: change the opening triple backticks to include a
language (e.g., "text") for the block that contains "-S, --skip-version-check  
skip checking to see if this is the most recent release" so the fence becomes
```text; ensure the block around that line uses the tagged fence rather than
plain ``` to fix the lint warning.

In `@docs/osdctl_jira.md`:
- Around line 22-24: The fenced code block containing the CLI flag line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
is missing a language tag (MD040); update the fence from ``` to include a
language such as ```text (or ```console) so the block becomes a labeled code
fence and satisfies markdownlint. Ensure you only change the opening fence to
include the language and do not alter the flag text itself.

In `@docs/osdctl_jumphost.md`:
- Around line 22-24: The fenced code block containing the option line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
is missing a language tag and triggers markdownlint MD040; update the opening
triple-backtick for that block to include a language identifier (e.g., change
"```" to "```text") so the block becomes a labeled code fence around the "-S,
--skip-version-check" content.

In `@docs/osdctl_mc.md`:
- Line 7: Update the two fenced code blocks that currently use ``` to include a
language specifier (e.g., ```text) so markdown linters render them correctly;
specifically change the fence enclosing the "--as string                       
Username to impersonate..." block and the fence enclosing the "-S,
--skip-version-check   skip checking..." block to use ```text (or another
appropriate language).

In `@docs/osdctl_org.md`:
- Around line 26-28: The fenced code block containing the CLI flag line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
is missing a language hint and triggers markdownlint MD040; change the opening
fence from ``` to a language-tagged fence such as ```text (or ```console) around
that block in docs/osdctl_org.md so the block becomes ```text ... ``` to satisfy
the linter while preserving the exact block content.

In `@docs/osdctl_rhobs_logs.md`:
- Line 20: The --level flag help text in the fmt.Sprintf call that builds the
description (using allowedLogLevels) is malformed; update the strings.Join
separator to use `", "` and ensure a trailing closing quote is appended to the
formatted string so the allowed values print as "val1", "val2", "val3". Locate
the fmt.Sprintf usage in logsCmd.go (the description for the --level flag that
references allowedLogLevels) and replace the current join/separator expression
with one that uses `", "` and concatenates the final `"` after the join.

In `@docs/osdctl_servicelog.md`:
- Around line 26-28: The fenced code block containing the option line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
is missing a language tag which triggers markdownlint MD040; update that fenced
block to include a language identifier (for example "text") immediately after
the opening triple backticks so the snippet is written as ```text ... ``` to
satisfy the linter.

In `@docs/osdctl_swarm.md`:
- Around line 22-24: The fenced code block containing the option line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
is missing a language identifier; update the markdown fenced block to include a
language tag (for example "text") so the block becomes ```text ... ``` to
satisfy MD040 and avoid the markdownlint warning.

In `@docs/README.md`:
- Around line 4249-4255: Add explicit language identifiers to the new fenced
code blocks to satisfy markdownlint MD040: change the opening fences for the
command blocks like "osdctl rhobs [flags]", "osdctl rhobs cell [flags]", "osdctl
rhobs logs [pod] [flags]", and "osdctl rhobs metrics prometheus-expression
[flags]" to use ```bash and change the option/list output blocks that follow
each command to use ```text (e.g., the blocks containing "-C, --cluster-id
string ..." and other flag lines); update the fences at the occurrences reported
(the blocks around those command headers and their flag lists) accordingly so
every triple-backtick has an explicit language.

---

Nitpick comments:
In `@cmd/dynatrace/requests.go`:
- Around line 122-148: The loop and duplicate state unmarshal are redundant:
call requester.Send() once, unmarshal its resp into a DTExecuteState (use the
existing DTExecuteState type and resp variable), check that state.State is
either "RUNNING" or "SUCCEEDED" and return an error if not, then proceed; remove
the for { ... break } block and the second json.Unmarshal/duplicate state check
(the variables requester.Send(), resp, DTExecuteState and state are the
identifiers to update).

In `@cmd/rhobs/cellCmd.go`:
- Around line 17-45: The command's handler currently uses Run and calls
os.Exit(1) which prevents Cobra-style error propagation; change the
cobra.Command field from Run to RunE, make the anonymous function signature
return error, replace os.Exit(1) with returning a non-nil error (wrap/return
metricsErr or logsErr or a combined error) and ensure success paths return nil;
keep existing logic around CreateRhobsFetcher, metricsRhobsFetcher,
logsRhobsFetcher and commonOptions.clusterId/hiveOcmUrl but propagate failures
by returning errors instead of calling os.Exit.

In `@cmd/rhobs/requests.go`:
- Around line 415-416: After calling writer.Flush(), call writer.Error() and
handle any returned error consistently with the file's existing pattern (e.g.,
return or log the error). Specifically, immediately after the existing
writer.Flush() invocation check err := writer.Error(); and propagate or report
err the same way other errors are handled in this function (keep the same
control flow surrounding writer.Flush() and writer.Error() to match existing
error handling semantics).

In `@docs/osdctl_network.md`:
- Around line 22-24: The fenced code block containing the CLI option line "-S,
--skip-version-check   skip checking to see if this is the most recent release"
lacks a language specifier; update that fence to include a language (e.g.,
```bash) so the snippet complies with markdownlint MD040 and renders
consistently.

In `@pkg/utils/vault.go`:
- Around line 125-128: Remove the redundant VAULT_ADDR assignment: since
setupVaultToken() already sets VAULT_ADDR before the code in pkg/utils/vault.go
(the os.Setenv("VAULT_ADDR", vaultRef.Addr) call), delete that second os.Setenv
call and its error handling; ensure any remaining code relies on the VAULT_ADDR
set by setupVaultToken() and keep setupVaultToken() as the single source of
truth for setting the environment variable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 06320455-a969-4942-a71f-27077914609d

📥 Commits

Reviewing files that changed from the base of the PR and between ce11fda and 353eb0a.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (52)
  • cmd/cmd.go
  • cmd/dynatrace/hcpGatherLogsCmd.go
  • cmd/dynatrace/requests.go
  • cmd/rhobs/cellCmd.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/requests.go
  • cmd/rhobs/rootCmd.go
  • docs/README.md
  • docs/osdctl.md
  • docs/osdctl_aao.md
  • docs/osdctl_account.md
  • docs/osdctl_alert.md
  • docs/osdctl_cloudtrail.md
  • docs/osdctl_cluster.md
  • docs/osdctl_cost.md
  • docs/osdctl_dynatrace.md
  • docs/osdctl_dynatrace_dashboard.md
  • docs/osdctl_dynatrace_gather-logs.md
  • docs/osdctl_dynatrace_logs.md
  • docs/osdctl_dynatrace_url.md
  • docs/osdctl_env.md
  • docs/osdctl_evidence.md
  • docs/osdctl_hcp.md
  • docs/osdctl_hive.md
  • docs/osdctl_iampermissions.md
  • docs/osdctl_jira.md
  • docs/osdctl_jumphost.md
  • docs/osdctl_mc.md
  • docs/osdctl_network.md
  • docs/osdctl_org.md
  • docs/osdctl_promote.md
  • docs/osdctl_promote_dynatrace.md
  • docs/osdctl_promote_managedscripts.md
  • docs/osdctl_promote_saas.md
  • docs/osdctl_rhobs.md
  • docs/osdctl_rhobs_cell.md
  • docs/osdctl_rhobs_logs.md
  • docs/osdctl_rhobs_metrics.md
  • docs/osdctl_servicelog.md
  • docs/osdctl_setup.md
  • docs/osdctl_swarm.md
  • docs/osdctl_upgrade.md
  • docs/osdctl_version.md
  • go.mod
  • internal/utils/globalflags/globalflags.go
  • main.go
  • pkg/utils/requester.go
  • pkg/utils/token.go
  • pkg/utils/utils.go
  • pkg/utils/vault.go
  • pkg/utils/vault_test.go
✅ Files skipped from review due to trivial changes (14)
  • docs/osdctl_version.md
  • docs/osdctl.md
  • docs/osdctl_dynatrace_gather-logs.md
  • pkg/utils/vault_test.go
  • docs/osdctl_upgrade.md
  • docs/osdctl_promote.md
  • docs/osdctl_promote_managedscripts.md
  • main.go
  • docs/osdctl_dynatrace_logs.md
  • docs/osdctl_promote_saas.md
  • docs/osdctl_dynatrace.md
  • docs/osdctl_dynatrace_url.md
  • docs/osdctl_dynatrace_dashboard.md
  • internal/utils/globalflags/globalflags.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • pkg/utils/utils.go
  • cmd/rhobs/metricsCmd.go
  • cmd/dynatrace/hcpGatherLogsCmd.go
  • cmd/cmd.go
  • pkg/utils/token.go
  • go.mod
  • pkg/utils/requester.go

Comment thread cmd/rhobs/logsCmd.go Outdated
Comment thread docs/osdctl_aao.md
Comment thread docs/osdctl_account.md
Comment thread docs/osdctl_cluster.md
Comment thread docs/osdctl_evidence.md
Comment thread docs/osdctl_org.md
Comment thread docs/osdctl_rhobs_logs.md Outdated
Comment thread docs/osdctl_servicelog.md
Comment thread docs/osdctl_swarm.md
Comment thread docs/README.md
@Nikokolas3270 Nikokolas3270 force-pushed the SREP-347 branch 2 times, most recently from aa1b380 to d8c225b Compare May 5, 2026 16:41
@Nikokolas3270
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
pkg/utils/vault.go (2)

53-59: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bound the non-interactive Vault calls with a timeout.

vault version, vault token lookup, and vault kv get can block indefinitely on a bad install, network stall, or Vault outage. That hangs the whole CLI. These should use exec.CommandContext with a reasonable timeout; keep vault login interactive if you want it unbounded.

Proposed fix
 import (
+	"context"
 	"encoding/json"
 	"fmt"
 	"os"
 	"os/exec"
+	"time"
@@
+const vaultCommandTimeout = 30 * time.Second
+
 func setupVaultToken(vaultAddr string) error {
-	versionCheckCmd := vaultCmd(vaultAddr, "version")
+	ctx, cancel := context.WithTimeout(context.Background(), vaultCommandTimeout)
+	defer cancel()
+	versionCheckCmd := exec.CommandContext(ctx, "vault", "version")
+	versionCheckCmd.Env = append(os.Environ(), "VAULT_ADDR="+vaultAddr)
@@
-	tokenCheckCmd := vaultCmd(vaultAddr, "token", "lookup")
+	ctx, cancel = context.WithTimeout(context.Background(), vaultCommandTimeout)
+	defer cancel()
+	tokenCheckCmd := exec.CommandContext(ctx, "vault", "token", "lookup")
+	tokenCheckCmd.Env = append(os.Environ(), "VAULT_ADDR="+vaultAddr)
@@
-	kvGetCommand := vaultCmd(vaultRef.Addr, "kv", "get", "-format=json", vaultRef.Path) //nolint:gosec
+	ctx, cancel := context.WithTimeout(context.Background(), vaultCommandTimeout)
+	defer cancel()
+	kvGetCommand := exec.CommandContext(ctx, "vault", "kv", "get", "-format=json", vaultRef.Path) //nolint:gosec
+	kvGetCommand.Env = append(os.Environ(), "VAULT_ADDR="+vaultRef.Addr)

Also applies to: 62-66, 130-134

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/utils/vault.go` around lines 53 - 59, Replace the blocking exec.Command
calls for non-interactive Vault operations with context-bound commands: create a
context with timeout (e.g., context.WithTimeout(ctx, 10*time.Second)) and use
exec.CommandContext for the version check (versionCheckCmd), the token lookup
call (tokenLookupCmd) and the kv read (kvGetCmd) so they cannot hang
indefinitely; ensure you call cancel() after command execution, preserve
Stdout/Stderr wiring, and return a clear error on context deadline exceeded
while leaving interactive vault login unchanged.

47-50: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid mutating VAULT_ADDR process-wide.

Lines 48 and 125 update global process state, and every vault subprocess here inherits it. Once this helper is shared, concurrent callers can race and send a command to the wrong Vault address. Set VAULT_ADDR on each exec.Cmd.Env instead of calling os.Setenv.

Proposed fix
+func vaultCmd(vaultAddr string, args ...string) *exec.Cmd {
+	cmd := exec.Command("vault", args...)
+	cmd.Env = append(os.Environ(), "VAULT_ADDR="+vaultAddr)
+	return cmd
+}
+
 func setupVaultToken(vaultAddr string) error {
-	err := os.Setenv("VAULT_ADDR", vaultAddr)
-	if err != nil {
-		return fmt.Errorf("error setting environment variable: %v", err)
-	}
-
-	versionCheckCmd := exec.Command("vault", "version")
+	versionCheckCmd := vaultCmd(vaultAddr, "version")
@@
-	tokenCheckCmd := exec.Command("vault", "token", "lookup")
+	tokenCheckCmd := vaultCmd(vaultAddr, "token", "lookup")
@@
-		loginCmd := exec.Command("vault", loginArgs...)
+		loginCmd := vaultCmd(vaultAddr, loginArgs...)
@@
 func GetSecretFromVault(vaultRef VaultRef) (*map[string]string, error) {
 	err := setupVaultToken(vaultRef.Addr)
 	if err != nil {
 		return nil, err
 	}
-
-	err = os.Setenv("VAULT_ADDR", vaultRef.Addr)
-	if err != nil {
-		return nil, fmt.Errorf("error setting environment variable: %v", err)
-	}
-
-	kvGetCommand := exec.Command("vault", "kv", "get", "-format=json", vaultRef.Path) //nolint:gosec
+
+	kvGetCommand := vaultCmd(vaultRef.Addr, "kv", "get", "-format=json", vaultRef.Path) //nolint:gosec

Also applies to: 53-58, 62-66, 82-83, 120-130

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/utils/vault.go` around lines 47 - 50, The helper mutates process-wide
VAULT_ADDR via os.Setenv (see setupVaultToken) which causes races; instead stop
calling os.Setenv and ensure each vault subprocess gets VAULT_ADDR only on its
exec.Cmd.Env. Change setupVaultToken (and other places that currently call
os.Setenv) to either return the per-process VAULT_ADDR string or an env []string
and update call sites that create exec.Command to append "VAULT_ADDR=<addr>" to
cmd.Env (preserving existing env from os.Environ()). Locate usages around
setupVaultToken and the vault command invocations and replace global env
mutation with setting cmd.Env per exec.Cmd so concurrent callers don't share
process-wide state.
pkg/utils/token.go (1)

20-25: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make the cached token provider thread-safe.

Token() mutates token and expiresAt without synchronization. If the provider is shared across goroutines, this races and can cause duplicate refreshes or stale reads. A small mutex around the cache is enough here.

Proposed fix
 import (
 	"encoding/json"
 	"fmt"
 	"net/http"
 	"net/url"
+	"sync"
 	"time"
@@
 type cachedTokenProvider struct {
+	mu        sync.Mutex
 	token     string
 	expiresAt time.Time
 	fetchFunc func() (string, int, error)
 	margin    time.Duration
 }
@@
 func (p *cachedTokenProvider) Token() (string, error) {
+	p.mu.Lock()
+	defer p.mu.Unlock()
+
 	if p.token != "" && time.Now().Before(p.expiresAt.Add(-p.margin)) {
 		return p.token, nil
 	}

Also applies to: 37-49

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/utils/token.go` around lines 20 - 25, The cachedTokenProvider mutates
token and expiresAt without synchronization, causing races when shared across
goroutines; add a sync.Mutex (or sync.RWMutex) field to the cachedTokenProvider
struct and lock/unlock it around reads and writes in the Token() method (and any
other methods that access token/expiresAt), ensuring fetchFunc is called outside
the lock if possible to avoid blocking other readers; update struct definition
(cachedTokenProvider) and protect accesses to token, expiresAt, and margin
accordingly so cache reads are guarded and refreshes are serialized.
♻️ Duplicate comments (1)
docs/osdctl_hcp.md (1)

22-24: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language identifier to the fenced code block.

The block starting at Line 22 still violates markdownlint MD040. Use a language on the opening fence (for example, text).

Proposed fix
-```
+```text
   -S, --skip-version-check   skip checking to see if this is the most recent release
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @docs/osdctl_hcp.md around lines 22 - 24, The fenced code block containing
the option line "-S, --skip-version-check skip checking to see if this is the
most recent release" should include a language identifier to satisfy
markdownlint MD040; update the opening fence for that block from totext
(or another suitable language token) so the block reads with a language
specifier.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (1)</summary><blockquote>

<details>
<summary>cmd/rhobs/metricsCmd.go (1)</summary><blockquote>

`37-37`: _⚡ Quick win_

**Use `%w` to wrap errors for proper error chain inspection**

Line 37 should use `%w` instead of `%v` so callers can use `errors.Is/As` to inspect the root cause.

<details>
<summary>Proposed fix</summary>

```diff
-				return fmt.Errorf("failed to print metrics: %v", err)
+				return fmt.Errorf("failed to print metrics: %w", err)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/rhobs/metricsCmd.go` at line 37, Replace the fmt.Errorf call that returns
"failed to print metrics: %v" so it wraps the original error using %w (e.g.,
fmt.Errorf("failed to print metrics: %w", err)) to enable errors.Is/errors.As
chaining; locate the return statement inside the metrics printing function (the
fmt.Errorf call that references err) and change the verb from %v to %w without
altering the surrounding message.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @docs/osdctl_cloudtrail.md:

  • Around line 26-28: Add a language identifier to the fenced code block
    containing the CLI output line "-S, --skip-version-check skip checking to see
    if this is the most recent release" (the fenced block that currently starts with
    ). Change the opening fence from to ```text so the block is recognized as
    neutral CLI/text and resolves markdownlint MD040.

In @docs/README.md:

  • Line 2861: The README contains conflicting default values for the --tail flag
    ("defaults to 100" vs "(default 1000)"); update the documentation so it matches
    the actual default from the command flag definition (the single source of
    truth). Locate the flag definition for "--tail" in the CLI command registration
    (the code that sets the flag's default value and help string) and make the
    README line " --tail int Last 'n' logs to fetch (defaults to X)
    (default X)" use that same value (replace X with the flag's configured default)
    so both occurrences match.

Outside diff comments:
In @pkg/utils/token.go:

  • Around line 20-25: The cachedTokenProvider mutates token and expiresAt without
    synchronization, causing races when shared across goroutines; add a sync.Mutex
    (or sync.RWMutex) field to the cachedTokenProvider struct and lock/unlock it
    around reads and writes in the Token() method (and any other methods that access
    token/expiresAt), ensuring fetchFunc is called outside the lock if possible to
    avoid blocking other readers; update struct definition (cachedTokenProvider) and
    protect accesses to token, expiresAt, and margin accordingly so cache reads are
    guarded and refreshes are serialized.

In @pkg/utils/vault.go:

  • Around line 53-59: Replace the blocking exec.Command calls for non-interactive
    Vault operations with context-bound commands: create a context with timeout
    (e.g., context.WithTimeout(ctx, 10*time.Second)) and use exec.CommandContext for
    the version check (versionCheckCmd), the token lookup call (tokenLookupCmd) and
    the kv read (kvGetCmd) so they cannot hang indefinitely; ensure you call
    cancel() after command execution, preserve Stdout/Stderr wiring, and return a
    clear error on context deadline exceeded while leaving interactive vault login
    unchanged.
  • Around line 47-50: The helper mutates process-wide VAULT_ADDR via os.Setenv
    (see setupVaultToken) which causes races; instead stop calling os.Setenv and
    ensure each vault subprocess gets VAULT_ADDR only on its exec.Cmd.Env. Change
    setupVaultToken (and other places that currently call os.Setenv) to either
    return the per-process VAULT_ADDR string or an env []string and update call
    sites that create exec.Command to append "VAULT_ADDR=" to cmd.Env
    (preserving existing env from os.Environ()). Locate usages around
    setupVaultToken and the vault command invocations and replace global env
    mutation with setting cmd.Env per exec.Cmd so concurrent callers don't share
    process-wide state.

Duplicate comments:
In @docs/osdctl_hcp.md:

  • Around line 22-24: The fenced code block containing the option line "-S,
    --skip-version-check skip checking to see if this is the most recent release"
    should include a language identifier to satisfy markdownlint MD040; update the
    opening fence for that block from totext (or another suitable language
    token) so the block reads with a language specifier.

Nitpick comments:
In @cmd/rhobs/metricsCmd.go:

  • Line 37: Replace the fmt.Errorf call that returns "failed to print metrics:
    %v" so it wraps the original error using %w (e.g., fmt.Errorf("failed to print
    metrics: %w", err)) to enable errors.Is/errors.As chaining; locate the return
    statement inside the metrics printing function (the fmt.Errorf call that
    references err) and change the verb from %v to %w without altering the
    surrounding message.

</details>

<details>
<summary>🪄 Autofix (Beta)</summary>

Fix all unresolved CodeRabbit comments on this PR:

- [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended)
- [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Repository: openshift/coderabbit/.coderabbit.yaml

**Review profile**: CHILL

**Plan**: Enterprise

**Run ID**: `9a259cc0-6f35-4290-ab28-f1d72457104c`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 353eb0a2117fb9d358d95df1fff519f7f845cd91 and d8c225b6f99a3eb629d568f3692c1b392839b7d7.

</details>

<details>
<summary>⛔ Files ignored due to path filters (1)</summary>

* `go.sum` is excluded by `!**/*.sum`

</details>

<details>
<summary>📒 Files selected for processing (52)</summary>

* `cmd/cmd.go`
* `cmd/dynatrace/hcpGatherLogsCmd.go`
* `cmd/dynatrace/requests.go`
* `cmd/rhobs/cellCmd.go`
* `cmd/rhobs/logsCmd.go`
* `cmd/rhobs/metricsCmd.go`
* `cmd/rhobs/requests.go`
* `cmd/rhobs/rootCmd.go`
* `docs/README.md`
* `docs/osdctl.md`
* `docs/osdctl_aao.md`
* `docs/osdctl_account.md`
* `docs/osdctl_alert.md`
* `docs/osdctl_cloudtrail.md`
* `docs/osdctl_cluster.md`
* `docs/osdctl_cost.md`
* `docs/osdctl_dynatrace.md`
* `docs/osdctl_dynatrace_dashboard.md`
* `docs/osdctl_dynatrace_gather-logs.md`
* `docs/osdctl_dynatrace_logs.md`
* `docs/osdctl_dynatrace_url.md`
* `docs/osdctl_env.md`
* `docs/osdctl_evidence.md`
* `docs/osdctl_hcp.md`
* `docs/osdctl_hive.md`
* `docs/osdctl_iampermissions.md`
* `docs/osdctl_jira.md`
* `docs/osdctl_jumphost.md`
* `docs/osdctl_mc.md`
* `docs/osdctl_network.md`
* `docs/osdctl_org.md`
* `docs/osdctl_promote.md`
* `docs/osdctl_promote_dynatrace.md`
* `docs/osdctl_promote_managedscripts.md`
* `docs/osdctl_promote_saas.md`
* `docs/osdctl_rhobs.md`
* `docs/osdctl_rhobs_cell.md`
* `docs/osdctl_rhobs_logs.md`
* `docs/osdctl_rhobs_metrics.md`
* `docs/osdctl_servicelog.md`
* `docs/osdctl_setup.md`
* `docs/osdctl_swarm.md`
* `docs/osdctl_upgrade.md`
* `docs/osdctl_version.md`
* `go.mod`
* `internal/utils/globalflags/globalflags.go`
* `main.go`
* `pkg/utils/requester.go`
* `pkg/utils/token.go`
* `pkg/utils/utils.go`
* `pkg/utils/vault.go`
* `pkg/utils/vault_test.go`

</details>

<details>
<summary>✅ Files skipped from review due to trivial changes (14)</summary>

* docs/osdctl_promote_managedscripts.md
* pkg/utils/vault_test.go
* docs/osdctl.md
* docs/osdctl_dynatrace_logs.md
* docs/osdctl_version.md
* docs/osdctl_dynatrace_dashboard.md
* docs/osdctl_dynatrace_url.md
* docs/osdctl_dynatrace_gather-logs.md
* cmd/dynatrace/hcpGatherLogsCmd.go
* docs/osdctl_dynatrace.md
* go.mod
* cmd/rhobs/rootCmd.go
* pkg/utils/requester.go
* cmd/rhobs/requests.go

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (5)</summary>

* docs/osdctl_promote_saas.md
* main.go
* cmd/rhobs/cellCmd.go
* cmd/cmd.go
* cmd/rhobs/logsCmd.go

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread docs/osdctl_cloudtrail.md
Comment thread docs/README.md Outdated
@Nikokolas3270
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/utils/vault.go (1)

48-51: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid process-wide VAULT_ADDR mutation; scope it per vault command instead.

At lines 48 and 125, os.Setenv("VAULT_ADDR", ...) mutates global process state. This creates a race condition where concurrent credential fetches targeting different Vault addresses can overwrite each other's environment variables. Set the environment variable per-command using Cmd.Env instead:

Suggested approach

Create a helper function to construct vault commands with the address set per-command:

func vaultCmd(vaultAddr string, args ...string) *exec.Cmd {
	cmd := exec.Command("vault", args...)
	cmd.Env = append(os.Environ(), "VAULT_ADDR="+vaultAddr)
	return cmd
}

Then replace all exec.Command("vault", ...) calls with vaultCmd(vaultAddr, ...) and remove the os.Setenv calls at lines 48 and 125.

This applies to both setupVaultToken() (lines 48-51) and GetSecretFromVault() (lines 125-133).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/utils/vault.go` around lines 48 - 51, Replace the process-wide
os.Setenv("VAULT_ADDR", ...) usage in setupVaultToken() and GetSecretFromVault()
with a per-command environment by adding a helper func vaultCmd(vaultAddr
string, args ...string) *exec.Cmd that constructs exec.Command("vault", args...)
and sets cmd.Env = append(os.Environ(), "VAULT_ADDR="+vaultAddr); then update
all exec.Command("vault", ...) calls in setupVaultToken and GetSecretFromVault
to use vaultCmd(vaultAddr, ...), and remove the os.Setenv calls so VAULT_ADDR is
scoped to each Cmd only.
🧹 Nitpick comments (1)
pkg/utils/requester.go (1)

57-65: ⚡ Quick win

Add safe default for SuccessCode to prevent future bugs.

The current code relies on callers to set SuccessCode. While all existing instantiations do set it explicitly, adding a default of http.StatusOK in the Send() method would be a defensive measure to prevent accidental breakage if a future caller forgets to set it.

Suggested patch
 func (rh *Requester) Send() (string, error) {
+	successCode := rh.SuccessCode
+	if successCode == 0 {
+		successCode = http.StatusOK
+	}
+
 	client := http.Client{
 		Timeout: time.Second * 600,
 	}
@@
-	if resp.StatusCode != rh.SuccessCode {
+	if resp.StatusCode != successCode {
 		var respErr responseError
 		err = json.Unmarshal(body, &respErr)
 		if err != nil || len(respErr.Records) == 0 {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/utils/requester.go` around lines 57 - 65, Ensure Send() uses a safe
default for SuccessCode to avoid relying on callers: in the Send method (where
you compare resp.StatusCode != rh.SuccessCode) set a local default or assign
rh.SuccessCode to http.StatusOK when it's zero (e.g., if rh.SuccessCode == 0 {
rh.SuccessCode = http.StatusOK } or use a local successCode var). This
guarantees the comparison uses http.StatusOK if SuccessCode was not set, and
keeps the existing error-unmarshal logic unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/utils/vault.go`:
- Around line 48-51: Replace the process-wide os.Setenv("VAULT_ADDR", ...) usage
in setupVaultToken() and GetSecretFromVault() with a per-command environment by
adding a helper func vaultCmd(vaultAddr string, args ...string) *exec.Cmd that
constructs exec.Command("vault", args...) and sets cmd.Env =
append(os.Environ(), "VAULT_ADDR="+vaultAddr); then update all
exec.Command("vault", ...) calls in setupVaultToken and GetSecretFromVault to
use vaultCmd(vaultAddr, ...), and remove the os.Setenv calls so VAULT_ADDR is
scoped to each Cmd only.

---

Nitpick comments:
In `@pkg/utils/requester.go`:
- Around line 57-65: Ensure Send() uses a safe default for SuccessCode to avoid
relying on callers: in the Send method (where you compare resp.StatusCode !=
rh.SuccessCode) set a local default or assign rh.SuccessCode to http.StatusOK
when it's zero (e.g., if rh.SuccessCode == 0 { rh.SuccessCode = http.StatusOK }
or use a local successCode var). This guarantees the comparison uses
http.StatusOK if SuccessCode was not set, and keeps the existing error-unmarshal
logic unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0d42fe79-402c-49a5-a086-d41a63f2f034

📥 Commits

Reviewing files that changed from the base of the PR and between d8c225b and f0cf074.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (53)
  • cmd/cmd.go
  • cmd/dynatrace/hcpGatherLogsCmd.go
  • cmd/dynatrace/logsCmd.go
  • cmd/dynatrace/requests.go
  • cmd/rhobs/cellCmd.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/requests.go
  • cmd/rhobs/rootCmd.go
  • docs/README.md
  • docs/osdctl.md
  • docs/osdctl_aao.md
  • docs/osdctl_account.md
  • docs/osdctl_alert.md
  • docs/osdctl_cloudtrail.md
  • docs/osdctl_cluster.md
  • docs/osdctl_cost.md
  • docs/osdctl_dynatrace.md
  • docs/osdctl_dynatrace_dashboard.md
  • docs/osdctl_dynatrace_gather-logs.md
  • docs/osdctl_dynatrace_logs.md
  • docs/osdctl_dynatrace_url.md
  • docs/osdctl_env.md
  • docs/osdctl_evidence.md
  • docs/osdctl_hcp.md
  • docs/osdctl_hive.md
  • docs/osdctl_iampermissions.md
  • docs/osdctl_jira.md
  • docs/osdctl_jumphost.md
  • docs/osdctl_mc.md
  • docs/osdctl_network.md
  • docs/osdctl_org.md
  • docs/osdctl_promote.md
  • docs/osdctl_promote_dynatrace.md
  • docs/osdctl_promote_managedscripts.md
  • docs/osdctl_promote_saas.md
  • docs/osdctl_rhobs.md
  • docs/osdctl_rhobs_cell.md
  • docs/osdctl_rhobs_logs.md
  • docs/osdctl_rhobs_metrics.md
  • docs/osdctl_servicelog.md
  • docs/osdctl_setup.md
  • docs/osdctl_swarm.md
  • docs/osdctl_upgrade.md
  • docs/osdctl_version.md
  • go.mod
  • internal/utils/globalflags/globalflags.go
  • main.go
  • pkg/utils/requester.go
  • pkg/utils/token.go
  • pkg/utils/utils.go
  • pkg/utils/vault.go
  • pkg/utils/vault_test.go
✅ Files skipped from review due to trivial changes (18)
  • docs/osdctl_promote.md
  • docs/osdctl_dynatrace.md
  • pkg/utils/vault_test.go
  • pkg/utils/utils.go
  • main.go
  • docs/osdctl_dynatrace_url.md
  • docs/osdctl_promote_managedscripts.md
  • docs/osdctl_version.md
  • cmd/dynatrace/logsCmd.go
  • docs/osdctl_promote_saas.md
  • docs/osdctl_upgrade.md
  • cmd/dynatrace/requests.go
  • go.mod
  • internal/utils/globalflags/globalflags.go
  • docs/osdctl_dynatrace_dashboard.md
  • cmd/rhobs/requests.go
  • docs/osdctl_dynatrace_gather-logs.md
  • docs/osdctl_promote_dynatrace.md
🚧 Files skipped from review as they are similar to previous changes (6)
  • cmd/dynatrace/hcpGatherLogsCmd.go
  • cmd/rhobs/cellCmd.go
  • docs/osdctl.md
  • cmd/rhobs/metricsCmd.go
  • cmd/rhobs/logsCmd.go
  • cmd/rhobs/rootCmd.go

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 5, 2026

@Nikokolas3270: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment thread cmd/rhobs/requests.go
clusterId: cluster.ID(),
clusterExternalId: cluster.ExternalID(),
clusterName: cluster.Name(),
isManagementCluster: !isHcp,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isManagementCluster: !isHcp logic looks inverted for the metrics filtering in PrintMetrics. When the input cluster is HCP, you resolve its parent MC for the RHOBS cell, but then isManagementCluster is false, so PrintMetrics filters by _id (external ID of the HCP cluster) rather than _mc_id/mc_name. Is this intentional? I would have expected HCP clusters to need MC-based filtering since the metrics come from the MC RHOBS cell.

Copy link
Copy Markdown
Contributor Author

@Nikokolas3270 Nikokolas3270 May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem with isManagementCluster but not the one you reported:

  • The actual problem:Previously the code was only allowing hosted clusters or management clusters...
    ... but in the future we may have other clusters sending metrics & logs to RHOBS.
    I will refactor this part a bit so isManagementCluster is only true when the cluster reallty is a management cluster.
  • Your problem: if isManagementCluster is true metrics are indeed filtered by _mc_id/mc_name; otherwise they are filtered by _id. Is this intentional? Yes it is but remark that this filtering only happens if the -f/--filter flag is set. The idea of the --filter option if to show only the metrics for which we are 100% sure they concern the given cluster. For a HCP cluster, metrics with a _mc_id/mc_name matching the MC cluster for that HCP cluster may concern other HCP clusters.

Comment thread cmd/rhobs/logsCmd.go
var isPrintingTimestamp bool
var printedFieldsStr string

nowTime := time.Now()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nowTime is captured when newCmdLogs() builds the cobra command, not when RunE executes. If there is any delay between CLI init and command execution, the "30 minutes ago" default and endTime will be stale. Consider moving nowTime := time.Now() inside the RunE function.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The delay is negligeable.
For the sake of simplicity & avoid calling time.Now() multiple times with a time a bit different each time which may confuse some logic, I prefer to call it only once at the command creation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants