Feature- Tests#10
Conversation
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
|
Warning Review limit reached
More reviews will be available in 28 minutes and 35 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThis PR introduces a comprehensive model testing and performance infrastructure. It adds snapshot-based validation tests for three model types (binary classifier, semantic segmentation, embeddings), a performance benchmarking tool, reference data generation, and Ray Serve HTTP endpoints to execute tests remotely. Kubernetes infrastructure provisions shared storage for test reference data. ChangesModel Testing and Performance Infrastructure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a test runner and a throughput benchmark runner for Ray Serve deployments, along with Helm chart configurations, PVC setups, and snapshot tests for model validation. The review feedback suggests several improvements: moving dynamic pip installations from the deployment initialization to the runtime_env configuration, returning proper HTTP error status codes (e.g., 500) upon test failures, avoiding hardcoded paths by introducing a helper to dynamically resolve the test references directory, and explicitly handling ValueError in the readiness check to prevent infinite retry loops.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (6)
tests/perf_throughput.py (1)
41-45: ⚡ Quick winEliminate code duplication by importing from
_shared.py.This function duplicates
_models_base_urlfromtests/model_snapshots/_shared.py. Maintaining the same logic in two places increases the risk of inconsistency if the default URL changes.♻️ Refactor to import from shared module
+from tests.model_snapshots._shared import _models_base_url + import numpy as np from rationai import Client -def _models_base_url() -> str: - return os.environ.get( - "MODEL_SERVICE_MODELS_BASE_URL", - "http://rayservice-model-serve-svc.rationai-jobs-ns.svc.cluster.local:8000", - ) - - def make_pool(tile_size: int, n: int) -> list[np.ndarray]:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/perf_throughput.py` around lines 41 - 45, Remove the duplicate _models_base_url function and import it from the shared module instead; update tests/perf_throughput.py to add an import from tests.model_snapshots._shared (or the correct module path) and use the imported _models_base_url function where needed so the default URL is maintained in one place (reference symbol: _models_base_url in tests/model_snapshots/_shared.py and its usages in tests/perf_throughput.py).builders/throughput_runner.py (2)
14-18: ⚡ Quick winUnnecessary
pytestinstall.
run()invokestests/perf_throughput.pydirectly viasys.executable, not through pytest, so installingpytesthere serves no purpose and just adds startup cost/failure risk. Drop it, or install the actual depsperf_throughput.pyneeds (e.g.numpy) instead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@builders/throughput_runner.py` around lines 14 - 18, The __init__ in builders/throughput_runner.py currently runs subprocess.run to pip-install pytest, which is unnecessary because perf_throughput.py is invoked directly via sys.executable rather than via pytest; remove the pytest install call from the __init__ (or replace it with installing real runtime deps perf_throughput.py requires, e.g., numpy) so that the subprocess.run([... "pip", "install", "pytest" ...]) invocation is deleted/updated and the class initializer no longer performs an unneeded pip install.
51-55: ⚡ Quick winAdd a timeout to the benchmark subprocess.
Same pattern as
builders/test_runner.py:subprocess.runhas notimeout, so if the benchmark hangs it blocks the single replica. A hard cap slightly aboveduration_s(e.g.duration_s + buffer) plusTimeoutExpiredhandling would bound it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@builders/throughput_runner.py` around lines 51 - 55, The subprocess.run call currently has no timeout and can hang; add a timeout parameter set to duration_s + a small buffer (e.g. buffer seconds) when calling subprocess.run(cmd, capture_output=True, text=True, timeout=duration_s+buffer) and wrap the call in a try/except that catches subprocess.TimeoutExpired (use the same pattern as builders/test_runner.py): on timeout assemble output from the exception's stdout/stderr (or note that it timed out), and return a Response (text/plain) describing the timeout and partial output (set an appropriate HTTP status like 504 if the test_runner uses that) so the replica is bounded and the caller sees the timeout.helm/rayservice/applications/throughput-test.yaml (1)
1-3: 💤 Low valueNaming inconsistency:
throughput-runnervsthroughput-test.The app
nameisthroughput-runner, but the file and thevalues.yamlapplications entry arethroughput-test, whiletest-runneris consistent across all three. Align the names to avoid confusion when correlating values entries to app configs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helm/rayservice/applications/throughput-test.yaml` around lines 1 - 3, The app config in throughput-test.yaml uses name: throughput-runner which mismatches the file name and the values.yaml applications entry (throughput-test) while other configs use test-runner; update the name field in the manifest to match the canonical identifier used in values.yaml and other configs (e.g., change name: throughput-runner to name: throughput-test or to test-runner depending on the canonical choice) so that builders.throughput_runner:app and route_prefix: /run-throughput remain unchanged and the values.yaml applications entry maps correctly.builders/test_runner.py (2)
14-18: ⚡ Quick winPrefer installing
pytestviaruntime_env/image instead of at replica startup.Installing inside
__init__adds cold-start latency and a network-dependent failure surface (check=Truewill crash replica init on any transient pip failure). Addingpytestto theruntime_env.piplist inhelm/rayservice/applications/test-runner.yaml(or baking it into the image) is more reliable.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@builders/test_runner.py` around lines 14 - 18, The __init__ method in builders.test_runner.py currently runs subprocess.run([sys.executable, "-m", "pip", "install", "pytest", "-q"], check=True) which causes cold-start latency and brittle failures; remove that subprocess.run call from the TestRunner.__init__ and stop installing pytest at replica startup, and instead add "pytest" to the runtime_env.pip list in the Helm manifest (helm/rayservice/applications/test-runner.yaml) or bake it into the container image so TestRunner starts without performing pip installs at init.
23-37: ⚡ Quick winAdd a timeout to the pytest subprocess.
subprocess.run(..., capture_output=True)has notimeout, so a hung or slow snapshot run will block this request — and, withnum_replicas=1/target_ongoing_requests: 1, the whole replica — indefinitely. Consider a bounded timeout and handlingTimeoutExpired.♻️ Suggested change
result = subprocess.run( [ sys.executable, "-m", "pytest", "tests/model_snapshots/", "-v", "--tb=short", "--no-header", "-s", "--color=no", ], capture_output=True, text=True, + timeout=600, )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@builders/test_runner.py` around lines 23 - 37, The pytest subprocess call using subprocess.run(...) should include a bounded timeout and handle subprocess.TimeoutExpired; update the subprocess.run invocation in builders/test_runner.py to pass a sensible timeout (or a configurable constant) and wrap the call in a try/except that catches subprocess.TimeoutExpired, captures/uses the exception's stdout/stderr (or logs them), and returns/fails the test run with a clear timeout error instead of hanging indefinitely; reference the existing subprocess.run(...) call and the subprocess.TimeoutExpired exception for locating where to add the timeout and handler.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/model_snapshots/_shared.py`:
- Around line 186-194: Move the shape validation to occur before computing the
cosine similarity: check actual.shape == expected.shape (use the same case_name
for context) and call pytest.fail if they differ, then compute similarity via
np.dot and norms; also guard against zero norms by checking actual_norm and
expected_norm (from np.linalg.norm) and fail or handle the case explicitly to
avoid division-by-zero/nan behavior when computing similarity used against
min_cosine_similarity.
- Around line 96-119: The shape comparison should be performed immediately after
computing `actual` and loading `expected` so it fails with the clear message
instead of raising a NumPy error during subtraction; move the `if actual.shape
!= expected.shape: pytest.fail(f"Shape mismatch: expected={expected.shape},
actual={actual.shape}")` check to precede the `max_diff =
np.abs(actual.astype(np.float32) - expected.astype(np.float32)).max()` line (and
before any `stats_slice` slicing/aggregation), keeping the rest of the
`max_diff`, `stats_slice`, `min_val`, `mean_val`, `max_val`, and `frac_05`
computations unchanged.
In `@tests/perf_throughput.py`:
- Line 215: The help text for the CLI option "--models-base-url" is misleading
(it says default http://localhost:8000) — update the help string to reflect the
actual default returned by the helper _models_base_url() (the cluster URL
http://rayservice-model-serve-svc.rationai-jobs-ns.svc.cluster.local:8000 when
the MODEL_SERVICE_MODELS_BASE_URL env var is unset); locate the parser/arg
definition that sets help for "--models-base-url" in tests/perf_throughput.py
and change the descriptive text to mention the real default (or refer to
_models_base_url() / MODEL_SERVICE_MODELS_BASE_URL) so the help is accurate.
---
Nitpick comments:
In `@builders/test_runner.py`:
- Around line 14-18: The __init__ method in builders.test_runner.py currently
runs subprocess.run([sys.executable, "-m", "pip", "install", "pytest", "-q"],
check=True) which causes cold-start latency and brittle failures; remove that
subprocess.run call from the TestRunner.__init__ and stop installing pytest at
replica startup, and instead add "pytest" to the runtime_env.pip list in the
Helm manifest (helm/rayservice/applications/test-runner.yaml) or bake it into
the container image so TestRunner starts without performing pip installs at
init.
- Around line 23-37: The pytest subprocess call using subprocess.run(...) should
include a bounded timeout and handle subprocess.TimeoutExpired; update the
subprocess.run invocation in builders/test_runner.py to pass a sensible timeout
(or a configurable constant) and wrap the call in a try/except that catches
subprocess.TimeoutExpired, captures/uses the exception's stdout/stderr (or logs
them), and returns/fails the test run with a clear timeout error instead of
hanging indefinitely; reference the existing subprocess.run(...) call and the
subprocess.TimeoutExpired exception for locating where to add the timeout and
handler.
In `@builders/throughput_runner.py`:
- Around line 14-18: The __init__ in builders/throughput_runner.py currently
runs subprocess.run to pip-install pytest, which is unnecessary because
perf_throughput.py is invoked directly via sys.executable rather than via
pytest; remove the pytest install call from the __init__ (or replace it with
installing real runtime deps perf_throughput.py requires, e.g., numpy) so that
the subprocess.run([... "pip", "install", "pytest" ...]) invocation is
deleted/updated and the class initializer no longer performs an unneeded pip
install.
- Around line 51-55: The subprocess.run call currently has no timeout and can
hang; add a timeout parameter set to duration_s + a small buffer (e.g. buffer
seconds) when calling subprocess.run(cmd, capture_output=True, text=True,
timeout=duration_s+buffer) and wrap the call in a try/except that catches
subprocess.TimeoutExpired (use the same pattern as builders/test_runner.py): on
timeout assemble output from the exception's stdout/stderr (or note that it
timed out), and return a Response (text/plain) describing the timeout and
partial output (set an appropriate HTTP status like 504 if the test_runner uses
that) so the replica is bounded and the caller sees the timeout.
In `@helm/rayservice/applications/throughput-test.yaml`:
- Around line 1-3: The app config in throughput-test.yaml uses name:
throughput-runner which mismatches the file name and the values.yaml
applications entry (throughput-test) while other configs use test-runner; update
the name field in the manifest to match the canonical identifier used in
values.yaml and other configs (e.g., change name: throughput-runner to name:
throughput-test or to test-runner depending on the canonical choice) so that
builders.throughput_runner:app and route_prefix: /run-throughput remain
unchanged and the values.yaml applications entry maps correctly.
In `@tests/perf_throughput.py`:
- Around line 41-45: Remove the duplicate _models_base_url function and import
it from the shared module instead; update tests/perf_throughput.py to add an
import from tests.model_snapshots._shared (or the correct module path) and use
the imported _models_base_url function where needed so the default URL is
maintained in one place (reference symbol: _models_base_url in
tests/model_snapshots/_shared.py and its usages in tests/perf_throughput.py).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f1bad222-0af5-4760-aa06-357b910d5716
📒 Files selected for processing (19)
builders/test_runner.pybuilders/throughput_runner.pyhelm/rayservice/applications/test-runner.yamlhelm/rayservice/applications/throughput-test.yamlhelm/rayservice/values.yamlhelm/rayservice/workers/cpu-workers.yamlhelm/rayservice/workers/mig20-workers.yamlpvc/model-test-refs-pvc.yamlpyproject.tomltests/__init__.pytests/model_snapshots/__init__.pytests/model_snapshots/_shared.pytests/model_snapshots/generate_references.pytests/model_snapshots/new_images.txttests/model_snapshots/test_binary_classifier_model_snapshot.pytests/model_snapshots/test_prov_gigapath_model_snapshot.pytests/model_snapshots/test_semantic_segmentation_model_snapshot.pytests/model_snapshots/test_virchow2_model_snapshot.pytests/perf_throughput.py
Summary by CodeRabbit
Release Notes
New Features
Tests
Chores