[pull] master from ray-project:master by pull[bot] · Pull Request #1076 · garymm/ray

pull · 2026-06-16T07:18:16Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…wn (#64100) ## Description Convert five contributor-facing **Developer Guides** pages in `doc/source/ray-contribute/` from reStructuredText to MyST Markdown, matching the conventions already used by `docs.md` and `agent-development.md` in the same directory. `doc/.claude/CLAUDE.md` declares MyST Markdown the standard for new doc pages and a lint check rejects newly added `.rst`, so this continues the in-progress migration of the section to consistent Markdown that's friendlier for humans and agents to edit. Pages converted: - `index` - `ci` - `api-policy` - `debugging` - `fake-autoscaler` **Faithful format conversion only — no content restructuring.** RST constructs map to their MyST equivalents: | RST | MyST | |---|---| | `.. meta::` / `:description:` | YAML frontmatter (`myst.html_meta.description`) | | `.. _label:` | `(label)=` | | `:ref:` | `{ref}` | | `.. code-block:: LANG` | fenced ` ```LANG ` | | `.. list-table::` | `{list-table}` | | `.. literalinclude::` | `{literalinclude}` | | `.. autoclass::` | wrapped in `{eval-rst}` | All cross-reference labels (`api-policy`, `backend-logging`, `fake-multinode`, `fake-multinode-docker`) are preserved exactly, so existing `{ref}` callers keep resolving. Toctree entries are extensionless and unchanged; the parent toctree entry in `index.rst` and the `.html` redirect targets in `doc/redirects/current.yaml` are unaffected, because Sphinx resolves extensionless toctree entries to whichever source exists and HTML output paths are identical regardless of source format. ## Related issues None. ## Additional information Verified that every external reference resolves before pushing: the `{literalinclude}` targets (`src/ray/util/logging.h`, `python/ray/tests/test_autoscaler_fake_multinode.py` example markers), the `{eval-rst}` `autoclass` targets (`AutoscalingCluster`, `DockerCluster`), and the `{ref}` targets (`api-stability`, `temp-dir-log-files`) all exist. No bare `.md`→`.rst` whole-doc links were introduced. Relying on the Read the Docs PR preview build for the authoritative Sphinx warning check. Signed-off-by: Douglas Strodtman <douglas@anyscale.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…o MyST (#64115) ## Description Convert the two doctest-heavy contributor-facing **Developer Guides** pages in `doc/source/ray-contribute/` from reStructuredText to MyST Markdown, matching the conventions already used by `docs.md` and `agent-development.md`. Continues the in-progress migration of the section to MyST (the declared standard for new pages). Pages converted: - `writing-code-snippets` - `testing-tips` **Faithful format conversion only — no content restructuring.** `writing-code-snippets` is a meta-doc that both *uses* and *demonstrates* testcode/doctest blocks, so the literal-vs-executed distinction is preserved carefully: - The two real examples rendered under "They're rendered like this" — the `is_even` *doctest-style* and *code-output-style* blocks — become `{doctest}` and `{testcode}` + `{testoutput}`, so they stay covered by the docs doctest rule (which globs `source/**/*.md`). - Every block shown as *syntax to copy* stays a plain literal fence, so it renders but is never executed. - The two real `{literalinclude}` directives (of `doc_code/example_module.py`) and the `writing-code-snippets_ref` label are preserved. `testing-tips` is **excluded** from the doctest rule, so its 6 `{testcode}` blocks render without executing. Its `doc/BUILD.bazel` exclude entry is updated from `testing-tips.rst` → `testing-tips.md` so it stays excluded — otherwise the `source/**/*.md` glob would pull it into doctest and its un-imported `ray` references would fail. Toctree entries are extensionless and unchanged; `getting-involved.rst`'s `{ref}` to `writing-code-snippets_ref` and `docs.md`'s extensionless links keep resolving; the `testing-tips.html` redirect target is unaffected. ## Related issues None. ## Additional information Verified locally: the `{literalinclude}` target and `is_even` markers resolve; the two executed `is_even` blocks produce `True`/`False`; the GPU-steps numbered list keeps its nested code fences (rendered via markdown-it). The RtD preview build confirms the page builds; the docs doctest CI target exercises the `writing-code-snippets` executed blocks. Signed-off-by: Douglas Strodtman <douglas@anyscale.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…#64118) `x-request-id` http header is populated by `RequestIdMiddleware`, which is always added to the asgi app. This pr adds a comment describing that, and also defends the code with a safe default value if the header is ever removed/missing. --------- Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

remove the helper in `default_impl.py` which lazily imports proxy modules and directly import the modules at top level in `controller.py` --------- Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

…er-stage stats (#63933) ## Why Two sources of run-to-run noise in the Serve release-test locust harness, surfaced while debugging the serve-perf dashboard: 1. **Latency is quantized.** Locust's `bucket_response_time` rounds response times to ~2 significant figures (1 ms below 100 ms) to bound its percentile histogram for distributed mode. For sub-10 ms Serve latencies that 1 ms floor dominates the signal: percentiles land on integer-ms steps and sub-ms differences are invisible. 2. **Per-stage RPS/percentiles are trailing-window snapshots.** `on_stage_finished` read `current_rps` and `get_current_response_time_percentile`, which are Locust's instantaneous trailing-~10 s values sampled at the stage boundary, far noisier than aggregating the whole stage. ## What - **Finer bucketing:** install a 0.1 ms bucketer (below 100 ms; coarser above to keep keys bounded) in the worker and master processes. - **Per-stage stats from cumulative-snapshot diffs:** report each stage's **average RPS** (requests in stage / duration) and **full-stage percentiles** (over the stage's own histogram, obtained by differencing the cumulative histogram at stage boundaries with locust's `diff_response_time_dicts` + `calculate_response_time_percentile`, so per-stage percentiles match locust's end-of-test report). Overall end-of-test metrics are unchanged. This is benchmark/release-test tooling, so no unit test is added; the logic is exercised when the release test runs. `on_stage_finished` resolves the locust helpers at runtime, where the release env already ships locust. ## Notes - Master-side aggregation lags worker reports by a few seconds, so a stage boundary's cumulative count is approximate, still far better than the 10 s window. - Part of a broader serve-perf noise-reduction effort; dashboard changes (median/IQR/rolling) are handled separately. --------- Signed-off-by: Seiji Eicher <seiji@anyscale.com>

We haven't been maintaining `text_embeddings_benchmark` (in `release_tests.yaml`), and it's similar to `text_embedding` (in `release_data_tests.yaml`). I don't think we should have duplicate release tests or code we don't maintain. So, I'm removing `text_embeddings_benchmark`. I'm choosing to delete it over `text_embedding` because I know `text_embedding` is closely based on a real user's production workload. Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>

## Description `from_blocks` is a utility function that creates a dataset from the user-provided input blocks. It's useful for testing and reproduction scripts because it guarantees that the dataset starts with the exact given inputs without any unexpected changes. In this PR, I'm exporting it from the `ray.data` package so that linters don't complain. <img width="323" height="100" alt="image" src="https://github.com/user-attachments/assets/b63e3f73-f62e-4320-9a0d-95eca2486569" /> ## Related issues ## Additional information Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>

…64122) Ray Data can't prevent out-of-memory errors unless you specify hints for high-memory UDFs. Since the `heterogeneous_memory_batch_inference` release test uses large batch sizes that require multiple GiBs of heap memory, I'm adding `memory` hints in this PR. There is room to optimize the memory use for these UDFs (in theory, they only need ~2 GiB rather than the ~4 GiB I observed), but I'm leaving that out of scope for now since I think there are higher priority improvements to make. Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>

when the request context contains a request_id, use it. otherwise generate a random id. --------- Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

…th (#64089) ## Description `download(filesystem=pyarrow.fs.S3FileSystem(anonymous=True))` silently performs a *credentialed* read when obstore is the backend. The obstore download path re-derives S3 credentials from the supplied filesystem, but a native PyArrow `S3FileSystem` exposes only `region` to Python -- the credential configuration (anonymous, explicit keys, assume-role) lives in the underlying C++ object and cannot be read back. The solution is to read the information from the __reduce__'d object if possible. If we didn't at least try to do that, we'd always fall back to Ray Data's threaded implementation. Small repro to illustrate: ``` import pyarrow.fs as pafs from ray.data._internal.planner._obstore_download import ( _extract_credentials_from_filesystem, ) fs = pafs.S3FileSystem(anonymous=True, region="us-west-2") kwargs = _extract_credentials_from_filesystem(fs) print("anonymous attr exposed to Python:", getattr(fs, "anonymous", "MISSING")) print("obstore kwargs Ray derives :", kwargs) print("skip_signature present :", "skip_signature" in kwargs) # master -> {'region': 'us-west-2'} (no skip_signature -> obstore SIGNS) # fixed -> {'region': 'us-west-2', 'skip_signature': True} (unsigned -> anonymous) ``` Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

code was duplicated before, refactor into a shared module --------- Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

dstrodtman and others added 11 commits June 15, 2026 18:27

[serve] Eager load proxy imports (#64113)

d3cb285

remove the helper in `default_impl.py` which lazily imports proxy modules and directly import the modules at top level in `controller.py` --------- Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

[serve] Add custom request id for direct ingress grpc (#64112)

53bf52d

when the request context contains a request_id, use it. otherwise generate a random id. --------- Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

[serve] Dedup ingress replica + proxy metrics code (#64041)

0be9ffa

code was duplicated before, refactor into a shared module --------- Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

pull Bot locked and limited conversation to collaborators Jun 16, 2026

pull Bot added the ⤵️ pull label Jun 16, 2026

pull Bot merged commit 0be9ffa into garymm:master Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ray-project:master#1076

[pull] master from ray-project:master#1076
pull[bot] merged 11 commits into
garymm:masterfrom
ray-project:master

pull Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

pull Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pull Bot commented Jun 16, 2026 •

edited

Loading