Skip to content

[metal] Implement MULTI_DRAW_INDIRECT_COUNT via compute shader emulation#9659

Open
Bromles wants to merge 6 commits into
gfx-rs:trunkfrom
Bromles:metal-multi_draw_indirect_count
Open

[metal] Implement MULTI_DRAW_INDIRECT_COUNT via compute shader emulation#9659
Bromles wants to merge 6 commits into
gfx-rs:trunkfrom
Bromles:metal-multi_draw_indirect_count

Conversation

@Bromles

@Bromles Bromles commented Jun 10, 2026

Copy link
Copy Markdown

Connections
None

Description
Metal has no native support for draw_indirect_count / draw_indexed_indirect_count. The feature MULTI_DRAW_INDIRECT_COUNT was not advertised on Metal. This PR attempts to emulate it to support all native primary backends.

The emulation runs a compute shader before the render pass. It reads the count buffer and copies the appropriate number of draw commands from the source indirect buffer into a temporary buffer, then calls regular draw_indirect. Commands beyond the count are zeroed so those draw calls are no-ops.

I was considering Indirect Command Buffers. But Metal ICB does not support providing the draw count from the GPU - the command count must be known at ICB encoding time on the CPU. This would require a GPU-CPU sync to read back the count buffer before encoding, which is too expensive per multi_draw_indirect_count call.

Future interaction with draw_index. wgpu already has SHADER_DRAW_INDEX for Vulkan and GLES, but Metal does not support it yet. When Metal support is added, the emulation would need to inject the draw index into each command in the temp buffer so that draw_indirect calls can expose it.

I used indirect_validation as a reference for the implementation: a device-level MultiDrawEmulation owns the pipeline and a temp buffer pool, while MultiDrawResources manages per-command-buffer resources and returns buffers to the pool on drop.

Testing

  • GPU tests in tests/tests/wgpu-gpu/draw_indirect.rs for indexed/non-indexed and partial count cases
  • Visual example multi_draw_indirect_count
  • All new tests pass on Metal

Squash or Rebase?
Squash

Checklist

  • I self-reviewed and fully understand this PR.
  • WebGPU implementations built with wgpu may be affected behaviorally.
  • Validation and feature gates are in place to confine behavioral changes.
  • Tests demonstrate the validation and altered logic works.
  • CHANGELOG.md entries for the user-facing effects of this change are present.
  • The PR is minimal, and doesn't make sense to land as multiple PRs.
  • Commits are logically scoped and individually reviewable.
  • The PR description has enough context to understand the motivation and solution implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant