Skip to content

feat: Enable PyTorch2 Batching Tests#8814

Open
mattwittwer wants to merge 13 commits into
mainfrom
mwittwer/enable_pytorch2_batching
Open

feat: Enable PyTorch2 Batching Tests#8814
mattwittwer wants to merge 13 commits into
mainfrom
mwittwer/enable_pytorch2_batching

Conversation

@mattwittwer

@mattwittwer mattwittwer commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

What does the PR do?

Makes the AOTI test models batch‑capable and adds coverage: the simple add/sub model is exported with a dynamic batch dim and max_batch_size: 8; a new sequence (implicit‑state accumulator) model + config is added to gen_qa_implicit_models.py and wired into gen_qa_model_repository; and torch_aoti_infer_test.py gains batched inference cases (batch 1/4/8 across dtypes) plus a sequence test class (single + interleaved sequences), with test.sh updated to pull and run them.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

triton-inference-server/pytorch_backend#196

Where should the reviewer start?

Test plan:

  • CI Pipeline ID:

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

@mattwittwer mattwittwer self-assigned this Jun 2, 2026
@mattwittwer mattwittwer changed the title draft: enable pytorch2 batching feat: enable pytorch2 batching Jun 4, 2026
@mattwittwer mattwittwer changed the title feat: enable pytorch2 batching feat: Enable PyTorch2 Batching Jun 4, 2026
@mattwittwer mattwittwer changed the title feat: Enable PyTorch2 Batching feat: Enable PyTorch2 Batching Tests Jun 4, 2026
Comment thread qa/common/gen_qa_implicit_models.py Fixed
Comment thread qa/common/gen_qa_implicit_models.py Fixed
Comment thread qa/common/gen_qa_implicit_models.py Fixed
Comment thread qa/common/gen_qa_models.py Fixed
Comment thread qa/common/gen_qa_models.py Fixed

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the PyTorch AOTInductor (PT2 / torch_aoti) QA assets to support and validate batching, including dynamic batching behavior, plus adds new sequence-batching models and corresponding L0 test coverage.

Changes:

  • Export AOTI models (simple add/sub + torchvision) with a dynamic batch dimension and configure max_batch_size: 8 + dynamic batching.
  • Add new AOTI batching-coverage models (variable non-batch dim, multi-instance) and AOTI sequence-batching models (including forward-interface + initial_state + negative-load variants).
  • Extend L0_torch_aoti tests to cover batched inference, dynamic batching coalescing, multi-instance correctness, variable-shape batching, sequence scheduling, and negative load-failure checks.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
qa/L0_torch_aoti/torch_aoti_infer_test.py Adds batched inference cases, dynamic batching coalescing checks, variable-shape and multi-instance coverage, and new sequence-batching tests.
qa/L0_torch_aoti/test.sh Adds additional models to the repo setup, pulls in sequence models, and runs a new negative load-failure phase.
qa/common/gen_qa_models.py Exports AOTI models with dynamic batch dims, sets max_batch_size, and adds new batching-coverage model generators.
qa/common/gen_qa_model_repository Wires AOTI implicit-sequence model generation into the model repository build step.
qa/common/gen_qa_implicit_models.py Implements AOTI sequence model + configs (including variants and negative configs) and adds a --torch-aoti flag.
Comments suppressed due to low confidence (1)

qa/L0_torch_aoti/test.sh:197

  • The redirection operator is incorrect (&1>2). This does not redirect output to stderr as intended; use 1>&2 so the test runner properly captures failures.
if [[ ${RET} -ne 0 ]]; then
    echo -e "${COLOR_ERROR}\n***\n*** Test Suite FAILED\n***${COLOR_RESET}" &1>2
else

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread qa/L0_torch_aoti/test.sh Outdated
Comment thread qa/L0_torch_aoti/test.sh
Comment thread qa/L0_torch_aoti/torch_aoti_infer_test.py Outdated
Comment thread qa/L0_torch_aoti/test.sh Outdated
kill -s SIGINT ${SERVER_PID}
wait ${SERVER_PID} || true
fi
rm -rf ${BAD_MODELDIR}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove model directories at the top so that one can inspect them after test completes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoisj
whoisj previously approved these changes Jun 9, 2026

@whoisj whoisj left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

max_sequence_idle_microseconds: 5000000
control_input [
{{
name: "INPUT__2"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably have a test with the other parameter naming schema used as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added: test_forward_interface_sequence to cover the ARGS[...]/RESULT[...] schema

@yinggeh

yinggeh commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@whoisj Should we test functionality of dynamic and sequence batching in L0_batcher and L0_sequence_batcher?

whoisj commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

We should, but we should NOT block this PR because of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants