Skip to content

[AIMIGRAPHX-828] Cross Compile Pt 2: construct gpu target with gfx number to enable cross compilation#4795

Merged
causten merged 46 commits into
developfrom
cross_compile_ctx
Jun 2, 2026
Merged

[AIMIGRAPHX-828] Cross Compile Pt 2: construct gpu target with gfx number to enable cross compilation#4795
causten merged 46 commits into
developfrom
cross_compile_ctx

Conversation

@kahmed10
Copy link
Copy Markdown
Collaborator

@kahmed10 kahmed10 commented Apr 17, 2026

Motivation

builds on top of part 1: #4771

Technical Details

  • adds option to construct target with gfx arch string (ex. "gfx1100"), number of CUs, number of chiplets.
  • when constructed this way, benchmarking kernels in compile_ops and finalize() will be disabled.
  • will not work for MIOpen/hipblaslt/CK.

Changelog Category

Add a CHANGELOG.md entry for any option other than Not Applicable

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 17, 2026

Codecov Report

❌ Patch coverage is 43.28358% with 38 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/api/api.cpp 0.00% 15 Missing ⚠️
src/register_target.cpp 0.00% 15 Missing ⚠️
src/include/migraphx/target.hpp 85.00% 3 Missing ⚠️
src/program.cpp 40.00% 3 Missing ⚠️
src/include/migraphx/register_target.hpp 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4795      +/-   ##
===========================================
- Coverage    92.66%   92.56%   -0.10%     
===========================================
  Files          588      588              
  Lines        30397    30467      +70     
===========================================
+ Hits         28165    28200      +35     
- Misses        2232     2267      +35     
Files with missing lines Coverage Δ
src/api/include/migraphx/migraphx.hpp 99.03% <ø> (ø)
src/include/migraphx/context.hpp 83.52% <100.00%> (+2.03%) ⬆️
src/include/migraphx/register_target.hpp 80.00% <0.00%> (-20.00%) ⬇️
src/include/migraphx/target.hpp 78.82% <85.00%> (+7.16%) ⬆️
src/program.cpp 81.50% <40.00%> (-0.19%) ⬇️
src/api/api.cpp 72.41% <0.00%> (-0.83%) ⬇️
src/register_target.cpp 64.58% <0.00%> (-29.36%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kahmed10 kahmed10 changed the title Cross Compile Pt 2: add env variable to enable cross compilation [AIMIGRAPHX-828] Cross Compile Pt 2: add env variable to enable cross compilation Apr 21, 2026
@kahmed10 kahmed10 requested review from CharlieL7 and pfultz2 April 21, 2026 20:25
@kahmed10
Copy link
Copy Markdown
Collaborator Author

@pfultz2 @CharlieL7 this is the implementation using environment variables. I'm thinking of instead adding it to compile options with an additional map that can be used to tweak number of CUs, chiplets, etc. Also let me know what you think of the rest of the changes.

@kahmed10 kahmed10 marked this pull request as ready for review April 23, 2026 20:09
@kahmed10 kahmed10 requested a review from a team as a code owner April 23, 2026 20:09
Copilot AI review requested due to automatic review settings April 23, 2026 20:09
@kahmed10 kahmed10 requested a review from causten as a code owner April 23, 2026 20:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a GPU cross-compilation mode driven by environment variables so MIGraphX can compile for a specified gfx* target without depending on a physical device, primarily by synthesizing GPU device properties and disabling tuning/finalization steps that require execution.

Changes:

  • Introduces MIGRAPHX_GPU_ARCH (plus CU/chiplet overrides) and a synthetic HIP device-props path to enable GPU cross-compilation.
  • Disables kernel benchmarking/tuning paths when cross-compiling (problem cache load + compile_ops benchmarking).
  • Updates GPU passes and data-type elimination logic to query device capabilities via context where available; documents the new env vars and adds a changelog entry.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/targets/gpu/target.cpp Adds env vars, creates cross-compile GPU context, and avoids problem-cache load in cross-compile mode.
src/targets/gpu/include/migraphx/gpu/eliminate_data_type_for_gpu.hpp Adds context* plumbing to support context-based device capability queries.
src/targets/gpu/include/migraphx/gpu/cross_compile_device.hpp Declares helper to build synthetic hipDeviceProp_t for cross compilation.
src/targets/gpu/include/migraphx/gpu/context.hpp Adds cross-compile constructors/mode and blocks runtime execution paths in that mode.
src/targets/gpu/eliminate_data_type_for_gpu.cpp Uses context-aware queries (when provided) for fp8/bf16 capability checks.
src/targets/gpu/device_name.cpp Refactors hipBLASLt support helpers and retains context/no-context query variants.
src/targets/gpu/cross_compile_device.cpp Implements synthetic HIP device properties for cross compilation.
src/targets/gpu/compile_ops.cpp Skips benchmarking/tuning when cross-compiling.
src/targets/gpu/CMakeLists.txt Adds new cross-compilation device source to the GPU library.
src/program.cpp Skips finalize/module-finalize when MIGRAPHX_GPU_ARCH is set.
src/include/migraphx/program.hpp Declares MIGRAPHX_GPU_ARCH env var for core program compilation logic.
docs/reference/MIGraphX-dev-env-vars.rst Documents MIGRAPHX_GPU_ARCH, CU/chiplet overrides, and stated limitations.
CHANGELOG.md Adds a changelog entry for cross-compilation support.

Comment thread src/program.cpp Outdated
Comment on lines +330 to +348
@@ -342,7 +344,8 @@ void program::compile(const target& t, compile_options options)
MIGRAPHX_THROW("Dangling reference in module " + mod->name() + " from instruction " +
std::to_string(index));
}
mod->finalize(this->impl->contexts);
if(not cross_compiling)
mod->finalize(this->impl->contexts);
Comment thread src/targets/gpu/target.cpp Outdated
if(not arch.empty())
{
auto num_cu = value_of(MIGRAPHX_GPU_NUM_CU{}, 120);
auto num_chiplets = value_of(MIGRAPHX_GPU_NUM_CHIPLETS{}, 1);
Comment on lines +27 to +40
#include <migraphx/gpu/export.h>
#include <migraphx/config.hpp>
#include <hip/hip_runtime_api.h>
#include <string>

namespace migraphx {
inline namespace MIGRAPHX_INLINE_NS {
namespace gpu {

/// Populate a hipDeviceProp_t with synthetic values for cross-compilation.
/// Used when no physical GPU is present and the target architecture
/// is specified via environment variables.
MIGRAPHX_GPU_EXPORT hipDeviceProp_t make_cross_compile_device_props(const std::string& arch_name,
std::size_t cu_count);
Comment on lines 36 to 41
struct MIGRAPHX_GPU_EXPORT eliminate_data_type_for_gpu
{
bool disable_64bit = false;
const context* ctx = nullptr;
std::string name() const { return "gpu::eliminate_data_type_for_gpu"; }
void apply(module_pass_manager& mpm) const;
Comment on lines +746 to +751
| Enables cross-compilation mode by specifying a target GPU architecture without requiring a physical GPU.
| When set, kernel benchmarking and finalization are skipped. MIOpen, hipBLASLt, and CK operations are currently not supported in this mode.

- | Takes a valid GPU architecture string (e.g. ``gfx942``, ``gfx1100``).

| Default: Not set. A physical GPU is used.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kahmed10 is the intent to let it throw in cross compile mode for external lib calls or should it be handled nicer?

Comment thread src/program.cpp Outdated
Comment on lines +311 to +312
if(string_value_of(MIGRAPHX_GPU_ARCH{}).empty())
this->finalize();
Comment thread src/targets/gpu/target.cpp Outdated
ctx.set_exhaustive_tune_flag(options.exhaustive_tune);
ctx.load_problem_cache();
if(not ctx.is_cross_compile())
ctx.load_problem_cache();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem_cache should still be loaded when cross compiling. Otherwise compile_ops will not pick the fastest config.

Comment thread src/program.cpp Outdated
}
}
this->finalize();
if(string_value_of(MIGRAPHX_GPU_ARCH{}).empty())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we should use env variable here. Instead the target or context should tell us if we can run finalize.

Comment thread src/program.cpp Outdated
auto&& passes = t.get_passes(this->impl->contexts.front(), options);
run_passes(*this, passes, options.trace);
auto mods = this->get_modules();
bool cross_compiling = not string_value_of(MIGRAPHX_GPU_ARCH{}).empty();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we should use env variable here. Instead the target or context should tell us if we can run finalize.

if(ctx != nullptr)
return f(*ctx);
return f();
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this. Everything should just use the function that takes the context. The other overloads should be removed, in a later PR.

@kahmed10 kahmed10 changed the title [AIMIGRAPHX-828] Cross Compile Pt 2: add env variable to enable cross compilation [AIMIGRAPHX-828] Cross Compile Pt 2: construct gpu target with gfx number to enable cross compilation May 21, 2026
@CharlieL7 CharlieL7 requested a review from bdevorem May 27, 2026 19:42
Copy link
Copy Markdown
Member

@bdevorem bdevorem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using this branch for a few weeks actually and it has been working well, lgtm. I don't mind it being controlled by env vars, it has been easy and straightforward as a user, though I agree you could more cleanly control a lot more toggles if using compile options. There is a copilot comment I think is worth looking at, I had a comment about potential nonsense targets, and code coverage is red

}

template <class T>
value to_value_target(const T& x)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to add test coverage for this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment thread src/driver/main.cpp
{
if(target_name == "gpu" and not gpu_arch.empty())
{
return make_target(target_name,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would happen if the values don't combine to be a valid target, simulated or otherwise? I'm not sure if that's even possible, but curious if there needs to be error checking here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will throw down the line with a message like this:

Pass: gpu::compile_ops
Unsupported architecture: gfx0

Comment thread src/register_target.cpp Outdated
return it->second;
}

target make_target(const std::string& name, const value& options)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test coverage here too

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@kahmed10 kahmed10 requested a review from bdevorem May 28, 2026 18:45
Copy link
Copy Markdown
Member

@bdevorem bdevorem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Licensing is red but I’ll approve now

Comment thread src/api/include/migraphx/migraphx.h Outdated

MIGRAPHX_C_EXPORT migraphx_status migraphx_target_create_with_options(migraphx_target_t* target,
const char* name,
const char* options_json);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like migraphx_operation_create this should be variadic. So the user can do migraphx_target_create_with_options(&target, "gpu", "{gpu_arch: %s}", arch)

MIGRAPHX_EXPORT void register_target(const target& t);
MIGRAPHX_EXPORT void unregister_target(const std::string& name);
MIGRAPHX_EXPORT target make_target(const std::string& name);
MIGRAPHX_EXPORT target make_target(const std::string& name, const value& options);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should use std::initializer_list<std::pair<std::string, value>> and then add an template overload to handle the value type just like make_op.

Comment thread src/driver/main.cpp Outdated
if(target_name == "gpu" and not gpu_arch.empty())
{
return make_target(target_name,
value{{"gpu_arch", gpu_arch},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove value.

Suggested change
value{{"gpu_arch", gpu_arch},
{{"gpu_arch", gpu_arch},

Comment thread src/api/api.cpp Outdated
{
if(options_json == nullptr or *options_json == '\0')
return make_target(name);
return make_target(name, from_json_string(options_json));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to first call convert_to_json so users can pass in the keys without quotes: migraphx_target_create_with_options(&target, "gpu", "{num_cu: 32}"). This makes it consistent with migraphx_operation_create

inline namespace MIGRAPHX_INLINE_NS {
namespace gpu {

hipDeviceProp_t make_cross_compile_device_props(const std::string& arch_name, std::size_t cu_count)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define our own device property struct instead of relying on hipDeviceProp_t? Hip could change it in the future.

@causten causten merged commit 82079fc into develop Jun 2, 2026
36 of 38 checks passed
@causten causten deleted the cross_compile_ctx branch June 2, 2026 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants