[AIMIGRAPHX-828] Cross Compile Pt 2: construct gpu target with gfx number to enable cross compilation#4795
Conversation
…vice_name_change
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #4795 +/- ##
===========================================
- Coverage 92.66% 92.56% -0.10%
===========================================
Files 588 588
Lines 30397 30467 +70
===========================================
+ Hits 28165 28200 +35
- Misses 2232 2267 +35
🚀 New features to boost your workflow:
|
|
@pfultz2 @CharlieL7 this is the implementation using environment variables. I'm thinking of instead adding it to compile options with an additional map that can be used to tweak number of CUs, chiplets, etc. Also let me know what you think of the rest of the changes. |
There was a problem hiding this comment.
Pull request overview
Adds a GPU cross-compilation mode driven by environment variables so MIGraphX can compile for a specified gfx* target without depending on a physical device, primarily by synthesizing GPU device properties and disabling tuning/finalization steps that require execution.
Changes:
- Introduces
MIGRAPHX_GPU_ARCH(plus CU/chiplet overrides) and a synthetic HIP device-props path to enable GPU cross-compilation. - Disables kernel benchmarking/tuning paths when cross-compiling (problem cache load + compile_ops benchmarking).
- Updates GPU passes and data-type elimination logic to query device capabilities via
contextwhere available; documents the new env vars and adds a changelog entry.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/targets/gpu/target.cpp | Adds env vars, creates cross-compile GPU context, and avoids problem-cache load in cross-compile mode. |
| src/targets/gpu/include/migraphx/gpu/eliminate_data_type_for_gpu.hpp | Adds context* plumbing to support context-based device capability queries. |
| src/targets/gpu/include/migraphx/gpu/cross_compile_device.hpp | Declares helper to build synthetic hipDeviceProp_t for cross compilation. |
| src/targets/gpu/include/migraphx/gpu/context.hpp | Adds cross-compile constructors/mode and blocks runtime execution paths in that mode. |
| src/targets/gpu/eliminate_data_type_for_gpu.cpp | Uses context-aware queries (when provided) for fp8/bf16 capability checks. |
| src/targets/gpu/device_name.cpp | Refactors hipBLASLt support helpers and retains context/no-context query variants. |
| src/targets/gpu/cross_compile_device.cpp | Implements synthetic HIP device properties for cross compilation. |
| src/targets/gpu/compile_ops.cpp | Skips benchmarking/tuning when cross-compiling. |
| src/targets/gpu/CMakeLists.txt | Adds new cross-compilation device source to the GPU library. |
| src/program.cpp | Skips finalize/module-finalize when MIGRAPHX_GPU_ARCH is set. |
| src/include/migraphx/program.hpp | Declares MIGRAPHX_GPU_ARCH env var for core program compilation logic. |
| docs/reference/MIGraphX-dev-env-vars.rst | Documents MIGRAPHX_GPU_ARCH, CU/chiplet overrides, and stated limitations. |
| CHANGELOG.md | Adds a changelog entry for cross-compilation support. |
| @@ -342,7 +344,8 @@ void program::compile(const target& t, compile_options options) | |||
| MIGRAPHX_THROW("Dangling reference in module " + mod->name() + " from instruction " + | |||
| std::to_string(index)); | |||
| } | |||
| mod->finalize(this->impl->contexts); | |||
| if(not cross_compiling) | |||
| mod->finalize(this->impl->contexts); | |||
| if(not arch.empty()) | ||
| { | ||
| auto num_cu = value_of(MIGRAPHX_GPU_NUM_CU{}, 120); | ||
| auto num_chiplets = value_of(MIGRAPHX_GPU_NUM_CHIPLETS{}, 1); |
| #include <migraphx/gpu/export.h> | ||
| #include <migraphx/config.hpp> | ||
| #include <hip/hip_runtime_api.h> | ||
| #include <string> | ||
|
|
||
| namespace migraphx { | ||
| inline namespace MIGRAPHX_INLINE_NS { | ||
| namespace gpu { | ||
|
|
||
| /// Populate a hipDeviceProp_t with synthetic values for cross-compilation. | ||
| /// Used when no physical GPU is present and the target architecture | ||
| /// is specified via environment variables. | ||
| MIGRAPHX_GPU_EXPORT hipDeviceProp_t make_cross_compile_device_props(const std::string& arch_name, | ||
| std::size_t cu_count); |
| struct MIGRAPHX_GPU_EXPORT eliminate_data_type_for_gpu | ||
| { | ||
| bool disable_64bit = false; | ||
| const context* ctx = nullptr; | ||
| std::string name() const { return "gpu::eliminate_data_type_for_gpu"; } | ||
| void apply(module_pass_manager& mpm) const; |
| | Enables cross-compilation mode by specifying a target GPU architecture without requiring a physical GPU. | ||
| | When set, kernel benchmarking and finalization are skipped. MIOpen, hipBLASLt, and CK operations are currently not supported in this mode. | ||
|
|
||
| - | Takes a valid GPU architecture string (e.g. ``gfx942``, ``gfx1100``). | ||
|
|
||
| | Default: Not set. A physical GPU is used. |
There was a problem hiding this comment.
@kahmed10 is the intent to let it throw in cross compile mode for external lib calls or should it be handled nicer?
| if(string_value_of(MIGRAPHX_GPU_ARCH{}).empty()) | ||
| this->finalize(); |
| ctx.set_exhaustive_tune_flag(options.exhaustive_tune); | ||
| ctx.load_problem_cache(); | ||
| if(not ctx.is_cross_compile()) | ||
| ctx.load_problem_cache(); |
There was a problem hiding this comment.
The problem_cache should still be loaded when cross compiling. Otherwise compile_ops will not pick the fastest config.
| } | ||
| } | ||
| this->finalize(); | ||
| if(string_value_of(MIGRAPHX_GPU_ARCH{}).empty()) |
There was a problem hiding this comment.
I dont think we should use env variable here. Instead the target or context should tell us if we can run finalize.
| auto&& passes = t.get_passes(this->impl->contexts.front(), options); | ||
| run_passes(*this, passes, options.trace); | ||
| auto mods = this->get_modules(); | ||
| bool cross_compiling = not string_value_of(MIGRAPHX_GPU_ARCH{}).empty(); |
There was a problem hiding this comment.
I dont think we should use env variable here. Instead the target or context should tell us if we can run finalize.
| if(ctx != nullptr) | ||
| return f(*ctx); | ||
| return f(); | ||
| } |
There was a problem hiding this comment.
Remove this. Everything should just use the function that takes the context. The other overloads should be removed, in a later PR.
…oss_compile_ctx
…oss_compile_ctx
…oss_compile_ctx
bdevorem
left a comment
There was a problem hiding this comment.
I've been using this branch for a few weeks actually and it has been working well, lgtm. I don't mind it being controlled by env vars, it has been easy and straightforward as a user, though I agree you could more cleanly control a lot more toggles if using compile options. There is a copilot comment I think is worth looking at, I had a comment about potential nonsense targets, and code coverage is red
| } | ||
|
|
||
| template <class T> | ||
| value to_value_target(const T& x) |
There was a problem hiding this comment.
is it possible to add test coverage for this?
| { | ||
| if(target_name == "gpu" and not gpu_arch.empty()) | ||
| { | ||
| return make_target(target_name, |
There was a problem hiding this comment.
what would happen if the values don't combine to be a valid target, simulated or otherwise? I'm not sure if that's even possible, but curious if there needs to be error checking here
There was a problem hiding this comment.
It will throw down the line with a message like this:
Pass: gpu::compile_ops
Unsupported architecture: gfx0
| return it->second; | ||
| } | ||
|
|
||
| target make_target(const std::string& name, const value& options) |
bdevorem
left a comment
There was a problem hiding this comment.
Licensing is red but I’ll approve now
|
|
||
| MIGRAPHX_C_EXPORT migraphx_status migraphx_target_create_with_options(migraphx_target_t* target, | ||
| const char* name, | ||
| const char* options_json); |
There was a problem hiding this comment.
Just like migraphx_operation_create this should be variadic. So the user can do migraphx_target_create_with_options(&target, "gpu", "{gpu_arch: %s}", arch)
| MIGRAPHX_EXPORT void register_target(const target& t); | ||
| MIGRAPHX_EXPORT void unregister_target(const std::string& name); | ||
| MIGRAPHX_EXPORT target make_target(const std::string& name); | ||
| MIGRAPHX_EXPORT target make_target(const std::string& name, const value& options); |
There was a problem hiding this comment.
There should use std::initializer_list<std::pair<std::string, value>> and then add an template overload to handle the value type just like make_op.
| if(target_name == "gpu" and not gpu_arch.empty()) | ||
| { | ||
| return make_target(target_name, | ||
| value{{"gpu_arch", gpu_arch}, |
There was a problem hiding this comment.
Remove value.
| value{{"gpu_arch", gpu_arch}, | |
| {{"gpu_arch", gpu_arch}, |
| { | ||
| if(options_json == nullptr or *options_json == '\0') | ||
| return make_target(name); | ||
| return make_target(name, from_json_string(options_json)); |
There was a problem hiding this comment.
This needs to first call convert_to_json so users can pass in the keys without quotes: migraphx_target_create_with_options(&target, "gpu", "{num_cu: 32}"). This makes it consistent with migraphx_operation_create
| inline namespace MIGRAPHX_INLINE_NS { | ||
| namespace gpu { | ||
|
|
||
| hipDeviceProp_t make_cross_compile_device_props(const std::string& arch_name, std::size_t cu_count) |
There was a problem hiding this comment.
Can we define our own device property struct instead of relying on hipDeviceProp_t? Hip could change it in the future.
Motivation
builds on top of part 1: #4771
Technical Details
Changelog Category
Add a
CHANGELOG.mdentry for any option other thanNot Applicable