pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2026-01-15 12:15:51 +00:00

Files

Chinmay Dattanand Kuchinad 9fef18e31d [ROCm] Enable multi-arch compilation and unit tests for AOT Inductor (#166357 )

## Summary
This PR adds multi-architecture kernel compilation support for ROCm in PyTorch's AOT Inductor module, enabling a single compiled model to run across multiple AMD GPU architectures (MI200, MI300, MI350, etc.) without recompilation.

## Implementation
- **Multi-arch compilation pipeline**: Compiles LLVM IR to multiple GPU architectures and bundles them using `clang-offload-bundler`
- **Architecture detection**: Automatically detects target architectures from `torch.cuda.get_arch_list()`, with overrides via `PYTORCH_ROCM_ARCH` environment variable
- **ROCm-specific utilities**: New `rocm_multiarch_utils.py` module handles ROCm toolchain integration
- **Test infrastructure**: Adapted AOT Inductor tests to support both CUDA and ROCm compilation paths

## Testing
Successfully tested on:
- MI200
- MI300

**Enabled tests:**
- `test_simple_multi_arch`
- `test_compile_after_package_multi_arch`
- `test_compile_with_exporter`
- `test_compile_with_exporter_weights`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/166357
Approved by: https://github.com/jeffdaily

2025-11-06 19:08:15 +00:00

__init__.py

[ROCm] Enable multi-arch compilation and unit tests for AOT Inductor (#166357 )

2025-11-06 19:08:15 +00:00

_utils.py

[ROCm] Enable multi-arch compilation and unit tests for AOT Inductor (#166357 )

2025-11-06 19:08:15 +00:00