pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2026-01-15 12:15:51 +00:00

Author	SHA1	Message	Date
Pian Pawakapan	47f048afa5	[torchfuzz] add dtensor_placements template (#170136 ) Some checks failed Update viable/strict / do_update_viablestrict (push) Has been cancelled Details Upload test stats while running / Upload test stats while running (push) Has been cancelled Details Close stale pull requests / stale (push) Has been cancelled Details B200 Smoke Tests / get-label-type (push) Has been cancelled Details B200 Smoke Tests / linux-jammy-cuda12.8-py3.10-gcc11-sm100 (push) Has been cancelled Details rocm-mi200 / before-test (push) Has been cancelled Details rocm-mi200 / get-label-type (push) Has been cancelled Details rocm-mi200 / linux-jammy-rocm-py3.10 (push) Has been cancelled Details unstable-periodic / introduction (push) Has been cancelled Details rocm-navi31 / before-test (push) Has been cancelled Details rocm-navi31 / get-label-type (push) Has been cancelled Details rocm-navi31 / linux-jammy-rocm-py3.10 (push) Has been cancelled Details rocm-navi31 / linux-jammy-rocm-py3_10 (push) Has been cancelled Details periodic / before-test (push) Has been cancelled Details periodic / get-label-type (push) Has been cancelled Details periodic / linux-jammy-cuda12.4-py3.10-gcc11 (push) Has been cancelled Details periodic / linux-jammy-cuda12.8-py3.10-gcc11 (push) Has been cancelled Details periodic / linux-jammy-cuda12.8-py3.10-gcc11-debug (push) Has been cancelled Details periodic / linux-jammy-cuda13.0-py3.10-gcc11 (push) Has been cancelled Details periodic / linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck (push) Has been cancelled Details periodic-rocm-mi300 / before-test (push) Has been cancelled Details periodic-rocm-mi300 / get-label-type (push) Has been cancelled Details periodic-rocm-mi300 / linux-noble-rocm-py3.12-mi300 (push) Has been cancelled Details periodic-rocm-mi200 / before-test (push) Has been cancelled Details periodic-rocm-mi200 / get-label-type (push) Has been cancelled Details periodic-rocm-mi200 / linux-jammy-rocm-py3.10 (push) Has been cancelled Details slow-rocm-mi200 / before-test (push) Has been cancelled Details slow-rocm-mi200 / get-label-type (push) Has been cancelled Details slow-rocm-mi200 / linux-jammy-rocm-py3.10 (push) Has been cancelled Details inductor-rocm-mi200 / get-label-type (push) Has been cancelled Details inductor-rocm-mi200 / rocm-py3.10-inductor (push) Has been cancelled Details windows-arm64-build-test / build (push) Has been cancelled Details windows-arm64-build-test / test (push) Has been cancelled Details inductor-periodic / get-default-label-prefix (push) Has been cancelled Details inductor-periodic / periodic-dynamo-benchmarks-build (push) Has been cancelled Details inductor-periodic / periodic-dynamo-benchmarks-test (push) Has been cancelled Details inductor-periodic / periodic-dynamo-benchmarks-build-cuda13 (push) Has been cancelled Details inductor-periodic / periodic-dynamo-benchmarks-test-cuda13 (push) Has been cancelled Details inductor-periodic / rocm-periodic-dynamo-benchmarks-build (push) Has been cancelled Details inductor-periodic / rocm-periodic-dynamo-benchmarks-test (push) Has been cancelled Details inductor-periodic / inductor-smoke-build (push) Has been cancelled Details inductor-periodic / inductor-smoke-test (push) Has been cancelled Details inductor-periodic / periodic-dynamo-benchmarks-cpu-build (push) Has been cancelled Details inductor-periodic / periodic-dynamo-benchmarks-cpu-test (push) Has been cancelled Details vllm-test / vllm-x-pytorch-build (push) Has been cancelled Details vllm-test / vllm-x-pytorch-test (push) Has been cancelled Details Limited CI on H100 / get-label-type (push) Has been cancelled Details Limited CI on H100 / linux-jammy-cuda12.8-py3.10-gcc11-sm90 (push) Has been cancelled Details Limited CI on H100 / linux-jammy-cuda12_8-py3_10-gcc11-sm90-FA3-ABI-stable-test (push) Has been cancelled Details xpu / get-label-type (push) Has been cancelled Details xpu / linux-jammy-xpu-n-1-py3.10 (push) Has been cancelled Details xpu / linux-noble-xpu-n-py3.10 (push) Has been cancelled Details xpu / win-vs2022-xpu-n-1-py3 (push) Has been cancelled Details xpu / win-vs2022-xpu-n-py3 (push) Has been cancelled Details Limited CI for CUTLASS backend on H100 / get-label-type (push) Has been cancelled Details Limited CI for CUTLASS backend on H100 / linux-jammy-cuda12.8-py3.10-gcc11-sm90-cutlass-backend (push) Has been cancelled Details rocm-mi355 / before-test (push) Has been cancelled Details rocm-mi355 / get-label-type (push) Has been cancelled Details rocm-mi355 / linux-noble-rocm-py3.12-mi355 (push) Has been cancelled Details operator_microbenchmark / get-label-type (push) Has been cancelled Details operator_microbenchmark / opmicrobenchmark-build (push) Has been cancelled Details operator_microbenchmark / opmicrobenchmark-test (push) Has been cancelled Details operator_microbenchmark / opmicrobenchmark-build-b200 (push) Has been cancelled Details operator_microbenchmark / opmicrobenchmark-test-b200 (push) Has been cancelled Details operator_microbenchmark / opmicrobenchmark-build-rocm (push) Has been cancelled Details operator_microbenchmark / opmicrobenchmark-test-rocm (push) Has been cancelled Details inductor-A100-perf-nightly / get-label-type (push) Has been cancelled Details inductor-A100-perf-nightly / cuda12.8-py3.10-gcc11-sm80 (push) Has been cancelled Details inductor-A100-perf-nightly / cuda13.0-py3.10-gcc11-sm80 (push) Has been cancelled Details inductor-perf-nightly-x86 / get-label-type (push) Has been cancelled Details inductor-perf-nightly-x86 / inductor-build (push) Has been cancelled Details inductor-perf-nightly-x86 / inductor-test-nightly-freezing (push) Has been cancelled Details inductor-perf-nightly-x86 / inductor-test (push) Has been cancelled Details inductor-perf-nightly-x86-zen / get-label-type (push) Has been cancelled Details inductor-perf-nightly-x86-zen / inductor-build (push) Has been cancelled Details inductor-perf-nightly-x86-zen / inductor-test-nightly (push) Has been cancelled Details inductor-perf-nightly-x86-zen / inductor-test (push) Has been cancelled Details inductor-perf-nightly-macos / macos-perf-py3-arm64 (push) Has been cancelled Details inductor-perf-nightly-macos / macos-perf-py3-arm64-mps (push) Has been cancelled Details inductor-perf-nightly-aarch64 / get-label-type (push) Has been cancelled Details inductor-perf-nightly-aarch64 / linux-jammy-aarch64-py3.10-inductor (push) Has been cancelled Details inductor-perf-b200 / get-label-type (push) Has been cancelled Details inductor-perf-b200 / cuda12.8-py3.10-gcc11-sm100 (push) Has been cancelled Details inductor-nightly / get-default-label-prefix (push) Has been cancelled Details inductor-nightly / nightly-dynamo-benchmarks-build (push) Has been cancelled Details inductor-nightly / nightly-dynamo-benchmarks-test (push) Has been cancelled Details inductor-micro-benchmark / get-default-label-prefix (push) Has been cancelled Details inductor-micro-benchmark / cuda12.8-py3.10-gcc11-sm80 (push) Has been cancelled Details inductor-micro-benchmark / cuda13.0-py3.10-gcc11-sm80 (push) Has been cancelled Details inductor-micro-benchmark-x86 / inductor-build (push) Has been cancelled Details inductor-micro-benchmark-x86 / inductor-micro-benchmark-test (push) Has been cancelled Details attention_op_microbenchmark / attn-microbenchmark-build (push) Has been cancelled Details attention_op_microbenchmark / attn-microbenchmark-test (push) Has been cancelled Details attention_op_microbenchmark / opmicrobenchmark-build-b200 (push) Has been cancelled Details attention_op_microbenchmark / opmicrobenchmark-test-b200 (push) Has been cancelled Details Nightly Upload to s3 / upload-stats-to-s3 (push) Has been cancelled Details Limited CI for symmetric memory tests on H100 / get-label-type (push) Has been cancelled Details Limited CI for symmetric memory tests on H100 / linux-jammy-cuda12.8-py3.10-gcc11-sm90-symm (push) Has been cancelled Details Limited CI for symmetric memory tests on B200 / get-label-type (push) Has been cancelled Details Limited CI for symmetric memory tests on B200 / linux-jammy-cuda12.8-py3.10-gcc11-sm100-symm (push) Has been cancelled Details trunk / job-filter (push) Has been cancelled Details trunk / before-test (push) Has been cancelled Details trunk / get-label-type (push) Has been cancelled Details trunk / libtorch-linux-jammy-cuda12.8-py3.10-gcc11-debug (push) Has been cancelled Details trunk / linux-jammy-cuda12.8-py3.10-gcc11 (push) Has been cancelled Details trunk / linux-jammy-cuda13.0-py3.10-gcc11 (push) Has been cancelled Details trunk / linux-jammy-cuda12.8-py3.10-gcc11-no-ops (push) Has been cancelled Details trunk / linux-jammy-cuda13.0-py3.10-gcc11-no-ops (push) Has been cancelled Details trunk / macos-py3-arm64 (push) Has been cancelled Details trunk / win-vs2022-cpu-py3 (push) Has been cancelled Details trunk / win-vs2022-cuda12.8-py3 (push) Has been cancelled Details trunk / linux-jammy-rocm-py3.10 (push) Has been cancelled Details trunk / inductor-build (push) Has been cancelled Details trunk / inductor-build-cuda13 (push) Has been cancelled Details trunk / cross-compile-linux-test (push) Has been cancelled Details trunk / verify-cachebench-cpu-build (push) Has been cancelled Details trunk / verify-cachebench-cpu-test (push) Has been cancelled Details trunk / linux-jammy-py3-clang12-executorch (push) Has been cancelled Details trunk / linux-jammy-py3.10-gcc11-full-debug-build-only (push) Has been cancelled Details trunk-rocm-mi300 / before-test (push) Has been cancelled Details trunk-rocm-mi300 / get-label-type (push) Has been cancelled Details trunk-rocm-mi300 / linux-jammy-rocm-py3.10 (push) Has been cancelled Details slow / before-test (push) Has been cancelled Details slow / get-label-type (push) Has been cancelled Details slow / linux-jammy-cuda12.8-py3.10-gcc11-sm86 (push) Has been cancelled Details slow / linux-jammy-cuda13.0-py3.10-gcc11-sm86 (push) Has been cancelled Details slow / linux-jammy-py3.10-clang12 (push) Has been cancelled Details slow / linux-jammy-py3.10-clang18-asan (push) Has been cancelled Details s390x-periodic / before-test (push) Has been cancelled Details s390x-periodic / linux-manylinux-2_28-py3-cpu-s390x (push) Has been cancelled Details rocm-mi300 / before-test (push) Has been cancelled Details rocm-mi300 / get-label-type (push) Has been cancelled Details rocm-mi300 / linux-noble-rocm-py3.12-mi300 (push) Has been cancelled Details pull / job-filter (push) Has been cancelled Details pull / before-test (push) Has been cancelled Details pull / get-label-type (push) Has been cancelled Details pull / linux-jammy-py3.10-gcc11 (push) Has been cancelled Details pull / linux-docs (push) Has been cancelled Details pull / linux-jammy-py3.10-gcc11-no-ops (push) Has been cancelled Details pull / linux-jammy-py3.10-gcc11-pch (push) Has been cancelled Details pull / linux-jammy-py3.10-clang18-asan (push) Has been cancelled Details pull / linux-jammy-py3.10-clang12-onnx (push) Has been cancelled Details pull / linux-jammy-py3.10-clang12 (push) Has been cancelled Details pull / linux-jammy-py3.14-clang12 (push) Has been cancelled Details pull / linux-jammy-cuda12.8-cudnn9-py3.10-clang12 (push) Has been cancelled Details pull / linux-jammy-cpu-py3.10-gcc11-bazel-test (push) Has been cancelled Details pull / linux-jammy-py3.10-gcc11-mobile-lightweight-dispatch-build (push) Has been cancelled Details pull / linux-jammy-rocm-py3.10 (push) Has been cancelled Details pull / cuda12.8-py3.10-gcc11-sm75 (push) Has been cancelled Details pull / cuda13.0-py3.10-gcc11-sm75 (push) Has been cancelled Details pull / linux-jammy-xpu-n-py3.10 (push) Has been cancelled Details inductor-unittest / get-label-type (push) Has been cancelled Details inductor-unittest / inductor-build (push) Has been cancelled Details inductor-unittest / inductor-test (push) Has been cancelled Details inductor-unittest / inductor-halide-build (push) Has been cancelled Details inductor-unittest / inductor-halide-test (push) Has been cancelled Details inductor-unittest / inductor-pallas-cpu-build (push) Has been cancelled Details inductor-unittest / inductor-pallas-cpu-test (push) Has been cancelled Details inductor-unittest / inductor-triton-cpu-build (push) Has been cancelled Details inductor-unittest / linux-jammy-cpu-py3.12-gcc11-inductor-triton-cpu (push) Has been cancelled Details inductor-unittest / inductor-cpu-build (push) Has been cancelled Details inductor-unittest / inductor-cpu-test (push) Has been cancelled Details inductor-unittest / inductor-cpu-core-build (3.11) (push) Has been cancelled Details inductor-unittest / inductor-cpu-core-build (3.12) (push) Has been cancelled Details inductor-unittest / inductor-cpu-core-build (3.13) (push) Has been cancelled Details inductor-unittest / inductor-cpu-core-test (3.11) (push) Has been cancelled Details inductor-unittest / inductor-cpu-core-test (3.12) (push) Has been cancelled Details inductor-unittest / inductor-cpu-core-test (3.13) (push) Has been cancelled Details dynamo-unittest / get-label-type (push) Has been cancelled Details dynamo-unittest / dynamo-build (3.11) (push) Has been cancelled Details dynamo-unittest / dynamo-build (3.12) (push) Has been cancelled Details dynamo-unittest / dynamo-build (3.13) (push) Has been cancelled Details dynamo-unittest / dynamo-test (3.11) (push) Has been cancelled Details dynamo-unittest / dynamo-test (3.12) (push) Has been cancelled Details dynamo-unittest / dynamo-test (3.13) (push) Has been cancelled Details Limited CI for distributed tests on H100 / get-label-type (push) Has been cancelled Details Limited CI for distributed tests on H100 / linux-jammy-cuda12.8-py3.10-gcc11-sm90-dist (push) Has been cancelled Details CI for distributed tests on B200 / get-label-type (push) Has been cancelled Details CI for distributed tests on B200 / linux-jammy-cuda12.8-py3.10-gcc11-build-distributed-b200 (push) Has been cancelled Details CI for distributed tests on B200 / linux-jammy-cuda12.8-py3.10-gcc11-test-b200 (push) Has been cancelled Details vLLM Benchmark / set-parameters (push) Has been cancelled Details vLLM Benchmark / Build PyTorch and vLLM (push) Has been cancelled Details vLLM Benchmark / Run vLLM benchmarks (push) Has been cancelled Details Build vLLM wheels / Build cu128 vLLM wheel on manylinux_2_28_x86_64 (push) Has been cancelled Details Build vLLM wheels / Build cu128 vLLM wheel on manylinux_2_28_aarch64 (push) Has been cancelled Details Build vLLM wheels / Build cu129 vLLM wheel on manylinux_2_28_x86_64 (push) Has been cancelled Details Build vLLM wheels / Build cu129 vLLM wheel on manylinux_2_28_aarch64 (push) Has been cancelled Details Build vLLM wheels / Build cu130 vLLM wheel on manylinux_2_28_x86_64 (push) Has been cancelled Details Build vLLM wheels / Upload cu128 vLLM wheel on manylinux_2_28_aarch64 (push) Has been cancelled Details Build vLLM wheels / Upload cu128 vLLM wheel on manylinux_2_28_x86_64 (push) Has been cancelled Details Build vLLM wheels / Upload cu129 vLLM wheel on manylinux_2_28_aarch64 (push) Has been cancelled Details Build vLLM wheels / Upload cu129 vLLM wheel on manylinux_2_28_x86_64 (push) Has been cancelled Details Build vLLM wheels / Upload cu130 vLLM wheel on manylinux_2_28_x86_64 (push) Has been cancelled Details inductor-perf-nightly-xpu / get-label-type (push) Has been cancelled Details inductor-perf-nightly-xpu / xpu-n-py3.10-inductor-benchmark (push) Has been cancelled Details inductor-perf-nightly-xpu / xpu-n-py3.10-inductor-test (push) Has been cancelled Details Close nonexistent disable issues / close-nonexistent-disable-issues (push) Has been cancelled Details Index PyTorch Tests for Target Determination / get-label-type (push) Has been cancelled Details Index PyTorch Tests for Target Determination / index (push) Has been cancelled Details nightly / get-label-type (push) Has been cancelled Details nightly / Link checks (push) Has been cancelled Details nightly / docs build (push) Has been cancelled Details nightly / docs push (push) Has been cancelled Details nightly / update-commit-hashes (main, .ci/docker/ci_commit_pins, triton, triton-lang) (push) Has been cancelled Details nightly / update-commit-hashes (main, .github/ci_commit_pins, audio, pytorch) (push) Has been cancelled Details nightly / update-commit-hashes (main, .github/ci_commit_pins, vision, pytorch) (push) Has been cancelled Details nightly / update-commit-hashes (main, .github/ci_commit_pins, vllm, vllm-project) (push) Has been cancelled Details inductor-perf-nightly-rocm-mi355 / get-label-type (push) Has been cancelled Details inductor-perf-nightly-rocm-mi355 / rocm-py3_10-inductor-benchmark-build (push) Has been cancelled Details inductor-perf-nightly-rocm-mi355 / rocm-py3_10-inductor-benchmark-test (push) Has been cancelled Details inductor-perf-nightly-rocm-mi300 / get-label-type (push) Has been cancelled Details inductor-perf-nightly-rocm-mi300 / rocm-py3_10-inductor-benchmark-build (push) Has been cancelled Details inductor-perf-nightly-rocm-mi300 / rocm-py3_10-inductor-benchmark-test (push) Has been cancelled Details Delete old branches / delete (push) Has been cancelled Details inductor-perf-nightly-h100 / get-label-type (push) Has been cancelled Details inductor-perf-nightly-h100 / build (push) Has been cancelled Details inductor-perf-nightly-h100 / test-periodically (push) Has been cancelled Details inductor-perf-nightly-h100 / test-weekly (push) Has been cancelled Details inductor-perf-nightly-h100 / test (push) Has been cancelled Details quantization-periodic / get-default-label-prefix (push) Has been cancelled Details quantization-periodic / periodic-quantization-build (push) Has been cancelled Details quantization-periodic / periodic-test-quantization (push) Has been cancelled Details operator_benchmark / x86-opbenchmark-build (push) Has been cancelled Details operator_benchmark / aarch64-opbenchmark-build (push) Has been cancelled Details operator_benchmark / x86-opbenchmark-test (push) Has been cancelled Details operator_benchmark / aarch64-opbenchmark-test (push) Has been cancelled Details weekly / update-commit-hash (push) Has been cancelled Details weekly / update-slow-tests (push) Has been cancelled Details docker-builds / get-label-type (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-aarch64-py3.10-clang21, linux.arm64.m7g.4xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-aarch64-py3.10-gcc13, linux.arm64.m7g.4xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-aarch64-py3.10-gcc13-inductor-benchmarks, linux.arm64.m7g.4xlarge, 600) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda12.4-cudnn9-py3-gcc11, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-inductor-benchmarks, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda12.8-cudnn9-py3.10-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda12.8-cudnn9-py3.10-linter, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda12.8-py3.12-pallas, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda12.9-cudnn9-py3.12-gcc11-vllm, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda13.0-cudnn9-py3-gcc11, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-cuda13.0-cudnn9-py3-gcc11-inductor-benchmarks, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-linter, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3-clang12-onnx, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3-clang18-asan, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3-gcc11-inductor-benchmarks, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.10-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.10-gcc11, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.11-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.12-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.12-halide, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.12-pallas, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.12-triton-cpu, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.13-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.14-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-rocm-n-py3, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-rocm-n-py3-benchmarks, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-tpu-py3.12-pallas, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-xpu-n-1-py3, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-noble-riscv64-py3.12-gcc14, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-noble-rocm-n-py3, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-noble-rocm-nightly-py3, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-noble-xpu-n-py3, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-noble-xpu-n-py3-inductor-benchmarks, linux.12xlarge) (push) Has been cancelled Details ossf-scorecard / Scorecards analysis (push) Has been cancelled Details fuzzes over [Replicate(), Shard(i), Partial()] for DTensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/170136 Approved by: https://github.com/bobrenjc93	2026-01-02 06:27:43 +00:00
Guilherme Leobas	39839dbc39	Include one level of stack trace in the `lru_cache` warning msg (#171496 ) Fixes #167991 Example of the new warning message: ```python /home/guilhermel/git/pytorch313/torch/_dynamo/variables/functions.py:2159: UserWarning: Dynamo detected a call to a `functools.lru_cache`-wrapped function at 'script.py:12'. Dynamo ignores the cache wrapper and directly traces the wrapped function. Silent incorrectness is only a potential risk, not something we have observed. Enable TORCH_LOGS=+dynamo for a DEBUG stack trace. This call originates from: File "/path/to/script.py", line 12, in bar return baz(x) torch._dynamo.utils.warn_once(msg) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/171496 Approved by: https://github.com/Lucaskabela	2026-01-01 18:17:57 +00:00
pytorchbot	b35a75b73d	Update inductor expected accuracy files (#171533 ) ## Summary This PR updates the expected accuracy CSV files for inductor benchmarks based on CI results from PyTorch commit `3c98eef883`. These files serve as reference points for dynamo/inductor CI to track: - Graph breaks - Model accuracy ## Changes - Updated CUDA expected accuracy files in `benchmarks/dynamo/ci_expected_accuracy/` - Updated ROCm expected accuracy files in `benchmarks/dynamo/ci_expected_accuracy/rocm/` ## Test Plan - [ ] Verify that the CI jobs pass with the updated expected accuracy files - [ ] Review the diff to ensure changes are reasonable and expected - [ ] Check that no unexpected regressions are being marked as "expected" Pull Request resolved: https://github.com/pytorch/pytorch/pull/171533 Approved by: https://github.com/jataylo, https://github.com/atalman	2026-01-01 15:07:33 +00:00
Nikhil Patel	f7f91ec63a	[Inductor][NV Universal GEMM] Ensure benchmarking launches kernels on the current CUDA stream (#171362 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171362 Approved by: https://github.com/drisspg ghstack dependencies: #170623	2026-01-01 05:28:51 +00:00
albanD	845ea00ae1	Remove assert meta/prims/refs (#170776 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/170776 Approved by: https://github.com/ezyang, https://github.com/cyyever ghstack dependencies: #170598	2026-01-01 05:27:53 +00:00
albanD	1913ee1aec	Assert in github, docs, setup and top (#170598 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/170598 Approved by: https://github.com/ezyang, https://github.com/cyyever	2026-01-01 05:27:53 +00:00
can-gaa-hou	9def334cd1	[inductor] Fix OverflowError when truncate infinity number (#166636 ) Fixes #163833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/166636 Approved by: https://github.com/ezyang	2026-01-01 05:24:42 +00:00
Tom Ritchford	5ad95e64e0	Fix pyrefly errors by using `pyrefly check --suppress-errors` (#171188 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171188 Approved by: https://github.com/lolpack, https://github.com/cyyever	2026-01-01 05:18:13 +00:00
Anshul Sinha	452abac61c	[dtensor][partial] fixes unnecessary redistributions when subtracting two partial tensors (#170040 ) Summary: Currently, whenever we subtract two partial dtensors, we redistribute since linearity is -1 for aten.sub.tensor. However, this is an unnecessary redistribution that can be avoided in similar ways to its add counterpart. I moved the op to linear_ops and ensured subtracting a scalar from a partial dtensor continues to redistribute. Test Cases: 1. pytest test/distributed/tensor/test_pointwise_ops.py -k test_add_sub_scalar_norm_partial 2. pytest test/distributed/tensor/test_pointwise_ops.py -k test_add_sub_scalar_partial Pull Request resolved: https://github.com/pytorch/pytorch/pull/170040 Approved by: https://github.com/wconstab ghstack dependencies: #170030, #170035	2025-12-31 23:52:03 +00:00
William Wen	76a53f9626	[dynamo] remove most InstructionTranslator.current_tx() callsites (#170234 ) We will eventually remove `current_tx` in favor of directly passing it to VT's. We also eventually intend to change callsites involving TX'es so that the leaf TX is always passed. Currently, this is inconsisten since `InstructionTranslator.current_tx()` returns the root TX. Pull Request resolved: https://github.com/pytorch/pytorch/pull/170234 Approved by: https://github.com/guilhermeleobas	2025-12-31 23:47:34 +00:00
William Wen	dd6a12daec	[dynamo] remove most Unsupported subclasses (#171486 ) These subclasses were either deleted outright or modified to not be a subclass of Unsupported. Pull Request resolved: https://github.com/pytorch/pytorch/pull/171486 Approved by: https://github.com/guilhermeleobas ghstack dependencies: #170587, #171358	2025-12-31 23:46:41 +00:00
Jongsok Choi	78ff7c86be	[pallas backend] Fix scalar store shape mismatch (#171581 ) When storing a scalar value into a buffer, use jnp.full to handle shape differences. Pull Request resolved: https://github.com/pytorch/pytorch/pull/171581 Approved by: https://github.com/oulgen ghstack dependencies: #171571, #171579	2025-12-31 19:10:10 +00:00
Jongsok Choi	69c5884a75	[pallas backend] Update xfail files with current error messages (#171579 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171579 Approved by: https://github.com/oulgen ghstack dependencies: #171571	2025-12-31 19:10:10 +00:00
Jongsok Choi	6be2baa9fb	[pallas backend] Fix 1D buffer broadcasting for batch norm operations (#171571 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171571 Approved by: https://github.com/oulgen	2025-12-31 19:10:01 +00:00
Eddie Yan	178ebac3a9	[cuDNN][SDPA] Use-same prefer-cuDNN settings for Blackwell and Blackwell Ultra (#170800 ) GB300 can run GB200 SDPA kernels Pull Request resolved: https://github.com/pytorch/pytorch/pull/170800 Approved by: https://github.com/Skylion007	2025-12-31 17:35:59 +00:00
Aaron Gokaslan	766be25a17	[BE]: Update cpp-httplib submodule to 0.29.0 (#171333 ) Updates the submodule. I updated it to remove a lot of redundant copies in the codebase that I found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/171333 Approved by: https://github.com/drisspg	2025-12-31 17:05:34 +00:00
Aleksandar Samardžić	49e614ea32	Fix synthetic offsets calculation for grouped MM auto-tuning (#171316 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171316 Approved by: https://github.com/NikhilAPatel	2025-12-31 15:31:57 +00:00
Yuanyuan Chen	77470cdbfb	Remove old CUDA conditions (#171235 ) This PR removes old branches for CUDA <=12.3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/171235 Approved by: https://github.com/ezyang	2025-12-31 09:05:08 +00:00
vishalgoyal316	7c467cad4a	Improve RNN dtype mismatch error message (#166946 ) /Enhance error message to explain mismatch and provide two actionable fixes: convert input with input.to(dtype) or convert model with model.to(dtype). Add test to validate error message and verify both suggested fixes work. Fixes #136931 Pull Request resolved: https://github.com/pytorch/pytorch/pull/166946 Approved by: https://github.com/mikaylagawarecki, https://github.com/cyyever	2025-12-31 06:51:06 +00:00
Jongsok Choi	30fd43528e	[pallas backend] Add atomic_add store mode support (#171567 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171567 Approved by: https://github.com/oulgen	2025-12-31 05:59:49 +00:00
Andrew Megalaa	4dcec41041	[ROCm] [CI] Fix deterministic scan kernel edge case and enable test (#170763 ) Fixes #168862 Previously the test would break on MI300x on this assertion ```cpp TORCH_INTERNAL_ASSERT(2 * BLOCK_THREADS >= grid_size); ``` because the MI300x has 304 SMs, so the grid_size would get set to at least 304 while the number of threads within a block would be 128 for 16-byte types (2 * 128 = 256 which is not >= 304). It seems like the reason for this assertion was because the kernel performed a simple reduction for the block aggregates: each thread held a block's aggregate, and if there were less threads per block than the number of blocks, then each thread would add up 2 aggregates (hence the assertion as a safe guard). Changing the conditional to a loop should incur very minimal overhead since it's executed at most one more time per thread than the old behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/170763 Approved by: https://github.com/jeffdaily Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2025-12-31 05:35:10 +00:00
Jongsok Choi	f79dc8a549	[pallas backend] skip flaky test test_constant_pad_2d_strides_nonpositive_cpu_pallas (#171564 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171564 Approved by: https://github.com/oulgen	2025-12-31 04:56:20 +00:00
PyTorch UpdateBot	c8ae6c8618	[vllm hash update] update the pinned vllm hash (#171557 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned vllm hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/171557 Approved by: https://github.com/pytorchbot	2025-12-31 04:51:52 +00:00
Yu, Guangye	3c620e7eff	[xpu][fix] Fix test_wrap_triton_handled_during_tracing on XPU (#171512 ) # Motivation This PR aims to fix the failure introduced from https://github.com/pytorch/pytorch/pull/171289 on XPU backend. Currently, some UTs under `test/dynamo/` are device-agnostic. So the cuda-specific code will raise the following error on XPU backend. ```python AssertionError: Torch not compiled with CUDA enabled ``` # Additional Context Fix https://github.com/pytorch/pytorch/issues/171508 and https://github.com/pytorch/pytorch/issues/171509 Pull Request resolved: https://github.com/pytorch/pytorch/pull/171512 Approved by: https://github.com/ezyang	2025-12-31 02:41:42 +00:00
shunting314	e03d126b68	[autochunker] override num-chunks (#171477 ) Previously we can not override auto_chunker.num_chunks with the options argument of torch.compile due to lack of type annotation. The type of the config is decided as the default value which is None. Overriding it as an integer during compilation will trigger type mismatch and fail. Adding type annotations fixes that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/171477 Approved by: https://github.com/v0i0, https://github.com/eellison ghstack dependencies: #171359	2025-12-31 01:51:38 +00:00
shunting314	7b5761f816	[autochunker] support gradient accumulation (#171359 ) With gradient accumulation, there will be a fx.Node dividing the loss by a scalar (the gradient accumulation steps). Update the propagation rule to handle that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/171359 Approved by: https://github.com/eellison	2025-12-31 01:51:38 +00:00
Zhang, Jianyi	99d55c3193	[xpu][fix] Fix UT test_fuse_mix_order_reductions_combo_kernels (#170297 ) Fixes #170296 Pull Request resolved: https://github.com/pytorch/pytorch/pull/170297 Approved by: https://github.com/jansel	2025-12-31 01:07:25 +00:00
Parshant Sharma	6535e1e69e	Fix share_memory_ compile (#171162 ) Fixes #166623 ### Summary: Fixes share_memory_ in compile mode Pull Request resolved: https://github.com/pytorch/pytorch/pull/171162 Approved by: https://github.com/Lucaskabela	2025-12-31 00:52:27 +00:00
Jongsok Choi	f65bb97673	[pallas backend] Refactor load and store functions. (#171546 ) This refactoring extracts common logic from the load() and store() methods into reusable helper functions, improving code readability and maintainability. New helper methods added: - _get_iter_vars(), _get_used_iter_vars(), _get_indirect_vars(): Variable extraction - _safe_int(): Safe integer conversion - _get_buffer_info(): Buffer metadata retrieval - _compute_output_numel_from_index(): Output size computation - _get_index_coefficients(), _check_gather_pattern(): Index analysis - _needs_strided_indexing(), _adjust_index_for_buffer_shape(): Indexing decisions - _build_load_expr(), _build_store_expr(): Expression building - _detect_scatter_pattern(), _detect_point_scatter(), _detect_iter_scatter(): Scatter detection - _check_im2col_pattern(), _check_load_is_strided_input(): Pattern matching - _check_store_needs_transpose(), _build_full_array_store_expr(): Store helpers - _maybe_squeeze_intermediate_buffer(): Broadcasting fixes Pull Request resolved: https://github.com/pytorch/pytorch/pull/171546 Approved by: https://github.com/oulgen	2025-12-31 00:02:12 +00:00
Oguz Ulgen	4ecfdeb39e	[pallas backend] Add automatic padding to align tensor sizes to WARPGROUP_SIZE (128) (#171539 ) Mosaic requires tiles of size 128, and there's no way to mask since jnp.arange is not supported. I'm told padding is the only way. Until I find a better alternative, pad to 128. Pull Request resolved: https://github.com/pytorch/pytorch/pull/171539 Approved by: https://github.com/choijon5 ghstack dependencies: #171475, #171485, #171531	2025-12-30 22:39:38 +00:00
Oguz Ulgen	98bc4d77e1	[pallas backend] Require sm90+ for mosaic (#171531 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171531 Approved by: https://github.com/choijon5 ghstack dependencies: #171475, #171485	2025-12-30 22:39:38 +00:00
Oguz Ulgen	2fff179ded	[pallas backend] More passing tests after 0.8.2 upgrade (#171485 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171485 Approved by: https://github.com/choijon5 ghstack dependencies: #171475	2025-12-30 22:03:30 +00:00
Oguz Ulgen	f43c42a437	[pallas backend] Swap from triton to mosaic for gpu (#171475 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171475 Approved by: https://github.com/choijon5	2025-12-30 22:03:29 +00:00
zpcore	af45e0c41f	[DTensor] refactor redistribute_cost function (#170108 ) [No function change] Refactor the `redistribute_cost` code by extracting out the logic to compute the cost of a single collective op into `_compute_placement_transition_cost`. This helps for `DTensorRedistributePlanner` to use the same single collective op cost from `_collective_utils.py` when traversing the graph. Below is how the calling stack will look like: ``` DTensorRedistributePlanner --> one_step_redistribute_cost ----------------\| \| -----> _compute_placement_transition_cost \| redistribute_cost ---> DTensorRedistributePlanner (get transform_infos)---\| ``` Without the refactor, `redistribute_cost` and `DTensorRedistributePlanner` will circular calling each other. Pull Request resolved: https://github.com/pytorch/pytorch/pull/170108 Approved by: https://github.com/mori360 ghstack dependencies: #170106, #170107	2025-12-30 21:42:14 +00:00
zpcore	2a11e4430f	[DTensor] Fix redistribute_cost using incorrect comm_bytes_gb (#170107 ) Notice that the redistribute_cost is incorrect while looking into https://github.com/pytorch/pytorch/issues/169439 to verify cost with following condition: ``` redistribute_cost(SRC, DST) <= redistribute_cost(SRC, INT) + redistribute_cost(INT, DST) for all INT ``` The failing case is: 1. For SRC --> DST, the redistribution path is: `S(1)S(0)[0]S(0)[1]->S(1)S(0)R->S(1)[0]S(0)S(1)[1]->S(1)[0]RS(1)[1]->S(1)[0]S(1)[2]S(1)[1]` Then redistribute cost is summed up from following four costs: ``` current=S(0), target=R, comm_bytes_gb=1.1920928955078125e-07, step_cost=7.2006796424717825 current=R, target=S(1), comm_bytes_gb=1.1920928955078125e-07, step_cost=0.0 <<<<<<<<<<<<<<<<<<< comm_bytes_gb incorrect current=S(0), target=R, comm_bytes_gb=2.384185791015625e-07, step_cost=7.201359284943566 <<< mismatch with number 7.2006796424717825 current=R, target=S(1), comm_bytes_gb=2.384185791015625e-07, step_cost=0.0 ``` 2. For SRC --> INT, the redistribution path is: 'S(1)S(0)[0]S(0)[1]->S(1)S(0)R->S(1)[0]S(0)S(1)[1]' Then redistribute cost is summed up from following two costs: ``` current=S(0), target=R, comm_bytes_gb=1.1920928955078125e-07, step_cost=7.2006796424717825 current=R, target=S(1), comm_bytes_gb=1.1920928955078125e-07, step_cost=0.0 ``` 3. For INT --> DST, the redistribution path is `S(1)[0]S(0)S(1)[1]->S(1)[0]RS(1)[1]->S(1)[0]S(1)[2]S(1)[1]' Then redistribute cost is summed up from following two costs: ``` current=S(0), target=R, comm_bytes_gb=1.1920928955078125e-07, step_cost=7.2006796424717825 current=R, target=S(1), comm_bytes_gb=1.1920928955078125e-07, step_cost=0.0 ``` As we can see, `redistribute_cost(SRC, DST) > redistribute_cost(SRC, INT) + redistribute_cost(INT, DST) ` in this failing test. The difference is from converting from `R` to `S(1)`, which results in the incorrect `comm_bytes_gb` for the upcoming cost computation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/170107 Approved by: https://github.com/mori360 ghstack dependencies: #170106	2025-12-30 21:42:14 +00:00
zpcore	dee666e7ce	[DTensor] Fix redistribute_cost to detect shard_order (#170106 ) When source placements and target placements are the same, we may still need to compute the redistribute_cost because they can be under different shard order. Skipped the testcase for this PR as there will be a more strict test in #170109. Pull Request resolved: https://github.com/pytorch/pytorch/pull/170106 Approved by: https://github.com/mori360	2025-12-30 21:42:14 +00:00
Aaron Gokaslan	ae272eebd0	[BE][Ez]: Simplify flex attention typing imports (#171528 ) Saw this pattern in #171487 and it's not necessary. Typing_extensions already devolves to type alias in newer Python. Once the base Python version is raised enough, ruff will clean it up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/171528 Approved by: https://github.com/drisspg	2025-12-30 21:27:30 +00:00
PyTorch MergeBot	a69ed6babc	Revert "[precompile] E2e AOT compilation w/ regional inductor backend. (#170153 )" This reverts commit `5f1bed3d93`. Reverted https://github.com/pytorch/pytorch/pull/170153 on behalf of https://github.com/zhxchen17 due to breaking internal tests ([comment](https://github.com/pytorch/pytorch/pull/170153#issuecomment-3700377268))	2025-12-30 20:07:55 +00:00
PyTorch MergeBot	c8ac3a7f1b	Revert "[precompile] Respect torch.compile() dynamic config. (#170844 )" This reverts commit `8b581ce5c0`. Reverted https://github.com/pytorch/pytorch/pull/170844 on behalf of https://github.com/zhxchen17 due to breaking internal tests ([comment](https://github.com/pytorch/pytorch/pull/170153#issuecomment-3700377268))	2025-12-30 20:07:55 +00:00
PyTorch MergeBot	a38bbbab87	Revert "[precompile] Cache serialized code object by identity weak map. (#170845 )" This reverts commit `dcdf0acbfa`. Reverted https://github.com/pytorch/pytorch/pull/170845 on behalf of https://github.com/zhxchen17 due to breaking internal tests ([comment](https://github.com/pytorch/pytorch/pull/170153#issuecomment-3700377268))	2025-12-30 20:07:55 +00:00
William Wen	44cbd4f8df	[dynamo] Remove SkipCodeRecursiveException and RecompileLimitExceeded, add frame_exec_strategy attribute (#171358 ) Replace exception-based control flow pattern with attribute-based approach for SkipCodeRecursiveException and RecompileLimitExceeded. Instead of using specific exception types for control flow, add a frame_exec_strategy attribute to TorchDynamoException that allows exceptions to optionally specify how convert_frame should handle them. Benefits: - Cleaner separation of concerns (exceptions for errors, attributes for control flow) - More flexible - any exception can specify a frame execution strategy - Easier to extend - no need for new exception types for new strategies - Better type safety with isinstance(e, exc.TorchDynamoException) check Changes: - torch/_dynamo/exc.py: * Add frame_exec_strategy attribute to TorchDynamoException with documentation * Remove SkipCodeRecursiveException and RecompileLimitExceeded classes - torch/_dynamo/convert_frame.py: * Remove imports of removed exception classes * Replace isinstance checks with frame_exec_strategy attribute check * Set frame_exec_strategy on Unsupported exception in recompile limit handler Pull Request resolved: https://github.com/pytorch/pytorch/pull/171358 Approved by: https://github.com/Lucaskabela, https://github.com/guilhermeleobas ghstack dependencies: #170587	2025-12-30 19:51:13 +00:00
Dylan Maloy	6c619f0f05	support __torch_function__ in torch.autograd.backward (#171473 ) Summary: i'd like to call torch.autograd.backward on a list of tensor-subclasses that imlpement __torch_function__ :) Differential Revision: D89834724 Pull Request resolved: https://github.com/pytorch/pytorch/pull/171473 Approved by: https://github.com/zhxchen17	2025-12-30 18:18:11 +00:00
Jongsok Choi	88052ace3e	[pallas backend] Add FMA support. (#171518 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171518 Approved by: https://github.com/oulgen	2025-12-30 17:22:08 +00:00
cyy	963bd0a31f	[BC-BREAKING] Remove torch::cuda::profiler::init (#169202 ) This function throws on CUDA 12+ and ROCM, and that means it is useless. Pull Request resolved: https://github.com/pytorch/pytorch/pull/169202 Approved by: https://github.com/ezyang	2025-12-30 17:04:45 +00:00
Aaron Gokaslan	11d2161833	[BE][Ez]: Fix correctness and perf issues with ParamHasherUtil (#171467 ) Enables more efficient hashing in STL containers and fixes incorrect invariant Pull Request resolved: https://github.com/pytorch/pytorch/pull/171467 Approved by: https://github.com/drisspg	2025-12-30 16:20:04 +00:00
PyTorch MergeBot	8c66e4b884	Revert "Add custom torch dispatch mode in aot_autograd runtime wrapper to analyze custom ops under config (#166545 )" This reverts commit `807f51f99f`. Reverted https://github.com/pytorch/pytorch/pull/166545 on behalf of https://github.com/atalman due to Failing internally ([comment](https://github.com/pytorch/pytorch/pull/166545#issuecomment-3699701857))	2025-12-30 15:27:14 +00:00
Andrey Talman	bfb99436c8	[CI] Bump jax version (#171478 ) Same as https://github.com/pytorch/pytorch/pull/171211 Pull Request resolved: https://github.com/pytorch/pytorch/pull/171478 Approved by: https://github.com/oulgen, https://github.com/huydhn	2025-12-30 15:08:19 +00:00
Andrey Talman	a810249adf	Bump torchbench version (#171490 ) To include https://github.com/pytorch/benchmark/pull/2661 Pull Request resolved: https://github.com/pytorch/pytorch/pull/171490 Approved by: https://github.com/huydhn, https://github.com/oulgen	2025-12-30 15:03:53 +00:00
Jack Taylor	3c98eef883	[ROCm] enable decompose k tests for functional coverage (#169948 ) Fixes https://github.com/pytorch/pytorch/issues/168617 Fixes https://github.com/pytorch/pytorch/issues/168615 Fixes https://github.com/pytorch/pytorch/issues/168614 Fixes https://github.com/pytorch/pytorch/issues/168613 Fixes https://github.com/pytorch/pytorch/issues/168599 Fixes https://github.com/pytorch/pytorch/issues/168600 Fixes https://github.com/pytorch/pytorch/issues/168601 Fixes https://github.com/pytorch/pytorch/issues/168602 Fixes https://github.com/pytorch/pytorch/issues/168603 Fixes https://github.com/pytorch/pytorch/issues/168604 Fixes https://github.com/pytorch/pytorch/issues/168605 Fixes https://github.com/pytorch/pytorch/issues/168606 Fixes https://github.com/pytorch/pytorch/issues/168607 Enables testing for decompose K mode on ROCm. This is still disabled by default pending perf testing but we can get the functional coverage by adding an inductor config for decompose k enablement. Pull Request resolved: https://github.com/pytorch/pytorch/pull/169948 Approved by: https://github.com/jansel, https://github.com/eellison, https://github.com/PaulZhang12	2025-12-30 09:48:53 +00:00
Jongsok Choi	9ade6aad80	[pallas backend] Fix iteration variable reshaping for reductions. (#171504 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/171504 Approved by: https://github.com/oulgen	2025-12-30 08:51:53 +00:00

1 2 3 4 5 ...

97727 Commits