mirror of
https://github.com/zebrajr/pytorch.git
synced 2026-01-15 12:15:51 +00:00
## Summary Skips CUDA device in `TestQuantizeFxModels.test_qat_embeddingbag_linear`. **Test:** `TestQuantizeFxModels.test_qat_embeddingbag_linear` **Repro:** `python test/test_quantization.py TestQuantizeFxModels.test_qat_embeddingbag_linear -v` ## Problem The test at `test/quantization/fx/test_quantize_fx.py:9647` started failing after https://github.com/pytorch/pytorch/pull/167043 added `.to(device)` to migrate models and tensors to CUDA. When the test runs with CUDA device and QNNPACK backend, `convert_fx()` fails at line 9672 during weight quantization. The failure occurs in `torch.quantize_per_tensor()` which calls `at::new_qtensor()` in `aten/src/ATen/quantized/Quantizer.cpp:130`, where a `TORCH_CHECK` explicitly rejects CUDA tensors for CPU-only quantization backends (QNNPACK, FBGEMM, ONEDNN). ## Solution Skips CUDA device in the test loop by adding a `continue` statement when `device == 'cuda'`. This change at `test/quantization/fx/test_quantize_fx.py:9653` ensures the test only runs on CPU, which is the only supported platform for all current quantization backends. The test now passes successfully, restoring the original working behavior from before that PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/170917 Approved by: https://github.com/cyyever, https://github.com/ezyang