pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2026-01-15 12:15:51 +00:00

Files

can-gaa-hou 89e3bbcb5b [Accelerator] Add Accelerator Capabilities API (#165631 )

# Motivation
There are several issues related to the data type and precision that an accelerator supports (see #165038 and #143112). Sometimes, we have to check for these capabilities in the document, and then hard-code.  This PR proposes a new unified API for users to check their accelerator capabilities.

# Changes
This PR creates a new data structure `DeviceCapability` containing the capabilities that an accelerator commonly has:
- Supporting DataType (set to be supported as default):
  - `fp16`, `int32`, `complex` ... etc
- Other capabilities (need to be discussed)

To access the structure, this PR defines a new Python API in the Accelerator module -- `get_device_capability`. It takes `device` as an input and returns a dictionary containing the capabilities (now we have `supported_dtypes` as the key).

# Usage
```python
>>> import torch
>>> import torch_openreg
>>> torch.accelerator.get_device_capability('openreg:0')
{'supported_dtypes': [torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64, torch.float16, torch.float32, torch.float64, torch.complex32, torch.complex64, torch.complex128, torch.bool, torch.qint8, torch.quint8, torch.qint32, torch.bfloat16, torch.quint4x2, torch.quint2x4, torch.bits1x8, torch.bits2x4, torch.bits4x2, torch.bits8, torch.bits16, torch.float8_e5m2, torch.float8_e4m3fn, torch.float8_e5m2fnuz, torch.float8_e4m3fnuz, torch.uint16, torch.uint32, torch.uint64, torch.uint1, torch.uint2, torch.uint3, torch.uint4, torch.uint5, torch.uint6, torch.uint7, torch.int1, torch.int2, torch.int3, torch.int4, torch.int5, torch.int6, torch.int7, torch.float8_e8m0fnu, torch.float4_e2m1fn_x2]}
```
# TODO
- So far, precision is the only capability to track, based on my knowledge. But we can find more capabilities in common, and the API should be designed for good extension.
- It will support other in-tree accelerators, such as **cuda** and **mps**.
- Clarify whether the capabilities are software or hardware supported. (By @guangyey )

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165631
Approved by: https://github.com/guangyey, https://github.com/albanD

Co-authored-by: Yu, Guangye <106960996+guangyey@users.noreply.github.com>
Co-authored-by: Jiawei Li <ljw1101.vip@gmail.com>

2025-12-03 21:37:30 +00:00

alloc_cpu.cpp

Remove unnecessary "static" for definitions in anonymous namespace (#165035 )

2025-10-11 00:04:23 +00:00

alloc_cpu.h

…

COW.cpp

[1/N] Change C-style casts to static_cast or reinterpret_cast (#165750 )

2025-10-20 23:27:13 +00:00

COW.h

…

COWDeleter.cpp

…

COWDeleter.h

…

DeviceGuardImplInterface.cpp

Add functions to setup PrivateUse1 as a python backend device. (#157859 )

2025-10-01 21:32:59 +00:00

DeviceGuardImplInterface.h

[Accelerator] Add Accelerator Capabilities API (#165631 )

2025-12-03 21:37:30 +00:00

FakeGuardImpl.h

Mark unused parameters in C++ code (#164912 )

2025-10-09 06:23:25 +00:00

GPUTrace.cpp

…

GPUTrace.h

Mark unused parameters in C++ code (#164912 )

2025-10-09 06:23:25 +00:00

HermeticPyObjectTLS.cpp

Revert "[BE] Remove HermeticPyObjectTLS and Simplify PythonOpRegistrationTrampoline (#163464 )"

2025-09-30 18:20:20 +00:00

HermeticPyObjectTLS.h

Revert "[BE] Remove HermeticPyObjectTLS and Simplify PythonOpRegistrationTrampoline (#163464 )"

2025-09-30 18:20:20 +00:00

InlineDeviceGuard.h

…

InlineEvent.h

…

InlineStreamGuard.h

…

LocalDispatchKeySet.cpp

…

LocalDispatchKeySet.h

Mark unused parameters in C++ code (#164912 )

2025-10-09 06:23:25 +00:00

PyInterpreter.cpp

Rework PyObject preservation (v2) (#167564 )

2025-11-17 14:52:02 +00:00

PyInterpreter.h

Rework PyObject preservation (v2) (#167564 )

2025-11-17 14:52:02 +00:00

PyInterpreterHooks.cpp

[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 )

2025-07-30 17:29:43 +00:00

PyInterpreterHooks.h

[2/N] Fix clang-tidy readability checks (#164652 )

2025-10-06 01:06:01 +00:00

PyObjectSlot.h

Rework PyObject preservation (v2) (#167564 )

2025-11-17 14:52:02 +00:00

PythonDispatcherTLS.cpp

…

PythonDispatcherTLS.h

…

README-cow.md

…

README.md

…

SizesAndStrides.cpp

…

SizesAndStrides.h

…

TorchDispatchModeTLS.cpp

[1/N] Remove unused header inclusion (#165763 )

2025-10-18 05:23:11 +00:00

TorchDispatchModeTLS.h

…

VirtualGuardImpl.h

[Accelerator] Add Accelerator Capabilities API (#165631 )

2025-12-03 21:37:30 +00:00

README.md

c10/core/impl provides headers for functionality that is only needed in very specific use-cases (e.g., you are defining a new device type), which are generally only needed by C10 or PyTorch code. If you are an ordinary end-user, you should not use headers in this folder. We permanently give NO backwards-compatibility guarantees for implementations in this folder.

Compare with c10/util, which provides functionality that is not directly related to being a deep learning library (e.g., C++20 polyfills), but may still be generally useful and visible to users.

(We don't call this c10/detail, because the detail namespace convention is for header private details. However, c10::impl may be utilized from external headers; it simply indicates that the functionality is not for end users.)