mirror of https://github.com/zebrajr/pytorch.git synced 2026-01-15 12:15:51 +00:00

Files

eellison 93bfe57144 cudagraphs: fix backward hooks & fsdp interaction (#126914 )

Fixes

> ERROR: expected to be in states [<TrainingState.FORWARD_BACKWARD: 2>] but current state is TrainingState.IDLE

Error that would occur when composing pt2 fsdp and cudagraphs. Cudagraphs caches output tensor impls in the fast path, so we were inadvertently accumulating multiple hooks on what should have been fresh allocations.

from code comment:
```
# this output represents a fresh allocated tensor.
# We return the same TensorImpl from run to run to avoid overhead.
# autograd.Function will reset the Autograd meta of output tensors
# as part of aot_autograd, but _backward_hooks are stored on tensors separately,
# so we need to manually reset hooks.
``

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126914
Approved by: https://github.com/awgu, https://github.com/xmfan

2024-05-28 22:07:41 +00:00

distributed

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

dynamo

cudagraphs: fix backward hooks & fsdp interaction (#126914 )

2024-05-28 22:07:41 +00:00

fastrnns

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

framework_overhead_benchmark

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

functional_autograd_benchmark

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

fuser

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

gpt_fast

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

inference

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

instruction_counts

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

nested

…

operator_benchmark

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

overrides_benchmark

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

profiler_benchmark

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

record_function_benchmark

…

serialization

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

sparse

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

static_runtime

Fix layer norm in static runtime when input is non-contiguous (#124789 )

2024-04-24 19:49:36 +00:00

tensorexpr

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

transformer

[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126 )

2024-05-27 14:49:57 +00:00

compare-fastrnn-results.py

…

compare.sh

…

README.md

…

upload_scribe.py

…

README.md

PyTorch Benchmarks

This folder contains scripts that produce reproducible timings of various PyTorch features.

It also provides mechanisms to compare PyTorch with other frameworks.

Setup environment

Make sure you're on a machine with CUDA, torchvision, and pytorch installed. Install in the following order:

# Install torchvision. It comes with the pytorch stable release binary
conda install pytorch torchvision -c pytorch

# Install the latest pytorch master from source.
# It should supersede the installation from the release binary.
cd $PYTORCH_HOME
python setup.py build develop

# Check the pytorch installation version
python -c "import torch; print(torch.__version__)"

Benchmark List

Please refer to each subfolder to discover each benchmark suite. Links are provided where descriptions exist: