mirror of
https://github.com/zebrajr/pytorch.git
synced 2026-01-15 12:15:51 +00:00
Summary: For sepcific hardware (A100), Autocast will generate a relatively large error on Transformer (torch.nn.TransformerEncoder) when using no_grad decorator on dim=256 (and larger presuably). H100 seems fine, as does A100 with mig (so less than full SMs). For now backing out, and revisting next week. Test Plan: failed jobs: https://fburl.com/scuba/remote_execution_action/jzcmujgk {F1983543613} Reviewed By: t-ivan-gr Differential Revision: D87111518 Pull Request resolved: https://github.com/pytorch/pytorch/pull/167884 Approved by: https://github.com/malfet