Fix edge-data handling in cudaGraphNodeGetDependencies for CUDA 13 in graph_capture_record_stream_reuse (#168305)

CUDA 13 introduced stricter behavior for querying graph edges with edge data.
According to the CUDA documentation for [cudaGraphNodeGetDependencies](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__GRAPH.html#group__CUDART__GRAPH_1g94ee7ba53ade560483e9c5d06e8ef50d)

> If an edge has non-zero (non-default) edge data and edgeData is NULL, this API returns cudaErrorLossyQuery.
If edgeData is non-NULL, then pDependencies must also be non-NULL.

When a graph contains edge data, we must provide a non-NULL edgeData buffer during dependency queries. Otherwise CUDA 13 will raise a cudaErrorLossyQuery.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/168305
Approved by: https://github.com/eqy, https://github.com/ezyang
This commit is contained in:
Frank Lin
2025-11-21 07:16:02 +00:00
committed by PyTorch MergeBot
parent 4ee6b3d60c
commit 8b0314d1a7

View File

@@ -1765,7 +1765,12 @@ class DeviceCachingAllocator {
auto node_get_dependencies =
[](cudaGraphNode_t n, cudaGraphNode_t* deps, size_t* count) -> void {
#if (defined(CUDA_VERSION) && CUDA_VERSION >= 13000)
C10_CUDA_CHECK(cudaGraphNodeGetDependencies(n, deps, nullptr, count));
if (deps == nullptr) {
C10_CUDA_CHECK(cudaGraphNodeGetDependencies(n, deps, nullptr, count));
} else {
cudaGraphEdgeData edgeData;
C10_CUDA_CHECK(cudaGraphNodeGetDependencies(n, deps, &edgeData, count));
}
#else
C10_CUDA_CHECK(cudaGraphNodeGetDependencies(n, deps, count));
#endif