You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I profiled torch.nn.Conv3d using both PyTorch's built-in profiler and Nsight Compute. When viewing the results in TensorBoard, the PyTorch profiler reports zero Tensor Core utilization. However, Nsight Compute indicates that Tensor Cores are actually being used.
Description
I profiled
torch.nn.Conv3d
using both PyTorch's built-in profiler and Nsight Compute. When viewing the results in TensorBoard, the PyTorch profiler reports zero Tensor Core utilization. However, Nsight Compute indicates that Tensor Cores are actually being used.Upon investigating the codebase, I found that the Tensor Core allowlist (
TC_Allowlist
) in[tb_plugin/torch_tb_profiler/profiler/tensor_core.py](https://github.com/pytorch/kineto/blob/main/tb_plugin/torch_tb_profiler/profiler/tensor_core.py)
appears to be outdated.The kernel used in
Conv3d
is:However,
xmma_fprop_implicit_gemm
is not included in the allowlist, which might explain the discrepancy.Expected Behavior
PyTorch's profiler using tensorboard should correctly report Tensor Core utilization when kernels that use Tensor Cores are executed.
Suggested Fix
The allowlist should be updated to include
xmma_fprop_implicit_gemm
and other relevant kernels.Environment
The text was updated successfully, but these errors were encountered: