Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch Profiler Shows Zero Tensor Core Utilization for torch.nn.Conv3d, While Nsight Compute Confirms Usage #1041

Open
BurkeHulk opened this issue Feb 14, 2025 · 0 comments
Labels
plugin PyTorch Profiler TensorBoard Plugin related

Comments

@BurkeHulk
Copy link

Description

I profiled torch.nn.Conv3d using both PyTorch's built-in profiler and Nsight Compute. When viewing the results in TensorBoard, the PyTorch profiler reports zero Tensor Core utilization. However, Nsight Compute indicates that Tensor Cores are actually being used.

Upon investigating the codebase, I found that the Tensor Core allowlist (TC_Allowlist) in [tb_plugin/torch_tb_profiler/profiler/tensor_core.py](https://github.com/pytorch/kineto/blob/main/tb_plugin/torch_tb_profiler/profiler/tensor_core.py) appears to be outdated.

The kernel used in Conv3d is:

sm90_xmma_fprop_implicit_gemm_bf16bf16_bf16f32_f32_nhwckrsc_nhwc_tilesize128x128x64_warpgroupsize1x1x1_g1_execute_segment_k_off_kernel__5x_cudnn

However, xmma_fprop_implicit_gemm is not included in the allowlist, which might explain the discrepancy.

Expected Behavior

PyTorch's profiler using tensorboard should correctly report Tensor Core utilization when kernels that use Tensor Cores are executed.

Suggested Fix

The allowlist should be updated to include xmma_fprop_implicit_gemm and other relevant kernels.

Environment

  • PyTorch Version: 2.6.0+cu124
  • CUDA Version: 12.4
  • GPU: NVIDIA H200
  • Profiling Tools: PyTorch Profiler, Nsight Compute (2024.1.1.0 (build 33998838))
  • torch-tb-profiler: 0.4.3
@davidberard98 davidberard98 added the plugin PyTorch Profiler TensorBoard Plugin related label Feb 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plugin PyTorch Profiler TensorBoard Plugin related
Projects
None yet
Development

No branches or pull requests

2 participants