[Performance] GPU Fallback to CPU Without Error When CUDA DLLs Are Missing #23372
Labels
ep:CUDA
issues related to the CUDA execution provider
performance
issues related to performance regressions
Describe the issue
When using ONNX Runtime with GPU and setting CUDA as the provider, if the model fails to load to the GPU due to missing CUDA DLLs or other issues, the execution falls back to the CPU without raising any error. This behavior results in the model running on the CPU while merely logging the error in the logs, but no explicit error is returned to the application.
This can lead to scenarios where the user is unaware that the execution has switched to the CPU, which might significantly impact performance.
Expected Behavior:
The behavior should raise an error and stop the execution if the GPU cannot be initialized correctly, ensuring that the user is immediately aware of the issue.
Current Behavior:
To reproduce
Urgency
No response
Platform
Windows
OS Version
10
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.1
ONNX Runtime API
C++
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 11.8
Model File
No response
Is this a quantized model?
No
The text was updated successfully, but these errors were encountered: