-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build engine failure of TensorRT 10.7 when running quantization model on GPU NVIDIA GeForce RTX 3090 #4320
Comments
Try to use |
Thanks. But it doesn't seem to solve my problem.
|
1, use a tiny onnx like resnet50.onnx by trtexec to check the hardware env. BTW , because your model is quantization model, and make sure is quantized by nv QAT tool. Try to use |
Description
I tried to quantize an FP32 ONNX model to an INT8 TRT model.
And when I use tensorrt's python api to convert this onnx model to trt engine, I got error like:
[01/14/2025-16:56:50] [TRT] [I] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 115, GPU 1322 (MiB)
[01/14/2025-16:56:51] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +2073, GPU +385, now: CPU 2343, GPU 1707 (MiB)
Parsing ONNX file: ./eloftr_outdoor_ds1_fp32_640x640_640x640_v17.onnx
Building an engine from file ./eloftr_outdoor_ds1_fp32_640x640_640x640_v17.onnx, this may take a while...
Loaded 300 images for calibration
Building INT8 engine...
[01/14/2025-16:56:52] [TRT] [W] /model/fine_matching/Reshape_12: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 1 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[01/14/2025-16:56:52] [TRT] [W] /model/fine_matching/Reshape_12: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 1 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[01/14/2025-16:56:52] [TRT] [I] Perform graph optimization on calibration graph.
[01/14/2025-16:56:52] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/14/2025-16:56:54] [TRT] [I] [GraphReduction] The approximate region cut reduction algorithm is called.
[01/14/2025-16:56:54] [TRT] [I] Detected 2 inputs and 3 output network tensors.
[01/14/2025-16:56:57] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 4: Shape Error (broadcast dimensions must be conformable)
Failed to build serialized engine
Failed to build engine
Environment
TensorRT Version: 10.7
NVIDIA GPU: NVIDIA GeForce RTX 3090
NVIDIA Driver Version: 535.183.01
CUDA Version: 12.1
CUDNN Version: 8.9.5
Operating System: Ubuntu 20.04.6 LTS
Python Version (if applicable): 3.10
Tensorflow Version (if applicable):
PyTorch Version (if applicable): 2.5.1
Baremetal or Container (if so, version):
Relevant Files
Model link:
EfficientLoFTR: https://drive.google.com/drive/folders/1nw1nhtInBfo65ux2I-GtaBXyTPLqtqnH
fp32 onnx model : https://drive.google.com/file/d/1jKXhOtj5-LfQqrRW0R4R7j3hcStGWTGW/view?usp=drive_link
Steps To Reproduce
The converted code(onnx2trt.py) I use to convert onnx to trt:
import os
import onnx
import tensorrt as trt
import numpy as np
import cv2
from tqdm import tqdm
import pycuda.driver as cuda
import pycuda.autoinit
import torch
from onnxsim import simplify
print(f"TensorRT Version:{trt.version}")
class CalibrationDataLoader:
def init(self, imgpath, input_shape=(1, 1, 640, 640), batch_size=2):
"""
Args:
imgpath (str): directory containing calibration images
input_shape (tuple): input shape (N,C,H,W)
batch_size (int): number of images per batch
"""
self.root_dir = imgpath
self.input_shape = input_shape
self.batch_size = batch_size
self.batch_idx = 0
class INT8Calibrator(trt.IInt8EntropyCalibrator):
def init(self, dataloader):
trt.IInt8EntropyCalibrator.init(self)
self.dataloader = dataloader
self.cache_file = "calibration.cache"
def build_int8_engine(onnx_path, imgpath):
"""Build TensorRT engine with INT8 quantization"""
logger = trt.Logger(trt.Logger.INFO)
builder = trt.Builder(logger)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
config = builder.create_builder_config()
parser = trt.OnnxParser(network, logger)
def main():
# Configuration
imgpath = "./data/calibration_images/0015/images"
onnx_path = "./eloftr_outdoor_ds1_fp32_640x640_640x640_v17.onnx"
if name == "main":
main()
Commands or scripts: python onnx2trt.py
Have you tried the latest release?: yes, I have also tried other versions of tensorrt, but each version reports a different error.
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt
): yesThe text was updated successfully, but these errors were encountered: