[Bug]: T5 Text Encoder producing (very) different output on GPU when model is compiled for dynamic vs. static shape #29017

RyanMetcalfeInt8 · 2025-02-16T21:29:19Z

OpenVINO Version

2025.1 Nightly (openvino_toolkit_windows_2025.1.0.dev20250214_x86_64)

Operating System

Windows System

Device used for inference

GPU

Framework

None

Model used

T5 Text Encoder

Issue description

The T5 Text Encoder Model (from Stable Diffusion 3 pipeline) produces very different output when it is compiled & run on GPU using dynamic shapes, vs static shapes. Note that this issue seems to be specific to GPU -- I do not observe the same problem when using CPU.

See reproduce steps below for more details.

Step-by-step reproduction

I have attached a small C++ reproducer:

t5_static_vs_dynamic_reproducer.zip

Obtain T5 Text Encoder model by generating SD3 models via the following cmd's. Note: sd3_requirements.txt is packaged with the zip

python -m venv my_env
my_env\Scripts\activate
pip install --upgrade-strategy eager -r sd3_requirements.txt
optimum-cli export openvino --model stabilityai/stable-diffusion-3-medium-diffusers --task stable-diffusion --weight-format fp16 stable-diffusion-3-medium-diffusers

Build the small C++ reproducer (cmd.exe shell):

call "openvino_toolkit_windows_2025.1.0.dev20250214_x86_64\setupvars.bat"
mkdir t5_static_vs_dynamic_reproducer-build
cd t5_static_vs_dynamic_reproducer-build
cmake ..\t5_static_vs_dynamic_reproducer
cmake --build . --config Release

Run it.

cd Release
main.exe <path_to>\stable-diffusion-3-medium-diffusers\text_encoder_3\openvino_model.xml GPU

You should see the following at the end of the log, which indicates different output between static & dynamic:

Mismatch at postiion 0
  static = -0.244141
  dynamic = -0.00371361
done..

Note that if you replace 'GPU' with 'CPU', no mismatches are reported.

Relevant log output

Issue submission checklist

I'm reporting an issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.

The text was updated successfully, but these errors were encountered:

RyanMetcalfeInt8 added bug Something isn't working support_request labels Feb 16, 2025

RyanMetcalfeInt8 mentioned this issue Feb 16, 2025

Text2Image, Stable Diffusion 3: Explicit Reshape + Compile produces very different output on GPU #29113

Open

mlukasze assigned geunhwan Feb 17, 2025

mlukasze added the category: GPU OpenVINO GPU plugin label Feb 17, 2025

ilya-lavrenov assigned e-ddykim Feb 17, 2025

e-ddykim mentioned this issue Feb 21, 2025

[GPU] fixing RMS for static layers #29110

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: T5 Text Encoder producing (very) different output on GPU when model is compiled for dynamic vs. static shape #29017

[Bug]: T5 Text Encoder producing (very) different output on GPU when model is compiled for dynamic vs. static shape #29017

RyanMetcalfeInt8 commented Feb 16, 2025

[Bug]: T5 Text Encoder producing (very) different output on GPU when model is compiled for dynamic vs. static shape #29017

[Bug]: T5 Text Encoder producing (very) different output on GPU when model is compiled for dynamic vs. static shape #29017

Comments

RyanMetcalfeInt8 commented Feb 16, 2025

OpenVINO Version

Operating System

Device used for inference

Framework

Model used

Issue description

Step-by-step reproduction

Relevant log output

Issue submission checklist