Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] Multithreading for DequantizeLinear #23395

Open
tarekziade opened this issue Jan 16, 2025 · 2 comments
Open

[Performance] Multithreading for DequantizeLinear #23395

tarekziade opened this issue Jan 16, 2025 · 2 comments
Labels
performance issues related to performance regressions quantization issues related to quantization

Comments

@tarekziade
Copy link

Describe the issue

The current DequantizeLinear CPU operator does not use threads.

I have implemented a quick prototype that shows a 4x speed up on that operator when used with a Qwen 2.5 0.5B model

I do see a comment about this:

https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/cpu/quantization/quantize_linear.cc#L302

@fajin-corp is this something you were planning to implement? I'd be happy to help under your guidance

To reproduce

n/a

Urgency

No response

Platform

Windows

OS Version

any

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

main

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

Yes

@tarekziade tarekziade added the performance issues related to performance regressions label Jan 16, 2025
@github-actions github-actions bot added the quantization issues related to quantization label Jan 16, 2025
@yuslepukhin
Copy link
Member

Go ahead and PR it.

@fajin-corp
Copy link
Contributor

@tarekziade I'm not working on it. You are very welcome to open a PR for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance issues related to performance regressions quantization issues related to quantization
Projects
None yet
Development

No branches or pull requests

3 participants