Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image_transforms preprocess quite slow when run large image with qwen2vl #34272

Open
4 tasks
zhjunqin opened this issue Oct 21, 2024 · 9 comments · May be fixed by #35733
Open
4 tasks

image_transforms preprocess quite slow when run large image with qwen2vl #34272

zhjunqin opened this issue Oct 21, 2024 · 9 comments · May be fixed by #35733

Comments

@zhjunqin
Copy link

zhjunqin commented Oct 21, 2024

System Info

  • transformers version: 4.45.2
  • Platform: Linux-5.4.0-132-generic-x86_64-with-glibc2.31
  • Python version: 3.12.7
  • Huggingface_hub version: 0.25.1
  • Safetensors version: 0.4.5
  • Accelerate version: 1.0.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.4.0+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?:
  • Using GPU in script?:
  • GPU type: NVIDIA GeForce RTX 3090

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

funcitons in image_transforms, rescale, normalize quite slow when preprocess large image.
https://github.com/huggingface/transformers/blob/main/src/transformers/image_transforms.py

here is benchmark

image

please refer to vllm-project/vllm#9238

Expected behavior

how to improve performance?

@zhjunqin zhjunqin added the bug label Oct 21, 2024
@zucchini-nlp
Copy link
Member

Hey @zhjunqin !

Can be related to #28847, where we enabled image processing with torchvision but that only is supported in ViT model. Also @yonigozlan is working on optimizing image processing time in #33810, so he might be your point of contact :)

@yonigozlan
Copy link
Member

Hey @zhjunqin !
Thanks a lot for raising this issue. Indeed I'm currently working on adding fast image processors to Transformers, and I'll try to address the Qwen one shortly. I'll ping this issue once a PR is opened!

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@SinanAkkoyun
Copy link

@yonigozlan Hey :) Did you find time to address the qwen preprocessor?

@yonigozlan
Copy link
Member

Not yet, but it is still planned :). I will ping here when it's done

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@Gladiator07
Copy link
Contributor

Hi @yonigozlan , is qwen's preprocessing factoring still planned ? We are using offline inference with qwen 2 vl 7B for document extraction tasks for approx 70 million images and the preprocessing time is a major slowdown for us. If it is not planned immediately, can you tell if there's any workaround to speed or skip the preprocessing as I am already sending the resized images using smart_resize function but still it somehow sends it to huggingface for resizing it again. Any pointers will help a lot...

@yonigozlan
Copy link
Member

yonigozlan commented Jan 16, 2025

Hi @Gladiator07. Sorry for the delay on this. I was waiting for this big refactoring PR on fast image processors #35069 to be merged to continue adding new fast image processors.
But as this is taking longer than I thought and since there is a lot of demand for qwen2vl, I'll try to open a PR for a fast qwen2vl image processors by the end of the week. I'll ping here when it's opened.
Once it's out you'll be able to checkout my branch to use it.
Hope that sounds good!

@yonigozlan
Copy link
Member

PR is open here #35733 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants