Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] Issue with "include_ops_by_config" and DynamicQuantizeMatMul in WASM CPU execution provider #22761

Open
sevagh opened this issue Nov 7, 2024 · 0 comments
Labels
build build issues; typically submitted using template platform:web issues related to ONNX Runtime web; typically submitted using template quantization issues related to quantization

Comments

@sevagh
Copy link
Contributor

sevagh commented Nov 7, 2024

Describe the issue

Here's my required_operators_and_types.config file:

ai.onnx;6;InstanceNormalization
ai.onnx;10;MatMulInteger
ai.onnx;11;Conv{"inputs": {"0": ["float"]}},ConvTranspose,DynamicQuantizeLinear{"outputs": {"0": ["uint8_t"]}}
ai.onnx;13;Cast{"inputs": {"0": ["int32_t"]}, "outputs": {"0": ["float"]}},Erf,Gather{"inputs": {"0": ["float"], "1": ["int64_t"]}},Gemm{"inputs": {"0": ["float"]}},MatMul{"inputs": {"0": ["float"]}},ReduceMean{"inputs": {"0": ["float"]}},Sigmoid{"inputs": {"0": ["float"]}},Slice{"inputs": {"0": ["float"], "1": ["int64_t"]}},Softmax{"inputs": {"0": ["float"]}},Split{"inputs": {"0": ["float"]}},Sqrt{"inputs": {"0": ["float"]}},Squeeze,Transpose{"inputs": {"0": ["float"]}},Unsqueeze
ai.onnx;14;Add{"inputs": {"0": ["float"]}},Div{"inputs": {"0": ["float"]}},Mul{"inputs": {"0": ["float"]}},Reshape,Sub{"inputs": {"0": ["float"]}}
ai.onnx;17;LayerNormalization
com.microsoft;1;BiasGelu,DynamicQuantizeMatMul,FusedMatMul,Gelu

When I build onnxruntime for wasm without the config file, ORT inference works:

python ./vendor/onnxruntime/tools/ci_build/build.py \
    --build_dir="./build/build-ort-wasm-simd" \
    --config=MinSizeRel \
    --build_wasm_static_lib \
    --parallel \
    --minimal_build \
    --disable_ml_ops \
    --disable_rtti \
    --use_preinstalled_eigen \
    --eigen_path=$(realpath "./vendor/eigen") \
    --skip_tests \
    --enable_wasm_simd
    #--include_ops_by_config="./onnx-models/required_operators_and_types.config" \
    #--enable_reduced_operator_type_support \
;;

When I build onnxruntime for wasm with the config file, ORT inference gives the following error:

2024-11-07 09:14:13.284000 [I:onnxruntime:, inference_session.cc:1699 Initialize] Initializing session. demucs_onnx_simd.js:1751:12
2024-11-07 09:14:13.284000 [I:onnxruntime:, inference_session.cc:1736 Initialize] Adding default CPU execution provider. demucs_onnx_simd.js:1751:12
Ort::Exception: Could not find an implementation for DynamicQuantizeMatMul(1) node with name '/crosstransformer/layers.0/self_attn/MatMul_quant'

In the cmake output, we see the file that contains this operator being replaced:

File '/home/sevagh/repos/demucs.onnx-pro/vendor/onnxruntime/onnxruntime/contrib_ops/cpu/cpu_contrib_kernels.cc' substituted with reduced op version '/home/sevagh/repos/demucs.onnx-pro/build/build-ort-wasm-simd/MinSizeRel/op_reduction.generated/onnxruntime/contrib_ops/cpu/cpu_contrib_kernels.cc'.

In this replacement file, there are some lines related to DynamicQuantizeMatMul:

$ rg 'DynamicQuantizeMatMul' build/build-ort-wasm-simd/build/build-ort-wasm-simd/MinSizeRel/op_reduction.generated/onnxruntime/contrib_ops/cpu/cpu_contrib_kernels.cc
95:class ONNX_OPERATOR_TYPED_KERNEL_CLASS_NAME(kCpuExecutionProvider, kMSDomain, 1, float, DynamicQuantizeMatMul);
260:      BuildKernelCreateInfo<ONNX_OPERATOR_TYPED_KERNEL_CLASS_NAME(kCpuExecutionProvider, kMSDomain, 1, float, DynamicQuantizeMatMul)>,

Urgency

No response

Target platform

WebAssembly

Build script

python ./vendor/onnxruntime/tools/ci_build/build.py
--build_dir="./build/build-ort-wasm-simd"
--config=MinSizeRel
--build_wasm_static_lib
--parallel
--minimal_build
--disable_ml_ops
--disable_rtti
--use_preinstalled_eigen
--eigen_path=$(realpath "./vendor/eigen")
--skip_tests
--enable_wasm_simd
#--include_ops_by_config="./onnx-models/required_operators_and_types.config"
#--enable_reduced_operator_type_support \

Error / output

Ort::Exception: Could not find an implementation for DynamicQuantizeMatMul(1) node with name '/crosstransformer/layers.0/self_attn/MatMul_quant'

Visual Studio Version

No response

GCC / Compiler Version

No response

@sevagh sevagh added the build build issues; typically submitted using template label Nov 7, 2024
@github-actions github-actions bot added quantization issues related to quantization platform:web issues related to ONNX Runtime web; typically submitted using template labels Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template platform:web issues related to ONNX Runtime web; typically submitted using template quantization issues related to quantization
Projects
None yet
Development

No branches or pull requests

1 participant