Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup to_hex (~2x faster) #14686

Merged
merged 2 commits into from
Feb 17, 2025
Merged

Speedup to_hex (~2x faster) #14686

merged 2 commits into from
Feb 17, 2025

Conversation

simonvandel
Copy link
Contributor

Which issue does this PR close?

N/A

Rationale for this change

We can speedup to_hex by writing string values directly to the string array, instead of making temporary allocations.

to_hex i32 array: 1024  time:   [14.239 µs 14.295 µs 14.354 µs]
                        change: [-48.522% -48.285% -48.031%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

to_hex i64 array: 1024  time:   [15.471 µs 15.546 µs 15.624 µs]
                        change: [-46.390% -46.208% -46.007%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

What changes are included in this PR?

  • Add a benchmark
  • Avoid string allocations

Are these changes tested?

Relying on existing tests.

Are there any user-facing changes?

Faster execution.

@alamb alamb added the performance Make DataFusion faster label Feb 16, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it @simonvandel -- very nice!

Avoiding allocations for the win!

Copy link
Member

@Weijun-H Weijun-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@Weijun-H Weijun-H merged commit b09c09a into apache:main Feb 17, 2025
26 checks passed
@Weijun-H
Copy link
Member

Thanks @simonvandel and @alamb 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions performance Make DataFusion faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants