-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-51045][TESTS][4.0] Regenerate benchmark results after upgrading to Scala 2.13.16 #49741
Conversation
AMD EPYC 7763 64-Core Processor | ||
Test contains use empty Set: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
Use HashSet 3 3 0 291.9 3.4 1.0X | ||
Use EnumSet 4 4 0 227.7 4.4 0.8X | ||
Use HashSet 1 1 0 1390.7 0.7 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the ratio changes in both Java 17 and 21. The measure time is too small, 1ms
. So, I believe we can ignore because it means it gets faster.
Legacy 9203 9215 8 0.1 9203.3 1.0X | ||
New 813 816 2 1.2 813.1 11.3X | ||
Legacy 6862 6873 7 0.1 6861.7 1.0X | ||
New 796 821 9 1.3 795.6 8.6X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both becomes faster and New
is still faster than Legacy
.
Spark 3987 3988 1 0.3 3986.6 1.2X | ||
Spark Binary 2762 2766 3 0.4 2761.6 1.8X | ||
Common Codecs 5040 5043 3 0.2 5039.8 1.0X | ||
Java 3137 3141 6 0.3 3136.7 1.6X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java
becomes faster then Spark
.
Spark 3482 3488 10 0.3 3482.0 1.4X | ||
Spark Binary 2638 2639 0 0.4 2638.3 1.9X | ||
Common Codecs 4955 4983 43 0.2 4955.2 1.0X | ||
Java 5790 5815 28 0.2 5790.1 0.9X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java
is still the slowest one in Java 21.
SQL Json 8374 8384 14 1.9 532.4 1.2X | ||
SQL Json with UnsafeRow 9509 9511 3 1.7 604.6 1.1X | ||
SQL Parquet Vectorized: DataPageV1 83 93 6 189.4 5.3 123.2X | ||
SQL Parquet Vectorized: DataPageV2 208 217 7 75.7 13.2 49.2X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, the ratio between DataPage1 and DataPageV2 is changed too big in this case.
ConstantColumnVector 892 892 1 459.3 2.2 1.0X | ||
OnHeapColumnVector 1020 1021 1 401.5 2.5 0.9X | ||
OffHeapColumnVector 892 893 1 459.0 2.2 1.0X | ||
ConstantColumnVector 0 0 0 13274135.5 0.0 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks strange because the value is 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 0s are strange. Do we also get 0s for master branch for this benchmark result?
ConstantColumnVector 765 766 1 535.4 1.9 1.0X | ||
OnHeapColumnVector 774 774 1 529.3 1.9 1.0X | ||
OffHeapColumnVector 830 831 2 493.6 2.0 0.9X | ||
ConstantColumnVector 0 0 0 3321170.8 0.0 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also becomes 0
.
There are a few places to spot, but in general, when we cross check both Java 17/21. There seems to be no regression in Scala 2.13.16. |
Could you review this PR when you have some time, @huaxingao ? |
To @huaxingao , yes, it happens even in this PR twice for
For the record, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thank you for spending your time here to help, @huaxingao ! |
…g to Scala 2.13.16 ### What changes were proposed in this pull request? This PR aims to regenerate benchmark results of `branch-4.0` after upgrading to Scala 2.13.16. - #49478 ### Why are the changes needed? To check a regression again - #49411 (comment) ### Does this PR introduce _any_ user-facing change? No, this updates only test results. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49741 from dongjoon-hyun/bm_40. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
Merged to branch-4.0. |
What changes were proposed in this pull request?
This PR aims to regenerate benchmark results of
branch-4.0
after upgrading to Scala 2.13.16.Why are the changes needed?
To check a regression again
from_json
#49411 (comment)Does this PR introduce any user-facing change?
No, this updates only test results.
How was this patch tested?
Manual review.
Was this patch authored or co-authored using generative AI tooling?
No.