-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GH-3146] Optimize the binaryToDecimal function #3147
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parquet-pig has been discussed to be removed: https://lists.apache.org/thread/vh1twzdbvm4fr4sl2wt8swqgq92k8369
Is it actually used in your case @qian0817?
cc @Fokko
@@ -60,12 +60,12 @@ public void testBinaryToDecimal() throws Exception { | |||
// Test LONG | |||
testDecimalConversion(Long.MAX_VALUE, 19, 0, "9223372036854775807"); | |||
testDecimalConversion(Long.MIN_VALUE, 19, 0, "-9223372036854775808"); | |||
testDecimalConversion(0L, 0, 0, "0.0"); | |||
testDecimalConversion(0L, 0, 0, "0"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these two lines need change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this use case, 0 is the correct value. Using double to construct BigDecimal previously may lead to potential incorrect behavior.
I did not directly use the parquet-pig module; while writing my own parquet converter, I referenced some code from parquet-pig and then discovered the optimization points here. |
@qian0817 I see, thanks for the added context. Since Pig is deprecated, I'm going to close this PR. That said, I really appreciate taking the time for creating this PR, and hope we'll see more of these in the future 👍 |
Rationale for this change
#3146
If precision is less than 18, the condition unscaledNew <= -pow(10, 18) || unscaledNew >= pow(10, 18) can not be true, so we can remove the judgment logic here. Additionally, using BigDecimal.valueOf(unscaledNew, scale) is preferable over using BigDecimal.valueOf(unscaledNew / pow(10, scale)), as it does not convert the unscaled value to double.
What changes are included in this PR?
Optimize the binaryToDecimal function
Are these changes tested?
pass unit test.
Are there any user-facing changes?
No
Closes #3146